Introduction

Although adolescent idiopathic scoliosis (AIS) has been associated with impaired functioning, body image, and health-related quality of life (HRQoL) [2, 27], the goal of surgical treatment is to prevent curve progression and achieve curve correction, not to improve the possible impairment in HRQoL [11]. Consequently, the information in the literature on HRQoL before and after after scoliosis surgery is scanty in relation to the more easily measured technical outcomes such as X-ray measurements. Moreover, surgeon-oriented results have no clear relationship with HRQoL or degree of satisfaction with the treatment [9, 31]. Hence, the effect of surgery on subjective outcomes is difficult to interpret and can be poorly estimated using traditional objective measures. A few short-term longitudinal studies have reported significant improvement in HRQoL domain scores when preoperative and postoperative data are compared [7, 10, 20, 22], but there are no prospective studies that confirm this improvement in a long-term follow-up. It’s also important to compare the subjective results of patients who have undergone surgery for AIS with healthy subjects [11] to determine whether surgery leads to normalization of HRQoL in these patients. Which score change on validated questionnaires best reflects any clinically relevant difference? The use of minimal clinically important difference (MCID) values permits clinical changes in outcomes to be interpreted, and affords further interpretation of previous studies that used the same instruments. The goals of this prospective study were precisely to: (1) evaluate clinically relevant postoperative changes in HRQoL, functionality, and body image as assessed by validated instruments in subjects who had undergone surgery for AIS more than 5 years earlier, (2) compare these self-reported outcomes with the published norms, and (3) identify possible outcome predictors of these subjective outcomes.

Materials and methods

We included only patients who met the following criteria: (1) diagnosis of AIS, (2) a minimum 40° Cobb angle, and (3) a minimum follow-up period of 5 years. Ninety-one consecutive patients with these characteristics were operated on in our department from 2002 to 2009. The surgical correction of the scoliosis was achieved using a hybrid Colorado 2 System. Pedicle screws were inserted in the lumbar and lower thoracic spine with more screws on the concave side. Pedicular hooks were used at the upper instrumented levels and were always stabilized by a locking staple that attached to the inferior facet. Clamps were slid onto the precontoured titanium rod at the concave side and connected to the hooks and screws. Another precontoured rod was attached to the convex side following the same procedure. The rods were rotated and the progressive tightening of the nuts on the threaded posts approximated the spine to the rods. Segmental compression or distraction was carried out before the final locking of the system. Autologous bone grafts from the spinous process were always used. All patients agreed to participate and gave their informed consent to the study. Four patients were lost to follow-up and excluded from the analysis of outcomes. The remaining 87 patients were included in the study (follow-up rate = 96 %). No significant differences between the participants and subjects lost to follow-up were found with respect to age at operation, gender, and preoperative severity of the scoliotic curve nor there were any perioperative complications in these four cases. Follow-up data as they were available for the subjects lost to follow-up suggested no differences from the study group. The mean length of follow-up for the participants was 6.9 ± 2.0 years (range 5–12). Clinical data including measurement of body weight, height, trunk shift imbalance, and height of the most prominent paravertebral hump were prospectively collected at set intervals (preoperative, 1 year postoperatively, and annually thereafter). The hump was measured with an air bubble scoliometer with the patient in the forward bent position and calculated in millimetres. The trunk shift imbalance was calculated with a plumb line as the horizontal distance of the spinous process of C7 from the center of the sacral line, expressed in millimeters. Preoperatively and subsequently for each follow-up control, we employed the Italian version of the Short Form-36 Health Survey (SF-36) [3] and the Scoliosis Research Society (SRS)-23 Questionnaire [6]. SF-36 results were compared to the published data [3]; the MCIDs with the normative data were 5–7 points [11]. SRS-23 results were compared to the results in healthy subjects [13]. The following differences between the study group and the normative data were considered a MCID [7]: average sum score 0.6, back pain 0.6, function/activity 0.8, self-image/appearance 0.5, and mental health 0.4. As part of the routine care, full-length posteroanterior and lateral radiographs of the spine were obtained preoperatively and postoperatively at set intervals (1 day postoperatively, 6 months postoperatively, 1 year, and last follow-up) and measured by an unbiased author (CA). The Cobb angle of the scoliotic curve at the one-year and final follow-up control dates were compared to the pre-operative angle, and the correction rate was calculated. Preoperative coronal curves were classified according to Lenke et al. [17]. Caudal level and the distal extension of fusion were recorded. Thoracic kyphosis was measured by the Cobb method on lateral films, selecting the segments T3–T12 as limits. Information on both perioperative and delayed complications were collected. We recorded any re-operations.

Statistical analysis

A two-sample t test for paired or unpaired data and an ANOVA test were used to test the significance of the cross-sectional or longitudinal differences in the means. A Bonferroni test was used to test the differences between multiple groups. A multiple stepwise linear regression analysis was used to evaluate the relationship between each explanatory variable and the summary measures and single scale scores on the SF-36 and SRS-23 Questionnaires at the last follow-up. Possible explanatory variables we included were: age at operation, age at follow-up, gender, body mass index (BMI), post-surgery complications or re-operations, Cobb angle of the thoracic curve at the final follow-up, Cobb angle of the lumbar or thoraco-lumbar curve at the final follow-up, height of the most prominent residual rib hump in millimetres (mm), angle of thoracic kyphosis at the final follow-up, distal level of the fusion (L1, L2, L3, or L4), length of the fusion area, and preoperative score of the outcome variable under examination. Before constructing the models, an age-adjusted univariate linear regression analysis was performed. Explanatory variables were included in our multiple regression models if a trend toward an association (i.e. p ≤ 0.10) with the outcome of interest was found in the univariate analysis. In the multiple linear regression analysis, we calculated the total R 2 for the model and changes in R 2 for the independent contribution of single factors, to assess the percent of the total variance in the outcome accounted for by the whole model and by single explanatory variables, respectively. A p value of less than 0.05 was considered significant. A SPSS software program (SPSS, Inc., Chicago, IL, USA) was used for the database and statistics.

Results

The average age of participants at the time of surgery was 14.8 ± 2.3 years (range 11–22). Clinical and radiographic data are summarized in Table 1. Postoperatively, the mean correction of the thoracic curve at the 1 year and final follow-up was 53 and 47 %, respectively. For the lumbar (or thoraco-lumbar) curve it was 63 and 57 %, respectively. Significant loss of correction (P < 0.001) occurred for both types of curve over the follow-up interval. Thirteen surgery-related complications occurred in the study group (14.9 %) and revision surgery was required for seven patients (revision rate = 8.0 %). In detail, postoperative deep infection developed in three patients. Two of these were early infections that healed with surgical irrigation and debridement and did not require implant removal. In the third patient a hematogenous infection occurred 6 years after the primary procedure and the instrumentation could not be retained. Two patients presented prolonged wound drainage and were successfully treated with conservative methods. Two patients had a prominent instrumentation at the upper levels. One was managed conservatively and one required revision surgery. One patient presented asymptomatic breakage of the connector between transverse process and the upper pedicular hook. Misplacement or pull-out of screws was diagnosed in five cases. Three of these patients who complained of radicular pain were treated by replacement or removal of the pedicle screw.

Table 1 Summary of clinical and radiographic data of patients (mean ± SD) or N (%)

SF-36 questionnaire

The surgery group’s SF-36 scores both preoperatively and at the 1-year and final follow-up are summarized in Fig. 1. Regarding clinically relevant differences, the final MCS of the SF-36 questionnaire compared positively with the age-matched normative value (47.9 ± 12.3), whereas the final PCS was lower in comparison with unaffected subjects (55.6 ± 6.8), but this difference did not exceed MCID (Table 2). The SF-36 single domain scores are reported in Table 2 for male and female patients in comparison with age- and sex-matched norms. At the final follow-up, most scale scores belonging to the physical health category were inferior to normative values. Conversely, the mental health indexes were similar to or surpassed these norms.

Fig. 1
figure 1

Results of SF-36 Questionnaire (mean ± standard deviation). PF physical functioning, RP role physical, BP bodily pain, GH general health, VT vitality, SF social functioning, RE role emotional, MH mental health, PCS physical component summary, MCS mental component summary

Table 2 Results of SF-36 Questionnaire (mean ± SD) in patients (PTS) stratified by sex in comparison with age-matched healthy subjects (CTR) (Δ = difference) [2]

SRS questionnaire

The overall and single section scores on the SRS-23 questionnaire are reported in Fig. 2. The study group experienced significant improvement in all domains of the questionnaire 1 year postoperatively and at the final follow-up in comparison to their preoperative status. A decrease in all domains but the function/activity occurred between 1-year and the final follow-up, but the level of satisfaction remained high (1-year follow-up = 4.3 ± 0.5; final follow-up = 4.4 ± 0.7). Clinically, self-image and pain domains improved by equal or more than MCID 1 year post-surgery, but pain remained the only domain that had improved by relevant clinical levels at the final follow-up. The SRS scores are reported in Table 3 for male and female patients in comparison with age- and sex-matched norms. Preoperatively, the mean differences between the study group and normative values exceeded MCID for pain and self-image domains. At the final follow-up, no variations exceeding MCID were detected between the study group and normative values, but female patients in the study group reported more pain.

Fig. 2
figure 2

Results of SRS (Scoliosis Research Society) questionnaire (mean ± standard deviation)

Table 3 Results of SRS (Scoliosis Research Society) Questionnaire (mean ± SD) in patients (PTS) stratified by sex in comparison with age-matched healthy subjects (CTR) (Δ = difference) [11]

Relationships

A better final result was recorded on the function/activity domain of the in subjects with a Lenke 1 curve pattern as compared to those with a Lenke 3 curve pattern (P = 0.040). Determinants of the SF-36 questionnaire summary and scale scores are reported in Table 4. The most important negative determinants of physical and mental indexes were the occurrence of complications or reoperations and the height of the residual rib hump, respectively. Table 5 reports the predictors of the SRS questionnaire scores at the final follow-up. Height of the residual rib hump and a more caudal level of fusion represented the most important negative predictors of these scores. Patients who had reported complications or re-operations showed worse physical status (SF-36 PCS score = 46.8 ± 6.9 vs 51.1 ± 5.9; P = 0.045) and more pain (SRS-23 pain domain = 3.4 ± 0.7 vs 4.0 ± 0.5; P = 0.017) compared to the ones who had not. Conversely, no differences between complicated and uncomplicated patients emerged for the mental indexes of the SF-36 (MCS = 53.7 ± 0.8 vs 54.2 ± 0.8) and SRS 23 (3.8 ± 0.6 vs 3.8 ± 0.5) questionnaires.

Table 4 Predictors of SF-36 Questionnaire scores at the final follow-up (multiple linear regression analysis)
Table 5 Predictors of average SRS-23 questionnaire scores at the final follow-up (multiple linear regression analysis)

Discussion

The assessment of surgical results in AIS should focus on the correction of curvature but also on changes in quality of life and functionality, since clinical and radiographic results may be unrelated to patient satisfaction [9, 31]. Nevertheless, the outcome in terms of HRQoL in the short term cannot be considered a primary effect variable after scoliosis surgery and there remains a need for a prospective study with long term follow-up [11]. While the use of innovative rehabilitation programs carried out until skeletal maturity has been associated with improvement in the HRQoL of adolescents with mild AIS [21], one recent literature review was unable to draw the same conclusion regarding the impact of surgical versus non-surgical interventions on the long term HRQoL and cosmetic issues of patients with severe AIS [8]. In the present study, the generic and scoliosis-specific HRQoL were prospectively assessed for more than 5 years using validated tools. Preoperatively, our patients had impaired HRQoL with respect to their age- and sex-matched healthy controls, especially in physical domains. Surgery led to significant improvement in the 1-year SF-36 scores; these showed a further slight progression until the final follow-up. One recent prospective study with a 1 year follow-up also showed postoperative improvement in SF-36 indexes [22], but data on SF-36 results from a longer follow-up are lacking. Despite the postoperative improvement found in this study, final PF, RP, and BP (in males) scale scores still remained significantly lower as compared to age- and sex-matched norms. Both scales belong to the physical health category. This finding agrees with the results of previous studies [12] indicating that patients who had undergone scoliosis surgery, even with similar or superior mental health characteristics, continue to subjectively demonstrate inferior physical status and role limitations in comparison to unaffected subjects. In our study group, the preoperative differences between study group and normative values were clinically relevant for pain and self-image domains of the SRS questionnaire. This result is in keeping with previous studies [30] that found lower scores of pain and self-image domains on the SRS-24 questionnaire in subjects with untreated scoliosis in comparison with age- and sex-matched normative values. The greater the curve magnitude the greater this difference. In the present study, the scores on the SRS questionnaire were significantly higher in comparison to the preoperative data at both the 1 year and final follow-up, even though the final scores were slightly lower as compared to other studies with shorter follow-up [7, 9, 14]. This may reflect a more compromised preoperative status in our patients, since the magnitude of variations on SRS scales obtained with surgery is similar to the published results. The pain scale showed the lowest preoperative score, confirming that pain is an underestimated problem in AIS [16, 25, 26]. But it was the only scale that still showed a gain exceeding MCID at the final follow-up in comparison to the preoperative status. One literature review [23] concluded that postoperative improvements exceeding MCID occur in self-image only, but other studies have found that reported pain can improve [16, 20, 25]. Contrary to the SF-36 questionnaire, most of the SRS questionnaire scores decreased over the follow-up interval in this study. A decline in SRS questionnaire results between the 2- and 5-year follow-up has been previously noted [14, 28] and one retrospective study reported SRS scores at a mean of 12.7 years after surgery very similar to our findings [26]. The reason for the different trend in the results of the two questionnaires adopted in this study could lie in their different characteristics. Although high correlations have been reported between relevant SRS and SF-36 domains [5], the SF-36 questionnaire may not be able to capture domains that are more specific and relevant to scoliosis patients’ perceived HRQoL. Furthermore, the score distribution of SF-36 scales favors high scores [5]. Despite the decrease observed over time, the final SRS scores remained significantly higher in comparison with the preoperative period, and only the pain domain in female patients showed an impairment exceeding MCID with respect to normative values [13]. Interestingly, no decrease in the level of satisfaction with surgery was observed over time.

As in the study of outcome predictors, the amount of the variance accounted for by our models was higher for SRS questionnaire domains than for SF-36 results, using the same explanatory variables. This result confirms that assessment by SRS questionnaire should be mandatory to study the HRQoL of patients who have undergone surgery for AIS. In this study the height of the most prominent residual hump was the main negative predictor of SRS scales and the domains of the SF-36 questionnaire belonging to the mental health category. Like previous studies [4, 19, 29], this finding confirms the importance of transverse plane deformity and the associated paravertebral prominence for the patients’ perception of their condition, primarily affecting self-image but also functionality. One previous study [18] showed that all-pedicle screw systems allow better rotational and coronal correction than hybrid constructs like the one used in the current study, and this superior corrective effect leads to improvement in the patients’ perception of their postoperative cosmetic appearance. The authors hypothesized that the improved correction may be partly due to the higher number of spine fixation points in all screw systems compared to hybrid constructs. However, the above cited study did not show any differences for SRS overall and subdomain scores between patients who had undergone surgery with these two alternative strategies. In our cohort of patients, a more distal extension of the fusion area also was negatively related to the pain score and other domains of the SF-36 and SRS questionnaires at the final follow-up. One short-term study of patients who were treated with a current instrumentation also found similar relationships [24]. However, previous long-term studies of patients who underwent surgery with Harrington instrumentation gave variable results for the association between back pain and disability and the caudal extension of the fusion area [12, 19]. The finding that a more caudal lower fusion level predicts a poor outcome suggests that surgeons should avoid extending fusions caudally, but in the surgical decision it should also be considered that allowing the curve below the caudal level of fusion to continue to progress may also negatively influence the long-term HRQoL. Moreover, the post-surgery restoration of sagittal spino-pelvic alignment has been shown to limit early degenerative changes in the free-motion segment discs after AIS surgery [1].

The methodological strengths of this study include its prospective design, the minimum 5-year follow-up, and the high survey participation rate. Moreover, the longitudinal assessment of patients by validated patient-oriented tools and the analysis of MCID over time warranted comparison with the preoperative status and with age and sex-matched norms, mitigating the lack of a control population. These methods have been previously listed as features of properly designed studies [11]. Some methodological weaknesses in the present study should also be acknowledged. We used the SRS-23 questionnaire because at the time the first patients in our study group underwent their surgery, the SRS-22 version was not available. We continued to use the SRS-23 version throughout the duration of this study to maintain consistency. This might have caused issues in the comparison of our data with the SRS-22 results reported in literature. However, excellent correspondence in the questionnaire and domains’ total and mean scores between the SRS-23 and SRS-22 version of the questionnaire has been reported [15]. Another limitation is that a more detailed analysis of radiographic parameters in the sagittal plane and clinical data related to the specific location of the residual rib hump and the shoulder balance would have provided more explanatory variables for use in the statistical analysis. Especially the shoulder height discrepancy represents one of the key components of the body deformity in AIS patients and its correction might play an important role in patients’ satisfaction after surgical treatment.

In conclusion, this study demonstrates that patients who had undergone surgery for AIS a minimum of 5 years earlier have impaired self-reported physical HRQoL in comparison with healthy subjects, but they still perform better than before their surgery. A decline in SRS questionnaire results can be observed over the follow-up period, but the outcome at the final follow-up still remains subjectively satisfactory. The residual hump and the distal extension of the fusion area are important predictors of the self-reported outcome of patients at the final follow-up.