Introduction

Osteoporosis is a systemic skeletal disease characterised by low bone mass and disruption of bone architecture resulting in reduced bone strength and an increased relative risk of fracture [1]. Post-menopausal osteoporosis, the most common form of the disease, is associated with pain and disability, mainly as a result of vertebral fractures [2, 3]. The age-standardised incidence of vertebral fracture in women with osteoporosis is estimated as 12.1/1000 person years [4] and this is expected to have an increasing impact on healthcare providers in the developed world as the population ages.

Vertebral fractures can have a significant impact on the quality of life (QoL) of post-menopausal patients with osteoporosis [5]. The chronic pain associated with undiagnosed or untreated vertebral fractures, along with other consequences of these events, such as bone deformation and gastrointestinal and respiratory disorders, can compromise patients’ ability to perform normal daily activities. In addition, the psychological consequences of fractures, which include fear of falling and breaking a bone and fear about the future, have been documented as QoL issues in women with osteoporosis [6]. Measuring functional status and QoL is now considered an important part of the evaluation of new treatments, particularly in chronic diseases such as osteoporosis [7].

In clinical studies, QoL can be measured using a disease-specific instrument, a generic instrument or both. While generic instruments have the advantage of providing scores that can be compared with other populations and conditions, these instruments alone do not cover the concepts of greatest importance to patients with osteoporosis. In order to measure specifically and accurately the impact of post-menopausal osteoporosis on QoL, a disease-specific instrument is required. The QUAlity of Life questionnaire In OSTeoporosis (QUALIOST®) [8] was developed as an osteoporosis-specific QoL module to be used in conjunction with one of the most widely used generic instruments, the Medical Outcomes Study short-form (36-item) Health Survey (SF-36®) [9]. The QUALIOST® is a 23-item module resulting in a total score and two sub-scores — physical and emotional — that has undergone psychometric validation in France (n = 90) and the UK (n = 40) for internal consistency, reliability and reproducibility [8]. The responsiveness of the QUALIOST® (i.e., its ability to detect a change in QoL when a fracture occurs) has also been assessed, in which additional language versions were also validated [10].

The Spinal Osteoporosis Therapeutic Intervention (SOTI) 5-year study has evaluated the effect of strontium ranelate versus placebo on the incidence of new vertebral fractures in post-menopausal women with osteoporosis who had previously had one or more vertebral fractures. The clinical effectiveness of this orally administered drug has been evaluated over a 3-year period, at which time patients in the strontium ranelate group had a significantly reduced risk of vertebral fracture (relative risk 0.59; 95% confidence interval [CI] 0.48–0.73; P < 0.001) [11]. This effect was apparent from the end of the first year of treatment. There was also a significant increase in lumbar bone mineral density (BMD) of 14.4% in the strontium ranelate group compared with placebo (P < 0.001) and an increase in femoral neck BMD of 8.3% (P < 0.001 versus placebo) at 3 years. QoL was one of the secondary endpoints of this study and was measured using both the generic SF-36® questionnaire and the disease-specific QUALIOST® module. An analysis of QoL at the 3-year time point was undertaken and is reported here.

Patients and methods

Study population

Post-menopausal women were recruited into this 5-year multicenter, placebo-controlled, randomised, double-blind phase III trial in the following countries: Australia, Belgium, Denmark, France, Germany, Greece, Hungary, Italy, Poland, Spain, Switzerland and the UK [11]. All patients were recruited from the FIRST (Fracture International Run-in for Strontium ranelate Trials) study [12], which aimed to start the normalisation of patients’ calcium and vitamin D status. Patients were required to meet the following inclusion criteria: Caucasian women with osteoporosis who had experienced at least one previous vertebral fracture; lumbar BMD ≤ 0.840 g/cm2; aged ≥50 years; and post-enopausal for ≥5 years. All patients were ambulatory and gave written informed consent to participate in the study, which was conducted in accordance with the Declaration of Helsinki.

Study design

Patients were randomised to receive either placebo or strontium ranelate 2 g/day for 3 years. Patients also received elemental calcium (0–1000 mg/day, depending on their dietary intake) and vitamin D (400–800 IU/day, depending on their baseline serum concentration of 25-hydroxy vitamin D). QoL was assessed at baseline and every 6 months using the SF-36® and the QUALIOST® instruments in each country except for Greece, as a Greek translation of the SF-36® was unavailable at the beginning of the study.

Assessments

The SF-36® is a well-known generic QoL measure that has been used in many different populations and diseases [9]. The 36 items can be grouped into eight multi-item dimensions for which a score can be calculated: physical functioning (ten items), role physical (four items), bodily pain (two items), social functioning (two items), mental health (five items), role emotional (three items), vitality (four items) and general health perception (five items). In addition, two summary scores can be calculated from the dimension scores: the mental component summary (MCS) and physical component summary (PCS) scores [13]. All dimension scores range from 0 to 100, with higher scores indicating better QoL.

The QUALIOST® is a 23-item instrument developed specifically for use in conjunction with the SF-36® to measure post-menopausal osteoporosis-related QoL [8] (see online supplement for details of the QUALIOST®). The QUALIOST® contains two dimensions: physical (ten items, including pain, limitations in moving, hobbies and housework and difficulties in dressing) and emotional (13 items, including negative feelings such as worry, fear of falling and being a burden, being affected by change in physical appearance and decreased health confidence). In addition to the total score, a score per dimension can be calculated. All scores range from 0 to 100, with a score of 100 indicating the greatest impairment in QoL and 0 indicating no impairment (i.e., the best QoL). Item 6 of the physical domain of the QUALIOST® (‘Have you had pain in the middle or upper part of your back?’) was also used on its own to provide a qualitative assessment of the pain associated with vertebral fractures.

Both the SF-36® and the QUALIOST® questionnaires are self-administered.

Statistical analyses

All analyses were performed using SAS software version 8.2. For all tests considered, the α risk threshold was set at 5% and tests were two sided. Analyses of treatment effect were performed on pooled data (across countries). The main dataset analysed was derived from the clinical full analysis set (FAS), which consisted of all randomised patients who had taken at least one sachet of study treatment and who had one X-ray at baseline as well as another post-baseline (intent to treat principle). The QoL-FAS comprised patients from the clinical FAS who had at least one assessable SF-36® (i.e., <50% missing data) and one assessable QUALIOST® completed at baseline, plus at least one assessable SF-36® and one assessable QUALIOST® completed post-baseline.

The changes in the MCS and PCS scores of the SF-36® and the total score of the QUALIOST® were compared between groups. A repeated-measures analysis (mixed model) was performed on raw data (missing questionnaires were not replaced) in which a comparison of treatment groups at each time point was performed. When a significant group–time interaction was identified, the treatment groups were compared at each time point using contrasts. If the group–time interaction was not statistically significant, a model without interaction was used. Other QoL analyses included between-group comparisons of the 8 SF-36® dimension scores, and the emotional and physical dimension scores from the QUALIOST®. An analysis of covariance (ANCOVA), using the baseline score as covariate, was also performed on all QoL scores to compare the change between baseline and endpoint. In order to investigate the effects of drug in patients who adhere to treatment, an ANCOVA was also performed on all scores to compare groups on the change between baseline and the last available assessment performed under treatment or within 30 days of treatment.

The number of patients free from back pain (from Item 6 of the QUALIOST®) was compared between groups using a Poisson regression in the QoL-FAS.

Imputation rules for missing data

In order to minimise the potential for bias resulting from missing questionnaires, the analyses described above were also performed on imputed data. Missing data resulting from missing questionnaires were replaced taking into account the fracture status of each patient. For example, for patients who had experienced a fracture and for whom the questionnaire was missing after they had their fracture, the average increase in score seen in patients after they experienced a fracture was added to the last available score for that patient. Missing items within questionnaires had already been taken into account when calculating scores, with dimension scores being calculated as the mean of non-missing items only if at least half of the items in that dimension had been answered.

Results

Patients

In total, 1,649 patients were included in the SOTI study. Of these, 1,442 patients (87%) comprised the FAS, 719 in the strontium ranelate group and 723 in the placebo group. The QoL-FAS corresponded to 1,240 of these patients (86% of the total FAS; strontium ranelate: n = 618; placebo: n = 622). Baseline characteristics of the QoL-FAS dataset were comparable in both treatment groups (Table 1).

Table 1 QoL-FAS patient characteristics at baseline

Quality of life

In total, 8,218 QoL questionnaires were completed between the first inclusion visit and the last 3-year visit. The return rate of questionnaires was high in both treatment groups (≥93% of questionnaires at baseline) with an expected decrease during the study as a consequence of the length of follow-up. However, the return rate of assessable questionnaires remained comparably high (79% return rate in the QoL-FAS at the end of Year 3). The quality of completion was also high in both treatment groups, and noticeably greater for the QUALIOST® (72–76% of SF-36® questionnaires had no missing data, compared with 88–90% of QUALIOST® questionnaires); the rate of return of questionnaires was higher at each assessment for the QUALIOST® than for the SF-36®. Overall, 3,159/4,701 questionnaires (67%) from patients in the strontium ranelate group and 3,041/4,746 (64%) in the placebo group were completed alone rather than with help from a third party. Over half of all questionnaires (5,469/9,447 questionnaires; 58%) were completed in the physician’s waiting room; the remainder were completed at home (3,968/9,447 questionnaires; 42%). There were no noticeable differences in completion rates between treatment groups.

QUALIOST®

QUALIOST® total scores increased by 0.77 in the placebo group (before imputation of missing data; Fig. 1), indicating a slight deterioration in the QoL of patients. In contrast, a small negative change (−1.29) was observed in patients treated with strontium ranelate over the treatment period, indicating slightly improved QoL. This between-group difference was statistically significant (mixed-model analysis; P =  0.028). After imputation of missing data, a significant difference between treatment groups in favour of the strontium ranelate group was still observed (mixed-model analysis; P = 0.032), demonstrating the robustness of the results obtained from the primary analysis. Further confirmation was provided with the analysis of the difference between groups in change from baseline to endpoint (ANCOVA; P = 0.034 for strontium ranelate versus placebo) and change from baseline to endpoint under treatment (ANCOVA; P = 0.016 for strontium ranelate versus placebo).

Fig. 1
figure 1

Change in QUALIOST® scores between baseline and study endpoint (before imputation of missing data). n = number of patients with data. Error bars represent the standard deviation from the mean.*Mixed-model analysis

Analysis of the change in QUALIOST® dimension scores, before imputation of missing data, showed that patients in the placebo group had increased physical and emotional scores (changes of 1.31 and 0.21, respectively), indicating further impairment in physical and psychological function (Fig. 1). In contrast, patients in the strontium ranelate group had a reduction in physical and emotional scores (changes of −0.84 [P = 0.024 versus placebo] and −1.61 [P = 0.046 versus placebo], respectively; repeated measures analysis), indicating an improvement in physical and psychological functioning. The difference in dimension scores was also statistically significantly in favour of strontium ranelate when missing data were imputed (mixed-model analysis; emotional: P = 0.047; physical: P = 0.044). Similarly, a statistically significant difference between groups was demonstrated in favour of the strontium ranelate group when comparing the change from baseline to endpoint for emotional scores (P = 0.041 versus placebo; ANCOVA) and a trend towards statistical significance for physical scores (P = 0.051). A comparison of the change from baseline and endpoint under treatment scores revealed that these were statistically significantly in favour of strontium ranelate (emotional P = 0.019; physical P = 0.032; ANCOVA).

SF-36®

SF-36® MCS and PCS scores decreased in both the strontium ranelate (change from baseline to endpoint: −2.51 and −0.30 for MCS and PCS, respectively) and placebo groups (change from baseline to endpoint: −1.95 and −0.95 for MCS and PCS, respectively) over the study period, indicating slight deterioration in QoL (Fig. 2; after imputation of missing data). There was no significant between-group difference in either of these scores over the study period. After imputation of missing data, similar results were obtained, although the General Health Perception score was significantly better in the strontium ranelate group (p = 0.004). Mean changes in PCS and MCS between baseline and endpoint, and between baseline and endpoint under treatment, were negative, indicating poorer QoL at the end of the study than at baseline. The change from baseline to endpoint in PCS score was relatively small (−0.46 for strontium ranelate and −0.53 for placebo) compared with the change in MCS score (−2.47 for strontium ranelate and −2.08 for placebo). Between-group differences were not statistically significant for either score (ANCOVA; P = 0.909 and P = 0.563 for MCS and PCS, respectively). A similar pattern was observed for the difference from baseline to endpoint under treatment: once again, the difference between treatment groups was not statistically significant (ANCOVA; P = 0.438 and P = 0.501 for MCS and PCS, respectively).

Fig. 2
figure 2

Change in SF-36® scores between baseline and study endpoint (before imputation of missing data). n = number of patients with data. Error bars represent the standard deviation from the mean.BP: body pain; GH: general health perception; HT: health transition index; MCS: mental component summary; MH: mental health; NS: non-significant; PCS: physical component summary; PF: physical functioning; RE: role emotional; RP: role physical; SF: social functioning; VT: vtality

Prevention of back pain

Analysis of Item 6 of the QUALIOST® instrument (‘Have you had pain in the middle or upper part of your back?’) revealed that the number of patients without back pain was significantly increased by 30% over 3 years in the strontium ranelate group compared with the placebo group (P = 0.005), with a significant effect observed after the first year (31% reduction versus placebo; P = 0.023) (Fig. 3).

Fig. 3
figure 3

Effect of treatment on patient-reported back pain. Patients who answered ‘not at all’ to the question ‘Have you had pain in the middle or upper part of your back?’ were included in this analysis. aRepeated measures analysis; generalised estimating equation model

Discussion

The QUALIOST® is a recently developed and validated instrument specifically designed for the measurement of QoL in patients with osteoporosis. The results of the present study indicate a statistically significant difference in QUALIOST® total score in favour of strontium ranelate therapy, which has been demonstrated to reduce the incidence of new vertebral fractures in patients who have previously had one or more fracture compared with placebo. Statistically significant beneficial effects were also seen on the emotional and physical dimension scores of the QUALIOST®. A significant difference between treatment groups in favour of strontium ranelate was demonstrated for the General Health Perception score of the SF-36®. Although some other trends were observed in SF-36® scores, changes in MCS or PCS dimension scores of this widely used generic instrument did not reach statistical significance. As previously reported, the incidence of new vertebral fractures after 3 years of treatment was 32.8% in the placebo group and 20.9% in the strontium ranelate group, leading to a relative risk reduction of 41% (95% CI 0.27; 0.52) [11]. Therefore, this relative risk reduction is associated with a statistically significant benefit on quality of life, the clinical significance of which should be interpreted in the context of fracture prevention and reduction in impact on QoL.

The change over time in each treatment group was not expected to be large. The hypotheses were that QoL in the strontium ranelate group would be maintained or slightly improved owing to a lower incidence rate of new fractures and that QoL would worsen in the placebo group. This was indeed confirmed in our analysis of the data. Comparison of the change over the study period in QUALIOST® total score in the two treatment groups indicated a significant between-group difference in favour of the strontium ranelate group (difference in score = 1.52; P = 0.034). As with any QoL data, the important question is how to interpret the results to determine the clinical significance of a change in QoL score. Previous studies on the responsiveness of the QUALIOST® suggest that the differences we observed after 3 years of treatment were comparable to the difference between having a new fracture (QUALIOST score −0.92) and not having one (QUALIOST score 0.46; difference in score −1.38) [10].

Previous studies have shown that osteoporosis in general — and vertebral fractures in particular — can have a significant psychological effect on patients [1416]. In the present study, a significant between-group difference in favour of strontium ranelate was observed in the emotional dimension score of the QUALIOST®. The emotional dimension consists of items related to patients’ worries about their condition, such as fear of falling and being a burden, as well as feeling older, frustration and the effect of changes in physical appearance. Hence, an improvement in the emotional dimension score suggests that these negative feelings are reduced in patients treated with strontium ranelate compared with placebo-treated patients. The physical dimension covers aspects of osteoporosis relating to pain, mobility and activities of daily living, such as getting dressed and doing housework. Although the physical scale score was only very slightly reduced in the strontium ranelate group, there was a significant increase in this score in the placebo group, suggesting that treatment with strontium ranelate has beneficial effects on QoL.

Other instruments have been developed to assess changes in QoL in patients with osteoporosis. One of these, the QUALEFFO, was developed and validated by a working party of the European Foundation of Osteoporosis specifically for use in osteoporosis [17]. It has been shown to be sensitive to the impairment in QoL that results from the occurrence of fracture, demonstrating a significant decrease in physical function and social function scores in patients with fracture. However, the mental and pain domains of the QUALEFFO were not able to discriminate between patients with and without fractures or within patients according to their number of vertebral fractures [18].

Back pain is clinically important in osteoporosis, as it frequently appears as the first symptom of established osteoporosis. Therefore, complaints about back pain are often the driver for initiating management of the disease. The proportion of patients reporting that they had no back pain (as measured by Item 6 of the QUALIOST®) was 30% higher in the strontium ranelate group over 3 years than with placebo. The early effect on back pain, observed from the first year of treatment, provides further support for the effect of strontium ranelate on the physical well-being of the patient.

The results from the present study support the use of a disease-specific instrument for capturing changes that have an impact on patients’ QoL [6]. The inclusion of the SF-36®, on the other hand, is useful as it allows comparison with different populations. As expected, SF-36® scores indicated poorer QoL at baseline in the SOTI population compared with population norms. For example, mean PCS and MCS scores for women aged over 65 years in a US population were 41 and 51.4, respectively [13], compared with mean baseline PCS scores of 37.4–37.9 and MCS scores of 44.9–45.1 in our population. PCS and MCS scores observed in our study were lower than mean scores reported for patients with long-standing illnesses in the UK (SF-36® version 2 scores: PCS 44.6, MCS 48.2), providing further evidence of the impact of osteoporosis on QoL [19]. Since the PCS and MCS scores were consistent at baseline and Year 3 of the SOTI study, the SF-36® may not be sensitive enough to capture treatment effects specific to osteoporosis to the same extent as the QUALIOST®, which is particularly sensitive to the impact of vertebral fractures.

The key limitation commonly associated with the evaluation of QoL in clinical studies is a failure to account for missing data. In a study with long-term follow-up and an elderly population, a decline in sample size over time and an increase in missing data rates is to be expected. Consequently, our analysis plan included measures designed to minimise the impact of missing data. In reality, however, the rates of returned questionnaires were fairly high given the duration of follow-up. Comparison of the results obtained using observed data, i.e., before imputation of missing data, and those using data following imputation revealed similar results, i.e., statistically significant differences in favour of strontium ranelate for QUALIOST® scores and no significant difference in SF-36® MCS and PCS scores. Indeed, similar QoL results were obtained, regardless of the population or statistical method used (before and after imputation of missing data, mixed-model analysis, ANCOVA on endpoint data). These analyses, therefore, provide confidence in the robustness of the study results. A second limitation of many QoL studies is the fact that although statistically significant differences in scores may be observed, the clinical relevance of these differences may be less obvious. QUALIOST® scores can, however, be related to the difference between not having a fracture and having one or more fractures [10]. Effect sizes have also been determined for different types of vertebral fracture [11], further adding to the interpretability of the instrument.

The effects on QoL of other anti-osteoporotic drugs have been evaluated in phase III studies. Treatment with alendronate was shown to improve QoL in patients compared with calcium or calcitonin treatment [20]. However, conclusions from this study are limited by the small number of patients in each group (51, 50 and 50 patients in each group, respectively) and a short follow-up time (1 year). In an open-cohort observational study of 9,188 patients with post-menopausal osteoporosis, an improvement in QoL was observed in risedronate-treated patients after 6 months [21]. Treatment with raloxifene did not demonstrate any significant advantage over placebo in terms of effect on QoL [22]. Treatment with teriparatide has been shown to reduce the risk of new or worsening back pain [23] compared with placebo; however, no results relating to improvements in QoL have been demonstrated.

Thus, strontium ranelate is the only anti-osteoporotic agent for which an improvement in QoL, compared with placebo, has been detected using a sensitive osteoporosis-specific questionnaire in a large prospective placebo-controlled study with a 3-year follow-up period. To date, strontium ranelate is the only anti-osteoporotic drug with a summary of product characteristics issued by the European Medicines Agency that includes a claim regarding the effects of the product on QoL [24]. This is an acknowledgement of the significance of the QoL data presented in this paper and also a reflection of the increasing interest on the part of regulatory authorities in health-related QoL measurements.

In conclusion, treatment of post-menopausal osteoporotic women with strontium ranelate provides clinically relevant benefit in QoL compared with placebo, as assessed using an osteoporosis-specific questionnaire. Using the QUALIOST®, one can demonstrate treatment-related differences in QoL in post-menopausal osteoporosis, and the QUALIOST® appears to have an appropriate responsiveness for this task. Further studies using other treatment options are required to establish the role of the QUALIOST® in assessing QoL in patients with osteoporosis.