Introduction

Bone metastases are a common cause of morbidity in patients with advanced cancer. Pain is the most common presenting symptom and additional complications can include pathological fracture, spinal cord compression and hypercalcemia [1]. Palliative radiotherapy is an effective intervention for patients with painful bone metastases with response rates ranging from 60 to 70% [2]. Data from the NCIC SC.20 trial and other studies demonstrate the effectiveness of radiation in the retreatment setting as well [3, 4].

In addition to pain from bone metastases, patients with advanced cancer often have other symptoms, which can affect multiple domains of life [5, 6]. Given that patients with advanced cancer are often not in a curative situation, maintaining or enhancing quality of life (QOL) is an important therapeutic goal. Therefore accurately measuring QOL is required to both guide clinical decision making and evaluate the impact of various interventions in this population. For this purpose, validated instruments have been developed to reliably assess symptoms and evaluate QOL in patients with cancer including the EORTC QLQ-C30 and Brief Pain inventory (BPI). The QLQ-C30 is a cancer-specific questionnaire designed by the EORTC to evaluate relevant QOL issues in patients with cancer [7, 8]. Since pain is a dominant symptom in patients with advanced cancer, the BPI was developed to measure and assess both the intensity of the pain as well as the interference of pain in the patients’ lives [9]. Both instruments are validated.

Interpreting the numerical scores on these instruments can be challenging for a number of reasons. Although large clinical trials can show statistically significant differences in the numerical scores, the clinical impact of these differences is open to interpretation. The study of minimal clinically important differences (MCID) explores this particular issue: what numerical changes in the items on the QOL instruments translate to meaningful clinical impact in the patients’ lives? The methodology for determining MCID scores in QOL instruments has been previously established, and both anchor-based and distribution-based methods can be used for this purpose. This study was designed to establish the MCID for the QLQ-C30 and BPI instruments in a prospective cohort of patients undergoing reirradiation for painful bone metastases.

Methods

Patient population

The dataset for this study comes from the NCIC CTG SC.20 randomized controlled trial, which enrolled and randomized 850 patients from multiple centres in 9 countries worldwide [3]. The trial evaluated different radiation schedules for repeat treatment of painful bone metastases. Patients were randomized to receive either 8 Gy in a single fraction or 20 Gy in 5 to 8 fractions as repeat treatment. The inclusion criteria for the trial were as follows: patients 18 years or older who had radiologically confirmed, painful (pain measured as ≥ 2 points using BPI) bone metastases, had previously received radiation therapy to the same area, and were taking a stable dose and schedule of pain-relieving drugs. The primary endpoint of the trial was overall pain response at two months. Institutional research ethics board approval was obtained at all participating centers and written informed consent was signed by all participants prior to enrollment.

QOL instruments and scoring

The BPI is a tool that was specifically developed to assess pain in patients with cancer and it has been extensively validated [10]. It contains both a sensory dimension, which measures the intensity of pain, and a reactive dimension which measures the interference of pain in the patient’s life [9]. In total, the questionnaire contains 11 items, which include 2 multi-item scales measuring pain intensity and the impact of pain on functioning and well-being. The BPI assesses interference via seven items including general activity, mood, walking ability, normal work, relationship, sleeping problems, and enjoyment of life. The EORTC QLQ-C30 is a more general QOL instrument designed for patients with cancer [8]. It contains five functional scales (physical, role, cognitive, emotional, and social), three symptom scales (fatigue, pain, and nausea and vomiting), a global health status/QOL scale, and a number of single items assessing additional symptoms commonly reported by cancer patients (dyspnea, loss of appetite, insomnia, constipation, and diarrhea) and perceived financial impact of the disease. In total there are 30 items on the questionnaire.

Each item in the EORTC QLQ-C30 is rated from 1 (not at all) to 4 (very much) in severity, except for the overall QOL scale, which is rated from 1 (very poor) to 7 (excellent). A high score represents increased distress in the symptom scale, whereas a high score in the functional scale suggests increased functional ability. Each scale is converted to a score ranging from 0 to 100. For the BPI, each item is rated from 0 to 10, where higher scores indicate more severe pain or greater interference of pain in activities. This raw score is not scaled to a range from 0 to 100.

Statistical analysis

The patient characteristics and demographic profile of the analyzed cohort were recorded including age, gender, primary malignancy, Karnofsky performance status (KPS), worst pain score (WPS) at baseline, initial response to radiation, site of painful bone lesion, and treatment received. Pain response at two months was determined as per International consensus on palliative radiotherapy endpoints [11]. Complete response (CR) was defined as a WPS of zero at the treated site with no concomitant increase in analgesia. Partial response (PR) was defined as WPS reduction of two or more without analgesic increase or opioid analgesic reduction of 25% or more from baseline without an increase in WPS. Pain progression (PD) was defined as an increase of WPS of two or more points above baseline without reduction of analgesic use or opioid analgesic increase of 25% or more from baseline without reduction in WPS.

The methods used to determine MCID follow that of previously published recommendations and studies by the EORTC [12,13,14]. Both an anchor-based approach and distribution-based approach were used to determine the MCID scores, as both these techniques are thought to produce roughly equivalent results [15]. For the anchor-based approach, the mean change method was used, following the methodology established by Reidelmeir et al. [16]. Each QOL subscale was compared or “anchored” to another reference measurement that was considered to be clinically relevant. In this study, the chosen anchor was the global QOL score as measured by the QLQ-C30. The correlation between the items and global QOL score was computed using the Spearman’s correlation coefficient. Only those items with |r| ≥0.30 were used for the anchor-based analysis, as per the recommended methods by Revicki et al. [14]. A 10-point change in the global QOL score was used to classify improvement or deterioration. Patients with less than a 10-point change were classified as stable. While the change in ≥ 10 points is arbitrary and requires validation, previously studies have shown that this represents a mild-moderate change that is clinically significant [15, 17]. It is uncertain if smaller changes in scores are also subjectively perceptible and represent a more sensitive measure of clinically significant change. For each item, the mean score change between baseline and 2-month follow-up was calculated for the improved, deteriorated, and stable categories (based on global QOL score change). The MCID scores for improvement were calculated by measuring the difference in mean scores between the improved and stable categories. Likewise, the MCID scores for deterioration were calculated by measuring the difference between the stable and deteriorated patient groups. 95% confidence intervals (CI) were calculated for the differences in mean scores. If the CI did not contain zero, the results were considered to be statistically significant.

In the distribution-based approach, the statistical variation in scores was used as an alternative estimation of the MCID, as other studies have shown that the MCID can correspond to 0.2–0.5 standard deviations (SD) of the sample [14]. The MCID estimates were expressed as some proportion of the SD at baseline and at follow-up. The distribution-based MCID estimates were compared with the anchor-based MCID scores.

Results

In total, 850 patients were enrolled in the trial and 375 patients were analyzed in this study, with documented pain responses and completed QOL questionnaires at 2 months. The remaining patients were ineligible for the study, dead, lost-to-follow up or did not have completed QOL questionnaires at 2 months, and were therefore not included in this analysis. The demographic profile of the analyzed cohort is summarized in Table 1 and compared to the baseline factors of patients not included in the analysis. The median age was 64 years (range 18–93) and 54.7% of the patients were male. The most common primary sites were breast cancer (30.9%), followed by prostate cancer (26.1%) and lung cancer (21.9%). The majority of patients had a KPS ≥ 70 (82.4%) and a baseline WPS ≥ 5 (85.1%). The site of the painful bone lesion was mostly in the pelvis/hips (36.3%) or spine (35.5%), and the main reason for retreatment was recurrent pain after an initial response (70.7%).

Table 1 Patient demographics

Table 2 summarizes the response to pain at 2 months. 50.9% of the included patients had a complete or partial response to pain, whereas 12.4% of patients were classified as having progression. The remaining 36.7% of patients had an undefined response (not meeting criteria for CR/PR/PD) or no change in pain (NC), and 9.9% of patients did not have a documented pain response (IN). The majority of patients who were not included in the analysis did not have a documented pain response at 2 months (60.0%). Table 3 summarizes the baseline QLQ-C30 and BPI scores and their correlation with the global QOL score. In the QLQ-C30 instrument, 9 out of 14 items had a correlation ≥ 0.3 with the global QOL score: physical functioning, role functioning, emotional functioning, cognitive functioning, social functioning, fatigue, nausea, pain, and appetite. In the BPI instrument, 6 out of 10 items had a correlation ≥ 0.3 with the global QOL score. The worst pain, average pain, current pain, and sleeping items had correlation < 0.3. The least pain score was not available in the collected data and was excluded from the analysis.

Table 2 Pain response to radiotherapy at month 2
Table 3 Baseline QOL results and correlation with global QOL anchor from the QLQ-C30 instrument

The MCID scores from the anchor-based analysis are summarized in Table 4. Using the global QOL score on the QLQ-C30 as an anchor and a 10-point threshold, 111 patients showed improvement, 159 patients had no change and 90 patients had deterioration. The improvement and deterioration scores are summarized in Tables 5 and 6. For deterioration, statistically significant MCID scores were found for all items on the QLQ-C30 and the BPI. For improvement, statistically significant MCID scores were found in 7/9 items of the QLQ-C30 instrument, except for cognitive functioning and nausea/vomiting. For the BPI, statistically significant MCIDs for improvement could only be found in 2/6 items (mood and relations with others). Uniformly, the MCIDs for deterioration were higher than for improvement. Table 7 shows the distribution-based MCIDs. In general, the MCIDs for improvement were closer to 0.3 SD for QLQ-C30 and closer to 0.2 SD for the BPI. For deterioration, the MCIDs were closer to 0.5 SD for both the QLQ-C30 and BPI. Table 8 summarizes the absolute values for MCIDs obtained from the anchor-based and the distribution-based approaches.

Table 4 MCIDs for the QLQ-C30 and BPI using the anchor-based analysis
Table 5 Global QOL score changes in the improvement category
Table 6 Global QOL score changes in the deterioration category
Table 7 MCIDs for the QLQ-C30 and BPI using the distribution-based approach
Table 8 Summarized absolute values for MCID scores

Discussion

This study is the first in our knowledge to measure the MCID for the BPI and EORTC QOL instruments in a large prospective cohort of patients with metastatic cancer. Both anchor- and distribution-based methods were used for the analysis and results were compared. The choice of anchor is an important factor for the anchor-based analysis. It must capture a clinically meaningful endpoint and be responsive to small, but important changes in patient status. Using global QOL as an anchor is one accepted method and has been used in previous studies [18,19,20]. Other previously used methods include using KPS, WHO performance status, weight changes, well-being, mini-mental state examination and functional impairment as anchors [12, 13, 21,22,23]. The advantages of using a global rating of change as an anchor are that they are easy to obtain, patient-centric and can take into account a variety of information and determinants of well-being [18]. Another reason for the choice of anchor was to facilitate comparison with another study investigating MCID scores in patients with cancer and painful bone metastases underdoing palliative radiotherapy to a painful site for the first time [20].

In order to use an anchor-based method, there must exist some association (minimum correlation) between the QOL items and the chosen anchor [14]. Therefore, a Spearman’s correlation coefficient was calculated for all items and the global QOL measure. Only the items with at least moderate correlation |r| ≥ 30 were kept for subsequent analysis. In both the BPI and the QLQ-C30, the sleep item did not correlate with the global QOL. Also the dyspnea, financial, constipation, and diarrhea items in the QLQ-C30 did not correlate with the global QOL. The patients in this trial were enrolled because they had advanced cancer and recurrent painful bone metastases; therefore, the items that are conceptually related to pain are expected to show better correlation. Accordingly most items on the BPI interference scale and all the functional domains on the QLQ-C30 had good correlation with global QOL. Interestingly, although the pain item on the QLQ-C30 showed good correlation with the anchor, the pain items on the BPI were more weakly correlated and the reasons for this are unclear.

In a survey of palliative cancer patients ranking the importance and relevance of different items on the QLQ-C30, both the financial and diarrhea items were the lowest ranked [24]. These items were subsequently removed from the QLQ-C15-PAL, a shortened questionnaire more appropriately designed for palliative patients with more advanced cancer. Therefore, it is understandable why these items did not correlate with global QOL. Dyspnea, sleep, and constipation are other items with a low symptom burden and a high standard deviation in the scores, consistent with the results of our previous study which also showed low levels of correlation with the global QOL anchor [20].

Using the anchor-based analysis, statistically significant MCID scores for deterioration could be found in all items of the QLQ-C30 and the BPI instruments. However, statistically significant MCID scores for improvement could only be found in 7/9 QLQ-30 items and 2/6 BPI items. The MCID scores for deterioration were higher compared to improvement across all items in the QLQ-C30 and BPI. We recognize that the anchor-based MCID scores presented here may be overestimated as patients with large global QOL scores changes were also included in the calculation; however, these patients were a minority, as presented in Tables 4 and 5. The distribution of scores were very similar for improvement and deterioration. Also we only performed anchor-based analysis on the items that had at least moderate correlation with chosen global QOL anchor. The distribution-based method may provide an alternate estimate of clinically significant differences for the items that did not meet these criteria. Another limitation in the study was that 55.7% of patients were excluded from the study due to missing BPI or QOL data. Table 1 shows that there were significant differences in the KPS and cancer type between the two groups. In this cohort of patients, we have previously shown that both lower baseline KPS and non-breast cancers had significantly worse survival outcomes [25]. Presumably many of these patients had deteriorated or were too unwell to complete the required follow-up. This may create a selection bias where the established MCID scores are more applicable to the analyzed population with favorable outcomes.

The results of this study can be most directly compared to other studies by our group which have assessed the MCID scores in patients with painful bone metastases [20,21,22]. In one such study, QOL was measured in a large prospective cohort of patients undergoing palliative radiation for the first time and MCID scores were determined using identical methods in slightly different QOL instruments, the QLQ-C15- PAL and QLQ-BM22 [20]. Using global QOL as anchor, the MCID’s for improvement for most items were higher compared to deterioration. These results are in stark contrast with the results of the current study, which show the opposite result.

This phenomenon can be potentially explained by a “response shift.” Response shift can be defined as a change in the meaning of one’s self-evaluation of a target construct as a result of a change in the respondent’s internal standards of measurement (recalibration), a change in the respondent’s values (reprioritization) or a redefinition of the target construct (reconceptualization) [26]. Prior to receiving palliative radiotherapy, there is an expectation in most patients to improve after treatment [27]. Based on the discussion with the oncologist, patients may understand there is a 60–70% probability that their pain will respond to treatment. Therefore, a small deterioration in any of the QOL domains may have a more significant impact on global QOL. Patients undergoing retreatment of bone metastases are later in their disease course; they may be more accepting of their health-state and may understand that response rates are lower. Despite having a higher symptom burden on the individual items of the QLQ-C30/C15-PAL questionnaire, the patients in the SC.20 (reirradiation trial) had similar global QOL and KPS scores compared to the patients in the SC.23 trial [20, 28]. Also almost 30% of patients in the SC.20 trial did not have any response or an insufficient response to radiation the first time, so their expectations could be modified by this previous experience as well. Therefore, there may be a smaller threshold for improvement and a larger deterioration required in the individual items to impact global QOL.

This type of response shift can be categorized as a “recalibration” where the relative impact of improvement vs. deterioration in different domains changes over the disease trajectory, and/or a “reconceptualization,” where patients alter their view of QOL and choose to focus on the positive aspects only rather than overall deterioration. These explanations are speculative, hypothesis-generating and should be interpreted with caution as the correct way to measure response shift is through longitudinal QOL changes in the same cohort of patients [26]. However, the contrast between the MCID scores for “improvement” vs. “deterioration” in the two studies looking at reirradiation and first-time radiation in the palliative setting warrants further attention. Further studies are required to follow patients in a longitudinal fashion to better understand the response shift phenomenon. In the interim, consideration must be given to the disease trajectory and patient expectations when interpreting the QOL scores in patients undergoing palliative radiotherapy.

Conclusion

In this study, we established the MCID scores for the EORTC QLQ-C30 and the BPI QOL instruments using both anchor-based and distribution-based methods in prospective cohort of patients undergoing reirradiation for painful bone metastases. The results of the study can guide clinicians and researchers in the interpretation of these instruments. This study also highlights the importance of patient expectations and disease trajectory in the interpretation of the patient-reported outcomes.