Introduction

Chronic myeloid leukemia (CML) is a myeloproliferative neoplasm progressing without treatment from a most often asymptomatic chronic phase to an accelerated phase and lastly to acute leukemia, i.e. the so-called blastic phase [1]. It is a rare disease with an age‐adjusted annual incidence rate around 1 per 100,000 person‐years in European countries [2, 3] and a prevalence rate estimated at 16.3 per 100,000 inhabitants in France [4]. In the last two decades, imatinib, the first tyrosine kinase inhibitor (TKI), approved in CML followed by subsequent generations of TKIs (dasatinib, nilotinib, bosutinib and ponatinib) has dramatically improved the prognosis of the disease. In patients treated with imatinib who achieve complete cytogenetic response, overall survival is similar to that of the general population [5]. Even more dramatically, patients having achieved a sustained deep molecular response after several years of treatment can safely stop treatment with no evidence of relapse [6, 7]. There is a growing recognition in oncology, and in particular in CML patients, of the importance of measuring patient quality of life (QoL) throughout the disease course [8]. In addition of safety and efficacy, QoL is often amongst the endpoints of clinical studies assessing TKIs [9,10,11,12,13,14]. When treatments have a similar efficacy, the specific adverse event profiles of each molecule and their impact on patients’ quality of life may guide the choice of the TKI. Nonetheless, only a few studies have investigated QoL in CML patients and studied how CML affected the QoL of patients compared to the general population [15, 16] and, specifically, which components were affected.

Because of prolonged treatment and high treatment costs, it is a major issue to evaluate the value-for-money of these drugs. Patient-reported outcome (PRO) data are needed to investigate if/how this chronic condition affects the daily life of patients and to assess heath state utility values which will be useful for future economic evaluations. Data on health utility scores in patients with CML are scarce (Online Resource Table S1). Three studies have estimated health state utility values using the EQ-5D-3L questionnaire administered to patients with CML enrolled in clinical trials [11, 17, 18], i.e. in experimental conditions. Szabo et al. conducted a multinational study to estimate time trade-off preference values for seven health states characterizing CML course (combining disease phase, response and severe adverse events) in 353 subjects from the general population (103 in Canada, 74 in the United States, 97 in the UK and 79 in Australia) [19]. Similarly, Guest et al. conducted two successive studies using four different health states in each study (untreated, hematologic response, cytogenetic response and molecular response in the first study [20] and treatment-free remission, complete molecular response, molecular response, reappearance of detectable disease in the second study [21]) from randomly selected members of the general population in the UK (241 and 235 subjects, respectively). Unfortunately, in these studies using direct elicitation methods, the content of health states had not been validated by patients. In addition, since the first studies were published [17, 19], new treatments have been approved and the accelerated and blastic phases have almost disappeared. Finally, none of these studies investigated the demographic and clinical factors associated with the utility score.

Our objective was to assess QoL and health utility scores in patients with CML in a real-life setting in France, to study the determinants of utility and to compare health-related QoL values to general population norms.

Methods

Study design and patients

This was a prospective web-based survey directed at patients with CML via the French patients’ association “LMC France”, which contacted patients with CML in 2018 through its website, social media and e-mail. Information on the existence of this web-based survey was also given in clinics by hematologists from the French FiLMC group to patients in advanced phases. Patients could participate via the web-based survey or complete an identical questionnaire provided in a pen-and-paper version. The study questionnaire included demographics, medical data regarding CML, current and past CML treatments and three PRO, including both generic and specific instruments: EuroQoL EQ-5D-3L [22], EORTC QLQ-C30 [23] and EORTC QLQ-CML-24 [24]. The number of treatment lines was approximated by the number of different TKI received. The study protocol was approved by the French ethics committee Comité de Protection des Personnes Ile de France 7.

PRO and instruments

EuroQoL EQ-5D-3L

The EQ-5D-3L is a five-item, validated generic instrument designed to describe and value the collective preferences (utility) for various health states [22]. The EQ-5D-3L consists of five questions, each representing a health dimension (mobility, self-care, usual activities, pain/discomfort and anxiety/depression). For each dimension, the respondent indicates one of three levels of functioning (no problems, moderate problems or severe problems) leading to 243 (35) possible health states. Some severe health states may be considered as worse than being dead and the utility score is, in this case, negative. We used the French value set to calculate utility scores from the EQ-5D questionnaires collected in our patient population [25]. Patients were also asked to rate their overall perception of health on the EQ-5D-3L visual analog scale (EQ-VAS), which ranges from 0 (worst imaginable health state) to 100 (best imaginable health state).

EORTC QLQ-C30 and EORTC QLQ-CML-24

We used the French version of the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire-Core 30 (EORTC QLQ-C30) [23] and EORTC Quality of Life module for patients with Chronic Myeloid Leukemia QLQ-CML24 [24]. The QLQ-C30 is a self-reported questionnaire whose validity and test–retest reliability have been demonstrated in several studies. The QLQ-C30 consists of 30 items and includes five functioning scales (physical, role, emotional, social and cognitive), three symptom scales (fatigue, nausea/vomiting and pain), a global health status/QoL scale and six single items (dyspnea, insomnia, appetite loss, constipation, diarrhea and financial difficulties).

The QLQ-CML24 questionnaire is a module developed and validated [26] to specifically assess the quality of life in patients with CML. It includes 24 items corresponding to two functioning scales (satisfaction with care and information received, satisfaction with social life) and four symptom scales/items (burden of symptoms, impact on worry/mood, impact on daily life and body image problems). The scores of the QLQ-C30 and QLQ-CML24 items were linearly transformed to 0–100 scales. For functioning scales and global health status/QoL scales, higher scores correspond to better levels of functioning. For symptom scales, higher scores represent greater levels of symptoms or problems.

Statistical analysis

Mean and standard deviation (SD) as well as median and interquartile range were calculated for the utility score, EQ-VAS score and QoL scores.

Comparison to general population norms

We used French population-based norm reference values for the EQ-5D-3L questionnaire [27] and recently published reference values for the EORTC QLQ-C30 questionnaire [28] to compare utility and QoL scores in patients with CML to the general population. Deviation from reference norms for the EQ-5D-3L index were calculated by subtracting the mean utility score (by age group and gender) of the general population from the utility score of patients with CML. Negative values indicate worse health than counterparts from the general population.

A similar methodology was applied for the EORTC QLQ-C30 questionnaire. For functioning scales, a negative value indicates that patients with CML have worse functioning scores than the general population. For symptom scales, a positive value indicates that patients with CML have more symptoms than the general population. For each scale, we used the threshold proposed by Cocks et al. to evaluate the clinical relevance of any deviation from general population norms [29].

Identification of the determinants of the utility score

We evaluated demographic and clinical factors associated with the utility score using the non-parametric Kruskal–Wallis or Wilcoxon test for categorical variables and computed Spearman correlation coefficients for quantitative variables. Then, we used multiple regression models. In the first model, we included demographic and clinical variables collected in the questionnaire as potential determinants of the utility scores. In the second model, we also added in the model the symptom scores and the functioning scores from the QLQ-C30 and QLQ-CML24 questionnaires to investigate whether the utility score could capture to some extent the level of symptoms. We did not include in the model QLQ-C30 or QLQ-CML24 functioning scales that are already included in the EQ-5D-3L items (e.g. physical functioning, role functioning). For each dimension of the QLQ-C30 questionnaire, the difference of scores between patients with CML and the general population norm was graphically represented using boxplots. Statistical analyses were performed using SAS 9.4. All tests were two-sided with p ≤ 0.05 indicating statistical significance.

Results

Study population

Among the 412 patients with CML who participated in the survey from April to November 2018, 383 patients were evaluable and data were complete for 350 patients (Online Resource Fig. S1). The questionnaire was almost exclusively completed online. Participants were mainly women (59.6%) with a median age [interquartile range] of 52 years [40–62]. Participants were surveyed at a median of 3.0 years after diagnosis. Ninety-two percent of patients were in the chronic phase and 7.3% were in treatment-free remission Table 1. Only 3 patients were in the accelerated phase and 1 patient was in the blastic phase. Fifty-nine percent of patients had received one treatment line for CML, 26.4% two treatment lines and 14.9% three treatment lines or more. Imatinib was the main current treatment (41.7%) followed by nilotinib (23.6%) and dasatinib (18.4%). Nine participants (2.3%) had undergone a bone marrow transplant.

Table 1 Patient characteristics

Descriptive results and comparison with population norms

EuroQol EQ-5D-3L

Overall, the mean utility score was 0.73 (standard deviation SD: 0.25) ranging from −0.3 to 1.0, with a median score of 0.80 (interquartile range: 0.64–0.89). The distributions of both the utility and VAS scores of the study population were negatively skewed Fig. 1a and b. Of the five EQ-5D-3L domains, problems (“some problems” or “extreme problems”) were most frequently reported for pain/discomfort (72%) followed by anxiety/depression (61%), usual activities (36%), mobility (18%) and self-care (4%) Fig. 1c. Compared with population norms, patients with CML had deviation from the reference norm of −0.15 in average (SD: 0.25) with respect to the general population of the same age and sex group.

Fig. 1
figure 1

Distribution of EQ-5D-3L scores (a) Distribution of EQ-5D utility score based on the French value set. (b) Distribution of visual analog scale score. (c) Distribution of responses to EQ-5D-3L, by dimension. VAS visual analog scale

EORTC QLQ-C30 and CML-24

Scores for the functioning and symptoms scales of the QLQ-C30 questionnaire and the CML-24 scales are shown in Table 2. The dimensions of QoL that were the most affected in patients with CML compared to the general population of same age and sex were social functioning, role functioning and cognitive functioning with a mean difference of −16.0, −13.1 and −11.7, respectively Fig. 2. Using the thresholds proposed by Cocks et al. for all the scales, these differences were respectively considered as being of considerable clinical relevance for social functioning, moderate clinical relevance for cognitive functioning and low clinical relevance for role functioning (Online Resource Table S2). Fatigue, dyspnea and pain were the symptoms with the highest deviation from general population norms (mean difference of 20.6, 14.0 and 8.3, respectively). These differences were respectively considered as being of considerable clinical relevance for fatigue, moderate clinical relevance for dyspnea and low clinical relevance for pain. Diarrhea and nausea/vomiting scores were higher in patients with CML than in the general population (mean difference 15.6 and 8.4, both corresponding to moderate clinical relevance), but this difference was limited to patients treated with bosutinib or imatinib. Interestingly, the perception of patients with CML of their global health status was similar to that of the general population (mean difference: 0.8, SD: 19.7), while many of their functioning and symptom scores were worse than in the general population.

Table 2 Health-related quality of life scores: QLQ-C30 and QLQ-CML24 scales
Fig. 2
figure 2

Difference in QLQ-C30 scores between patients with CML and the general population norm (a) Difference in functioning scores between CML patients and the general population, matched by sex and age. A negative difference indicates that CML patients have worse functioning scores than the general population. (b) Difference in symptom scores between CML patients and the general population, matched by sex and age. A positive difference indicates that CML patients have more symptoms than the general population

Correlation of global QoL and utility

Spearman’s correlation was performed to assess the relationship between utility and global health status score from the QLQ-C30 questionnaire. There was a positive correlation between utility and global health status (ρs = 0.63).

Determinants of the utility score

The utility score varied according to the phase of the disease (p = 0.005). The mean utility score (SD) was 0.72 (0.25) in the chronic phase, 0.84 (0.21) in treatment-free remission and 0.77 (0.15) in advanced (accelerated/blast) phases, but this latter phase concerned only 4 patients. The mean utility score was on average 0.10 points lower for women than for men (0.79 vs. 0.69, p < 0.0001), as shown in Table 3. It decreased with the number of treatment lines received from 0.77 in patients having received only one line to 0.59 in patients with four or more treatment lines (p = 0.014). Within the chronic phase, mean utility scores are presented according to the line of treatment in the Online Resource Table S3. The mean utility score differed according to the current treatment (p = 0.003). Patients receiving no treatment (likely to correspond to patients in treatment-free remission) were those with higher utility scores of 0.82 (except for patients treated with ponatinib, but this estimate was based on only 7 patients), while patients treated with bosutinib had the lowest utility score (mean utility: 0.61). Age and time from diagnosis were not associated with the utility score Table 3.

Table 3 EQ-5D-3L health utility scores by demographic and disease characteristics

In the multiple regression analysis, model 1 including only the demographic and medical variables yielded similar results to the univariate analysis, although the association between utility score and the current CML treatment was less pronounced after adjustment for other clinical variables such as the number of treatment lines Table 4. However, model 1 explained only 13.9% of the variance in the utility score. In model 2, including several scales/items of the QLQ-C30 and QLQ- CML24, fatigue was the most important independent determinant of the utility score (p < 0.0001, Table 4). The burden of symptoms scale, which is based on main treatment side effects, was also a significant determinant of the utility score (p = 0.003), as well as dyspnea (p = 0.030) and financial difficulties (p = 0.033). Of note, excluding the 4 patients in advanced phases and the 2 patients with unusual “other treatments” yielded the same results (Online Resource Table S4).

Table 4 Determinants of EQ-5D-3L health utility scores: multiple regression analysis results

The utility decrement associated with a 1-point increase on the fatigue scale was −0.00269. Dividing the fatigue score into quintiles ([0,23], [24–34], [35–56], [57–78], [79–100]), the mean utility scores were 0.9, 0.8, 0.8, 0.6, 0.4, respectively. Model 2 explained a higher proportion (50.2%) of the variance of the utility score compared to model 1. Surprisingly, gender, number of treatment lines and current CML treatment were no longer associated with the utility score. Those three clinical variables were significantly associated with the fatigue score. Women reported a higher mean fatigue score than men (56.3 versus 42.3, p < 0.0001). The mean fatigue score increased with the number of lines received (p-value = 0.002). It varied from 39.3 in patients not receiving treatment to 60.3 in patients receiving dasatinib (Online Resource Table S5).

Discussion

This observational study evaluated health-related QoL and health state utility values in a large sample of patients with CML in France using three standardized PRO questionnaires. The mean utility score (SD) was 0.72 (0.25) in the chronic phase (any treatment line combined) and 0.84 (0.21) in treatment-free remission. Patients with CML had lower global QoL and utility scores than the general population of the same age and sex. Regarding the heath state utility value, deviation from population norms amounted to −0.15 (SD: 0.25) on average. This difference exceeds the minimum clinically important difference for the EQ-5D-3L reported in a context of cancer or hemopathy [30,31,32], which represents the minimal amount of impact that an individual would identify as important. Regarding QoL, social functioning, cognitive functioning and role functioning were impacted in patients with CML compared to general population norms with a mean difference of −16.0 (considerable clinical relevance), −11.7 (moderate clinical relevance) and −13.1 (low clinical relevance), respectively. Fatigue, dyspnea and pain were the symptoms with the highest deviation from general population norms (mean difference of 20.6, 14.0 and 8.3 corresponding to considerable, moderate and low clinical relevance, respectively). Our study shows that although TKIs prevent the disease from progressing to the accelerated or blast phase and even allow remission without treatment, quality of life of patients with CML is notably altered with a real burden of symptoms.

The mean utility score for patients in first line chronic phase was estimated at 0.76 in our study, which was close to the mean utility values of 0.80 in newly diagnosed CML patients in the SPIRIT2 trial [18]. In our study, the mean utility value in second line chronic phase patients (0.68) was also consistent with the mean utility values of a hospital-based Thai study in chronic phase CML patients refractory to first line treatment with imatinib (mean utility values of 0.647, 0.749 and 0.810 for patients receiving high dose imatinib, dasatinib and nilotinib, respectively) [33]. Not surprisingly, our estimate was notably lower than estimates derived from studies that used direct elicitation techniques in respondents from the general population (Online Resource Table S1) [19,20,21]. Indeed, Arnold et al. [34] showed in a systematic review of studies providing both direct (time trade off, standard gamble) and indirect (EQ5D) utility estimates that direct methods resulted in higher health ratings than indirect methods.

One criticism made to the EQ-5D-3L questionnaire is its lack of sensitivity to changes in health [35]. One of our objectives was to study whether the utility score could capture meaningful symptoms of the disease or the treatments in CML patients. Our study showed that the utility score varied depending on symptoms such as fatigue, dyspnea and the global symptom score from the QLQ-CML24. Their effects on the utility score outweighed the effect of gender, current treatment and the number of lines of treatments received which ceased to be statistically significant after adjustment on symptoms. Gender and the number of lines of treatments were both correlated with fatigue. However, the cross-sectional design of our study does not allow for disentangling the causal effect of each one of these variables.

Regarding the comparison of quality of life between CML patients and the general population, our results are consistent with previous studies that have used other instruments [15, 16]. In a study of 448 Italian patients having received long-term imatinib, Efficace et al. found that QoL as measured by the Short Form Health Survey (SF-36) was significantly worse relative to adjusted population norms for the physical components, but less markedly for mental health dimensions [15]. Clinically meaningful differences were observed for the “Role limitation because physical problems” scale (−11.5; 95% confidence interval [CI], −16.8 to −6.3), the “General health perception” scale (−8.9; 95% CI, −11.7 to −6.0) and the “Role limitation because of emotional problems” scale (−9.6; 95% CI, −14.9 to −4.3). Differences in QoL between patients and population norms were particularly pronounced among females and younger individuals (ages 18–39 years). In CML patients 60 years and older, scores were almost identical compared to population norms in all scales. Fatigue was the most reported symptom and severe fatigue was more frequently reported by women (39%) than by men (22%), which is consistent with our results. In another study, 62 US CML patients treated with a first or second generation TKI reported significantly worse fatigue severity, fatigue interference, depression, symptom burden and physical QoL than respondents from the general population [16]. The impairment of cognitive functioning in CML patients that we found in our study is in line with the results of Zulbaran-Rojas et al. [36]. In this study that enrolled 219 patients on frontline TKI trials with a second generation TKI, one of the top five symptoms measured by the MD Anderson Symptom Inventory questionnaire was difficulty remembering. Of note, as in our survey, the patient population in this study was quite young (median age 50 years).

Our study has limitations: the first limitation is that our population was selected. Compared to a population-based study of prevalent CML patients in France [4], the study population is younger (median age: 52 versus 63 years) with a female preponderance (59.6% versus 45%). CML patients were mainly informed of the existence of the web-based survey thanks to the French patients’ association “LMC France”. As respondents had to have access to the internet and be motivated to answer to the survey, they may not be fully representative of the whole population of French CML patients. However, our utility estimates in the chronic phase were consistent with estimates from previous recent clinical trial or hospital-based studies conducted in other countries. [18, 33] Secondly, due to the cross-sectional nature of the study, the study population combines patients with different disease settings. Most of the patients (58.6%) had received only one line of treatment, but some patients had received up to five different treatments and some were in treatment-free remission. However, this design allowed us to study potential determinants of utility among clinical and treatment-related factors. Because there were only four patients in advanced phases, we cannot report precise value for those health states. This illustrates the fact that advanced phases are now quite rare, even for CML expert hematologists. Finally, no data regarding response to treatment, or detailed chronology of treatments administration were collected, precluding the estimation of QoL or utility values according to treatment response. This is because the questionnaire was meant to be completed by patients and not by physicians. Questions were intentionally simple and easy to complete to obtain reliable data.

Our work has also several strengths. Firstly, the sample size is significant given the rarity of the disease. The study provides recent QoL and utility data for different TKIs obtained in a real-life setting in contrast to available results obtained from clinical trials [11, 17, 18]. Secondly, we used several standardized questionnaires to gain more insight into the QoL of CML patients, from a generic questionnaire (EuroQol EQ-5D-3L) to a specific questionnaire (EORTC QLQ-C30) and an ultra-specific questionnaire (EORTC QLQ-CML-24). The comparison to population norms for the EuroQol EQ-5D-3L and EORTC QLQ-C30 allowed us to quantify how much the QoL of CML patients was altered compared to the general population of the same age and sex and to identify the most affected functions and symptoms.

Conclusions

Our study provides utility values for future economic evaluations of treatments in patients with CML. Contrary to other studies that presented utility scores according to the cytogenetic or molecular response [19,20,21], we provided utility score according to the number of treatment lines received and showed that it decreased with the number of treatment lines. The availability of utility scores according to the number of lines may be useful for researchers using heath care claims databases as data source for cost-utility modeling. These databases, such as the US Medicare/Medicaid databases, the French National Health Data System (SNDS) and the Taiwan’s National Health Insurance Research Database provide detailed data regarding health care costs, treatments received by the patients and their vital status. However, biological results such as cytogenetic or molecular response are usually not available in such databases and their use for cost-utility modeling in CML is limited. Our study results may help researchers to develop cost-utility models using those data.

We also showed that, although TKIs prevent the disease from progressing and even allow remission without treatment, QoL in patients with CML is notably altered. The utility scores deteriorate with CML symptoms and particularly with fatigue.