Introduction

With a 13% lifetime incidence risk [1], breast cancer is the most common malignancy in women [2], and their quality of life (QoL) suffers from this disease and its treatment [3]. In one-third of the cases [4], chemotherapy may alleviate the symptoms [5], and, for high-risk early breast cancer [6], reduce the 10-year recurrence by ≥ 5%. Gemcitabine is a pyrimidine analogue that stops the DNA elongation after adding a physiological nucleotide (masked termination). It competitively inhibits the DNA polymerase and the ribonucleotide reductase [7]. Phase II studies have reported 19% WHO-grade 3 and 3% WHO-grade 4 toxicities with Gemcitabine and 30% WHO-grade 3 and 11% of WHO-grade 4 toxicities with Docetaxel–Gemcitabine (DG) [8].

The unique triple action and moderate toxicity of Gemcitabine deserve particular attention in breast cancer. In the SUCCESS A trial, we found more women with hematologic toxicities such as thrombocytopenia (2% vs. 0%) and leukopenia (64% vs. 58%) with DG than with Docetaxel (D), both following fluorouracil–epirubicin–cyclophosphamide (FEC) [9]. Fifty-nine percent with FEC-DG versus 36% with FEC-D needed more granulocyte colony-stimulating factor, and 4% versus 2% needed dose reductions of more than 20%. Neuropathy (1% vs. 0%), arthralgia (2% vs. 1%), and bone pain (3% vs. 1%) were more frequent with FEC-D.

Recent neoadjuvant [10] and adjuvant breast cancer trials [11,12,13] challenge the superior recurrence-free survival of combinations over single-agents [14, 15], and the QoL still remains largely unexplored. A meta-analysis comparing combinations of taxanes and novel non-taxane agents, such as Vinorelbine, Gemcitabine, or Capecitabine, with single taxanes, reported a pooled hazard of 0.8 (95% confidence interval (CI) 0.7–0.9) favouring the combinations [15]. One of the rare [16] DG versus D trials found a hazard ratio of 0.8 for time-to-treatment failure [11], another found a hazard ratio of 0.9 for disease-free survival [12], and we found the same [13]. The former favoured the single agent, the latter two the combination, and all were statistically nonsignificant. None reported QoL.

One reason for the under-investigation of QoL in chemo versus chemo studies [17] might be that time-to-event outcomes are easier to analyse than QoL curves [18, 19]. Using the 0-to-100-point, 30-item core questionnaire of the European Organisation for Research and Treatment of Cancer (EORTC QLQ-C30), one study confirmed a previous finding [20] of an eight-point difference in global QoL at three months (p = 0.001) for a higher dosage compared to a longer FEC [21]. While another trial found better QoL with oral rather than with intravenous chemotherapy [22], most QoL breast cancer trials compared chemotherapy with hormone therapy, stem cell transplant, or surgery, or analysed QoL as a predictor rather than as an outcome [17].

This report compares the QoL of women with high-risk early breast cancer randomized to two different adjuvant chemotherapies, namely three cycles of FEC in both groups followed by three cycles of DG in one group versus three cycles of D in the other.

Methods

Design

From September 2005 to March 2007, 271 study centres across Germany coordinated the SUCCESS A trial (clinicaltrials.gov: NCT02181101). The centres informed the local gynaecologists and gynaecological oncologists about the trial, who then informed their potentially eligible patients orally and in writing. After confirming eligibility and obtaining written informed consent, they transmitted their patients´ contact information and baseline characteristics to the centre. The centre completed this information as necessary and, before therapy, sent the first QoL questionnaire (t1) to each participant’s postal address. Varying recovery times might have prolonged the 21 days scheduled between each of the six chemotherapy cycles.

The centres sent the second (t2) and third (t3) QoL questionnaires after the 3rd and 6th cycles respectively and assigned the women to FEC-DG versus FEC-D arms before the 4th cycle. Further questionnaires followed 3 (t4), 6 (t5), 9 (t6), and 12 months (t7) after chemotherapy. The women were advised to complete the questionnaires at rest and independently, and they received stamped postal envelopes to return the completed questionnaires. To ensure quality, a clinical research organisation (CRO) regularly visited the centres and electronically managed the data, including automated plausibility checks. Led by the ethical board of the Ludwig-Maximilian-University of Munich, 37 local boards approved the study. The full protocol is available at http://www.success-studie.de/a/downloads.htm. The dataset generated and analysed during this study is available from the steering committee of the SUCCESS A trial upon reasonable request.

Eligibility

Eligible women were ≥ 18 years old with a ≤ 6-week-old R0 resection of an invasive primary epithelial breast cancer without distant metastases. They had a high recurrence risk, namely age ≤ 35 years at diagnosis, multifocal, multicentric or bilateral cancer, stage ≥ T2 (> 2 cm), G3 differentiation, hormone receptor negative tissue, or lymph node metastases. Their condition on the scale of the Eastern Cooperative Oncology Group was ≤ 2 (i.e. capable of all selfcare), and they were able to understand the study concept well. They consented to regular aftercare. They had ≥ 3.0 × 109 leucocytes and ≥ 100 × 109 thrombocytes per blood litre, and their aspartate, alanine aminotransferase, and alkaline phosphatase were within 1.5 times the reference laboratory’s normal range.

Baseline characteristics

Demographic characteristics collected were age, body mass index, and menopausal status. Cancer characteristics were size, tissue origin and differentiation, hormone and human epidermal growth factor receptor 2 status, and number of lymph node metastases. The tissue differentiation was graded by the Elston-Ellis modification of the Scarff–Bloom–Richardson [23]. A positive hormone status was an expression of either or both oestrogen or progesterone receptors on at least 10% of the cancer cells.

Quality of life and sample size

The EORTC QLQ-C30, version 3.0, contains one global QoL, five functional, and nine symptom scales. The 16-item breast cancer module contains four functional and four symptom scales. All scales result from adding their items and transforming the sum scores to range between 0 and 100 points. Each scale needs 50% valid items. Higher scores indicate better global and functional, but worse symptom-related QoL. The between-item-correlations, retest reliabilities, convergent and discriminant validities are well proven [24,25,26].

Based on phase II Gemcitabine studies [8], global QoL was the primary outcome. Secondary core outcomes were fatigue, emotional and physical functioning, and pain. Side effects of systemic therapy were the secondary breast module outcome. Global QoL includes two seven-point items rating overall health and QoL during the past week. A ≥ five-point change is clinically relevant [27]. With a planned study sample 3658 women [13], the power was sufficient for a 95%CI of less than ± one QoL point.

Randomisation and concealment

An external statistician performed the randomisation. The ratio of 1:1 was stratified by menopausal status, cancer differentiation, hormone and human epidermal growth factor receptor status, and number of lymph node metastases. The CRO informed the centres about the allocation by facsimile and the electronic case report form. The study was open-label; however, the CRO concealed the sequence.

Chemotherapy and further treatment

All cycles were body-surface adapted intravenous infusions. The FEC dose was respectively 500 mg/m2 in 15 min, 100 mg/m2 in 15 min, and 500 mg/m2 in 60 min. The DG-dose was respectively 75 mg/m2 in 60 min and 1000 mg/m2 in 30 min. The D-dose was 100 mg/m2 in 60 min. Dexamethasone, 2-mercaptoethanesulfonate-sodium, and serotonin-3-antagonists identically decreased the toxicity during the cycles in both groups [9]. After chemotherapy, all hormone receptor-positive women received 20 mg/d Tamoxifen orally until study end. Oral 1 mg/day Anastrozol replaced it in the event of contraindications. From the end of chemo to study end, all received 4 mg Zoledronate intravenous infusions quarterly. Radiotherapy followed chemotherapy in all women with breast-conserving surgery, cancer size > 3 cm, multifocal, multicentric or bilateral cancer, an N2 lymph node status, or with carcinomatous lymph- or haemangiosis [9].

Analysis

A blinded independent institution described and analysed the data with IBM SPSS Statistics 21. All were intention-to-treat, complete time-point analyses.

Multilevel linear models of repeated measurements estimated the mean QoL differences between FEC-DG versus FEC-D. T1 to t7 and their interaction with chemotherapy were random effects. The number of days between time-points was a covariate. The covariance structure between time-points was the one with the highest − 2 log-likelihood for global QoL. For the time-points overall and for the one with the largest difference, we computed the number of included women, the QoL differences, the CIs, and the p values. A line chart illustrated the prediction of global QoL by group and time-point.

To analyse informative participation—that is, whether the probability of reporting QoL is associated with higher QoL [18]—a generalised multilevel linear model estimated the odds ratio between FEC-DG versus FEC-D of reporting global QoL. Assuming binomial distribution of the repeated outcome, the logit function linked chemotherapy with it. The fixed effects, covariance structure, and numerical and graphical presentation were as above. However, including the number of days between time-points would have excluded most non-participants.

Using Cox regression, the continuation of chemo- and bisphosphonate therapy until t7 was analysed. Sensitivity models on specific reasons for premature discontinuation (for reasons that occurred sufficiently often) treated discontinuation for reasons other than the reason currently being modelled as censored. The expected duration of 105 days of chemo plus 12 months of bisphosphonate therapy replaced the real duration if this information was missing. If women discontinued both therapies, the model accounted for the former.

Results

Baseline characteristics

The analysis included 3691 women (Fig. 1). Most had small hormone receptor-positive cancers and few lymph node metastases (Table 1).

Fig. 1
figure 1

Diagram of the participant flow

Table 1 Characteristics before chemotherapy

QoL

Altogether, 3454 women returned at least one QoL questionnaire (Fig. 1). The last time point was t7 in 61% of the FEC-DG cases versus 60% with FEC-D, t6 in 7% versus 8%, t5 in 4% for both arms, t4 in 2% both arms, t3 in 9% versus 10%, t2 in 4% versus 3%, and t1 in 5% of both arms. The average delay between these and the other women is only 0.7 days.

The average global QoL varied between 51 and 69 points. The standard deviation varied between 19 and 21 points. Physical functioning scored best with little difference between groups. The dates of questionnaire completion varied most at t4. QoL, particularly global QoL and emotional functioning, was highest from t5 to t7 (Table 2). Apart from side effects of systemic therapy, t1 to t7 accounted for less than 10% of the variance. Pain varied the least over time. The between-time variance of side effects of systemic therapy was stronger with FEC-DG than with FEC-D (Table 3).

Table 2 Quality of life by chemotherapy and time-point
Table 3 Variance of quality of life between women and time-points by chemotherapy

Therapy completion

Of the 3410 (92%) who completed chemotherapy (Fig. 1), 1619 (88%) started bisphosphonate therapy after FEC-DG and 1685 (90%) after FEC-D. Of the 281 (8%) discontinuing women, 65 (3%) began bisphosphonate therapy after FEC-DG and 50 (3%) after FEC-D. Severe toxicity was the main reason for discontinuing chemotherapy, namely in 4% with FEC-DG after an average of 77 days versus 3% with FEC-D after an average of 59 days. Most women stopping the bisphosphonate therapy chose to do this themselves (6% with FEC-DG vs. 5% with FEC-D; Table 4).

Table 4 Reason for and time to discontinuation from therapy

Effect of chemotherapy on QoL

Over all time points, the average global QoL was one point higher with FEC-DG than with FEC-D (95% CI: ± one point; p = 0.05), which is a fifth of the minimal clinically relevant difference [24]. At t3, this difference reached its maximum of two points (95% CI: ± two points; p = 0.02), again favouring the FEC-DG group and again below the clinical relevance threshold. QoL decreased during chemotherapy and ended six points higher than before (Fig. 2).

Fig. 2
figure 2

General linear model of average global quality of life by chemotherapy and time-point

While no QoL outcome differed by more than one point over all time points between FEC-DG versus FEC-D (95% CI always ± 1), t3 was always the time point with the largest difference, always favouring FEC-DG. This difference was clinically relevant for side effects of systemic therapy (Table 5). Women with FEC-DG reported significantly less pain and fatigue and a significantly better physical functioning at t3. However, as these differences were maximally four points, they were probably below clinical relevance.

Table 5 General linear models of secondary outcomes by chemotherapy and time-point (Docetaxel–Gemcitabine minus Docetaxel)

Effect of chemotherapy on reporting global QoL

Over all time points, the odds of self-assessment of global QoL were 1.1 times higher with FEC-D than with FEC-DG (95% CI 1.0–1.1; p = 0.23). At t3, the ratio reached its maximum of 1.1 (95% CI 1.0–1.2; p = 0.15), again favouring FEC-D. That is, the questionnaire return proportion, which accounts for participation and correlates with higher QoL [18], was the same for both regimens. With both treatments, the probability to report QoL decreased by 25% by t4 and then was stable (Fig. 3).

Fig. 3
figure 3

Logistic linear model of the average probability of self-assessment of global health by chemotherapy and time-point

Effect of chemotherapy on continuation of therapy

Table 6 shows that the hazard of continuing therapy was 1.2 times higher with FEC-D than with FEC-DG (95% CI 1.0–1.4; p = 0.03). That is, women in the former group were slightly more likely to continue.

Table 6 Continuation of chemo- and bisphosphonate therapy by chemotherapy (Docetaxel–Gemcitabine divided by Docetaxel)

Discussion

After prior anthracycline treatment, DG is as beneficial as D for the QoL course of women with high-risk early breast cancer, as the one-point difference was below clinical relevance and the participation was the same. With both regimens, the long-term increase in QoL is clinically relevant, as the improvement seen from t1 to t5 and lasting until t7 was six points (p < 0.001) and good participation correlates with good QoL [18]. More precisely, the favourable effects of both chemotherapies on QoL are probably more durable than their short-term adverse influences. However, additional treatment with zoledronate applied equally to both groups might also contribute to the long-term increase of QoL. Zoledronate may contribute by preventing disease recurrences or by promoting faster bone recovery from chemotherapy [28]. A perhaps even simpler explanation is that the cancer diagnosis and the need of surgery diminish QoL. At long term, the recovery from chemotherapy and the hope that this therapy removed the last remnants of the cancer probably restore QoL to a level similar to that before diagnosis.

The secondary outcomes support the conclusion of the primary outcome. In line with prior findings [20, 21], the superiority of DG in four of five scales at t3 is very short. It is clinically relevant for therapeutic side effects (six points, p < 0.001) and perhaps relevant for pain (four points, p = 0.001) and physical functioning (three points, p < 0.001). A pain increase needs three points for clinical relevance, but a decrease needs five. The circumstances are similar for physical functioning [27].

A strength of this study is that four independent institutions collected and analysed representative nationwide data with regular quality checks during the study and a thorough final validation after a long-term follow-up was carried out. The separate responsibilities for randomisation, allocation concealment, data management, data collection, and blinded analysis minimised influences of potential conflicts of interest. Each institution counter-checked the information transmitted by the others.

We were the first to compare the one-year evolution of QoL between a taxane-Gemcitabine combination and a single-agent taxane after prior anthracycline treatment [10,11,12, 15]. The improvement that we found with both regiments was stronger and more persistent than in studies comparing other treatments [20,21,22]. This could be due to the superiority of anthracycline combinations followed by taxanes or to our non-chemotherapeutic modalities, such as bisphosphonates. Future clinical trials may address these important hypotheses.

Adding to the disagreement of prior studies regarding recurrence-free survival [10,11,12, 15], in the SUCCESS‑A trial we found that it was equal in the FEC-D and FEC-DG arms [13]. We also found that more haematologic toxicities, a need for granulocyte colony-stimulating factor, and dose reductions with FEC-DG [9] disagree with the better QoL related to side effects of therapy at t3 with this treatment. Perhaps neuropathy, arthralgia, and bone pain, which are more frequent with FEC-D [9], influence QoL more strongly than the former.

Taken together, we favour taxane therapies without Gemcitabine after prior anthracycline treatment because of equal survival [10,11,12,13, 15], fewer hematologic toxicities and need for adaptations, [9] and equal QoL. Thereby, we challenge prior recommendations favouring combination therapies [14, 15].