Introduction

Although tamoxifen had long been the standard adjuvant endocrine therapy for postmenopausal women with breast cancer based on the results of randomized controlled trials (RCTs), treatments including aromatase inhibitors (AIs) are now regarded as standard adjuvant endocrine therapies for postmenopausal women with breast cancer [19]. In RCTs, 5 years of tamoxifen were compared with three types of AI use in the adjuvant setting, namely, 5 years of AI [1, 2], 2–3 years of AI after 3–2 years of tamoxifen [36], and AI after 5 years of tamoxifen [79]. All arms including AIs exhibited a somewhat better disease-free survival than the arms of 5 years of tamoxifen. On the other hand, some data regarding health-related quality of life (HRQOL) outcomes examined in the RCTs mentioned above have also been reported. Such HRQOL data have so far shown no difference between the group administered tamoxifen for 5 years and that including AI use [1013].

We conducted an RCT comparing 5 years of tamoxifen with 1–4 years of anastrozole after 4–1 years of tamoxifen in the adjuvant setting for postoperative postmenopausal women with breast cancer. We here report the HRQOL data from the trial.

Patients and methods

In November 2002 we started to enroll eligible patients in this open-label RCT called the National Surgical Adjuvant Study of Breast Cancer 03 (N-SAS BC 03). Eligibility criteria were described previously [14]. In brief, postoperative postmenopausal women of 75 years old or younger with estrogen and/or progesterone receptor-positive breast cancer of stages I, IIA, IIB, IIIA, or IIIB according to the UICC–TNM classification entered the trial. They were free of recurrence, had been on adjuvant tamoxifen for 1–4 years.

After obtaining written informed consent, the patients were randomly assigned either to continue tamoxifen 20 mg/day (tamoxifen group) or to switch from tamoxifen to anastrozole 1 mg/day (anastrozole group) until individual patients in both groups had completed 5 years of the respective treatment (Fig. 1).

Fig. 1
figure 1

Study design. TAM tamoxifen, ANA anastrozole

Primary endpoints of the trial were disease-free survival and adverse events. Secondary endpoints were relapse-free survival, overall survival, and HRQOL. The results of the primary endpoints and secondary endpoints other than HRQOL were already reported [14].

Although we had planned to enroll 2,500 patients in this trial, the results of several RCTs determining the role of AIs in the adjuvant setting had shown fewer recurrences in the arms with AIs. Therefore, we decided to close the enrollment after 700 patients had entered. Although patients of the tamoxifen group were then given the option to switch treatment to anastrozole, only 13 patients did so. Patient enrollment in the study was closed in December 2005.

In the HRQOL assessment, all participants of N-SAS BC 03 were asked to reply to a self-administered survey of FACT-B [breast cancer scale], FACT-ES [endocrine symptom scale], and that of psychological distress (CES-D: Center for Epidemiologic Studies Depression scale) at the randomization (baseline), 3 months, 1, and 2 years, respectively, after randomization. During that period, HRQOL and psychological distress scores were compared between the two treatment groups. The officially validated Japanese version of FACT-B [15, 16] is a 36-item questionnaire that measures both a general HRQOL associated with cancer (27 items) (referred to as FACT-G) and an additional HRQOL specifically related to breast cancer (9 items) (referred to as breast cancer subscale [BCS]). FACT-G has four subscales, i.e., physical (7 items), social (7 items), emotional (6 items), and functional well-being (7 items). The Japanese version of FACT-ES is a combination of FACT-G and an endocrine symptom subscale composed of 18 items. In the FACT system, a 5-point scale is used (from 0 to 4). Higher scores indicate better HRQOL status. The CES-D, a questionnaire that measures the severity of depression, has 20 items in which a 4-point scale is used (from 0 to 3). Higher CES-D scores indicate worse conditions.

The data were analyzed on an intention-to-treat basis to provide a realistic indication of the generalizability and effectiveness of the intervention. Differences in the categorical sociodemographic variables between the treatment groups were analyzed using the chi-square test. Analysis of variance was used to test differences in the continuous variables between the groups.

Although we expected that some HRQOL data would be missed, all available data from each time point were used in these analyses, under the assumption that data were randomly missed and were therefore ignorable. In other words, no specific analytical techniques were applied to deal with missing data, such as imputation-based approaches. For the longitudinal data of HRQOL and CES-D variables, we used the generalized linear model (Generalized Estimating Equations) to account for correlations among repeated measurements. In all the models, we treated time as a categorical variable, included the baseline measurement as a covariate, and used the compound-symmetry structure for an initial error-correlation matrix. Standard errors were calculated using robust sandwich variances. All analyses were performed using SAS for Windows version 9.1 (SAS Institute, Cary, NC). The significance level was considered to be P < 0.05.

Results

The response rates of the questionnaires were 98.6, 97.2, 90.9, and 78.5% at baseline, 3 months, 1, and 2 years, respectively. At baseline, 694 patients (346 in the tamoxifen group and 348 in the anastrozole group) responded. The patients’ characteristics were well balanced between the two treatment groups as shown in Table 1.

Table 1 Patient and tumor characteristics and treatments

Regarding performance status (PS) of the participants, most of them were PS 0 according to ECOG criteria, and less than 3% of them were PS 1 in both treatment groups during the 2 years. Only one patient of the anastrozole group was PS 2 at 1 and 2 years, respectively. There were no statistically significant differences between the treatment groups at any time point.

As for drug compliance, the percentages of the patients who had taken more than 80% of the prescribed drug doses were 94.6% in the anastrozole group and 97.4% in the tamoxifen group at 3 months, 87.0 and 90.1% at 1 year, and 72.6 and 74.1% at 2 years, respectively. There were no statistically significant differences between the groups at any time point.

At baseline, the scores in the two treatment groups were almost the same in any questionnaire and in any subscale thereof (Table 2). The scores at 3 months were almost the same as those at baseline, evidencing no differences between the two groups. On the other hand, the total scores of FACT-G declined in the anastrozole group at 1 year and continued to do so until 2 years, whereas the scores of the tamoxifen group were generally stable in all questionnaires. This change of FACT-G in the anastrozole group caused a difference in the total scores of FACT-B and -ES between the two groups.

Table 2 Mean scores (standard error) at four time points by treatment group

The results of the generalized linear model with correlated errors, which is an analysis including a time factor, showed a statistically significant difference in the main effect in favor of the tamoxifen group in the total scores of FACT-G and -ES (P = 0.042, and 0.038, respectively) (Figs. 2, 3). Meanwhile, the total scores of FACT-B were marginally better in the tamoxifen group than in the anastrozole group (P = 0.066). Among the scores of the FACT subscales, only the scores of the physical well-being (PWB) subscale were statistically significantly better in the tamoxifen group compared to those in the anastrozole group (P = 0.005) (Fig. 4). On the other hand, there were no statistically significant differences in the scores of any item in the PWB subscale between the two treatment groups.

Fig. 2
figure 2

FACT-G total scores (Least-squares means). TAM tamoxifen, ANA anastrozole. Although the scores of the TAM group were stable, those of the ANA group declined and did not recover

Fig. 3
figure 3

FACT-ES total scores (Least-squares means). TAM tamoxifen, ANA anastrozole. The changes of the FACT-ES total scores of the two treatment groups were quite similar to those of FACT-G total scores

Fig. 4
figure 4

Scores of physical well-being of FACT-G (Least-squares means). TAM tamoxifen, ANA anastrozole. The scores of physical well-being subscale of FACT-G were statistically significantly better in the TAM group than the ANA group

No statistically significant differences were found between the two treatment groups in the scores of CES-D, and subscales other than PWB of FACT-G, FACT-B, and FACT-ES.

In all analyses, interaction terms between treatments and time were not significant.

Although there was no statistically significant difference between the two treatment groups in the scores of the endocrine symptom subscale of FACT-ES, some items in the endocrine symptoms showed statistically significant differences. Hot flashes and vaginal discharge were worse in the tamoxifen group than in the anastrozole group (P = 0.0002, and <0.0001, respectively), while dizziness, diarrhea, and headache were worse in the anastrozole group (P = 0.0165, <0.0001, and 0.0023, respectively).

The mean scores (standard error) of total and each domain of FACT-G, -B, and -ES, and 5 items of the endocrine symptom subscale of FACT-ES, and CES-D at 4 time points by treatment group are shown in Table 2.

We conducted exploratory analyses of the scores that had shown statistically significant differences between the treatment groups by separating the patients by the duration of tamoxifen at study entry (shorter than 2.5 years vs. 2.5 years or longer), or the age at entry (younger than 60 years old vs. 60 or older). In terms of the duration of tamoxifen, statistical significance was lost in the total scores of FACT-G, or FACT-ES after separating the patients at 2.5 years, but a trend favoring tamoxifen was seen both in patients with shorter duration on tamoxifen and those on it longer (P = 0.115–0.238). Regarding PWB of FACT-G, the tamoxifen group scores were still statistically significantly better than those of the anastrozole group patients with the shorter duration on tamoxifen (P = 0.039), while those of the tamoxifen group were marginally better than those of anastrozole in the patients on it longer (P = 0.057). After separating the patients at the mean age of 60, the results of the analyses were essentially the same as those after separating them by the duration of tamoxifen. Hence, the total scores of FACT-G, or FACT-ES were better in the tamoxifen group than in the anastrozole group in both of the two age groups, but the differences were not statistically significant (P = 0.078–0.315). Although the scores of PWB of FACT-G of the tamoxifen group were statistically significantly better than those of the anastrozole group in the younger patients (P = 0.019), those of the tamoxifen group in the older patients were marginally better than those of the anastrozole group in the older ones (P = 0.105).

We conducted exploratory analyses of the endocrine symptom subscale items of FACT-ES that showed statistically significant differences between the two treatment groups, namely hot flashes, vaginal discharge, dizziness, diarrhea, and headache, by separating the patients by their age at the entry (younger than 60 years old vs. 60 or older), presumably because those symptoms might have been affected by patients’ ages. Hot flashes and vaginal discharge were statistically significantly worse in the tamoxifen group than in the anastrozole group for the younger patients (P = 0.0003, and <0.0001, respectively), while only vaginal discharge was statistically significantly worse in the tamoxifen group for the older patients (P < 0.0001). On the other hand, dizziness and diarrhea were statistically significantly worse in the anastrozole group than in the tamoxifen group with the younger patients (P = 0.023, and 0.0015, respectively), while diarrhea and headache were statistically significantly worse in the anastrozole group for the older patients (P = 0.0039, and 0.014, respectively).

Discussion

Until recently, the standard adjuvant endocrine therapy had been tamoxifen for 5 years for both pre- and postmenopausal women with early breast cancer. However, after the results of RCTs comparing tamoxifen with AIs of the third generation in the adjuvant setting revealed the superiority of AIs for postmenopausal patients, at least in terms of disease-free survival, the standard adjuvant endocrine therapy for postmenopausal patients with early breast cancer was changed from tamoxifen to AIs.

Three third-generation AIs are available: anastrozole, letrozole, and exemestane. The first two agents are non-steroidal, while the third is steroidal. Each agent was compared with tamoxifen in the adjuvant setting according to the three ways of administration in RCTs. First, the up-front administrations of tamoxifen and AIs for 5 years were compared. In this setting, anastrozole was examined in an ATAC trial [1], and was found to be superior to tamoxifen in terms of disease-free survival. Letrozole was then compared with tamoxifen in a BIG1-98 trial [2], and the rate of disease-free survival was better in the letrozole group.

Second, a comparison was made between continuing tamoxifen and switching to AIs after 2–3 years of tamoxifen. In this setting, the results of a comparison between tamoxifen and exemestane in the IES trial were reported [3]. In the first report, published in 2004, exemestane was shown to be better in terms of disease-free survival, while the second report in 2007 regarding survival demonstrated the superiority of the overall survival with exemestane [4]. So far, there are three RCTs that examined whether switching from tamoxifen to anastrozole was superior to 5 years of tamoxifen in terms of survival, the ITA trial [5], ARNO 95 trial, and ABCSG trial 8 [6]. The results of those three RCTs were combined and examined in a meta-analysis [17] that showed overall survival to be statistically significantly better in the switched groups. Our N-SAS BC 03 trial was also an RCT focusing on the same issue. After the publication of an ASCO technology assessment which recommended the use of AIs for postmenopausal women with early breast cancer [18], however, patients of the tamoxifen group were offered the option to switch to anastrozole, but only a few did so.

Third, a comparison was conducted between no further treatment and AI administration after 5 years of tamoxifen. In an MA. 17 trial, letrozole was used in this setting [7]. The superiority of the letrozole group in disease-free survival was shown in 2003, after which patients in the placebo group were also offered letrozole. Therefore, the data of overall survival were not analyzed in randomized fashion. The ABCSG trial 6a [8] revealed the effectiveness of anastrozole after 5 years of tamoxifen in terms of recurrence-free and distant metastatic recurrence-free survival. The NSABP B-33 trial [9] investigated the effectiveness of exemestane, but the study was ended before the planned number of patients was enrolled because of the publication of the MA. 17 trial results; then they offered exemestane to the patients in the control group. However, even in such a situation exemestane administration was shown to be effective in terms of relapse-free survival.

In most of the RCTs mentioned above, the HRQOL of the patients was evaluated using established HRQOL questionnaires as a secondary endpoint. The results of some of them were published, and are summarized below.

In the ATAC trial, 1021 women were enrolled in the HRQOL subprotocol. FACT-B and -ES were used, and the participants were asked to complete the questionnaires at baseline, 3 and 6 months, and every 6 months thereafter, or until disease recurrence. The results of the first interim analysis at 2 years were reported in 2004 [10]. The overall HRQOL for both the tamoxifen and anastrozole groups improved from baseline during the 2-year period. There were no significant differences in the FACT-B Trial Outcome Index (TOI) and ES scores. TOI was the sum of the scores from the physical and functional well-being and the breast cancer subscales. The second report regarding the HRQOL at 5 years in the ATAC trial was published in 2006 [11]. The results at 5 years were almost the same as at 2 years, i.e., there was no statistically significant difference between the tamoxifen and anastrozole groups in the TOI of FACT-B and ES total scores. Therefore, the authors concluded that anastrozole and tamoxifen had a similar impact on the HRQOL. The results of the HRQOL study in IES were reported in 2006 [12]. Five hundred eighty-two patients were enrolled in that substudy. As in the ATAC trial, FACT-B and -ES were used, and the FACT-B TOI scores, total FACT-B + ES scores, and total ES scores up to 24 months were evaluated. HRQOL was stable over 2 years, with no statistically significant differences observed between the tamoxifen and exemestane groups in the TOI and ES scores except for the TOI score change from baseline at 6 months in favor of the tamoxifen group. The investigators of IES concluded that the clinical benefits of exemestane over tamoxifen were achieved without a significantly detrimental effect on HRQOL. The results of the HRQOL assessment in MA. 17 were published in 2005 [13]. Thirty-six-hundred-twelve patients were enrolled in that HRQOL substudy. The Short Form 36-item Health Survey (SF-36) and the Menopause Specific Quality of Life Questionnaire (MENQOL) were used for the assessment of HRQOL. The participants completed them at baseline, 6 months, and annually. No differences were seen between groups in the mean change scores from baseline for the SF-36 physical and mental component summary scores at 6, 12, 24, and 36 months, respectively. The investigators concluded that letrozole had no adverse impact on overall HRQOL.

In our study, the scores of the tamoxifen group were unchanged for 2 years after randomization as expected, but those of the anastrozole group in FACT-G declined from 2 to 3 points. It is difficult to judge whether anastrozole actually compromised the HRQOL in the anastrozole group patients. Cella et al. [19] examined the issue of how much a score change is clinically meaningful. They showed that a decline of 8–10 scores or more in FACT-G was associated with a clinically meaningful worsening in HRQOL. Therefore, our results in this study might indicate clinically less meaningful changes in the anastrozole group. However, this difference was actually statistically significant. Such a detrimental impact might have been caused by the musculoskeletal symptoms which are known to be induced by AIs at a very high incidence [20]. Actually, the adverse event data regarding arthralgia in this trial were more common in the anastrozole group than in the tamoxifen group [14]. Although there were no items in the questionnaires used that specifically proved the joint pain, the scores of the PWB subscale of FACT were statistically significantly worse in the anastrozole group and might have affected the overall HRQOL to a certain degree.

Our results are not consistent with those of the trials mentioned above, possibly for the following reasons. First, there might be some racial differences in the HRQOL or its side effects. In the MA. 17 trial such differences were suggested upon taking letrozole [21]. Second, the differences in our trial might have been obtained by chance. Third, the differences might have been caused by the possible biases mentioned below. Fourth, an antidepressant of Selective Serotonin Reuptake Inhibitors (SSRIs) reportedly interferes with the activity of CYP2D6, one of the enzymes that converts tamoxifen to a more active metabolite, endoxifen [22]. SSRI usage may also have affected the results of the present trial, but we have no data in this regard.

There are some limitations to our study. First of all, it was conducted as an open-label RCT, not a blinded one. Therefore, the participants knew which drug they were taking, a factor that may have affected the results. All the participants in our study had already taken tamoxifen for 1–4 years. Patients who experienced severe side effects due to tamoxifen may have stopped taking it before they were offered to participate in our trial. Thus, patients who were assigned to the tamoxifen group should have continued to do well. On the other hand, if patients experienced new adverse effects after starting a new drug, their HRQOL would have worsened. The present results might have been affected by this type of selection bias. However, other HRQOL data obtained in another RCT of ours comparing tamoxifen, anastrozole, and exemestane in an adjuvant setting of an up-front use have shown that HRQOL measured with FACT was better in the tamoxifen group than in the AI groups [23]. Thus, the present data could prove accurate. Second, during the performance of our trial, the results of large RCTs comparing 5-year tamoxifen with 2–3 years of tamoxifen followed by 3–2 years of one of the AIs showed the latter to be superior at least in terms of disease-free survival. The participants of our trial were informed of those results. Although this information might have affected the HRQOL data, this bias should not have been the cause of the lower scores in FACT of the anastrozole group.

In order to identify subgroups that would have a remarkably worse HRQOL by taking anastrozole compared to tamoxifen, we did exploratory analyses by separating the participants by the duration of tamoxifen at the study entry and their age at entry. Although some of the scores examined have lost statistical significance possibly due to smaller sample sizes, the trend was almost the same as the results of the participants as a whole.

In conclusion, for Japanese postmenopausal breast cancer patients who would benefit from adjuvant endocrine therapy but whose absolute benefit would be small, tamoxifen for 5 years may be a better choice than adjuvant endocrine therapy including AIs in terms of HRQOL.