Introduction

Breast cancer is one of the most commonly diagnosed types of cancer among women of all racial and ethnic groups and the second leading cause of cancer death in the United States [1]. Breast cancer survivors are at risk of recurrence, second cancers, and premature death [2]. Research guiding the importance of individual disease management on the natural course of breast cancer is therefore paramount to increase survival after breast cancer diagnosis. There is convincing evidence that physically active women have a substantially less risk of developing breast cancer compared with inactive women [3, 4]. The preventive effects of physical activity (PA) on breast cancer development might also act after a breast cancer diagnosis, inhibiting progression and improving prognosis. A number of studies have examined whether or not exercise influences mortality among breast cancer survivors but with varying results. A meta-analysis [5], including the results of four studies on prediagnosis and three on postdiagnosis PA, found that prediagnosis PA reduced all-cause mortality by 18 % but had no effect on breast cancer deaths, and postdiagnosis PA reduced breast cancer deaths by 34 %, and all-cause mortality by 41 %. Since the meta-analysis, ten large prospective cohort studies have estimated the association between PA and mortality in breast cancer [2, 614]. In addition, the previously meta-analysis did not include all the published studies available at the time of its compilation [15]. Therefore, we systematically conducted a meta-analysis by combining all available data of studies to derive a more precise estimation of this association. Besides, we also performed a dose–response analysis, because categories of PA differed between studies, which might complicate the interpretation of the pooled results across study populations with different categories.

Material and methods

Literature search

We searched PubMed (from 1967 to present) and Embase (from 1965 to present) for studies in humans of the association between PA and mortality in breast cancer. The search strategy used the terms “exercise”, “physical activity”, “motor activity”, “breast cancer”, and “mortality”. The latest date of this search was January 2014, and there was no limit of languages. Reference lists from selected articles and relevant review articles were examined manually to further identify potentially relevant studies. All searches were conducted independently by two reviewers; differences were checked by the two and resolved by discussion. When more than one of the same patient population was included in several publications, only the most recent or largest population was used in this meta-analysis.

Inclusion criteria

The following inclusion criteria were used in selecting literature for further meta-analysis: (a) the exposure of interest was PA assessed before or after diagnosis; (b) the outcomes of interest were all-cause mortality or breast cancer-specific mortality; (c) The type of study was cohort; (d) the relative risk (RR) of mortality and 95 % confidence intervals (CIs) were reported (or information to calculate them); (e) the study compared at least two different PA levels, e.g. more PA subjects versus least PA subjects (i.e. reference category); and (f) PA was assessed directly, and not measured indirectly though sitting time.

Data extraction

Two investigators extracted the data independently. Discrepancies were adjudicated by the third investigator until consensus was achieved on every item. The following information was abstracted from each included articles: the name of first author, year of publication, country origin, follow-up period, sample size, PA measurements, the RRs and corresponding 95 % CIs, and confounders adjusted for in multivariate analysis, respectively. For studies that provided more than one RR, the RRs from multivariate models with the most completed adjustment for confounding factors were abstracted for analyses.

Assessment of methodological quality

The methodological quality of the included studies was independently evaluated by two investigators mostly based on the Newcastle–Ottawa Scale (NOS) [16]. Each study was assessed based on (a) selection: whether or not the study was population-based (a representative cohort of patients with breast cancer); (b) exposure: how the PA questionnaire was administered (interviewer or self-administered), whether or not a more precise scale [e.g. metabolic equivalent task (MET) hours per week (MET-h/week), and kilocalories (kcal) per week] was used to measure the levels of PA, and whether or not the PA was assessed at more than one point in a person’s life; (c) comparability: whether analyses had been adjusted for the important confounding factors (age and BMI) and any additional factor; and (d) outcome: how the outcome was assessed (medical records or self-report), whether follow-up was long enough for outcomes to occur (>5 years), and whether follow-up was near-complete (≥90 %). Discrepancies were adjudicated through discussion and re-evaluation of the methodology of the study in question.

Statistical methods

All statistical analyses were done with Stata software (Version 12; Stata Corporation, College Station, TX, USA), and all tests were two-sided. If a study provided separate RR estimates by body mass index (BMI), we treated them as different studies [17]. The natural logarithm of the RR from each study was combined to estimate a summary of RR for PA and mortality using the DerSimonian and Laird random-effects model [18] that accounts for both within- and between-study variation. For each study, low-level PA represented the reference category, high-level PA represented the highest category, moderate-level PA represented in-between, and moderate-high level of PA represented both low- and moderate-level PA. First, we compared high level of PA with low PA. Second, estimates comparing the moderate level of PA to low PA were calculated. Third, estimates were also calculated for moderate to high level of PA. For studies not report a RR estimate for moderate-level PA [2, 7, 11, 12, 14, 15, 17, 19, 20], a summary estimate was calculated using RR estimates for each of the moderate-level PA categories. This summary estimate was used in the meta-analysis of moderate versus low PA. For studies not report a RR estimate for moderate-high level of PA [2, 715, 17, 1922], a summary estimate was also calculated. Statistical heterogeneity among studies was assessed with the Q and I 2 statistics [23]; and a P < 0.1 was considered significant [24]. Sensitivity analysis was performed to reflect the impact of the individual study to the summarized RRs by removing studies involved in the meta-analysis one at a time. Publication bias was evaluated using the Begg’s and Egger’s test [25].

A two-stage random-effects dose–response meta-analysis was performed to compute the trend from the correlated log RR estimates across levels of PA taking into account the between-study heterogeneity [26]. Briefly, a restricted cubic spline model, with three knots at the 25th, 50th, and 75th percentiles of the levels of PA, was estimated by generalized least square regression taking into account the correlation within each set of published RRs [27]. Then, we combined the study specific estimates by the restricted maximum likelihood method in a multivariate random-effects meta-analysis [28]. A P value for nonlinearity was calculated by testing the null hypothesis that the coefficient of the second spline is equal to 0. For each study, we calculated the median level of PA for each category by assigning the midpoint of upper and lower boundaries in each category as the average PA level. When the highest category was open-ended, we assigned the lower end value of the category multiplied by 1.5. Studies were not eligible if the required data were not reported or could not be estimated.

Results

Characteristics of the studies

Figure 1 outlines the search strategy used to obtain relevant literature. Four thousand nine hundred and ninety two titles and abstracts were identified and screened, and 23 studies were reviewed in detail. Two articles without data about total PA were excluded [29, 30]. Four studies [17, 3133] were excluded in the analysis of the association between postdiagnosis PA and breast-specific and/or all-cause mortality since their subjects were overlapped in a larger study [12]. Nevertheless, one of the studies [17] was included in the analysis of the association between prediagnosis PA and all-cause mortality, and three [17, 32, 33] were included in the dose–response analysis. After further excluding two reviews, sixteen cohort studies [2, 615, 17, 1922] involving 42,602 patients of breast cancer were selected for meta-analysis. The characteristics of the included studies are shown in Table 1. Among these sixteen cohort studies, fourteen studies reported on the association between prediagnosis PA and breast cancer-specific and/or all-cause mortality and four studies reported on the association between postdiagnosis PA and breast cancer-specific and all-cause mortality, with two studies having data on both prediagnosis and postdiagnosis physical activities (Table 1). Table S1 presents the methodological quality of studies included in the final analysis. The NOS results showed that the average score was 6.7, ranging from 5 to 8.

Fig. 1
figure 1

Flow chart of the selection of publications included in the meta-analysis

Table 1 Prospective cohort studies of physical activity and survival outcomes in breast cancer patients

Association of prediagnosis PA with mortality

Figure 2 presents the estimated RRs of breast cancer patients with prediagnosis PA. The results showed that patients who participated in moderate to high levels of PA before diagnosis had a RR of 0.82 (95 % CI 0.74–0.91, P < 0.01) for breast cancer-specific mortality (vs. low PA). The RRs of breast cancer-specific mortality for moderate versus low PA and high versus low PA were 0.83 (95 % CI 0.73–0.94, P < 0.01) and 0.81 (95 % CI 0.72–0.90, P < 0.01), respectively. There was some evidence of heterogeneity among studies for moderate-high versus low PA (P = 0.07, I 2 = 39.0 %).

Fig. 2
figure 2

Relative risks for the association between prediagnostic physical activity and breast cancer-specific and all-cause mortality in breast cancer patients

Regarding the all-cause mortality, prediagnosis PA was also associated with a protective effect. Moderate-high level of PA before diagnosis conferred a RR of 0.79 (95 % CI 0.73–0.85, P < 0.01) for all-cause mortality compared to low PA. When the association between PA and all-cause mortality was analyzed as moderate versus low PA and high versus low PA, RRs of 0.80 (95 % CI 0.73–0.88, P < 0.01) and 0.76 (95 % CI 0.69–0.83, P < 0.01) were found, respectively. There was some evidence of heterogeneity among studies for moderate-high versus low PA (P = 0.06, I 2 = 43.0 %).

From the results of the leave-one-out sensitivity analysis, all the results above were not materially altered (data not shown). We found no evidence of publication bias in any analyses using Begg’s and Egger’s tests (P ≥ 0.12).

Association of postdiagnosis PA with mortality

Risk estimates of the association between postdiagnosis PA and mortality in breast cancer are shown in Fig. 3. The results revealed that patients who participated in moderate to high levels of PA after diagnosis had a RR of 0.71 (95 % CI 0.58–0.87, P < 0.01) for breast cancer-specific mortality compared to low PA. The RRs of breast cancer-specific mortality for moderate versus low PA and high versus low PA were 0.81 (95 % CI 0.70–0.94, P < 0.01), and 0.68 (95 % CI 0.57–0.82, P < 0.01), respectively. There was no statistically significant heterogeneity among the studies in any analyses (P ≥ 0.17).

Fig. 3
figure 3

Relative risks for the association between postdiagnostic physical activity and breast cancer-specific and all-cause mortality in breast cancer patients

In terms of all-cause mortality, moderate-high level of PA reduced all-cause mortality by 43 % (RR 0.57, 95 % CI 0.45–0.72, P < 0.01) compared to low PA. When the association between PA and all-cause mortality was analyzed as moderate versus low PA and high versus low PA, we found that moderate and high level of PA decreased all-cause mortality by 39 % (RR 0.61, 95 % CI 0.46–0.81, P < 0.01) and 48 % (RR 0.52, 95 % CI 0.43–0.64, P < 0.01), respectively. There was some evidence of heterogeneity among studies for moderate-high versus low PA (P = 0.01, I 2 = 76.2 %), and for moderate versus low PA (P = 0.01, I 2 = 74.0 %).

From the results of the leave-one-out sensitivity analysis, all the results above were not materially altered (data not shown). We found no evidence of publication bias in any analyses using Begg’s and Egger’s tests (P ≥ 0.14).

Associations between PA and mortality according to BMI

Figure 4 shows risk estimates for moderate-high versus low PA according to BMI. PA prior to diagnosis reduced breast cancer-specific and all-cause mortality for those patients with BMI ≥ 25 kg/m2 (RR 0.63, 95 % CI 0.49–0.81, P < 0.01; and RR 0.80, 95 % CI 0.69–0.94, P < 0.01, respectively); however, it had insignificant effect on those with BMI < 25 kg/m2.

Fig. 4
figure 4

Relative risks for the association between pre- and post-diagnostic physical activity and breast cancer-specific and all-cause mortality according to BMI

Postdiagnosis PA reduced breast cancer-specific mortality (RR 0.73, 95 % CI 0.61–0.86, P < 0.01 for BMI < 25 kg/m2; and RR 0.74, 95 % CI 0.65–0.83, P < 0.01 for BMI ≥ 25 kg/m2) and all-cause mortality (RR 0.68, 95 % CI 0.55–0.83, P < 0.01 for BMI < 25 kg/m2; and RR 0.60, 95 % CI 0.39–0.94, P = 0.02 for BMI ≥ 25 kg/m2) for breast cancer patients no matter what their BMIs were.

Associations between PA and mortality according to menopausal status

Risk estimates for moderate-high versus low PA are shown in Fig. 5 according to menopausal status. Prediagnosis PA reduced all-cause for postmenopausal women (RR 0.77, 95 % CI 0.60–1.00, P < 0.05) but had no effect on premenopausal women.

Fig. 5
figure 5

Relative risks for the association between pre- and post-diagnostic physical activity and breast cancer-specific and all-cause mortality according to menopausal status

Postdiagnosis PA reduced breast cancer-specific and all-cause mortality for postmenopausal patients (RR 0.70, 95 % CI 0.60–0.81, P < 0.01; and RR 0.66, 95 % CI 0.56–0.78, P < 0.01, respectively) but not for premenopausal patients.

Dose–response meta-analysis

We assessed the dose–response relationship between PA and mortality in breast cancer with seven studies [11, 13, 17, 20, 22, 32, 33]. Statistically significant departure from linearity was found for relationship between postdiagnosis PA and breast cancer-specific or all-cause mortality (P = 0.01, Fig. 6b; and P < 0.01, Fig. 6d, respectively), but not for relationship between prediagnosis PA and breast cancer-specific or all-cause mortality (P = 0.07, Fig. 6a; and P = 0.10, Fig. 6c, respectively). A 3 MET-h/week increment in prediagnosis PA conferred a RR of 0.95 (95 % CI 0.92–0.97) for breast cancer-specific mortality, and 0.90 (95 % CI 0.86–0.94) for all-cause mortality. Regarding postdiagnosis PA, an increment of 1 MET-h/week from 0 to 5 MET-h/week was associated with a 6 % lower breast cancer-specific or all-cause mortality, and the RR of mortality was decreased more sharply than that when PA level is greater than 5 MET-h/week (Fig. 6b, d).

Fig. 6
figure 6

The dose–response analysis with restricted cubic splines in a multivariate random-effects dose–response model for the relationships of a prediagnostic physical activity and breast cancer-specific mortality; b postdiagnostic physical activity and breast cancer-specific mortality; c prediagnostic physical activity and all-cause mortality; and d postdiagnostic physical activity and all-cause mortality. The solid line and the long dash line represent the estimated RR and its 95 % CI. Short dash line represents the linear relationship

Discussion

This meta-analysis investigated the association between PA and mortality in breast cancer involving 27,805 patients for prediagnosis PA and 23,360 patients for postdiagnosis PA with breast cancer survival outcomes. The summary results, as derived from sixteen cohort studies, indicated both prediagnosis and postdiagnosis PA were associated with reduced breast cancer-specific mortality and all-cause mortality, with a slightly more beneficial effect among breast cancer patients with postdiagnosis PA. The previously meta-analysis conducted by Ibrahim et al. indicated that postdiagnosis PA reduced breast cancer-specific mortality and all-cause mortality sharply. However, Ibrahim et al. failed to find a relationship between prediagnosis PA and breast cancer-specific mortality, and only found a borderline inverse association between prediagnosis PA and all-cause mortality. The limited studies with small sample size may be underpowered to detect the association, and consequently contributed to the different results from ours.

The effect of PA within different subgroups of the population defined by BMI (<25 vs. ≥25 kg/m2) was examined in eleven different studies. The results showed that prediagnosis PA was more beneficial for overweight women. There is plenty of evidence that overweight and obesity at the time of diagnosis are associated with a worse prognosis in breast cancer survivors [34]. A recent large prospective study by Etemadi et al. [35] also reported an increased mortality rate among obese adolescents and young adults, especially cancer mortality rate for obese adolescents. The relationship between obesity and cancer may be mediated through insulin resistance [36] which is thought to influence the risk of breast cancer recurrence and mortality [37]. The effect of PA on the reduction in weight and, subsequently, on insulin levels might be an explanation why prediagnosis PA was more beneficial for overweight women.

In the subgroup analyses by menopausal status, only postmenopausal women experienced a benefit from PA, especially postdiagnosis PA (Fig. 5). Ageing may be one of factors contributing to the different effect of PA on pre- and postmenopausal breast cancer patients [37]. It is well-known that aging is associated with declines in physical and cognitive functioning. Several mechanisms, including improved muscle strength and gait speed, reduction in falls, improved balance, bone mineral density and increased mental health, have been demonstrated for the positive effects of PA in older people [37].

There was no much evidence of heterogeneity among studies. However, heterogeneity cannot be ruled out, since PA assessment methods vary across included studies. PA is a complex behavior that has many inter-related components, such as energy intake and body size [15]. It is difficult to examine the effect of independently of the other factors. Therefore, the measurement of PA is methodologically challenging. Furthermore, PA is difficult to measure accurately since the type (i.e., occupational, household, recreational), dose (i.e., frequency, intensity, and duration), and timing in life all need to be considered [4]. Misclassification of exposure might also have arisen, because assessment of PA has been made primarily by self-report, which is limited by the respondent’s ability to recall and quantify the PA. Nonetheless, most included studies used a reliable and valid instrument to assess PA and used a more precise scale to measure the levels of PA. It seems that the PA assessment methods used in all included studies should have been adequate to distinguish more PA subjects from least PA subjects. In addition, most studies provided information on PA history or reassessed the levels of PA during the course of follow-up (Table S1). Furthermore, to complicate the interpretation of the pooled results across study populations with different PA categories, we also performed dose–response analyses with studies measuring levels of PA in MET-h/week. Therefore, our results, based on the sixteen cohort studies seem to be robust.

There is convincing evidence that PA may significantly impact breast cancer outcomes. The biologic mechanisms underlying the relationship between PA and breast cancer are not completely understood. Several mechanisms have been postulated to explain the inverse association between PA and mortality in breast cancer patients. One of the potential mechanisms is the effect of PA on insulin resistance. PA has been shown to reduce insulin resistance and lower fasting insulin levels, through which breast cancer prognosis may be mediated [38]. Another potential mechanism involves PA-associated reduction in inflammation [39, 40]. Evidence also suggested that inflammation may up-regulate aromatase which could result in higher production of estrogens both in the breast tissue and in circulation [41]. In addition, increased PA could lower endogenous estrogens [42, 43]. Increases in estrogens and inflammations are involved with increased breast cancer risk and poor prognosis [40, 44]. Combined with our results, it seems that PA intervened in breast cancer development.

The potential limitations of our study should be considered when interpreting the results. First, our results are likely to be affected by some misclassification of PA exposure levels. In addition, several studies assessed PA at only one point in a person’s life, so measure of exposure may not adequately reflect the person’s true PA exposure. Second, although many of the studies had adjusted for important risk factors, unmeasured factors related to PA may also have influenced results of individual studies. Third, studies included in this meta-analysis were major conducted in Western countries; so, the results should be extrapolated to other populations with caution.

In conclusion, our data suggest that PA, whether prediagnosis or postdiagnosis, is associated with better prognosis of breast cancer based on the findings of sixteen cohort studies. Future trials should examine the role of PA in patients with breast cancer in randomized controlled trial with larger sample size, well-controlled confounding factors, long enough follow-up time, and more accurate assessment of PA exposure levels.