Introduction

Breast cancer is the most common cancer and the second leading cause of death with respect to cancer among women in United States [1]. The recent analysis by the National Institutes of Health has showed that the national medical cost of breast cancer care is estimated to be $16.50 billion in 2010 and $20.5 billion in 2020, accounting for the largest part of all cancer costs [2]. Preventive strategies are of paramount importance to reduce the huge burden caused by breast cancer.

Dairy consumption has long been thought to play a role in the development of breast cancer. Many epidemiologic studies [320] that examined the association between diary product consumption and risk of breast cancer have produced conflicting results, with both inverse and positive associations reported. A previous pooling analysis by Missmer et al. [21] that pooled primary data from eight prospective cohort studies failed to detect any significant associations of dairy products consumption, regardless of type, with incidence of breast caner. During the past decade, the number of subsequent original studies on this issue has doubled.

Given the newly emerging evidence, we conducted a meta-analysis of prospective cohort studies with the following objectives: (1) to review and summarize the epidemiologic evidence on the relation of dairy consumption with risk of breast cancer; (2) to examine the dairy and breast cancer association according to study characteristics; and (3) to quantify dose–response patterns between dairy intake and risk of breast cancer.

Methods

Search strategy

We attempted to plan, conduct, and report this meta-analysis in accordance with the Meta-Analysis of Observational Studies in Epidemiology guidelines [22]. A PubMed database search through January 2011 was performed to identify relevant studies regarding the association between dairy consumption and risk of breast cancer. We used search terms “dairy products,” “dairy,” or “milk” in combination with “breast cancer” or “breast neoplasms.” No restrictions were imposed. In addition, we reviewed the reference lists of retrieved studies and recent reviews. We did not contact authors of original studies for additional information. No attempt was made to identify unpublished reports.

Study selection

Study selection was based on an initial screen of identified abstracts or titles and a second screen of full-text articles. Studies were considered eligible if they met the following criteria: (1) the study design was a prospective cohort study, which provides stronger evidence than a retrospective design; (2) the main exposure of interest was dairy products; (3) the outcome of interest was breast cancer incidence; and (4) relative risks (RRs) with corresponding 95% confidence intervals (CIs) for the highest versus lowest categories of dairy consumption were reported.

Data extraction and quality assessment

We extracted all data using a standardized data-collection form. Information was recorded as follows: last name of the first author, publication year; study population, period, and location; mean length of follow-up; number of cases and participants; measurement of exposure and outcome; risk estimate from multivariable model for the highest versus the lowest category of dairy intake with corresponding 95% CI; and statistical adjustment for the main confounding factors of interest. In two studies [11, 15], we extracted the RRs with full adjustment for all potential confounding factors but not for calcium and vitamin D, because controlling for calcium and vitamin D may represent over-adjustment for variables on the causal pathway. For three studies [9, 11, 13] with multiple assessments of dairy consumption during follow-up, data for the longest follow-up were extracted.

Instead of providing aggregate scores, we assessed the quality of individual studies by reporting the key components of study designs [22], including characteristics of study populations, assessments of exposure and outcome, duration of follow-up, and statistical control for potential confounding factors. Two authors (J.Y.D and L.Q.Q) independently performed the studies selection and data extraction. Any disagreements were resolved by discussion.

Statistical analysis

Our main analyses were focused on the associations between consumption of total dairy food and milk and risk of breast cancer. Total dairy food was defined as skim/low-fat milk, whole/high-fat milk, yogurt, cottage, cheese, butter, and other dairy products. Milk was defined as skim/low-fat milk and whole/high-fat milk. Because certain exposures, such as cheese, yogurt, or butter, were seldom assessed in individual reports, these analyses were not performed.

The RR was used as the common measure of association across studies, and the hazard ratio and incidence rate ratio were directly considered as RR. RRs from individual studies for each category of exposure and the corresponding standard errors, which were derived from CIs or P values, were transformed to their natural logarithms to stabilize the variances and to normalize the distributions.

Homogeneity of RRs across studies was tested by Q statistic (significance level at P < 0.10) and the I 2 statistic, which is a quantitative measure of inconsistency across studies [23]. A random effects model [24] was used to take into account both within-study and between-study variation. We conducted subgroup analyses stratified by geographic region, length of follow-up, fat content of dairy food, and menopause status at baseline to assess the impacts of these variables on outcomes. We also conducted a sensitivity analysis to investigate the influence of a single study on the overall risk estimate by omitting one study in each turn.

We next quantified dose–response relationships of total dairy food and milk consumption with risk of breast cancer based on the method proposed by Greenland and Longnecker [25]. To perform this analysis, dairy intakes were converted from servings or other units into grams per day (g/d) using standard conversions from the Food Standards Agency (1 serving = 200 g; 1 cup = 237 g; 1 glass = 200 g) [26]. This analysis is based on data for each category of average intake, number of cases, person-year at risk, and adjusted logarithm of the RR with its standard error. The average dose was assigned as the mean of the upper and lower bounds in each category. If the upper bound was not reported in individual studies for the highest category, we assumed it the same magnitude as the preceding category to calculate the average intake in this category.

Potential publication bias was assessed by both Begg rank correlation test and Egger linear regression test [27, 28]. All analyses were performed using STATA version 11.0 (StataCorp, College Station, TX). P < 0.05 was considered statistically significant, except where otherwise specified. All statistical tests were two-sided.

Results

Literature search

A flow chart showing the study selection is presented in Fig. 1. Briefly, we identified 21 potentially relevant studies for full-text review. Two studies [29, 30] were excluded, because they used a retrospective cohort design or nested case–control design. We further excluded one study [31], which was subset of another main study or had overlapping data. Finally, 18 studies [320] were selected for analysis.

Fig. 1
figure 1

Flow chart of study selection

Study characteristics

The characteristics of the selected studies are presented in Table 1. The 18 prospective cohort studies were published between 1989 and 2010. Nine studies were conducted in United States, eight in Europe, and one in Japan. Of the included studies, the majority was population-based, whereas three [11, 13, 20] were conducted in nurses. The number of cases diagnosed in the original studies ranged from 29 to 7,119, with a sum of 24,187. The number of participants ranged from 2,215 to 319,826, with a sum of 1,063,471. Three studies were conducted among premenopausal women only, 3 among postmenopausal women only, and 4 [11, 15, 16, 18] of the remaining studies presented results by menopausal status. The median length of follow-up ranged from 3.9 to 65 years, with a median of 10 years. Among the 18 studies, 10 reported on total dairy food intake and 12 reported on milk intake. Assessments of dairy intake were not consistent between studies, with diet questionnaire and structured food frequency questionnaire mostly used. Case ascertainments also differed between studies, with most using medical records and some using self-report, of which the majority was confirmed by medical records. Adjustment for potential confounding factors differed across studies, and most risk estimates were adjusted for age, body mass index, family history of breast cancer, reproductive factors, hormone replacement therapy, and total energy intake.

Table 1 Characteristics of 18 prospective cohort studies of dairy consumption and breast cancer included in this meta-analysis

Main analysis

The multivariable-adjusted RRs for each study and all studies combined for the highest versus lowest categories of total dairy food and milk consumption in relation to breast cancer risk are shown in Figs. 2 and 3. Results from 10 studies on total dairy food intake were inconsistent, with most showing an inverse relation. The summary RR comparing the highest with the lowest categories of total dairy food consumption was 0.85 (95% CI: 0.76–0.95), with evidence of heterogeneity (P = 0.01, I 2 = 54.5%). Results from 12 studies on milk intake were also conflicting. For milk intake, the summary RR was 0.90 (95% CI: 0.80–1.02), and substantial heterogeneity was observed (P = 0.003, I 2 = 59.7%).

Fig. 2
figure 2

Forest plot of studies examining the association between total dairy food intake and risk of breast cancer

Fig. 3
figure 3

Forest plot of studies examining the association between milk intake and risk of breast cancer

Subgroup and sensitivity analyses

Table 2 shows the results of subgroup analyses stratified by geographic region, duration of follow-up, fat content of dairy, and menopause status of participant at baseline. For total diary food consumption, a significantly inverse relation with breast cancer risk was observed in most subgroups but not in subgroup with high-fat dairy consumption or subgroup among postmenopausal women. The association was somewhat stronger for low-fat dairy food intake compared with high-fat dairy food intake and for premenopausal women compared with postmenopausal women. For milk consumption, only low-fat milk intake was statistically significant associated with a reduced risk of breast cancer (RR = 0.93, 95% CI: 0.88–0.99). Similarly, premenopausal women experienced a somewhat greater, although not significant, risk reduction (RR = 0.79, 95% CI: 0.60–1.02) in relation to milk intake compared with postmenopausal women.

Table 2 Relative risks (RR) of breast cancer in relation to total dairy food and milk consumption according to study characteristics

Sensitivity analyses investigating the influence of a single study on the overall risk estimate by omitting one study in each turn suggested the overall risk estimates did not substantially modified by any single study, with a range from 0.82 (95% CI: 0.72–0.94) to 0.88 (95% CI: 0.80–0.97) for total dairy food intake and from 0.88 (95% CI: 0.77–1.01) to 0.94 (95% CI: 0.84–1.05) for milk intake.

Dose–response analysis

As required data were not provided in four studies [4, 8, 19, 20], the dose–response analysis of breast cancer risk finally included eight studies on total dairy food intake and nine studies on milk intake. Overall, an increment of 200 g/d of total dairy food intake was associated with a significant, although slight, risk reduction of 4% (RR = 0.96, 95% CI: 0.94–0.98), whereas an increment of 200 g/d of milk intake was not associated with breast cancer risk (RR = 0.98, 95% CI: 0.95–1.01). No evidence of heterogeneity was observed for either exposure (both P > 0.30).

Publication bias

There was no evidence of publication bias with regard to consumption of total dairy food or milk in relation to breast cancer risk, as suggested by Begg rank correlation test and Egger linear regression test (all P > 0.05).

Discussion

Dairy consumption has long been thought to play a role in the development of breast cancer, yet evidence from observational studies is not conclusive. The findings of the present meta-analysis of prospective cohort studies indicated that increased consumption of total dairy food may be associated with a reduced risk of breast cancer. Yet milk consumption was not associated with breast cancer risk. Subgroup analyses based on limited numbers of studies suggested that the associations were somewhat stronger for low-fat dairy intake than for high-fat dairy intake and for premenopausal women than for postmenopausal women.

Several components in dairy products, including vitamin D, calcium, conjugated linoleic acids (CLA), and saturated fat acids, may be responsible for either a prospective or a harmful association between dairy and breast cancer. In vitro studies have suggested that calcium and vitamin D exert anticarcinogenic effects on breast cancer cells [32, 33]. A recent meta-analysis has provided evidence that vitamin D and calcium intakes protect against breast cancer, particularly in premenopausal women [34]. Experimental studies in animals and in vitro have shown protective effects of CLA against carcinogenesis in the mammary gland, potentially by inhibiting the cyclooxygenase-2 or the lipooxygenase pathway or by inducing the expression of apoptotic genes [35]. Yet data from population studies on the association between dietary CLA intake and risk of breast cancer are sparse and conflicting [12, 3638].

Dietary fat intake has been long hypothesized to increase the incidence of breast cancer. Previous meta-analyses have produced conflicting results regarding the association of dietary fat with breast cancer [3942]. Different types of fatty acids may, at least partly, contribute to the controversy, as several recent large prospective cohort studies have documented a positive association between saturated fat consumption and breast cancer [43, 44]. Our finding that low-fat, but not high-fat, dairy consumption is associated with a reduced risk of breast cancer is broadly in line with current evidence.

We observed substantial heterogeneity across studies of the associations of total dairy food and milk consumption with breast cancer risk. This is not surprising given the variation in study designs and characteristics of populations between studies. As indicated by our subgroup analyses, menopausal status of participants likely contributed to the observed heterogeneity. In fact, the magnitudes of associations by menopausal status differed within single studies [11, 15, 16]. Data from individual studies [11, 15, 16] consistently suggested that the association between dairy product consumption and risk of breast cancer was stronger in premenopausal women than that in postmenopausal women. In subgroup analyses, we observed a significant relation of total dairy intake and a marginally significant relation of milk intake with premenopausal, but not postmenopausal, breast cancer, indicating menopausal status may server as a potential effect modifier of the dairy and breast cancer association. To interpret the difference by menopausal status is challenging. One possible explanation is that a potentially inverse association between dairy product intake and risk of postmenopausal breast cancer might have been obscured by use of hormone replacement treatment. An alternative explanation might be related to the interaction among calcium, vitamin D, and insulin-like growth factors, which may be stronger for premenopausal women than for postmenopausal women and thus lead to greater risk reduction in premenopausal breast cancer [16].

Our study has strengths. With available evidence and enlarged number of studies to date, we have enhanced statistical power to detect any associations and quantify dose–response relationships between dairy product intakes and risk of breast cancer. In addition, all the original studies enrolled in the present meta-analysis used a prospective cohort design, which minimizes recall, interviewer, and selection biases that can always be concerns in retrospective studies.

Several limitations involved in this study should be considered. First, unmeasured or uncontrolled confounding inherited from original studies is a concern in this meta-analysis as consumption of dairy food, especially low-fat dairy, is probably associated with a healthy lifestyle. All risk estimates were derived from multivariable models, but individual studies did not adjust for potential risk factors in a consistent way. With aggressive research on risk factors and hence advanced understanding of their effects on breast cancer, recent studies generally controlled for more complete confounders than earlier ones, e.g., smoking, an important risk factor of breast cancer [45], was assessed and controlled in most studies [12, 13, 15, 16, 18] published in the latest decade but seldom in those before this century. We therefore could not exclude the likelihood that inadequate control for confounding factors may bias the findings.

Second, misclassification bias should be noted. Inevitably, dietary assessments suffer from measurement errors. Within single studies, non-differential misclassification may occur in classifying categories of dairy product consumption. Across studies, differential misclassification may be introduced as dietary assessments were based on different questionnaires and different nutrient databases. For instance, in definition of total dairy products, some cohort studies included ice cream or butter, while others did not. Non-differential misclassification generally biases the associations towards null, whereas differential misclassification could bias the results in either direction.

Third, there is substantial heterogeneity across studies. The heterogeneity was likely due to the variation in exposure definitions, exposure ranges, dietary assessment methods, and population characteristics between studies. Further, our subgroup analyses indicated menopausal status of participants potentially contributed to the variation in the strengths of associations.

Fourth, we could not rule out the influences of diet change on the risk estimates among women with sub-clinical breast cancer, despite that individuals with known preexisting breast cancer were excluded in all original studies. Sensitivity analyses that excluded cases diagnosed within the initial several years of follow-up could help examine these influences and achieve reliable results, yet they were seldom performed in single studies. Furthermore, few studies [9, 11, 13] assessed dairy consumption more than once during the follow-up period and regression dilution bias, therefore, may exaggerate or obscure the true associations.

Fifth, as the dose–response analyses were based on a limited number of studies, the results should be treated with cautions. In addition, the average dose for the highest category was not reported in several studies, and the estimated value may not reflect the actual intake of dairy food.

Finally, potential publication bias might influence the findings, yet little evidence of publication bias was observed in the present meta-analysis, as indicated by the formal statistical tests [27, 28].

In summary, findings of the present meta-analysis of prospective cohort studies indicate that increased consumption of total dairy products may be associated with a reduced risk of breast cancer. Yet, there is insufficient evidence to support a significantly inverse relation between milk consumption and breast cancer risk. As limited evidence suggests menopausal status may server as a potential effect modifier, further large prospective studies are warranted to clarify the role of menopausal status in the dairy and breast cancer association.