Introduction

Almost two decades ago, Wetzels et al. [1] used immunohistochemical markers to identify a subset of breast tumors that exhibited a “basal cell phenotype,” in that the tumors expressed cytokeratins normally found only in the cell layer lying closest to the basement membrane of the mammary gland epithelium. Perou and colleagues [24] further characterized “basal-like” breast cancer as one of five principal subtypes identified in a supervised gene expression analysis of breast tumors. The “intrinsic” subtypes consisted of estrogen receptor (ER)-positive (luminal) tumors, two separate groups of ER−negative tumors [basal-like and human epidermal growth factor receptor 2 (HER2)-positive] and a group with a pattern resembling normal breast [2, 3]. Luminal tumors stained for cytokeratins normally expressed in the upper, more differentiated breast epithelial layer (i.e., keratins 8/18), while basal-like tumors expressed cytokeratin 5/6. Luminal tumors were further subdivided into luminal A (ER-positive, HER2-negative) and luminal B (ER-positive, HER2-positive). The “intrinsic” subtypes have been reproduced across a variety of microarray platforms [5, 6] and validated in numerous patient datasets from around the world [7, 8]. The “intrinsic” classification system showed significant agreement in predicting clinical outcomes when compared with three other gene-expression based classification schemes, suggesting that these profiling methods identify distinct, stable biologic properties of breast tumors [9].

To investigate the prevalence of “intrinsic” subtypes in large, population-based datasets where fresh tumor tissue was not available, immunohistochemistry (IHC) surrogate markers were developed that could be applied to formalin-fixed, paraffin-embedded tumor blocks [10]. We applied these IHC markers to tumor blocks collected as part of the Carolina Breast Cancer Study (CBCS), a population-based, case–control conducted among African-American and white women in North Carolina. The “intrinsic” subtypes were observed in invasive [11] as well as in situ [12] breast cancer. The presence of the basal-like subtype in in situ breast cancer suggests that this phenotype is established early in breast carcinogenesis, and could therefore reflect a distinct pathway for disease etiology. In the CBCS, the prevalence of basal-like breast cancer was highest among premenopausal African-American women, while luminal A was most common among postmenopausal white women [11].

In the present analysis, we used exposure information collected from the CBCS to identify risk factors for the five breast cancer subtypes, with an emphasis on comparing two of the most distinct subtypes, namely luminal A and basal-like. We estimated the prevalence of risk factors for basal-like breast cancer among controls in the CBCS dataset, and estimated population attributable fractions that may be useful for prioritizing interventions to reduce the incidence of basal-like breast cancer, particularly among younger African-American women.

Methods

Study design and sampling

The CBCS is a population-based, case–control study conducted in 24 counties of North Carolina that combines molecular biology and population-based epidemiology to understand the causes of breast cancer [13]. Cases were identified from the North Carolina Central Cancer Registry, and controls were identified using Drivers’ License and Medicare beneficiary lists [14]. Participants provided informed consent using documents approved by the Institutional Review Board at the University of North Carolina School of Medicine. Women with invasive breast cancer and population controls were enrolled during Phase 1 (1993–1996) and Phase 2 (1996–2001). Randomized recruitment was used to oversample younger and African-American cases so that sample sizes would be sufficient for separate analyses [15]. Women with carcinoma in situ (CIS) and population controls were enrolled only during the latter time period (1996–2001). All cases of CIS [including ductal carcinoma in situ (DCIS), DCIS with microinvasion to a depth of 2 mm, and lobular carcinoma in situ (LCIS)] were eligible. Controls were frequency matched to cases by age and race using randomized recruitment [15]. Participants ranged in age from 20 to 74 years. Contact, cooperation, and overall response rates have previously been published for each phase of the study [16]. The portion of the CBCS designed to evaluate invasive breast cancer included 1,803 cases (787 African-American, 1,016 white) and 1,564 controls (718 African-American, 846 white), with overall response rates of 76% for cases and 55% for controls. The portion of the CBCS that evaluated carcinoma in situ (CIS) comprised 508 cases (107 African-American, 401 white) and 458 controls (70 African-American, 388 white), with overall responses rates of 83% for cases and 65% for controls.

In-person interviews and body size measurements

In-person interviews were conducted for cases and controls by trained nurses. Participants were asked detailed information about family history of cancer and reproductive history, including age at onset of regular menstruation, age at first full-term pregnancy (AFFTP) and number of children, breastfeeding and onset of menopause. Women were asked to compare their weight to their peers during fifth grade, and to provide information on recreational physical activity, household or farm chores, and walking or biking to school at age 12, and frequency of recreational physical activity as an adult. Additional information was obtained on environmental exposures (smoking, alcohol use), hormone use (oral contraceptives; hormone replacement therapy, HRT), and socioeconomic status (income, education, occupational history) [1720]. Participants were also asked about prior medical conditions, including diabetes mellitus.

Measurements were taken of waist circumference, hip circumference and body weight at the time of interview.

Tumor blocks and immunohistochemistry assays

Women with invasive and in situ breast cancer were asked for permission to obtain relevant medical records and pathology reports (to confirm eligibility) and access to tumor blocks (for centralized review, sectioning and immunohistochemistry assays) [21]. The distributions of breast cancer “intrinsic” subtypes was previously published for 496 cases from Phase 1 of the invasive portion [11] and 245 cases from the CIS portion [12]. For the present analysis, we added data from an additional 653 cases of invasive breast cancer from Phase 2 and 30 cases from the CIS study. The additional CIS cases included three women with DCIS, 17 with DCIS with microinvasion and 10 with LCIS that were not included in the previous analysis [12]. In total, this article therefore presents data from 1,424 cases (1,149 invasive and 275 in situ) with sufficient tissue for IHC analysis, comprising 62% (1,424/2,311) of enrolled CBCS cases. A comparison of cases with IHC marker data to those without yielded no statistically significant differences for age, menopausal status, family history of breast cancer, or other covariates, with the following exceptions: African Americans and patients with later stage at diagnosis were more highly represented in the IHC marker dataset compared to cases without marker data. African-American women in the CBCS tended to be diagnosed with later stage tumors than white women, and tumors with adequate tissue for IHC assays tended to be slightly larger than tumors with insufficient tissue [11].

Tumor blocks were sectioned and stained for a panel of IHC markers at the Immunohistochemistry Core Laboratory, University of North Carolina (UNC). For invasive cases, ER and progesterone receptor (PR) status were abstracted from medical records for 80% of cases and determined using IHC assays performed at UNC for the remaining cases [22]. For in situ cases, ER status was determined using IHC. For all cases, IHC assays for HER2, HER1 (EGFR), and CK5/6 assays were conducted using assay procedures and cutpoints for positivity as previously described [11, 12]. Subtype definitions for invasive cases were based upon five IHC markers: luminal A (ER+ and/or PR+, HER2−), luminal B (ER+ and/or PR+, HER2+), basal-like (ER−, PR−, HER2−, HER1+ and/or CK5/6+), HER2+/ER− (ER−, PR−, HER2+) and unclassified (negative for all five markers). For in situ disease, four IHC markers were used: luminal A (ER+, HER2−), luminal B (ER+, HER2+), basal-like (ER−, HER2−, HER1+ and/or CK5/6+), HER2+/ER− (ER−, HER2+) and unclassified (negative for all four markers). PR status was not determined for in situ cases in order to preserve tissue sections. For in situ breast cancer, PR+ tumors are almost always ER+. In one recent study of DCIS, ER and PR status were strongly correlated (P < 0.001), and fewer than 1% of tumors were ER− and PR+ [23].

Statistical analysis

Race was categorized based upon self-report as African-American or white. The latter category included fewer than 2% of participants who listed their race as Native American, Asian, mixed or other race, while the remainder classified themselves as white. Menopausal status was determined using information from the interview. Women younger than 50 years were classified as postmenopausal if they had undergone natural menopause, bilateral oophorectomy, or irradiation to the ovaries, otherwise they were classified as premenopausal. For women aged 50 or older, menopausal status was assigned based upon cessation of menstruation.

Body mass index (BMI) was calculated as body weight (kg)/height (m)2 and used as a measure of general adiposity. Categories for BMI were based upon National Heart, Lung, and Blood Institute (NHLBI) cutpoints (<25 normal or underweight, 25–29 overweight, ≥30 obese) [24]. Waist-hip ratio (WHR) was calculated as the ratio of waist to hip circumference (cm) and used as a measure of abdominal adiposity. Cutpoints for WHR were tertiles based upon the distribution in controls. Other covariates were defined as previously reported [14, 1720]. Briefly, women who had smoked at least 100 cigarettes in their lifetime, consumed any alcoholic beverages, or used oral contraceptives or HRT at any time were classified as “ever users.” Breastfeeding was categorized according to the total lifetime number of months of breastfeeding, the number of children breastfed, months of breastfeeding per child, and use of medications to suppress lactation. The average number of children breastfed and months breastfeeding per child were calculated for each woman based upon information obtained for each live birth. Women were also asked about lactation failure or other problems with breastfeeding.

The prevalence of breast cancer subtypes (among cases) and participant characteristics and risk factors (cases and controls) were adjusted for the sampling probabilities used to select eligible participants, as implemented in SUDAAN version 9.0.1 (Research Triangle Institute, Research Triangle Park, NC). Distributions across categories were compared using adjusted Chi square tests.

Unconditional logistic regression was used to calculate odds ratios (ORs) as a measure of association, as implemented in SAS version 8.2 (SAS Institute, Cary NC). Odds ratios were calculated among cases only using luminal A, the most common subtype, as the comparison group. Case-only analyses using disease subtypes are a useful exploratory tool to uncover etiologic heterogeneity [25]. In the present application, the case-only OR estimates the relative strength of association between a risk factor and a given disease subtype (basal-like, luminal B, HER2+/ER−, or unclassified) versus the same exposure and luminal A (ratio of ORs). Case-only ORs were adjusted for age and/or race, and supplemental analyses were conducted adjusting for American Joint Committee on Cancer (AJCC) stage at diagnosis (stage 0 or in situ, 1, 2, 3 + 4).

Odds ratios comparing cases and controls were calculated to further investigate the etiology of the five subtypes (estimate risk ratios), with each subtype separately compared to all controls (N = 2,022). Potential confounders were selected based upon prior knowledge, directed acyclic graphs (DAGs) [26, 27] and by selecting variables that resulted in a 10% or greater change in the beta estimate for the exposures of interest. Prior knowledge dictated that ORs for waist circumference and WHR be adjusted for BMI [28, 29]. Odds ratios for BMI and WHR were also calculated after stratifying on menopausal status, and postmenopausal women were further stratified based upon use of HRT [28]. DAGs dictated that we not adjust parity and lactation ORs for WHR or BMI, since the latter variables could lie on a causal pathway between the exposures of interest and breast cancer. The list of exposures of interest and potential confounders included family history, reproductive history, measures of body size, weight gain, physical activity, environmental exposures, hormone use, and socioeconomic status (education and family income).

When the analysis was restricted to parous women, ORs for breastfeeding variables were attenuated slightly and estimates were less precise, therefore results are presented using the more stable referent category of nulliparous women. Odds ratios for lifetime duration of breastfeeding used a cutpoint of 4 months since no additional protective effects were observed for longer duration. Odds ratios for breastfeeding variables did not differ among women who reported having trouble breastfeeding or being unable to lactate. Odds ratios for reproductive and breastfeeding variables were similar (although less precise) after stratifying on race and menopausal status, and ORs did not differ substantially when CIS cases and controls were removed from the analysis.

To evaluate multiplicative interaction, likelihood ratio tests (LRTs) were used to calculate P-values comparing models with main effects to models with main effects plus relevant interaction term/s. Likelihood ratio tests were not significant for the interaction of the exposures of interest and race or menopausal status. In particular, for basal-like breast cancer, LRTs yielded non-significant results for the interaction of parity and race (P = 0.22), parity and menopausal status (= 0.46), parity/breastfeeding composite variable and race (= 0.32), and parity/breastfeeding and menopausal status (= 0.41). Therefore, results are presented combining African and white women, and pre- and postmenopausal women. For BMI and WHR, ORs were similar after stratifying on race and LRTs were not significant.

Tests for trend were conducted by calculating P-values for the beta coefficient in logistic regression models with exposure coded as an ordinal variable. All statistical tests were two-sided with an alpha level of 0.05.

Population attributable fractions (PAFs) for basal-like breast cancer were estimated using the method of Bruzzi et al. [30]. PAFs combine information on the relative risk (estimated in the present study by the OR) and prevalence for a given exposure or group of exposures in the dataset of interest. The 95% CIs for PAFs were calculated using the bootstrap method described by Rockhill et al. [31]. Briefly, 1,000 random samples, with replacement, stratified on case–control status were repeatedly drawn from the original dataset. PAFs were calculated for each random sample, resulting in 1,000 PAFs, and the 2.5th and 97.5th percentiles of the frequency distribution served as the approximate 95% CI for the original PAF estimate.

Results

Distribution of breast cancer subtypes

The distribution of “intrinsic” breast cancer subtypes in the combined CBCS datasets (invasive and in situ) is presented in Table 1. Among the 1,424 cases with IHC marker data, 796 (56%) were classified as luminal A, 225 (16%) were basal-like, 116 (8%) were HER2+/ER−, 137 (10%) were luminal B, and the remaining 150 cases (10%) were unclassified. For in situ tumors, all cases of LCIS were classified as luminal A, while DCIS with microinvasion was divided among all five subtypes, similar to the distributions reported previously for pure DCIS [12]. The distribution of “intrinsic” subtypes differed significantly by race and menopausal status (< 0.0001) (Table 1). Postmenopausal white women showed the highest prevalence of luminal A, while premenopausal African-American women exhibited the highest prevalence of basal-like breast cancer.

Table 1 Distribution of breast cancer subtypes according to race and menopausal status

Case-only odds ratios

Case-only ORs comparing each subtype to luminal A are presented in Table 2, and were minimally adjusted for age and/or race. Compared to luminal A, basal-like cases tended to be younger, African-American, and have younger age at menarche, higher parity, younger age at first full-term pregnancy, shorter duration breastfeeding and higher BMI and WHR (especially among premenopausal women). HER2+/ER−, luminal B, and unclassified cases also tended to be younger than luminal A cases. HER2+/ER− cases were slightly more likely to be African-American but less likely to be premenopausal. Luminal B cases had older age at first full-term pregnancy, and were more likely to consume alcohol and use HRT, and less likely to be obese or have central distribution of fat. Unclassified cases were more likely to be African-American, and had younger age at menarche and increased parity compared with luminal A. There were no significant interactions between race and menopausal status for each of the four subtypes compared to luminal A. Odds ratios did not differ after adjustment for stage at diagnosis (data not shown).

Table 2 Case-only ORs comparing basal-like, HER2+/ER−, luminal B and unclassified to luminal A breast cancer

Case–control odds ratios

Odds ratios for luminal A cases versus controls, and basal-like cases versus controls, are presented in Table 3. Younger age at menarche was positively associated with basal-like, but not luminal A, breast cancer. Parity, regardless of the number of live births, and younger AFFTP (before 26 years) showed inverse associations with luminal A breast cancer. In contrast, significant, positive increases in risk of basal-like breast cancer were observed with increasing number of live births and younger age at first full-term pregnancy. Inverse associations were observed for breastfeeding and basal-like breast cancer, with significant trends for lifetime duration of lactation, number of children breastfed, and average number of months breastfeeding per child. Use of lactation suppressants was positively associated with basal-like but not luminal A breast cancer.

Table 3 Case–control odds ratios comparing luminal A cases versus controls and basal-like cases versus controls

The composite variable “parity and lactation” exhibited a strong positive association for basal-like breast cancer among women who had 1–2 children and never breastfed, and a slightly stronger association for women with 3 or more children who never breastfed (Table 3). The composite variable “parity and AFFTP” showed stronger positive associations with basal-like breast cancer for parous women with AFFTP <26 than women with AFFTP of 26 or greater. In contrast, inverse associations with luminal A were observed for both composite variables. A composite variable that included parity, AFFTP and breastfeeding demonstrated that higher parity and lack of breastfeeding were the main contributors to increased risk of basal-like breast cancer, with little additional contribution from younger AFFTP (data not shown). Among parous women, ORs for breastfeeding and AFFTP did not change after mutual adjustment, and there was no evidence for interaction between the two variables.

Additional analyses were conducted for timing of pregnancy and breast cancer subtypes. The average interval between pregnancies did not differ across the five breast cancer subtypes (= 0.11). The proportion of women with three or more pregnancies and at least one interval between pregnancies of a year or less was 21% for luminal A and 20% for basal-like cases. The proportion of women who were pregnant or diagnosed with breast cancer within 1 year of being pregnant did not differ across the five subtypes (= 0.14). However, time between last pregnancy and breast cancer diagnosis was longer for luminal A compared to the other case subtypes (= 0.002), which may be attributable to the fact that luminal A cases were older relative to the other groups.

For BMI, ORs were slightly inverse or close to the null for both luminal A and basal-like breast cancer (Table 3). Among postmenopausal women, increasing tertiles of WHR were positively associated with luminal A, however, WHR showed stronger positive associations with basal-like breast cancer for pre- and postmenopausal women. Among postmenopausal women, results for BMI and WHR were similar after stratification on use of HRT (data not shown).

Among cases and controls in the CBCS, elevated BMI, WHR, and waist circumference were positively associated with history of diabetes mellitus (data not shown). However, the prevalence of diabetes mellitus did not differ across the five breast cancer subtypes (= 0.59). Women who reported a gain in adiposity since childhood had increased risk of basal-like breast cancer, while women who decreased in adiposity were at reduced risk. Specifically, women with an elevated WHR measured at interview (≥0.77) who reported being thinner than their peers in fifth grade had elevated risk of basal-like breast cancer (adjusted OR = 2.2, 95% CI 1.5–3.4), relative to women with lower WHR who were thinner than their peers in fifth grade. In contrast, women who reported being heavier than their peers in fifth grade and whose current WHR was low exhibited an inverse association with basal-like breast cancer (OR = 0.5, 95% CI 0.2–1.4). The comparable ORs were close to the null for luminal A breast cancer. The proportion of women reporting gains in adiposity since fifth grade was higher among African-American controls (63%) compared to white controls (42%) (= 0.0002).

Case–control ORs for the luminal B, HER2+/ER− and unclassified subtypes were largely similar to luminal A, with the following exceptions. Luminal B cases showed a stronger positive association with alcohol use than the other subtypes (adjusted case–control OR = 1.7, 95% CI 1.1–2.7), and no association with elevated WHR. Whereas luminal A, basal-like and HER2+ subtypes showed weak inverse associations with postmenopausal HRT, the case–control OR for luminal B was slightly above the null (OR = 1.1, 95% CI 0.7–1.9).

Prevalence of risk factors for basal-like breast cancer

The distribution of risk factors for basal-like breast cancer differed among the four race-menopausal status groups (Table 4). Prevalence estimates are based upon controls, and represent weighted estimates for women residing in the 24-county region of North Carolina sampled by the CBCS. Premenopausal African-American women showed the highest prevalence of menarche before age 13 years and never breastfeeding, and the lowest prevalence of lifetime breastfeeding of 4 months or longer, ≥2 children breastfed and ≥4 months breastfeeding per child.

Table 4 Distributions of selected basal-like risk factors among controls according to race and menopausal status

Even stronger differences between African-American and white women emerged when we subdivided younger women into two age groups, less than age 40 and aged 40 to 50 (Table 5). Younger African-American women had a higher prevalence of each of the principal risk factors for basal-like breast cancer: higher parity, lower breastfeeding, higher parity combined with lower breastfeeding, greater use of lactation suppressants, and elevated WHR. Among parous women, African Americans in each age group reported younger AFFTP, fewer children breastfed, and fewer months breastfeeding per child.

Table 5 Distributions of selected basal-like risk factors in African-American and white controls under age 40 and aged 40–49

Population attributable fractions

Population attributable fractions for basal-like breast cancer were estimated for the two most easily modified risk factors: breastfeeding (never versus ever) and elevated WHR (≥0.77 vs. <0.77). For the entire study population, the PAF was 53% (95% CI 33.3–68.9). Among the four age-race groups, PAFs for basal-like breast cancer were 68% (95% CI 30.0–90.1) for premenopausal African-American women, 57% (−20.5 to 93.1) for postmenopausal African-American women, 37% (−15.1 to 68.4) for premenopausal white women, and 38% (−12.5 to 74.5) for postmenopausal African-American women. The PAF for a set of risk factors can be interpreted as the proportion of breast cancer that would be eliminated if the entire study population was moved from the exposed to the unexposed level for each of the relevant exposures.

Discussion

In a population-based epidemiologic study of African-American and white women, we observed differing magnitudes of association for several breast cancer risk factors when we subdivided cases according to the “intrinsic” subtypes (luminal A, luminal B, basal-like, HER2+/ER− and unclassified). Exploratory case-case comparisons were most striking for luminal A versus basal-like breast cancer, and analyses comparing cases and controls yielded several potential risk factors for basal-like cancer that differed in magnitude and direction in comparison with luminal A. Parity combined with lack of breastfeeding, early-onset menarche, younger AFFTP, use of lactation suppressants, elevated WHR and gain in adiposity since childhood were positively associated with basal-like breast cancer. Notably, each of these risk factors was more prevalent among younger African-American women, as represented by controls in the CBCS. The results suggest that a large part of the racial difference in the distribution of the “intrinsic” breast cancer subtypes may be attributable to differing distributions of specific risk factors related to reproductive history, breastfeeding, adiposity and weight gain.

In a recent article, Anderson et al. [32] examined incidence rates for breast tumors with poor prognostic features (ER and PR negative, tumor size greater than 2.0 cm, lymph node positive, high grade) compared to tumors with a more favorable prognosis (hormone receptor positive, size 2.0 cm or less, lymph node negative, low grade). Incidence rates were higher for poor prognosis tumors until ages 30–44, followed by a plateau at age 50 and a subsequent reduction, whereas incidence rates for more favorable prognosis tumors were higher in women aged 50 years and continued to rise as women grew older. The authors hypothesized that high- and low-risk breast tumors represent distinct subtypes of breast cancer with separate risk factor profiles and/or cell types of origin. In a similar vein, Bernards and Weinberg [33] cited biologic data to support a theory that breast cancer prognosis is “preordained by the spectrum of mutations that progenitor cells acquire relatively early in tumorigenesis; that is, some cancers start out on the wrong foot” [33: page 823]. Therefore, incidence rates and genetic data together support the idea that poor prognosis breast tumors in younger women have a different underlying etiology than more favorable breast cancers in older women. This hypothesis is especially relevant for younger African-American women, for whom breast cancer incidence remains high compared to white women [34] and mortality from hormone receptor negative, high grade breast cancer is a major public health problem [3537].

Increased parity and younger AFFTP have been associated with increased risk of breast cancer among younger African-American women in several studies [3840] including the CBCS [41], but not in others [42] (for review, see Swanson et al. [35]). We observed a statistically significant increase in risk of basal-like breast cancer with increasing number of children, a relationship that was not observed for luminal A breast cancer. The relationship between parity and basal-like breast cancer was not confined to younger women, and basal-like cases were no more likely to be diagnosed following a pregnancy than luminal A cases. Thus, the positive association between parity and basal-like breast cancer was not restricted to the well-documented short-term increase in risk of breast cancer following live birth [41, 43]. Nor did the increase in risk appear to be attributable to younger age at menarche or younger AFFTP which have also been associated with increased risk of breast cancer in younger African-American women [35]. Rather, the increased risk for basal-like breast cancer with increasing parity appeared to be largely confined to women who did not breastfeed (Table 3). Furthermore, the effects of increased parity and lower breastfeeding, and the contrast between basal-like and luminal A breast cancer, were observed across all four age-race groups. In the case-only analysis comparing basal-like versus luminal A breast cancer, the OR for parity ≥3 and no breastfeeding (adjusted for age and race) was 1.9 (95% CI 1.1–3.4) for all women. In the four patient groups, ORs (adjusted for age) were 2.2 (95% CI 0.7–6.6) for premenopausal African-American women, 1.9 (95% CI 0.6–5.9) for postmenopausal African-American women, 1.8 (95% CI 0.5–7.0) for premenopausal white women, and 1.7 (95% CI 0.5–5.6) in postmenopausal white women.

The Collaborative Group on Hormonal Risk Factors in Breast Cancer [44] determined that breastfeeding exerts a protective effect on overall breast cancer risk beyond that of parity alone. Potential mechanisms include induction of terminal differentiation and/or removal of initiated breast epithelial cells, removal of estrogens via breast fluid, excretion of carcinogenic agents, delay in ovulation, and changes in breast pH [45]. Use of lactation suppressants has also been associated with increased breast cancer risk, although results were not consistent across studies [45]. Several lines of evidence suggest a link between basal-like breast cancer and lack of breastfeeding. Symmans et al. [46] found that over-expression of the basal-like marker, GABApi, was associated with younger age at diagnosis and shorter duration of breastfeeding among Hispanic breast cancer patients. BRCA1, but not BRCA2, mutation carriers show a high prevalence of basal-like breast cancer (for review, see Tischowitz and Foulkes [47]). In one study, BRCA1 carriers who breastfed for 1 year or longer were less likely to develop breast cancer than mutation carriers who did not breastfeed; no effect of breastfeeding was seen for BRCA2 carriers [48]. As suggested by Tischowitz and Foulkes [47], full-term pregnancy followed by failure to breastfeed or reduced duration of breastfeeding could result in retention of initiated progenitor cells that ultimately die or differentiate during lactation, and these retained cells could presumably develop into basal-like breast tumors. Pregnancy confers specific gene expression signatures on breast tissue and may effect the distribution and differentiation of potential breast cancer stem cells [49], but the effects of lactation on gene expression and the differentiation status of mammary epithelial cells are not well understood.

The other strong risk factor for basal-like breast cancer identified in the CBCS was WHR. Elevated WHR was associated with a strong increase in risk of basal-like breast cancer among pre- and postmenopausal women, and a more modest increase for luminal A among postmenopausal women. When the two components of WHR were examined separately, elevated waist circumference showed a strong positive association with basal-like breast cancer among pre- and postmenopausal women, while ORs for hip circumference were slightly inverse (data not shown). Waist circumference and WHR serve as surrogates for abdominal adiposity: waist circumference is correlated with the amount of visceral and subcutaneous fat, while WHR is used as an index of the relative accumulation of abdominal versus gluteal fat [28]. Previous epidemiologic studies have shown a consistent association between elevated central adiposity and increased breast cancer risk in postmenopausal women [50], while results for premenopausal women have been less consistent [28, 29]. Abdominal adiposity is correlated with hyperinsulinemia and insulin resistance among African-American and white women [51, 52], and insulin resistance has been hypothesized to increase breast cancer risk in premenopausal women through increased mitotic activity and enhanced cell proliferation in breast epithelial tissue [28]. There are currently no biologic data linking insulin resistance with basal-like breast cancer, and our data do not support an association between prior history of diabetes mellitus and increased risk of basal-like disease. However, overexpression of the leptin receptor is found in breast tumors with high grade [53], a feature associated with basal-like breast cancer.

Our results combining recalled weight in fifth grade with measured WHR at the time of interview suggest that weight gain and/or gain in abdominal adiposity over a woman’s lifetime may contribute to increased risk of basal-like breast cancer. Previous studies reported a stronger association between weight gain and risk of postmenopausal compared with premenopausal breast cancer [54, 55]. Slattery et al. [56] found that weight gain since age 15 and elevated WHR were both associated with increased risk of ER−negative breast cancer. The latter results were presented combining pre- and postmenopausal women, and HER2 status was not included in tumor subtyping.

In addition to Slattery et al. [56], the work of other researchers suggests that risk factors for breast cancer differ depending upon hormone receptor status of the tumor [22, 5762]. Although differences were slight, the results suggest that traditional risk factors based upon reproductive history are associated with increased risk of hormone-receptor positive disease [63, 64], which is consistent with our findings for the luminal A breast cancer subtype. Other studies stratified cases based upon HER2 positivity; but strong differences were not noted (for review, see Huang et al. [65]).

Previous studies reported a higher frequency of hormone-receptor negative breast cancer and later stage at diagnosis among African-American and other minority women compared with white women in the United States [37, 60, 61]. Recently, researchers at the California Cancer Registry found that breast cancer patients with the “triple negative” (ER−, PR−, HER2−) phenotype were more likely to be under age 40, African-American, or Hispanic [66]. “Triple negative” breast cancer was more frequent among women of lower socioeconomic status. The authors used the “triple negative” phenotype as a partial surrogate for basal-like breast cancer, since IHC data were limited to ER, PR and HER2 status. Individual-level data were not available on breast cancer risk factors, and socioeconomic status was assigned at the census block level using address at the time of diagnosis. In the CBCS, lower socioeconomic status (based upon income and education) was not associated with increased frequency of basal-like breast cancer. However, lower socioeconomic status was strongly associated with several risk factors for basal-like cancer, including lower breastfeeding (< 0.0001) and elevated WHR (< 0.0001). Future studies are needed to determine whether the increased prevalence of triple negative breast cancer found among Hispanic women in California may be attributable to reproductive history, breastfeeding, central adiposity and other basal-like risk factors.

Only one previous population-based study examined risk factors for breast cancer based upon the joint distribution of ER, PR, HER2, HER1, and CK5/6, the five IHC markers used to identify the “intrinsic” subtypes in the CBCS. Using data collected from a case–control study in Poland, Yang et al. [67] calculated ORs for each of the five breast cancer subtypes versus controls. Results were similar to the CBCS, in that luminal A and basal tumors showed distinct risk factor profiles, with luminal A showing associations typically described for breast cancer as a whole. The authors reported positive associations for younger age at menarche and parity with basal-like cancer, but breastfeeding was not addressed. An inverse association between elevated BMI and luminal A breast cancer was observed among premenopausal women, but no association was seen for basal-like breast cancer, similar to our results. The authors did not examine WHR. Age at menarche and parity were associated with luminal A but not HER2+/ER− breast cancer. In the CBCS, case–control ORs for HER2+/ER− were almost identical to luminal A, with the exception of a slight inverse association for elevated WHR among postmenopausal women. Yang et al. [67] reported a stronger association with family history for basal-like breast cancer compared to the other subtypes. In our study, associations with family history were nearly identical across the five subtypes, with age and race-adjusted case–control ORs equal to 1.5 (95% CI 1.2–1.9) for luminal A and 1.7 (95% CI 1.1–2.5) for basal-like breast cancer. The only other epidemiologic study to examine risk factors for the “intrinsic” breast cancer subtypes was a population-based case series from Sweden [8] in which the authors subdivided cases based upon gene expression profiling. Current users of HRT were over-represented in the “normal-like” or “unclassified” breast tumor subtype. In the CBCS, the case–control OR for postmenopausal HRT and the unclassified subtype was 1.0 (95% CI 0.6–1.7).

A primary focus of the present analysis was to identify modifiable risk factors that could be targeted to reduce the risk of basal-like breast cancer, particularly among younger African-American women who have the highest incidence of this breast cancer subtype. Mortality rates are higher among younger African-American breast cancer patients, and the disparity in breast cancer outcomes has worsened over time [34]. Since basal-like breast cancer confers a poor prognosis [6, 11], understanding the etiology of this breast cancer subtype is an important public health problem. We estimated that approximately two-thirds (68%) of basal-like breast cancer in younger African-American women (and over half of the disease in the general population) could be prevented by interventions that increase breastfeeding and decrease abdominal adiposity.

There are a number of limitations to PAF estimates, since they are based upon very strong assumptions. First, PAFs estimate the proportion of disease that would be eliminated if the entire population was moved from the exposed to the unexposed level for each of the relevant risk factors, assuming that the exposures in question are causal. One or more of the associations observed in this article could have resulted from recall bias, confounding, or other sources of systematic error. However, it is unlikely that exposure misclassification would be differential by breast cancer subtype, and extensive analyses were conducted to address the possibility of confounding. Analyses of participants with and without IHC marker data, and previous analyses comparing participants and non-participants in the CBCS [68], suggest that selection bias is also unlikely. Data from subsequent population-based studies that utilize “intrinsic” subtypes will provide important information as to whether the associations observed in this article are causal. Second, the afore-mentioned PAF estimates assume that all women in the population are able to breastfeed children and reduce their WHR below 0.77. Clearly, not all women will have children, and there may be significant barriers to both breastfeeding and reducing abdominal adiposity. Third, the calculations assume independence of breastfeeding and WHR from other risk factors, such that the remaining risk factors for basal-like breast cancer are not changed by modifying the two exposures in question. Finally, PAFs should not be interpreted as the proportion of disease that can be “explained” by any specified group of risk factors. Since PAFs do not necessarily add up to 100%, it is possible that many additional exposures could contribute to the risk of basal-like disease. Despite these limitations, PAF calculations perform an important function for public health in that they provide a framework for greater understanding of disease etiology in populations, and stimulate the public health community to evaluate the feasibility of primary prevention strategies [69].

There are several additional limitations to the present analysis. BRCA1 carrier status was determined for only a small sample of women in the CBCS [70]. It is possible that some basal-like cases were BRCA1 mutation carriers, but this number is likely to be very small given the low frequency of BRCA1 carriers in the CBCS [70] and other population-based studies [71]. Another caveat is that IHC surrogates were used to subtype CBCS cases since fresh tumor samples were unavailable to perform gene expression profiling. The IHC surrogates have been validated in another study population, showing excellent agreement with gene expression profiling [10], and they have been utilized in other studies to detect the presence of the five “intrinsic” breast cancer subtypes [67, 7274]. Although tumor blocks tended to be available from cases with larger tumors, the case-only subtype comparisons did not differ when we adjusted for stage at diagnosis. Sample size was small for many of the subsets of interest, and our results need to be replicated in other population-based studies. Our study was limited largely to African-American and white women, and studies of the epidemiology of basal-like breast cancer among Hispanic women and other minority groups is an important area for future investigation.

Interventions to reduce the risk of basal-like breast cancer have strong prior justification. In a summary of existing data on breast cancer among younger African-American women, Bernstein et al. [43] targeted increasing breastfeeding, losing weight, and increasing physical activity as the most effective ways of reducing disease risk. Our study adds further support for these recommendations. The benefits of breastfeeding for mother and child are well-documented [75]. The Centers for Disease Control and Prevention Goals for Healthy People 2010 lists a target of 75% of mothers breastfeeding in the immediate postpartum period, with at least 50% continuing to breastfeed for 6 months [76]. As observed in the CBCS, the prevalence of breastfeeding is reported to be lower among younger African-American women compared to white women [42, 43, 76]. Lack of information about benefits, restrictions surrounding employment, and social pressures limit breastfeeding [75], and maternal obesity decreases initiation as well as continuation of lactation [77]. Teenage mothers may experience particular barriers to breastfeeding. In the CBCS, the proportion of controls who reported having a child before the age of 20 was higher among African-American (45%) compared with white women (23%, < 0.0001). Thus, the reasons for lower prevalence of breastfeeding among younger African American women are complex, and interventions to encourage breastfeeding must operate at the level of the community, the workplace, and society at large [78].

Public health interventions aimed at avoiding over-nutrition, promoting a healthy diet, and encouraging physical activity [79] could impact the incidence of basal-like breast cancer, especially programs that target excessive weight gain. Reduction in abdominal adiposity would provide additional benefits, including reduced risk of diabetes mellitus and heart disease [28, 50]. The prevalence of obesity is increasing among pregnant women [80], leading to increased risk of hypertension and perinatal mortality [81]. A variety of barriers at the school and neighborhood level [82] may need to be overcome to promote physical activity among young girls.

Interventions to reduce risk of basal-like breast cancer would take years to have an impact, especially if early stages of carcinogenesis were targeted. Measures to improve survival for patients with basal-like breast cancer will have a more immediate impact. Increased adiposity at the time of diagnosis can confer a worse prognosis for younger breast cancer patients [83], and this poor prognosis may be especially relevant for women with basal-like disease. Timely and effective treatment is vitally important for patients with basal-like breast cancer, and a variety of new drugs are being evaluated in clinical trials [84]. However, African-American women historically suffer from reduced access to quality health care, delays in diagnosis and treatment, and low enrollment in clinical trials, and these disparities need to be addressed more effectively in the future [8588]. Health care providers need to be aware of the possibility of a breast cancer subtype with distinct etiology and worse prognosis. Unfortunately, clinicians may overlook breast cancer among younger women when patients do not present with a “classic” set of risk factors [35]. Determination of the sensitivity and specificity of screening mammography for basal-like breast cancer would have important implications for detection and diagnosis of breast cancer, particularly in younger women. Finally, risk assessment models for breast cancer may need to be modified to identify women at high-risk for the basal-like subtype.

Conclusions

The “intrinsic” breast cancer subtypes, luminal A and basal-like, exhibit distinct risk factors. Basal-like breast cancer is associated with early-onset menarche, younger age at first full-term pregnancy, high parity combined with lack of breast feeding, and abdominal adiposity. In contrast to recent commentaries suggesting that basal-like breast cancer represents the “exclusive” property of a specific age and racial group by virtue of genetics [8991], our data show that the basal-like subtype is present in younger white breast cancer patients as well as older African-American and white patients at appreciable frequencies. Furthermore, distributional differences of basal-like breast cancer by age and race appear to be largely attributable to varying distributions of the currently identified risk factors for basal-like breast cancer. Programs aimed at promoting breastfeeding and reducing abdominal adiposity would reduce the number of cases of basal-like breast cancer among all women. Such interventions would be particularly relevant for younger African-American women, among whom the prevalence of risk factors for basal-like breast cancer is high.