Introduction

Black women have an ~40 % higher risk of breast cancer (BC) mortality post-diagnosis, compared with White women [17], a disparity that emerged in the 1980s. By contrast, Asian and Hispanic BC patients generally have lower disease-specific mortality than Whites [1, 4, 8, 9]. Several possible causes of the Black–White BC survival disparity have been investigated, including differences in disease characteristics at presentation such as stage at diagnosis [1012], treatment or delays in treatment [2, 1317], socioeconomic status [17, 18], and presence of comorbidities [13]. Some studies suggest that when stage, treatment, and follow-up are similar, Black women have similar disease-free survival to White women [19] though studies generally find that Black women have worse overall survival than White women [4, 19, 20]. Although this may implicate treatment differences, it has also led researchers to hypothesize that the disparity may be due to differences in tumor subtype [21]. Black women, and particularly young Black women, with BC have a lower prevalence of ER or PR-positive tumors [22], which are associated with better prognosis [4, 22], and a higher prevalence of the more aggressive triple negative BC (TNBC) or basal-like subtype, which has poorer prognosis.

It is unclear whether variation in tumor subtype helps explain the higher post-diagnosis BC mortality rate among Black women. Previous studies have used surrogate subtype categories defined primarily by immunohistochemical (IHC) detection of protein markers, and some have found higher BC mortality risk for Black women within the luminal A [2325] and TNBC [25, 26] subtypes, though others have reported few overall or BC-specific survival differences among those with TNBC [24, 27]. Patterns are still emerging, and no studies have evaluated whether differences in subtype defined by gene expression, rather than IHC, help to explain survival disparities.

The PAM50 assay, an RT-qPCR-based subtype classifier that incorporates the expression of 50 genes based on gene expression microarrays [28], has been shown to be more prognostic of distant recurrence or BC death than subtype classification by IHC [29]. We investigated associations of race and survival after BC diagnosis, accounting for PAM50 BC intrinsic subtypes.

Methods

Study population

The molecular subtyping study population included BC survivors from the Life After Cancer Epidemiology (LACE) and Pathways cohorts. LACE participants included women diagnosed with early stage invasive BC from 1996 to 2000. They were recruited primarily from the Kaiser Permanente Northern California (KPNC) Cancer Registry (83 %) and the Utah Cancer Registry (12 %) from 2000 to 2002. Eligibility criteria included 1) ages 18–70 years at enrollment; 2) diagnosis of early-stage primary BC (stage I > 1 cm, II, or IIIA); 3) enrollment 11–39 months post-diagnosis (mean time to diagnosis = 23 months); 4) completion of BC treatment (except adjuvant hormonal therapy); 5) no evidence of recurrence; and 6) no history of other cancers <5 years prior to enrollment. Further details are provided elsewhere [30].

For the Pathways cohort, women with invasive BC were recruited from the KPNC patient population from January 2006 to May 2013. Cases were ascertained rapidly by scanning of electronic pathology reports. Eligibility criteria for the study included current KPNC membership, ≥21 years of age at diagnosis, and having a first primary invasive BC with no prior history of cancer other than non-melanoma skin cancer. Women were typically enrolled within two months of diagnosis. Further details are provided elsewhere [31]. For the molecular subtyping population, we included Pathways women diagnosed from 2006 to 2008.

Women with no IHC information, with no suitable tumor block available, who did not consent to the study, and for whom the PAM50 was not measurable, were excluded. Additional exclusions for molecular subtyping were invasive tumor <0.5 cm diameter, bilateral disease, or neoadjuvant therapy. Participants provided informed consent under protocols approved by institutional review boards at KPNC and the University of Utah.

Selection of participants

We used a stratified case-cohort study design [32], an alternative to the nested case–control study design in studies examining multiple outcomes (e.g., recurrence and survival) [33, 34], with strata defined by IHC subtype [22, 32]. In brief, participants included 1) a random sample of women with the most common luminal A subtype based on IHC classification (positive for ER or PR and negative for HER2); 2) women with the luminal A subtype with an event of interest (BC recurrence or BC death); and 3) all women who had non-luminal A tumors, with follow-up for outcomes of interest. More specific details about selection of participants have been previously published [35]. This cohort of 1,635 women was followed for recurrences and deaths through August 2012. Of these women, 370 had a recurrence and 510 died of any cause, with 274 BC deaths (53.7 %).

Tissue samples

For selected cohort members, we contacted the hospital where surgery for resection of the primary tumor was performed, or the institution’s pathology storage facility, to obtain formalin-fixed, paraffin-embedded (FFPE) tissue blocks, and slides from the surgical resection of the primary tumor. Slides were reviewed by a pathologist (REF) who marked an area of representative tumor tissue on a slide. For eligible cases, 1 mm punches were obtained from areas of representative tumor tissue. Two punches per case, or one punch if the primary tumor was <0.7 cm in diameter, were placed in plastic tubes labeled with a sample identifying number. If the area of invasive tumor was <0.5 cm in diameter, then the case was deemed ineligible.

Clinical tissue markers

Hormone receptor status (ER and PR) and HER2 expression were obtained from medical record review and either the KPNC Cancer Registry (KPNC cases) or Utah Cancer Registry (Utah cases). For KPNC breast surgical specimens, ER, PR, and HER2 status were determined by IHC at the KPNC regional IHC lab; at Utah, by hospital pathology departments or ARUP Laboratories, Inc. (Salt Lake City, UT).

Gene expression assay (PAM50 subtype)

The tissue punch was deparaffinized, and tissue was digested for RNA extraction as previously described [29]. Reverse-transcriptase polymerase chain reaction (RT-PCR) of extracted RNA was conducted to determine expression of the 50 target genes that comprise the PAM50 [36]. Details of RT-PCR methods have been provided elsewhere [29, 37]. Quality control included a negative control (no template) and a positive control (reference DNA template) for each gene in each plate, and PCR of five housekeeping genes from each tissue sample. Laboratory personnel were blinded to clinical information and received only a study identifying number to track the sample. Each batch of tissue samples sent to the lab included a mix of IHC types including events and non-events.

Intrinsic subtype classification

To determine PAM50 molecular subtypes, we used a method of correlation of gene expression patterns to centroids from archetype samples of each subtype in an independent RT-qPCR training set [29]. For each sample, this algorithm generated a categorical subtype call, a Pearson correlation to each subtype in the training set, and a continuous quantitative score (between 1 and 10) for the expression of ESR1, PGR, ERRB2, and proliferation genes. PAM50 subtypes included the Luminal A, Luminal B, Basal-like, and HER2-enriched subtypes. A proliferation score was computed based on the average of gene expression scores associated with proliferation.

Data collection

Race/ethnicity and other covariates

Self-reported data on race and ethnicity, other sociodemographic variables, and reproductive and lifestyle factors were obtained at study enrollment using mailed (LACE) or in-person (Pathways) questionnaires. Disease characteristics at diagnosis including age, disease stage, tumor size, node status, histologic grade, estrogen receptor (ER) status, progesterone receptor (PR) status, and HER2 overexpression or amplification in the primary tumor were abstracted from tumor registry data and medical records review. Comorbidity was assessed by the Charlson index [38].

BC outcomes and clinical characteristics

Information on clinical factors was obtained through KPNC electronic data sources for KPNC participants or from medical chart review for non-KPNC participants. Data included breast surgery (lumpectomy, mastectomy), tumor size, number of positive lymph nodes, hormone receptor status, and adjuvant treatment (i.e., chemotherapy, radiation therapy, hormonal therapy, and Herceptin). Tumor stage was calculated according to criteria of the American Joint Committee on Cancer (AJCC).

Recurrences were ascertained by a mailed semi-annual or annual health status questionnaire asking participants to report events occurring in the preceding 6 or 12 months, respectively. Non-respondents were called by telephone to complete questionnaires. Medical records were reviewed to verify reported outcomes.

Mortality

Participant deaths were determined through the KPNC mortality file, a family member responding to a mailed questionnaire or a phone call to the family. Death certificates or physician notes were obtained to verify primary and underlying causes of death (International Classification of Diseases, 9th revision). Overall mortality included death from any cause. BC-specific mortality included death attributable to BC as a primary or underlying cause on the death certificate. A physician reviewer was consulted when the cause of death was unclear.

Statistical analyses

All analyses incorporated sampling weights and the stratified sampling design for unbiased estimation of population parameters and valid estimates of standard errors. This included estimates of frequency distributions of baseline characteristics and IHC–PAM50 concordance measures using the “svy” commands in Stata software (StataCorp, College Station, TX). Cox proportional hazards regression were used to estimate hazard ratios (HR) and 95 % confidence intervals (CI) for associations of race and recurrence, BC-specific mortality, and overall mortality. Time since diagnosis was the time scale used in the regression models, allowing for delayed entry into the cohort (i.e., left truncation). Point and interval estimation of regression parameters accounted for the case-cohort study design with stratified sampling of the subcohort using the methods of Borgan et al. [32], as implemented in SAS subroutines developed by Langholz and Jia [39]. We compared age-adjusted models to those adjusted for age, education, income, stage, tumor size, number of positive nodes, grade, comorbidity, and reproductive and lifestyle factors. We then evaluated models adjusted additionally for PAM50 subtype, relative to the Luminal A subtype, and then separately for continuous PAM50 scores. We also adjusted for a proliferation score in additional analyses. We conducted additional analyses, separately considering risks for early (in the first 5 years after diagnosis) and late (≥5 years) events, since we found evidence that the relative hazard of events, compared with the Luminal A subtype, decreases over time in the total sample.

Since the two parent cohorts—the LACE and Pathways cohorts—included women with BC diagnosed at different points in time (1996–2000 vs. 2006–2008), enrolled at different timepoints relative to diagnosis (~2 months vs. 2 years following diagnosis), we initially conducted analyses separately for each cohort. Since the overall Black–White survival disparity did not differ markedly by cohort, the cohorts were combined in all subsequent analyses.

We also conducted analyses stratified by subtype, but due to power limitations, these analyses were adjusted for age and stage only. Finally, we stratified by education (high school (HS) or some college vs. college graduate or graduate-level education), age (< vs. ≥ median = 58.8 years), stage (Stages I and II vs. III and IV), and treatment (chemotherapy, radiation, hormone therapy—yes or no). We attempted to stratify by income, but relatively few women were in the lower income groups. When associations differed across strata, we used Likelihood ratio χ 2 tests to evaluate interaction terms of stratification variables and race. All tests of statistical significance were two-sided.

Results

Race/ethnic groups included White (n = 1,176), Black (n = 128), Hispanic (n = 138), Asian/Pacific Islanders (PI) (n = 149), and other (n = 44) women. White women were more likely to have a family history of BC but were less likely to be diagnosed at an early age than other women. On average, White and Asian/PI women had higher levels of education, lower BMI, lower parity, and were more commonly diagnosed with Stage I cancer compared with Blacks or Hispanics. Black women were more likely to have any comorbidity, BMI ≥ 30 kg/m2, and to have been a smoker than other women. White women had a higher proportion with ER and/or PR-positive BC tumors, whereas Black women were more likely to have TNBC; correspondingly, Black women were less likely to have been treated with tamoxifen or aromatase inhibitors. There were no other race differences for BC treatment (Table 1).

Table 1 Selected baseline characteristics by category of race/ethnicity (n = 1,635)

Adjusted for age, Black women had higher risks of recurrence and BC mortality, whereas Hispanic women had a lower risk of recurrence (Table 2). In analyses adjusted additionally for BC severity, treatment, comorbidity, and socioeconomic, reproductive, and lifestyle factors, results were qualitatively similar (Model I, Table 2). Because Herceptin was only routinely given to women after 2000 and thus pertinent to Pathways participants only, we did not adjust for Herceptin, though additional adjustment for Herceptin in a sensitivity analysis did not qualitatively influence associations.

Table 2 Race/ethnicity and relative hazard of recurrence, breast cancer death, and overall death, in women classified by the PAM50 subtype (n = 1,635)

Further adjustment for PAM50 subtype did not attenuate associations (Model II, Table 2). Black women had a significantly higher risk of recurrence (HR 1.65, 95 % CI 1.06–2.57) and BC-specific death (HR 1.71, 95 % CI 1.02–2.86) compared with White women. By contrast, Hispanic women had a lower risk of recurrence (HR 0.54, 95 % CI 0.30–0.96) than Whites. Associations also did not differ when adjusted for proliferation score, either in analyses including all intrinsic subtypes or among the Luminal subtypes only (data not shown). Associations between Black race and outcomes did not differ by cohort, though the inverse association between Hispanic race and outcomes was apparent only in the LACE cohort (Table 2).

Stratified by PAM50 subtype, recurrence risks in Black versus White women appeared elevated for all PAM50 subtypes except for the Basal-like tumors, though analyses were underpowered and nonsignificant (Table 3). Examining follow-up in the first five years, we noted significantly higher risks of recurrence (HR 2.78, 95 % CI 1.19–6.51) and BC mortality (HR 2.20, 95 % CI 1.06–4.55) comparing Black versus White women (Table 2), whereas risks ≥5 years were elevated but not as strong.

Table 3 Race/ethnicity and relative hazard of recurrence by PAM50 intrinsic subtype (n = 1,635)

In other stratified analyses, the Black–White disparity was evident only in women who were younger (recurrence, p-interaction = 0.008; BC mortality, p-interaction = 0.02) (Table 4). Also, the magnitude of the associations between Black race and outcomes appeared stronger among those with later stage cancer (recurrence, p-interaction = 0.02; BC mortality, p-interaction = 0.002) (Table 4). There was no apparent effect modification by treatment or education level (data not shown).

Table 4 Race/ethnicity and relative hazard of recurrence and breast cancer death, stratified by age and stage (n = 1,635)

Conclusions

Similar to most previous studies, Black women had higher risks of recurrence and BC mortality after diagnosis, compared with White women. By contrast, Hispanic women had a lower risk of recurrence. Classifying tumors into subtypes by a more robust gene expression assay and then adjusting for PAM50 intrinsic tumor subtype did not explain the Black–White BC survival disparity. Although analyses within PAM50 subtypes did not produce significant results, given limited power, Black women had a higher recurrence risk for three of the four subtypes, suggesting that the risk was not restricted to a specific subtype. By contrast, the elevated risk in Black versus White women was apparent only in younger women, and there was evidence that the risk was greater among those with later stage cancer. Our study is the first to evaluate associations by PAM50 subtype. Further work is needed to uncover other possible reasons behind racial disparities in BC survival or whether racial disparities differ across populations of women; this could help to elucidate potential BC survival mechanisms.

There is general agreement that Black women have worse BC survival than White women, though the magnitude of the disparity has differed by study. In most cases, adjustment for sociodemographic and clinical factors has not explained the association though some investigators report attenuation of the association with covariate adjustment [40, 41]. When factors such as stage and treatment have been identical, investigators have reported similar disease-free survival [19] but poorer overall survival in Black versus White women [4, 19, 20].

Studies differ as to whether this disparity is evident for all tumor subtypes using the IHC classification, and no clear pattern has yet emerged [23, 25, 27]. In previous studies, Black women have been shown to have worse survival among those with the luminal A and TBNC subtypes. However, our findings [42], in which PAM50 classification resulted in a larger percentage being classified as having Luminal B or HER2-enriched BC than by IHC-based classifications, suggest that results may depend on the method of subtype classification.

We found that the Black–White disparity was evident in younger, but not older, women. Our findings are consistent with some [2, 5] though not all [24] studies. In one study [2] of women ≥65 years, Black women who had received chemotherapy had no worse mortality than White women, though those that did not receive chemotherapy had significantly worse survival. When we stratified by treatment, we did not find differences by chemotherapy. Although treatment and factors that influence both decisions about, delays in, and receipt of treatment may influence the Black–White disparity, we did not have this detailed information available. Future work should consider in greater detail how race and ethnicity influence the course of treatment.

Although the PAM50 intrinsic subtype classifier did not explain the Black–White disparity, subtype is nonetheless an important reason behind the disparity. In the general population, 15–20 % of women diagnosed with BC are diagnosed with TNBC [43]. Among the younger, African-American women in our study, 36 % were diagnosed with the Basal-like subtype, which has the worst clinical prognosis [44].

In contrast to these findings, Hispanic women had a lower recurrence risk, compared with Whites, consistent with several [1, 4, 8, 9] but not all [4547] studies. We were also unable to explain the lower risk of recurrence in Hispanic women by tumor subtype with adjustment for numerous covariates. In previous work, Chlebowski et al. [4] reported lower mortality in Hispanic women, explainable by differences in reproductive history. However, reproductive factors such as use of hormonal therapy and oral contraceptives, age at first pregnancy, breastfeeding, and parity did not explain the difference observed in this cohort. Studies of race and health have often shown better outcomes in Hispanics compared with Whites despite lower socioeconomic levels [48], known as the “Hispanic paradox.” Reasons suggested have included stronger social networks, healthy migrant effects, back-migration, and reproductive and other factors [9, 49]. Interestingly, the lower recurrence risk was present only in women from the LACE study, who were enrolled approximately two years post-diagnosis; recurrence rates in Hispanics in the Pathways study were similar to Whites. It is possible that Hispanic women have a survival advantage conditional on surviving past treatment, but this question should be examined in greater depth.

More work is needed in larger, multiethnic cohorts to clarify racial differences in BC survival and reasons for those differences. Additionally, in order to more fully address reasons behind disparities, investigators must carefully analyze patterns of disparities across health care systems, geographic regions, and individual-level variables such as education and income. Our study provides support that a focus on young Black women to reduce recurrence and BC mortality in this patient population may be warranted. More in-depth examination of the course of treatment may also produce insights [15, 50].

Strengths of this study include the large sample size of BC survivors, use of the PAM50 subtype, excellent measures of numerous BC clinical characteristics, and the ability to adjust for lifestyle, reproductive, and demographic characteristics. Despite the relatively large sample size, we had limited power to examine analyses of race and BC outcomes stratified by BC subtype. Future studies should include larger racial minority subgroups.

In summary, in this large multiethnic study of women with invasive BC, Black women had a higher risk of recurrence and worse BC survival regardless of PAM50 intrinsic tumor subtype. However, Hispanic women had a lower risk of recurrence compared with Whites. Our findings suggest that with improved classifiers of tumor subtype, racial differences in BC outcomes are present across all subtypes, and prevalence of subtype does not fully explain racial disparities in BC survival. Further work is needed to uncover other possible reasons for BC survival disparities.