Introduction

Breast cancer is the most common cancer and the second leading cause of cancer death among women in the U.S. [1]. In 2008, 210,203 women were diagnosed and 40,589 died from breast cancer [2]; this is compared to an estimated 226,870 who will be diagnosed and 39,510 who will die in 2012 [3]. These statistics suggest that treatment and survival for breast cancer have improved markedly, even in recent years. Moreover, the Surveillance Epidemiology and End Results (SEER) estimated that 2,747,459 women in the U.S. diagnosed with breast cancer were alive as of January 2009 [4]. Longer survival may be attributable to advances in treatment modalities and earlier detection by improved screening protocol adherence.

Although these statistics are encouraging, breast cancer survival disparities by race, ethnicity, and socioeconomic status (SES) still exist [5]. Examining the influential roles that disparities exert on patient outcomes is essential for improving survival rates in all patients. The unique racial, ethnic, and economic groups living in Florida afford the opportunity to explore and assess a large multi-racial, multi-ethnic, and economically diverse population to determine whether there are disparities in survival time after breast cancer diagnosis. Because we are using a large linked dataset which contains complete information on clinical characteristics, treatment, economic status, and comorbidities, we have the unique ability to simultaneously control for previously untested confounding effects on survival.

Methods

Data

Data were obtained between 1996 and 2007 from the Florida Cancer Data System (FCDS) which was linked with Florida’s Agency for Health Care Administration (AHCA) and with information from the U.S. Census. These data were linked using a unique patient ID number to form the full dataset for this study. The matches were confirmed using the patient’s date of birth and Social Security number. FCDS, a Florida population–based cancer registry, identifies incident cancer cases and collects information on diagnosis (ICD-9 codes), stage of disease, medical history, patient demographics, methods of treatment, and date of death from linkage with the National Death Index (NDI) and Florida Mortality Records [6].

We also had access to the last date of a patient encounter, if no death was reported from FCDS data. In order to establish survival status for patients diagnosed in the last year of the study, 2007, we followed these patients through 2010; thus, all patients in our study have at least 3 years of follow-up data. Patients without a date of death were considered to have censored data as of the last date of contact. Losses to follow-up such as these may result from underuse of healthcare system or relocating out of state.

Florida’s AHCA dataset contains records on all patients treated at hospitals and free-standing surgical and radiologic treatment centers, and collects diagnosis and procedure codes for all in- and out-patient encounters [7]. From AHCA data, we retrieved all diagnosis information for patients in our population from 1995 through the end of the period where we collected follow-up data on survival (2010). Comorbidities in the Elixhauser comorbidity index were grouped into 4 categories of 0, 1–2, 3–4, and >4 based on how many comorbidities each patient had recorded in the linked dataset as counts of comorbidities for the descriptive statistics. However, for the regression analyses, Elixhauser comorbidities were included in the regression models as dummy variables.

Variables

The primary endpoint, overall survival, was considered as time from date of diagnosis found in FCDS to date of death, taken from FCDS data linked with NDI and Florida Mortality Records, or date of last encounter found in AHCA. Our main predictors of interest were race, ethnicity, and SES. Individuals were classified into 1 of 7 race categories: white, black, Native American, Asian, Pacific Islander, Asian Indian or Pakistani, and Other. Ethnicity was defined as non-Hispanic or Hispanic. Patient’s place of residence was used to approximate patient-level SES. From the U.S. Census, we obtained tract-level information on percentage of households in the neighborhood living below the federal poverty line. Each tract was categorized as: lowest (≥20 %), middle-low (≥10 and <20 %), middle-high (≥5 and <10 %), and highest (<5 %) SES based on percentage of the neighborhood living in poverty. Individuals living in each tract were assigned that tract’s SES level.

Sociodemographics and clinical characteristics such as age, marital status, smoking status, type of insurance, hospital volume and teaching status (a potential measure of quality of care delivered), tumor grade and stage, lymph node status, type of treatments, and Elixhauser comorbidities were available in our linked dataset and were used as covariates in our regression analyses. The FCDS staging criteria is consistent with the SEER summary staging, defined by local, regional, and distant stages. Histologic category was based on International Classification of Disease for Oncology 3rd edition morphology coding scheme.

Population

Our sample included female cancer patients diagnosed with breast cancer in the state of Florida during 1996–2007 (n = 179,072). Patients younger than 18 years of age, with carcinoma in situ, and non-Florida residents (n = 40,358) were excluded. We also excluded those with missing values for race, ethnicity, or SES (n = 10,960) resulting in a total sample size of 127,754.

Statistical analyses

Associations between categorical variables were examined using chi-square tests. Overall median survival and 1-, 3-, and 5-year survival rates were calculated by the Kaplan–Meier method. Log-rank tests were used to compare the survival curves by race, ethnicity, and SES. Univariate and multivariate Cox proportional hazards regression models were used to obtain unadjusted and fully adjusted hazard ratios (HR) and 95 % confidence intervals. The main predictor variables were tested for interactions. Patients treated in the same hospital or facility share some unmeasured characteristics that may affect clinical outcomes and can therefore not be considered as independent observations. Robust standard errors to adjust for clustering of patients within medical facilities were incorporated in all models. The type-I error rate was set at 5 %. The SAS v9.3 (SAS Institute Inc., Cary, NC) was used to perform all analyses. This project was approved by the University of Miami Institutional Review Board.

Results

Patient demographics and clinical variables

The demographic and clinical characteristics of the sample by race, ethnicity, and SES are depicted in Tables 1 and 2. Overall, the majority of patients were white (90.4 %), non-Hispanic (90.4 %), and at middle-high or highest SES (59.5 %). The mean age was 63 years with a standard deviation (SD) of 14 (range: 19–105 years). Among race categories, the mean age of those diagnosed earliest was 54 years (SD 13) for Asian Indian or Pakistani; 56 years (SD 13) for Asian; and 56 years (SD 11) for Pacific Islander. Blacks comprised the majority of patients in the lowest category of SES (45.6 %), and Asian Indian or Pakistani (32.1 %) made up the majority in the highest SES category. The highest proportion of those with private insurance was seen in Asian Indian or Pakistani (43.9 %) and Asian (43.6 %); but the lowest proportion of Medicare was received by Asian Indian or Pakistani, Asian, and Pacific Islanders. The majority of Hispanics were in the lowest or middle-low SES categories (60.7 %), and the majority of non-Hispanics were in the highest or middle-high categories of SES (61.6 %). The greatest percent of those alive at the end of the study among race categories was Asian Indian or Pakistani (84.0 %) and Asians (82.6 %); blacks had the smallest percent living (63.4 %). A majority of patients were treated at non-teaching (90 %) and low-volume (88 %) facilities. The highest proportion of blacks (92 %) was treated at low-volume and the highest proportion of whites (91 %) was treated at non-teaching facilities.

Table 1 Demographic characteristics of female cancer patients displayed as a percent of total race, ethnicity, and SES categories from the linkage of 1996–2007 Florida Cancer Data System, Agency for Health Care Administration, and U.S. Census tract
Table 2 Pathological and clinical characteristics of female cancer patients displayed as a percent of total race, ethnicity, and SES categories from the linkage of 1996–2007 Florida Cancer Data System, Agency for Health Care Administration, and U.S. Census tract

A majority of the sample had more than 4 comorbidities (57 %), localized tumors (59 %), and received surgery and radiation (52 %) (Table 2). The greatest number of comorbidities was present in patients from the lowest SES category (64 vs. 51 % in the highest). The largest percent of patients with no comorbidities was from Pacific Islanders (48.0 %), Asian Indian or Pakistanis (43.4 %), and Asians (40.7 %). All of the demographic and clinical characteristics were significantly different (p < 0.05) among race, ethnicity, and SES categories.

Survival

Overall median survival time (MST) was 11.8 years; Asians and Asian Indian or Pakistanis had the longest MST at 12.6 and 12.0 years, and Native Americans had the shortest at 7.4 years (Table 3 and Fig. 1a, b). The survival time for Pacific Islanders could not be estimated because they did not cross over the overall median survival for the sample, likely due to small sample size of Pacific Islanders in the state of Florida. Hispanics and those in the highest SES category had the longest MST (11.6 and 13.7 years, respectively) (Fig. 1c, d). Among races, Asians had the highest 1-year survival rate (97.3 %) followed by Asian Indian or Pakistani (96.9 %), while the lowest was for Native Americans and blacks (90.5 % each). By year 5, Pacific Islanders had the highest rate (86.4 %) followed by Asian Indian or Pakistanis (83.9 %) and Asians (80.6 %); blacks had the lowest 5-year survival rate (63.3 %). Hispanics and those in the highest SES categories consistently had the highest survival rates compared to non-Hispanics and those in lower SES categories at 1, 3, and 5 years post-diagnosis.

Table 3 Median survival in years and 1-, 3-, and 5-year post-diagnosis survival rates in percentages of female breast cancer patients from the linkage of 1996–2007 Florida Cancer Data System, Agency for Health Care Administration, and U.S. Census
Fig. 1
figure 1

Median survival in female breast cancer patients by a overall sample, b race, c ethnicity, and d socioeconomic status

A striking result was seen in the status of the registry patients during the study period. The categories of Asian, Pacific Islander, and Asian Indian or Pakistani cancer patients had almost a 20 percentage point higher rate of survival than did blacks and Native Americans (Table 1).

Regression analysis

In univariate analysis, compared to whites, blacks had worse survival (HR = 1.44; p < 0.001), and both Asians (HR = 0.71; p < 0.001) and Asian Indian or Pakistanis (HR = 0.65; p < 0.05) had better survival (Table 4). Improved survival was also seen for Hispanics (HR = 0.92; p < 0.001) compared to non-Hispanics. For SES, there was an incremental improvement in survival for each higher SES category (p < 0.001 for all). In multivariate analysis, while controlling for the effects of race, ethnicity, and SES as well as all covariates described in Tables 1 and 2, worse survival was maintained in blacks (HR = 1.28; p < 0.001) and better survival was sustained in Hispanics (HR = 0.90; p < 0.001). However, the survival advantage of Asians and Asian Indian or Pakistanis compared to whites seen in the unadjusted model was not sustained in the fully adjusted model. Significant results for SES were upheld in the fully adjusted model (p < 0.001 for all). In addition, we tested for interactions among race, ethnicity, and SES in the fully adjusted model. No interactions were found.

Table 4 Univariate and multivariate Cox proportional hazards regression models

Discussion

While breast cancer is the second leading cause of cancer death in women, disparities in overall survival are evident and independently based on race, ethnicity, and SES. By examining a large population-based cancer registry for Florida, we attempted to investigate these diverse outcomes in order to understand how to address the survival disparities. We had access to approximately 95 % of incident breast cancer cases in the state of Florida from 1996 to 2007 with follow-up till the year 2010. To our knowledge, this was the largest and most diverse sample (7 discrete races, 2 ethnicities, and 4 SES categories) employing simultaneous adjustments for the greatest number of demographic and clinical characteristics including comorbidities in the final fully adjusted survival model, in order to comprehensively explore survival differences in the state of Florida. Ours is the largest study of American Asians, and Asian Indians and Pakistani Americans with breast cancer of which we are aware.

Also, in our study, we had a minimum follow-up of 3 years which allowed us to follow the cohort of patients and determine whether death occurred after the study ended. The overall MST of the sample was 11.8 years. However, survival time varied greatly based on patient’s demographic and clinical characteristics. Controlling for these characteristics and comorbidities, we were able to show that survival disparities remained between whites and blacks, between ethnicities, and among SES categories. These findings are very important because without simultaneously controlling for SES, it cannot be known whether the survival disadvantage was due to black race or was confounded by SES. We were able to show that this finding holds true with race, ethnicity, and SES while controlling for many other confounding factors in the state of Florida.

Our study found substantial disparities for black women with breast cancer. Blacks were most likely to live in neighborhoods with the lowest SES, have the greatest number of comorbidities, and be diagnosed with localized cancer less than all other races. Blacks were also diagnosed younger than whites (median age 56 and 65 years, respectively). Overall, 50 % of blacks survived only 8.3 years from time of diagnosis; additionally, at 1-, 3-, and 5- years post-diagnosis, blacks had the lowest survival rate compared to all other races. Multivariate analyses showed that blacks had a worse survival outcome compared to whites, supporting previous studies’ findings [810].

Conversely, our multivariate analysis revealed that Asians had improved survival compared to whites but not statistically significant. MST for Asians was longer than for whites and a larger percent of Asians survived at 1, 3, and 5 years post-diagnosis. Only 3.4 % of Asians, the smallest rate of all races, were staged as distant at diagnosis. Our findings are in agreement with previous literature [11], although other studies found differences in survival between Asian and white women with breast cancer [1215]. These contradictory findings may be due to differences in the categorization of Asian race in our study compared to other studies. We included only Asian Americans (within our Asian group) while others combined Asians with other racial groups or divided Asians by distinct ethnicities [16, 17]. We compared all Asians in our study to the reference group of white race (of all ethnicities) and kept Hispanic and non-Hispanic ethnicity as a separate prognostic factor. Others compared Asians and/or Asian ethnicities to non-Hispanic whites as the referent [16, 17].

Asians, Pacific Islanders, and Asian Indian or Pakistani had higher unadjusted rates of survival and less comorbidities compared to blacks and Native Americans. This may be due to the fact that these 3 racial groups were diagnosed at a younger age compared with the other racial groups. Additionally, for Pacific Islanders, our findings may have been affected by our small sample size (n = 50) for this subpopulation group in the state of Florida. The 1-, 3-, and 5-year survival rates of Asians, Pacific Islanders, and Asian Indian or Pakistani were consistently higher than all other races suggesting that these populations had better health. These phenomena were only discernible as a result of our exclusive access to these linked data while controlling for SES, treatments, tumor characteristics, other demographic/clinical characteristics, and comorbidities.

There were no differences in survival of the other races in our study compared to whites after controlling for all other prognostic factors; the survival advantage in the univariate models for Asians and Asian Indian or Pakistani was no longer significant in the final fully adjusted model. However, there was a beneficial trend in the point estimates in the fully adjusted models for Asians (HR 0.84) and for Asian Indian or Pakistani (HR 0.87). With a larger sample size, these trends would possibly show significance. Some other studies have found no difference between whites and Filipino Americans [16, 18] or Indian and Pakistani [16, 19]. Another study found lower survival rates in Native Americans compared to whites [16] and Pacific Islanders [16, 20]. However, as we had a very large sample and we controlled for numerous factors that could affect outcomes, our finding of no difference in survival may be more supported.

A noteworthy finding in our study was that Hispanics had improved survival compared to non-Hispanics. This survival benefit remained significant, and became more favorable, after controlling for all covariates. Hispanics compared to non-Hispanics had a longer MST, were diagnosed younger, and had more aggressive tumor grades. Our findings differ from other studies which showed no difference in survival between Hispanics and non-Hispanics [16] or a higher adjusted relative risk for death from breast cancer in Hispanics compared to non-Hispanics [20]; however, in those studies, only white Hispanics were compared to non-Hispanics, while we compared all races of Hispanics to non-Hispanics of all races.

Our results show that women who live in progressively poorer socioeconomic neighborhoods have an incrementally shorter survival time. Even our adjusted models demonstrated the same pattern of longer survival at higher economic levels, albeit with smaller effect sizes. MST at the highest economic level was more than 5 years longer than at the lowest level (13.7 vs. 8.5 years, respectively). Our findings agree with other large population–based studies [2123].

The inter-related effects of race, ethnicity, and SES can be complex. A large Medline literature search was undertaken by Cross et al. [24] which suggested that race could be a surrogate for SES, masking the negative effects of SES on survival, particularly in black women. However, these investigators intimated that large database studies did not ordinarily control for SES in the survival model. Our findings confirmed what others have found that race, ethnicity, and SES are significant predictors of survival [25]. Other social differences (e.g., mistrust of the healthcare system, cultural beliefs, lifestyle factors) as possible contributing factors for these disparities need to be considered and addressed within pockets of subpopulation groups [26, 27].

Several limitations should be noted for our study. The FCDS and AHCA datasets do not collect complete hormone receptor information, limiting our ability to control for hormonal status. We did not have information about treatment dosage or type of chemotherapy used, which may affect outcomes. Additionally, information about screening differences was not available at the time of analysis but it has been reported that black and poorer females may screen less than whites and wealthier women [5]. The databases that we have access to do not have individual-level indicators of SES; therefore, we used neighborhood-level poverty as a proxy. However, using neighborhood indicators of SES has been shown to be a valid and reliable methodology [28]. We excluded approximately 8 % of our original sample for having missing data on our primary predictor variables (race, ethnicity, and SES), and some other variables had relatively large numbers of missing values (>20 %). However, we have no reason to believe these were not missing at random. Finally, our treatment data are likely to miss women who only received oral hormone therapy as a treatment.

In conclusion, significant disparities in female breast cancer survival by race, ethnicity, and SES persist even after simultaneously controlling for age, marital status, smoking status, type of insurance, hospital volume and teaching status, tumor grade and stage, lymph node status, type of treatments, and comorbidities. Multiple years of data and the linkage of two large population-based datasets with the U.S. Census from the multicultural state of Florida allowed us to concurrently explore 7 race categories, Hispanic ethnicity, and 4 SES groups, including those typically not included in breast cancer survival studies (e.g., Asian Indian and Pakistani, Pacific Islander, and Native American). Although previous studies have also explored the disparities in survival, few have had access to as comprehensive data as we have, which allows us to make finer racial/ethnic distinctions and also control for multiple confounders. For example, Li et al. [29] and Deshpande et al. [30] also found worse survival for blacks when compared with non-Hispanic whites and with white women; however, they could not control for SES or comorbidities as we did. It may be that the lack of these control variables is why our findings for races other than black (Native Americans and Pacific Islanders) and ethnicity (Hispanics) are different from those other studies. This is an important consideration, because without controlling for SES, it cannot be known whether the survival disadvantage was due to black race or was confounded by SES. We were able to show that there is a survival disadvantage for both SES and black race, but that there was not an interaction between these factors.

We found that compared to whites, blacks have shorter and Asians have longer survival times; Hispanic ethnicity is protective. Neighborhood socioeconomic conditions are associated with survival in an incremental fashion; the effect is reduced after controlling for demographics and clinical characteristics, but still significant. Organized screening programs have been shown to reduce mortality in cancer patients from neighborhoods with lower SES [23]. Further exploration is needed to clarify the survival gap and tailor programs to ensure earlier diagnosis and access to treatment options, particularly for blacks and those living in poorer neighborhoods. Future research can focus on the interaction of race, ethnicity, and SES with other prognostic indicators such as hospital characteristics to determine whether an intervention such as assignment to specific hospital type for treatment can ameliorate disparities.