Breast cancer is the most commonly diagnosed cancer and the second leading cause of cancer mortality in women in the United States (US).1 Prior research has examined the factors influencing breast cancer presentation and outcome, including age, hormone receptor (HR) status, socioeconomic status, and race/ethnicity.2,3,4,5 Among racial and ethnic subgroups, Hispanic and Black patients are disproportionately burdened with delayed diagnosis and poorer prognosis in breast cancer.2

Hispanic populations represent the second largest ethnic group in the US after non-Hispanic White individuals, and over 21% of Americans will identify as Hispanic by 2030.6,7 Among Hispanic Americans, cancer is the leading cause of death, and breast cancer is the leading cause of cancer-specific death among Hispanic women.8 Hispanic patients face significant barriers to healthcare access due to multiple social determinants of health, including structural discrimination, lower socioeconomic status, and lower health literacy.9,10,11 In addition, Hispanic individuals have the highest uninsured rate in the US.12 Among ethnically Hispanic patients, both Hispanic Black and Hispanic White patients are at higher risk of advanced-stage breast cancer when compared with non-Hispanic White patients.13

Many studies treat Hispanic patients as a monolithic group, and prior research aimed at stratifying in the context of breast cancer have mainly disaggregated by race.4,14 While previous studies have suggested Black patients are at greater risk for triple-negative disease, there are limited studies examining this in Hispanic racial subgroups.3,14,15 Furthermore, few studies have examined the influence of Hispanic country of origin in cancer outcomes.4

Given diverse cultural, socioeconomic, and immigration experiences of Hispanic subgroups, we hypothesize that there may be disparities in breast cancer stage at presentation on disaggregation. As breast cancer survival continues to improve due to advances in early detection and treatment, our study uses national data to evaluate stage at presentation with subgroup analysis by both receptor status and intersectional racial/ethnic subgroups, with further disaggregation by country of origin.16 Our study encourages practicing oncologists to explore the diversity within Hispanic populations and analyze the social determinants of health unique to various subgroups. In analyzing variations in outcome, we hope to help physicians develop more targeted interventions to prevent delays in diagnosis for their Hispanic patients.

Methods and Materials

Throughout this paper, we use the term ‘Hispanic’ in reference to individuals identified as belonging to the ethnic population with cultural lineage from Spain, Latin America (excluding Brazil), or the Caribbean; this definition is driven by how the National Cancer Database (NCDB) categorizes Hispanic populations; however, we also reference literature on Latino populations given its relevance to this research. Hispanic populations are often mislabeled as a racial group rather than an ethnicity composed of various racial subgroups. References to race throughout this paper describe racial subgroups (non-Hispanic Black, Hispanic White, etc.), not the Hispanic population overall.

Data Source and Study Population

Data from 2004 to 2017 were extracted from the NCDB. The NCDB provides > 34 million records from hospital cancer registries, with data sourced from > 1500 cancer programs in both the continental US and Puerto Rico, accounting for approximately 70% of new cancer diagnoses.17

Women with breast cancer were included if information regarding their stage (0–IV) and Hispanic ethnicity were both available. They were stratified by race and categorized as non-Hispanic White, non-Hispanic Black, non-Hispanic other, Hispanic White, Hispanic Black, and Hispanic other. ‘Other’ categories included populations who identified as American Indian, Asian, Native Hawaiian, Pacific Islander, or other; individuals with unknown race were not included in this study. The Spanish/Hispanic origin variable was used to further disaggregate individuals identified as Hispanic into Mexican (including Chicano), Puerto Rican, Cuban, South or Central American (except Brazil), other specified Spanish/Hispanic Origin (includes Europe but excludes Dominican Republic), Spanish surname only, and Dominican (Republic). Patients missing data on stage, ethnicity, or other covariates included in our models were excluded.

Clinical and Sociodemographic Covariates

The dependent variable was breast cancer stage at presentation, and the primary independent variable was Hispanic ethnicity disaggregated by both race and country of origin. All models adjusted for additional independent variables, including age, facility type, year of diagnosis, and Charlson–Deyo comorbidity.

Given complexities in interpreting models that examine inequalities by race/ethnicity as well as socioeconomic status, epidemiological experts have encouraged a more thoughtful approach to covariate adjustment.18,19 In this study, we use an interpretation of the effects of race/ethnicity proposed by Vanderweele and Robinson: “the health inequality that would remain for persons” after covariates had been set equal between populations.20 As social determinants of health are likely mediators of interactions between dependent variables and stage at presentation, we also created models that further adjusted for three variables: median household income in patient’s area of residence (neighborhood income: < $38,000, $38,000–$47,999, $48,000–$62,999, ≥ $63,000), percentage of adults in the patient ZIP code who did not graduate from high school (neighborhood education: ≥ 21.0%, 13.0–20.9%, 7.0–12.9%, < 7.0%), and insurance status.

Statistical Analysis

Chi-square tests analyzed the distribution of categorical variables by racial subgroup. Adjusted odds ratios (aOR) with 95% confidence intervals (CIs) were created using ordinal logistic regression. Two models were run to assess the odds of presenting at a later stage, with one model disaggregated by race and the other disaggregated by country of origin. Each model adjusted for (1) age, facility type, year of diagnosis, and Charlson–Deyo comorbidity; and (2) previous covariates plus neighborhood income, neighborhood education, and insurance status. The population was then stratified by receptor status (HR+ with HER2−, any estrogen receptor/progesterone receptor status with HER2+, and triple negative), and the analysis was repeated. Non-Hispanic White women were used as the reference group in all models given that they were the largest population, consistent with prior Hispanic disaggregation studies.4,13,21

Analyses were performed using Stata/MP 17.0 (StataCorp LLC, College Station, TX, USA). This study was deemed exempt from the hospital Institutional Review Board given use of de-identified data.

Results

Baseline Characteristics

Overall, 2,282,691 women were included, of whom 475,749 (20.84%) had stage 0 disease, 968,698 (42.44%) had stage 1 disease, 559,277 (24.50%) had stage 2 disease, 186,588 (8.17%) had stage 3 disease, and 92,379 (4.05%) had stage 4 disease. Patient characteristics are presented in Table 1.

Table 1 Sample characteristics of the study population

Non-Hispanic Black and Hispanic Black patients had higher comorbidity rates, with 77.11% and 79.77% reporting a comorbidity score of 0, respectively (vs. 84.55% in non-Hispanic White patients). Non-Hispanic White patients were less likely to live in neighborhoods with low education: 13.09% non-Hispanic White compared with 57.32% Hispanic Black women. Non-Hispanic White and non-Hispanic Other populations were least likely (11.98% and 8.27%, respectively) and non-Hispanic Black women (41.55%) were most likely to live in the lowest-income neighborhoods. Hispanic White and Hispanic Other populations were most likely to be uninsured (7.77% and 7.68%, respectively) compared with non-Hispanic White patients (1.18%). Hispanic patients were more likely to have Medicaid, with Hispanic Black populations reporting the highest proportion at 28.13%.

Breast Cancer Stage at Presentation

Compared with non-Hispanic White patients, Hispanic women overall had greater odds of presenting with later-stage breast cancer (aOR 1.19, 95% CI 1.18–1.21; p < 0.01) [electronic supplementary material (ESM) Table 1], with no significant differences noted after adjusting for socioeconomic status (aOR 1.00, 95% CI 0.99–1.02; p = 0.39). By racial subgroup stratification, all groups besides non-Hispanic Other had greater odds of presenting at a later stage than non-Hispanic White women (non-Hispanic Black: aOR 1.29, 95% CI 1.28–1.30, p < 0.01; Hispanic White: aOR 1.20, 95% CI 1.19–1.21, p < 0.01; Hispanic Black: aOR 1.11, 95% CI 1.03–1.19, p ≤ 0.01; Hispanic Other: aOR 1.14, 95% CI 1.09–1.19, p < 0.01) (Fig. 1a; Table 2). Non-Hispanic Other women were less likely to present at a later stage (aOR 0.93, 95% CI 0.92–0.94; p < 0.01). After adjusting for socioeconomic status, odds to present at a later stage remained robust in non-Hispanic Black women (aOR 1.14, 95% CI 1.13–1.15; p < 0.01), while Hispanic Black women were found to have decreased odds of a later stage (aOR 0.88, 95% CI 0.82–0.95; p < 0.01).

Fig. 1
figure 1

Proportion of individuals presenting with breast cancer stages 0–IV, stratified by (a) racial subgroup and (b) country of origin

Table 2 Adjusted odds ratios with 95% confidence intervals comparing the odds of presenting at progressively later-stage disease for Hispanic women disaggregated by racial subgroup and Hispanic country of origin, with non-Hispanic White as the reference group

All subgroups by country of origin were found to have increased odds of presenting at a later stage compared with the reference non-Hispanic White group, except Puerto Rican (aOR 1.05, 95% CI 1.00–1.10; p = 0.06), Other Specified Spanish (aOR 0.98, 95% CI 0.92–1.05; p = 0.65), and Dominican (aOR 1.06, 95% CI 0.99–1.14; p = 0.09) women (Mexican: aOR 1.55, 95% CI 1.51–1.60, p < 0.01; Cuban: aOR 1.12, 95% CI 1.05–1.19, p < 0.01; South or Central American: aOR 1.09, 95% CI 1.05–1.13, p < 0.01; Not Otherwise Specified (NOS): aOR 1.18, 95% CI 1.16–1.19, p < 0.01; Spanish surname only: aOR 1.10, 95% CI 1.5–1.16, p < 0.01) (Fig. 1b; Table 2). When accounting for socioeconomic factors, odds of presenting at a later stage remained robust for Mexican women (aOR 1.22, 95% CI 1.18–1.25; p < 0.01) but decreased for other groups (Puerto Rican: aOR 0.89, 95% CI 0.85–0.93, p < 0.01; South or Central American: aOR 0.90, 95% CI 0.87–0.94, p < 0.01; Other Specified Spanish: aOR 0.85, 95% CI 0.79–0.90, p < 0.01; and Dominican populations: aOR 0.81, 95% CI 0.75–0.87; p < 0.01). Across all models, odds of later-stage disease appeared to decline over time.

Analyses were repeated in three subpopulations based on receptor status (Table 2; ESM Table 2). Disparities by race and country of origin appeared consistent or even larger in magnitude. Among patients with triple-negative disease, odds of delayed diagnosis worsened in all racial subgroups besides Hispanic Black (non-Hispanic Black: aOR 1.42, 95% CI 1.38–1.45, p < 0.01; non-Hispanic Other: aOR 1.09, 95% CI 1.03–1.15, p < 0.01; Hispanic White: aOR 1.32, 95% CI 1.26–1.38, p < 0.01; Hispanic Black: aOR 1.23, 95% CI 0.99–1.54, p = 0.06; Hispanic Other: aOR 1.35, 95% CI 1.16–1.59, p < 0.01) (Fig. 1a; Table 2). Odds similarly worsened for Mexican (aOR 1.59, 95% CI 1.43–1.76; p < 0.01), South or Central American (aOR 1.33, 95% CI 1.14–1.55; p < 0.01), and NOS (aOR 1.32, 95% CI 1.25–1.39; p < 0.01) women.

Subgroup analysis was conducted in Black women, with non-Hispanic Black used as the reference group (ESM Tables 3 and 4). Hispanic Black women were less likely to present at later stage (aOR 0.89, 95% CI 0.83–0.95; p < 0.01), with the difference becoming stronger in the mediator analysis (aOR 0.81, 95% CI 0.76–0.87; p < 0.01).

Discussion

This national study of over 2 million women with breast cancer found overall later stage at diagnosis in the aggregate Hispanic population, with further disparities in late presentation among specific Hispanic subgroups. Hispanic women (of Black or White race) were more likely to present at a later stage of cancer than non-Hispanic White women. Additionally, Mexican, Cuban, and South or Central American women were also more likely to present with later-stage disease. Disparities in stage remained for Mexican women after socioeconomic adjustment. Our findings suggest that social determinants of health contribute to why certain Hispanic groups present at a later stage, but cannot fully explain late-stage presentation seen in Mexican women. As Hispanic women were more likely to live in neighborhoods with low income and education, our findings support efforts to target social determinants of health as a means to mitigate breast cancer disparities.

Healthcare disparities among Hispanic patients have been driven by a number of social determinants of health, including structural racism, lower socioeconomic status, language barriers, and lack of insurance.9,12,22 Disparities persist even between Hispanic subgroups, with noted differences in insurance status and healthcare utilization.23 These factors appear to be particularly meaningful for Cuban and Central or South American populations in explaining their increased risk of presenting with later-stage breast cancer. For patients of Cuban descent in our study, there were no significant differences when compared with non-Hispanic White populations after we adjusted for socioeconomic social determinants of health. For patients of Mexican descent, lower rates of breast cancer screening could play a significant role. Ramirez et al. found that Mexican women along the Texas-Mexican border had the lower rates of mammograms/clinical breast examinations when compared with Central American women in San Francisco, Cuban women in Miami, and Puerto Rican women in New York.24

While ‘Central or South American’ remains a substantially heterogenous population, disruptions in medical care due to recent migrations may mediate the increased risk. According to the United Nations Population Division, nearly 17.6 million South Americans lived outside the country of their birth. While South American immigrants tend to be slightly more educated and more likely to participate in the labor force, political and humanitarian crises may have led to higher rates of mobility and lower rates of preventative screening prior to arrival in the US.25,26 Given more recent immigrations compared with other Hispanic populations, language barriers may be more burdensome in this population. It is important to emphasize that the Central or South American population merit further disaggregation to provide meaningful explanations of observed differences given the inherent diversity in this large group.

Aside from experiences prior to entering the US, diverse immigration patterns have resulted in variations in geographic settlement among Hispanic subgroups, further contributing to differences in health status.27 Citizenship may further play a role, particularly with regard to healthcare access. In our study, Puerto Rican women did not have delayed diagnosis when compared with non-Hispanic White populations. As US citizens, Puerto Rican populations may have better access to healthcare resources and Medicaid than other Hispanic populations.

Our study adds to the growing body of literature supporting a more granular approach in identifying groups for targeted interventions. Research efforts should be directed towards (1) greater understanding of the drivers of disparities through quantitative and qualitative studies; and (2) piloting targeted interventions to benefit the most vulnerable. Additionally, understanding how social factors can mediate disparities in stage at presentation for disaggregated Hispanic patients may inform our understanding of other groups that benefit from disaggregation, such as the Asian American, Native Hawaiian, and Pacific Islander populations.28,29

A challenge in disaggregating Latino data comes from the complex cultural heritages resulting from colonization in both North and South America. Many Latinos today carry a combination of European, Native American, and/or African ancestry based on Latin American country of origin.30,31 Furthermore, differential assimilation to American language and culture by Hispanic populations may also drive variability in access and acceptance to cancer care, including differences in language barriers, socioeconomic status, nativity status, and healthcare literacy, as well as barriers that result from the deleterious effects of structural racism.31,32,33,34

At the intersectionality of race and ethnicity, our study shows that both Hispanic Black and non-Hispanic Black women are more likely to present with later-stage disease. Although adjusting for income/education ameliorated in Hispanic Black populations, non-Hispanic Black women remained at increased odds. When further analyzing by receptor status, Hispanic Black women were less likely to have triple-negative disease than non-Hispanic Black women. This is consistent with previous studies showing higher rates of triple negative disease in non-Hispanic Black women when compared with Hispanic Black women.3 Few studies have explored the intersecting roles that race and ethnicity play in influencing health behaviors and outcome. LaVeist-Ramos et al. observed that ethnicity plays a dominant role in Hispanic Black communities for health behaviors (diet, alcohol use, smoking status), while race plays a more prominent role in health services (insurance coverage and last visit to the doctor).35 Our findings reinforce the intersectionality inherent in identity, which when analyzed in the context of the social determinants of health, collectively contribute to outcome disparities.

Limitations

Limitations include those related to the primary dataset. Although the NCDB contains the majority of cancer cases in the US, it may not be representative of cancer in the population as a whole. Additionally, Hispanic subgroups defined by NCDB are imperfect, with categories such as Central and South American grouping a number of communities together; certain non-descript groupings such as Other Spanish or NOS are also difficult to interpret.36 Grouping of Hispanic populations by race and country of origin may miss other important cultural factors that may play a role in health disparities, such as language or cross-cultural identity. We also acknowledge that large population studies can miss the vast heterogeneity within cultural identity groups and recognize that an individual patient’s identity goes beyond their cultural and/or ethnic groups. Certain groupings created due to small sample size, such as ‘Hispanic Other’, may obscure potential disparities within traditionally underrepresented groups, such as Indigenous populations. Missingness in the data should also be acknowledged, particularly with regard to patient receptor status; nearly 40–50% of patients were missing information on the receptor status in each of the subgroups. Lastly, we recognize the imperfect nature of socioeconomic corrections given neighborhood-level data on income and education available in NCDB, as uncaptured disparities due to structural racism may not be adjusted for accurately.

Future Study

Our findings support several areas of future exploration. While this work suggests that socioeconomic status may mediate some of the observed disparities among Hispanic populations, further studies with more granular patient-level data are needed to validate this observation. Additionally, some groups, such as Mexican populations, continued to have higher odds of presenting at later-stage cancer in spite of adjustment for socioeconomic status, suggesting other mechanisms may be at play. Future studies should also explore language, social ties, cultural factors, and differential participation in cancer screening that may account for some of the observed disparities. Furthermore, studies should explore the role of acculturation and immigration in healthcare access and stage at presentation; differences may be observed between newly arrived immigrants and other immigrant populations who spent prolonged periods of time in the US. Disparities may further exist due to differences in geographic settlement of Hispanic subpopulations and access to targeted resources, which merit further exploration. Additionally, differences between Hispanic Black and non-Hispanic Black populations remain relatively understudied. Despite significant genetic admixture among Hispanic populations, future research could also consider exploring any potential differences in prevalence of known genetic mutations between Hispanic subgroups. Given their aggregation in this study, populations such as Central and South Americans should be better examined in future work. Importantly, qualitative studies are necessary to shed further light on the reasons underlying the observed disparities, and may help to clarify modifiable factors that could be leveraged to promote equity. From a treatment perspective, future studies should also consider access to treatment in these populations, with particular focus on any differential access to breast conservation versus mastectomy with or without reconstruction. Further work is needed to identify the causes of these disparities and may take the form of qualitative research.

Conclusion

There are significant disparities in breast cancer stage at presentation across Hispanic populations stratified by race and country of origin, with socioeconomic status potentially playing a large role in the observed disparities. Further research on potential mechanisms, including language, environmental, and cultural factors, should be explored to guide development of targeted interventions to improve both economic and access inequities, which could lead to improved outcomes in vulnerable Hispanic subpopulations.