Introduction

Breast cancer disparities by race/ethnicity span the continuum from etiology, prevention, early detection, diagnosis, treatment, and survivorship [1,2,3]. In order to understand the complex interactions between biologic, lifestyle, environmental, social, cultural, and community-level factors that underlie the disparities, research in diverse populations is critical [4, 5]. Racial/ethnic minorities are under-represented in observational studies [5, 6], intervention and clinical trials [7, 8], and biorepositories [9, 10], and it is often assumed that minorities are less willing to participate in biomedical research and provide biospecimens [11]. Multiple barriers precluding minority participation in biospecimen collection and genomics research have been identified [12,13,14,15]. Under-representation of racial/ethnic minorities may also be due to failures in recruitment methodology or lack of opportunities to engage in research rather than an inherent unwillingness to participate in biomedical research [14, 16,17,18].

We report on the enrollment experience from the Northern California site of the Breast Cancer Family Registry [19, 20]. Family studies are a powerful study design to investigate gene–environment interactions [21], but they also present unique challenges, as they depend on participants’ willingness to grant permission to contact family members. We evaluated participation in multiple study components, including biospecimen collection, by race/ethnicity and other factors.

Materials and methods

Study sample

The Northern California family registry site recruited female probands aged 18–64 years newly diagnosed with breast cancer through population-based cancer registries that are part of the national cancer institute’s surveillance, epidemiology, and end results (SEER) program and the California cancer registry. The San Francisco Bay Area-based recruitment included invasive or in situ breast cancer cases of any race/ethnicity diagnosed between 1 January 1995 and 30 September 1998 (Phase I); Hispanic, African American, Chinese, Filipina, and Japanese invasive cases diagnosed between 1 October 1998 and 30 April 2003 (Phase II); Hispanic and African American invasive cases diagnosed between 1 May 2003 and 31 August 2009 (Phase III); and triple-negative cases (estrogen receptor negative, progesterone receptor negative, human epidermal growth factor receptor 2 negative) diagnosed between 1 January 2007 and 30 June 2009 (Phase IV). Additionally, Hispanic and African American invasive cases from the Sacramento area diagnosed between 1 January 2005 and 31 December 2006 were identified through the Sacramento and Sierra cancer registries.

Of 34,517 ascertained cases, 1,235 (4%) were deceased, 270 (0.8%) had no physician approval to be contacted, and 3,092 (9.1%) had outdated addresses. Except for 361 Phase I cases (diagnosed before age 35 years, with a prior ovarian or childhood cancer, or bilateral breast cancer with a first diagnosis before age 50 years) which were enrolled without screening, the remaining 29,559 cases were screened by telephone to determine self-identified race/ethnicity and study eligibility. Cases with characteristics suggestive of inherited breast cancer (i.e., diagnosis before age 35 years, prior ovarian or childhood cancer, bilateral breast cancer with a first diagnosis before age 50 years, and first-degree family history of breast, ovarian, or childhood cancer) were invited to enroll in the family registry. Cases diagnosed at ages 35–64 years not meeting these criteria were randomly sampled; racial/ethnic minorities at 33% and non-Hispanic whites (NHWs) at 2.5%, given the high volume of NHW cases.

We also enrolled the probands’ adult relatives living in North America, primarily first-degree relatives. The present analysis was limited to parents, full and half-sisters, and adult daughters or sons with a prior diagnosis of breast, ovarian, or childhood cancer.

Data collection

For probands, we collected a family history questionnaire by telephone. Probands and relatives completed a risk factor questionnaire (by home visit if residing in the San Francisco Bay area (Alameda, Contra Costa, Marin, Monterey, Santa Clara, Santa Cruz, San Francisco and San Mateo counties) or by telephone if residing elsewhere) and a mailed food frequency questionnaire. Probands and relatives with a prior breast cancer completed a treatment questionnaire and a signed medical release to collect the pathology report and tumor tissue. Living parents of probands also completed the risk factor questionnaire, except for parents of probands diagnosed from 1995 to 1998, for whom we collected the risk factor questionnaire only if they had a prior breast or ovarian cancer. The risk factor questionnaire included questions about race/ethnicity, education, country of birth, year of migration to the U.S., years of residence in the U.S. if foreign-born, and first language learned. If English was not the participant’s first language, English language proficiency was assessed in the questionnaire by asking “Which of these choices best describes how well you speak English?” with response options of ”well,” ”medium,” ”little,” or ”not at all.” Data on age and stage at diagnosis and neighborhood socioeconomic status (SES) based on U.S. census data were obtained from the cancer registries.

Biospecimen collection

All probands and relatives who enrolled in the family registry study by completing the risk factor questionnaire were invited to provide a biospecimen sample. For parents of probands diagnosed from 1995 to 1998, we collected a biospecimen sample only if they had a prior breast or ovarian cancer. For local participants (San Francisco Bay area residents); Alameda, Contra Costa, Marina, Monterey, Santa Clara, Santa Cruz, San Francisco, and San Mateo Counties), interviewers/phlebotomists collected the blood sample at the home visit after they administered the risk factor questionnaire; for non-local participants, we mailed a blood collection kit with pre-paid postage for return if they were willing to have their blood drawn at their doctor’s office. Participants who declined a blood draw were invited to provide a mouthwash sample using a mailed mouthwash collection kit.

Data and biospecimen collection procedures

We used several strategies to maximize enrollment and biospecimen collection. Trained professional study interviewers and phlebotomists made multiple attempts by phone to reach study participants or by mail to obtain updated telephone numbers. They made up to 10 attempts to conduct the telephone screening or to schedule a home visit or telephone interview. All study materials, except the diet questionnaire, were translated into Spanish and Chinese, and data and biospecimens were collected by bi-cultural and bi-lingual interviewers and phlebotomists. We matched participants and interviewers/phlebotomists on language and cultural background, when possible. To reduce participant burden, we collected all questionnaire data by home visit or telephone interview at a time that was convenient to the participant, including evenings and weekends, and re-scheduled canceled appointments. Participants received $25 for completing the risk factor questionnaire and $25 for providing a biospecimen. Through the consent form, the participants were informed that de-identified data and biospecimens would be stored for future research by approved investigators. The study protocol was approved by the institutional review board of the Cancer Prevention Institute of California and participants provided written informed consent.

Analytic variables

Analyses of case participation in telephone screening relied on race/ethnicity from the cancer registries, whereas analyses of proband and relative participation in data and biospecimen collection relied on self-reported race/ethnicity. For eligible relatives who did not enroll, we used the proband’s race/ethnicity. We classified race/ethnicity as NHW, Hispanic, African American, Chinese, Filipino, Japanese, other Asian American/Pacific Islander, or other (native American, mixed race/ethnicity). Stage at diagnosis was based on SEER summary stage (in situ, localized, regional, distant); neighborhood SES is a composite measure of seven SES indicators from 2000 census data [22] and was categorized according to the quintile distribution of all ascertained breast cancer cases. Cancer family history was defined as breast, ovarian, or childhood cancer in first-degree relatives, and personal cancer history was defined as a prior diagnosis of breast, ovarian, or childhood cancer.

Statistical analysis

We evaluated racial/ethnic differences in study participation, defined as completion of telephone screening interview (breast cancer cases), family history and risk factor questionnaires (probands), risk factor questionnaire (relatives), and biospecimen collection (probands and relatives). We calculated participation rates as the number of subjects who completed the study component divided by the number of eligible subjects. For both enrollment and biospecimen collection, we examined differences in proband participation by race/ethnicity, age at diagnosis, stage, cancer family history, and neighborhood SES, and differences in relative participation by race/ethnicity, proband’s age at diagnosis, and personal cancer history. To evaluate differences in biospecimen collection by other characteristics collected in the risk factor questionnaire (i.e., education, country of birth, age at migration to the U.S., years of residence in the U.S., and English language proficiency), we restricted the analyses to enrolled probands and relatives. For single predictors of study participation, we assessed the statistical significance of differences using Chi-square tests. To assess differences in study participation adjusting for multiple predictors, we used multivariable models to calculate odds ratios (OR) and 95% confidence intervals (CI). The ORs represent odds ratios for participation. For probands, we used unconditional logistic regression, whereas for relatives we used the generalized estimating equations (GEE) method on logistic models to account for correlation among relatives from the same family. We used the Wald test to test for significant differences in study participation. Two-sided p < 0.05 were considered statistically significant and all analyses were performed using SAS Version 9.4 (SAS Institute, Cary, NC).

Of the 4,841 eligible probands who were alive and selected to enroll in the family registry, 61 were from multiple proband families. For this analysis, secondary or tertiary probands were classified as relatives.

Results

Case screening interview

Of 29,559 incident female breast cancer cases contacted, 25,183 (85%) completed the telephone screening interview (Table 1). Participation differed by race/ethnicity (p < 0.01), with higher rates for African Americans, NHWs, and Hispanics (87–88%) than for Asian American subgroups (76%-81%). Participation also differed by age (lowest for ages 18–34 years) and stage at diagnosis (lowest for distant stage), but not by neighborhood SES. Lower participation in screening by Asian Americans was also seen in multivariable adjusted models.

Table 1 Participation in screening interview, by race/ethnicity, and neighborhood socioeconomic status

Proband participation

Family history and risk factor questionnaires

Of 4,780 breast cancer cases selected as probands, 3,620 (76%) enrolled in the family registry study by completing the family history and risk factor questionnaires (Table 2). Characteristics of enrolled probands are shown in Supplemental Table 1. Significant differences by race/ethnicity were found for age at diagnosis, stage, cancer family history, education, and country of birth, but not for neighborhood SES. Enrollment varied by race/ethnicity (p < 0.01) and was highest for NHWs (81%), intermediate (74–76%) for all other groups except Filipinas (66%). Enrollment did not differ by age at diagnosis (p = 0.55) or neighborhood SES (p = 0.45), but was higher for those with a cancer family history (p = 0.01) and lower for those with missing stage (p = 0.01). In multivariable adjusted models, compared to NHWs, enrollment was similar for Hispanics and African Americans, but significantly lower for Asian Americans (OR = 0.68, 95% CI = 0.55–0.83).

Table 2 Proband enrollment and biospecimen collection, by race/ethnicity, age at diagnosis, stage, first-degree cancer family history, neighborhood socioeconomic status, education, and country of birth

Biospecimen collection

Of eligible probands, 3,244 (68%) provided a biospecimen sample. Participation differed by race/ethnicity, ranging from 76% among NHW women to 50% among Filipinas, and was higher for family history positive versus negative probands (71% vs. 66%, p < 0.01). Among enrolled probands who completed the risk factor questionnaire, biospecimen collection differed by race/ethnicity (p < 0.01), with similarly high participation by African Americans, NHWs, and Hispanics (92–95%), but notably lower participation by some Asian American subgroups (72–88%; Table 2). Participation in biospecimen collection also differed by cancer family history (p < 0.01), education (p < 0.01), and country of birth (p < 0.01). In multivariable adjusted models, participation in biospecimen collection was 70% lower for Asian Americans compared to NHWs, but was not significantly different for Hispanics and African Americans, and 34% lower for foreign-born vs. U.S.-born probands (p = 0.01). Overall, 92% of biospecimens collected were blood samples and 8% were mouthwash samples, with some variation by race/ethnicity (p = 0.01). The proportion of blood versus mouthwash samples was highest for Hispanics (97%), followed by Japanese (94%), NHWs (93%), African Americans (90%), Filipinas (89%), Chinese (86%), and other Asians/others (86%) (data not shown in tables).

Relative participation

Of 3,620 enrolled probands, only 62% had eligible first-degree relatives that could be contacted (Supplemental Table 2). Seven percent had no living first-degree relatives, ranging from 3% for Filipinas to 14% for NHWs; 9% had no first-degree relatives living in North America, ranging from 1% for African Americans to 25% for other Asian Americans/others; and 21% did not give permission to contact their relatives, ranging from 11% for NHWs to 35% for Chinese. Overall, 38% of probands did not have a first-degree relative we could contact for enrollment, ranging from 28% for NHWs to 58% for Chinese.

Risk factor questionnaire

Of 4,279 eligible first-degree relatives, 3,306 (77%) enrolled in the family registry study by completing the risk factor questionnaire (Table 3). Characteristics of enrolled relatives are shown in Supplemental Table 3. Statistically significant differences by race/ethnicity were found for age at interview, personal cancer history, education, and country of birth. Relative enrollment differed by race/ethnicity (p < 0.01), and was highest for NHWs (87%); intermediate (74–80%) for African Americans, Hispanics, Japanese, and other Asian Americans/others; and lowest (67%) for Chinese and Filipinas. Enrollment was similar for relatives with or without a personal cancer history (78% vs. 77%) and was higher for relatives of younger probands than those of older probands (84% vs. 76%, p < 0.01). In multivariable adjusted models, race/ethnicity and proband’s age at diagnosis were significant predictors of enrollment.

Table 3 First-degree relative enrollment and biospecimen collection, by race/ethnicity, age, personal cancer history, education, and country of birth

Biospecimen collection

Of eligible relatives, 2,774 (65%) provided a biospecimen sample. Participation differed by race/ethnicity (p < 0.01), with participation ranging from 76% among NHWs to 53% among Filipinas. Participation was higher for relatives of younger probands (p ≤ 0.01), but did not differ by the relative’s personal cancer history. Of 3,306 enrolled relatives, 2,774 (84%) provided a blood or mouthwash sample, with significant differences by race/ethnicity (p < 0.01), ranging from 82–87% for African Americans, Hispanics, and NHWs (Table 3). For Asian Americans, biospecimen collection was 78% overall, and ranged from 71–88% for specific subgroups. Biospecimen collection was higher for younger than older relatives (p < 0.01), but did not differ by education (p = 0.65) or personal history of cancer (p = 0.69). In multivariable models, ORs for biospecimen collection were 1.00 (95% CI = 0.71–1.42) for Hispanics, 0.70 (95% CI = 0.51–0.97) for African Americans, and 0.59 (95% CI = 0.42–0.82) for Asian Americans, compared to NHWs.

Proband and relative biospecimen collection by migration history

Among enrolled Hispanic probands, biospecimen collection (95% participation overall) did not differ by education, country of birth, age at migration to the U.S., duration of residence in the U.S., or English proficiency (Table 4). In multivariable adjusted models, none of the differences in participation were statistically significant. In contrast, among Asian Americans, biospecimen collection rates were higher among more educated probands, and those who were U.S.-born, migrated to the U.S. before age 20 years, lived in the U.S. for ≥ 40 years, or spoke English well or English only (all p values < 0.01). In multivariable models, education, country of birth, years of residence in the U.S., and English language proficiency remained significant predictors of biospecimen collection.

Table 4 Biospecimen collection for enrolled Hispanic and Asian American probands and first-degree relatives, by migration history

Among enrolled Hispanic relatives, biospecimen collection was lower compared to Hispanic probands (87% vs. 95%), and higher for less educated versus more educated relatives (p = 0.02), foreign-born versus U.S.-born relatives (90% vs. 84%; p = 0.01) and those who lived in the U.S. < 40 years versus ≥ 40 years (89–96% vs. 85%; p < 0.01) (Table 4). In multivariable models, country of birth and years of residence in the U.S. remained significant predictors of biospecimen collection. In contrast, among enrolled Asian Americans, biospecimen collection was similar for relatives and probands (78% vs. 77%), lowest for those with low education (p < 0.01), higher for U.S.-born than foreign-born relatives (87% vs. 73%; p < 0.01) and differed by migration history, with the highest participation for relatives who migrated to the U.S. before age 30 years (79–81%), lived in the U.S. for ≥ 40 years (87%), and spoke English well or English only (82%). In multivariable models, only education remained a significant predictor of biospecimen collection; country of birth was a predictor of borderline significance.

Discussion

In this population-based family cohort, study participation was generally high, with some variation by race/ethnicity. Participation in telephone screening was similar for female Hispanic, African American, and NHW breast cancer cases, but lower for Asian American subgroups. Proband enrollment was highest for NHWs, and intermediate for all other groups, except Filipinas. A similar enrollment pattern by race/ethnicity was seen for first-degree relatives. Biospecimen collection rates both for probands and relatives were lowest for Asian Americans, with considerable variation across Asian American subgroups.

Our enrollment rates for Hispanic, African American, and NHW probands are comparable to the participation rates in the San Francisco Bay Area Breast Cancer Study (SFBCS), a population-based case–control study [23]. Both studies used similar recruitment methods and collected interview data and biospecimens through home visits. Proband enrollment was somewhat lower than case participation in SFBCS (Hispanics: 75% vs. 89%; African Americans: 76% vs. 87%; NHWs: 81% vs. 86%), possibly due to long-term follow-up and involvement of family members.

One of our key findings is the high biospecimen collection rate for enrolled Hispanic and African American probands, consistent with the high rates for SFBCS cases who completed the interview [24] (Hispanics: 95% vs. 88%; African Americans: 92% vs. 85%; NHWs: 94% vs. 90%). Furthermore, most participants provided a blood vs. mouthwash sample, demonstrating that California Hispanic and African American women with breast cancer are as willing as their NHW counterparts to donate blood for biomedical research.

Reports on biospecimen collection rates in racial/ethnic minorities are sparse. The Southern Community Cohort Study obtained blood or buccal samples for 96% (half were blood samples) of African Americans recruited from community health centers [25]. Blood or saliva collection by home visits was also high for African Americans in the North Carolina Colorectal Cancer Study (94% overall, 79% for blood) [26]. The Black Women’s Health Study obtained mailed buccal samples for 51% [27], and a blood sample collected at a nearby clinical center for 35% of 1,500 pilot study participants [28]. These data suggest that high biospecimen collection rates are more difficult to attain in studies that increase participant burden (i.e., return of biospecimens by mail, clinic visits). Studies, such as ours, that reduce participant burden through home visits may be more effective in achieving high biospecimen collection. Home visits are more costly, particularly for geographically dispersed participants, but large-scale repeated mailings of biospecimen collection kits that are not returned also come at a considerable cost [29].

Our high biospecimen collection rate for Hispanics is consistent with the Mano a Mano cohort of Mexican Americans [30] that obtained biospecimens (blood, cheek cell, or urine samples) for 94% (collection rates for specific biospecimens were not provided), with similar collection rates for U.S.-born and foreign-born Hispanics, consistent with our findings for enrolled probands, whereas foreign-born relatives were twice as likely to participate in biospecimen collection. Consent to give blood and urine samples for future research did not differ between Mexican Americans and NHWs who participated in the 2011–2012 National Health and Nutrition Examination Survey [31]. Focus groups and surveys assessing knowledge and beliefs about biospecimens among Hispanics have also found high willingness to provide biospecimens for biomedical research [32,33,34,35,36].

For Asian Americans, proband and relative enrollment and biospecimen collection were considerably lower, as reported by others [31, 37] and differed between Asian American subgroups. Associations with migration-related variables differed from those for Hispanic women. Among both enrolled Asian American probands and relatives, biospecimen collection was lower for foreign-born than U.S.-born Asian Americans, whereas among enrolled Hispanics, biospecimen collection was higher for foreign-born relatives. Duration of residence in the U.S. and English language proficiency were significant predictors for enrolled Asian American probands only. The lack of interviewers who spoke Filipino language may have contributed to the lower enrollment of Filipina cases in our study. Greater reluctance to participate in biospecimen-based research by foreign-born Asian Americans and more recent immigrants may be related to cultural beliefs and lack of knowledge about cancer, biospecimens, and biobanking [38]. Culturally relevant educational programs for Chinese Americans have been successful at increasing knowledge about biospecimens, addressing informed consent procedures and privacy concerns, and generally encouraging participation in biomedical research [14, 39,40,41].

Enrollment of family members presented several challenges. The willingness to grant access to first-degree relatives varied by race/ethnicity, with about a third of Chinese probands not granting permission to contact relatives. Family size and immigrant background also affected the availability of relatives for enrollment; 14% of NHW probands did not have any first-degree relatives who were alive, and 19% of Chinese probands did not have any first-degree relatives who lived in North America. Overall, only 62% of probands had relatives whom we could approach for enrollment, with the lowest percentage for Chinese Americans (42%). Therefore, the enrolled relatives may not be representative of all eligible relatives.

Race/ethnicity was the only statistically significant predictor consistently associated with enrollment and biospecimen collection, both among probands and relatives, with generally similar participation for Hispanics, African Americans, and NHWs, but lower participation in some Asian American subgroups. It is reassuring that case screening, proband enrollment, and biospecimen collection varied little by neighborhood SES. For enrolled probands and relatives, biospecimen collection varied by education, though only among Asian Americans (data not shown for NHWs and African Americans). Similarly, other studies found only small differences in biospecimen collection by education [42] and among Hispanics and African Americans specifically [9, 27].

We employed several strategies to maximize enrollment and biospecimen collection, including data collection by home visits and telephone interviews which helps build rapport and trust with participants, allows for participants’ concerns to be addressed and resolved in a timely manner, and overcomes literacy issues. Home visits also reduce participant burden and help mitigate barriers such as lack of transportation or interference with family or work responsibilities [43]. Bi-lingual research staff and multi-lingual and culturally sensitive study materials are essential in multiethnic and immigrant study populations, and concordance in language and culture between study participants and interviewers has been shown to increase participation [44]. Community-based participatory research approaches and community outreach have also been shown to be effective in the recruitment of minority populations [45,46,47,48].

Our data and those from other epidemiologic studies demonstrate that minorities are willing to donate biospecimens for biomedical research [24,25,26, 30]. However, given numerous barriers that may hinder participation in research [12, 13, 49, 50], special efforts should be directed towards giving minorities the opportunities to participate in research studies and facilitating their participation by reducing barriers. Increasing knowledge about biospecimens through educational programs and greater transparency by study investigators also helps overcome issues of distrust and reluctance to participate in biospecimen collection [14, 51]. It is important that populations from diverse racial/ethnic and socioeconomic backgrounds are given the opportunity to participate in biomedical research [5] because research that is inclusive of all populations is critical for informing targeted cancer control and prevention strategies, personalized medicine, and health policy.

Our study has several important strengths, including the population-based design for the recruitment of probands, purposeful oversampling of racial/ethnic minorities, and comprehensive collection of breast cancer risk factors by questionnaire and clinical and tumor characteristics from cancer registry records. Recruitment shortly after diagnosis and during treatment may be challenging, but we were successful at contacting cases at least 6 months after breast cancer diagnosis as soon as cases from the cancer registries became available. Women who died soon after diagnosis did not have the opportunity to enroll in the study, but given the high survival rate, the proportion of women who had died before being contacted for eligibility screening was small (3%). Other limitations include inability to assess study eligibility for all breast cancer cases due to language barriers at the screening level, and the less than optimal participation by Filipina and Chinese women. It is also possible that study participation in other U.S. regions differs for minority populations with different cultural backgrounds, countries of origin, or sociodemographic characteristics compared to the urban populations of the San Francisco Bay Area.

Conclusions

Our results show that racial/ethnic minority populations are willing to participate in research and provide biospecimen samples, although recent immigrants may need more directed recruitment methods. Future studies should prioritize culturally sensitive approaches in their design in order to maximize recruitment and biospecimen collection, especially among Asian Americans.