BACKGROUND

The existence of racial disparities in medical treatment, health service utilization and patient–provider interactions is supported by a large body of research from around the world.1 4 Although research on healthcare provider racism was first conducted over 30 years ago,5 it was not until the publication of the landmark report ‘Unequal Treatment’6 that racism was recognized as a key driver of racial/ethnic disparities in healthcare. Over a decade later, there now exists a substantial body of literature devoted to this topic,7 including reviews on perceptions of racial discrimination in healthcare8 and the impact of racism for racial/ethnic minority patients in the U.S.9

Racism can be defined as phenomena that maintain or exacerbate avoidable and unfair inequalities in power, resources or opportunities across racial, ethnic, cultural or religious groups. Racism can be expressed through beliefs (e.g. negative and inaccurate stereotypes), emotions (e.g. fear or hatred) or behaviors/practices (e.g. discrimination or unfair treatment) and can occur at three levels: internalized (incorporating racist beliefs into one’s worldview); interpersonal (racist interactions between individuals); and systemic/institutional (racism occurring through policies, practices or processes within organizations/institutions).10

The National Research Council recognized that: “no single approach to measuring racial discrimination allows researchers to address all the important measurement issues or to answer all the questions of interest”.11 Leading scholars in the study of healthcare provider racism have also noted the need for multi-method studies.7 , 12

Focusing on interpersonal racism rather than internalized or systemic/institutional racism, this paper reviews worldwide evidence (from 1995) for racism among healthcare providers, while also comparing existing measurement approaches to emerging best practice. Notwithstanding the need to understand and address systemic/institutional and internalized racism within healthcare provision,13 , 14 methods to measure these levels of racism, as well as mechanisms of influence, are heteregeneous14 and require separate consideration. Systemic racism within healthcare9 , 14 and the health effects of internalized racism15 , 16 have been the focus of previous reviews.

OBJECTIVES

To systematically review and appraise evidence of healthcare provider racism and assess current approaches to measuring racism amongst healthcare providers.

METHODS

Data Sources

The following databases and electronic journal collections were searched for studies published between January 1995 and June 2012: Medline, CINAHL, PsycInfo, Sociological Abstracts. Authors’ own reference databases and reference lists of included studies were also searched (see Appendix for details).

Study Selection

Types of Studies

Published empirical studies of any design measuring healthcare provider racism in the English language (including theses and dissertations).

Racism as reported by patients is beyond the scope of this review. Also excluded are studies focused on knowledge of minority group patients, cultural difference, cross-cultural practice and cultural competence. Moreover, because a range of factors drive racial disparities in health, we only consider disparities to be indicative of racism when found in experimental studies that are robust to alternative explanations. This includes disparities in provider diagnosis; treatment recommendations; behavior/communication; and patient satisfaction, adherence or utilization.

Types of Participants

Healthcare providers included physicians, nurses and allied healthcare professions (such as physiotherapists, social workers) and support staff (e.g. nursing aides and attendants, allied health assistants) involved in direct patient care. Reception and administration staff with direct patient contact were also included. Studies solely focused on medical or allied health students and/or their teaching staff were excluded. Health students and their teaching staff are considerably different from each other, as well as from providers themselves. Students are likely to be negotiating formation of their own individual and professional provider identity; both of which are likely to influence race-related attitudes, beliefs and behaviors.17

Titles and abstracts of all identified studies were screened for inclusion by the second and third authors, with the first author independently screening a 5 % random sample. There was no difference in inclusion/exclusion agreement between reviewers.

Data Extraction

Data were extracted for each eligible study by the second author and by the first author on a random selection of 10 %, with full agreement between authors observed. Variables for data extraction from each study were:

  • Study design and objectives;

  • Method of measurement, constructs measured, type of tool;

  • Healthcare provider and patient characteristics using PROGRESS-PLUS18 (Place of residence, Race/ethnicity, Occupation, Gender, Religion, Education, Socio-economic status, Social capital/networks and age, disability and sexual orientation)

  • Healthcare setting (e.g. primary care, tertiary);

  • Country and language of study; and

  • Study outcomes (reporting of PROGRESS-PLUS at outcome).

Quality Assessment

The quality of each eligible study was assessed by the second author (and 10 % by the first author) using the Health Evidence Bulletin Wales critical appraisal tool adapted from the Critical Appraisal Skills Programme (CASP) (http://hebw.cf.ac.uk/projectmethod/appendix5.htm#top). This tool assesses key domains of study quality, including clarity of aims, appropriateness and rigor of design and analysis, including risk of bias, and relevance of results.19 The first author also reviewed each completed critical appraisal tool for accuracy. Differences in assessment were resolved by discussion between the two reviewers.

RESULTS

A total of 37 studies published between January 1995 and June 2012 met the inclusion criteria (asterisked in the reference list). See Figure 1 for flow chart of search. A summary of the key characteristics of each study is detailed in Table 1, including study design, country, healthcare setting, provider profession and racial/ethnic background of provider and patient (where applicable). Statistically significant evidence of racist beliefs, emotions or practices among healthcare providers in relation to minority groups was evident in 26 of these studies. No particular patterns emerged by country, study population, healthcare setting or measurement approach.

Figure 1.
figure 1

Systematic review flowchart—initial search conducted December 2010.

Table 1 Characteristics of 37 Studies of Reported Healthcare Provider Racism in Relation to Minority Groups

Measurement of Racism

Direct measures of racism occur when the attribute being assessed is asked about specifically, while indirect measures require inference from collected data.20 , 21 Although the terms direct and indirect are utilized in this review, various other terms are commonly used, including: automatic vs. controlled, spontaneous vs. deliberate, implicit vs. explicit, impulsive vs. reflective and associative vs. rule-based/propositional.22

Direct Measures

Self-completed surveys were the most commonly utilized direct measurement approach.23 37 Van Ryn & Burke23 assessed beliefs about patient abilities and personality characteristics through physicians rating a series of semantic differentials (intelligent-unintelligent; self-controlled-lacking self-control; pleasant-unpleasant; educated-uneducated; rational-irrational; independent-dependent; and responsible-irresponsible). Providers rated patients on stereotypes in terms of how likely they were to: lack social support; exaggerate discomfort; fail to comply with medical advice; abuse drugs, including alcohol; desire a physically active lifestyle; participate in cardiac rehabilitation (if it were prescribed); try to manipulate physicians; initiate a malpractice suit; have major responsibility for the care of a family member(s); and have significant career demands/responsibilities.23

Van Ryn & Burke23 also assessed social distance with one item (‘this patient is the kind of person I could see myself being friends with’), while Green et al.32 utilized the 7-item Affective Racial Attitudes Scale38 to assess social distance and inter-group contact (e.g. ‘my friendship network is very racially mixed’). Sabin et al.35 and Green et al.36 assessed feelings of warmth towards African and European American (0 cold to 10 warm), while Sabin et al.35 also assessed stereotypes of compliance by asking providers whether African or European Americans were more generally likely to be compliant patients. Sabin et al.35 asked whether Blacks or Whites were more likely to receive appropriate treatment generally, and in the providers’ own workplace. Sabin et al.37 and Green et al.36 directly assessed racial preference (e.g. ‘I like White Americans and African Americans equally’ to ‘I moderately/strongly prefer African Americans to White Americans’).

Mitchell & Sedlacek29 used the 100-item semantic differential Situational Attitude Scale39 to assess ten emotional reactions relating to Hispanics, African Americans and people of an unspecified racial group across ten social situations controlling for social desirability (e.g. for the situation ‘New family next door’, the items are: good-bad, safe-unsafe, angry-not angry, friendly-unfriendly, sympathetic-not sympathetic, nervous-calm, happy-sad, objectionable-acceptable, desirable-undesirable, suspicious-trusting).

The 20-item Ethnic Attitude Scale40 , 41 assessed beliefs in response to a clinical scenario/vignette,30 while Constantine et al.27 measured racism towards Blacks using the 7-item New Racism Scale42 as well as views of White privilege using the Awareness subscale of the Multicultural Counseling Knowledge and Awareness Scale.43 Constantine et al.27 also utilized the 43-item Visible Racial/Ethnic Identity Attitude Scale44 and 50-item White Racial Identity Attitude Scale45 to assess a range of race-related beliefs, emotions and behaviors, while Michaelsen et al.25 assessed knowledge of and attitudes towards immigrants with 29 items. Penner et al.24 utilized a 25-item scale46 , 47 and Green et al.32 a 9-item scale38 to measure beliefs about to race-related policies, including awareness of contemporary racism.

Paez et al.28 used one item each to assess belief in race-based meritocracy, White privilege and assimilationist ideology. Middleton et al.26 assessed self-perceptions of racism among providers (‘When working with minority individuals, I am confident that my conceptualization of client problems do not consist of stereotypes and biases’ and ‘When working with minority clients, I perceive that my race causes clients to mistrust me’) within the 40-item Multicultural Counseling Inventory.

Indirect Measures

Vignettes are indirect measures that infer bias in diagnosis, recommended treatment or patient characteristics (i.e. practices/behaviors) from differential response to hypothetical situations that are identical except for the race/ethnicity of the patients involved. Vignettes are primarily based on brief written scenarios, but can also include more detailed approaches such as medical chart abstraction48 and audio-visual material. For example, Hirsh et al.49 utilized 20-second audio-visual clips of virtually generated characters along with vignettes to examine the influence of contextual information (i.e. sex, race and age) on pain-related decisions among nurses. Vignettes, as an indirect approach to measuring racism, were more commonly utilized than self-completed surveys among studies included in this review.35 , 36 , 49 65 , 70

A range of computer-based indirect measures were also used, notably several versions of the Implicit Association Test (IAT).66 In the studies reviewed here, the IAT involved a comparison of two target objects that produced a measure of relative preference for one race over another. Participants are required to categorize a set of names or faces in terms of their membership in a relevant category (e.g., a race/ethnicity). The IAT differs from priming tasks (semantic or evaluative/affective) in which participants are not explicitly required to process the category membership of the presented stimuli. Instead, primes (either words or images) relating to particular racial/ethnic groups are presented very briefly (80–300 ms), such that they are not consciously recalled by participants. This can be preceded by a cover-up task to allay suspicion. A neutral mask (also a word or image) before and/or after can also be used to reduce visibility of the prime. While there is some evidence that indirect measures are correlated,67 it is likely that measures using different underlying mechanisms such as IATs and priming tasks produce distinct results.68

The IATs reviewed here evaluated Black-White race generally;24 , 35 37 , 65 , 69 , 70 stereotypes about Blacks being uncooperative;36 stereotypes about Blacks being medically uncooperative;36 race in relation to compliant patients;35 and race in relation to the quality of medical care.35 Affective (also known as evaluative) priming tasks were also utilized.53 , 70 , 71 Stepanikova53 used racial labels (African American and Hispanic) and one Black stereotype-related word (i.e. rap) along with an initial cover-up task and a mask presented after each prime, while Abreu71 used stereotypes (Negroes, Blacks, lazy, blues, rhythm, Africa, stereotype, ghetto, welfare, basketball, unemployed, and plantation) with a mask presented after each prime. Moskowitz70 used stereotyped African American diseases such as HIV, hypertension and drug abuse mixed with non-stereotyped diseases such as chicken pox, leukemia and Crohn’s disease.

As an alternative method indirect measure, Balsa et al.72 tested for statistical discrimination in relation to survey and interview data. This study examined the extent to which doctors’ rational behavioral reactions to clinical uncertainty explained racial differences in the diagnosis of depression, hypertension and diabetes.

Studies Utilizing Both Direct and Indirect Measures

Five studies used both direct and indirect measures of racism.24 , 31 , 35 , 37 , 69 Cooper et al.69 used both the IAT and self-reported measures (designed to assess concepts in the IATs, including preferences or feelings toward and perceived cooperativeness of Whites and Blacks). Sabin et al.35 , 37 and Penner24 utilized explicit measures of racial attitudes/prejudice in addition to the IAT. Joseph31 used a vignette and questions about the patient care situation, followed by three questions related to cultural diversity.

Extent of Racism

Eleven vignette-based studies found that race influences the medical decision making of healthcare practitioners in relation to minority groups,35 , 36 , 49 , 53 , 54 , 56 59 , 61 , 65 whereas eight studies found no association.50 52 , 55 , 60 , 62 64 For example, Schulman et al.56 found that physicians were less likely to refer Black women for cardiac catheterization, even after adjusting for symptoms, the physicians’ estimates of the probability of coronary disease and clinical characteristics. In contrast, Weisse et al.55 , 63 found that physician decisions related to pain management were not influenced by race.

Four studies24 , 35 37 utilizing the IAT found that implicit racial bias existed among healthcare providers in the absence of explicit bias. In Stepanikova’s53 study, physicians’ medical decisions were influenced when subliminally exposed to Black and Hispanic stimuli. In Abreu’s71 study, participants primed with stereotypes related to African Americans rated a hypothetical patient more negatively.

Four studies using direct measures showed evidence of racism,23 , 29 , 31 , 34 while two studies did not find such evidence.32 , 35 Van Ryn & Burke23 found that physicians were less likely to have positive perceptions of Black than White patients across several dimensions, including compliance with medical advice and level of intelligence. Moskowitz et al.34 found that physicians had lower trust in non-White, compared with White, patients. In contrast, Sabin et al.35 found no significant differences in reported feelings towards European Americans and African Americans.

Pagotto et al.33 found that hospital workers’ interactions with immigrants were associated with lower levels of prejudice towards immigrants in general. Green et al.32 found that social workers possess the same ambivalence and social distance about race as the broader U.S. population, while Noble et al.30 found increased religiosity among hypothetical Jewish patients in clinical scenarios was associated with more racism against them. Balsa et al.72 found that physicians’ perceptions about the prevalence of disease across racial groups was associated with racial differences in the diagnosis of hypertension and diabetes. Although measuring healthcare provider racism, neither Middleton et al.26 nor Constantine et al.27 reported specifically on these findings.

Study Quality

Study quality was assessed in relation to the following areas: clarity of aims, appropriateness and rigor of design and analysis, including risk of bias, and relevance of results.19 The majority of studies were of moderate quality. All studies were cross-sectional, therefore limiting causal inference. Major methodological limitations of studies were: small sample sizes (e.g. n = 11,70 n = 1524),28 31 , 34 , 36 , 49 , 58 , 65 , 69 , 71 low response rates (e.g. 1–2 %,53 11 %26),25 , 32 , 35 , 54 , 57 , 60 , 63 non-representative samples (e.g. army nurses working in one hospital,31 infectious disease physicians59),52 , 71 threats to internal validity due to social desirability (e.g. Constantine et al.,27 Weisse et al.63),32 , 50 not controlling for confounders (e.g. gender),23 , 29 , 33 , 37 , 51 , 54 , 55 , 57 61 , 63 , 70 , 72 using non-randomised samples,28 , 37 , 49 , 55 , 69 and utilizing limited statistical analysis.29 , 31 , 55 , 63 , 70 Thirty of the thirty-seven studies were conducted in the United States, limiting generalizability of results.

DISCUSSION

Over two-thirds of studies included in this review found evidence of racism among healthcare providers. This includes racist beliefs, emotions and behaviors/practices relating to minority patients. No particular patterns emerged by country, study population, healthcare setting or measurement approach. A plethora of measurement approaches were used with little consistency across the included studies. Self-completed surveys were the most commonly utilized direct measurement approach, including assessment of patient abilities and characteristics, stereotypes, social distance, intergroup contact, perception of appropriate treatment, racial preference, emotional reactions and feelings of warmth towards racial/ethnic groups, as well as race-related beliefs and attitudes including White privilege and awareness of contemporary racism. Indirect measures consisted predominantly of clinical scenario vignettes or computer-based versions of the Implicit Association Test (IAT). Five studies used both direct and indirect measures. Eleven vignette-based studies found that race influences the medical decision making of healthcare practitioners, whereas eight studies found no association. Four studies utilizing the IAT found that implicit racial bias existed among healthcare providers in the absence of explicit bias. Four studies using direct measures showed evidence of racism, while two studies did not find such evidence. Findings of this review have substantial relevance to medical and healthcare provision, and highlight an ongoing need to recognize and counter racism among healthcare providers. A critical starting point in such endeavors is a more rigorous, sophisticated and systematic approach to monitoring racism among healthcare providers. Concurrently, the implementation and evaluation of multi-strategy, evidence-based, anti-racism approaches that dispel false beliefs and counter stereotypes, build empathy and perspective taking, develop personal responsibility and positive group norms, as well as promote intergroup contact and intercultural understanding73 within healthcare settings is also required.

Studies included in this review were almost solely conducted with physicians in the U.S. As a result meaningful comparison of differences in racism between provider categories was not possible. Further research is required to examine and compare racism among healthcare providers from other professional backgrounds (although, see Halanych et al.),74 and in countries outside of the U.S. The literature also suffers from limited information on racism experienced by patients of non-African American backgrounds (although, see Blair et al.).75

Although studies in this review used a number of measurement approaches (surveys, vignettes and computer-based indirect measures), the range of constructs measured was limited. Furthermore, only five studies utilized both direct and indirect approaches. Direct and indirect measures each have limitations that can be minimized by including both approaches in the same study.11 Self-completed surveys are subject to a range of biases, particularly social desirability.76 They are also unable to provide direct evidence of impact as the extent to which racist attitudes or beliefs translate into poorer healthcare varies. Vignettes can also subject to social desirability bias if participants are aware of the study aims. Physicians may respond differently to vignette than to actual clinical encounters. In addition, written vignettes may be less accurate than audio-visual recordings (although subtle differences between actors’ appearances and non-verbal cues may also affect audio-visual approaches). The majority of vignettes included factors such as age, gender and race/ethnicity. However, other factors such as socioeconomic status, employment status, and family situation can influence study findings.

Studies predominantly assessed general knowledge, attitudes, beliefs, emotions and behaviors towards racial groups, without detailing specific constructs or distinguishing between in-group favoritism and out-group derogation. In-group favoritism is defined as positive orientations towards one’s own racial/ethnic group, while out-group derogation constitutes negative orientations towards other racial/ethnic groups. Empirical evidence demonstrates that associations between in-group favoritism and out-group derogation can be negative, zero, or positive.77 As such, studies that do not differentiate between these constructs may be misleading, in that efforts to address prejudice against specific minority groups will differ from those aimed at reducing favored treatment for one’s own ethnic/racial group.78 Although central to social identity theory (a key psychological theory of racism)79 and despite calls to study in-group favoritism among healthcare providers,74 only one study included in this review assessed both in-group favoritism and out-group derogation.23

Unlike the Implicit Association Test, priming tasks are able to distinguish between in-group favoritism and out-group derogation.80 Moreover, priming tasks may more accurately capture associations in memory because they are designed to operate subliminally beyond conscious intention.67 This is especially the case when masks (i.e. symbols unrelated to the study topic) are used before and after the prime to reduce the visibility of the prime. It is notable that the two studies in this review using affective priming tasks only masked after (rather than also before) the prime,53 , 71 possibly compromising prime ‘invisibility’.

Despite a long history in other settings such as employment and housing,81 and calls for adoption in healthcare settings,9 no identified studies utilized paired-audit studies. Such studies could involve, for example, patients of different race/ethnicity (indicated by accent), but matched on other relevant characteristics such as phoning an emergency medicine department/hotline and enacting a set script. Any differences in provider behavior would then be attributable to ‘patient’ race/ethnicity.

Asking healthcare providers to assess their own level of racism through items such as ‘When working with minority individuals, I am confident that my conceptualization of client problems do not consist of stereotypes and biases’26 is likely to trigger strong social desirability bias that threatens response validity. It may be possible to minimize social desirability bias using computer-based speeded self-report tasks to assess ‘gut reaction’ to a particular topic (e.g. where participants are required to indicate negative or positive responses to questions within 700 milliseconds). After responding to questions on unrelated topics or completing other tasks (e.g., scenario responses), questions focused on the same topic (with no response deadline) can be asked, comparing these considered answers with ‘gut reactions’.22

Recent scholarship has identified warmth/good-naturedness towards, and perceived competence/capability of, racial/ethnic groups as key dimensions driving emotions that, in turn, drive racism.82 However, only two studies included in this review assessed warmth towards minority groups,35 , 36 with none assessing perceived competence. Future studies should utilize validated scales to assess good-naturedness/warmth, competence/capability, as well as the consequent emotions of admiration/ pride, envy/jealousy, pity/sympathy and contempt/disgust.83

Although no measures identified in this review assessed this, stereotyping is a cognitive process that can’t be effectively suppressed or denied, but rather needs to be recognized and accepted to avoid discriminatory behavior.84 , 85 Example items used to measure this understanding include: ‘It’s OK to have prejudicial thoughts or racial stereotypes’ and ‘When I evaluate someone negatively, I am able to recognize that this is just a reaction, not an objective fact’.86

Explicit prejudice reduction requires cognitive change through egalitarianism-related, non-prejudicial goals and increased awareness of contemporary racism,87 whereas implicit prejudice reduction requires decreased fear of, and positive contact with, members of a specific group.87 However, only two reviewed studies assessed awareness of contemporary racism, fear/anxiety and intergroup contact,23 , 32 with no studies examining egalitarianism or motivation to respond without prejudice. Meritocracy, just-world beliefs88 and White racial identity, privilege and guilt89 are also important constructs that were assessed in only two of the included studies.27 , 28

Other important constructs that remain unexamined to date include ideologies such as color-blindness (i.e., treating everyone the same regardless of their race/ethnicity), multiculturalism (i.e., recognition and celebration of racial/ethnic difference) and anti-racism (i.e., targeted efforts to address racial disparities through, for example, affirmative action),90 genetic determinism (i.e., genes determine life chances)91 and essentialism (i.e., differences between racial/ethnic groups are natural and inherent),92 perceived status differences (i.e., prestige/success of racial/ethnic groups),93 medical authoritarianism (i.e., belief in hierarchical relationships between providers and patients),94 social dominance orientation (i.e., belief that some racial/ethnic groups are or should be superior to others)95 and materialism (i.e., the important of acquiring and owning possessions),96 as well as realistic threat (e.g., migrants ‘stealing’ jobs) and symbolic threat (e.g., migrants jeopardizing national values).97

Given the extensive research conducted on patient–provider communication,13 , 98 , 99 the relationship between racism and communication requires investigation (e.g., Hagiwara).100 Such research should examine evaluative concerns (e.g., when anxiety about appearing prejudiced is interpreted as prejudice itself) and stereotype threat (e.g., when thinking about common stereotypes, such as being a non-compliant patient, inadvertently causes behavior that aligns with these stereotypes).101 103 This could include emerging research on the counter-intuitive effects of complimentary stereotypes and positive feedback.104 , 105 Furthermore, a virtual immersive environment (i.e. an audio-visual virtual reality simulation in which providers can interact with computer-generated characters and manipulate objects) could increase realism of vignettes.106

It is also notable that none of the included studies examined a combination of racist beliefs, emotions and behaviors/practices. Although two experimental studies suggest causal relationships between stereotypes, emotions and behaviors,82 two meta-analyses and a study utilizing multiple national probability samples reveal only moderate correlations (0.32–0.49) between racist beliefs, emotions and behavior.107 109 Such findings indicates the need to explore how, and to what extent, racist attitudes and beliefs drive healthcare provider behavior and decision-making.9

Despite a burgeoning interest in racism as a contributor to these disparities, we still know relatively little about the extent of healthcare provider racism or how best to measure it. This review provides evidence that healthcare provider racism exists, and demonstrates a need for more sophisticated approaches to assessing and monitoring it.