Keywords

1 Triple Goal

The lack of labour market integration of vulnerable groups, such as refugees and other individuals with a migration background, the elderly, and people with a mental or physical health impairment, has received much attention in both policy and academic circles in the past decade (OECD 2008a, 2010). For policymakers, it is important to understand what factors cause this lack of integration in order to design the appropriate integration policies . Academic scholars have suggested discrimination in hiring as one important factor contributing to the poor labour market integration of these individuals (Altonji and Blank 1999; OECD 2008b). However, it is very challenging to measure discrimination in hiring, which makes it difficult to distinguish the effect of discrimination on employment from the effect of other factors, such as differences in human capital and other skills.

Historically, scholars have measured hiring discrimination through statistical analysis of non-experimental (survey or administrative) data. A commonly used approach has been to try to control for as many observed individual factors as possible, such as education , experience, and occupation, and then interpret any unexplained part in employment between groups as pointing in the direction of discrimination (Blinder 1973; Oaxaca 1973). In general, these studies are likely to suffer from an important endogeneity bias, because job applicants who appear similar to researchers (except for their discrimination ground), based on non-experimental data, might in fact appear to be different to employers. For example, administrative data seldom contain information about language skills of individuals with a migration background, but this is likely to be observed by the employer, perhaps at a job interview. As long as not all relevant variables, taken into account by employers in making their hiring decisions, are controlled by the researcher, no conclusive proof of discrimination can be provided.

In response to this methodological problem, and inspired by the seminal work of Bertrand and Mullainathan (2004), scholars in labour economics, sociology of labour, and personnel psychology during the past decade have turned to so-called correspondence experiments to measure hiring discrimination (Gaddis 2018). In these experiments, fictitious job applications, differing only in a randomly assigned discrimination ground, are sent in response to real job openings. By monitoring the subsequent call-back from employers, unequal treatment based on this single characteristic is identified and can be given a causal interpretation.

Not surprisingly, given the seminal status of the correspondence experimentation frameworkFootnote 1 and the numerous academic studies that have adopted this framework, during the past years, scholars have written reviews and meta-analyses concerning this literature. We are aware of four such meta-studies: Bertrand and Duflo (2016), Neumark (in press), Rich (2014), and Zschirnt and Ruedin (2016). While all are inspiring high-quality syntheses, with excellent policy links and clever directions for further research, they share two limitations. First, these studies focus on an in-depth review of the field experimental evidence on labour market discrimination based on some grounds, while neglecting other grounds based on which unequal treatment is also forbidden. Second, none of these studies attempt to provide the reader with an exhaustive list of all experiments (conducted during a particular time frame). They all seem to focus on the better known (i.e. from their own country or highly cited) experiments while neglecting complementary work.

This chapter has a different ambition. It starts with identifying all discrimination grounds based on which unequal treatment is prohibited in at least one state of the United States and then provides the reader with a register of all correspondence experiments conducted (later than Bertrand and Mullainathan 2004) to measure these forms of discrimination. Given that the information provided for each study (i.e. particular treatment, country, and sign of the effect) is kept very limited—no effect size information is provided—this chapter has to be seen as a working instrument rather than as a classical review.

The register we will present serves three goals. First, it serves as a reference table to which later chapters of this book will refer. Second, and more broadly, it can be used by scholars in search of a catalogue of all correspondence experiments on hiring discrimination based on a (cluster of) particular ground(s). Third, it implicitly indicates potentially fruitful directions for future correspondence experiments, as it unambiguously shows where the lacunae in this literature are, i.e. the discrimination grounds and regions to which researchers have paid little attention.

2 Scope

The register discussed in the next section is the result of a systematic search for correspondence experiments conducted after Bertrand and Mullainathan (2004) with the aim of measuring forms of unequal treatment in hiring which are prohibited by law in at least one state of the United States, i.e. the country in which the most correspondence experiments have been conducted. So, correspondence experiments included to assess the causal effect of, e.g., other cv characteristics such as juvenile delinquency, student employment and (former) unemployment spells were not included (Baert and Verhofstadt 2015; Baert et al. 2016d; Kroft et al. 2013; Eriksson and Rooth 2014).

Under US federal law, unequal treatment is forbidden based on nine (clusters of) discrimination grounds: (A) race and national origin, (B) gender and pregnancy, (C) religion , (D) disability, (E) (older) age, (F) military service or affiliation, (G) wealth, (H) genetic information, and (I) citizenship status.Footnote 2 With respect to (B), discrimination based on motherhood is also prohibited in AlaskaFootnote 3 and California.Footnote 4 Finally, discrimination based on (J) marital status,Footnote 5 (K) sexual orientation and gender identity,Footnote 6 (L) political affiliation,Footnote 7 (M) union affiliation,Footnote 8 and (N) physical appearanceFootnote 9 is forbidden in at least one state.

With this list of discrimination grounds at hand, a key word search (for the word groups ‘correspondence test’, ‘correspondence experiment’, ‘correspondence study’, ‘fictitious resume’, ‘fictitious cv’, ‘fictitious application’, and ‘field experiment’ in combination with ‘discrimination’) was conducted on three sources: Web of Science, Google Scholar, and the IZA Discussion Paper Series. This exercise was followed by the screening of all references in the relevant articles found and the screening of the studies citing these relevant articles.

3 The Register

Table 3.1 provides the reader with an overview of all studies (after Bertrand and Mullainathan 2004 of which we are aware that build on correspondence experiments aimed at measuring discrimination based on one of the grounds mentioned in the previous section. The unit of observation is the individual correspondence experiment. For each such experiment, there is a cell in column (3) of Table 3.1. Some cells contain more than one study, meaning that the studies exploited the same experimental data. Some studies focussed on more than one discrimination ground, and are therefore mentioned in more than one cell: Agerström et al. (2012), Albert et al. (2011), Arceo-Gomez and Campos-Vazquez (2014), Banerjee et al. (2009), Berson (2012), Capéau et al. (2012), Patacchini et al. (2015), Pierné (2013), and Stone and Wright (2013).

Table 3.1 Register of correspondence experiments conducted between 2005 and 2016 with the aim of measuring discrimination based on prohibited grounds in US law

In total, we are aware of 90 correspondence experiments conducted between 2005 and 2016 with the aim of measuring discrimination based on prohibited grounds in at least one state of the United States. For 37 of these experiments, the focus (at least partly) was on measuring ethnic discrimination. Other commonly investigated discrimination grounds were gender (14 field experiments), age (11 experiments), and sexual orientation (12 experiments). In addition, at least five experiments focussed on religion , disability, and physical appearance as determinants of employers’ hiring decisions. Only three experiments had a wealth-related focus and only two were related to military experience. Only one experiment has been conducted on hiring discrimination based on political affiliation and union membership. We are not aware of any experiments measuring unequal treatment based on genetic information, nor have any experiments—somewhat surprisingly given the massive migration flows to Europe in recent years—investigated citizenship status as a discrimination ground.

3.1 Treatment and Treatment Effects

As can be seen in column (1) of Table 3.1, for many discrimination grounds studied, a variety of particular treatments strategies have been used. For instance, ethnic origin is mostly revealed by means of the names of the candidates. The various minority groups studied are always groups that are substantially represented in the country where the data gathering took place. Alternative designs have disclosed ethnic origin by means of adding a resume picture or revealing one’s nationality.

Column (4) shows the average treatment effect for each experiment (averaged across all vacancies and neglecting analyses by subsamples as presented in many studies). Overall, an overwhelming majority of the studies report negative treatment effects (i.e. discrimination of the group hypothesised to be discriminated against). More concretely, 80 (i.e. 78.4%) treatment effects are significantly negative, 17 (i.e. 16.7%) are insignificantly different from 0, and 5 (i.e. 4.6%) are significantly positive.Footnote 10

Most of the cases document discrimination against ethnic minorities. There are two important exceptions with respect to this empirical pattern. First, in two recent studies with experiments conducted in the United States, no ethnic discrimination in hiring was found (Darolia et al. 2016; Decker et al. 2015). Second, in Malaysia the (expected) unfavourable treatment of the ethnic majority was found (Lee and Khalid 2016).Footnote 11 In addition, research in Belgium (Baert and Vujić 2016; Baert et al. 2015, 2017) revealed situations in which ethnic discrimination disappeared there, i.e. when ethnic minorities mentioned volunteer work for mainstream organisations, when they applied for occupations in which labour market tightness was high, and when they had many years of work experience. For an in-depth review of a selection of the studies in Panel A of Table 3.1, we refer to Bertrand and Duflo (2016), Neumark (in press), Rich (2014), and Zschirnt and Ruedin (2016).

With respect to evidence on gender discrimination , i.e. the experiments comparing call-back for male and female candidates, the evidence is very mixed. This is related to the particular occupations tested. Indeed, many authors mentioned that gender discrimination was heterogeneous by occupational characteristics (Baert et al. 2015; Petit 2007; Carlsson 2011). On the other hand, a significant penalty for being pregnant or being a mother was found in a study from Belgium and one from the United States, respectively (Capéau et al. 2012; Correll et al. 2007). Disclosing one’s transgender identity was found to be detrimental to labour market success in the United States (Make the Road New York 2010).

With respect to discrimination based on religion , a majority of the studies focussed on the signal of being a Muslim (directly mentioned or indicated by means of a resume picture in which headscarves were worn), compared with being a Christian (in countries where Christianity was the majority religion). Affiliation with Islam always yielded lower call-back rates (Adida et al. 2010; Banerjee et al. 2009; Pierné 2013; Weichselbaumer 2016). Somewhat surprisingly, no correspondence experiments have been conducted yet with respect to other leading religions (e.g., Hinduism, Buddhism, and Judaism) as well as to various folk religions.

Remarkably, all experiments on discrimination against the disabled have focussed on different dimensions of disability. Thus, we are in favour of replication studies for this dimension of discrimination. Nevertheless, each form of disability revealed in the hiring process seems to result in adverse hiring outcomes. The same is true with respect to age discrimination : across all studies listed in Table 3.1, older age is always punished.

A minority sexual orientation , revealed by means of mentioning membership in a rainbow organisation or the name of one’s (same-sex) marital partner in the resume, has a non-positive effect on employment opportunities. Including an attractive facial picture (compared to a less attractive one) with one’s resume has a beneficial effect. Finally, Table 3.1 lists little evidence for non-negative effects of military service and higher wealth (Baert and Balcaen 2013; Kleykamp 2009), a negative effect of trade union membership (Baert and Omey 2015), and zero effects for marital status (Arceo-Gomez and Campos-Vazquez 2014) and political affiliation (Baert et al. 2014).

3.2 Country of Analysis

Column (2) of Table 3.1 shows that the summarised literature on labour market discrimination is unbalanced with respect to the country of analysis. Grouped at the continental level, 59 of the 90 correspondence experiments were conducted in Europe, compared to 20 in North America, only 7 in the largest continent of Asia, 2 in South America, 2 in Australia, and none in Africa.

At the country level, most experiments (19) were conducted in the United States. The European countries of Belgium (13 experiments), France (8 experiments), Greece (6 experiments), Sweden (9 experiments), and the UK (8 experiments) are clearly overrepresented. On the other hand, these European countries are, together with the United States, the only ones in which within-country comparisons can be made of the discrimination measured for different grounds. In 6 of the 10 largest countries by population (Indonesia, Brazil, Pakistan, Nigeria, Bangladesh, and Russia), no correspondence experiments have been conducted yet.

4 Conclusion

This chapter provided the reader with a catalogue of all correspondence experiments on hiring discrimination conducted after Bertrand and Mullainathan (2004) that could be found through a systematic search. It shows that these experiments have focussed on a few specific grounds for discrimination (race, gender , religion , disability, age, sexual orientation , and physical appearance). An overwhelming majority of these studies reported unfavourable treatment of the group hypothesised to be discriminated against. On the other hand, other topical forms of potential hiring discrimination (e.g., based on genetic information, citizenship status, or political orientation) have hardly been assessed. Moreover, in 6 of the 10 largest countries by population, no correspondence experiments have been conducted yet.

The register presented in Table 3.1—enriched with hyperlinks to the electronic versions of the included studies—is kept updated at the author’s homepage [http://users.UGent.be/~sbaert].