1 Introduction

For international migrants, learning a language of the destination is an important investment in human capital. Empirical evidence shows there are three Es of language learning — exposure, efficiency, and economic incentives — that determine proficiency (Chiswick and Miller 2015; Grenier and Zhang 2021). Closer linguistic proximity between the mother tongue and the learned language allows people to learn a language more quickly. Thus, linguistic proximity allows immigrants to transfer human capital from their home country (e.g. language fluency, culture, institutional knowledge) to the host country’s labour market. Previous work suggests that linguistic proximity decreases the cost of communication and migration (Isphording and Otten 2014; Belot and Ederveen 2012). Furthermore, linguistic proximity may have far-reaching impacts on the next generation and globally. There is a growing literature on the intergeneration transmission of linguistic traits (Galor et al. 2020) and international trade (Egger and Lassmann 2015; Melitz and Toubal 2014). However, evidence about the effect of linguistic proximity on the labour market integration of asylum seekers and refugees (hereafter, the asylum population) is limited.

This paper investigates the role of linguistic proximity on the employment of the asylum population. The research question is of policy relevance because the economic integration of the asylum population has been a challenge across the European Economic Area. The analysis focuses on permanent residents, who are more likely to settle in one place and invest in learning the local language. To obtain plausibly exogenous variation in labour market outcomes, I leverage a natural experiment in which asylum seekers are randomly assigned across 26 cantons (administrative regions) in Switzerland.Footnote 1 The regulation mandates the State Secretariat of Migration (SEM) to allocate the residential location of asylum seekers following a proportional rule (State Secretariat for Migration 2018b). Asylum seekers cannot choose which canton they are sent to upon arrival and generally cannot leave the canton to which they are assigned (State Secretariat for Migration 2018a, b). It takes on average 2 years for an asylum seeker to receive refugee status, after which they can apply for a change of canton (Hainmueller et al. 2016). As a result, the initial assignment and limited movements across cantons create a unique experimental setting.

Switzerland is a location of interest for two specific reasons. First, it ranks amongst the top five European countries in both the number of refugees accepted and the refugee employment rate (European Commission and OECD 2016; Eurostat 2017). Second, it has four official languages: German, French, Italian, and Romansh (see Fig. 4 in the Appendix for the geographical distribution). Each of the over 2500 municipalities has one dominant spoken language relating to the neighbouring country. The country shares borders with Germany in the North, Austria in the East, Italy in the South, and France in the West. The municipalities can be grouped into 26 cantons or two language regions. Cultural and behavioural differences are particularly pronounced between German and Romance (French and Italian) regions. Previous work documents substantial differences along the language borders, such as work attitudes and job search behaviour (Eugster et al. 2017) and voting turnout (Brunner and Kuhn 2018). In the German region, Bernese German is the everyday spoken language. Bernese German is slightly different from Standard German used in Germany. Overall, Switzerland presents an appealing context to investigate the role of language similarity in economic integration.

Using a linked administrative dataset constructed by the Swiss Federal Statistical Office and the University of Geneva, I observe the current legal residential location of asylum seekers in Switzerland and their labour market outcomes over the period 2010 to 2014. The sample consists of permanent residents, meaning they have resided in Switzerland for at least one year before the survey year. I assume the residential location of asylum seekers who have recently arrived since 2005 to be as good as the assigned location.Footnote 2 I use a linguistic proximity measure from the Automated Similarity Judgement Program (ASJP) to identify phonetic similarities between the languages of the origin country and the destination municipality, namely Bernese German, French, and Italian.Footnote 3 I compare outcomes of the asylum population from different countries with similar observable characteristics.

While previous work has reported the positive effect of language skills or proficiency on integration, this paper is the first to consider the role of linguistic proximity in explaining labour market outcomes of the asylum population. Because linguistic proximity is based on country of origin, my results are not meant to capture outcome differences based on nationality change. Instead, I investigate whether language relatedness between populations can explain outcome variations among asylum seekers. My interpretation is that linguistic proximity represents the effort to learn or transfer skills. It is easier to learn a new language when it sounds or looks familiar. My results suggest that a one standard deviation increase in linguistic proximity is associated with 2.4 percentage points (6.8%) increase in employment. A one standard deviation increase in linguistic proximity also increases the likelihood to obtain wages above the national poverty threshold by 1.7 percentage points (9.7%). The national poverty threshold is 26,400 CHF annually (equivalent to 24,750 €) (Federal Statistical Office 2014).Footnote 4 This is referred to as the living wage hereafter.

In a richer model that includes interactions of nationality level characteristics with arrival cohorts, I find that the overall results are driven by the earlier arrival cohorts. This finding suggests that linguistic proximity plays an important role in mid-to-long-term integration. I examine differences between German and Romance regions. While linguistic proximity has different effects in another multilingual country (see Adserà and Ferrer (2015) on Canada), I do not find evidence that it is the case in Switzerland. I also investigate the potential effect of proximity to an international language, English, but I find no discernible effect on outcomes. In further sensitivity checks, I assess the potential effect of omitted variable bias. I find the employment effect is plausibly causal. Considering the asylum population moves for ‘non-economic reasons’ and is less likely to be positively selected than economic migrants, the employment result should be considered an upper bound of the true causal effect. However, the analysis is descriptive and probably not causal given issues such as selection into language regions and selective migration.

This paper makes two key contributions to the literature. First, I employ a precise index to investigate the effect of linguistic proximity on the economic integration of the asylum population. Previous work on linguistic proximity focuses on migrants as a whole, not just on the asylum population (Adserà and Ferrer 2015; Isphording and Otten 2014; Bredtmann et al. 2020). Second, I am among the first to use a linguistic proximity index to examine integration in a multilingual country. The one exception is Adserà and Ferrer (2015), who use the same proximity indices to explore immigrant assimilation in Canada. Contrary to their findings, I find that linguistic proximity does not have a differential effect on labour market outcomes in another multilingual country, Switzerland. Current research has focused on migrant assimilation, selection, and sorting of destination countries (e.g. Dustmann and Van Soest 2002; Belot and Hatton 2012; Adserà and Pytliková 2015). Migrants are positively selected (in terms of ability), but the asylum population flees out of choice and is subject to locational restriction in some countries. Answering the current research question allows us to understand that the initial disadvantage — stemming from a far linguistic origin — imposes a significantly higher cost of language acquisition.

The next section introduces prior empirical work on foreigners’ assimilation and language. Section 3 describes the assignment policy, institutional context, and residential location of asylum seekers. Section 4 describes the dataset and discusses the data limitations. Section 5 lays out the empirical model. Section 6 presents the results. Section 7 tests the validity of my identification strategy. Section 8 reports the robustness checks. Finally, Sect. 9 concludes.

2 Prior empirical work on assimilation and languages

I contribute to the broader literature on the importance of language skills in the economic integration and assimilation of foreigners. Previous work classifies language skills as a form of human capital (Chiswick and Miller 1998, 2012; Chiswick 1978; Isphording and Otten 2014; Bleakley and Chin 2004; Dustmann and Van Soest 2002). The two main threats to identification in this research area are migrant self-selection and measurement error in the language measure. To deal with the two sources of bias, the literature follows three strategies.

The first strand of research considers the endogeneity of language skills in the earnings of migrants. The causal estimate of language skills is obtained from a range of methods, such as instrumental variables, Heckman's (1979) selectivity correction, and decomposition methods. Previous work finds language proficiency increases earnings by 15% in Germany (Dustmann and Van Soest 2002), by 13% in the USA among childhood immigrants (Bleakley and Chin 2004), and by 19 to 24% in Australia (Chiswick et al. 2005). Nadeau and Seckin (2010) estimate that knowledge of French reduces the immigrant-native wage gap by 5% and that bilingual immigrants earn 12% more than unilingual immigrants in Quebec. These papers conclude language skills have positive effects on the economic integration of immigrants.

The second strand of research addresses the issue of migrant selection by exploiting experimental settings in which the labour market outcomes are exogenous from a language-based integration program or policy (Åslund and Johansson 2011; Joona and Nekby 2012; Auer 2018; Slotwinski et al. 2019; Hangartner and Schmid 2021; Lochmann et al. 2019). In Switzerland, researchers exploit the random assignment policy of asylum seekers and find that language skills increase re-employment probability (after job loss) within 2 years by 14% (Auer 2018) and increase employment probability of African refugees by over 150% in the first 5 years of arrival (Hangartner and Schmid 2021). Slotwinski et al. (2019) report that asylum seekers who reside in Swiss cantons with more inclusive labour market policies are 11 percentage points (hereafter, pps) more likely to be employed and that language proximity complements such policies by an additional 0.3 pps increase in the employment rate. Their unit of observation is canton, while I observe individual characteristics. The authors use the common language index to measure the total impact of language (Melitz and Toubal 2014). However, the common language index has a limited geographic coverage and it cannot measure the direct effect of similarities to Bernese German. Overall, this strand of research highlights that language classes have a positive, albeit heterogeneous, impact across the foreigner population.

This paper closely relates to the third strand of research which addresses measurement error in the language measure and provides alternative indices for language skills and linguistic proximity. Earlier work uses self-reported language fluency (Dustmann and Van Soest 2002; Beenstock et al. 2001; Bleakley and Chin 2004) and natives’ ability in learning foreign languages (Chiswick and Miller 1998) to proxy for skills. The main issue with language skills is that it is often over-reported, which underestimates the true language effect (Dustmann and Van Soest 2002). Recent work on language skills uses assessment scores (Auer 2018; Lochmann et al. 2019), while recent work on proximity applies objective time-invariant measures from linguistic research. Economists have used measures such as the linguistic tree, the Levenshtein distance (LDND), and the World Atlas of Language Structures.Footnote 5 In one other study about a multilingual country, Adserà and Ferrer (2015) use the LDND to examine the wage and occupational assimilation of immigrant men in Canada. They find immigrants to perform better in English rather than French-speaking areas. Although there are many linguistic proximity indices, this paper uses the LDND because it measures proximity to Bernese German directly. It includes all languages in the world and can be computed transparently. Nonetheless, I show my results are robust to the choice of measure. See Fig. 1 for the conceptual framework for the analysis.

Fig. 1
figure 1

Conceptual relationship between linguistic proximity and outcome measures. LDND is Levenshtein distance, a database of the Automated Similarity Judgement Program (ASJP) that maps the lexical distance between all language pairs in the world. Source: Author’s drawing based on Chambers and Trudgill (1998), Borin (2013), Chiswick and Miller (2015), Grenier and Zhang (2021), and Education Testing Service (2019)

Although there is merit to study the direct effect of language skills, language skills are subject to measurement error and are endogenous to labour market outcomes. This paper adds value to the literature by focusing on the outcomes of the asylum population in a multilingual country. Using the exogenous residential location of the asylum population, I estimate the reduced-form relationship between linguistic proximity and economic integration. Similar to the existing literature about migrants, I find a strong initial advantage from linguistic similarity for asylum employment. Different from previous work, I do not find linguistic proximity to have differential effects in a multilingual country.

3 The residential location of asylum seekers

In Switzerland, asylum seekers are randomly assigned to one of the 26 cantons following a proportional rule set in 1988 (Vogt 2018; Couttenier et al. 2019; Hangartner and Schmid 2021) (see proportions in Table A 1). The goal of the random assignment policy is to prevent the formation of ethnic enclaves, increase diversity, and encourage equal sharing of resource burden across cantons (Auer 2018; Vogt 2018). Other than unaccompanied minors, family reunification, and medical cases that require attention in university hospitals, asylum seekers must follow the random assignment procedure (State Secretariat for Migration 2018b). Each day, the assignment officer manually assigns cases to cantons based on information available in the Central Migration Information (ZEMIS) system (see an asylum application form in Figure A1). The officer and the asylum seeker never communicate with each other. By the end of each month, the officer verifies that the top 10 nationalities are distributed to cantons proportional to their population size. Although I do not directly observe the randomly assigned canton, I assume the canton of residence to be as good as the initial canton of assignment based on the following descriptive evidence. I investigate the validity of the randomization procedure in Sect. 7.

First, asylum seekers (permit N) do not have a locational choice. The canton authorities determine the housing location of the asylum seeker across municipalities. The individual is required to report to canton authorities within 24 h (State Secretariat for Migration 2018c). Change of canton is rare because it requires valid reasons and an agreement between two cantons (State Secretariat for Migration 2018b). Moreover, the official record from SEM substantiates that they did not relocate asylum seekers between 2006 and 2014 (Table A 2).

Second, the more recent the arrivals, the less likely the asylum population could relocate. Hainmueller et al. (2016) report that the waiting period for asylum decisions (from asylum seeker to refugee status) is on average 665 days with a standard deviation of 478 days. After this period of mandatory location restriction, asylum seekers can be granted either permit B or F. Recognized refugees (permit B) can apply for change of cantons to the canton authority if they are not dependent on social benefits. Temporary accepted persons/refugees (permit F) can apply for change of cantons to SEM if they are not dependent on social benefits. While asylum seekers are not obliged to find work, it provides more economic and (potentially) locational freedom. Throughout the asylum application process, asylum seekers are entitled to social assistance and housing. The amount and distribution methods of social benefits are decided on by each canton individually (SODK 2017). In sum, the lengthy process and social assistance discourage movement across cantons.

In terms of work restrictions, asylum seekers (permit N) face either 3 or 6 months of work ban upon arrival (State Secretariat for Migration 2015a). The initial work ban is 3 months, but cantons can extend it to 6 months (Wichmann et al. 2011). After the initial work ban period, the labour market participation of asylum seekers is subject to employer justification and occupation restrictions (State Secretariat for Migration 2015b). This implies the employer must apply for a work permit for the asylum seeker. The employer may need to provide proof that the position cannot be filled by Swiss residents. About half of all cantons authorize work permits only in industries with labour shortage (e.g. agriculture, construction, and other low-skilled jobs) (Slotwinski et al. 2019; Wichmann et al. 2011). As a result, most asylum seekers lack employment and are reliant on social assistance provided by cantons.

Third, I observe low mobility among asylum seekers who arrived during the survey period. Table A 3, panel A (column 2) (Online Appendix) shows 2.1% of the sample changed canton. Among individuals who moved to another canton, column 2 indicates 4.0% of individuals moved in the first year after arrival and the mobility rate is 1–2% in each subsequent year. Table A 3, panel B (column 2) (Online Appendix) reports that only 0.9% of the sample changed the language region from 2010 to 2014. Column 2 reports 1.7% of individuals moved to another language region after their first year of arrival, and the cross-region mobility is lower than 1% in each subsequent year.

Although descriptive statistics suggest that the mobility rate is low among arrivals from 2010 to 2014 and that Swiss policy impedes one’s movement in the early years of arrival, I cannot completely eliminate the possibility of endogenous internal and outward migration. Thus, I report validity tests in Sect. 7, examine the role of unobservable selection, and test the sensitivity of results by excluding asylum seekers who are legally allowed to move.

4 Data

I use data from administrative records from the Swiss Longitudinal Demographic Database created by researchers of the Institute of Demography and Socioeconomics at the University of Geneva, in cooperation with the Swiss Federal Statistical Office (FSO) and the National Center of Competence in Research (NCCR On the Move) (Steiner and Wanner 2015). The data comprises the population of asylum seekers who are permanent residents and are residing in Switzerland between 2010 and 2014. To be a permanent resident, one needs to reside in Switzerland for a minimum of 12 months. The complete list of variables is available in Table 10 in the Appendix and the descriptive statistics are presented in Table 1.

Table 1 Descriptive statistics, 2010–2014

To understand the socioeconomic characteristics of the stock of the asylum population, the Swiss Labour Force Survey forms the backbone of the analytical dataset. This data covers 4% of the permanent residents of Switzerland (Qualité 2010; Federal Statistical Office 2017, 2018). The survey is nationally representative, and it contains a non-repeated sample. The cross-sectional data is pooled from 2010 to 2014 resulting in a 20% sample of the entire Swiss population. The asylum population in the sample has all passed the three or six months of probation period for work depending on their canton of assignment (State Secretariat for Migration 2015a). The survey collects socioeconomic and sociocultural information, such as the main language and education attainment.

The Structural Survey is linked with administrative records and population registers on an annual basis using the social insurance number (AHV) and probabilistic linkage approaches. This forms the time dimension of the unbalanced panel. The labour market outcomes are taken from individual accounts of the Swiss Central Compensation Office and the Unemployment register (PLASTA). The demographic characteristics, such as permit status, municipality of residence, and household size, are linked from the Population and Households Statistics (STATPOP) and Central Migration Information System (ZEMIS). Although I observe the same individual over time, the final dataset is an unbalanced panel because some individuals may leave Switzerland.

4.1 The asylum population sample

Between 2005 and 2014, 177,402 new asylum applications were filed in Switzerland (Table A 4). A total of 28,747 asylum applicants were granted protection in Switzerland, and the average asylum recognition rate was 18.0% in the same period (State Secretariat for Migration 2016). From 2010 to 2014, a sample of 3571 asylum seekers became permanent residents and responded to the Structural Survey. This represents 12.4% of the total asylum cases granted between 2005 and 2014.

I construct the main sample by including asylum seekers who arrived in Switzerland between 2005 and 2014. The baseline sample includes 3571 individuals (person-year observation (N) = 17,855). Table A 5 shows the number of observations dropped as each sample criteria is imposed. I exclude individuals who are not between 18 to 65 years old for the whole period (person-year observations (N) = 772), with no information on the location of residence (N = 1204), reside in Romansh speaking municipalities (N = 19), and with no information on both nationality and country of birth (N = 1196). Finally, the unbalanced panel consists of 3058 individuals (N = 13,780) from 72 countries who sought asylum in Switzerland and arrived between 2005 and 2014.Footnote 6 They reside in 863 municipalities. In the robustness check section, I verify that my results are robust to the exclusion of asylum seekers with no information on country of birth (N = 914).

4.2 Linguistic proximity

Linguistic proximity measures the similarity between the language from the country of origin (nationality) and the main language of the destination municipality (i.e. Bernese German, French, and Italian). Because I observe the country of origin of asylum seekers and not their native language, I follow the approach of Adserà and Pytliková (2015) to map languages spoken by the majority of the population in each country (hereafter, majority language).

Linguistic proximity is 100% minus Levenshtein distance (LDND). It is the inverse of linguistic distance. LDND is calculated from the Automated Similarity Judgement Program (ASJP) with a database for all language pairs in the world (see Online Appendix C for calculation method).Footnote 7 LDND represents the lexical distance between 40 common words, which is the number of changes in phonetic segments (sounds) to change from one language to another. For example, from ‘beer’ to ‘bier’ is one change, and from ‘heart’ to ‘herz’ are two changes. However, some words can be completely different, such as ‘mountain’ and ‘berg’. Although LDND does not account for grammatical or typological distance, it is applied in linguistic and economics research because it can reflect phonetic differences (Isphording and Otten 2013; Adserà and Pytliková 2015; Taraka and Kolachina 2012; Petroni and Serva 2010; Bredtmann et al. 2020).

The raw linguistic proximity index ranges from approximately 0 to 100, where 0 means the language pair is ‘unrelated’ and 100 means ‘highly related’. For example, the raw linguistic proximity between Bernese German and German is 49.9, and between French and Moroccan Arabic, it is 2.6.Footnote 8 I report the raw proximity indices of the top 10 nationalities in Table A 6. To compare across regions, I standardize all linguistic proximity measures by dividing the standard deviation of the entire sample. A one standard deviation corresponds to 5.1 pps (Bernese German) and 6.3 pps (French and Italian) difference in the raw linguistic proximity index. Figure 2 illustrates the distribution of the standardized measures ranges from − 2 to 13. Table 2 reports the standardized linguistic proximity of the top 10 nationalities. For example, Sinhala (from Sri Lanka) has a standardized linguistic proximity of 1.05 to Bernese German, 0.16 to French, and 1.98 to Italian. Syrian Arabic (from Syria) has a standardized linguistic proximity of − 1.02 to Bernese German, 0.19 to French, and 0.71 to Italian. When the two languages are very dissimilar, the values can be negative. Overall, the values of the linguistic proximity measure are quite low.

Fig. 2
figure 2

Linguistic proximity to Bernese German, French, and Italian, 2010–2014. Notes: The figure illustrates the distribution of linguistic proximity, standardized 1 − LDND based on the majority language of each nationality and destination municipality. Source: Various datasets including the ASJP Database (version 18) and the Federal Statistical Office and Swiss Longitudinal Demographic Database (2010–2014)

Table 2 Standardized linguistic proximity for the top 10 nationalities, 2010–2014

The current linguistic proximity measure has several advantages over other linguistic measures. First, it provides a direct measure for any languages or dialects, namely Bernese German. It includes more than 7500 languages in the world. Second, it has the biggest geographical coverage of all measures. It covers more countries than the common language index that is applied in Slotwinski et al. (2019). Third, it can match flexibly with countries or territories (e.g. Kosovo and Palestine), which are common origin countries of the asylum population. Moreover, the current linguistic proximity measure is highly correlated with other indices used in the literature (correlation coefficient > 0.7).

In addition to the Swiss languages, I consider proximities to English to examine the role of an international language. To test the sensitivity of my results, I use the highest proximity score based on multiple languages in the country of origin, proximities to Standard German (that is used in Germany), and another categorical measure constructed by Adserà and Pytliková (2015) using the linguistic tree levels.

To isolate the pure effect of linguistic proximity from proficiency, I control for the reported main language — a variable for language match between one’s reported language and the municipality’s language — in a robustness test. The Labour Force Survey offers nine languages for one to choose which language they think in and know best.Footnote 9 Unfortunately, information on the language proficiency of the asylum seeker is not formally assessed or collected by SEM upon arrival (see the asylum application form in Figure A 1).

4.3 Country of origin characteristics

I expect the transfer of skills to be more efficient with physical and cultural proximities, but to be less efficient with political instability. To account for physical proximities, I include the distance between capital cities. The greater the distance, the higher the cost of migration is. To account for cultural proximities, I include an indicator for countries with common history with France, Germany, Italy, Switzerland, or the UK. Colonial languages are often taught in schools. I control for drivers for seeking asylum through the Freedom House civil liberties index. I account for population dynamics using the stock of foreigners and population ratio. Importantly, the foreigner statistics from 2010 onwards include the asylum population. I use the FST genetic distance (or coancestor coefficient) constructed by Spolaore and Wacziarg (2009) to isolate the effect of genetic divergence between populations from linguistic influences. To account for exposure to the French language, I use the average share of French speakers reported by l’Observatoire de la langue française (OIF) from 2010 and 2014.Footnote 10

To further control for fluency, I use the Education First (EF) English Proficiency Index (EPI) — a standardized test score on adults’ reading and listening skills — when I consider proximities to English. I add an indicator for former French colonies to account for fluency determined by historical links when I explore regional differences (see the OIF and French fluency variables in Table A 7). All the country of origin (nationality) characteristics are open-source data and are detailed in Online Appendix D.

To treat the origin country characteristics as predetermined and to reduce the risk of reverse causality, all the time-varying variables are lagged by one outcome year (i.e. 2009–2013). The time-varying country of origin characteristics includes the civil liberties index, stock of foreigners, and population ratio. The time-invariant characteristics are the distance between capital cities, FST genetic distance, colonial relationship, the share of French speakers, EF EPI, and French colonial history.

4.4 Labour market outcomes

I consider two outcomes: employment and living wage. Employment is defined as non-zero wages and being absent from the unemployment register in the calendar year. Annual wages are the before-tax income reported by employers for tax accounts and social security, namely contributory pension. Wages are directly reported by employers to the Swiss Central Compensation Office, but they include unemployment benefits. The unconditional mean annual wage is 27,527 CHF (25,480 €). Using this definition of employment, I exclude individuals who are both employed and unemployed during the year. From 2010 to 2014, the average employment rate is 35.3%.

Living wage is an indicator variable for those who earn above the poverty threshold for a single-person household in 2012, which is 2200 CHF per month or 26,400 CHF annually (24,750 €) (Federal Statistical Office 2014). I use the relative poverty threshold because the level of wages is determined through collective bargaining between the trade unions and employers (Muller and Peterson 2007). There is no minimum wage in Switzerland. To my knowledge, no prior work has used a higher employment threshold to study the extent of asylum integration. In Figure A 2 I illustrate that there are sizable (unconditional) outcome variations across and within nationalities. Individuals from Sri Lanka and Serbia are among the best performers in the Swiss labour market.

4.5 Canton characteristics

The 26 Swiss cantons exhibit great variation in economic conditions, geographies, and labour market restrictions to asylum seekers. While the dominant language is defined by the municipality, each canton has its own constitution and laws. Thus, I include four canton variables. I control for the welfare state of the canton using the lagged social assistance rate for all residents. The more generous the welfare state, the more likely the social insurance dependency. Asylum seekers are subject to work ban and occupation restrictions upon arrival (Wichmann et al. 2011; Slotwinski et al. 2019; State Secretariat for Migration 2015b). To consider canton variation in labour market restrictions, I control for the lagged employment rate of asylum seekers (permit N holders). These two time-varying variables (social assistance rate and permit N employment) are lagged by one period to be treated as predetermined conditions. Furthermore, I control for the share of asylum seekers to be assigned to cantons. The rule is proportional to population density and it captures the local’s exposure to the asylum population. These three canton characteristics are publicly available and are detailed in Online Appendix E. Finally, I control for time-varying canton-specific network effects using the share of residents from the same country of birth in each outcome year. I construct this measure using data of all residents and entry records since the 1920s. The stronger the ethnic ties in the surrounding area, the higher the probability that random encounters will result in information sharing and employment opportunities.

4.6 Data limitations

My dataset has four main limitations. First, my findings are based on permanent residents who responded to the Structural Survey. I have no information on the labour market outcomes of asylum seekers who left Switzerland within 12 months upon arrival, who reside in small municipalities, or who are undocumented. Even though the Structural Survey is nationally representative, municipalities with less than 15,000 permanent residents are not sampled (Federal Statistical Office 2017). Morlok et al. (2015) estimate that there are 15,200 undocumented asylum seekers with rejected applications from 2005 to 2014. The estimate is quite low, implying an average of 1520 undocumented asylum seekers per year. Second, I do not have information on the assigned location and reception centres to address potential selection issues. Although randomization is carried out centrally in Bern by the immigration authority (SEM), Hangartner and Schmid (2021) suggest that the location and language of reception centres may influence the region of assignment. Third, inactive and unemployed individuals are indistinguishable in the dataset. In Switzerland, one needs to work for at least two years to be entitled to unemployment benefits. Therefore, individuals without work experience in Switzerland but are actively looking for work, those who are disabled, and seasonally unemployed would be classified as unemployed. Thus, I can only assume the unobserved heterogeneity stemming from non-employment (unemployment and inactive) is correlated with the observed individual characteristics. Fourth, unemployment statistics are only available annually. Individuals who are unemployed for 1 month or 11 months are both classified as unemployed; thus, the effect of unemployment could be overestimated.

5 Estimation model

For an asylum seeker possessing linguistic proximity \({L}_{n}\) between their country of origin (nationality) and destination municipality, the econometric model is given by,

$$\begin{array}{c}{Y}_{inakt}=\alpha +\beta {L}_{n}+{{{\varvec{\gamma}}}^{\boldsymbol{^{\prime}}}{\varvec{X}}}_{{\varvec{i}}{\varvec{t}}}+{{{\varvec{\rho}}}^{\boldsymbol{^{\prime}}}{\varvec{O}}}_{{\varvec{n}}{\varvec{t}}}+{\zeta K}_{nt}+{\eta }_{a}+{\theta }_{k}+{\tau }_{t}+{\varepsilon }_{inakt}\end{array}$$
(1)

where \(Y\) is the labour market outcome of an asylum seeker \(i\) of nationality \(n\), who belongs to arrival cohort \(a\), and resides in canton \(k\) in the observation year \(t\), where \(t=2010, \dots , 2014\). The coefficient of interest \(\beta\) expresses the effect of linguistic proximity based on the asylum seekers’ nationality. It quantifies the outcome differences stemming from language relatedness between populations. To estimate the causal effect of linguistic proximity, the model relies on the random assignment of asylum seekers to residential locations (i.e. cantons) and their restricted mobility in the early years of arrival. The identification assumption is that linguistic proximity is orthogonal to unobserved individual characteristics.

The labour market outcomes are partly determined by the returns to human capital. \({X}_{it}\) is the vector of individual characteristics including female, age, age squared, highest completed education, household size, geography, and asylum process. Education is a series of dummy variables, comparing the labour market outcomes of asylum seekers who completed secondary and tertiary education to those who completed primary education. Household size indicates the number of members living in the same house. Rural municipality is a dummy variable to account for occupations and wages in rural or urban economies. Asylum process is an indicator variable to distinguish the permit status of asylum seekers (permit N) from refugees (e.g. permit F and B). This is to account for the effect of mandatory work and residential mobility restrictions. Permit N holders are not permitted to work for either 3 or 6 months upon arrival. The employment restrictions are dependent on the timing of the first permit decision and the canton of assignment (State Secretariat for Migration 2015a).

\({O}_{nt}\) is the vector of country of origin characteristics to separate the effect of linguistic proximity from other factors driving migration and integration. It includes the time-varying (population ratio, stock of permanent residents in Switzerland from the country of origin, civil liberties) and time-invariant (distance between capital cities, ever in a colonial relationship, FST genetic distance, the share of French speakers) country of origin characteristics. I expect physical and cultural proximities, social networks, and common colonial past to be enablers for integration, while political turmoil and poverty intensify unemployment scarring. Because country of origin characteristics can influence labour market participation in the current year, all the time-varying variables are lagged by 1 year to mitigate the risk of reverse causality.

\({K}_{nt}\) is the time-varying canton-specific country of origin control. I use the share of individuals from the same country of birth for each canton and year to capture the co-ethnic social network effects. The positive effect of ethnic enclave size on refugees’ labour market outcomes has been documented in Switzerland (Martén et al. 2019), Sweden (Edin et al. 2003), and Denmark (Damm 2014, 2009). But Stips and Kis-Katos (2020) find no significant co-national network effects in Germany.

Fixed effects for arrival year cohorts (\({\eta }_{a})\), canton of residence (\({\theta }_{k}\)), and outcome year (\({\tau }_{t}\)) are applied to identify within-group differences. Asylum seekers who arrived in Switzerland between 2005 and 2014 are divided into six cohorts (\({\eta }_{a})\) to account for the speed of assimilation.Footnote 11 Canton fixed effects (\({\theta }_{k}\)) consider the local economies, geographies, and attitudes towards foreigners across the 26 cantons. Outcome year fixed effects (\({\tau }_{t}\)) capture differences in employment and wages throughout the observation period from 2010 to 2014. To ensure robust inference, standard errors are two-way clustered at the nationality and canton levels, with 490 clusters, to correct for heteroskedasticity and correlation within nationality and canton groups. The model assumes linguistic proximity has the same effect on labour market outcomes among individuals of the same nationality. A limitation is that it cannot assess variations within nationality.Footnote 12

Given the canton fixed effects may take away the benefits of random assignment, I also present the \(\beta\) coefficients by substituting a battery of canton characteristics for the canton fixed effects. To account for the relative canton differences in each outcome year, I control for the lagged social assistance rate of all residents and the lagged employment rate of asylum seekers (permit N holders). Moreover, I include the time-invariant assignment proportion rule to control for natives’ exposure to asylum seekers and population density.

Building on the main empirical model, I carry out three additional analyses. First, I examine the heterogeneous effects by arrival cohort. I interact linguistic proximity and other nationality level variables with arrival cohort fixed effects. I consider,

$$\begin{array}{c}{Y}_{inakt}=\alpha +\beta {L}_{n}+{\eta }_{a}+\sigma \left({L}_{n}{\eta }_{a}\right)+{{{\varvec{\rho}}}^{\boldsymbol{^{\prime}}}{\varvec{O}}}_{{\varvec{n}}{\varvec{t}}}+{{\varvec{\lambda}}}^{\boldsymbol{^{\prime}}}\left({{\varvec{O}}}_{{\varvec{n}}{\varvec{t}}}{{\varvec{\eta}}}_{{\varvec{a}}}\right)+{{\varvec{\gamma}}\boldsymbol{^{\prime}}{\varvec{X}}}_{{\varvec{i}}{\varvec{t}}}+{\zeta K}_{nt}+{\theta }_{k}+{\tau }_{t}+{\varepsilon }_{inakt}\end{array}$$
(2)

While Eq. (1) assumes the variable of interest (\({L}_{n}\)) have the same effect on labour market outcomes across individuals of the same nationality, the interaction term in Eq. (2) controls for time-varying characteristics within nationality and arrival cohort. The purpose of Eq. (2) is to examine if the effect of linguistic proximity is driven by the composition of asylum seekers across cohorts. It considers the non-linear effects of assimilation and Swiss integration programmes over years of arrival.

Second, I investigate the heterogeneous effects by language region. Work behaviour and attitudes towards foreigners may differ across language borders (Eugster et al. 2017; Brunner and Kuhn 2018). I start by interacting linguistic proximity with Romance region in the pooled model. Then, I analyse the effect of proximity to Bernese German and the Romance languages (French and Italian) in separate regressions. While separate regressions allow all coefficients and residual variance to vary across regions, the interaction approach constrains the two regions to be equal except for linguistic proximity. In an alternate model, I add the French colony history dummy to account for French fluency.

Third, I test the effect of proximity to an international language, English, on labour market outcomes. English is an effective means of communication. The skill may be valuable in the labour market. I replace the variable of interest with linguistic proximity between the origin country and English. The variable is not added to the main regression because linguistic proximity to English is highly correlated with proximity to German (correlation coefficient = 0.78) and French (correlation coefficient = 0.50). In an alternate model, I include the Education First English Proficiency Index (EF EPI) to control for English fluency.

6 Results

6.1 Linguistic proximity

In Table 3, I estimate OLS regressions following Eq. (1). Employment and living wages are the dependent variables. The linguistic proximity is defined by the destination municipality. Sixty-seven percent of the samples reside in the German-speaking municipalities, 29% reside in the French-speaking municipalities, and 4% live in the Italian-speaking municipalities. There are 17 German cantons, four French cantons, one Italian canton, and four multilingual cantons. But the differences between a multilingual and monolingual canton are absorbed by the canton fixed effects.

Table 3 The effect of linguistic proximity on labour market outcomes, 2010–2014

Table 3 column (1) reports the coefficient of linguistic proximity on employment is significant and positive. A one standard deviation increase in linguistic proximity is correlated with an increase in employment likelihood by 2.4 pps (or 6.8%).Footnote 13 To put this into perspective, a one standard deviation is equivalent to 5.1 pps of the raw linguistic proximity to Bernese German and 6.3 pps of the raw index to French and Italian.Footnote 14 The difference is equivalent to comparing individuals from Eritrea (Bernese German: 2.85; French: 3.11) to North Macedonia (Bernese German: 8.32), or Croatia (French: 9.40).

I find asylum seekers with higher linguistic proximity are more likely to earn a living wage (i.e. above the national poverty threshold). Column (2) suggests a one standard deviation increase in linguistic proximity corresponds to 1.7 pps (9.7%) increase in the likelihood to obtain a living wage. Not surprisingly, the effect size on the living wage is smaller than on employment.Footnote 15

Although I apply the canton fixed effects to control for systematic differences across cantons, one might be concerned that they remove the benefits of random assignment. In columns (3) and (4), I report the point estimates for linguistic proximity controlling for canton characteristics rather than the canton fixed effects. The coefficients are similar in magnitude and significance level. For more details, please refer to Table A 8 on coefficient and r-squared movements, Table A 9 and Table A10 on correlation between the country of origin covariates, and Table A11 on the robustness of the time-varying canton-specific country of origin variable.

While the findings are not directly comparable to those of Auer (2018) and Hangartner and Schmid (2021), as language proximity and proficiency are different measures of learning, it is still informative to understand the relative magnitude of effects. For example, Auer (2018) finds a language match (to the canton) and language course participation to increase labour market re-entry by 14% within 2 years. Hangartner and Schmid (2021) document that Francophone refugees are 4.6 pps (about 150% in relative terms) more likely to be employed after 4 years of stay in the French-speaking municipalities of Switzerland. My result is 2.4 pps (6.8%) regardless of the language region, which suggests that the magnitude of the linguistic proximity coefficient on employment is smaller than those of language proficiency.

To investigate the issue of ‘bad controls’, Table A 12 presents specifications where I remove household size (column 2), the share of French speakers (column 3), and genetic proximity (column 4). ‘Bad control’ is an econometric problem that arises when an additional variable introduces biases to the parameter of interest (Angrist and Pischke 2008, 2015; Cinelli et al. 2022). My results remain similar with the inclusion/exclusion of these observable controls.

For robustness, Table A13 (panel A) estimates alternate models where standard errors are clustered at the nationality level and multi-way clustered at the individual, nationality, and canton levels. Panel B reports the marginal effects of employment and living wage from the logit and probit regressions. Panel C reports the results using three and four arrival cohort categories, instead of six categories in the main specification. The significance, sign, and effect size of coefficients remain insensitive across the choice of clustering (panel A), regression specifications (panel B), and the number of arrival cohorts (panel C).

Overall, the main results report linguistic proximity by country of origin facilitates labour market entry. Linguistic proximity also improves the prospect of obtaining a living wage. These findings are consistent with studies which show linguistic proximity attributes to language acquisition and assimilation among migrants (e.g. Isphording and Otten 2014; Chiswick and Miller 2012). My findings also suggest that the effect of linguistic proximity is weaker than the effects found in previous work on language proficiency (e.g. through language courses).

6.2 Heterogeneous effects by arrival cohorts

In Eq. (1), I constrain the effect of linguistic proximity to be the same for all individuals regardless of nationality and arrival cohort. Results in Table 4 (Eq. (2)) allow the effect of linguistic proximity to vary across arrival cohorts. Using the interaction of the nationality level controls and arrival cohort fixed effects, I examine if the marginal effects of linguistic proximity are sensitive to the timing of arrival. This is similar to the approach of previous work to interact the variable of interest with the duration of stay (Chiswick and Miller 2012) or year of arrival dummies (Fasani et al. 2021) to quantify assimilation effects.

Table 4 The interaction of linguistic proximity and arrival cohort fixed effects, 2010–2014

Table 4 shows linguistic proximity has a positive and significant effect on employment and living wage among the 2005 and 2006 arrival cohort. However, the interaction coefficients indicate the effect of proximity reduces among recent arrivals. For example, in column (1), linguistic proximity increases employment by 3.2 pps among those who arrived from 2005 to 2006 but reduces employment to − 0.5 pps (that is, 0.032–0.037) for 13% (N = 1814) of the most recent (2011–2014) arrival cohort. Similarly, column (2) suggests that the coefficient on living wage has been reduced to about 0 pps (that is, 0.024–0.033) for 24.6% of the sample (N = 3391) who arrived since 2010.

The findings in Table 4 reveal that the positive relationship between linguistic proximity and economic outcomes is driven by earlier cohorts who arrived 5 years before the study period. However, the result may be driven by inherent differences between countries of origin despite Eq. (2) controls for many country of origin factors. I observe the earlier arrival cohorts are dominated by asylum seekers from Eritrea, Turkey, and Sri Lanka. The more recent arrival cohorts (since 2010) are mainly from Afghanistan and Syria.

All in all, the results demonstrate that linguistic proximity — the initial advantage from an origin country — comes into play in the mid-to-long-term integration process. Assimilation through the longer duration of stay is an important mechanism for the positive association between linguistic proximity and labour market outcomes. However, disadvantages upon arrival to Switzerland, such as the lack of country-specific skills, can intensify the employment differences among those of the same nationality and arrival cohort.

6.3 Heterogeneous effects by language region

I investigate whether the linguistic proximity effects are driven by individuals living in the Romance (French and Italian) region. Table 5 tests for the relative difference in linguistic proximity across regions. I interact linguistic proximity with Romance region in the referred model (columns 1 and 2), as well as the model with canton controls (columns 3 and 4). The interaction term is insignificant for employment and living wage, implying insignificant outcome differences across language regions.

Table 5 Heterogeneous effect by language region, 2010–2014

In Table A 14, I account for French ability through an indicator variable for French colonial history. Nine percent (N = 1188) of the overall sample may speak fluent French as they are nationals of former French colonies, such as Syria and Togo. The interaction term remains insignificant for employment and living wage. Furthermore, Table A 15 tests if asylum seekers from Africa have better labour market prospects in the Romance rather than the German region. Sixty-three (N = 3985) of the African sample is from Eritrea and 18% (N = 1113) of which is from Somalia. Eritrea and Somalia are not former French colonies. I find that the linguistic proximity effects are much smaller in this sample, and the interaction effects are insignificant for employment and living wage.

In Table A 16, I estimate separate regressions by language region. This specification is equivalent to a fully interacted model, which shows how all coefficients differ across regions. The magnitude of effects in the Romance region is generally larger than that of the German region. Upon testing the equality across models, I find that the effects on employment and living wage are not statistically different across language regions. Although the effect of linguistic proximity seems to be larger in the Romance region (in separate regressions), the interaction models (Table 5; Table A 14; Table A 15) suggest that there is no significant difference by region. My findings contradict with the existing literature that there is regional difference across Canada (Adserà and Ferrer 2015).

6.4 The importance of English as an international language

English is the international lingua franca and the most common foreign language in Switzerland (Swiss Federal Council 2017). Proximity to an international language is a skill. It may facilitate communications with other foreigners, English-speakers, and facilitate job-search. Analysis in Table 6 replaces the linguistic proximity variable for the municipality language with English. I find no detectable effects of proximity to English on employment and living wage in columns (1) and (4). Although the sign of coefficients is positive across outcomes, the effect sizes are small (0.9 pps for employment and 0.8 pps for the living wages).

Table 6 The role of proximity to English on labour market outcomes, 2010–2014

To test the notion of lingua franca, I control for English proficiency for asylum seekers from 22 countries (42% of the sample).Footnote 16 Columns (2) and (5) demonstrate that the effect of proximity to English on employment is marginally significant and on living wage is insignificant. Columns (3) and (6) add English proficiency (EF EPI), and the effects of proximity to English on employment and living wage are insignificant. However, these estimates are imprecise with large standard errors. I cannot rule out the role of this international language or English as a second language.

To summarise, I do not find evidence that proximity to English helps secure employment. The result supports the hypothesis that asylum seekers, who are mostly employed in low-skill jobs, have lower language requirements and needs of skill transferability. This is consistent with findings from Adserà and Pytlikovà (2015). They document no evidence of proximity to English in predicting migration from countries with low levels of education towards non-English speaking OECD countries. They deduce that the relevance of proximity and proficiency to English likely differs across migrant groups.

7 Testing the identification assumptions

My identifying assumption is that linguistic proximity is independent of unobserved individual characteristics. One way to help ensure that is through random assignment such that asylum seekers are not systematically different (in terms of characteristics) upon arrival before allocation to the 26 administrative regions. The legislation requires the administrators to allocate asylum seekers following a predetermined proportion detailed in Table A 1. Random assignment is a useful technique to generalize the independent and identically distributed (i.i.d.) assumption underlying the linear regression. Another concern in my identification strategy is selective migration over time. In what follows, I test the validity of the random assignment and assess sample selection issues.

7.1 Random assignment

To evaluate the validity of the random assignment assumption, I present three main tests. First, I check the balance of individual characteristics across cantons of residence. I test the hypothesis that the coefficients are equal and whether characteristics within the prearrival and post-arrival groups jointly predict canton assignment. To do so, I regress each observable characteristic, both prearrival and post-arrival, on canton dummies. The regression is conditional on arrival year and top 10 nationality fixed effects. I include the fixed effects because the assignment policy necessitates that asylum seekers from the top 10 nationalities be randomly assigned across cantons in each arrival year (Vogt 2018). Linguistic proximity is not included in this test as it is analogous to the top 10 nationalities.

Table 7 reports the canton coefficients (panel A), p-values for the joint test of equality (panel B), and the joint balance test for prearrival and post-arrival characteristics (panel C). The joint balance tests (panel C) fail to reject the null hypothesis for the prearrival (p = 0.63) and post-arrival characteristics (p = 0.12). This suggests both the prearrival and post-arrival characteristics are not predictive of canton assignment.

Table 7 Identification: balancing the study variables by cantons, 2010–2014

Second, I assess compliance with the regulation (i.e. the proportional distribution of individuals). According to the proportional rule, 70% of asylum seekers from the top 10 nationalities would be assigned to the German region.Footnote 17 I report the means and two-tailed p-values from the one-sample test of proportion for each nationality by arrival cohort (Table A 17) and outcome year (Table A 18). I find statistical evidence to reject the null hypothesis that the proportion of asylum seekers assigned to the German region is 0.7 at 46% (out of 55 tests by arrival cohort) and 32% of the time (out of 50 tests by outcome year) at the 5% significance level. Table A 17 suggests that the sample is less balanced among arrivals in 2005–2006 and 2008, but the imbalance may be associated with selective migration over outcome years. Table A 18 (panel C) indicates that I cannot reject the hypothesis that the proportion of top 10 nationalities is 0.7 at the 10% level across outcome years. However, I find imbalance in the tests for each nationality (Table A 18 panel A). In sum, I cannot conclude if asylum seekers are sorted into language regions.

Third, I check the correlation between linguistic proximity and cantons of residence. I regress each canton on the linguistic proximity conditional on arrival year fixed effects. Accounting for both the balance and size of the differences, Fig. 3 suggests that linguistic proximity may be able to predict 4 of the 26 canton locations. The four cantons are Zurich, Luzern, Vaud, and Geneva. They are among the more populated cantons. Each of them should be assigned with at least 4% of all asylum seekers according to the legislation (Table A 1). These cantons are also more likely to have facilities to handle special cases (e.g. medical cases and minors). Please refer to Online Appendix B for additional balancing tests and validation checks.

Fig. 3
figure 3

Identification: balance of the variable of interest for each canton, 2010–2014. Notes: This figure presents the estimated coefficient of linguistic proximity for each of the 26 cantons. The error bars present the 95% confidence interval. I regress each canton (as a binary dependent variable) on linguistic proximity (independent variable) conditional on arrival year fixed effects. Standard errors are clustered by individual. Source: The Federal Statistical Office and Swiss Longitudinal Demographic Database (2010–2014)

Although the balancing tests suggest less imbalance in the prearrival than post-arrival dimension, my findings cannot rule out that asylum seekers may be systematically different across cantons. There are several plausible explanations for the imbalance. Firstly, my dataset is restricted to a sample of permanent residents. I cannot eliminate the possibility of that selective return or onward migration took place before permanent residency. Compared with the asylum application statistics (see Table A 4), the imbalance may be driven by an underrepresentation of asylum seekers from several top nationalities, such as Nigeria, Serbia, and Tunisia. If the current sample is positively selected to stay and integrate, my results should be viewed as an upper bound.

Secondly, asylum seekers are randomly assigned following the population share of the cantons. The second set of balancing tests cannot check compliance with the regulation along other observable dimensions that can infer labour market skills. Thirdly, special cases (e.g. minors, medical cases, and family reunification) are likely correlated with nationality, but they are not subject to the randomization policy. Although the official record suggests that special cases only account for about 11% of all applications (Table A 2), I cannot identify these individuals from the Structural Survey. If the non-compliers are identifiable and that they are more likely to reside in the four cantons (where I report imbalances in Fig. 3), the baseline balance will likely improve.Footnote 18

7.2 Selective migration

I assess the balance of characteristics between stayers (those who stayed) and leavers (those who departed at any outcome year). I consider individuals with missing residential locations and never returned to Switzerland as leavers. While I drop observations without information on residential location in the main analysis, I incorporate the dropped observations to evaluate selection in migration over time in Table A 19. I examine the difference of means between stayers and leavers (column 3). In the subsample of leavers, I examine the difference in means between the year before departure and other years in Switzerland (column 6).

Table A 19 column (3) shows that leavers perform significantly worse in the labour market than stayers. I also find that leavers are significantly younger, more likely to face residential location and work restrictions (permit N asylum seekers), reside in the Romance region, and reside in border cantons. The (insignificant) results suggest that the leavers are male, less educated, single, and without children. I do not find significant difference in labour market outcomes between the year before departure and other years (column 6).

While asylum seekers and migrants are not directly comparable due to different labour market treatments upon arrival and abilities to return home, my findings broadly align with recent work on emigration behaviour in Switzerland. Wanner et al. (2021) report that migrants who are male, single, residing in Geneva, and undereducated are more mobile. But migrants who are residing in rural areas, re-employed, bilingual, and from transition countries (e.g. the Balkans) tend to stay. My result suggests asylum seekers with weak social ties and who perform poorly economically tend to leave Switzerland. The current sample is likely positively selected over time; thus, my results likely represent an upper bound of the true causal estimate.

Based on a thorough evaluation of the random assignment, theoretical assumptions of the fitted model, and selective migration, I refrain from making strong causal claims on the linguistic proximity estimates. Although the policy and procedural documents support that asylum seekers are randomly assigned to cantons, the randomization tests show that the 26 cantons are not entirely balanced on all observable characteristics. The test on selective migration also suggests that the sample is positively selected over time. To ensure a consistent estimate, I investigate the role of unobservable selection in the robustness section.

8 Robustness checks

8.1 The effect of unobservables

One may be concerned about the effect of unobservables at the individual and country of origin level, such as the assigned canton, health, training and work experience in the home country, and similarity in education systems. These unobserved factors may affect selection into language regions and economic choices, and they would explain the results if observed. To diagnose the extent of omitted variable bias, I implement a method proposed by Altonji et al. (2005) and Oster (2019). The main assumption of the method is that selection between unobservables and observables are proportional to changes in the estimated coefficient and explanatory power (i.e. r-squared), as observed covariates are included in the model.

The method assumes that the observed and unobserved variables are orthogonal components of individual characteristics. When the ratio of selection between unobservables and observables (\(\delta )\) is 1, it suggests that unobservables and observables (that is, all the existing covariates and fixed effects) have equal explanatory power to the outcome. When the values of \(\delta >1\), it suggests a robust result as observables are guided by economic theories and that omitted variables are unlikely to be more important in explaining outcomes than the included variables (Altonji et al. 2005). When the values of \(\delta <0\), it suggests unobservables and observables would explain the outcome in an opposite direction; thus, the bias would drive the estimated effect towards zero.

To assess omitted variable bias, I implement the method in Oster (2019) to estimate the value of \(\delta\) (selection on unobservables relative to observables) under assumptions for (i) zero effect of linguistic proximity and (ii) the maximum explanatory power of the model with all unobservables (\({R}_{\mathrm{max}}=k\times \tilde{R },\) where \(\tilde{R }\) is the r-squared from the preferred model).Footnote 19 Oster (2019) argues that \({R}_{\mathrm{max}}<1\) due to measurement error. Guided by randomised data, \({R}_{\mathrm{max}}\) is recommended as 1.3 times that of the r-squared value of the regression with all observables (\(k=1.3\)).

Table 8 Robustness test: selection on observables versus unobservables, 2010–2014

Table 8 presents the values of \(\delta\) (panel A) and coefficient estimates (panel B) for different values of \({R}_{\mathrm{max}}\). In panel A, I find the positive effect on employment to survive under the conventional assumptions: It is unlikely that the explanatory power of the excluded variables to be 1.2 times as important as the included variables (recall: arrival year, education, and so on). However, the estimated \(\delta\) for living wage is 0.9, implying that the OLS estimate would be driven by selection bias if selection on unobservables is 90% as strong as selection on observables. If I assume a greater \({R}_{\mathrm{max}}\) where \(k=2\), selection on unobservables would only need to be 40% (employment) and 30% (living wage) greater than selection on observables to drive the effect of linguistic proximity to zero. These results imply that the OLS estimates may overstate the positive effects of linguistic proximity on labour market outcomes.

In Table 8 panel B, I estimate the coefficient bounds considering the equal importance of selection on unobservables and observables (\(\delta =1\)). I find that the positive effect of linguistic proximity on employment and living wage to be smaller and close to zero. The coefficient bounds at \(k=1.3\) are [0.006, 0.024] for employment and [− 0.002, 0.017] for living wage. This implies that a one standard deviation increase in linguistic proximity is associated with an increase in employment by 0.6 to 2.4 pps. The coefficient sets on living wage contain zero if \({R}_{\mathrm{max}}\) exceeds 0.22 (where \(k=1.3)\). Considering refugees move due to ‘non-economic’ reasons, they are less positively selected than economic migrants (Chiswick 1999). There are likely other omitted variables for which I cannot control, such as initial canton, health status, work, and education experiences of the analysis sample. Therefore, the employment finding can only be considered an upper bound.

8.2 Alternative linguistic proximity indices

It is crucial to ensure the estimated effects of linguistic proximity are not specific to this index. I compare my main results to three other measures. First, I use the highest proximity score to the municipality language based on the origin country’s first official, second official, majority, and minority language. This would be the ‘best’ proximity score or the highest language capital one can attain. Second, I use another linguistic proximity index calculated from the ASJP program, which is 100% minus LDND between Standard German and the majority language from the origin country. In Switzerland, all official documents are published in the standard form of the official languages (in this case, Standard German) (Swiss Federal Council 2020). I hypothesize the two languages would yield similar effects on labour market outcomes. Third, I use another linguistic proximity index (hereafter index (A&P), to avoid confusion) constructed by Adserà and Pytliková (2015). The authors calculate an index using levels of the linguistic tree between the first official language of countries. I take the index between origin country and destination countries, which are Germany, France, and Italy, to test the robustness of my results.Footnote 20 For ease of comparison, I standardize all the raw indices by their standard deviations.

Table 9 panel A verifies the significant and beneficial effect of linguistic proximity on employment and the living wage measure. Coefficient signs, magnitudes, and significance levels are comparable across specifications. Note that the estimates are generally bigger and more statistically significant from the standardized A&P index. This is likely because the original A&P index is a categorical variable taking six values ranging from 0 to 1. This robustness test suggests that the effect of linguistic proximity is not sensitive to the specific proximity measure. The majority score is still the preferred measure as it avoids overstating the effect of proximity (unlike the highest proximity score) and it directly relates to Bernese German.

Table 9 Summary of robustness tests, 2010–2014

8.3 Further controls and fixed effects

I explore the effect of omitted confounders that are correlated with labour market outcomes and linguistic proximity. I add language skills to the main regression model in Table 9 panel B1. Language skill is the reported main language taken from the Structural Survey. It is a dummy variable of self-reported language which one thinks in and knows best. Asylum seekers who reported using the main language of the destination municipality are compared to individuals who reported otherwise. The main language in the German region is German and in the Romance region is French and Italian. The effect of linguistic proximity remains unchanged with the inclusion of language skills.Footnote 21

Although the preferred specification accounts for individual and country of origin characteristics, one may be concerned about the effect of unobserved continent heterogeneity, for example, racial discrimination from employers, to exacerbate unequal access to employment opportunities. I consider two other specifications where I exclude country of origin characteristics and include (i) nationality and (ii) nationality-region fixed effects. In the latter, I group the nationalities of asylum seekers into 12 regions (United Nations 1999).Footnote 22 The additional specifications represent more conservative assumptions that asylum seekers from the exact nationality or the same region would share similar observable and unobservable characteristics that can explain their labour market performance.

Table 9 panel B2 reports the effect of linguistic proximity is no longer statistically significant in the regression with nationality fixed effects. But the findings are robust to using nationality-region fixed effects (panel B3). Considering there are 72 nationalities and only two language regions, my results suggest that there is less within-nationality variation than across-nationality variation among asylum seekers.

8.4 Sample restrictions

To examine the potential effect of selection and measurement error, I run the main specification with five sample restriction criteria in Table 9 panel C. First, I investigate whether the empirical strategy is sensitive to endogenous internal migration. In panel C1, I exclude 50% of the sample (N = 6874) who are legally allowed to move across cantons (permit B, C, and naturalised citizens). The excluded sample does not face any residential location or labour market restrictions. Both the employment and living wage effects are smaller. The result for employment becomes marginally significant, and living wage is no longer detectable. I observe that the excluded sample is older, more educated, and more likely to reside in urban areas than the remaining (permit N and F) sample. It is plausible that the excluded sample is positively selected as the legislation requires refugees to be economically dependent to be considered for canton change.

Second, I test a potential source of selection stemming from mobility within the four multilingual cantons (i.e. across municipalities): Bern, Fribourg, Graubünden, and Valais. While I observe municipalities of residence, the municipality fixed effects would absorb all the variations across nationalities. To rule out endogenous mobility to a more favourable language or work environment, I exclude 19.5% of the sample (N = 2684) who may move within a multilingual canton. Panel C2 validates my results remain robust.

Third, I explore the effect of outlier countries with high linguistic proximity. I exclude the asylum population from six countries with linguistic proximity at the top 1%, namely Cameroon, Columbia, Cuba, Liberia, Serbia, and the USA. This is equivalent to 1.6% of the sample (N = 223). Panel C3 indicates the result for employment yields the same magnitude and significance level, but the effect on living wage is smaller and no longer discernible.

Fourth, I consider a potential source of measurement error from the nationality information. The information comes from the ZEMIS foreign register that is administered by SEM and that I can verify its accuracy with the country of birth in most cases. However, the country of birth is missing in 6.3% of the sample (N = 870). To rule out this potential source of measurement error, I exclude these individuals from the main regressions. Panel C4 confirms my results are similar.

Finally, I consider the effect of age at arrival on language. To take this into account, I exclude 4.9% of the sample (N = 671) who arrived before the age of 18. The excluded sample originates from 32 countries with the same top 10 countries as the main sample. Seventy percent of the excluded sample resides in the German region, similar to the main sample. Panel C5 confirms that my results are similar.

9 Conclusion

This paper is among the first to explore the relationship between linguistic proximity and labour market outcomes of the asylum population in a multilingual country. The variable of interest — linguistic proximity — captures the skill transferability of human capital among asylum seekers in Switzerland. Using administrative records of permanent residents from 2010 to 2014, I find that linguistic proximity is associated with positive employment outcomes and a higher likelihood of obtaining a living wage. The overall effect of linguistic proximity is driven by the earlier arrival cohorts, suggesting that linguistic proximity cannot mitigate the initial disadvantages upon arrival. I do not find evidence for differences between language regions. I find that proximity to English has no discernible effect on economic integration, which underlines the importance of being fluent in the local language. Further analysis shows that the positive effect on employment is likely to remain even if selection on unobservables is as important as selection on observables. Other than the results on regional differences, these findings are consistent with the wider research on the influence of language on economic integration.

Although the central assignment of asylum seekers by the Swiss immigration authority (SEM) may alleviate endogenous selection bias, I find a lack of perfect balance and the least successful asylum seekers leave Switzerland. Thus, this study is descriptive rather than causal. If I can observe the outcomes of the underrepresented nationalities and unbiased language proficiency of the asylum population, I would expect the effect of linguistic proximity to decrease. The intuition is in line with the assessment of selection issues and the effect of language skills reported in previous studies from Switzerland.

From a policy perspective, the government should continue devoting substantial resources to teaching the Swiss languages and assessing the language proficiency of the asylum population. Several studies show proficiency acquired through language training increases job accessibility (e.g. Adserà and Ferrer 2015; Lochmann et al. 2019). Future work should investigate the underlying mechanisms, whether the role of linguistic proximity acts through the channel of cultural bias, language training, or integration policy (e.g. Slotwinski et al. 2019). This is complementary to existing initiatives, such as improving the acknowledgement of education degrees and providing hiring incentives to firms. It would also be useful to investigate other aspects of language, such as the complexity of grammatical structure, sentence formation, and linguistic diversity in the origin country (see examples of linguistic diversity, e.g. Chevalier et al. 2020; Dale-Olsen and Finseraas 2020). These other dimensions of language may increase returns to language fluency and facilitate skill transfer in host countries.