1 Introduction

It is a common international phenomenon that the immigrant population is geographically concentrated. In 1990, 63% of the foreign-born population in the USA were clustered in the four most populous states, California, New York, Florida and Texas, where only 31% of the overall population lived (Zavodny 1997). In 1998, 52% of the foreign-born population in Denmark lived in the metropolitan area where only 34% of the overall population lived (Ministry of Internal Affairs 1999).

Knowledge of which regional factors influence location choices of recent immigrants helps local policymakers anticipate which locations can expect to receive immigrants in the future. Most US studies on recent immigrants’ location choices find that recent immigrants are attracted to large cities in which earlier cohorts of co-ethnics and other immigrants have settled (Bartel 1989; Zavodny 1997, 1999; Jaeger 2000; Bauer et al. 2002, 2005). Migration theory predicts that immigrants are attracted to regions with favourable income prospects. However, US studies have found contrasting evidence on whether immigrants are sensitive to regional differences in labour market conditions, welfare eligibility and social benefit levels (Bartel 1989; Jaeger 2000; Zavodny 1997, 1999; Borjas 1999). With the exception of the Swedish study by Åslund (2005), few studies have investigated location choices of recent immigrants outside the US.

Most previous studies investigate immigrants’ location preferences by estimating pull factors, i.e. the set of negative or positive social or economic factors in the potential areas of destination which pulls migrants towards them (Lee 1966), using information on immigrants’ first choice of location in the host country. However, estimates from a standard choice model will not reflect preferences, if individual-specific costs of choosing some regions over others are unobserved.

An alternative way of learning about immigrants’ location preferences is to estimate push factors, i.e. the set of negative or positive social or economic factors in an area of origin which pushes migrants away (Lee 1966), based on immigrants’ subsequent internal migration pattern. However, push factor estimates may be biased if immigrants sort into initial locations based on unobserved personal attributes that also influence the secondary migration probability.

The main strength of this paper is that it provides quasi-experimental evidence on recent immigrants’ location preferences by exploiting a governmental spatial dispersal policy on refugees in Denmark, which was carried out between 1986 and 1998. The dispersal policy implied that new refugees were randomly distributed across locations in Denmark conditional on six personal attributes, which are largely observed in Danish administrative registers. The dispersal policy is exploited to estimate push factors because the policy is especially suited for consistent estimation of push factors rather than pull factors in placed refugees’ secondary migration. Controlling for the personal attributes that may have affected the initial location in the migration decision, the initial location can be regarded as exogenous in the subsequent migration decision. Due to the exogeneity of the initial location, the push factor estimates are unaffected by initial location sorting.

An additional strength of the study is the empirical model. Push factors are estimated in a duration model framework that aims at capturing the distribution of the hazard function of secondary migration at time t because the hazard function is closely related to the underlying economic behaviour. The residential history of a refugee since the date of immigration is reconstructed from Danish longitudinal administrative register data on the immigrant population, which also contains a wide range of personal attributes. The location choice analysis is restricted to the first migration investment because of the exogeneity of the initial location in contrast to the endogeneity of subsequent locations. The main geographical unit of location used in the study is a municipality because the Danish spatial dispersal policy aimed at an equal distribution of refugees, not only at the county level, but also at the municipality level. Hence, a move across the municipality border is regarded as a migration investment.

The main contributions of the study are the following. First, the study explains why recent immigrants are attracted to large cities. It does so by providing new and quasi-experimental evidence that lack of rental, including social, housing and lack of institutions for qualifying education are push factors and that after controlling for these factors, city size has an insignificant effect on the hazard rate of secondary migration. I interpret this as evidence that recent immigrants are attracted to large cities to get access to housing and institutions for qualifying education. Second, the study contributes to solving the controversy about whether recent immigrants are sensitive to regional differences in economic conditions by providing quasi-experimental evidence that a relatively high regional unemployment rate is a push factor in placed refugees’ secondary migration and by demonstrating that use of micro-data with endogenous initial location of residence may explain the lack of a robust finding in previous studies.

2 The natural experiment

1986 marks the start of the first Danish spatial dispersal policy on refugees and asylum seekers who had just received a permit to stay for reasons of asylum.Footnote 1 Henceforth, I refer to such recognised refugees and asylum seekers as refugees. The Danish Government urged the Danish Refugee Council to implement the dispersal policy after a surge of refugees in the mid-1980s made it increasingly difficult for the Council to satisfy the location preferences of most new refugees for accommodation in the larger cities. The policy was in force until 1999 under the charge of the Council.

Spatial dispersal was a two-stage process. At the country level, the Council aimed at the attainment of an equal number of refugees relative to the number of inhabitants across counties. At the county level, the Council aimed at attaining an equal number of refugees relative to the number of inhabitants across municipalities (local authority districts) with suitable facilities for reception such as housing, educational institutions, employment opportunities and co-nationals.Footnote 2 These dispersal criteria implied that refugees were provided with permanent housing in cities and towns and to a lesser extent in the rural districts (Ministry of Internal Affairs 1996).Footnote 3

In practice, the settlement took place in three steps. First, as soon as a refugee was granted asylum, the individual was offered assistance from the Council in finding housing. If the individual accepted the offer, he filled in a form concerning his background such as family members and nationality. Second, approximately 10 days later, the Council assigned the individual to 1 of 15 counties. Third, having been provided with temporary housing in the receiving county, local offices of the Council assisted the assigned refugees in finding permanent housing in the county.Footnote 4

Dispersal was voluntary in the sense that only refugees who were unable to find housing themselves were subject to the dispersal policy. However, the take-up rate was high; between 1986 and 1997, approximately 90% of refugees were provided with permanent housing by the Council (after 1995 by a local government) under the terms of the dispersal policy (Annual Reports of the Danish Refugee Council 1986–1994 and the Council’s internal administrative statistics for 1995–1998).

Once settled, the refugees participated in Danish language courses during an introductory period of 18 months while receiving social assistance. The refugees were urged to stay in the assigned municipality during the entire introductory period. However, there were no relocation restrictions. Refugees could move away from the municipality of assignment at any time, in so far as they could find alternative housing elsewhere. Receipt of social benefits was unconditional on residing in the assigned municipality.

Figure 1 provides evidence that the Council aimed at distributing refugees equally between municipalities relative to the number of inhabitants in municipalities and was relatively successful in attaining the goal. The figure is a reproduction of a figure in the Council’s Annual Report in 1987 and shows the number of refugees who were assigned to permanent housing between 1985 and 1987 per 10,000 inhabitants across municipalities. At the end of 1987 the country average was 33 refugees per 10,000 inhabitants (Danish Refugee Council 1987, 30). One can see from the figure that only 2 years after the introduction of the dispersal policy refugees lived in 243 out of the 275 municipalities, and the number of refugees per 10,000 inhabitants exceeded the country average by a factor 2 in only 17 municipalities.

Fig. 1
figure 1

Number of refugees assigned to permanent housing per 10,000 inhabitants in the period 1985–1987. Note: Reproduction of figure in the Annual Report of the Danish Refugee Council 1987

The important question whether refugees were randomly distributed across locations under the spatial dispersal policy is analysed in a related study by Damm (2005). The study examines the initial settlement pattern of refugees who got asylum between 1986 and 1998 based on a range of data sources: an interview with two placement officers at the Council, the Council’s internal administrative statistics and administrative registers. The study concludes that the Danish spatial dispersal policy on refugees carried out between 1986 and 1998 gave rise to a random initial distribution of refugees who were provided with or assisted in finding permanent housing by the Council, conditional on six characteristics of the individual: health (in need of special medical or psychiatric treatment), educational needs, location of close relatives, family size, nationality as well as year of immigration. The main reasons are given below.

First, according to the interview with two former placement officers, the Council aimed at satisfying location wishes of refugees who wished to be assigned to a location near close family members and at assigning refugees who were in need of special medical or psychological treatment or education to a location in which the treatment or education was available. Secondly, in almost every year, a larger share of singles than refugees with family waited more than 9 months for permanent housing (annual reports of the Danish Refugee Council 1986–1996 and internal administrative statistics of the Danish Refugee Council 1992–1997) suggesting that it was typically more difficult to find permanent housing for singles than for refugee families. Thirdly, over time, it became increasingly difficult for the Council to find vacant rental housing units in the larger and medium-sized towns which suggests that refugees who arrived in the first years after the introduction of the dispersal policy were more likely to realise their potential location wish. Fourthly, refugees from certain source countries have apparently been less likely to be assigned to a larger city. Empirical evidence from administrative registers presented in Damm (2005) shows that this is the case for refugees from Sri Lanka, Palestinian refugees from Lebanon and in particular for refugees from Bosnia–Herzegovina who were dispersed under a special introduction programme that included settlement in rural districts. Finally, there is some evidence from the Council that reluctance to accept assignment to a non-preferred county was of minor importance for the initial settlement. First, the Council claims in its annual report of 1986 that the location wish of refugees to live in a larger city became less pronounced after the implementation of the dispersal policy. Second, according to the Council’s internal administrative statistics, the share of placed refugees who were re-assigned to another county was low.

Note that due to the way in which the dispersal policy was implemented, municipalities had little opportunity to cream-skim refugees, i.e. to ask for, say, well-educated refugees. Although the Council did not know in advance which groups of asylum seekers would next be granted asylum, it had to provide refugees with temporary housing shortly after receipt of the residence permit. This procedure left little time for negotiation between the Council and municipalities. Moreover, the Council acted as a private agent searching for housing in the local housing market on behalf of refugees who had just received a residence permit. The local authorities typically were not informed about the settlement of a refugee in the municipality before a refugee had been provided with housing in the municipality. There is some empirical evidence to back this claim. Linear regression of the number of inhabitants in the municipality of assignment on a range of characteristics of the individual shows absence of a significant correlation between the size of the municipality of assignment and an individual’s educational level.

Three of the six personal attributes which may have influenced the location of assignment of a refugee are observable in Danish administrative registers that cover all immigrants: family size, nationality and year of immigration. In addition, the registers contain variables which may be good proxies for two of the three unobservable personal attributes: Age and nationality may be decent proxies for an individual’s educational need, and nationality and the size of the ethnic stock at the time of immigration may be decent proxies for the likelihood of having close relatives in Denmark at the time of immigration. In conclusion, one potentially important individual characteristic for initial settlement is unobserved in administrative registers: health status at the time of immigration.

3 Methodology

3.1 Methodological concerns

The standard micro-econometric approach to revealing location preferences of immigrants is to estimate pull factors by estimating a conditional logit model (McFadden 1973) using cross-sectional data on recent immigrants’ location choices in the host country. However, estimates from a standard choice model will not reflect preferences, if individual-specific costs of choosing some regions over others are ignored. Suppose, for instance, that immigrants who are not proficient in foreign languages have higher costs of settlement outside an ethnic enclave than immigrants with foreign language proficiency and that foreign language proficiency is unobserved to the researcher. In that case, estimation of pull factors will be biased.

An alternative way of learning about immigrants’ location preferences is to estimate push factors using longitudinal data on immigrants’ initial and subsequent location choice. However, push factor estimates may be biased due to location sorting, which is present if characteristics of the area of origin are correlated with unobserved personal attributes that also influence the migration probability. Suppose for example that immigrants who are not proficient in foreign languages tend to live in ethnic enclaves and are relatively less prone to secondary migration. In that case, the correlation between the probability of migration and ethnic enclave size may be driven by the unobserved factor, foreign language proficiency.

A dispersal policy on refugees under which refugees are initially randomly distributed across locations by the authorities will under some circumstances eliminate the bias due to location sorting in a push factor analysis. The first case is if secondary migration costs are small relative to the potential gains from moving. If in contrast secondary migration costs are large relative to the potential gains from moving, secondary migration costs must be constant across locations and uncorrelated with having preferences for one type of region, conditional on personal attributes and regional attributes of the location of assignment (Åslund 2005).

The circumstances under which a dispersal policy on refugees will also eliminate the bias due to location sorting in a pull factor analysis of initially placed refugees’ secondary migration are more restrictive. Åslund (2005) demonstrates formally that in case of a pull factor analyses, a further requirement is that the authorities initially distribute people equally between regions of different types. Unfortunately, this condition is not satisfied in the Danish case since refugees who were subject to the Danish dispersal policy were provided with permanent housing in cities and towns and to a lesser extent in the rural districts.

I therefore only exploit the Danish dispersal on refugees to estimate push factors. I further restrict the push factor analysis to push factors in the first migration decision after initial assignment. The reason is that even in case of initially random assignment of individuals across locations, location sorting may be an important issue in subsequent spells (Ham and Lalonde 1996). Hence, initially placed refugees may have a different sorting process into subsequent location spells. Furthermore, I investigate the extent to which push factor estimates from micro-data with endogenous initial location suffer from location sorting by comparing push factor estimates for refugees who were subject to the Danish dispersal policy with push factor estimates for similar refugees who got asylum before the implementation of the dispersal policy.

3.2 Empirical model

The standard empirical approach to estimation of push factors is to model the migration decision as a binary logit model, see e.g. Bartel (1989) and Åslund (2005). That approach aims at capturing the moments (i.e. mean) of the probability of migration at time t. In contrast, the push factor estimation approach used in this study focuses on capturing the distribution of the hazard function of secondary migration at time t because the hazard function is closely related to the underlying economic behaviour.

After assignment to a municipality of residence, a placed refugee has to decide the optimal length of stay in the municipality of assignment. This decision is likely to be a function of the expected utility level associated with continued residence in the municipality of assignment and the expected migration costs. Neither the expected utility level associated with continued residence in the municipality of assignment nor the expected migration costs can be observed directly, but they are likely to be determined by the attributes of the municipality of assignment and personal attributes of the individual. One can obtain consistent estimates of push factors in the migration decision using duration techniques, given access to micro-data for randomly assigned refugees on the actual spell of residence in the municipality of assignment, attributes of the municipality of assignment and other potential determinants of the spell of residence in the municipality of assignment.

Let the random variable T i denote time until relocation out of the municipality of assignment of individual i. Let x i be the vector of initial values of personal attributes and attributes of the municipality of assignment of individual i. Let v i be time-invariant unobserved heterogeneity of individual i. v i is assumed to be uncorrelated with the observed attributes x i . Subscript i is suppressed below for notational simplicity.

The key variable is the hazard rate of relocation out of the municipality of assignment, which in continuous time is defined as the transition rate out of the municipality of assignment at time t, conditional on having stayed in the municipality of assignment at least until t.

I specify the hazard function for relocation out the municipality of assignment at time t for an individual with observed characteristics x and unobserved heterogeneity v as a standard mixed proportional hazard (MPH) model with time-invariant explanatory variables (Lancaster 1979; Vaupel et al. 1979)

$$ h{\left( {t\left| {x,v} \right.} \right)} = \lambda {\left( t \right)} \cdot \varphi {\left( x \right)} \cdot v $$
(1)

where λ(t) is the baseline hazard, which gives the shape of the hazard function for any individual, and φ(x) is the systematic part of the hazard. The proportional hazard specification implies that only the level of the hazard function is allowed to differ across individuals since covariates are restricted to have a proportional effect on the baseline hazard. As is commonly done, I specify the systematic part of the hazard φ(x) as

$$ \varphi {\left( x \right)} = \exp {\left( {x\prime \beta } \right)}. $$
(2)

Consequently, the hazard function is multiplicative in all separate elements of x.

A right-censored residential spell contributes to the likelihood with the probability of residence in the municipality of assignment until time t (the survivor function) while a residential spell that is completed at time t contributes with the product of the probability that the spell lasts at least until time t (the survivor function) and the conditional probability of completion of the spell at time t (the hazard function). The total likelihood contribution from a residential spell of an individual is the product of the likelihood contribution of the residential spell integrated over the distribution of the unobserved covariates. The intuition is that because an individual’s type is not known, the likelihood function is a mixture over types weighted by their sample probabilities (Heckman and Singer 1984).

I choose a flexible model for the unobserved covariates v. The marginal distribution of the unobserved term is specified as a discrete distribution with two unrestricted mass point locations. Let v m, m = 1.2 denote the two mass-points of v. Each combination is observed with probability p i to be estimated, with \( 0 \leqslant p_{i} \leqslant 1 \) for i = 1.2 and \( {\sum\limits_{i = 1}^2 {p_{i} = 1} } \). I normalise the distribution of the unobserved term by letting v 1 = 1.

The baseline hazard function is assumed to be piecewise constant, i.e. \( \lambda {\left( t \right)} = \exp {\left( {\alpha _{k} } \right)},k = 1, \ldots ,K \), where K is the number of intervals of the baseline hazard function. The length of the baseline intervals is chosen by inspection of the Kaplan–Meier empirical hazard function for relocation out of the municipality of assignment in Fig. 4.Footnote 5

3.3 Model identification

Given normalisations of the mean of the unobserved covariates (finite means) and weak requirements for variation in the observed covariates, the Mixed Proportional Hazard model is identified non-parametrically if the observed covariates are independent of unobserved characteristics influencing the hazard rate (Elbers and Ridder 1982). The latter identification condition requires the initial settlement to be independent of any unobservable personal attributes affecting the hazard rate of relocation out of the municipality of assignment. This requirement is satisfied if all personal attributes that have influenced the initial settlement are included as explanatory variables in the model.

As mentioned in Section 2, the initial settlement of new refugees may have been influenced by one unobserved personal attribute, whether an individual was in need of a special health treatment at the time of asylum. However, there was no systematic mental health examination of refugees at the time of assignment. Furthermore, since mental health problems are taboos, they tend to be treated at a late stage, if treated at all. Whether a refugee was in need of special mental treatment at the time of assignment is therefore likely to have had little influence on initial settlement.

The estimated model will yield consistent estimates of push factors in the absence of initial selection into regions of refugees in need of special health treatment and given that inclusion of the included proxy variables adequately controls for potential selection into regions of refugees with special educational needs and relatives in Denmark. However, some of the estimated effects of demographic characteristics of the individual on the subsequent migration decision are correlations to the extent to which they affect the initial location. Hence, they should not be given a causal interpretation and are therefore not reported in Section 5.

4 Data

4.1 Refugee sample

Micro-data on refugees are extracted from longitudinal administrative registers of Statistics Denmark on the immigrant population in Denmark 1984–2000, henceforth referred to as the immigrant data set. The registers contain information on an individual’s county and municipality of residence (at the end of each year) and the date of the last residential move (by the end of each year). Such information is available because it is determined by Danish law to report a residential move to the local municipality of destination within a fortnight after the move. Using this information, I construct spells of municipality of residence for each individual. The spell durations are measured in months. Since the analysis concerns determinants of the first migration investment, I only follow an individual until the end of the spell of residence in the municipality of assignment or until December 2000 if the spell of residence in the municipality of assignment is right-censored.

The immigrant data set contains information on a variety of demographic and socioeconomic characteristics of the individual. I use the following information as control for the personal attributes that may have affected the initial location: marital status, children indicators, age and size of the ethnic stock at the time of immigration as well as year of immigration and country of origin. These variables may at the same time control for individual-specific differences in the expected utility gain and costs of migration. For the latter reason, I also include sex and years of education as controls. The data reliability is high since the data almost exclusively stem from administrative registers. The information on years of education is an exception; for immigrants who have not studied in Denmark, the information on years of education stems from a survey by Statistics Denmark.

There are 35,563 individuals of which 21,108 are men in the extracted refugee sample. Ideally, this sample should cover observations on all adult refugees who were assigned to a municipality by the Council under the terms of the spatial dispersal policy carried out from 1986 to 1998. However, the administrative registers lack information on admission category of immigrants.

Instead, I extract immigrants who immigrated to Denmark for the first time between October 1985 and December 1997 from refugee-sending countries. I use two criteria for selection of refugee-sending countries. First, a relatively large number of refugees originate from the country. Second, the number of non-refugee immigrants relative to the total number of immigrants from the country is small so that the sample of refugees is relatively uncontaminated. I use the official statistics on the total number of residence permits granted to refugees between 1985 and 1997 reported in Table 6 in the Appendix to select the largest refugee-sending countries in the period. The ten largest refugee-sending countries (ranked in descending order) are Former Yugoslavia, Palestine, Iran, Iraq, Somalia, Sri Lanka, Vietnam, Afghanistan, Poland and Rumania. Furthermore, for each refugee-sending country, I calculate the number of non-refugee immigrants relative to the total number of immigrants for each country. The number of non-refugee immigrants relative to the total number of immigrants during the period in which the country sent refugees ranges from 0.029 for Iraq to 0.562 for Poland. The share of non-refugee immigrants is also quite high for Rumania. I therefore limit the refugee-sending countries to the eight largest refugee-sending countries. They are considered as refugee-sending countries in the following period: Former Yugoslavia, 1994–1997; Palestine, 1985–1997; Iran, 1985–1997; Iraq, 1985–1997; Somalia, 1989–1997; Sri Lanka, 1985–1997; Vietnam, 1985–1997; and Afghanistan, 1985–1997. I calculate the rate of contamination of the overall refugee sample as the number of non-refugee immigrants relative to the total number of immigrants from these countries in the specified periods. The rate of contamination is 15.9%. Note also that the extracted refugee sample is a fairly representative sample of refugees in Denmark since residence permits granted to refugees from one of the eight largest refugee-sending countries constitute 89.4% of the total number of residence permits granted to refugees between 1985 and 1997.

The ethnic composition of the extracted sample by year of immigration is shown in Table 7 in the Appendix. Only individuals aged 18–66 are included in the sample. The age criteria explains why the number of individuals sampled in each year is only around 50% of the actual number of residence permits granted to refugees from the selected countries. Furthermore, since family-reunified immigrants from refugee-sending countries were only subject to spatial dispersal if they immigrated shortly after their spouse, I exclude immigrants from refugee-sending countries, who at the time of immigration were married to one of the following: (1) an individual born in Denmark, (2) an immigrant from a non-refugee-sending country or (3) an immigrant from a refugee-sending country who had immigrated at least 1 year earlier. I exclude individuals who were neither observed in the registers in the year of immigration nor in the following year because in that case, information on the initial municipality of residence is missing. Unfortunately, the registers do not allow us to exclude the 10% of refugees who turned down the Council’s offer of housing under the terms of the spatial dispersal policy.

Another weakness of the data is the lack of information on the municipality of assignment. Solving this issue is complicated by the fact that refugees may initially have lived in temporary housing in proximity of the municipality to which they were later assigned, on average after 6–7 months and in general after 3 months. For this reason, I include refugees who immigrated in the last quarter of 1985 in the refugee sample. I identify the municipality of assignment by using the following algorithm which I constructed based on information on the Council’s internal administrative statistics on temporary housing. The first municipality of residence observed in the registers is defined as a municipality of temporary housing if the person relocates to another municipality within the county within 1 year after receipt of residence permit. Otherwise, the first municipality is defined as the municipality of assignment.

All personal attributes that are included in the set of explanatory variables are defined in Table 8, and their first two moments are shown in Table 9 in the Appendix.

4.2 Descriptive evidence

Denmark is administered at three levels: the state, the county and the municipal level. Denmark has 275 municipalities; 273 of the municipalities constitute 14 counties. Copenhagen and Frederiksberg municipalities are excluded from the county division. The largest metropolitan area, the Greater Copenhagen area, is constituted by Copenhagen and Frederiksberg municipalities and Copenhagen County. The four most populated municipalities are Copenhagen, Aarhus, Odense and Aalborg, which are also the four largest cities.

Table 1 shows the initial geographical distribution across counties of individuals in the refugee sample. Henceforth, I refer to individuals in the refugee sample as post-reform refugees. For reasons of comparison, Table 1 also presents the initial distribution across counties of refugees aged 18–66 from the same 8 refugee-sending countries who immigrated up to 3 years before the implementation of the dispersal policy. Henceforth, I refer to this group of individuals as pre-reform refugees. Finally, Table 1 presents the distribution of the overall Danish population and the immigrant population (immigrants and their descendants) across counties in 1985, i.e. the year before the implementation of the first dispersal policy on refugees in Denmark. The table shows that whereas immigrants and the 1983 and 1984 cohorts of pre-reform refugees are highly over-represented in the Greater Copenhagen area, there is a close correspondence between each county’s share of post-reform refugees and the population share of the county. This confirms that the dispersal policy was successful in distributing new refugees equally across counties.Footnote 6

Table 1 Geographical distribution across counties (percent)

Did the introduction of the spatial dispersal policy on refugees in 1986 also affect the initial geographical distribution of refugees across municipalities? To answer this question, I compare the initial settlement pattern across municipalities of pre-reform refugees to that of post-reform refugees. Under an equal distribution of pre-reform refugees across municipalities, there would be 8 pre-reform refugees per 10,000 inhabitants in each municipality. However, Fig. 2 presents evidence that pre-reform refugees were initially far from equally distributed across the 271 municipalities.Footnote 7 Pre-reform refugees were initially over-represented in 51 municipalities, including the four largest municipalities in Denmark and the Greater Copenhagen area. Pre-reform refugees were initially absent in 50% of the municipalities.

Fig. 2
figure 2

The initial settlement of pre-reform refugees across municipalities

Under an equal distribution of post-reform refugees across municipalities, there would be 70 post-reform refugees per 10,000 inhabitants in each municipality. Figure 3 presents evidence that post-reform refugees were over-represented in 91 municipalities, but unlike pre-reform refugees, post-reform refugees were not over-represented in the largest metropolitan area of Copenhagen. Furthermore, post-reform refugees were assigned to all, but four, municipalities. Comparison of Figs. 2 and 3 suggests that the spatial dispersal policy on refugees considerably increased the dispersion of refugees across municipalities and settlement of refugees outside the larger municipalities.

Fig. 3
figure 3

The initial settlement of post-reform refugees across municipalities

To investigate how the introduction of the spatial dispersal policy on refugees in 1986 affected the initial settlement pattern of refugees across urban versus rural areas, I divide the municipalities into three categories according to the number of inhabitants. A small municipality is defined as having less than 10,000 inhabitants; a medium-sized municipality has 10,000–100,000 inhabitants; a large municipality is defined as having more than 100,000 inhabitants. According to this definition, Denmark has four large municipalities: Copenhagen, Aarhus, Odense and Aalborg. Large and medium-sized municipalities are predominantly urban areas whereas small municipalities cover urban areas as well as rural districts.

The initial geographical distribution of pre- and post-reform refugees across municipality size categories is reported in Table 2. For comparison, the geographical distribution of the total population in Denmark and the immigrant population in 1985 across the three municipality size categories is also reported. Post-reform refugees were initially slightly over-represented in the large municipalities and slightly under-represented in small municipalities compared to the overall distribution of the Danish population. Furthermore, post-reform refugees were over-represented in the larger municipalities and under-represented in the smaller municipalities to a far lesser extent than the total immigrant population and the 1983 and 1984 cohorts of pre-reform refugees.

Table 2 Geographical distribution across municipality size categories (per cent)

Turning to the extent to which post-reform refugees subsequently migrated, descriptive statistics on the spell of residence in the municipality of assignment are reported in Table 3. By 2000, 39% of the individuals have moved out of the municipality of assignment. On average, movers make the first migration investment 28 months after settlement in the municipality of placement. As one would expect, the share of movers is negatively correlated with the year of immigration: 62% of the 1986 cohort of refugees are movers compared to only 29% of the 1997 cohort of refugees.

Table 3 Descriptive statistics on residential spells

The Kaplan–Meier empirical hazard function for relocation out of the municipality of assignment is plotted in Fig. 4. The hazard function peaks 13 months after assignment. The Kaplan–Meier empirical survivor function for residence in the municipality of assignment is plotted in Fig. 5. The figure shows that 15 years after initial settlement 48% of individuals in the sample still live in the assigned municipality.

Fig. 4
figure 4

Kaplan–Meier empirical hazard function for relocation out of the municipality of assignment

Fig. 5
figure 5

Kaplan–Meier empirical survivor function for residence in the municipality of assignment

Table 4 presents evidence that the spatial dispersal policy on refugees was successful in augmenting spatial dispersion of refugees relative to other immigrants in the medium run. In 2000, post-reform refugees were to a much lesser extent over-represented in Copenhagen and Frederiksberg municipalities than the overall immigrant population and pre-reform refugees and in fact under-represented in Copenhagen County, which in 2000 had the second highest share of immigrants. In contrast to the overall immigrant population, post-reform refugees were instead slightly over-represented in the counties in which the second and third largest cities in Denmark are situated, Aarhus County (Aarhus) and Funen County (Odense) and only slightly under-represented in the remaining counties.

Table 4 Geographic distribution across counties in 2000 (percent)

4.3 Area of origin data

Regional attributes of the municipality of assignment that I believe may affect placed refugees’ secondary migration propensity fall into three categories: (1) demographic attributes, (2) labour market attributes and (3) housing market attributes.

Four variables describing the demographic attributes of the municipality of assignment are included in the set of explanatory variables. The first variable is the percentage of co-nationals in the host country who live in the municipality of assignment. It is included because some theories predict that recent immigrants are attracted to locations in which earlier cohorts of co-ethnics have settled. According to the ethnic network hypothesis, the presence of co-ethnics constitutes an ethnic network that facilitates new immigrants’ adjustment to the new society by strengthening feelings of security, solidarity and identity within the group due to the common cultural background and due to development of local ethnic labour markets and establishment of social institutions that support its members in relation to the rest of the society and convey information about employment opportunities outside the residential area (Piore 1979; Kobrin and Speare 1983). According to the ethnic goods theory proposed by Chiswick and Miller (2005), living in an ethnic enclave reduces costs of consumption of so-called ethnic goods. Such goods are defined as the consumption characteristics of an ethnic group not shared with the host population, broadly defined to include market and non-market goods and services, including social interactions for themselves and their children with people of the same origin. If these theories are correct, one would expect the hazard rate of relocation out of the municipality of assignment to decrease with the percentage of co-nationals in the host country who live in the municipality.

The second demographic variable in the set of explanatory variables is the percentage of immigrants in the host country who live in the municipality of assignment. It is included to test whether new refugees prefer foreign-born neighbours, possibly for reasons of solidarity. If so, the hazard rate of relocation is likely to decrease with the relative size of the immigrant enclave in the municipality of assignment.

The third demographic variable is the logarithmic value of number of inhabitants in the municipality of assignment, which is included to test whether refugees prefer to live in a large city. They may do so for a variety of reasons, including more job opportunities and general economic activity, easy access to airports that facilitate contact with old networks abroad, access to a large variety of goods and services in general and urban populations being more accustomed to interactions with foreigners. If so, current residence in a large city will decrease the hazard rate of relocation.

The final demographic variable is an indicator variable for initial residence in the largest metropolitan area, the Greater Copenhagen area. New refugees may prefer to live in the capital area due to capital-specific local amenities. If so, initial residence in Copenhagen will decrease the hazard rate of relocation.

Migration theory predicts that regional differences in economic conditions such as regional unemployment, social benefit levels or eligibility rules or public goods provision may determine immigrants’ location choices (e.g. Sjaastad 1962; Bowles 1970). To test whether that is the case, three labour market variables are included in the set of explanatory variables. The regional unemployment rate is included to test whether unfavourable employment prospects is an important push factor in placed refugees’ migration decision. The percentage of right-wing votes at the latest local election is included to test whether placed refugees react to local variation in the extent to which participation in active labour market programmes is required for social benefits receipt. A direct test of the hypothesis is impossible due to lack of municipality data on the use of active labour market programmes before 1995. However, the hypothesis can be tested indirectly since right-wing dominated municipalities are likely to be more prone to enrol social benefits recipients in active labour market programmes than left-wing dominated municipalities. The hazard rate of relocation out of the municipality of assignment is likely to increase with the percentage of right-wing votes at the latest local election, because some individuals may prefer to leave the municipality of assignment to avoid active labour market training. In contrast, since social assistance rules, including entitlement rules, are the same across Danish municipalities, the set of explanatory variables does not include any local welfare generosity variable. The final labour market variable in the set of explanatory variables is the number of institutions for qualifying education in the municipality of assignment. It is included to test whether education opportunities affect recent immigrants’, especially refugees’, utility levels. They may do so for the following reasons: first, due to lack of education from the source country, second, due to lack of approval of foreign education in the host country, and third, due to a need for upgrading the skill level for employability in the host country labour market, for instance due to a high minimum wage and a mismatch between low-skilled job demand and supply in the host country. As a consequence, the hazard rate of relocation out of the municipality of assignment is expected to decrease with the number of institutions for qualifying education in the municipality of assignment.

Finally, it is important to include variables measuring local housing market conditions because relocations out of the municipality of residence may include short-distance relocations, which tend to be carried out for housing consumption adjustment reasons. In Denmark, new immigrants tend to live in rental housing, especially in social housing. In fact, according to Danish law, immigrants are not allowed to buy property during the first 5 years of stay. I therefore include the number of rental units in percent of the total local housing stock and the number of social housing units in percent of the total housing stock in the set of explanatory variables. I expect the local residence offer arrival rate to increase with the number of rental units and number of social housing units in percent of the total local housing stock, since adjustment of housing consumption can take place more easily within the municipality of assignment if the local shares of rental and social housing units are high.

Note that all areas of origin variables exploit the countrywide variation between Danish municipalities.

All areas of origin variables are defined in Table 8, and their first two moments are shown in Table 9 in the Appendix.

5 Results

5.1 Baseline results

I estimate five different Mixed Proportional Hazard models for post-reform refugees. The parameter estimates of regional attributes of the municipality of assignment are reported in columns 1–5 of Table 5.Footnote 8

Table 5 Mixed Proportional Hazard model coefficient estimates in the baseline models

According to the first model, the hazard rate of relocation out of the municipality of assignment decreases with the logarithmic value of number of inhabitants in the municipality of assignment and the percentage of co-nationals in the host country living in the municipality of assignment. In contrast, the hazard rate of relocation out of the municipality of assignment unexpectedly increases with the percentage of immigrants in the host country living in the municipality of assignment. Finally, the effect of the regional unemployment rate on the hazard rate of relocation is insignificant.

In the second model, the indicator variable for initial residence in the Greater Copenhagen area is included in the set of explanatory variables. The coefficient estimates of the four regional attributes, which were also included in the first model, are insensitive to the inclusion. The hazard rate of relocation out of the municipality of assignment increases with initial residence in Greater Copenhagen, contradicting the hypothesis that recent immigrants have higher utility levels in Greater Copenhagen than elsewhere due to capital-specific amenities.

In the third model, housing market attributes are included as explanatory variables. The inclusion decreases the coefficient estimate and t-statistic of the logarithmic value of number of inhabitants in the municipality of assignment. The interpretation is that refugees prefer living in large cities in part because it facilitates access to rental, including social, housing. In line with our prior beliefs, the hazard rate of relocation decreases both with the number of social housing units and the number of rental units in percent of the total local housing stock.

The percentage of right-wing votes at the latest local election is included as an additional explanatory variable in the fourth model. The hazard rate of relocation increases with the percentage of right-wing votes at the latest local election. This result supports the prior belief that refugees’ utility levels are decreasing in the use of active labour market programme participation rather than passive income support for unemployed individuals. With one exception, the coefficient estimates of the regional attributes included in the third model are insensitive to the variable inclusion; the coefficient estimate of the regional unemployment rate changes sign, but remains insignificant. This indicates a negative correlation between the regional unemployment rate and the percentage of right-wing votes at the latest local election.

In the final model, the number of institutions for qualifying education in the municipality of assignment is included as an additional explanatory variable. As expected, the hazard of relocation decreases with the number of institutions for qualifying education in the municipality of assignment. An additional institution for qualifying education decreases the hazard rate relocation by 4.9% \( {\left[ {{\left( {\exp {\left( {{ - 5.02} \mathord{\left/ {\vphantom {{ - 5.02} {100}}} \right. \kern-\nulldelimiterspace} {100}} \right)} - 1} \right)} \cdot 100} \right]} \). Inclusion of the variable causes significant changes in the coefficient estimate of two of the other location characteristic variables in the model: the logarithmic value of number of inhabitants in the municipality of assignment and the regional unemployment rate. The coefficient estimate of the logarithmic value of number of inhabitants in the municipality of assignment drops substantially and becomes insignificant. The interpretation is that refugees’ utility levels increase with local population size partly because access to educational institutions increases with local population size. The coefficient estimate of the regional unemployment rate increases and becomes significant. In other words, the hazard rate of relocation increases with the regional unemployment rate as expected. The coefficient increase implies that the regional unemployment rate is positively correlated with the number of institutions for qualifying education in the municipality of assignment, but the two factors affect refugees’ utility levels in opposite ways. A percentage point increase in the regional unemployment rate increases the hazard rate by 2.3% \( {\left[ {{\left( {\exp {\left( {{0.228} \mathord{\left/ {\vphantom {{0.228} {10}}} \right. \kern-\nulldelimiterspace} {10}} \right)} - 1} \right)} \cdot 100} \right]} \) ceteris paribus.

The remaining coefficients of regional attributes are largely unaffected by the inclusion of the number of institutions for qualifying education in the municipality of assignment. Their marginal effects are as follows. A percentage point increase in the percentage of co-nationals in the host country who live in the municipality of assignment decreases the hazard rate by 4% ceteris paribus. A percentage point increase in the percentage of immigrants in the host country who live in the municipality of assignment increases the hazard rate by 8.9% ceteris paribus. Initial residence in the Greater Copenhagen area increases the hazard rate by 49%. A percentage point increase in the number of social housing units in percent of the total local housing stock decreases the hazard rate of relocation by 1.5%. The effect of a corresponding change in the number of rental units in percent of the total local housing stock is 1.8%. A percentage point increase in the percent of right-wing votes increases the hazard rate of relocation by 0.7%.

Note that the marginal effect of the percentage of co-nationals in the host country living in the municipality of assignment is robust across model specifications. The marginal effect of the percentage of immigrants in the host country living in the municipality of assignment is fairly robust as well. Note also that the logarithmic value of number of inhabitants in the municipality of assignment has an insignificant effect on the hazard rate of relocation because its effect is captured by the effect of housing and labour market attributes on the hazard rate of relocation. To summarize, refugees prefer living in large cities in part because it facilitates access to housing and educational institutions.

The estimated hazard function of the final model is plotted in Fig. 6 in the Appendix, for an individual with mean observable and unobservable characteristics. The corresponding estimated survivor function is plotted in Fig. 7 in the Appendix.

The fact that post-reform refugees tend to live in temporary housing for at least 3 months before settlement in the municipality of assignment led me to include refugees who got asylum in the last quarter of 1985 in the sample of post-reform refugees. The coefficient estimates are virtually unchanged when excluding individuals who got asylum in the last quarter of 1985.

Due to the lack of exact information on the municipality of assignment, I check whether the results are robust to an alternative definition of the municipality of assignment. If I define the municipality of assignment simply as the first municipality of residence in Denmark of the individual three of the coefficient estimates change, a relatively small population size becomes an important push factor, while the effects of regional unemployment and access to institutions for qualifying education turn insignificant. The latter results are counterintuitive, which indicates that one should put more weight on the results reported in Table 5.

Estimation results unexpectedly show that the hazard rate of relocation out of the municipality of assignment increases with the share of the immigrant population who lives in the municipality of assignment and is higher for refugees who were assigned to the capital area. Since a substantial share of immigrants live in the capital area, the first unexpected result could in fact be driven by the latter effect. To investigate this, I re-estimate the last model, excluding post-reform refugees who were assigned to Copenhagen city. The parameter estimates for local area attributes are reported in column 6 of Table 5. The only coefficient estimate that is sensitive to the exclusion of refugees who were assigned to Copenhagen city is the estimate of the percentage of immigrants living in the municipality of assignment, which turns significantly negative. This indicates that the result of a positive effect of the share of the immigrant population who lives in the municipality of assignment reported in Table 5 is in fact driven by the relatively high out-migration rate from Copenhagen city.

Turning to the second unexpected result that the hazard rate of relocation out of the municipality of assignment is higher for refugees who were assigned to the Greater Copenhagen area, one explanation could be that post-reform refugees who were assigned to the Greater Copenhagen area were assigned to relatively unattractive neighbourhoods or housing. An alternative explanation is that the out-migration rate of refugees has always been relatively high for Copenhagen due to a tight local housing market, which hampers adjustment of housing consumption by means of an intra-municipality move for refugees in the Greater Copenhagen area. The push factor results for pre-reform refugees reported in column 7 of Table 5 show that the out-migration rate from a municipality in the Greater Copenhagen area was also relatively high before the implementation of the dispersal policy. This lends support to the hypothesis that the relatively high out-migration rate from the Greater Copenhagen area was caused by a tight local housing market. In addition, the hypothesis is supported by the descriptive evidence that 70% of placed refugees in the Greater Copenhagen area who relocate out of a county in the Greater Copenhagen area move to another county in the Greater Copenhagen area.

How do the push factor results for post-reform refugees compare with previous results in the literature? The results that lack of co-nationals and residence outside large cities are important push factors in secondary migration of placed refugees are in line with pull factor results of US and Swedish studies, which have found recent immigrants to be attracted to large cities and to locations in which earlier cohorts of co-nationals have settled (Bartel 1989; Jaeger 2000; Åslund 2005). The first result is also in accordance with Zavodny (1999). The latter result is also in accordance with Bauer et al. (2002, 2005). The result that lack of immigrants is a push factor is in line with the results by Zavodny (1999), Jaeger (2000) and Åslund (2005) that recent immigrants are attracted to location in which earlier cohorts of immigrants have settled. The result that high regional unemployment is a push factor in secondary migration of placed refugees is in line with the result by Jaeger (2000) that favourable local labour market conditions is a pull factor in the initial location decision for all US immigrants except spouses of US citizens and contrasts the result by Bartel (1989) that recent immigrants are insensitive to local labour market conditions. Finally, the result that a relatively high share of right-wing votes is a push factor in secondary migration of placed refugees, which I interpret as evidence of welfare seeking is in line with the welfare magnet result by Borjas (1999) and in contrast to the result of no welfare seeking among immigrants in Zavodny (1997, 1999).

Suppose I had estimated push factors in recent immigrants’ location decision without access to data for refugees who had been assigned to the initial location in the host country by the authorities. Would the results have been different? To investigate this, I have estimated the final model for pre-reform refugees. The coefficient estimates are reported in column 7 of Table 5. They differ from the estimates for post-reform refugees in a number of ways. The main differences are that an increase in the regional unemployment rate is counter-intuitively associated with a decrease in the hazard rate of relocation out of the initial municipality of residence and that an increase in the percentage of rental housing is counter-intuitively associated with an increase in the hazard rate of relocation out of the initial municipality of residence. Furthermore, a relatively small local population size becomes the most important push factor in terms of statistical significance. Finally, changes in the percentage of co-nationals and immigrants, the number of institutions for qualifying education and the percentage of right-wing votes at the latest local election are uncorrelated with changes in the hazard rate of relocation out of the initial municipality of residence.

The differences between the estimates for pre- and post-reform refugees emphasise the importance of using micro-data with exogenous variation in the location of residence to estimate push factors, notably economic push factors, in individuals’ migration decision. In contrast to post-reform refugees, pre-reform refugees are likely to have selected into initial locations based on unobserved personal attributes. Therefore, the estimates for pre-reform refugees cannot be given a causal interpretation; they are merely associations or statistical correlations rather than causal effects.

6 Conclusion

This paper provides quasi-experimental evidence on how regional attributes affect recent immigrants’ location choices. Push factors in recent immigrants’ secondary migration decisions are estimated for refugees who were subject to the Danish spatial dispersal policy on refugees carried out from 1986 to 1998. During this period around 90% of new refugees were randomly assigned to the initial location by the authorities, conditional on six personal attributes.

The results shed light on the question asked in the literature whether recent immigrants prefer living where co-ethnics as well as immigrants from other countries of origin settled earlier. The results presented show that refugees prefer living in a location in which relatively large shares of co-nationals and immigrants (irrespective of origin) in the host country have settled earlier. In addition, the results provide evidence that recent immigrants are attracted to large cities because they facilitate access to rental, including social, housing and institutions for qualifying education. Furthermore, the results provide evidence on the controversy in the literature on whether recent immigrants’ location choices are affected by economic factors, in particular employment prospects and welfare generosity. Placed refugees do indeed react to relatively high regional unemployment by internal migration. However, placed refugees also react to settlement in a right-wing dominated location by moving to another location. This could be due to a wider use of active labour market programme participation as a requirement for social benefits receipt instead of passive income support in right-wing dominated municipalities. If so, the result could be interpreted as evidence of welfare seeking.

I demonstrate that some of the results are due to access to quasi-experimental data since some of the push factor results differ significantly from push factor results for refugees who chose their initial location in the host country. Most importantly, a relatively low regional unemployment rate and a relatively high share of rental housing are counter-intuitively push factors in the secondary migration decision of refugees who chose their initial location in the host country. These differences provide empirical evidence that lack of use of quasi-experimental data yields biased estimates due to location sorting.

Furthermore, it is possible to determine whether some of the results presented in the study are specific to Denmark by comparing the results to the only previous study, which provides quasi-experimental evidence on push factors in placed refugees’ secondary migration, (Åslund 2005). A comparison with the Swedish results shows that coefficient estimates for regional attributes, which are included as explanatory variables in both studies with one exception differ mainly in terms of their statistical significance. Specifically, regional unemployment and lack of co-nationals in the municipality of assignment are found to be important push factors in both studies. In addition, a relatively small population size is found to be an important push factor in the Swedish case while it is an insignificant push factor in the Danish case. The final potential push factor investigated in both studies is the local presence of immigrants, which has a positive but insignificant effect on refugees’ secondary migration decision in the Swedish case and a significant, negative effect in current study.

To the extent that the set of results presented in this paper holds for all admission categories of immigrants, policy makers should expect new immigrants to settle in large cities in which earlier cohorts of co-ethnics and immigrants have settled and in regions with relatively low unemployment. From a labour market assimilation point of view, recent immigrants’ preference for living with co-ethnics should not necessarily cause policy makers’ concern. The high-quality empirical investigation on the effect of living in an ethnic enclave on labour market assimilation of immigrants by Edin et al. (2003) exploits the Swedish spatial dispersal on refugees to take location sorting into account. The results of the study show that residence in an ethnic enclave increases earnings of refugees 8 years after immigration. The results of the study could well generalize to the labour market assimilation experience of immigrants in other countries.