Keywords

1 Introduction

International migration patterns vary considerably over time, and across destination and origin countries. Some OECD countries have experienced a decrease in the size of the annual immigrant inflow between 1980 and 1995. Over the same years, the number of immigrants per year has increased in several other OECD countries.Footnote 1 The percentage change of the annual immigrant inflow from 1980 to 1995 ranges between negative 42% (in Japan) and positive 48% (in Canada). For all destinations, such changes are anything but monotonic (OECD, 1997). The variation in terms of origin countries is remarkable as well (OECD, 1997).

Several factors are likely to influence the size, origin, and destination of labor movements at each point in time and contribute to the variation observed in the data. However, very few empirical works in the literature have tried to understand what drives international migration, perhaps due to past unavailability of cross-country data.

In turn, international migration has recently received a great deal of attention in light of research showing its beneficial effects from an economic-development point of view. For example, the recent literature has pointed out repeatedly the potential of free migration to produce large benefits—most likely greater than the gains from liberalizing existing trade barriers (Rodrik, 2002). To fully understand these and other effects, it is important to identify the forces and constraints that shape international migration movements.

In this paper, I empirically investigate the determinants—economic, geographic, cultural, and demographic—of bilateral immigration flows. My analysis is based on the predictions of a simple theoretical framework that focuses on both supply and demand factors. I use yearly data on immigrant inflows into fourteen OECD countries by country of origin, between 1980 and 1995. The source of this data is the International Migration Statistics for OECD countries (OECD, 1997), based on the OECD’s Continuous Reporting System on Migration (SOPEMI).

My paper is related to a vast literature on the determinants of migration. Clark et al. (2007) and Karemera et al. (2000) both focus on the fundamentals explaining immigrant inflows into the United States by country of origin in the last decades. Other papers in the literature that analyze the determinants of migration to the U.S. are Borjas (1987) and Borjas and Bratsberg (1996). Hatton (2005) investigates trends in UK net migration in the last decades. Finally, Helliwell (1998) sheds light on factors affecting labor movements in his investigation of the magnitude of immigration border effects, using data on Canadian interprovincial, US interstate, and US-Canada cross-border immigration.

This paper makes three contributions to the literature. First, my analysis puts greater emphasis than previous works on the demand side of international migration, namely destination countries’ migration policies. This change of perspective is important, given restrictive immigration policies in the vast majority of host countries. Second, my work is the first one I am aware of to use the OECD (1997) data on international migration to systematically investigate the drivers of international flows of migrants. Previous works have either used country cross-sections (Borjas 1987, Yang 1995), or have focused on a single destination country over time (Borjas & Bratsberg, 1996; Brücker et al., 2003; Clark et al., 2007; Karemera et al., 2000) or a single origin country over time (Yang, 2003). By extending the focus of the analysis to a multitude of origin and destination countries and taking advantage of both the time-series and cross-country variation in the data, I can test the robustness and broader validity of the results found in earlier works.Footnote 2 Third, this paper carefully reviews and proposes solutions to various econometric issues that arise in the estimation, such as endogeneity and reverse causality. These econometric complications have not all been addressed in the previous literature.Footnote 3 Once I deal with them (e.g., by controlling for destination and origin countries’ fixed effects and for year effects), my analysis both delivers estimates broadly consistent with the predictions of the international migration model and generates empirical puzzles.

According to the international migration model, pull and push factors have either similar-sized effects (with opposite signs), when migration quotas are not binding, or they both have no (or a small) effect on emigration rates, when migration quotas are binding. It is not clear, ex ante, which one of the two scenarios characterizes actual flows. Migration policies in the majority of destination countries are very restrictive, which should imply binding constraints on the number of migrants. On the other hand, even countries with binding official immigration quotas often accept unwanted (legal) immigration.Footnote 4 Restrictive immigration policies are often characterized by loopholes, that leave room for potential migrants to take advantage of economic incentives. For example, immigration to Western European countries still took place after the late 1970s, despite the official closed-door policy. Family-reunification and asylum-seekers policies can explain continuing migration inflows to Western Europe (Joppke, 1998).

My empirical results are puzzling because they are in part consistent with the first scenario and in part with the second one. I find that pull factors—proxied by the per worker GDP in the destination country—significantly increase the size of emigration rates. This result is robust to changes in the specification of the empirical model. Both absolute and relative pull factors matter. That is, the emigration rate to a given destination is an increasing function of that country’s per worker GDP and a decreasing function of the average per worker GDP of all the other host countries in the sampleFootnote 5 (each weighted by the inverse of distance from the origin country). On the other hand, the impact of push factors—proxied by the per worker GDP in the origin country—is seldom negative as theory suggests would be the case with not-binding migration quotas and, when it is, the size of the effect is smaller than for pull factors and insignificant. Therefore my analysis finds evidence of an asymmetric impact of pull and push factors on emigration rates.Footnote 6

The asymmetry is a familiar puzzle. For example, it has been documented in several works in the literature on internal migration (see, e.g., Hunt (2006) and the papers referenced in its footnote 4). Based on the existing literature, there might be numerous reasons for the asymmetry and possibly different ones operating across borders versus within country borders. At the national level, where migration quotas do not exist, Hunt (2006) provides an explanation of the asymmetry by breaking down data by age group: Origin region’s unemployment rates (push factor) have an insignificant impact on migration flows because the insignificant effect for the young—who are not as sensitive to their own layoffs as the old—dominates the significant positive effect for the old. This explanation cannot be investigated at the international level because of data unavailability.

Another interpretation in the literature of the asymmetry is that migration quotas are effectively not binding but the impact of income opportunities in the origin country is affected by poverty constraints, due to fixed costs of migration and credit-market imperfections (Lopez & Schiff, 1998; Yang, 2003). Since lower levels of per worker GDP in the source country both strengthen incentives to leave and make it more difficult to overcome poverty constraints, the net effect might be close to zero. In the empirical analysis I investigate this possibility and I find very weak evidence that my result on push factors is driven by poverty constraints in the origin country.

Yet an alternative explanation of my findings is that the asymmetric effect I estimate for pull and push factors is explained by the demand side of international migration—namely, migration policies—and not by the supply side as is often assumed in the previous literature. Changes in mean income opportunities in the destination country not only affect migrants’ incentive to move there but also impact the political process behind the formation of migration policies. For example, in periods of economic booms, policymakers are better able to overcome political opposition to and accommodate increasing migration inflows.Footnote 7 If migration quotas are binding, the latter political-economy channel will be at work while the determinants on the supply side will have no (or a small) impact. This would explain the asymmetric effect I estimate for pull and push factors. While I do not investigate this interpretation directly,Footnote 8 I find evidence which is consistent with migration policy playing a constraining role. In the empirical analysis, I differentiate the effect of pull and push factors according to changes in destination countries’ migration policy. I find that the effect of pull factors becomes more positive and the impact of push factors turns negative in those years when a host country’s immigration laws become less restrictive. This is also true for the impact of other supply-side determinants such as geography and demographics (see below). In sum, my results suggest that migration quotas matter as they mitigate supply-side effects.Footnote 9

My empirical analysis also finds that inequality in the source and host economies is related to the size of emigration rates as predicted by Borjas (1987) selection model. An increase in the origin country’s relative inequality has a non-monotonic effect on the size of the emigration rate: The impact is estimated to be positive if there is positive selection, negative if there is negative selection. Among the variables affecting the costs of migration, distance between destination and origin countries appears to be the most important one: Its effect is negative, significant, and steady across specifications. On the other hand, there is no evidence that cultural variables related to each country pair play a significant role. Demographics—in particular, the share of the origin country’s population who is young—shape bilateral flows as predicted by the theory. Since the effect of geography and demographics works through the supply side of the model, their impact should be even stronger when migration quotas are relaxed, which is what I find in the data.

Finally, I empirically investigate the importance of network effects. Since immigrants are likely to receive support from other immigrants from the same origin country already established in the host country, they will have an incentive to choose destinations with larger communities of fellow citizens. Network effects imply that bilateral migration flows are highly correlated over time, which is what the data shows. However, it is not clear how to interpret this result. While it is consistent with supply factors (i.e., network effects), it could also be driven by demand factors (family reunification policies, for example).

The rest of this paper is organized as follows. Section 2 presents a simple model of international migration. In Sect. 3 I describe the data sets used, while in Sect. 4 I discuss the estimating equations and some econometric issues that complicate the analysis. Finally, I present the main empirical results and additional results in, respectively, Sects. 5 and 6. Section 7 concludes.

2 Theoretical Framework

Both supply and demand factors affect international migration flows. Migrants’ decisions to move, according to economic and non-economic incentives, shape the supply side of labor movements. The host country’s immigration policy represents the demand side, namely the demand for immigrants in the destination country. The theoretical framework in this paper is closely related to the previous literature (Borjas, 1999; Clark et al., 2007), the main difference being the greater emphasis in my model on destination countries’ immigration policy. I consider two countries: country 0, which is the origin of immigrant flows and country 1, which is the destination. I first focus on the supply side of immigration and look at the probability that an individual chosen randomly from the population of country 0 will migrate to country 1. In each country, wages are a function of the individual skill level (si). The wages that individual i receives in country 0 and would receive if he migrated to country 1 are respectively equal to w0i = α0 + θ0si + 𝜖0i and w1i = α1 + θ1si + 𝜖1i, where the two disturbances have zero means over the origin country’s population. In light of the empirical analysis below, based on aggregate data, it is helpful to rewrite individual i’s wages in the two locations as a function of first and second moments of the income distributions (of the origin country’s population) at home and abroad respectively:

$$ {w}_{0i}\kern0.5em =\kern0.5em {\mu}_0+{v}_{0i},\textrm{where}\kern0.3em {v}_{0i}\sim N\left(0,{\sigma}_0^2\right)\kern0.3em ,\kern0.5em $$
(1)
$$ {w}_{1i}\kern0.5em =\kern0.5em {\mu}_1^0+{v}_{1i},\textrm{where}\kern0.3em {v}_{1i}\sim N\left(0,{\sigma}_1^2\right),\kern0.5em $$
(2)

where the correlation coefficient between v0i and v1i equals ρ01, μ0 equals \( {\alpha}_0+{\theta}_0\cdot {\overline{s}}_0 \) and \( {\mu}_1^0 \) equals \( {\alpha}_1+{\theta}_1\cdot {\overline{s}}_0 \) (\( {\overline{s}}_0 \) is the mean skill level of the origin country’s population).

Notice that \( {\mu}_1^0 \), which is equal to the mean wage of the origin country’s population if it all migrated to country 1, is different from \( {\mu}_1={\alpha}_1+{\theta}_1\cdot {\overline{s}}_1 \), which is equal to the mean wage of the destination country’s population in country 1 (\( {\overline{s}}_1 \) represents the mean skill level of the destination country’s population). This point will be relevant in one of the robustness checks in the empirical analysis.

I assume that each individual has Cobb Douglas preferences for the two goods produced in the world (xA and xB), which implies an indirect utility (function) from having an income y given by \( v\left({p}_A,{p}_B;y\right)=\overline{A}\left({p}_A,{p}_B\right)\cdot y \). I assume that each country is a small open economy characterized by free trade with the rest of the world: Footnote 10 therefore goods’ prices pA and pB , as well as \( \overline{A}\left({p}_A,{p}_B\right) \), are given and equal across countries.Footnote 11 An individual in country 0 will migrate to country 1 if the utility of moving is greater than the utility of staying at home that is, given the assumptions above, if the expected income in country 1 net of migration costs is greater than the expected income in country 0. Following the literature, I can define an index Ii that measures the net benefit of moving relative to staying at home for a risk-neutral individual i:

$$ {I}_i={\eta}_{01}\cdot {w}_{1i}-{C}_i-{w}_{0i}, $$
(3)

where η01 is the probability that the migrant from country 0 will be allowed to stay in country 1, and \( {C}_i={\mu}_C+{v}_i^C \), with \( {v}_i^C\sim N\left(0,{\sigma}_C^2\right) \), represents the level of individual migration costs.Footnote 12 The correlation coefficients between \( {v}_i^C \) and (v0i , v1i) are equal to (ρ0C, ρ1C). The implicit assumption in (3) is that, if the migrant moves to but is not allowed to stay in the destination country, he still incurs the migration costs Ci and gives up the home wage w0i. In other words, the individual migrates to the host country before knowing whether he will be able to stay (for a longer period of time) and gain the income w1i.Footnote 13 Immigrants may not be able to stay in the host country because of quotas due to a restrictive immigration policy.

The probability that an individual chosen randomly from the population of the origin country will migrate from country 0 to country 1 therefore equals:

$$ P=\Pr \left[{I}_i>0\right]=\Pr \left[{\eta}_{01}\cdot \left({\mu}_1^0+{v}_{1i}\right)-\left({\mu}_C+{v}_i^C\right)-\left({\mu}_0+{v}_{0i}\right)>0\right], $$
(4)

which can be rewritten as P = 1 − Φ(z), where \( z=-\frac{\left({\eta}_{01}\cdot {\mu}_1^0-{\mu}_0-{\mu}_C\right)}{\sigma_v} \), σv is the standard deviation of (\( {\eta}_{01}\cdot {v}_{1i}-{v}_{0i}-{v}_i^C\Big) \), and Φ(⋅) is the cumulative distribution function of a standard normal. The probability in (4) is the supply emigration rate \( \frac{I_{01}^S}{P_0} \), where \( {I}_{01}^S \) represents the size of the migration flow as determined by the supply side of the model and P0 the population in the origin country.

Next, I assume that the destination country’s immigration policy sets quantity constraints for immigrants coming from each origin country. Let \( {I}_{01}^D \) be the maximum number of migrants from country 0 allowed each year into country 1. These immigration quotas, which represent country 1 ’s demand for immigrants from country 0, may or may not be binding. Only in the latter case does the emigration rate we observe in the data (\( \frac{I_{01}}{P_0} \)) equals the supply emigration rate \( \frac{I_{01}^S}{P_0} \) defined above. On the other hand, if quantity constraints are binding, \( \frac{I_{01}}{P_0} \) will be less than \( \frac{I_{01}^S}{P_0} \) . In general, the emigration rate we observe in the data is equal to the minimum of \( \frac{I_{01}^S}{P_0} \) and \( \frac{I_{01}^D}{P_0} \), and is represented in Fig. 1 by the heavy lines, as a function of \( {\mu}_1^0 \), μ0 and μC. The figure assumes that quotas \( {I}_{01}^D \) are exogenous, which means that they are not affected by \( {\mu}_1^0 \) nor by μ0 nor by μC. This is a strong assumption that is questioned in the interpretation of the empirical results.

Fig. 1
Two graphs illustrate the actual emigration rate. The graph represents I subscript 0 1 by P subscript P 0 versus two different subscripts of mu. The trend of the graph is rising curve and a falling curve, with one mean straight horizontal curve.

The actual emigration rate as a function of mean income opportunities in the destination and origin country and of mean moving costs

I assume that the probability η01 that the migrant from country 0 will be allowed to stay in country 1 is equal to \( \min \left\{1,\frac{I_{01}^D}{I_{01}^S}\right\} \). It is then possible to derive testable predictions for the impact of \( {\mu}_1^0 \), μ0, and μC on the emigration rate from country 0 to country 1:

$$ \frac{d\left(\frac{I_{01}}{P_0}\right)}{d{\mu}_1^0}=\left\{\begin{array}{l}\hfill \frac{\phi (z)}{\sigma_v}>0,\textrm{if}\kern0.3em \frac{I_{01}^S}{P_0}<\frac{I_{01}^D}{P_0};\hfill \\ {}\hfill 0,\textrm{if}\kern0.3em \frac{I_{01}^S}{P_0}\ge \frac{I_{01}^D}{P_0}\hfill \end{array}\right. $$
(5)
$$ \frac{d\left(\frac{I_{01}}{P_0}\right)}{d{\mu}_h}=\left\{\begin{array}{l}\hfill -\frac{\phi (z)}{\sigma_v}<0,\textrm{if}\kern0.3em \frac{I_{01}^S}{P_0}\le \frac{I_{01}^D}{P_0};\hfill \\ {}\hfill 0,\textrm{if}\kern0.3em \frac{I_{01}^S}{P_0}>\frac{I_{01}^D}{P_0}\hfill \end{array}\right. $$
(6)

where ϕ(⋅) is the density function of a standard normal and h = 0 , C. According to (5) pull effects (namely, improvements in the mean income opportunities in the destination country) are positive and strongest when restrictions are not binding neither ex-ante nor ex-post, they are positive but smaller in size when the quota is binding ex-post but not ex-ante and, finally, they are equal to zero in a quantity-constrained world. A parallel interpretation explains the comparative-static results in (6), which describes push effects (changes of μ0, that is mean income opportunities in the origin country) and the impact of mean migration costs (changes of μC), according to the immigration-policy regime.

Thus, according to this simple model, pull and push factors have either similar-sized effects (with opposite signs), when quotas are not binding, or they both have no (or a small) effect on emigration rates, when quotas are binding. In the empirical analysis I will not be able to control for whether migration quotas are binding for a country pair in a given year (since I do not have data on \( {I}_{01}^D \)). Therefore I will estimate an average effect across country pairs with different degrees of restrictiveness. However, I will be able to use information on changes in \( {I}_{01}^D \): I should find that pull (push) effects are more positive (negative) than average, for a given destination country, if that country’s migration policy becomes less restrictive.Footnote 14

Fig. 2
Fourteen scatterplots illustrate the total immigration inflow of 14 countries from 1980 to 1996. The United Kingdom had the highest immigration inflow in 1980.

Total immigrant inflow by destination country

Focusing for simplicity on the region where immigration quotas are not binding, it is straightforward to derive predictions for the impact of second moments of the income distributions (of the origin country’s population) at home and abroad respectively. In particular, assuming that σC = 0, we obtain the following expressions, where \( k=\phi (z){\left({\sigma}_1^2+{\sigma}_0^2-2{\rho}_{01}{\sigma}_0{\sigma}_1\right)}^{-\frac{1}{2}}\left(-\frac{1}{\sigma_v^2}\right)<0 \) (Borjas, 1987):

$$ \frac{d\left(\frac{I_{01}}{P_0}\right)}{d{\sigma}_1}=k\cdot \left({\mu}_1^0-{\mu}_0-{\mu}_C\right)\cdot \left({\sigma}_1-{\rho}_{01}{\sigma}_0\right), $$
(7)
$$ \frac{d\left(\frac{I_{01}}{P_0}\right)}{d{\sigma}_0}=k\cdot \left({\mu}_1^0-{\mu}_0-{\mu}_C\right)\cdot \left({\sigma}_0-{\rho}_{01}{\sigma}_1\right). $$
(8)

In my discussion I will assume that \( \left({\mu}_1^0-{\mu}_0-{\mu}_C\right)>0 \) so that, based on first-moments considerations, on average immigrants have an incentive to migrate. The results in (7) and (8) imply that, if \( \frac{\sigma_0}{\sigma_1}<1 \) and ρ01 is sufficiently high (\( {\rho}_{01}>\frac{\sigma_0}{\sigma_1} \)), then 0 > 0 or 1 < 0 (i.e., an increase in the relative inequality \( \frac{\sigma_0}{\sigma_1} \)) will increase the emigration rate. Similarly, if \( \frac{\sigma_0}{\sigma_1}>1 \) and ρ01 is sufficiently high (\( {\rho}_{01}>\frac{\sigma_1}{\sigma_0} \)), then 0 > 0 or 1 < 0 (i.e., an increase in the relative inequality \( \frac{\sigma_0}{\sigma_1} \)) will decrease the emigration rate.

3 Data

In this paper, I merge data from an international migration panel with macroeconomic and other information on the origin and destination countries of immigrant flows. Data on immigration comes from the International Migration Statistics (IMS) data set for OECD countries (OECD, 1997), which provides information on bilateral immigrant flows based on the OECD’s Continuous Reporting System on Migration (SOPEMI).Footnote 15 In particular, I use data on yearly immigrant inflows into fourteen OECD countries by country of origin, in the period 1980–1995. The IMS data only covers legal immigration; population registers and residence and work permits are the main sources of these statistics.Footnote 16 Based on this dataset, labor movements to the fourteen OECD countries appear to be both South-North and North-North flows. The sample includes seventy-nine origin countries with per worker GDP levels ranging from approximately $1000 to $55,000 (PPP-adjusted) on average in the period considered.

The quality of the IMS data is high even though the coverage is not complete. The data set is supposed to cover immigrant inflows into each of the fourteen destination countries from all over the world. However, the sum by country of origin of the IMS numbers is not equal to 100% of the total flow into each destination country. The percentage of the total immigrant inflow covered by the disaggregate data ranges between 45% (Belgium) and 84% (United States). Put differently, the data set includes zero flows in correspondence of some country pairs (immigrant inflows from Italy to the United States, for example): some of these observations are likely to correspond to very small flows rather than zero flows. If very small flows are recorded as zeros in the disaggregate data set, there will be a discrepancy between total flows and the sum of flows by origin country. In the empirical analysis I will keep zero-flows observations in the data set and will investigate the robustness of my results to using a Tobit model.

Summary statistics and data sources for the other regressors used in the empirical model are documented in Appendix. Data on macroeconomic variables comes from various sources: the 2001 World Development Indicators data set (World Bank 2001) and the Penn World Tables (versions 5.6 and 6.1). Geographic and cultural information, such as on great-circle distance,Footnote 17 land border, common language, and colonial ties, comes from Glick and Rose’s (2002) data set on gravity-model variables. I also use statistics on the average number of schooling years in the total population of destination and origin countries (over age 15) from Barro and Lee’s (2000) data set. Data on Gini coefficients of destination and origin countries, used to construct the origin country’s relative inequality variable, comes from Deininger and Squire (1996) data set (I only use so-called high-quality observations).Footnote 18 Finally, information on origin countries’ share of young population comes from the United Nations.

Figure 2 shows that many destination countries in the sample are characterized by substantial volatility of immigrant inflows year after year. An important cause of variation over time in the number of immigrants to a given destination country is changes in that country’s migration policy. For example, the United States’ graph in Fig. 2 displays a peak around the year 1990. This is not surprising given that an amnesty law, the Immigration Reform and Control Act, was passed in 1986 and put in effect in the following years, with the bulk of the legalizations taking place in 1989–1991. The graph for Japan, on the other hand, displays a sudden decrease in the total immigrant inflow around the year 1982, which is when the Immigration Control and Refugee Recognition Act was passed. A separate Appendix to this paper documents the main characteristics of the migration policies of the destination countries in the sample and the timing (after 1980) of changes in their legislations (Mayda and Patel, 2004). A data set of destination countries’ migration-policy changes, between 1980 and 1995, was constructed on the basis of the information in this Appendix and used in the empirical analysis.Footnote 19

4 Empirical Model

According to the theoretical framework in Sect. 2, the estimating equation should include the emigration rate as the dependent variable and, among the explanatory variables, the mean wage of the origin country’s population in, respectively, the origin and destination countries. As approximations for the latter two variables, I use the (log) level of per worker GDP, PPP-adjusted (constant 1996 international dollars) in the two countries.Footnote 20 Based on the theoretical model, I expect pull and push effects to be, respectively, positive and negative on average, if migration quotas are not binding, and both zero (or small) otherwise.

Another determinant of bilateral immigration flows implied by the model of Sect. 2 is the physical distance between the two locations, which affects migration costs Ci. The further away the two countries are, the higher the monetary travel costs for the initial move, as well as for visits back home. Remote destinations may also discourage migration because they require longer travel time and thus higher foregone earnings. Another explanation as to why distance may negatively affect migration is that it is more costly to acquire information ex-ante about far-away countries. Besides distance, I introduce additional variables that affect the level of migration costs Ci. A common land border is likely to encourage migration flows, since land travel is usually less expensive than air travel. Linguistic and cultural similarity are also likely to reduce the magnitude of migration costs, for example by improving the transferability of individual skill from one place to the other. Past colonial relationships should increase emigration rates, to the extent that they translate into similar institutions and stronger political ties between the two countries, thus decreasing the level of migration costs Ci.

Finally, I introduce the share of the origin country’s population who is young (between 15 and 29 years old) as a demographic determinant of migration flows. Consider an extension of the basic model in Sect. 2 to a multi-period setting. In this set-up, the individual cares not only about current wage differentials net of moving costs, but about future ones too. This implies that a potential migrant from country 0 will have a bigger incentive to migrate the younger he is, as the present discounted value of net benefits will be higher the longer the remaining work life time is (for positive Ii in each year). We would then expect the share of the young population in the origin country to positively affect the emigration rate out of that country.

In a cross-country analysis, such as in this paper, unobserved country-specific effects could result in biased estimates. For example, the estimate of the coefficient on the destination country’s per worker GDP may be positive. Based on this result, it is not clear whether immigrants go to countries with higher wages or, alternatively, whether countries with higher wages have other characteristics that attract immigrants. Along the same lines, a negative coefficient on income at home leaves open the question of whether immigrants leave countries with lower wages or, alternatively, whether countries with lower wages have certain features that push immigrants to leave. To (partly) get around this problem, I exploit the panel structure of the data set and I introduce dummy variables for both destination and origin countries. This allows me to control for unobserved country-specific effects which are additive and time-invariant.Footnote 21 All the regressions also have year effects, to account for common time shocks, and robust standard errors clustered by country pair, to address heteroscedasticity and allow for correlation over time of country-pair observations. Notice that destination countries’ fixed effects also allow me to control for features of their immigration policy which are time-invariant and common across origin countries. In order to capture the effect of changes in destination countries’ migration policies, I introduce two interaction terms of an indicator variable of such changes with pull and push factors, respectively. According to the theory, if the migration policy of a destination country becomes less restrictive, the effect of pull (push) factors should turn more positive (negative).

The basic empirical specification thus looks as follows:

$$ {\displaystyle \begin{array}{rlll}\frac{flo{w}_{ij t}}{P_{it}}& =\beta +{\beta}_0 pwgd{p}_{it-1}+{\beta}_1 pwgd{p}_{jt-1}+{\beta}_2 dis{t}_{ij}+{\beta}_3 borde{r}_{ij}& & \\ {}& \kern1em +{\beta}_4 comlan{g}_{ij}+{\beta}_5 colon{y}_{ij}+{\beta}_6 youngpo{p}_{it-1}& & \\ {}& \kern1em +{\beta}_7 pwgd{p}_{it-1}\cdot immigpo{l}_{jt}+{\beta}_8 pwgd{p}_{jt-1}\cdot immigpo{l}_{jt}& & \\ {}& \kern1em +{\delta}_i{I}_i+{\delta}_j{I}_j+{\delta}_t{I}_t+{\varepsilon}_{ij t}& \end{array}} $$
(9)

where i is the origin country, j the destination country, and t time.\( \frac{flo{w}_{ijt}}{P_{it}} \) is the emigration rate from i to j at time t (flowijt is the inflow intocountry j from country i at time t, Pit is the population of the origin country at time t). pwgdp is the (log) per worker GDP, PPP-adjusted (constant 1996 international dollars) and dist measures the (log) great-circle distance between the two countries. The variable border equals one if the two countries in the pair share a land border. comlang and colony are two dummy variables equal to one, respectively, if a common language is spoken in the two locations, and for pairs of countries which were, at some point in the past, in a colonial relationship. The variable youngpop is the share of the population in the origin country aged 15–29 years old. The variable immigpol increases by one (decreases by one) if in that year the destination country’s immigration policy became less (more) restrictive, zero otherwise. In other words, a change in policy is modelled as leading to a lasting effect (i.e., in the year when the policy change occurred and in the following years). Finally, the basic empirical specification also includes destination and origin countries’ fixed effects (Ij and Ii) and year effects (It). According to the model in Sect. 2, I expect that β0 ≤ 0, β1 ≥ 0, β2 ≤ 0, β3 ≥ 0, β4 ≥ 0, β5 ≥ 0, β6 ≥ 0, β7 < 0, and β8 > 0.

An econometric complication is the possibility of reverse causality and, more in general, of endogeneity in the time-series dimension of the analysis. For example, the theoretical model in Sect. 2 predicts that, if migration quotas are not binding, better (worse) income opportunities in the destination (origin) country increase emigration rates. However, a positive β1 (negative β0) may just reflect causation in the opposite direction, that is the impact of immigrant flows on wages in host and source countries. After all, this channel is the main focus of analysis in many labor-economics papers (see Friedberg & Hunt, 1995 for a survey of this literature). More broadly, other time-variant third factors may drive contemporaneous wages and immigrant flows.

As for reverse causality, notice that it is likely to bias the estimates toward zero. The reason is that, if anything, immigrant inflows are likely to decrease wages in the destination country and outflows are likely to increase wages in the origin country. While the opposite signs are a theoretical possibility (e.g., in the economic-geography literature, because of economies of scale), the empirical evidence in the labor-economics literature is that immigrant inflows have a negative or zero impact on the destination country’s wages (Friedberg & Hunt, 1995; Borjas, 2003) and that immigrant outflows have a positive impact on the origin country’s wages (Mishra, 2007).

Although reverse causality may not be an issue, it is still important to address endogeneity. Thus, I relate current emigration rates to lagged values of (log) per worker GDP, at home and abroad (and to lagged values of all the other time-varying regressors). While it is unrealistic to claim that wages at home and abroad are strictly exogenous, it is plausible to assume that they are predetermined, in the sense that immigrant inflows—and third factors in the error term—can only affect contemporaneous and future wages.Footnote 22

5 Empirical Results

Table 1 presents the results from estimation of Eq. (9 ). The estimates show a systematic pattern, broadly consistent with the theoretical predictions of the international migration model. The analysis also generates empirical puzzles.

Table 1 Determinants of bilateral immigrant flows

First, the emigration rate is positively related to the destination country’s (log) per worker GDP. According to the estimate in regression (1), a 10% increase in the level of per worker GDP in the destination country increases emigration by 2.6 emigrants per 100,000 individuals of the origin country’s population (significant at the 5% level). In other words, a 10% increase in the host country’s per worker GDP implies a 20% increase in the emigration rate (as the mean of the dependent variable is, in regression (1), 13 emigrants per 100,000 individuals). This result would suggest that migration quotas are not binding on average across destination countries. However, the impact on the emigration rate of a change in the income opportunities at home is not consistent with this interpretation. Push effects are estimated to be insignificantly different from zero in Table 1 (and often of the wrong sign). One possibility is that, in practice, migration quotas are not binding, but push factors are zero due to the effect of poverty constraints in the origin country. I will investigate this hypothesis in Table 2.

Table 2 Economic determinants more in detail

In regressions (1)–(3), Table 1, I also explore the role played by geographic (log distance and land border), cultural (common language and colony), and demographic (share of young population (origin)) determinants, respectively. The picture that emerges from my results is one in which geography and demographics are the most important among this set of drivers of migration flows. According to the estimate in column (1), doubling the great-circle distance between the source and host country decreases the number of emigrants by 41 per 100,000 individuals in the origin country (significant at the 1% level). On the other hand, a common land border does not appear to play a significant role. The impact of a common language, though of the right sign, is not statistically significant and, surprisingly, past colonial relationships do not appear to affect migration rates (this is true whether common language and colony are entered in the regression together or one at a time). Finally, the share of the origin country’s population who is young has a positive and significant impact on emigration rates. A ten percentage point increase in the origin country’s 15–29 years old population raises the emigration rate by 20 emigrants per 100,000 individuals (regression (3)).

Next, I investigate whether per worker GDP (PPP-adjusted) of origin and destination countries is a good proxy for mean income opportunities of migrant workers at home and abroad. Per worker GDP is not a direct measure of wages of a potential migrant, since it depends on rates of return to both capital and labor and on endowments of each factor. For example, a higher per worker GDP in the destination country does not necessarily mean better income opportunities on average for an immigrant worker, since it could be due to a higher capital-labor ratio or to a more skilled labor force in the destination country’s population. To address this concern, I run a robustness check where I control for the mean skill level and per worker capital endowment in destination and origin countries (columns (4)–(5)).Footnote 23 I first control for the average schooling level in both countries in regression (4). I still estimate pull effects which are positive and significant (at the 1% level). The results on push effects are the same as in previous estimates as well. In line with the theoretical predictions, the average skill level in the population of the destination (origin) country has a negative (positive) impact on the emigration rate. In regression (5) I control for the per worker endowments of both skill and capital and find that their coefficients are of the right sign (although not significant). Most importantly, my prior findings on pull and push factors are robust.

In column (6), out of all the geographic, cultural, and demographic determinants, I only include the ones which are significant based on regressions (1)–(3), that is log distance and share of young population (origin). I find evidence consistent with my previous results. Using a specification with these variables, I test how robust the results are—in particular, in terms of the asymmetry between pull and push factors—to using a Tobit specification (regression (7)). The estimates are again in line with the picture based on OLS regressions but they are larger in magnitude.

In the next regression (column (8)) I only exploit the variation over time within country pairs, by introducing fixed effects for each combination of origin and destination countries.Footnote 24 These country-pairs dummy variables allow me to control for time-invariant features of the destination country’s immigration policy which are specific for each origin country. The results from this specification confirm that push and pull factors have an asymmetric effect in terms of magnitudes and significance levels.Footnote 25

Next, I investigate the interaction between changes in destination countries’ migration policies and, respectively, pull and push factors (column (9), Table 1). Consistent with the theoretical predictions, positive pull factors are bigger than average for a destination country whose migration policy becomes less restrictive. Setting aside the average effect, push factors turn negative and significant once migration restrictions are relaxed. The opposite is true when policy becomes more protectionist. In the same regression I also add the interaction of the indicator variable of changes in destination countries’ migration policy with, respectively, log distance and share of young population (origin). I find that the effect of the latter two variables is more pronounced (more negative and more positive, respectively) when a host country’s immigration laws turn less restrictive. The opposite is true when policy becomes more protectionist. Notice that I also include the linear effect of immigration policy changes, which is insignificant. Regression (9) represents the preferred specification of the model. It shows that migration restrictions matter by mitigating effects on the supply side of the model (pull and push factors, geography, and demographics).

6 Additional Results

In Table 2, I analyze economic determinants more in detail. First, I investigate the impact of the second moments of the income distributions in the origin and destination countries. According to the theory (formulas (7) and (8)), given low values of the origin country’s relative inequality(\( \frac{\sigma_0}{\sigma_1} \)), if \( \frac{\sigma_0}{\sigma_1} \) increases, the emigration rate will increase, while given highvalues of \( \frac{\sigma_0}{\sigma_1} \), if \( \frac{\sigma_0}{\sigma_1} \) increases, the emigration rate will decrease.Footnote 26 Theintuition for these results is straightforward. If income inequality in theorigin country is lower than in the destination country (\( \frac{\sigma_0}{\sigma_1}<1 \)), there ispositive selection of immigrants from country 0 to country 1: migrants are selected from the upper tail of the income distribution at home and end up in the upper tail of the income distribution abroad (in both cases, the relevant distribution is the origin country’s population one). For example, consider potential migrants from Portugal to the United States. Given that income inequality is lower in Portugal than in the U.S., among Portuguese workers it is the better-off who have an incentive to migrate while those at the very low tail of the income curve have an incentive to stay. The reason is that the probability of both very high and very low incomes is higher in the U.S. than in Portugal. An increase in income inequality in Portugal will make the marginal individual (who is in the lower tail of the income distribution) relatively worse-off at home and will increase her incentive to leave. Similarly, if income is more dispersed at home than abroad (\( \frac{\sigma_0}{\sigma_1}>1 \)), then there is negative selection of immigrants from country 0 to country 1: migrants are selected from the lower tail of the income distribution at home and end up in the lower tail of the income distribution abroad. An example of this situation is migration from Brazil to the U.S., given that income inequality in the latter is lower than in the former.Footnote 27 An increase in income inequality in Brazil will lower the emigration rate because those who were not migrating beforehand, the better-off, will have even less incentive to do so afterwards. In order to test these predictions, I introduce in the estimating equation a measure of the origincountry’s relative inequality (\( \frac{\sigma_0}{\sigma_1} \)) both in linear and quadratic forms. As expected, I find that the coefficient on the linear term is positive and on the quadratic term is negative (both significant at conventional levels), which is consistent with Borjas (1987) selection model (regressions (1)–(2), Table 2).Footnote 28

The remaining specifications in Table 2 investigate empirically a few extensions of the theoretical framework of Sect. 2. First, it is possible to incorporate poverty constraints in the model, due to fixed costs of migration and credit market imperfections in the origin country. As Yang (2003) shows, these assumptions imply that the effect on emigration rates of income opportunities at home is non-monotonic, positive at very low levels of income and negative for higher levels. Accordingly, I extend the empirical model previously specified by introducing both a linear and a quadratic term in per worker GDP of the origin country. I find very weak evidence of poverty constraints in regression (3). The sign of the coefficients is consistent with the theory but the lack of significance of the estimates prevents me from reading too much support into them.Footnote 29 This result thus leaves open the question of why push and pull effects are different in size and, indirectly, lends support to the alternative hypothesis of binding (and endogenous) migration quotas.

Next, the theoretical model can be modified by taking into account uncertainty in finding a job in each place. This extension suggests using the unemployment rate (which is approximately equal to one minus the probability of finding a job) as a regressor in the estimating equation. My results in column (4) are not significant. In an additional extension (column (5)), I test whether workers choose among multiple destination countries. In the theoretical model, the choice is between the origin country and one particular destination country. In practice, however, potential migrants are likely to compare mean income opportunities in their origin country to those in the destination country considered and in any other host country. For each pair of source and host economies, I construct and control for a multilateral pull term which is an average of per worker GDP levels of all the other destination countries in the sample, each weighted by the inverse of distance from the origin country. Regression (5) shows that third-country effects shape bilateral migration flows as expected, given that the coefficient of the multilateral pull term is indeed negative and significant (at the 10% level).Footnote 30

To conclude, I investigate the role of past migration flows to the destination country from the same origin country. Lagged emigration rates capture the impact of network effects, which are likely to reduce the cost Ci of migration. The introduction of the lagged emigration rate among the explanatory variables makes the model a dynamic one. I use Arellano and Bond’s GMM estimator to deal with the incidental parameter problem that arises with fixed-effects estimation of such a dynamic equation.Footnote 31 Emigration rates show considerable inertia in regression (6), where the coefficient of the lagged emigration rate is 0.66 (significant at the 1% level).Footnote 32 However, outside the model of Sect. 2—which assumes exogenous migration quotas—it is unclear how to interpret this autocorrelation. While it is consistent with network effects on the supply side, it could also be driven by factors working on the demand side. In particular, through the latter channel, past migration flows can influence the emigration rate in two different ways: through family-reunification immigration policies and through political-economy factors (see, e.g., Goldin (1994) and Ortega (2005), where the votes of naturalized immigrants affect immigration-policy outcomes).

7 Conclusions

In this paper, I empirically investigate the determinants of international bilateral migration flows. This analysis both delivers estimates consistent with the predictions of the international migration model and generates empirical puzzles.

In particular, I find evidence that pull factors, that is income opportunities in the destination country, significantly increase the size of emigration rates. This result is very robust to changes in the specification of the empirical model. On the other hand, the sign of the impact of push factors—that is, per worker GDP in the origin country—is seldom negative and, when it is, the size of the effect is smaller than for pull factors and insignificant. Therefore the evidence uncovered by the estimates is mixed in terms of the migration-policy regime that characterizes, on average, the destination countries in the sample: Push effects suggest that migration quotas are more binding than pull effects do. A possible explanation of the asymmetry between push and pull factors is the role played by the demand side of the model, that is destination countries’ migration policies. While the theoretical framework of Sect. 2 assumes that migration quotas are exogenous, in practice they are not. Indeed migration policies can be thought of as the outcome of a political-economy model in which voters’ attitudes toward immigrants, interest-groups pressure, policy-makers preferences, and the institutional structure of government interact with each other and give rise to a final immigration-policy outcome (Rodrik, 1995; Facchini & Willmann, 2005; Mayda, 2006). Binding and endogenous migration quotas can explain the asymmetric effect I estimate for pull and push factors. While I do not investigate the endogenous determination of migration policy, I find evidence consistent with the constraining role played by migration policies. In the empirical analysis, I interact an indicator variable of changes in destination countries’ migration policies with pull and push factors, respectively. I find that pull effects become more positive and push effects turn negative in those years when a host country’s immigration laws become less restrictive.

Among the variables affecting the costs of migration, distance appears to be the most important one. Its effect is negative, significant, and steady across specifications. Demographics, in particular the share of the origin country’s population who is young, represent a significant determinant of emigration rates as well. I find that the effect of both variables is more pronounced in those years when a host country’s immigration laws become less restrictive. In sum, my results suggest that migration quotas matter: They mitigate supply-side effects, that is pull and push factors, geography, and demographics.

The investigation of the determinants of international migration leads to other interesting research questions. The framework I have used in this paper to study migration flows is related to the gravity model of trade, which is used to analyze bilateral trade flows across countries. As a matter of fact, I have used several variables that appear frequently in the trade gravity literature (log distance, land border, common language, and colony). A common framework of empirical analysis for trade and migration makes it possible to combine the study of these two dimensions of international integration.

To conclude, by taking advantage of both the time-series and cross-country variation in an annual panel data set, this paper makes progress in explaining the determinants of international migration flows and in providing a framework for future analyses of migration relative to other dimensions of globalization.