1 Introduction

There is a long tradition of economic research on international migration, which dates back at least to Chiswick (1978) and has been recently summarized by Hatton (2014). Yet, the study of international migration is still at the heart of a lively debate that mainly centers around the causes and consequences of geographic mobility both at the individual, regional and country level. In particular, due to the growth of inter-regional and international flows of high-skilled workers, there has been a renewed interest on the topics of brain drain and brain gain (Beine et al. 2001; Docquier and Rapoport 2012). Surprisingly, among the highly skilled, less attention has been paid to Ph.D. graduates. Moreover, an analysis of their outcomes related either to inter-regional or international mobility is completely absent. Thus, the present study is an attempt to fill this gap by focusing on doctorate holders and their early performance in the labor market associated with the choice of international mobility.

Understanding the determinants and outcomes of Ph.D. mobility can be of great relevance not only for individuals themselves, but also for society and policy makers. For individuals, private benefits may be monetary and non-monetary, such as employment opportunities, higher wages, career prospects and job satisfaction (Abreu et al. 2015; Pitcher and Purcell 1998). Differently, social benefits crucially depend on the knowledge transfer to the private sector. Outflow mobility of high-skilled labor is often perceived in terms of human capital depreciation, while the ability to attract high-skilled migrants turns out to be fundamental to the enhancement of the workforce skill composition and the innovative success of regions and countries. For instance, since human capital intrinsically resides in individuals, the flows of highly educated individuals have a direct impact on regional development through its influence on firms’ success in competitive environments (Parey et al. 2015). In this context, studying the migration behavior of doctorate holders is particularly relevant, as they are among the most qualified workers both in terms of educational attainment and their ability to conduct research and contribute to economic development.Footnote 1

This study is also policy relevant, as it represents the first step toward a more comprehensive understanding of the international migration of Ph.D. holders. When the home country provides education, but highly skilled workers migrate elsewhere, the higher education system loses its educational investment. The effectiveness of policies aimed at reducing the brain drain crucially depends on the correct design of private incentives, which obviously must take into account also individual gains associated with the choice of migration.

Over the past two decades, the number of doctorate graduates has increased sharply, due to the worldwide development of higher education systems. Auriol et al. (2013) report a 38 % increase in the number of Ph.D. students who graduated from universities in OECD countries for the period 2000–2009, but similar trends can be also observed in other non-OECD countries. Furthermore, opportunities for academic employment have declined over time and the labor market for Ph.D.s has changed markedly.Footnote 2 In parallel, many countries have adopted quality-selective immigration policies, such as tax benefits and simplified immigration measures,Footnote 3 aimed at attracting the best talents from all over the world (Beine et al. 2008). Thus, both supply and demand factors have changed the composition of international migration flows. Moreover, since mobility has been proven to be a key dimension in making more effective investment in human capital, our focus on Ph.D. graduates aims at building an information base for the design of policies directed toward highly qualified individuals.

Similarly to other countries, also Italy has experienced a constant growth in the provision of Ph.D. programs, with a constant increase in the number of graduates. The Italian National Institute of Statistics reports that in 2000 around 4000 students were granted a Ph.D. degree from Italian universities, while the number had risen to over 12,000 in 2008. However, few permanent positions in universities and public research centers have become available (Ballarino and Colombo 2010). From this point of view, it is interesting to investigate the outcome of those who are at the top of the educational system, but at the same time face a considerable insecurity should they choose to pursue an academic career. Ph.D. students are often trained in internationally vivid environments and grow close to scientific communities that, in their nature, overcome national borders, and this makes doctorate holders a peculiar population to look at in terms of mobility choices.

Is then mobility an option that is rewarded with higher wages? To answer this question, this paper looks at the international mobility of Ph.D. graduates and their earnings performance after graduation. The paper presents novel results on the relatively unexplored topic of Ph.D. graduate mobility by exploiting a population-based survey of two cohorts of Italian Ph.Ds. Our econometric analysis confirms that the choice of mobility is associated with higher wages and that selection on unobservable factors is essential to address the issue of the returns to migration. Interestingly, we find negative selection in all our estimates. Additionally, we do not detect individual heterogeneity in the response of wages to migration. Moreover, we show that our results are always confirmed when we include two exclusion restrictions in the empirical model and when we restrict the analysis to different subpopulations.

The rest of the paper is organized as follows. Section 2 links this paper to the existing literature. Section 3 outlines the estimation approach and describes the data. Results and robustness checks are discussed in Sect. 4. Section 5 concludes with a discussion on policy implications and offers some directions for further research.

2 Literature review

The theoretical framework that is often adopted to model the interplay of labor market and migration decisions rests upon the theory of human capital investments as discussed in Sjaastad (1962), Becker (1962), Herzog et al. (1985) and Greenwood (1975). Since then, numerous lines of research have emerged. In particular, this paper is mainly related to the strand of empirical research that deals with the migration of the highly skilled.

In surveying the existing literature, it is evident that there is a lack of attention toward Ph.D. graduates, probably because of the limited availability of data covering mobility aspects of doctorates’ careers. While this category has been thoroughly studied in terms of research performance (Athey et al. 2007; Grove and Wu 2007), not so much is known about their migratory trajectories and the effects on their private benefits. Only recently, the international migration literature has been partially extended to doctorate holders. Indeed, Gottlieb and Joseph (2006), which can be considered the first study of the college-to-work migration behavior of Ph.D.s, analyze a sample of technology university and Ph.D. graduates. The authors find that the propensity for migration of doctorates responds to amenity factors like climate and crime, while little evidence is found with respect to economic factors such as recent employment growth and the presence of IT jobs. Using a sample of US Ph.D. graduates in economics, Davis and Patterson (2000) observe stable interstate and international mobility patterns over time and claim that doctoral economists are more likely to change regions for academic and government employment than for jobs in the private sector. The authors also stress the importance of college ranking in shaping the mobility flow of graduates, which is also a crucial factor in determining which regions are net exporters and importers of Ph.Ds. Another example is Grogger and Hanson (2013) who analyze location choices of foreign-born science and engineering students receiving a Ph.D. from US universities. Their findings suggest that individuals in countries with world-leading research organizations show a lesser need to move abroad, and consequently, the US tends to succeed in retaining the most talented students. The present study will join this research agenda by deepening our understanding of labor market outcomes and international migration decision of Italian doctoral holders.Footnote 4

Although our focus is on international mobility of Ph.D. graduates, many insights may be borrowed from the internal migration literature that specifically looks at college and university graduates. A common ground between these two areas of research is theoretical. Indeed, both at a graduate and postgraduate level, individuals are endowed with high levels of human capital acquired during their studies and spatial mobility represents a way to capitalize this investment (Venhorst et al. 2011). Moreover, as pointed out in Sjaastad (1962) and Faggian et al. (2007b), the propensity to migrate is positively associated with the individual’s endowment of human capital. Consequently, it is reasonable to think that higher levels of human capital are related to a higher willingness to undertake wider spatial mobility patterns. At the same time, since postgraduate education is often aimed at acquiring very specific skills, doctorate holders extend the job search on an international basis to increase the chance to obtain better job matches. Thus, both the factors behind the decision to migrate and the expectations on career prospects, employability and wage returns are all aspects that individuals take into account at the initial stage of job placement. In particular, there is ample evidence that the choice of (internal) migration can be effectively explained by factors such as individual characteristics (Corcoran et al. 2010; Faggian and McCann 2009), education (Kodrzycki 2001), academic and economic factors (Yousefi and Rives 1987; Ciriaci 2014), and the same have been recognized to be prominent factors also for the decision of Ph.D.s to move internationally (Auriol et al. 2013). Nevertheless, there might also be differences related to the specific research training received during the Ph.D. For instance, Ph.D.s might react strongly to differences in research funding when choosing where to move or exploit academic networks to increase the chances to obtain a job in their field of specialization.

Even though efforts have been made to explain the factors behind the decision to migrate and the high intensity of graduates migration flows (Groen 2004; Hoare and Corver 2010), contributions on the economic returns to migration that specifically focus on highly educated individuals are still sparse. Indeed, while many authors concentrate on the monetary returns for young workers in general,Footnote 5 only few have debated the importance of the highly educated. Among them, Abreu et al. (2015) analyze the relationship between migration and inter-industry mobility of UK graduates and show that both earnings and career satisfaction respond positively to a change in location. Di Cintio and Grassi (2013) determine to what extent internal migration—from domicile to higher education and, subsequently, to first employment—affects the wages of young graduates. Lastly, migration has been also found effective in reducing the probability of over-education and in increasing the likelihood of finding a job (Iammarino and Marinelli 2015; Devillanova 2013).

To summarize, the literature has so far mainly focused on the determinants of the migration decision and, partly, the issue of returns to spatial mobility of college and university graduates, mostly neglecting the equally important role of doctorate holders. Thus, the present paper has been an attempt to fill this gap by examining the international mobility of Ph.D. graduates and their earning performance after graduation.

3 Methods

To estimate the impact of mobility on wages, we largely rely on the extensive literature on program evaluation as developed, among others, by Rubin (1974), Angrist et al. (1996) and Heckman et al. (1999).Footnote 6 As it is well known in this literature, estimation of program impact needs statements on how to construct unobserved counterfactuals. Using Wooldridge (2010) notation, we consider a model in which the individual’s observed wage, y, can be expressed as follows:

$$\begin{aligned} y=y^{0}+t\left( {y^{1}-y^{0}} \right) , \end{aligned}$$
(1)

where the superscripts discriminate between the outcome conditional on migration and the outcome conditional on immobility, while t is the binary indicator for mobility. Thus, for those who migrate (\(t = 1\)) we can only observe \(y^{1}\), while for those who do not migrate (\(t = 0\)) we can only observe \(y^{0}\). A problem of evaluation arises as a missing data problem that must be tackled with proper econometric techniques. Two-step selection models can be considered an appropriate evaluation method.

As a general case, potential outcomes are assumed to be related to a set of regressors x:

$$\begin{aligned}&y^{1}=\mu ^{1}+x\beta ^{1}+\epsilon ^{1} \end{aligned}$$
(2)
$$\begin{aligned}&y^{0}=\mu ^{0}+x\beta ^{0}+\epsilon ^{0} \end{aligned}$$
(3)

where the \(\mu \) s are parameters, the x s are explanatory variables, and the \(\varepsilon \)s are idiosyncratic unobserved terms.

By plugging Eqs. 2 and 3 in 1, it is straightforward to get:

$$\begin{aligned} y=\mu ^{0}+\left( {\mu ^{1}-\mu ^{0}} \right) t+x\beta ^{0}+t\left( {x\beta ^{1}-x\beta ^{0}} \right) +\epsilon ^{0}+t\left( {\epsilon ^{1}-\epsilon ^{0}} \right) . \end{aligned}$$
(4)

By assuming treatment exogeneity, homogenous treatment response (\(\beta ^{1}=\beta ^{0}=\beta )\) and ruling out the presence of unobserved heterogeneity (\(\epsilon ^{1}=\epsilon ^{0}=\epsilon )\), one could simply run an OLS regression of y on x and t:

$$\begin{aligned} y=\mu ^{0}+\left( {\mu ^{1}-\mu ^{0}} \right) t+x\beta +\epsilon . \end{aligned}$$
(5)

In this simple setting, the program impact corresponds to the estimate of \(\left( {\mu ^{1}-\mu ^{0}} \right) \). In our case, however, OLS will not be able to produce unbiased estimates of the impact of mobility on wages for several reasons.

First, in the absence of an exogenous source of variation of the incentive to migrate, our indicator of mobility is likely to be an endogenous variable. As individuals chose to migrate only if the expected benefits associated with this choice outweigh the costs of moving (Sjaastad 1962; Borjas 1987), the process of self-selection into migration needs to be tackled with alternative empirical strategies. Moreover, the migration literature has well documented the fact that individuals’ unobservable characteristics, such as personal abilities or attitudes toward risk, may affect both the decision to move abroad and the observed wages. For instance, one would expect to observe a higher propensity to migrate for individuals with lower risk aversion, and at the same time, less risk adverse individuals should also be observed into more risky jobs that pay higher wages. Unless selection is fully accounted for by observable variables, empirical methods based on unobservable heterogeneity are needed. Thus, we control for selection in a Mincerian-type regression by estimating a selection rule (with and without exclusion restrictions) that predicts whether a Ph.D. graduate migrates abroad.

Second, differences in wages between internationally mobile and non-mobile individuals may not be fully captured by a simple level shift \(\left( {\mu ^{1}-\mu ^{0}} \right) \). Instead, the effect of migration could vary across subpopulations of individuals, which would result in heterogeneous treatment bias (Heckman and Vytlacil 1998; Heckman et al. 2006). This is a problem of heterogeneous individual response to the same treatment that we accommodate for in a Heckman selection model. In particular, we assume that individuals with higher scientific productivity are more likely to know better about research funding and job opportunities abroad and so they are also likely to have greater knowledge of (potential) costs and benefits associated with the choice of migration. Hence, they could ultimately be able to obtain higher wages. To control for this possibility, we let the treatment impact to vary according to a set of variables related to the scientific productivity of individuals. Moreover, younger Ph.D.s may benefit more from mobility because having obtained the Ph.D., while younger is often perceived as a measure of effectiveness and commitment, which in turn can be rewarded with higher wages in the labor market. At the same time, being younger is also associated with a higher propensity to migrate; thus, we let the treatment indicator interact with age at Ph.D. For completeness, we also use gender and father’s education to capture other dimensions along which the heterogeneous treatment bias could deploy its effect.

A third possible shortcoming is related to the fact that individuals self-select into employment. As commonly pointed out in many empirical studies, if wages are only observed for individuals that actually have a job, then sample selection bias arises. However, in our study, we believe that this source of bias should play a little role because more than 93.3 % of Ph.D.s reports to have a job at the time of the interview.Footnote 7 Moreover, among those without a job, almost 30 % reported being waiting either to go back to their previous job or to start a new job/paid training program.Footnote 8 It follows that the fraction of unemployed Ph.D.s is very low. In this respect, we consider the bias associated with selection into jobs being very low in our data and restrict the analysis to individuals holding a job.

In what follows, we sketch the strategies adopted in this study to tackle the issues outlined above. In particular, Sect. 3.1 describes the approach to modeling the endogeneity of the migration variable, while Sect. 3.2 deals with the possible heterogeneity in treatment response.

3.1 Endogenous dummy variable model

As a first step, we still assume homogenous treatment response (\(\beta ^{1}=\beta ^{0})\), but we acknowledge both the endogeneity of the migration variable and its dichotomous nature by implementing an endogenous binary treatment version of Heckman’s (1976, 1979) two-step model.Footnote 9 In this context, the observed binary treatment is thought to depend from an unobservable latent variable \(t^{*}\) that is assumed to be linearly related to a set of covariates, w, so that:

$$\begin{aligned} {t^{*}} = {w\gamma +u} \end{aligned}$$
(6)

and

$$\begin{aligned} t=\left\{ {{\begin{array}{ll} {1,}&{}\quad {\hbox {if}~t^{*}>0} \\ {0,}&{}\quad {\hbox {otherwise}} \\ \end{array} }} \right. , \end{aligned}$$
(7)

where Eq. 7 represents the selection (or behavioral) rule. In particular, the unobserved latent variable can be interpreted as the net value or utility that each individual puts on the choice of treatment. Only if this value is positive, individuals choose to migrate.

We estimate model 5, 6 and 7 within a two-step framework derived in Maddala (1986). The two-step estimator relies on the assumption that the unobserved heterogeneity is fully captured by the correlation structure between u and \(\epsilon \left( =\epsilon ^{1}=\epsilon ^{0}\right) \), i.e., the correlation structure between unobservables that affect t and unobservable that affect y. In particular, u and \(\varepsilon \) are bivariate normal distributed with correlation coefficient labeled \(\rho \). Estimation proceeds as follows. First, we obtain probit estimates of the form:

$$\begin{aligned} Pr\left( {t=1|w} \right) =\varPhi \left( {w\gamma } \right) \end{aligned}$$

and, then, we recover the hazard h for each observation according to the formula:

$$\begin{aligned} h^{t}=\left\{ {{\begin{array}{ll} {\phi \left( {w\hat{\gamma }} \right) /\varPhi \left( {w\hat{\gamma }} \right) }&{}\quad {t=1} \\ {-\phi \left( {w\hat{\gamma }} \right) /\left\{ {1-\varPhi \left( {w\hat{\gamma }} \right) } \right\} }&{}\quad {t=0} \\ \end{array} }} \right. , \end{aligned}$$

where \(\phi \) and \(\varPhi \) are, respectively, the probability and cumulative density function.

The estimates of \(\beta \) and \(\left( {\mu ^{1}-\mu ^{0}} \right) \) are then obtained with OLS by augmenting the regression equation with h:

$$\begin{aligned} y=\mu ^{0}+x\beta +\left( {\mu ^{1}-\mu ^{0}} \right) t+\rho \sigma h^{t}+\epsilon . \end{aligned}$$
(8)

For the model to be identified, it is not necessary that the vector w includes additional variables relative to the covariates already included in x. Nevertheless, in separate regressions, we use two exclusion restrictions, which will be discussed in more detail in Sect. 3.2.1.

The parameter associated with the hazard is the product of the standard deviation parameter (\(\upsigma \), which is always positive) and the correlation coefficient (\(\rho )\) between the error terms of the two equations; thus, it is informative about the strength of the unobserved heterogeneity in our data. \(\rho >0\) implies positive selection into migration due to unobservable traits that are also responsible for the higher observed wages among them. On the contrary, if \(\rho <0\), negative selection arises, meaning that individuals who are low earners possess unobserved traits that make them also more likely to migrate.

3.2 Treatment heterogeneity

The program evaluation literature has recently recognized that the outcomes may respond differently to the same treatment (Heckman et al. 2006). This is related to the fact that the gains from program participation may vary according to individuals’ characteristics. To tackle this issue, in this section we allow for a more flexible model by taking into account the possible heterogeneity in treatment response \((\beta ^{1}\ne \beta ^{0})\) and by relaxing the hypothesis of limited unobserved heterogeneity \(\left( {\epsilon ^{1}\ne \epsilon ^{0}} \right) \). In particular, this latter hypothesis let us separately estimate the correlations between each treatment status and wages. Unobserved characteristics may include the set of skills and abilities that contributes to an individual’s wage and its propensity to move abroad, which may be different in the subpopulations of migrants and non-migrants.

This variability in treatment response can be captured in a regression framework with the inclusion of the term \(\left( {x-\bar{x}} \right) \gamma t\), which is itself endogenous. Formally, we still rely on a Heckman two-step selection model where Eq. 8 can be reformulated as follows:Footnote 10

$$\begin{aligned} y= & {} \mu ^{0}+x\beta +\left( {\mu ^{1}-\mu ^{0}} \right) t+\left( {x-\bar{x}} \right) \gamma t+\rho ^{1}\sigma t\frac{\phi \left( {w\hat{\gamma }} \right) }{\varPhi \left( {w\hat{\gamma }} \right) }\nonumber \\&+\,\rho ^{0}\sigma \left( {1-t} \right) \frac{\phi \left( {w\hat{\gamma }} \right) }{1-\varPhi \left( {w\hat{\gamma }} \right) }+v, \end{aligned}$$
(9)

where \(\rho ^{1}\) and \(\rho ^{0}\) are the correlations between each treatment status and wages, while v is the error term. As explained in Sect. 3, we allow the heterogeneity to be related to indicators of scientific productivity, age, gender and father’s education.

Summing up, by modeling the endogeneity of the migration variable and the possible heterogeneity in treatment response, we are confident that our results may give clear evidence of the effect of international mobility on early wages of Ph.D. graduates. Nevertheless, as a robustness check, we also apply an instrumental variable approach. The choice of instruments is discussed in the next subsection.

3.2.1 The choice of instruments

The key variable of interest in this study is the indicator of international mobility, which is clearly an endogenous variable as it is the outcome of a utility maximization problem.

We searched for convincing instruments both within the data and using external data sources. First, the survey conveys information on previous spatial mobility behavior, as respondents were asked to report both if during the Ph.D. they spent at least one month abroad for training purposes and if they had changed the city to attend the Ph.D. Second, as external data sources, we selected statistics compiled by ISTAT at the NUTS3 level and choose the number of dead firms as a proxy of employment opportunities in the area of study.

Previous mobility patterns have been widely used in related studies as predictors of future mobility. For instance, Abreu et al. (2015) use migration to attend university and migration after graduation as instruments to post-graduation moves and inter-industry mobility for a sample of UK graduates. In our case, since our focus is on international mobility, we prefer to include past inter-regional mobility to attend the Ph.D. in both equations and use the dummy on foreign training during the Ph.D. as an exclusion restriction. Having spent time abroad during the Ph.D. can in principle be associated with future mobility. During periods abroad, individuals may lower the psychological cost of being mobile, acquire proficiency in a foreign languageFootnote 11 and increase their knowledge of possible future destinations. Thus, this instrument may contain sufficient information to predict future mobility. Nevertheless, it will be a valid instrument only if periods of training abroad in a student’s curriculum can be effectively excluded from the wage regression. From this point of view, we cannot totally rule out the possibility that having undertaken training abroad does not influence subsequent wages. However, we can argue, first, that in most cases, training programs for Ph.D.s are directed to acquire specific skills. For instance, summer and winter schools deal with specific topics usually related to new areas of research. Since we observe Ph.D.s after three and five years after graduation, the value of those skills probably depreciates and, thus, is less related to current wages. Moreover, to the extent that wages are advertised as in models of wage posting, our instrumental variable should not play a critical role in wage determination. For instance, models of directed search as in Moen (1997) or Shimer (2005) typically assume that there is wage posting, and empirical evidence of this mechanism can be found in Hall and Krueger (2010) for the USA and Brenzel et al. (2013) for Germany. In both studies, the authors report that two-thirds of hirings are characterized by wage posting. In addition, since we control for both university fixed effects and the type of degree awarded, we do not expect past mobility to have a direct effect on wages, especially after 3 and 5 years after graduation. Finally, even if we are well aware that causality emerges in multivariate settings, in Fig. 1 we offer a graphical representation of the kernel distributions of wages broken down both by previous experience abroad and by treatment status. In particular, panel (a) shows that the wage distribution of Ph.D.s who received some form of training abroad has a high overlap with the one that refers to Ph.D. not having spent time abroad for training. On the contrary, in panel (b), it is evident that the wage distribution of migrants is sensibly different from the wage distribution of non-migrants.

Fig. 1
figure 1

Kernel densities of wage distributions by previous migration experience (a) and treatment regime (b)

Our second instrument is the number of firms that shutdown their activity in the province where the Ph.D. was attended. Firm closings directly affect the opportunities for receiving job offers and could act as an instrument for mobility because it does not affect the post-move wage of doctorate holders. It turns out that this instrument performs well only when province fixed effects are excluded from the migration equation. This is reasonable as there are strong chances that these variables are highly correlated. Nevertheless, we prefer to include the set of fixed effects to increase the likelihood that the indicator of past mobility captures the migration decision more effectively.

3.3 Data and summary statistics

To carry out our empirical analysis, we use information contained in a survey administeredFootnote 12 by the Italian National Institute of Statistics (ISTAT) on Italian Ph.D. graduates. In particular, the survey has been conducted between December 2009 and February 2010 with the aim of gathering information on the labor market entry conditions of two cohorts of students who received a doctorate degree from an Italian university in 2004 and 2006, respectively. An important remark is that the survey has been directed to the universe of Ph.Ds. In detail, on a population of 18,568 Ph.D.s (8443 in 2004 and 10,125 in 2006), almost 13,000 interviews were made (5689 doctors in 2004 and 7275 in 2006), with an overall response rate of about 70 %.Footnote 13 Nevertheless, from the original population, we lose around 2000 observations for missing information on wages, hours worked and other job characteristics. The questionnaire contains five sections (curriculum, job characteristics, job search activities, international mobility and family background) from which we extrapolate relevant information to carry out the empirical part.

Table 1 reports descriptive statistics for the variables used in the analysis for all observations included in the regressions. The statistics are further broken down by migrant status. From a simple exploratory analysis, we first notice that only around 7 % of Ph.D. graduates have an international mobility pattern. In particular, of those who choose this path, 38 % migrate to the USA, 16 % opted for France, the same for the UK. Germany attracts 10 % of Ph.D.s and Spain only 6 %. The remaining 14 % choose a destination either in other European countries or countries in other continents. We also notice a remarkable difference in the gender composition of migrants and non-migrants, with males being around 58 % of the share of international mobile graduates, even if they are around 48 % of the total population.

Migrants earn around 14 % more than non-migrants, are more likely to report both previous experiences abroad and past inter-regional mobility and are younger than non-migrants. Surprisingly, only few differences emerge between the two groups in terms of field of study, with the exception of physics (whose graduates are more represented among migrants) and medicine (whose graduates tend to be less mobile).

Table 1 Summary statistics

Since the econometric model is made up of two parts, not all the variables listed in the table appear in both equations. In particular, the migration equation includes a subset of the control variables that appear in the wage equation. In some specifications, two exclusion restrictions are included to help identify the propensity to migrate and, consequently, the self-selection correction terms. Among the controls of the migration equation, age at Ph.D. is added to capture the idea that the probability of migration declines with age (Schwartz 1976). As long as migration is thought of as a form of human capital investment, net gains to migration depend on the time horizon left to benefit from the advantage of moving. Thus, net gains are ultimately related to the age of migrants. Since there is a large consensus suggesting that previous migration experiences are correlated with subsequent spatial mobility (Kodrzycki 2001), individual attitude toward mobility is proxied by a dummy equal to one if graduates had changed the city to attend their Ph.D. University-to-Ph.D. mobility is also very low, since more than 87 % of individuals attended the Ph.D. in the same university where they graduated. As already pointed out, we prefer not to use this variable as an exclusion restriction because it pertains more to inter-regional mobility. Thus, this variable is also included in the wage regression to control for the potential increasing knowledge of local labor market characteristics gained through regional mobility patterns. Whether or not Ph.D.s graduated on timeFootnote 14 is also included in the model to capture students’ commitment, ambition and motivation. In line with theoretical models and previous empirical research, expected wages are an important factor in shaping the decision to migrate; thus, we include a destination-to-origin wage ratio. In particular, as we lack an external source of data to construct a variable able to account for relative differences in wages between destinations and origins, we rely on information contained in our data. We use the reported net monthly wage and the weights constructed by ISTAT to build a proxy for average wages for each destination and origin location.Footnote 15 While for Italy we are able to construct average wages at the NUTS 3 level, at the international level we are only able to identify the following destinations: France, Germany, UK, Spain, USA, other EU countries, rest of the world. Alternatively, we could have computed an average wage for Italy, so that the wage ratio for Ph.D.s working in Italy would always take the value of 1. Unreported estimates based on these wage ratios do not differ from the ones presented in the paper.

As further controls, the probit model includes the Ph.D. field of study, parents’ education and the high school degree. Finally, we add province-specific intercepts to control for differences in the academia of origin. With the exception of a few provinces, indeed, usually there is only one university in each province. Thus, our province fixed effects should also capture differences in the academia of origin.

The second equation in the empirical model is a Mincerian-type wage regression that accounts for self-selection into migration. Here, the set of control variables is richer, as we add, compared to the first equation, also many job-related variables. In particular, we are able to control for general economic sectors (industry, services and agriculture), contract type (permanent, temporary, postdoc, others), job type (waged and non-waged employment) and job access (public competition, employer knowledge, resumes to employers, others). Moreover, we include two variables (job with R&D and job with teaching) that indicate whether individuals are fully, partially or not involved in R&D and teaching activities. Finally, respondents were asked to report whether professors or other acquaintances acquired during the Ph.D. were helpful in finding a job. In this way, we are able to proxy for network effects.

In addition, the survey delivers some pieces of information related to the scientific productivity in terms of published articles, monographs and conferences.Footnote 16 Scientific productivity is very high, and indeed more than 68 % reports to have published more than 3 journal articles. This percentage increases for international mobile Ph.D.s (74 %). Differently, around 64 % of graduates have not produced monographs, independently from their migration status. Finally, we detect large differences in participation in national and international conferences. Overall, more than 53 % attended more than three conferences and this percentage rises to 67.5 % for mobile Ph.Ds. Thus, we use these variables to partially account for ability in conducting research.

Given the large number of explanatory variables, we check the degree of multicollinearity with the variance inflation factors (VIF). In detail, the square root of the VIF indicates how much larger the standard error is compared to what it would have been if that variable were uncorrelated with the other independent variables.

In our analysis, all of the VIFs are lower than 10 (many are lower than 2) and the mean VIF is 2.12. Thus, since all of the VIFs are relatively low, we can be confident that multicollinearity is not an issue for our analysis.

4 Results

Table 2 reports the results from simple OLS estimates (column a) and the endogenous dummy variable model detailed in Sect. 3.1. In particular, the results of the wage equation and the probit model are listed in column b). Column c) refers to similar estimates with the only difference that the migration equation includes also province fixed effects. In what follows, we start with a discussion of the migration equation and then we move to the analysis of the wage equation, which we also compare with OLS estimates.

Firstly, the destination-to-origin wage ratio is always significant and positively predicts migration status. We observed that when this variable was left out from the analysis, the predictive power of the model dropped significantly. Coherently with what the theory predicts (Herzog et al. 1985), the wage differential between different labor markets is a crucial determinant of the migration decision, as it is a measure of differential returns of the investment in human capital.

In line with previous research, past mobility is also a good predictor of subsequent mobility. The coefficient is always significant at conventional levels and has a positive sign. As expected, from the inspection of the summary statistics, the propensity to migrate is sensibly lower for females.

Table 2 OLS and endogenous treatment effects

Although it is common to find a gender unexplained penalty in the propensity to migrate, which is usually interpreted as evidence of stronger family ties for females and men being more attached to their careers, in the literature, there are also examples of different patterns. For instance, Faggian et al. (2007a) find that UK female university graduates are more migratory than men. Age is also crucial when explaining mobility patterns. Compared to the baseline category (being younger than 30), all the coefficients have a negative sign and their magnitude is increasing in age, even though only the last category (being 33 or more) is statistically different from zero. Although not explicitly reported, fields of study turn out to be a relatively poor predictor in the migration equation, as well as parents’ education, high school field of study and whether or not the Ph.D. was earned on time. Similar results are found for the UK in Belfield and Morris (1999).

Overall, the fit of the selection equation seems reasonable, with a pseudo R-square of 60.34 % when province fixed effects are not included and 65.26 % if fixed effects are included, and thus, the inclusion of province fixed effects produces only a small improvement of the model fit. Moreover, by inspecting the list of coefficients (which are available upon request), we notice that there is only a slight tendency of an increasing propensity to migrate from northern provinces (and thus universities) and a decreasing propensity to migrate associated with some center and southern provinces, with the exception of Rome.

We now turn to the discussion of the wage equation, in which the natural logarithm of post-move hourly wages was regressed against the indicator of international mobility, individual characteristics, job characteristics, family background, academic background and a full set of origin and destination fixed effects. Interestingly, OLS and two-stage regressions deliver very similar estimates for most of our control variables, both in terms of magnitudes, signs and significance levels, with the important exception of our endogenous regressors. Indeed, while the estimated impact of international mobility is insignificant in the OLS regression, once we control for its endogeneity, it becomes strongly significant. Quantitatively, migration is associated with an increment in log hourly wages of 0.33 (in the probit model without province fixed effects) and 0.372 (when the selection equation also includes province fixed effects), which is equivalent to an increase of 1.39 and 1.45 in hourly wages. From the original data, we know that, on average, monthly hours worked are around 150; thus, there is a monthly wage gain of around 210 euros. Even though this result seems to support the idea that, through mobility individuals move to geographic areas in which their investment in human capital is better rewarded, doctorate holders could also benefit from migration in terms of career prospects and other dimensions related to job satisfaction. We believe that future research and data collections will also take into account these important aspects.

The second interesting result from the analysis concerns the selection mechanism. As expected, the coefficient of the selection correction term is highly significant; thus, we cannot reject the null hypothesis that the error terms of the migration and employment equation are correlated. Surprisingly, we find that unobservables that lower observed wages are also related to a higher propensity to migrate, i.e., there is negative selection. Up to now, only few authors have addressed selection within the group of high-skilled migrants. Very recently, Parey et al. (2015) have measured selection on observables of high-skilled migrants using survey data on German university graduates and found that migrants to more equal countries, such as Denmark, are negatively selected compared to non-migrants. In our case, interpreting negative selection on unobservables can be challenging. Indeed, if unobservables were mainly related to ability factors, negative selection would imply that graduate migration is a process to relocate less able Ph.D.s abroad. This is of course a possibility, but it is difficult to reconcile this result with the fact that most of our indicators on scientific productivity turned out to be insignificant (and these variables should at least be partially correlated with ability). Thus, we cannot exclude a priori that the set of observable characteristics is not complete enough to rule out the possibility that other unobservables traits drive the result of negative selection.

Regression results also suggest that many other observed characteristics are important determinants of wages. As expected, we find evidence of a gender gap in favor of men, but small in magnitude, while age does not play a substantial role, as well as past inter-regional mobility. Fields of study, even if not important predictors in the migration equation, have some significant impact in the wage equation. In particular, compared to the baseline category of math and computer science, we find that Ph.D. graduates in physics, medicine, industrial engineering and law have, on average, higher wages.Footnote 17 Interestingly, we find that while having completed the Ph.D. on time is not significant in the migration equation, it has a positive impact on wages. Very often, indeed, during the screening of job candidates, the fact that individuals complete their educational ladders on time is interpreted as a signal of efficiency and commitment.

Results on job-related characteristics reveal that, compared to waged employment, non-waged employment is associated with lower wages. Unexpectedly, we find that higher wages are associated with both temporary workers and jobs in which individuals are partially or not involved in R&D activities. These results are clearly counterintuitive, as one would expect higher wages both for permanent and for R&D workers. Nevertheless, in unreported estimates in which the dependent variable is the monthly wage rather than the hourly wage, we find the expected signs. Thus, the results could be simply driven by differences in working hours.

Table 3 Heterogeneous treatment effect

We also find lower wages associated with job access channels that are different from public competitions. This result probably corroborates the negative association of wages to the variable that captures the help in finding a job received from professors and other acquaintances acquired during the Ph.D. Probably, network effects are stronger in terms of probability to find a job (Calvo-Armengol and Jackson 2004), but penalize the job match quality. Alternatively, it might be the case that those who do not need help in finding a job are also the more talented that are better paid.

Table 3 summarizes the results from the model that tries to capture potential heterogeneous responses to treatment. In particular, heterogeneity has been assumed to depend on factors related to individual productivity which we proxy with published articles, monographs and conferences. We choose these variables because they are more likely to capture aspects of individual ability. We also use age, gender and father’s education to enrich the way heterogeneity is modeled. As shown in Eq. 9, when we look for heterogeneous response to treatment, the wage equation is augmented with a set of further endogenous variables which are consistently estimated in a two-step procedure. Except for two out of eleven coefficients significant at the 5 % level, the results reject the presence of heterogeneous response to treatment. At the same time, the coefficients of the control variables do not exhibit sensible differences when compared to the estimated coefficients in Table 2. Finally, the impact of geographic mobility is close to 0.3 as in previous estimates.

Since we do not find evidence of treatment heterogeneity, we base our robustness and sensitivity analyses on the model presented in Sect. 3.1 with the inclusion of province fixed effects in the migration equation.

4.1 Robustness and sensitivity

In this section, we carry out a robustness check, which is based on the two exclusion restrictions described in Sect. 3.2.1, and a sensitivity analysis of the coefficients’ stability by repeating the estimation exercise on selected subpopulations.

Table 4 summarizes the results of the robustness check. In particular, we apply the IV framework both to the binary endogenous treatment model (panels a and b) and the model with heterogeneous treatment effect (panels c and d). Of the two instruments, only the variable “training abroad” performs well in terms of the significance level, even though in unreported estimates where province fixed effects were not included in the first stage, also our second instrument was significant at conventional levels. A joint test of significance of the two instruments returns a statistic higher than 60 in both models, which is far above the threshold of 10, often invoked in empirical studies for detecting a problem of weak instruments (Stock and Yogo 2005). The use of previous migratory experiences is very common in the migration literature because it is often highly correlated with subsequent observed mobility. As pointed out in Abreu et al. (2015), after controlling for the university attended and the degree awarded, we would not expect past mobility to have a direct effect on wages, especially after 3 and 5 years after graduation.

Table 4 Instrumental variables

Thus, we believe that these estimates are reliable to perform a robustness check. In particular, the IV strategy produces a small gain in terms of model fit (adjusted R-squared in the probit model are around 1.2 % points higher) and the estimated impact of international mobility is similar to that already discussed in the previous section. Estimated coefficients in both equations are also close to previous estimates, and again, negative selection of migrants is confirmed.

Table 5 reports the results aimed at checking the sensitivity of the magnitude of the coefficients when estimates are carried out on selected subpopulations.

Table 5 Endogenous treatment effects: subpopulations

The first check is intended to exclude from the analysis those observations for which the probability of being a migrant is too close to extreme values. Indeed, a critical assumption implicit in our empirical model is the joint normality of the error terms. In practice, this assumption requires that the population under consideration does not contain one or more groups of individuals with extreme behavioral tendencies concerning both program participation and the outcome of interest. Given the nature of our problem, we believe that two cases are the most likely to affect the data. In particular, if a group of Ph.D.s has one or more unobserved features such that they always (never) choose an international migration pattern and are systematically at the right (left) tail of the log hourly wage distribution, then the joint distribution of the error terms could not be unimodal. Heuristically, we use the predicted probabilities of migration to exclude from the analysis individuals whose estimated probability is very close to 0 or 1. Specifically, we truncate the tails of the distribution at the 1 %. Results from this check are reported in panel a). The estimated coefficient of the international mobility indicator is still positive, close to 0.3 and statistically significant.

On the same line of reasoning, we run a second sensitivity check on the subpopulation of Ph.D. graduates with some mobility related to work, either national or international, leaving out from the estimates those individuals who never moved from the province where the Ph.D. was obtained and for which it is likely to observe a lower estimated probability of moving abroad. Also in this case, we confirm the presence of a wage gain associated with international mobility.

Panel (c–e) report the estimates of the impact of mobility when the population is restricted, respectively, to individuals employed in non-academic environments and to one of each cohort employed in the analysis. Again, we find that in all these cases the estimated coefficients are stable and remain statistically significant at conventional levels. Finally, negative selection is confirmed by all the sensitivity checks.

5 Discussion and conclusions

In the last two decades, we have witnessed a marked growth of the higher education systems in most OECD and non-OECD countries, with a consequent growth in the number of Ph.D. programs and Ph.D. graduates. At the same time, the contraction of opportunities for academic employment and the establishment of quality-selective immigration policies and the skill composition of migration flows have changed sharply. Recent studies have documented that Ph.D.s’ employability is no longer restricted to academia (Ballarino and Colombo 2010), and while this is common in the USA, it is a novelty for European countries. With respect to the Italian case, the university system has clear difficulties in absorbing the flow of doctorate holders, and new aspects of their transition into the labor market have emerged, with doctorates flowing into the private sector and becoming internationally mobile. However, the literature on high- skilled migration has so far mainly focused on the determinants and the returns to spatial mobility of college and university graduates, mostly neglecting the equally important role of doctorate holders. In this respect, the present paper is an attempt to fill this gap by examining the topic of international mobility of Ph.D.s and their earnings performance after graduation.

The analysis takes advantage of a population-based survey of two cohorts of Italian Ph.D.s conducted by ISTAT between December 2009 and February 2010. Through an endogenous dummy variable model with self-selection correction, we show that international mobility is associated with higher wages. In detail, the contribution of this paper is the estimation of a Mincerian-type wage equation that tries to capture the monetary returns to international mobility allowing for both unobserved individual heterogeneity and heterogeneous response of individual earnings to the migration path. In doing so, we also examine the extent to which such mobility is driven by the university in which the Ph.D. was granted, personal and job characteristics, as well as their field of study. Also, consistent with previous research, we also investigate potential gender differences both in terms of the propensity to migrate and mobility payoffs.

While, in general, research on the returns to migration has not reached a consensus about whether migration increases post-move earnings (Herzog et al. 1993), the literature that specifically focuses on graduate migration (despite the limited number of available studies) has so far documented wage premium for movers (Di Cintio and Grassi 2013). The results in this paper, even if not completely comparable with existing studies, seem to go in the same direction. When looking at highly educated individuals, who are endowed with high levels of human capital, spatial mobility represents a strategy to increase the chances to capitalize the investment in education. The paper also shows that Ph.D. migrants are a negatively self-selected group based on unobservable traits. Nevertheless, as our set of observable characteristics is limited, we cannot fully relate the selection mechanism exclusively to ability factors, making it difficult to rule out the possibility that other unobservables traits drive the result of negative selection.

The findings of this study should be considered in light of some limitations. From a methodological perspective, modeling the international mobility choice in a multinomial context would allow both a finer representation of the location choice and to correct for selectivity in a more specific way, as in Lee (1983) and Dahl (2002). Unfortunately, in our data some of the observed destinations are not disaggregated at the country level. Also, the data preclude a longitudinal analysis that could reveal important aspects about the persistency of wage gains associated with international mobility and that could be useful to track career advancements and job mobility, as well as to assess the short- and long-term impacts of migration. We suggest further studies to consider these and other unexplored questions that we left out of the analysis. For instance, contributions to current knowledge could focus on the impact of migration on job satisfaction and the assimilation of mobile Ph.D.s in foreign countries. With particular attention on Ph.D.s who remain in the academia, mobility can also be studied as a way to get into better-ranked universities to benefit from their reputation and research funding opportunities. Also, as Ph.D.s usually maintain network relations in their country of origin, it would be interesting to understand the extent of possible spillover effects.

Moving from the analysis to policy advice, the paper suggests that the private nature of the migration decision and earning rewards of migrants should underpin the rationale for public interventions. The design of incentive schemes to retain human capital within a country needs to account for the individual gains associated with the choice of migration, along with increasing research funds and opportunities. Also, the lack of absorptive capacity in academic institutions needs to be explicitly recognized by policy makers and actively addressed in terms of university career prospects and employment opportunities both in the private and in the public sectors. In this respect, this paper calls for a new placement model for Ph.D.s, particularly based on their inclusion in companies that focus on research and innovation. Indeed, in a publicly funded education system, the migration of the highly educated represents a loss in terms of both the educational investment and the potential economic and cultural growth.