1 Introduction

International migration is a selective process. Human capital models of migration claim that those who choose to leave a country might be more able and/or more motivated than those who choose to stay in their home country (see Chiswick 1999). If this is the case, then immigrants are said to be positively selected compared to the home population. Borjas (1987, 1991), however, showed that immigrants coming from a country with more unequal wage distribution than that prevailing in the host country may be negatively selected. In an extension of this work, Borjas and Bratsberg (1996) investigated the return migration of foreign-born individuals in the United States and showed how this may influence the type of self-selection characterising the migration flows. Dustmann (2003) studied the optimal length of stay abroad and return behaviour of temporary migrants in the framework of life-cycle analysis, while Dustmann and Kirchkamp (2002) looked at the activity choice of return migrants. Bauer et al. (2002), studying Portuguese immigrants in Germany, concluded that the German guest worker system succeeded in attracting positively self-selected immigrants in terms of unobservable characteristics and compared to the native German workers. Chiquiar and Hanson (2002) studied the performance of Mexican immigrants in the United States and compared them to the non-migrant Mexicans. Using the semi-parametric approach of DiNardo et al. (1996), they rejected previous results found in a more descriptive literature that Mexican immigrants in the US tend to be negatively selected in terms of observable skills compared to the stayers.

Most of the literature on return migration deals with the issue of self-selection within the context of the host country. One exception, however, is Co et al. (2000), who studied the potential economic benefits generated by the returning Hungarian migrants after spending time abroad. They addressed two potential selection biases: one due to the decision to migrate vs to stay home and the second due to the decision to work vs not to work. Using the maximum likelihood framework, they found an earnings premium of 40% for women who have had a spell abroad, while for men, the foreign experience effect is insignificant.

The focus of this paper is similar to that of Co et al., however, unlike them, we study the wage effects of return migration in Albania, comparing the performance of returnees to those who stayed in the home country. More specifically, we address the question of the self-selection process of out- and then re-migration of the individuals who left the source country and then returned home using the stayers (non-migrants) as the counterfactual. We address the following questions: (i) Had they chosen not to migrate, what would be the performance of return migrants compared to those who stayed? and (ii) What would be the performance of non-migrants had they decided to migrate and return?Footnote 1 To answer these questions, we use a sample of 694 Albanian individuals and use two alternative methodologies, a selection model along the lines of Heckman (1979) and Lee (1982) and a semi-parametric approach proposed by DiNardo et al. (1996). The first approach allows us to directly address the questions but offers only mean conditional earnings, while using the second approach, we can study the effect of migration on the entire wage distribution.

Evidence suggests that a large number of migrants from Central and Eastern European Countries fall into the category of temporary (or guest) workers. For example in Greece, amongst the Albanians who received a temporary white card in the regularisation programme in 1998, only 54% proceeded to the second phase of application 1 year later to obtain a permanent green card. In a survey realised in Albania by the International Organisation for Migration (IOM) in 1992, 79% of respondents said they were ‘likely’ or ‘very likely’ to migrate for a few months, and only 24% wanted to settle permanently in another country (IOM 1995). Other evidence based on Eurobarometer shows that 50% of Albanians planned to emigrate for a short period only (see Papapanagos and Sanfey 2001).

Our paper adds to the limited literature on the analysis of return migrants in their home labour market in the context of a self-selection model. This is the first study of such an issue for Albania, a transition country most affected by migration. Furthermore, this is the first paper to use a semi-parametric kernel density approach to study the impact of return migration.

We find support for the negative self-selection of return migrants compared to the native non-migrant population (stayers). Our empirical results show that stayers would have performed much better than return migrants had they chosen to migrate. We argue that, for stayers, the decision not to migrate comes from the non-transferability of current skills due to language barrier and also by the low added return to human capital in the host country. Interpreting those results in the framework of our model, we find support to a story of negative selection of the wave of return migrants compared to that of non-migrants. These results have potential implications for migration policies of the host and the source countries.Footnote 2

The rest of the paper is organised as follows. Brief background on Albanian migration is presented in Section 2, while the theoretical model is discussed in Section 3. Section 4 describes the data set and selection of the variables. In Section 5, empirical methodology used to examine the issues raised in the theoretical model is presented, while the empirical results are given and discussed in Section 6. Concluding remarks and potential policy implications appear in the last section.

2 Albanian migration: a brief background

Perhaps because of its central location in Europe and its relative poverty, Albania has long been a country of emigration. However, between 1945 and 1990, the state pursued a policy of social and economic isolation, totally restricting any movement of its citizens out of its borders. During the transition period, a large number of people, uncertain about the economic prospects of Albania, left the country. This was taking place against the backdrop of rapid and radical political change that had already begun elsewhere in Central and East European countries (CEEC) at the end of the 1980s. These events provided a further catalyst for change in Albania and helped to put in motion the organisational skills and energy of those who had been waiting for the right time to leave.

Precise figures on Albanian immigrants are difficult to gather due to the potentially high number of non-declared (illegal) individuals either settled or working short time periods in the host countries. For example, officially 4,300 Albanians were issued a residence permit in 1997 in Greece. But when the country adopted a regularisation programme (between November 1997 and May 1998) for undocumented immigrants, 239,000 Albanian immigrants applied (see OECD 2000). Hence, behind the official figures, there are a rather large number of undocumented migrants not only in Greece but elsewhere in Europe also, particularly in Italy. The Albanian Center for Economic Research (2002) estimates that at least 15% of the Albanian population is living abroad, which is by far the highest proportion amongst the Central and East European economies.

A gradual improvement of the economic situation of Albania took place until mid-1996, owing mainly to remittances and macroeconomic policies.Footnote 3 These factors lessened, to a certain extent, the major economic and social problems, which emerged as a result of high unemployment rates and big disparities in wealth. However, these “positive factors” proved temporary as the domestically financed deficit increased to almost 11% of GDP, and inflation tripled to more than 17% by the end of 1996. This was exacerbated by the collapse of the pyramid schemes in early 1997, causing an estimated loss of savings of about $1 billion.Footnote 4

The worsening economic situation led to a second large outflow of individuals as employment prospects in Albania dwindled for many. Emigration has an important impact in the reduction of unemployment in the country. According to official data, during 1998, unemployment in the country reached 17.7%, with a figure of 19.1% in the north-eastern areas, where the level of emigration is lower, and 13.4% in the south, where mass emigration exists. Given that Albanian emigration is often driven by seasonal and temporary employment, this has had an impact on the Albanian labour market. It is estimated that half the overall number of emigrants are seasonally employed in the host countries.

According to data from the Albanian Ministry of Labour and Social Affairs, during the last 10 years, Albanians have emigrated to about 20 European countries. However, by far the largest number goes to Greece followed by Italy. It may be the result of easier access to information about job availability and level of wages in Greece and also relatively lower transportation costs. The migration flow is amplified by the need for a flexible non-unionised workforce for the informal economy in Greece. However, as mentioned before, most of the migration appears to be temporary and for a specific purpose: to raise funds to set up enterprise in Albania and/or to acquire skills by working in a relatively richer and established market economy.

3 Theoretical framework

In earlier literature, migration has been modelled as a one-shot move, where individuals take their decision following an income-maximising strategy to either migrate or stay in the home country (Harris and Todaro 1970). More recently, migration has been considered as a dynamic process within the lifetime expectations of workers (Djajic 1989; Dustmann 1997). In this context, there is evidence that migration is self-selective, i.e. those who migrated would have done better regardless of whether or not they had gone abroad. Immigrants are often found to be “more able and more highly motivated” than those who stay at home. In this study, we question this assertion. We analyse the performance of return migrants in the source country, i.e. those who migrated but then decided to return to participate in the labour market of the source country.Footnote 5

Using the Albanian data, we want to know if migrants who returned home to Albania were selected from the upper or lower part of the ability distribution in the source country. To conduct such an analysis, we investigate their performance once they return to Albania. The problem can be modelled by assuming income-maximising individuals who make a migration decision based on their expected income in the source and the host countries net of any migration (and remigration) costs. More formally, we use a version of the Roy (1951) theoretical model modified by Borjas (1987, 1999) and Borjas and Bratsberg (1996) to analyse this problem. But in contrast with those papers, we analyse the impact of self-selection on the home country rather than the host country.

Let the log earnings distribution in the source country be,

$$w^{{\text{s}}} = \mu ^{{\text{s}}} + \eta \nu $$
(1)

where μ s is the mean of log income in the source country, η is interpreted as the rate of return to skills in the source country relative to that in the host country and is assumed to be known to the migrant and ν is the random variable that measures deviations from the mean and is independently and normally distributed with mean zero and variance \(\sigma ^{2}_{\nu } \). Now let the log earnings facing the population of the source country when they decide to migrate to the host country be,

$$w^{{\text{h}}} = \mu ^{{\text{h}}} + \nu + \varepsilon $$
(2)

where μ h is the mean income that migrants receive in the host country and ɛ is the random variable that measures deviations from the mean income in the host country and is not known to the migrant, i.e. it captures the luck and/or misinformation about the prospects in the host country. It is assumed to be independently and normally distributed with mean zero and variance \(\sigma ^{2}_{\varepsilon } \).

One of the main reasons for migration from Albania to EU countries is the significant wage gap between the two countries. A temporary migration to Western Europe (primarily to Greece and Italy) offers higher paid employment and the potential to acquire skills and, moreover, helps overcome any capital constraints that an individual may face in the source country to start an enterprise.Footnote 6 Therefore, migrants will only incur migration costs if they expect that after spending a fraction δ of their working life in the host country, they can increase their earnings by some percent, κ, when they return to their home country. We assume that the parameters δ and κ are constant.

Workers in Albania, therefore, have the following option: residing in an EU country for a fraction of the working life, followed by a permanent return to the source country. Ignoring discounting and using a first-order approximation, the log earnings associated with this choice (w r) are given by:

$$w^{{\text{r}}} = \delta w^{{\text{h}}} + {\left( {1 - \delta } \right)}{\left( {w^{{\text{s}}} + \kappa } \right)}$$
(3)

where δ and κ are parameters as defined above.

Workers maximise their lifetime earnings net of all migration costs. For the migration motive to be relevant, a person will only migrate if the expected earnings (due to skill acquisition abroad) in the source country, after returning, are greater than earnings in the source country if the individual did not migrate, net of both migration and remigration costs. Formally, we can write this as:

$$Ew^{{\text{r}}} > w^{{\text{s}}} + C^{{\text{m}}} + C^{{\text{r}}} $$
(4)

where C m and C r are the migration and remigration costs, respectively.Footnote 7

Substituting Eqs. 1, 2 and 3 in Eq. 4, we get the condition under which a person will migrate (with the intention of returning to the source country).

$${\left( {1 - \eta } \right)}\nu > {\left( {\mu ^{{\text{s}}} - \mu ^{{\text{h}}} + \kappa } \right)} + \frac{{C^{{\text{m}}} + C^{{\text{r}}} - \kappa }}{\delta }$$
(5)

Note that so far we have been assuming that a migrant must return to Albania as that person is either required to or has already decided at the time of migration to return home. However, to complete the picture, it could be the case that the migrant could stay, either permanently or for a relatively longer period of time, in the host country.Footnote 8 In this circumstance, we need to set out the conditions under which (i) a person will migrate regardless of future intentions and (ii) once migrated, the person will return to the source country after spending a fraction of time in the host country, i.e. has no incentive to stay in the host country permanently. The two conditions are respectively given as,

$$Ew^{{\text{h}}} > w^{{\text{s}}} + C^{{\text{m}}} $$
(6)

and

$$Ew^{{\text{r}}} > w^{{\text{h}}} + C^{{\text{r}}} $$
(7)

Equation (6) states that if the expected wage net of migration costs is greater than the wage in the source country, then it is better for a person to migrate. However, once abroad, a migrant will return to the source country only if the expected wage upon return, net of remigration costs, is greater than the wage in the host country. Substituting for the wages from the above equations, we get the following conditions under which a person will migrate regardless of future intentions

$${\left( {1 - \eta } \right)}\nu > \mu ^{{\text{s}}} - \mu ^{{\text{h}}} - C^{{\text{m}}} $$
(8)

and will migrate and then return home after spending a fraction of time in the host country,

$${\left( {1 - \eta } \right)}\nu < {\left( {\mu ^{{\text{s}}} - \mu ^{{\text{h}}} + \kappa } \right)} - \frac{{C^{{\text{r}}} + C^{{\text{m}}} - \varepsilon }}{{1 - \delta }}$$
(9)

It is easier to explain the intuition behind Eqs. 5, 8 and 9 in a diagrammatic analysis, and therefore, it will be presented using Figs. 1 and 2.

Fig. 1
figure 1

Returns to skills when η>1

Fig. 2
figure 2

Returns to skills when η<1

As discussed earlier, return migration arises because a temporary stay in the host country increases the worker's earning potential in the source country. Therefore, migration is a self-selection process which is based on the value of η in this model. The migration flow is composed of negatively selected individuals if η>1. In other words, people with lower than average skills in Albania will migrate to EU because in this case, only the lower skilled gain the most by moving to the host country. Amongst this cohort of negatively selected individuals, only the more able return to the origin country after a spell in the host country. This case is shown in Fig. 1 where we draw the earnings function w s and w h (net of migration costs) as thick lines and w r (net of migration and remigration costs) as dotted line.Footnote 9 Assuming that skills are not perfectly transferable across borders, there are gains from moving for individuals with lower skills, whereas those with relatively higher skills are better off staying in Albania (in terms of Eq. 8, to satisfy the inequality condition, it must be the case that μ h>μ s −C m). Amongst the lower skilled migrants, only those who have relatively higher skills will face incentives to collect the gains from migration and return to Albania (region A, Fig. 1).

If η<1, however, people with skills higher than the average level will migrate. Also, amongst this pool of positively selected migrants, only the relatively less able will find it worthwhile to return after a spell in the host country (region B, Fig. 2).

4 Data and choice of variables

Data used in this paper are based on direct interviews of 1,500 individuals in all regions of Albania. The interviews were conducted during the period of March 1998–January 1999.Footnote 10 Names were randomly selected in the district registers. Numbers attributed by districts are proportional to the size of the district, so the sample is regionally representative. No precise question was asked regarding the location of current residence, and therefore, it is not clear whether some individuals actually still work abroad but have been interviewed while taking time off in Albania. In order to select only the “real” returnees, we restricted our analysis to those who had migrated and came back at least 2 months before the day of interview.

Moreover, we wanted to avoid the cross-border or seasonal migrants, i.e. those who spend some time of the year abroad and then come back home for the rest of the year. These individuals are defined as persistent migrants and most probably have different characteristics and preferences than the population we want to study (see Constant and Zimmermann (2003) for an investigation of the determinants of repeat migration). Therefore, we selected only those individuals who live on earned income, excluding all those who live on remittances (transfers), unemployment benefits, unearned income (i.e. personal savings supposedly earned abroad) or social assistance. We also removed pensioners, housewives and students. Imposing these restrictions of course narrows the scope of our analysis, but it reflects our emphasis on the effect of return migration on the source country labour market.Footnote 11 Of the original sample of 1,500 individuals who were interviewed, selection of valid answers led us to a final sample of 594 wage earners, aged between 16 and 65 (see Table 6 in the Appendix for details on the selection).

Focusing on migrants, we note that less than 30% migrated for a period of less than a year, approximately the same percentage migrated for 1 to 2 years, 20% for 2 to 3 years, less than 8% for 3 to 4 years, 7% for 4 to 5 years, and only another 7% migrated for more than 5 years. Looking at the number of times individuals migrated, we find that 53% moved abroad only once, 32% did it twice and only 11% did it more often. And of those who migrated only once, more than 70% did so for more than 1 year, whereas those who migrated twice had an average spell abroad of 13 months each time. The average spell abroad for those who migrated three times is just more than 10 months. These findings are consistent with the selection of individuals who are return migrants and not persistent (or seasonal) migrants. Average characteristics are displayed in Table 1.

Table 1 Means of the sample

The hourly wages converted into US dollars are $0.72 for the total sample, $0.81 for return migrants and $0.67 for stayers.Footnote 12 The Albanian Institute of Statistics (INSTAT) gives the monthly mean income of public sector workers (18% of the labour force) as 10,000 Leks for 1998, while, in our sample, using average monthly working hours, the mean average monthly income is 15,351 Leks. We expect this difference to be due to individuals in the private sector earning more than those in the public sector (unfortunately, we do not know whether individuals work in the public or private sector in our data set). The average migrant in our sample is younger, slightly less qualified, less likely to be married and more likely to be male. The differences in average levels of education and age are not statistically significant. Looking at occupation, we note that return migrants are almost twice as likely to be self-employed as the stayers, and there are nearly identical proportion of managers in the stayers and return-migrant subpopulations (12.3% vs 11.3%). These two variables are central to our analysis and are therefore discussed in more detail in the empirical section. Other noticeable difference is the larger proportion of returnees who live in big cities (46% compared to 39%).

5 Empirical methodology

Two methods are used in order to investigate the issues presented in the theoretical model. We begin by making use of a selection model as proposed by Lee (1978, 1982) and applied to migration by Nakosteen and Zimmer (1980). The model can be summarised by the following three equations:

$$w^{{\text{r}}}_{i} = \beta ^{{\text{r}}} x_{i} + \varepsilon _{{r_{i} }} $$
(10)
$$w^{{\text{s}}}_{i} = \beta ^{{{s}\ifmmode{'}\else$'$\fi}} x_{i} + \varepsilon _{{s_{i} }} $$
(11)
$$m^{*}_{i} = \gamma \prime z_{i} + u_{i} $$
(12)

The w i r is the hourly log wage of individuals who migrated at least once and came back to Albania and w i s is the log hourly wage of those who stayed in the country. These hourly wages are explained by a matrix of socio-economic covariates such as education, age and its square, dummy variables for gender, marital status (and its interaction with the gender variable), occupation (managers, lower managers, skilled workers, self-employed, other paid workers and the reference clerical, unskilled and farmers) and a dummy for being paid in a foreign currency.Footnote 13 Equation 12 describes the decision to choose to migrate. The latent variable m i * is the difference between benefit and cost from migration (monetary and psychological). Though it is not observed, we know when the individual has decided to migrate, so it can be defined as follows:

$${\text{For}}\,{\text{migrants}}\quad m_{i} = 1\,\;{\text{iff}}\,\;m^{*}_{i} < 0$$
(13)
$${\text{and}}\,\,{\text{for}}\,\,{\text{non - migrants}}\quad m_{i} = 0\;\,{\text{iff}}\,\;m^{*}_{i} \geqslant 0$$
(14)

Two sets of variables are used to explain the decision to migrate: those included in the wage equations and those not included in them. The second set is needed to identify the model without relying entirely on the normality assumption. To begin with the first one, education is introduced as a variable for the probit migration decision and the wage equation, as this characteristic may be explaining both the migration decision and the wage equation. Age should be negatively associated with the migration decision as older individuals are expected to be more attached to local amenities than younger ones. Furthermore, men are more likely to move than women, a common feature of all studies on migration. The opposite is true for married individuals. We also add an interaction term between gender and marital status as the effect of these variables might be correlated. As additional variables in the migration equation that are not included in the wage equation, we introduce first the number of dependents within the household with the assumption that tighter liquidity constraints on the household might exert, all else constant, a positive impact on migration decision. The second one is the size of the city where the individual is currently living. Assuming that the individual returned to the place that he/she left when migrating, we expect people living in big cities to be more likely to migrate as family ties might be more relaxed in an urban environment as compared to that in a rural environment. As other identifying variable, we introduce the fact of living in the more mountainous North of the country.Footnote 14 Another variable expected to influence migration but not wage is religion. There are two main religions in Albania, Islam and Orthodox Christian. Muslims, who comprise 70% of the population, are expected to face higher (non-pecuniary) costs of migration as opposed to the minority Albanian Orthodox and Roman Catholics (20% and 10% of the population, respectively). These costs cover the relatively higher level of difficulty Muslims might face in practising their faith in a non-Muslim country and also the increased difficulty of assimilation in countries with different religions. We therefore introduced a “Muslim” dummy to measure these increased costs of migration for Muslims.

The following two conditional wages are defined as the outcome for those who have already made the choice,

$$E{\left( {\left. {w^{{\text{r}}}_{i} } \right|m_{i} = 1} \right)} = \beta ^{{{\text{{r}\ifmmode{'}\else$'$\fi}}}} x_{i} + E{\left( {\left. {\varepsilon _{{r_{i} }} } \right|u_{i} \geqslant - \gamma \prime z_{i} } \right)} = \beta ^{{{\text{{r}\ifmmode{'}\else$'$\fi}}}} x_{i} + \sigma _{{e_{{\text{r}}} }} \rho _{{{\text{r}}u}} \frac{{\phi {\left( {\gamma \prime z_{i} } \right)}}}{{\Phi {\left( {\gamma \prime z_{i} } \right)}}}$$
(15)
$$E{\left( {\left. {w^{{\text{s}}}_{i} } \right|m_{i} = 0} \right)} = \beta ^{{{\text{{s}\ifmmode{'}\else$'$\fi}}}} x_{i} + E{\left( {\left. {\varepsilon _{{s_{i} }} } \right|u_{i} < - \gamma \prime z_{i} } \right)} = \beta ^{{{\text{{s}\ifmmode{'}\else$'$\fi}}}} x_{i} + \sigma _{{e_{{\text{s}}} }} \rho _{{{\text{s}}u}} {\left[ { - \frac{{\phi {\left( {\gamma \prime z_{i} } \right)}}}{{1 - \Phi {\left( {\gamma \prime z_{i} } \right)}}}} \right]}$$
(16)

In order to address the questions posed in “Introduction”, we need the conditional probabilities for migrants, had they chosen not to migrate, and similarly the conditional probabilities of stayers, had they chosen to migrate. Following Maddala (1983), these are given as:

$$E{\left( {\left. {w^{{\text{s}}}_{i} } \right|m_{i} = 1} \right)} = \beta ^{{{\text{r}}\prime }} x_{i} + E{\left( {\left. {\varepsilon _{{r_{i} }} } \right|u_{i} < - \gamma \prime z_{i} } \right)} = \beta ^{{{\text{r}}\prime }} x_{i} + \sigma _{{e_{{\text{r}}} }} \rho _{{{\text{r}}u}} \frac{{\phi {\left( {\gamma \prime z_{i} } \right)}}}{{\Phi {\left( {\gamma \prime z_{i} } \right)}}}$$
(17)
$$E{\left( {\left. {w^{{\text{r}}}_{i} } \right|m_{i} = 0} \right)} = \beta ^{{{\text{s}}\prime }} x_{i} + E{\left( {\left. {\varepsilon _{{r_{i} }} } \right|u_{i} < - \gamma \prime z_{i} } \right)} = \beta ^{{{\text{s}}\prime }} x_{i} + \sigma _{{e_{{\text{s}}} }} \rho _{{{\text{s}}u}} {\left[ { - \frac{{\phi {\left( {\gamma \prime z_{i} } \right)}}}{{1 - \Phi {\left( {\gamma \prime z_{i} } \right)}}}} \right]}$$
(18)

Equation 17 is the conditional wage of stayers, had they chosen to migrate, and Eq. 18 is the conditional wage of migrants, had they chosen to stay. Where Φ(.) and ϕ(.) stand, respectively, for the cumulative and density function of the standard normal, \(\sigma _{{e_{{\text{r}}} }}\) and \(\sigma _{{e_{{\text{s}}} }}\) are the variances of the error terms of the wage equations for migrants and stayers, respectively, and ρ su and ρ ru are the correlations between the stayers and migrants error term, respectively, and that of the migration decision equation. There is no agreement in the literature as to whether these conditional wages should be preferred over the marginal distributions. So in the section devoted to the results, we give the marginal effects as well. Average wage differentials can be given for different groups of workers and at different ages and levels of education.

So far we have only been able to give average earning differences, whereas the distributional impact of migration might also be of interest to answer the questions posed earlier. One way of identifying the effect of return migration would be to answer the following question: Which density function would prevail if the individual characteristics of migrants had been similar to those of stayers, and if they had been paid according to the wage schedule observed for stayers? This is one counterfactual density. It is the wage density that would prevail if everybody were receiving stayers' wages. But another way of studying the effect of migration could be to construct a density that would prevail if everybody received migrants' wages. Here the question is: What density would prevail if the characteristics of stayers were similar to those of migrants, and if they were paid according to the wage schedule of return migrants? Following DiNardo et al. (1996), we can write down these two counterfactuals by the following steps. First, we represent the observed density of wages for stayers as the integral of the density of their wages conditional on observed characteristics z over the distribution of these characteristics:

$$g{\left( {\left. w \right|m = 0} \right)} = {\int {f^{{\text{s}}} {\left( {\left. w \right|s} \right)}h{\left( {\left. z \right|m = 0} \right)}{\text{d}}z} }$$
(19)

and similarly for migrants, we have:

$$g{\left( {\left. w \right|m = 1} \right)} = {\int {f^{{\text{r}}} {\left( {\left. w \right|z} \right)}h{\left( {\left. z \right|m = 1} \right)}{\text{d}}z} }$$
(20)

We know the required densities, i.e. the density that would prevail if everybody were receiving stayers' wages is:

$$g^{{\text{s}}} {\left( w \right)} = {\int {f^{{\text{s}}} {\left( {\left. w \right|z} \right)}h{\left( z \right)}{\text{d}}z} }$$
(21)

and the density that would prevail if everyone were receiving migrants' wages is:

$$g^{{\text{r}}} {\left( w \right)} = {\int {f^{{\text{r}}} {\left( {\left. w \right|z} \right)}h{\left( z \right)}{\text{d}}z} }$$
(22)

Following Bayes' Law, these densities can be rewritten as:Footnote 15

$$g^{{\text{s}}} {\left( w \right)} = {\int {\theta ^{1} {\left( z \right)}f^{{\text{s}}} {\left( {\left. w \right|z} \right)}h{\left( {\left. z \right|m = 0} \right)}{\text{d}}z} }$$
(23)
$$g^{{\text{r}}} {\left( w \right)} = {\int {\theta ^{2} {\left( z \right)}f^{{\text{r}}} {\left( {\left. w \right|z} \right)}h{\left( {\left. z \right|m = 1} \right)}{\text{d}}z} }$$
(24)

Note that Eqs. 23 and 24 are similar to Eqs. 19 and 20 except for the weights θ 1(z) and θ 2(z) which are, respectively:

$$\begin{array}{*{20}l} {{\theta ^{1} {\left( z \right)} = \frac{{{\text{prob}}{\left( {m = 0} \right)}}}{{{\text{prob}}{\left( {\left. {m = 0} \right|z} \right)}}}} \hfill} \\ {{{\text{and}}} \hfill} \\ {{\theta ^{2} {\left( z \right)} = \frac{{{\text{prob}}{\left( {m = 1} \right)}}}{{{\text{prob}}{\left( {\left. {m = 1} \right|z} \right)}}}} \hfill} \\ \end{array} $$
(25)

θ 1(z) can be empirically calculated since prob m=0 is simply the proportion of stayers in our sample and prob m=0∣z is the probability of being a migrant given individual characteristics which can be estimated by a probit (similar reasoning applies for θ 2(z)). Using these weights, we apply weighted kernel densities to the sample of stayers and migrants to estimate the densities of both counterfactual distributions.

6 Results and discussion

6.1 Parametric estimates

Following Ham et al. (2001), we conduct tests on the variables that identify the selection into migrants and stayers. More precisely, we introduce these variables in the wage regressions to check if they are significantly different from zero. If they are significant, we exclude them from the entire model, and if they are not significant, we include them in the probit and not the wage estimations. We investigate with four variables: two regional, i.e. whether individuals are living in cities and in the North of the country, and two personal characteristics: religion (being a Muslim) and number of dependents in the household. We expect these variables to affect the migration decision and to be uncorrelated with the error term in the wage equations. We compute Chi-square tests of their individual and joint significance in the probit and Wald test of the individual and joint significance in the wage equations. The four variables are individually and jointly insignificant in the wage equation for stayers (the individual tests all have a p value higher than 0.17, and the joint significance is rejected with a p value of 0.25). For migrants, coefficients for each variable are insignificant (except for living in cities), and test for their joint significance gives a p value of 0.075 (without the “living in cities” variable, p value is 0.45). Instruments are jointly significant (p value of 0 to the second decimal place) in the probit, and they are all significantly different from 0 individually except the “Muslim” variable (p value of 0.72). The maximum likelihood estimates of the migration model are given in Table 2. For comparison, Tables 7 and 8 in the Appendix also provide estimates of wage equations using Lee (1978) endogenous switching model, with wage equations explained only by education and age and then adding progressively more exogenous variables. Moreover, in the Appendix, we provide Lee's estimates with only regional characteristics in the probit (first selection rule, Table 9) and then add religion and the number of dependents (second selection rule, Table 10).

Table 2 Maximum likelihood estimates, second selection rule

6.1.1 Comments on estimates

Note that the estimates for different estimations are rather similar. Generally, the coefficients for the stayers' wage equations take the expected sign and are statistically significantly different from zero. One more year of education leads to approximately 4% increase in the hourly wage; age is introduced to measure labour market experience and shows that each subsequent year gives approximately 8.5% increase in the dependent variable. The age profile is concave. One coefficient of interest is the male dummy which is negative and not significant. This result has to be interpreted in the context of an ex-communist country, where work was compulsory for both men and women, and wages were set at the national level. Coefficients for occupations take the expected sign with managers earning 66% more than the omitted category (clerical, unskilled and farmers). The premium for self-employment is 52%.

Interestingly for return migrants, education and age are not significantly different from zero.Footnote 16 However, for migrants, returns to being a manager, self-employed and a “lower” manager are significant and higher than for stayers. Skilled return-migrant workers earn less than skilled stayers. Managers earn between 90 and 100% (depending on the estimation, see Table 2 and Appendix, Tables 7, 8, 9, 10) more than the omitted category (clerical, unskilled, farmer). The premium for self-employed returners is between 69 and 73%. These results are quite interesting as they suggest that returns to returning take the form of increased earnings in terms of (i) higher positions on the job ladder and (ii) becoming self-employed.Footnote 17 Better educated and more experienced migrants do not earn higher hourly wages when they return. We observe also a negative and significant sign of the education variable in the migration decision; therefore, migration is not associated with higher educated individuals. As the theoretical model shows, individuals choose to migrate if the relative rewards to their skills are higher in the host country and then choose to return if they expect the rewards (promotion and/or higher wages, etc.) to be higher than before in the home country due to newly acquired skills and/or through savings acquired abroad. Therefore, returns to skills take the form of access to better jobs in the career ladder but not through returns to formal skills (education and labour market experience). Individuals who choose to migrate and then return face the prospect of access to high-paid jobs that do not reward formal training (years of education and labour market experience). The data set shows that 10% of the self-employed and the managers used their savings accumulated abroad to set up a business. This result can therefore be related to the study of Mesnard (2004), who modelled migration as a way of overcoming constraints of the credit market in the home country. In our context, we observe that individuals who lack formal qualifications required for higher paid jobs tend to migrate to overcome their initial disadvantage. This strategy proves particularly successful as the average earnings of return migrants are higher than those of stayers.

Looking at the unobserved characteristics, the signs of the corrections for selectivity allow us to draw interesting conclusions. For instance, the correction for sample selection in the return migrant's wage equation is not significant when using a two-step approach (Appendix, Tables 9 and 10). The maximum likelihood, however, gives a significant and negative estimate for the correlation coefficient. For stayers, the three estimations give a significant and negative sign for the coefficient of the selectivity variable \({\left( {{\left[ { - \frac{{\phi {\left( {\gamma ^{\prime } z_{i} } \right)}}}{{1 - \Phi {\left( {\gamma ^{\prime } z_{i} } \right)}}}} \right]}} \right)}\), which means that the truncation effect is positive. Using the framework of Roy (1951) self-selection model as formalised by Maddala (1983) and others, this indicates that expected earnings of those who choose to migrate may be lower than that of a random individual from the entire sample for given characteristics. And conversely, the expected earnings of those who stayed are higher than the expected earnings of a random individual from the sample. There is positive selection for stayers and support for negative selection of the return migrants.Footnote 18 We expand this issue in the following section where we directly address the question whether the stayers would have performed as well as return migrants, had they decided to migrate.

6.1.2 Expected earnings and self-selection

Mean income is higher for return migrants than for stayers by 9 log points, so approximately by 9% (see Table 3). Looking at the two counterfactuals, calculated using simple OLS estimations, we note that, had they chosen to migrate (and return), stayers' earnings would have been higher than the mean income of return migrants. The mean earnings of return migrants, had they chosen to stay, would have been ‘just’ higher than the mean earnings of stayers. However, these estimates are probably biased as they do not take into account the potential self-selection of individuals in either subpopulation. Therefore, we correct for potential self-selection bias and present the results in columns 2 to 6 of Table 3, which are based on Table 2 and Tables 9 and 10 in the Appendix. For each estimation, we give the mean incomes based on the marginal \({\left( {E{\left( {w^{{\text{r}}} } \right)} = \beta ^{{{\text{r}}^{\prime } }} x\,\,{\text{and}}\,\,E{\left( {w^{{\text{s}}} } \right)} = \beta ^{{{\text{s}}^{\prime } }} x} \right)}\) Footnote 19 and the conditional \({\left( {E{\left( {\left. {w^{{\text{r}}} } \right|m = 1} \right)}\,\,{\text{and}}\,\,E{\left( {\left. {w^{{\text{s}}} } \right|m = 0} \right)}} \right)}\) expected wage rate. Marginal distribution should be used for inference on potential migration, and conditional distribution should be used for inference on realised migration (Maddala 1983, p. 287). Comparing rows 1 and 2 in Table 3, we observe that migrants made the correct decision in choosing to migrate, as their income is higher than what they would have earned by staying. Comparing the performance of return migrants, had they not migrated, with the performance of stayers (rows 2 and 3), we find that the counterfactual mean income of migrants is always lower than the mean income of stayers. This shows that the performance of return migrants, if they had stayed, would have been worse than that of the stayers. As for the stayers, comparing rows 3 and 4, it can be seen that their mean income would have been higher had they spent time abroad. The order of this advantage is .17 and 1.16 log points using the marginal and conditional expected means, respectively, of the maximum likelihood estimation. In the framework of our theoretical model, we observe that more skilled individuals do not migrate if their potential earnings net of migration costs are lower in the host countries compared to their wage at home. Our results give rise to a story of the more able/skilled individuals in Albania facing higher assimilation costs in the host labour markets. This may come from the difficulty to practise their profession in a foreign language. This, for instance, would apply to such professions as medical doctors, lawyers or teachers. For the less skilled, such costs may be much lower as the jobs performed in the host countries do not require a high fluency of the foreign language.Footnote 20 These results show that return migrants are negatively selected as depicted in the theoretical analysis in Fig 1.

Table 3 Estimated mean hourly wage for return migrants and stayers

6.2 Results using semi-parametric estimates

We now investigate the entire density of hourly wages. All graphs presented here give estimates calculated with a Gaussian kernel function. We used the Silverman (1986, Eq. 3.31) procedure to select the optimal bandwidth; its value lies at around 0.14. Kernel estimates for the entire sample, for the stayers and for the return migrants, are displayed in Fig. 3. In Fig. 4, densities for the total sample are decomposed into the weighted sum of the densities of return migrants and stayers. We simply multiply the sub-group densities of Fig. 3 by the sub-group population shares.

Fig. 3
figure 3

Kernel densities

Fig. 4
figure 4

Weighted densities

Figure 3 shows that return migrants tend to account for a larger part of the total distribution at higher hourly wages. There is clearly a clustering of the distribution at higher wages for those who have migrated and leads to a small “bump” at the top of the overall distribution. These observations based on the raw distributions are interesting but cannot reveal the real effect of migration as we compare subpopulations with rather different characteristics. We already know from Table 1 that migrants tend to be less educated, younger and, more often, male.

The different curves may be due more to these individual characteristics than to migration. So we have to go a step further in comparing populations with similar characteristics. This could be done in two ways: by either displaying the distribution of wages as if everyone were paid the stayers' wage or graphing the distribution of wages as if everyone were paid the return migrants' wage. More precisely, in the first case, we answer the following question: Which density function would prevail if individual characteristics of return migrants had been similar to those of stayers, and if they had been paid according to the wage schedule observed for stayers? This is done in Fig. 5, which gives the hypothetical counterfactual density together with the density of the entire population. The difference between the two curves can be interpreted as the effect of return migration. The curve called the density without migration is calculated using Eq. 24. Figs. 7 to 10 in the Appendix present the propensity scores of the probit and also the weights θ 1(z) and θ 2(z). Note that the counterfactual density in Fig. 5 is rather similar to the density of the entire sample. Had the return migrants been paid the same as the stayers and their characteristics would have been similar, we would have observed a slightly different density function. Mainly the small cluster at the top of the distribution disappears and is compensated by a shift of the curve to the right just after the mode of the distribution. So interpreting the effect of return migration as the difference between the two curves, we can say that its effect is rather reduced at the bottom of the distribution and can explain the bump at around 6 log hourly Lek.

Fig. 5
figure 5

Hypothetical density without migration

Figure 6 gives complementary information as here the reference is the return-migrant subpopulation. The counterfactual curve is now the density that would have prevailed if the characteristics of return migrants were similar to those of stayers. This would have resulted in the density function lying to the right of the actual one. This counterfactual distribution is nearly bimodal, with a second (lower) mode at higher wage. These figures give more support to the negative selection of return migrants. In particular, we observe here that the effect of migration would have been much stronger had the return-migrant characteristics been more similar to those of stayers.

Fig. 6
figure 6

Hypothetical density with migration

6.3 Results with disaggregated characteristics

In this section, we want to check that the above results, which are based on the mean income of all individuals, still hold if the individuals are disaggregated by qualification levels, age and type of employment (self- or wage employment). Using the maximum likelihood estimates, we therefore calculate the marginal and conditional expected hourly wages for three different characteristics: Those with more and less than 14 years of schooling, those who are more and less than 30 years of age, and for wage and self-employed workers (see Appendix, Table 11).Footnote 21

The first cell of first column of Table 4 shows that the stayers, had they migrated, would have earned 1.17 log points more than the return migrants' actual earnings. And the first cell of column 3 shows that the return migrants, had they decided to stay, would have earned .42 points less than the actual earnings of stayers. These results strongly suggest that the subpopulation of stayers is composed of better performers. For all decomposition of the population, by age, employment and level of education, stayers would have performed better had they migrated. We observe that highly educated (young and old) stayers would have gained more, had they decided to move, than the low-educated ones and compared to similar migrants. Also highly educated return migrants (young and old) would have lost more, had they stayed, compared to stayers with same education level.

Table 4 Absolute advantage for different characteristics

Another area of interest is to look at the individual comparative advantage for each subpopulation. Here, comparison is made between what the individuals would have earned (had they decided otherwise) with what they are actually earning. So the first cell of first column of Table 5 shows that low-educated stayers are earning 1.24 log points less than what they would be earning, had they decided to move. And the first cell of column 3 implies that the less qualified return migrants earn .59 than if they had chosen not to migrate. The results confirm that for each type of characteristics, migrants made the right decision. However, as mentioned in the previous section, the stayers must face unobserved costs of migration, which prevent them to migrate despite the fact that they would have been financially better off in doing so. Therefore, the results found earlier on the aggregated subpopulation (Table 3) are not affected when we take into account the different characteristics.

Table 5 Comparative advantage for different characteristics

7 Conclusion

Data on Albanian migration suggest that a sizeable proportion of migrants return to Albania after a short spell abroad. This predominantly return behaviour of Albanian migration offers an interesting case study to investigate the effect of migration on the source country labour market. Using a sample of 594 individuals active on the Albanian labour market, we compare those who returned after a spell abroad (204 individuals) with those who never migrated (390 individuals).

We have investigated the negative or positive selection of return migrants by comparing their performance in the source country with those of the stayers in the framework of the Roy theoretical model of self-selection. We found support for the negative selection of return migrants. Using counterfactual analysis, we found that, had the stayers decided to migrate and return, they would have earned a higher hourly wage (in the order of.17 to 1.09 log points) than the return migrants. Applying the semi-parametric approach of DiNardo et al. (1996), it was shown that return migration results in a slight rightward shift of the wage distribution in Albania. However, had the characteristics of migrants been more similar to those of stayers, the wage distribution would have markedly moved to the right. We interpret this result as further supporting evidence for the negative selection of return migrants compared to that of the stayers. We explain the choice of stayers by their higher costs of migration. Being on average more skilled, they would face higher assimilation costs in the host countries such as knowledge of the host country language and the recognition of their formal training acquired at home. For typical low-skilled migrants, such costs are much lower as they are expected to be active in menial jobs, where few contacts and relatively little training are required. We also observed that rewards to the typical human capital variables, age and education, are not statistically significant in the home country labour market for the return migrants, whereas the opposite prevails for the stayers.

This paper adds to the scant literature on the self-selection process characterising the flows of return migrants in the context of the source country labour market. Albania is a relatively poor and small country with a dominant agricultural sector typical of a large number of Central and Eastern European and developing countries. We may expect our results to apply to similar countries as well.

As potential policy implications, we may mention the increased hourly wage of returnees due to their spell abroad, despite them appearing to be negatively selected. This is clearly beneficial for the source country economy, especially as a large proportion of the returnees appear to choose to set up successfully as self-employed. It could therefore be inferred from this behaviour that credit constraints play an important role in the decision to leave, work and save abroad, and then return to participate in the local economy. It seems, therefore, that better access to credit market will be helpful in promoting higher pay-off through self-employment in Albania.

For the host countries, a common worry has been the fear of the adverse effect of large flows of unskilled immigrants entering their labour markets. It appears that, at least in the case of Albania, a large proportion of immigrants choose the short-term (or guest worker) option. Nevertheless, it may be advisable for countries fearing these adverse effects to implement short-term work permits to be able to better monitor such flows. Finally, host countries could as well try to lessen the incoming flows by favouring the creation of micro-credit institutions in the source countries.