1 Introduction

Educational integration is an important precondition for the economic assimilation of immigrants in the host societies. International student assessment studies cause concern about the integration of immigrant children in schools. The Program for International Student Assessment (PISA) and the Trends in International Mathematics and Science Study (TIMSS) show that students who were born abroad perform significantly worse in the achievement tests, compared to native students. The average achievement gaps in the OECD-countries amount to about 25 test score points in math and 28 in science (25% and 28% of the standard deviations in test scores).

Several studies investigate the achievement gaps in more detail. Entorf and Minoiu (2005) have shown that not only the PISA achievement gaps between migrants and non-migrants vary substantially across OECD countries, but also the socioeconomic background of the immigrants and its influence on achievement. Ammermüller (2007) has raised the question of why immigrants in Germany performed so poorly in PISA. The answer is twofold: immigrants in Germany come from less favorable social backgrounds and they get lower returns to their characteristics than German natives.

Why do we observe large gaps in cognitive skills between students with foreign background and native students? Can these gaps be explained by differences in student characteristics? And most importantly, what can policy do? How should schooling be organized to further the integration of children with foreign backgrounds?

This essay is aimed at quantifying the disadvantage of immigrant children in education and relating it to institutional conditions of the education system. In the first step, educational integration of immigrants and second-generation immigrants is measured and made comparable across countries and time, using micro-data of several international student assessment studies. In the second step, I estimate the effects of certain characteristics of the education system, such as pre-primary education, time in school, or the segregation of students among schools, on the mean level of educational integration based on a cross-country time-series analysis.

2 The integration of immigrants

The raw data of various achievement tests give a substantial drawback for students with migration backgrounds. These achievement gaps cannot be compared directly across countries, since educational success is largely determined by the social background of the students (Hanushek and Luque 2003; Wößmann 2005a) and different countries have different immigrant populations. Depending on the income situation, the geographic region, the immigration policy, and many other characteristics, they attract migrants with different abilities and social backgrounds.

I use the Blinder–Oaxaca decomposition to construct a measure of integration that is comparable across countries and time (Blinder 1973; Oaxaca 1973). The mean achievement gap between natives and migrants in a country is decomposed into a part that is explained by differences in social background characteristics and a part that remains unexplained. Educational production functions are estimated separately for natives, immigrants, and second-generation immigrants. The average native, immigrant, and second-generation immigrant test scores (\(\overline{Y}_n, \overline{Y}_i, \overline{Y}_s\)) can be written as products of the estimated coefficients, including the intercepts (\({\hat{\boldsymbol{\upbeta}}_{\bf n}}, \hat{\boldsymbol{\upbeta}}_{\bf i}, \hat{\boldsymbol{\upbeta}}_{\bf s}\)) and the average endowments (\(\mathbf{\overline {X}_n, \overline {X}_i, \overline {X}_s}\)) of the three groups:

$$ \overline{Y}_n=\hat{\boldsymbol{\upbeta}}_{\bf n} \mathbf{\overline{X}_n}, ~~\overline{Y}_i= \hat{\boldsymbol{\upbeta}}_{\bf i} \mathbf{\overline{X}_i} ~~\mbox{and}~~ \overline{Y}_s= \hat{\boldsymbol{\upbeta}}_{\bf s} \mathbf{\overline{X}_s}. $$
(1)

The average achievement gaps between students with foreign background and native students can be formulated as

$$ \Delta \overline{Y}_{i-n}=\hat{\boldsymbol{\upbeta}}_{\bf i} \mathbf{\overline{X}_i} - \hat{\boldsymbol{\upbeta}}_{\bf n} \mathbf{\overline{X}_n} = \underbrace{ {\hat{\boldsymbol{\upbeta}}_{\bf n}} \left(\mathbf{\overline{X}_i} - \mathbf{\overline{X}_n}\right)}_{\mbox{\emph{explained}}} + \underbrace{ \mathbf{\overline{X}_i} \left(\hat{\boldsymbol{\upbeta}}_{\bf i} - {\hat{\boldsymbol{\upbeta}}_{\bf n}}\right)}_{\mbox{\emph{unexplained}}}. $$
(2)
$$ \Delta \overline{Y}_{s-n}={\hat{\boldsymbol{\upbeta}}_{\bf s} \overline{X}_s} - \hat{\boldsymbol{\upbeta}}_{\bf n} \mathbf{\overline{X}_n} = \underbrace{ {\hat{\boldsymbol{\upbeta}}_{\bf n}} \left(\mathbf{\overline{X}_s} - \mathbf{\overline{X}_n}\right)}_{\mbox{\emph{explained}}} + \underbrace{ \mathbf{\overline{X}_s} \left(\hat{\boldsymbol{\upbeta}}_{\bf s} - {\hat{\boldsymbol{\upbeta}}_{\bf n}}\right)}_{\mbox{\emph{unexplained}}}. $$
(3)

The explained part of the test score gap considers that students with foreign background may be endowed with less favorable socioeconomic characteristics and, therefore, may be less successful in education. The unexplained part of the achievement gap can be interpreted as a measure of integration. Multiplying the unexplained by − 1 gives the answer to the following question: By how many test score points would immigrants perform better, given their own endowments, if they had the same returns as native students?Footnote 1 As mentioned above, a similar approach was used by Ammermüller (2007) to study the PISA achievement gap between migrant students and native Germans.

The separate estimation of the achievement function for the three groups allows heterogenous returns to individual characteristics for natives and students with migration background. It is plausible to assume that natives and foreign students are different populations and obtain different returns to their endowments. A high educational attainment of parents, for example, might not have the same positive impact for migrant students as for natives. Similarly, studies on the returns to education on the labor market show that individuals with migration background get a significantly smaller payoff to their education (Chiswick and Miller 2008; Hartog and Zorlu 2009).

The unexplained is interesting to analyze and compare across countries; nevertheless, it is problematic for the purpose of this paper. The analysis of institutional effects needs a measure of integration that is comparable across countries and does not depend on the average characteristics of immigrant students in a certain country. The question the measure should be able to answer is the following: How much better would a representative student with foreign background perform in a given institutional regime if he or she had the same returns as the native students in that regime?

The unexplained test score gaps are, therefore, standardized:

$$ I_{i} = \mathbf{\overline{X}^{st}_{i}} ({\hat{\boldsymbol{\upbeta}}_{\bf i}} - \hat{\boldsymbol{\upbeta}}_{\bf n}) ~~\mbox{and}~~ I_{s} =\mathbf{\overline{X}^{st}_{s}} ( {\hat{\boldsymbol{\upbeta}}_{\bf s}} - {\hat{\boldsymbol{\upbeta}}_{\bf n}}), $$
(4)

where \(\mathbf{\overline{X}^{st}_{i}}\) and \(\mathbf{\overline{X}^{st}_{s}}\) are the mean characteristics of immigrants and second-generation immigrants in the whole sample. The sensitivity of the results on institutional effects to this standardization is discussed later on.

Note that the measure of integration is a relative one. It gives the drawback of students with migration backgrounds, relative to the native students in that country. This is exactly the measure I need to represent the situation of immigrants. It is not important whether immigrants in the USA do worse than German natives or the average native in the sample. The important question is the relative position of immigrant students in the societies of their host countries, where they are going to live and work.

I include individual and family background variables in the achievement regressions. School resources, like class size or teacher characteristics, may play an important role but are not randomly allocated across schools, just as little as students with migration background are. The allocation of school resources is a source of integration policy and controlling for school characteristics in the educational production functions would overestimate the true level of integration.

2.1 Data from PISA and TIMSS

I use micro-data from several waves of two different international student assessment studies. TIMSS has been conducted by the International Association for the Evaluation of Educational Achievement (IEA) in 1995, 1999, and 2003 in about 50 different countries and PISA has been organized by the OECD in 2000 and 2003. In both surveys, about 4,000 secondary education students from about 170 schools were assessed in each participating country in each wave. Among other things, the surveys provide estimates of student proficiencies in mathematics and science, as well as detailed background information of students and schools.

My sample consists of 167 country-years, which span a time period of 9 years (from 1994 to 2003). See Table 1 for a list of the countries. Some country-years were dropped from the sample because of missing background information. Since the decomposition approach is based on separate estimations for immigrants, second-generation immigrants, and natives in each country, I further dropped all country-years with less than 40 students in a group. As will be explained below, I calculated the standard errors of the decompositions and use the inverse as weight for the regressions in the second part of the paper to account for differences in the number of observations.

Table 1 List of countries used in the analysis

For each of the 167 country-years, I estimate integration of immigrants and second-generation immigrants (I i , I s ). The dependent variable in the underlying educational production function is the student test score in PISA and TIMSS, respectively, and individual student characteristics are age, grade, sex, the highest education level obtained by parents, the number of books at home, whether students have a computer, a calculator, and a desk to study at home and whether they speak the national language at home. Table 2 gives summary statistics and a description of the student-level variables.Footnote 2

Table 2 Student-level variables and decomposition results

This rich list of explanatory variables represents the individual characteristics of the students and their family background. Some more variables concerning the immigration status, like the reasons why the families migrated, the number of years since immigration, and the home countries of the immigrants cannot be observed in most data sets. A variable that is seen to play an important role for the economic assimilation of immigrants is whether the students speak the national language at home. This variable is available in the data and included in the achievement regressions. Since this variable is critical and may be correlated with unobserved characteristics of the host country, a sensitivity check without this control is presented in the second part of the paper.

The achievement functions are estimated with survey regressions, with students weighted according to sampling probabilities, and the dependence of standard errors within clusters (schools) is taken into account. Part of the difference in the study designs between PISA and TIMSS can thereby be eliminated.

PISA and TIMSS are of similar type; both are aimed at obtaining an internationally comparable measure of the proficiency of secondary school students and both incorporate a comparable quality standard with respect to the design and implementation of the assessment.Footnote 3 The similarity of the PISA and TIMSS survey designs allows the use of both studies together. Concentrating on PISA or TIMSS only, is not possible in this study because sample size would be too small, especially for the country-fixed-effects regressions in the second part of the paper. See the Appendix for a short description of differences between PISA and TIMSS, as well as the applied transformation strategy to reach comparability of student achievement scores. A more detailed analysis of the comparability of PISA and TIMSS is given in Brown et al. (2007).

2.2 Integration in various countries

This section summarizes the actual (non-standardized) results of the Blinder–Oaxaca decompositions in mathematics and science. Figure 1 shows the total achievement gaps between foreign students and natives, decomposed into an explained and an unexplained part. Due to the wide range of different countries, these are arranged into nine country groups, wherefrom mean values are reported.

Fig. 1
figure 1

Achievement gaps by country groups

On average, students with foreign backgrounds achieve lower scores than native students, and a part of the test score gap can be explained with differences in student characteristics in each country group. The total gaps are similar in math and science and range from about − 34 points in Middle and Northern Europe to about − 2 in the Near East. Remember, the test scores are normally distributed with a (weighted) mean of 500 and a (weighted) standard deviation of 100 in math and science.

The Middle and Northern European countries show large achievement gaps, − 33 in math and − 35 in science, with about 15% and 26% remaining unexplained. In Southern Europe, both gaps are about − 10, and very small fractions remain unexplained. In Eastern Europe, the gaps are higher in relatively rich countries and the unexplained is positive, meaning that foreign students receive higher returns to their characteristics than natives. This is the same in the English-speaking countries, which have rather small total gaps.

An interesting pattern arises when comparing the integration measures with and without controlling for the national language proficiency in the achievement regressions (these comparisons are not shown in the figures). In Middle and Northern Europe 36% and 43% and in Southern Europe 32% and 50% of the gaps remain unexplained. Furthermore, in the relatively rich Eastern European countries, as well as in the English-speaking countries, the unexplained differentials turn zero in math and negative in science. Hence, the proficiency of the national language is a major vehicle for migrant students to catch up in education. This result is an important finding, since language proficiency can be influenced by public policy in different ways, like the provision of special language courses in schools or language trainings for adult immigrants.

Most countries in the sample are members of the OECD and the decomposition results for these countries are shown separately in Fig. 2. In addition to the large variation in achievement gaps and unexplained differentials, the graphs tell us two important stories:

  • The English-speaking countries are found in the middle and lower tail of the gap distribution. In these countries, the unexplained test score gap is always positive in math and mostly positive in science; thus, children with migration background get higher returns to their characteristics than native students. One explanation for this result might be the fact that English is a world language and, therefore, it might be easier for migrants to integrate in such countries. Furthermore, Australia, Canada, New Zealand, and the USA are frequently characterized as traditional countries of immigration, and most of these countries follow a selected immigration policy, targeted at highly educated individuals with professional skills and adequate language proficiency (Miller 1999; Entorf and Minoiu 2005).

  • Within Continental Europe, the German-speaking countries, the Benelux countries, France, and the Scandinavian countries can be found in the upper part of the distribution and the Southern and Eastern European countries are ranked in the lower tail.

Fig. 2
figure 2

Achievement gaps in math and science in the OECD

As mentioned above, a standardized version of the unexplained differential is used as measure of integration in the regression analysis of the second part. Table 2 gives summary statistics of the standardized unexplained test score gaps in math and science. In both subjects, second-generation immigrants face a higher level of integration than first-generation immigrants. This result was expected as the assimilation of immigrants is associated with their length of stay in the host country.

The unexplained part of the test score gap can be interpreted as a measure of integration, since, multiplied by − 1, it tells us how much better students with migration backgrounds would perform if they had the same returns as native students. The measure is not reliable and cannot be compared if unobserved ability differences between natives and immigrants in the various countries exist that influence the test scores of the students.

2.3 Unobserved ability differences?

The immigrants of a given country are a highly selected group of people. Certain factors motivated their decision to migrate, while others decided to stay in their country. Economic models have been developed that investigate the selectivity of economic migrants with respect to their ability. The most important is the Roy model, applied by Borjas (1987, 1999) and extended by Chiswick (1999). This migration model assumes that the rate of return from migration is different for high-ability and low-ability individuals and determines whether an individual decides to migrate. Positive fixed costs of migration lead to a positive selection of migrants, which is intensified if high-ability individuals are more efficient in the migration process. Furthermore, economic immigrants are negatively selected if the wages in the destination, relative to the home country, are higher for low-ability individuals. This result implies that, for a constant ability distribution across countries, a lower relative income inequality in the destination country negatively selects migrants. In total, due to the costs of migration and the likelihood that such costs are lower for high-ability individuals, economic migrants are positively self-selected. This selectivity is diminished if the relative income inequality is higher in the home country.

Economic reasons are not the only reasons why people migrate. Refugees have to move because their safety or freedom is at risk, and other people move to accompany family members in other countries. Such migrants are mostly not favorably selected, as studies on unemployment and earnings show (Chiswick 1999). Furthermore, not only does the supply of immigrants determine the foreign population in a country, but demand-side effects are relevant, too. Some countries follow an immigration policy that is restricted to well-educated immigrants with good language skills.

Overall, as long as ability and motivation cannot be observed entirely, integration is likely to be over- or underestimated depending on unobserved ability differences. Economic theories predict that, in countries with a relatively low level of income inequality and a big part of immigration due to non-economic reasons, immigrants are likely to be negatively self-selected with respect to their ability. On the contrary, a selective immigration policy leads to a positive selection of immigrants. Thus, the low level of integration in the European countries may be underestimated, while the high (or even positive) level of integration in traditional immigration countries may be overestimated.

3 The role of institutions

What is the influence of the education system and what can policy do to further the integration of students with migration background? To find answers to these questions, integration is related to institutional characteristics of the education system.

Segregation School systems differ with respect to the segregation of migrants and poor students among schools. A high degree of migrant or social segregation is caused either by selectivity mechanisms of the education system, like general tracking, or by a high degree of residential segregation in comprehensive education systems. On the one hand, immigrants may profit from segregated schools because teachers may be more able to target the needs of the students in more homogenous classes. On the other hand, a higher degree of segregation can harm immigrant children because they have a higher probability of being allocated to low-grade school types and schools (e.g., Rees et al. 1996; Epple et al. 2002). Attending a lower-grade school type or a school in a poor neighborhood can have negative effects for mainly two reasons: school resources might not be equally allocated to schools and the absence of clever classmates and students from supporting homes with caring parents may have negative effects on the learning climate. Empirical studies on peer effects show that especially low-ability students and students from less favorable family backgrounds profit from being placed in schools and classes with high-ability students or kids from parents with favorable socioeconomic characteristics (Winston and Zimmerman 2003; Sacerdote 2001; Schneeweis and Winter-Ebmer 2007).

Time in school If foreign students spend more time in school, pedagogically supported, together with kids of other ethnic groups, they should interact with each other and learn the national language and other national habits, and integration can take place. A full-time school system should, therefore, lead to a higher degree of integration. On the other hand, especially for students with learning or language problems, too much time in school might be too demanding. Aksoy and Link (2000) investigated US panel data and found mathematics achievement to be positively affected by the number of minutes per math class. The number of legal days in school and hours of school per week show no consistent effects. Lewis and Seidman (1994) found large positive effects of the length of the school year in a cross-section analysis.

Pre-primary education Carneiro et al. (2005) have studied labor market discrimination of ethnic minorities in the USA and argue that deficits in cognitive skills of minorities emerge early and widen with schooling. The authors recommend that policy measures to increase the labor market success of minority groups should be applied as early as possible. Early-childhood programs, like kindergartens, day-care centers, and pre-schools, are aimed at preparing children for primary education and providing an equal starting point for all children. Currie (2001) investigated pre-school programs in the USA and found significant benefits for educational attainment and earnings, especially for disadvantaged children. In another study, Currie and Thomas (1999) focused on the impacts of Head Start, a subsidized pre-school program in the USA. The authors show that all children benefit from Head Start, compared to their siblings who did not attend the program, and Head Start closes one quarter of the test score gap between Hispanic and white children. This evidence on pre-primary education suggests that a country should be more effective in decreasing inequality the more children of immigrants and second-generation immigrants attend pre-primary education.

Entry age of schooling Whether it is better to enroll children in school at an earlier or later time has been discussed frequently. Most economic studies in this regard rely on within-country variation in entry age due to month or quarter of birth (e.g. Angrist and Krueger 1992). In this study, variation in the school entry age across countries is investigated. It is hypothesized that enrollment at age 7 is detrimental for immigrant students, since the integration process in school starts later. Attendance at age 5 should operate the other way around and reduce the drawback of migration.

Pupil–teacher ratio The question of whether class size affects student achievement has been studied extensively. Two prominent studies should be mentioned. Krueger (1999) investigated the Tennessee Student–Teacher Achievement Ratio experiment (STAR) and showed that smaller classes in primary education help students, especially those from disadvantaged families. The same result was found by Angrist and Lavy (1999) for Israeli primary-education students. The authors employed a regression discontinuity design, exploiting random variation in class size due to grade enrollment and a maximum class size rule. The pupil–teacher ratio in primary education, thus, is assumed to have a negative impact on the integration of foreign students.

External student assessment Central examinations restrict the latitude of teachers’ grading practices, provide information on the relative standing of students and schools, and induce parental and public pressure on students, teachers, and schools. Thus, central student assessments should be positively related to academic achievement. Wößmann (2005b) has shown that central exams exert heterogenous performance effects and reduce the achievement drawback of children with migration background.

Remedial and enrichment courses Immigrant children should profit from school systems that offer special courses in academic subjects for low achieving students. Enrichment activities for gifted students, on the other hand, may increase the unexplained achievement gap if students with migration backgrounds are less frequently promoted in such programs.

3.1 Analysis of institutional effects

I use pooled weighted least squares and fixed-effects regressions to analyze institutional effects on integration. The model can be written as

$$\label{eq5} I_{ct} = \alpha_0 + \mathbf{\alpha_1 E_{ct}} + \mathbf{\alpha_2 Y_{ct}} + \mathbf{\alpha_3 C_{ct}} + v_c + u_{ct}, $$
(5)

where c and t index countries and time. The dependent variable I ct is the standardized unexplained test score gap of immigrants and second-generation immigrants, respectively. The vector E ct represents educational institutions, Y ct denotes the income situation of the country, and C ct is a vector of control variables. The error term is split up into a part that is constant within each country v c and an idiosyncratic part u ct .

A consistent estimation of α 1 requires that E ct and the error terms are uncorrelated. A country-fixed-effects model is a perfect way to eliminate the country-specific error term v c , which, among other things, includes the time invariant unobserved ability composition of the immigrant population. This unobserved ability composition may be correlated, for example, with the segregation of migrants among schools or enrollment in pre-primary education. In a country-fixed effects model, the identifying assumption is reduced to the condition that foreign students observed in 1994 should not differ from those in 2003 in their unobserved characteristics within each country.

The country-fixed-effects model uses the variation within countries only, capturing a time component and a sampling component. Thus, next to changes over a time span of about 10 years, an additional source of variation is available. The dependent variable and most of the institutional variables are extracted from PISA and TIMSS data, as will be explained in the next section. While, in PISA, 15–16-year-old students are sampled, in TIMSS, grade-8 students are assessed. There is additional variation due to the fact that TIMSS and PISA students face different environments in their schooling systems. The students for which integration is calculated at a given year are confronted with a set of environmental characteristics and exactly this set is related to their levels of integration.

A variance decomposition in between and within components shows that, out of the eight institutional variables, the within variance is above 20% for six and above 30% for two variables. The dependent variables show a within variance component of above 40%. Thus, the within variation should be sufficient to estimate a fixed-effects model.

Potential problems of a country-fixed-effects model are that the coefficients of variables that do not change over time cannot be estimated. Furthermore, differencing out country effects may increase attenuation biases because of measurement errors. Measurement errors might arise in this study from an imprecise measurement of integration, the fact that it is calculated from different student assessment studies and the aggregation of institutional characteristics to the country level. Due to these reasons, three methods are used to estimate the model: pooled weighted least squares, a model with country-group dummies as listed in Table 1, and a country-fixed-effects model.

Equation 5 explicitly includes Y ct , the level of income, as well as income inequality in the host country. The income situation should capture the general availability of resources, as well as the potential selectivity of migrants according to the Roy model. C ct includes the mean national test score as an overall quality measure of the education system, a dummy for second-generation immigrants, wave dummies, country-group dummies, or country dummies, respectively, and country-group-wave dummies in the country-group, as well as the country-fixed-effects models.

As mentioned above, since the dependent variable is constructed, following Silber and Weber (1999) and Card and Krueger (1992), observations are weighted by the inverse of the disturbance to the dependent variable. Standard errors of the decompositions are obtained by bootstrapping, with 200 bootstrap replications employed, and their inverse is used as weight. Country-years, in which integration is estimated with a lower degree of accuracy, are weighted less in the regressions. This is important because, for example, as the number of migrants is small in some countries, the coefficients of the achievement regressions might not always be significant or the background characteristics of migrants might be measured with error in some countries.

The model is estimated for the whole sample and for the subsample of OECD countries. Though the OECD sample is rather small, it includes countries that show comparable characteristics, and the identifying assumptions are more likely satisfied. Furthermore, to account for changes in the unobserved characteristics of migrants over time that might be correlated with institutional characteristics, I control for the home countries of the foreign population as a robustness check. For a small number of countries, I have aggregate data on the migration regions of the migrant population stock.

3.2 Explanatory variables

The empirical analysis of institutions is based on data from different sources. See Table 3 for a description and summary statistics of the country-level variables.

Table 3 Country-level variables

First, the PISA and TIMSS databases include useful information on schools, where the relevant school variables are aggregated to the country level.Footnote 4 One might ask why school data are aggregated to the country level and their effects on immigrant performance are not estimated directly. Exploiting the variation among schools entails the problem of student self-selection. If high-ability immigrants are more likely to choose better schools with clever peers and adequate equipment, the effects of school resources cannot be identified. Aggregation helps to overcome this identification problem to the cost of measurement errors in focusing only on the mean level of resources, regardless of the distribution.

Segregation of students among schools is measured by the Duncan and Duncan (1955) dissimilarity index, recently applied by Burgess and Wilson (2005) and Jenkins et al. (2008). The dissimilarity index of migrant segregation is based on a binary variable that splits the population into two groups: \(\mbox{Migrant segregation} = \frac{1}{2}\sum_{s=1}^S | \frac{f_s}{F} - \frac{n_s}{N} | \), where f s and n s are the numbers of foreign and native students in school s and F and N are the total numbers in the country. The index ranges from 0 to 1 and gives the fraction of students with migration background that has to be moved to other schools to ensure an equal representation of foreign students in each school. Analogously, a social segregation index is calculated, where the two groups represent students with more and less than 25 books at home. The segregation indices differ between PISA and TIMSS. While PISA sampled single students from schools, TIMSS assessed whole classes. Thus, the PISA data refer to school segregation, whereas the TIMSS data measure segregation among classes. Next to segregation, the information on time in school, external student assessment, and remedial and enrichment courses is taken from PISA and TIMSS data. Further data sources are the World Banks’ World Development Indicators 2005, the UNESCO Institute for Statistics and the Trends in International Migration published by the OECD.

GDP per capita and the Gini coefficient represent the income situation of the country. GDP per capita is an indicator for the general availability of resources and the Gini coefficient gives a picture of income inequality. Moreover, immigrants in high-income countries with a lower degree of income inequality may be negatively self-selected with respect to their ability. Unfortunately, there is no time variation in the Gini coefficient in the available data. It is assumed that income inequality has not changed substantially within the analyzed time period and the Gini coefficient is taken as fixed for each country.

For a small number of country years, the Trends in International Migration provides information on regions where the migrants in a country come from. This information is used to account for the possibility that unobserved characteristics of the foreign population change over time.

Unfortunately, the used country data are incomplete, some missing values are imputed from other years, and unavailable data that cannot be imputed from other years are not dropped from the sample, but missing dummies are included in the regressions.

4 Results

This section contains the results of the baseline specification for all countries, the results for the OECD sample, and a number of robustness checks.

4.1 Baseline specification for all countries

Table 4 gives the estimation results in math and science for all countries. The first two columns of figures contain the results of the pooled WLS estimations, followed by the country-group effects and country-fixed-effects models. The regressions are weighted with the inverse standard error of the underlying decomposition and cluster robust (countries) standard errors are reported. The dependent variable is integration of immigrants and second-generation immigrants in math and science, respectively. The effects of income inequality, as well as entry age of schooling, cannot be estimated with country-fixed effects because these variables do not change over time.

Table 4 Results for all countries

Educational institutions Segregation of migrant students among schools and social segregation show no significant effects in the first four columns. All point estimates have negative signs. In mathematics, the coefficient of migrant segregation gets positive significant and the coefficient for social segregation remains negative and gets significant when differencing out country effects. In science, both estimates are not significant. The results suggest that migrant segregation is good and social segregation is bad for immigrant children, though both variables are significantly positively correlated. Migrant segregation gets insignificant and social segregation less significant if both coefficients are estimated separately. In total, the results are inconsistent and do not allow for any conclusion of wether segregation in schools is negatively or positively related to the integration of migrants. As mentioned above, TIMSS measures segregation among classes and PISA refers to schools. Both segregation measures were interacted with TIMSS and PISA dummies, but no systematic differences have been found.

Time in school is represented by the mean number of instructional hours per school year (divided by 100). The variable is included in a quadratic form to allow for non-linear returns to schooling hours. The coefficients are statistically significant in all regressions and show the expected signs. According to the country-fixed effects model, time in school and integration are positively correlated up to about 1.094 h in math and 913 h in science. Given a mean value of 932 instructional hours per school year, a reform towards full-time schooling might be beneficial for migrant students in many countries.

Pre-primary school enrollment and integration are positively correlated; however, the effects are not always significant. The estimates of school entry-age are consistent with these results. Education systems where children start schooling at the age of five, compared to six, are associated with higher levels of integration. The magnitude of the coefficients is about ten test score points, capturing approximately 30% of the standard deviation of the dependent variable. Overall, the results suggest that early education, either in pre-primary or primary schools, is important for the educational assimilation of students with foreign background.

Mixed and mostly insignificant results are obtained for the pupil–teacher ratio and external student assessment. Furthermore, advancement activities for weak students show some significant positive effects, while enrichment courses show some significant negative coefficients. The availability of enrichment courses for gifted students was expected to be detrimental for integration if migrants are less likely to be accepted in such courses.Footnote 5

Income situation Both sets of regressions show that the income situation of the country is significantly correlated with the level of integration. This may be due to a resource effect, but also to the selection of economic migrants. The results indicate that high-income countries show a lower level of integration. Furthermore, a higher level of income inequality reduces integration of foreign students. This result was expected, since migrants often belong to the poor part of society and unequally spent resources should affect them negatively. Interestingly, the interactions of GDP and Gini show that foreign students are better integrated in high-income countries with a higher level of income inequality. This result may represent the selectivity of economic migrants. This is exactly what economic theory about migration predicts. Remember, economic migrants with high abilities are more likely to migrate to countries where they earn more. Thus, migrants are positively self-selected in high-income countries with a higher level of income inequality.

Control variables The national average test score is significantly positively related to the integration of migrants in three out of six estimations. However, the sign is always positive. Interpreting the average test score as a quality measure of the education system, it appears that high-quality systems further migrant students to a greater extent. Moreover, second-generation immigrants do better than immigrants, both in math and in science. The gaps between immigrants of the first and second generation are of equal magnitudes in all regressions, about six test score points in mathematics and nine in science. Educational assimilation seems to be easier in math as compared to science. One explanation for this result is that math scores should be more related to IQ and science scores to factual knowledge; thus, they should be more influenced by language skills and social environment.

4.2 Baseline specification for the OECD sample

The identifying assumption of the model is more likely satisfied if one compares similar countries only. This is particularly important for the pooled and country-group effects specifications. Table 5 gives the results for the OECD countries.

Table 5 Results for OECD countries

Both segregation indices, again, show no consistent results. The estimates of hours per school year indicate the same pattern as above; however, the effects are statistically not significant in two out of six regressions. According to the country-fixed-effects specification, schooling time is positively associated with integration in math up to 1,244 instructional hours. The average within the OECD is about 954 h per year, implying considerable potentiality in a number of countries if the effect was causal. The magnitude of the effect is sizeable as well: starting from the mean, an increase in hours by 100 (approximately one standard deviation within the OECD) is associated with a higher level of integration of 8.5 test score points in math.

The coefficients of pre-primary education are statistically and economically more significant within the OECD and show an important magnitude in the fixed-effects math regression. According to this specification, a higher total enrollment rate by 25% points (one standard deviation within the OECD) is related to reinforced integration of about 34 math points, which is about one standard deviation of the dependent variable. Moreover, the coefficients on school entry at age five show the expected signs, but are not always statistically significant. The coefficients on age seven are mixed and not consistent.

While most of the other variables show mixed and insignificant results, external student assessment seems to be important within the OECD. Integration in science is higher if schools are not responsible for assessment policies. In the country-fixed-effects specification, a large coefficient is obtained that is statistically significant at the 1% level. If this coefficient would be causal, an increase in the fraction of schools that do not have the primary responsibility for student assessment policies by 0.2, this is one standard deviation in the OECD, could be associated with an increase in integration of about 12 science points.

The income situation gives the same picture as above. This result can be interpreted as a selection mechanism. Immigrants and second-generation immigrants in high-income countries with a higher degree of income inequality are better off, which is consistent with the predictions of economic theory on the selectivity of economic migrants.

Overall, when estimating the model with OECD countries, most effects on institutions that have been obtained for all countries are corroborated and additional insights for the OECD countries are won.

4.3 Robustness checks

Four types of robustness checks are carried out. The model is estimated including controls for the composition of the migrant population with respect to the immigrants’ home countries (see Table 6). Furthermore, the sensitivity of the results to the inclusion of language proficiency in the achievement regressions, to some sample restrictions in the micro-data and to the standardization of the unexplained differentials, are investigated. These estimations are not shown in the paper but results can be obtained from the author.

Table 6 Results with migration regions

Controls for migration regions The first robustness check is based on a model that controls for the regions where the foreign population of a given country comes from. This strategy should remove the remaining problem that unobserved country attributes, like the composition of the immigrant population, change over time and may be correlated with changes in institutional characteristics of the education system. The information on migration regions is only available for a small number of country-years. As mentioned above, the other observations are not dropped but a missing dummy is included in the regressions. The results are given in Table 6 and are very similar to those of Table 4, where migration regions are not included.

National language in the achievement regressions As discussed in Section 2, the unexplained achievement gap is substantially larger when national language proficiency is not included in the achievement regressions of the decompositions. Whether the true level of integration is obtained by filtering out differences in language skills or not is a matter of opinion, therefore, I examined whether the results on institutional effects are sensitive to this decision. The results of schooling time are robust to this robustness check. Pre-primary education shows the same coefficients; however, the coefficient in the math country-fixed-effects regression is statistically not significant. Furthermore, school entry at age five and some coefficients on remedial and enrichment courses are not significant anymore.

Grade-age restrictions As mentioned in the Appendix, the grade and age variables in the achievement regressions might be endogenous if selection into grades differs between migrants and native students. To test for this bias, I restricted the sample to those grades with more then 30% of PISA students and furthermore dropped the oldest and youngest 5% of TIMSS students. Some coefficients loose statistical significance; however, all significant effects show the same signs and magnitudes as in the baseline model.

Robustness to the standardization Since the unexplained differential is calculated by using the average endowments of foreign students in each country, a standardization was needed to reach comparability. I used the mean characteristics of foreign students in the whole sample. To check whether the results are influenced by this choice, especially the results for the OECD sample, I replaced the standardization vector by the average characteristics of migrants within the OECD. The results are very robust to this sensitivity check.

5 Conclusions

This essay is aimed at quantifying the drawback of students with migration background in secondary schools. Blinder–Oaxaca decompositions show that, on average, the test score gaps between students with foreign background and natives cannot be entirely explained with differences in the students’ productivity characteristics. In most countries, a part of the test score gap remains unexplained. As shown in Figs. 1 and 2, educational gaps between native and foreign students and the shares that remain unexplained vary substantially across country groups and OECD countries.

The English-speaking countries or traditional countries of immigration are found in the middle and lower parts of the gap distribution, and here, the unexplained differential is almost always positive, meaning that immigrant children get higher returns to their characteristics than natives. The achievement gaps in Middle and Northern Europe are largest and average out to − 33 math and − 35 science points, with 15% and 26% remaining unexplained.

The proficiency of the national language turned out to have significant effects on integration. The unexplained test score gap is substantially larger if national language is not included in the achievement regressions. This finding is not new and strongly argues for public policies that encourage immigrants to learn the national language.

In the second part, integration of students with migration background is related to institutional characteristics of the education system, the income situation of the country, and some control variables. Interestingly, the estimated coefficients on the income situation show exactly the results predicted by economic theory of migration: high-income countries with a high level of income inequality should attract immigrants with higher abilities. In fact, educational integration is higher in high-income countries with higher levels of income inequality.

Educational institutions were found to be correlated with the integration of migrants. Specifically, time in school and early education are positively related to integration in both math and science. The result on pre-primary education is in line with the studies on Head Start and indicate that promotion of students with migration background should start as early as possible.

Central examinations, measured by the fraction of schools that do not have the primary responsibility for student assessment policies, are associated with higher academic science achievement of migrant students in the OECD. This is consistent with other studies and may be due to the involved restriction in the latitude of teachers’ grading practices and information-induced pressure on students, teachers, and schools.

Overall, the results of the study suggest that the design of the education system is important for children with migration background. Recent demographic trends in many industrialized countries indicate that the integration of migrants will be a major challenge in future. To meet this challenge, policy makers should regard educational integration as an important precondition and education policy as a main instrument to further the economic assimilation of immigrants in the host societies.