Introduction

In recent years, growing attention has been focused on the integration of immigrants into host societies around the world. Some areas are characterized by processes leading to segregation (Poulsen et al. 2002), whereas other areas are typified by processes leading to assimilation (Peach 2009), segmented assimilation (Portes and Zhou 1993) or pluralism (Boal 1999). For all these scenarios, researchers have sought to identify such processes and to explain how they lead to variations in settlement patterns. Many of these researchers have thus contributed to the vast body of work on the residential location of immigrants. A large portion of this work focuses on the explanatory factors that determine where immigrant populations are located. These explanatory factors, which are also referred to as determinants, are divided into four categories: labour market conditions (Borjas 1999, 2001), the specificities of the local welfare system (Borjas 1999; Dodson 2001), the presence of an existing immigrant community (Zavodny 1999; Zorlu and Mulder 2008; Åslund 2005) and the size and diversity of the existing immigrant population (Belanger and Rogers 1992; Rogers and Henning 1999). To illustrate the relative importance of these factors, most authors examine data on either an inter-metropolitan or regional scale. Among the research that has examined such inter-metropolitan and regional differences are the studies by several authors in the USA (Alba et al. 1999; Bartel 1989; Edin et al. 2003; Funkhouser 2000; Newbold 1999; Zavodny 1997), and in Northern and Western Europe (Arapoglou 2012; Musterd and Deurloo 2002; Zorlu and Mulder 2008). In addition, within this body of work, much effort has been devoted to improving our understanding of the processes at play in countries with long histories of immigration, whereas few authors have looked at immigration growth in countries where immigration is a more recent phenomenon. Moreover, few authors have examined intra-metropolitan factors that determine the location of new immigrants within the city, and when these explanatory factors are examined, the variables are somewhat limited in their scope (Fahey and Fanning 2010). In consequence, little is known about the factors that determine the location of immigrants within urban areas in the southern EU countries where strong immigration growth is a fairly recent phenomenon.

Over the last decade, Southern Europe has been a host destination for many immigrants arriving mainly from Africa, Latin America and Eastern Europe (Kandylis et al. 2012; Rebelo 2012). One of the most receptive of these host countries has been Spain. In 1998, 1.60 % of the Spanish population was foreign-born, in 2001 this proportion was 3.33 % but in 2008 this proportion had increased to 11.41 % (INE 2008). After 2008, the proportion of immigrants has increased more slowly: 12.08 % in 2009 and 12.22 % in 2010 which is the last year of growth (12.18 % in 2011). Furthermore, 2008 is an interesting cutoff to study the immigration in Spain because it is the last year of growth of the Spain GDP before the actual economic crisis: 1.44 trillion US dollars in 2007, 1.59 trillion US dollars in 2008, 1.45 trillion US dollars in 2009 (Source: The World Bank). Moreover, in the housing market, 2008 represents a peak in housing prices after a decade of continuous growth of building activity. Recent studies have found that immigration population growth in the 2001–2008 period may explain Spain’s larger housing market boom (Gonzalez and Ortega 2013). Given this, growing immigrant population has been accompanied by a situation of strong economic context which is especially interesting to explain why this growth is unevenly distributed within an urban area. In consequence, in the section “Data and Methods”, we present an econometric model to explain immigrant population growth over the 2001–2008 period based on the initial conditions in 2001.

From 2008 to 2013, the economic situation has changed dramatically: the recession has driven down 7 % of GDP over the last 5 years; unemployment is stuck at 26 %; the emigration of Spanish people is picking up; and the proportion of foreign-borns has decreased to 11.76 %. The change of economic situation had important consequences for non-EU immigrants and it has been analyzed by several authors. The effects of the economic crisis on labour market conditions for immigration population have been studied by Castles (2011). Domingo et al. (2010) have compared the situation of two countries of Southern Europe (i.e. Spain and Italy). Galeano et al. (2014) have described the formation and evolution of ethnic enclaves in Catalonia before and during the economic crisis. Taken together, these results indicate that there is a significant decline in the living conditions of immigrants. Several research projects are currently working with these new conditions generated by the economic crisis.Footnote 1

Of the total immigrant population in 2008, 40 % came from non-EU countries. Some EU member states require Bulgarians and Romanians to acquire a permit in order to work, whereas Spain maintains work restrictions only for Romanian citizens; Romanian citizens are therefore included in the group of non-EU immigrants even though they have been part of the EU since 2007.

From a geographic point of view, the distribution of immigrants across Spain has been rather uneven. In 1998, 75.85 % of all non-EU immigrants lived in Barcelona and Madrid, the two largest cities in the country. Ten years later, despite a considerable level of migration to other parts of Spain, the concentration of non-EU immigrants in these two cities remained high (48.68 %). In 2001, Spain had 41,116,842 inhabitants, 1,371,657 of whom were immigrants, compared with 46,157,822 and 5,268,762 respectively in 2008. There were 2,934,591 inhabitants in the Barcelona metropolitan area in 2001, of whom 148,170 were immigrants. In 2008, the Barcelona metropolitan area contained 3,182,739 inhabitants, 482,604 of whom were immigrants. The proportion of immigrants thus increased from 5.05 to 15.16 %. Over the same period, Madrid showed a similar increase in its immigrant population, which rose from 5.68 to 16.03 %.

Furthermore, in a previous article specifically examining the case of Barcelona, we found evidence of intra-metropolitan segregation among non-EU immigrants (Martori and Apparicio 2011). For all of these reasons, the example of the Barcelona metropolitan area (hereafter referred to as the BMA) represents a unique case that can be used to illustrate the factors that explain, both globally and locally, the changes occurring in the percentage of immigrants over time. We analyze these factors in a context of strong immigration growth over a short period of time (2001–2008), and we attempt to understand what has led to larger increases in the percentage of immigrants in some parts of the BMA as opposed to others.

From a methodological point of view, empirical analyses regarding immigrant location are often based on individual-level data. Among the authors using spatially referenced data, only one researcher uses intra-metropolitan spatial data to analyze the dynamics of immigrant settlement patterns (Bråmå 2008). To our knowledge, no study has as yet introduced spatial effects to explain the growth of the immigrant population within urban areas. Modeling with a spatial dataset requires that researchers control for spatial dependence (Anselin 2002). Spatial econometrics has recently been appraised in a thematic issue of the Journal of Regional Science. In the introduction to this special issue, Partridge et al. (2012) raise doubts about spatial lag models. Although some of the criticisms raised are valid, these issues can be overcome by improving applied spatial econometrics work. Moreover, spatial econometrics methods (spatial lag and spatial error models) are used today in many applied studies in various research fields (Castro et al. 2013; Lazrak et al. 2013; Weidmann and Salehyan 2013), as they are used here.

The remainder of this paper is structured as follows. In the second section, we discuss the theoretical framework of the key explanatory factors determining immigrant location. The third section gives a brief overview of the BMA and the immigration processes under study, and describes the empirical model and the statistical methodology that is applied to validate it. In the fourth section, we present the significance of the different explanatory factors.

Theoretical Framework

In order to appropriately allocate resources and services to the communities or municipalities where they are most needed, decision makers and policy makers managing the integration of immigrants must have a detailed knowledge not only of the current numbers of immigrants but also of the factors that will determine changes in their residential location over time. In researchers’ attempts to understand the spatial distribution of different population groups in urban areas, the residential segregation of ethnic groups has been a key topic (e.g. Gilliland and Olson 2010; Johnston et al. 2001; Massey and Denton 1988; van Kempen and Ozuekren 1998). Segregation is the result of processes that can be external to the ethnic group (i.e. policies or the host society’s attitudes towards immigrants, which may encourage or discourage settlement in certain neighbourhoods), or internal to the ethnic group (i.e. the close social bonds of certain ethnic groups, which favour a clustered settlement pattern) (Aalbers and Deurloo 2003; Arbaci 2007). Additionally, Schelling (1969) presents another explanation for the segregation which results from the native population moving away from the immigrant population. The recent empirical literature on the spatial distribution of immigrant populations in urban areas is rich in studies that have assessed the explanatory factors that determine these populations’ residential location (Fougère et al. 2013; Gabriel and Painter 2012; Lichter et al. 2010; Wadsworth 2010).

This literature is comprised of studies on four main components. The first component consists of factors relating to labour market conditions. The second component includes factors relating to the presence of established immigrants in the census tract, which may be less important in the case of the BMA, where the majority of immigrants have arrived in the last decade. The third component is comprised of factors pertaining to the housing market, and the last component groups together a number of geographic factors. In the sections below, we explain each component in detail.

Labour Market Conditions

Labour market conditions such as average wages and unemployment affect where immigrants settle (Borjas 1999). Another theory, posited by Borjas (1999) and Dodson (2001), is the welfare system’s role at the local level: these authors believe that the local welfare system may be an attractive factor for immigrants, along with unemployment insurance and health insurance (i.e. the notion of “welfare magnets”). But other authors disagree. Zavodny (1997), for example, shows that a more advanced welfare system is not correlated with higher levels of immigration. At the opposite end of the labour spectrum are the immigrants employed in the knowledge economy. Bartel (1989) found a positive correlation between educational level and immigrants’ tendency to move to areas with lower ethnic concentrations. In short, there is an ongoing debate on the importance of labour market conditions in determining the location of immigrant populations.

In Spain, there is a high level of unemployment, more than 10 % in 2001 and 13.79 % in 2008 before the economic crisis according to Economically Active Population Survey (Spanish Statistical Office), and it is generally accepted that this factor is associated with low-skilled jobs and low educational levels (Canal-Dominguez and Rodriguez-Gutierrez 2008; Simon et al. 2008). In this sense, Arbaci (2007; 2008) and Kandylis et al. (2012) point out that a combination of these three factors in southern EU cities may give rise to a scattered spatial pattern of immigrant population distribution. Consequently, immigrants from outside Europe may be likely to settle in areas with high levels of unemployment. In addition, the relatively strong presence of blue-collar workers and the relatively high proportion of residents without a diploma also characterize the labour market conditions of certain areas that have been associated with the strong presence of immigrants (Wadsworth 2010).

Presence and Diversity of the Immigrant Population

The second determinant of immigrants’ residential location is the presence of established immigrants (Zavodny 1999; Åslund 2005; Zorlu and Mulder 2008). In other words, the number of immigrants in a given area has an attractive effect in itself (Belanger and Rogers 1992; Rogers and Henning 1999), so that the higher the number of existing immigrants, the greater is the area’s power of attraction. These factors are also present if we look at the latest waves of immigration in the USA, where it has been found that immigrants tend to move to regions where there are already large immigrant populations (Belanger and Rogers 1992; Rogers and Henning 1999). This type of approach suggests that metropolitan areas in the USA may tend to receive immigrants in the future, and, for all intents and purposes, this theory also suggests which areas may have large immigrant populations in the near future. Since these processes involve significant changes in public services, some European countries have developed policies to direct immigrants to certain regions (see for example, Dutch Refugee Council 1999 or Edin et al. 2003).

The tendency for immigrants to settle in neighbourhoods where other immigrants are present appears to be more important after their initial choice of urban area, as demonstrated by Åslund (2005) in the case of Sweden. At the other end of the spectrum, Funkhouser (2000) also agrees with the secondary migration explanation, and, more importantly, he posits that after many years in the host country, immigrants tend to relocate to areas with low ethnic concentrations. Mendoza (2013) and Yue et al. (2013) have shown the importance of social networks in determining immigrants’ destinations. The explanatory value of this theory is however somewhat limited in our case. As we noted in the introduction, the nature of the migration to the BMA differs from that found in other urban areas. In 1998, with only 1.7 % of the population being foreign-born, the BMA did not have a large existing immigrant population. This population was also highly diverse (i.e. high entropy) (Martori and Apparicio 2011).

Housing Market

The third factor that determines the location of the immigrant population involves dimensions relating to the housing market. By comparing different regions in the Netherlands, Zorlu and Mulder (2008) show how recent immigrants are located in areas where deprivation indicators, such as substandard housing, are high. In the case of Sweden, Bråmå (2008) and Åslund (2005) also introduce a number of housing market conditions. In short, for both of these countries, it is shown that, in general, new immigrants live in very different housing conditions from those of the native population. But these cases differ considerably from the situation in Spain. In the Netherlands and Sweden, rental and cooperative housing represents more than half of the market. In Spain, on the other hand, 77.6 % of the immigrant population lives in rental dwellings, compared with only 15 % of the native population (Colectivo IOE 2005). In the central municipalities of the BMA, 25 % of the dwellings are rented against 11.3 % for the whole Spain.

This system is “characterized by an imbalance in housing tenure favouring owner occupation and residual social housing” (Fullaondo and Garcia 2007, page 4.). Painter and Yu (2010) introduce other variables, such as housing prices and rent, in order to examine immigrants’ success in the housing markets of a sample of 60 mid-size US urban areas. The results indicate, along the same lines, that immigrants are less successful than their American counterparts in becoming residential property owners. In short, the immigrant population in Spain is concentrated in areas where the majority of the dwellings are small and in poor condition (Ajenjo et al. 2008).

Geographic Factors

Population density is a geographic variable used to identify areas with a high concentration of immigrants (Martori and Hoberg 2008; Qadeer 2010). In Spain, the intense migration growth of the past decade has generated significant territorial dynamics in immigrants’ settlement patterns. With the wave of immigration that began in 2000, the old town cores of cities acted as “gateways” for immigration flows. These areas are characterized by an informal housing market, which is for the most part comprised of substandard dwellings. Thus, in contrast to the findings of Malheiros and Vala (2004) for the late 1990s, the territorial pattern of immigration in Spanish cities has been defined until recently by a high concentration of immigrants in the cores of metropolises, and not in the suburbs (Arbaci and Tapada-Berteli 2012). We have seen a shift in immigrant populations towards the peripheries of urban areas, which has resulted in a new ethnic geography. The primary force behind this shift was the saturation of the traditional “gateway” areas, which could no longer accept more populations. This has led to a process of geographic expansion whereby new immigrants have settled in working class areas in the urban periphery (Martori and Apparicio 2011).

Data and Methods

An Overview of the BMA

The BMA is located on the central coast of Catalonia, Spain. In administrative terms, it is part of the Metropolitan Region (MR), and it represents 65 % of residents and 89 % of the MR immigrant population. The BMA is comprised of 36 municipalities occupying 627 km2 and containing 3.18 million inhabitants in 2008. The BMA is divided into 2493 census tracts (CTs), each of which includes some 500 to 2000 residents. The data on residents’ nationalities was obtained at the CT level from the Spanish National Statistics Office. In terms of its urban form, the BMA contains three different zones: (1) the central municipality of Barcelona (1,615,908 inhabitants); (2) the inner ring forming a continuous urban space with the main municipalities of L’Hospitalet de Llobregat (253,782 inhabitants), Badalona (215,329) and Santa Coloma de Gramenet (117,336); and (3) the last zone, which is a discontinuous area far from the central business district (CBD)Footnote 2 that contains both residential areas and industrial parks. Here, we find the municipalities of Sant Cugat del Vallés (76,274 inhabitants) and Castelldefels (60,572) (Fig. 1).

Fig. 1
figure 1

Barcelona metropolitana area municipality división, for Barcelona district division

An Overview of Immigrant Population Growth in the BMA

The percentage of the immigrant population rose sharply from 4.17 to 11.91 % between 2001 and 2008. In 2008, the BMA had 3,182,739 inhabitants, of whom 482,604 were immigrants, with 379,198 of them coming from non-EU countries. In 2008, the seven largest immigrant groups were Ecuadorians (10.54 %), Moroccans (9.49 %), Bolivians (7.37 %), Italians (5.62 %), Peruvians (5.15 %), Pakistanis (5.04 %) and Chinese (5.02 %). The percentages of non-EU immigrants in 2001 and 2008 and the relative variation of non-EU immigrants from 2001 to 2008 are mapped at the CT level in Fig. 2a–c. As can be seen, the CTs with the highest proportions of non-EU immigrant populations in 2008 are mainly located in the districts of Barcelona and in the surrounding municipalities of L’Hospitalet de Llobregat, Santa Coloma de Gramenet, Badalona, Gavà and Castelldefels.

Fig. 2
figure 2

Presence and growth of non-EU immigrant population

Globally speaking, a comparison of Fig. 2a and c shows that CTs with a high proportion of non-EU immigrants in 2001 also have high values for the relative variation of non-EU immigrants. However, we are interested in whether the diverse composition of immigrant populations in 2001 is relevant to explaining the growth rate over the 2001–2008 period. Unfortunately, panel data is not available, and we were not able to test processes such as the replacement in the same area of one immigrant group by another.

The Empirical Model

This paper seeks to shed light on three main questions. Which urban factors affect immigrant settlement patterns? Which factors are the most important? Are there factors associated with the resident population, the labour market, housing characteristics or geographic factors? Due to the nature of the spatial data, we also need to take spatial autocorrelation into account. Consequently, we address these first two questions by capturing the spatial effects by using spatial lag and spatial error models. This research therefore uses an econometric modeling strategy to explain the growth of the non-EU immigrant population in the BMA from 2001 to 2008. The unit of analysis in this study is the census tract. Two sources of data, both at the census tract level, were obtained from the Spanish National Statistics Office: Censo de Población (2001), and the immigration data taken from the continuous registry of population (2001–2008) known as the Padron Municipal. The Padron Municipal is a municipal (non-state) registry that includes both the regular and irregular migrant populations. It contains data on the population by sex, age, place of birth, place of residence and nationality. It is available from 1996 onwards, and is easily accessible for research purposes. González (2005) compared counts of immigrants in Censo de Población and Padron Municipal; he concluded that there is an overestimation of the immigration population in the Padron Municipal in comparison with the Censo de Población. The variable that we wish to study is the growth rate of the population of non-EU immigrants from 2001 to 2008, a period with the highest levels of immigrant population growth.

In line with the theoretical framework, the predictors that we selected encompass four dimensions: labour market conditions (unemployment, low education, household income (Madariaga et al. 2012) and blue-collar workers), immigrant diversity expressed as an entropy measure (Theil and Finizza 1971), as used by Wright et al. (2005) and White et al. (2005), the housing market (small dwellings and condition of dwelling), and geographic factors (population density and distance to the CBD). The entropy index, which varies from 0 (completely homogenous census tract, i.e. inhabited by only one immigrant group) to 1 (maximally diversified census tract, with all immigrant groups equal in size), is calculated as follows:

$$ H2=-{\displaystyle \sum_{i=1}^n\left[\left({P}_{ij}/{P}_j\right) ln\left({P}_{ij}/{P}_j\right)\right]/ \ln (n)} $$

where n is the number of immigrant groups, P ij is the population of the ith group in census tract j and Pj is the sum of the population of groups 1 to n in census tract j.

In the case of the presence of immigrants, according to the findings of Martori and Apparicio (2011), we expect zones with high entropy values to be areas with a sizeable growth of immigrants as well. Table 1 lists the dependent and independent variables. Before proceeding with the estimation of the econometric models, all variables were screened for statistical assumption violations, as well as for missing values and outliers. Almost all continuous variables violated the assumption of normality and more to that, were highly skewed. We then transformed them with a natural logarithm.

Table 1 The selected predictors

According to Anselin (2002), there are three main reasons for introducing spatial effects: spatial interdependence, asymmetry of the spatial reactions, and the relevance of factors located in “other spaces.” Classic econometric models often refer to these effects as interacting agents, and from another angle, sociologists refer to these effects as social interactions. The main approach is referred to as the spillover model, in which an agent i chooses the level of a decision variable y i , but the values of y chosen by other agents affect its objective function as well. Similarly, the location decision of a given immigrant household is determined not only by the presence of immigrants in the neighbourhood but also by the levels of immigration in surrounding areas.

With respect to this approach, various neighbourhood characteristics are related to the percentage of immigrants. These spatial effects on immigrant location choices provide a theoretical basis for us to use the spatial lag and spatial error models.

From a practical standpoint, there is another reason for introducing spatial effects. If, for example, the scale and location of the process under study do not correspond to the available data, the mismatch will tend to result in the modeling of a structure that shows a systematic spatial pattern. In consequence, the off-diagonal elements of the error covariance matrix will not equal zero. In this situation, two broad classes of spatial effects may be distinguished, which are referred to as spatial dependence and spatial heterogeneity.

In summary, we use a strategy for modeling the growth of the immigrant population in the BMA that is based on the predictors discussed in the theoretical framework. Since this general model has spatial effects, we proceed with the computation of two spatial econometric models. Consequently, we use spatial lag and spatial error models to control for spatial dependence.

According to the theoretical framework and the available data at the census tract level, we estimate the model as follows:

$$ \begin{array}{c}\hfill \log {\left(\mathrm{G}\_\mathrm{I}\mathrm{M}\mathrm{G}\right)}_{i,2001-2008}={\beta}_1+{\beta}_2 \log {\left(\mathrm{UNEMPL}\right)}_{i,2001}+{\beta}_3 \log {\left(\mathrm{BLUEC}\right)}_{i2001}\hfill \\ {}\hfill +{\beta}_4 \log {\left(\mathrm{LOWEDUC}\right)}_{i,2001}+{\beta}_5 \log {\left(\mathrm{I}\mathrm{N}\mathrm{COME}\right)}_{i,2001}\hfill \\ {}\hfill +{\beta}_6 \log {\left(\mathrm{I}\mathrm{M}\mathrm{G}\_\mathrm{E}\mathrm{N}\mathrm{T}\right)}_{i,2001}+{\beta}_7 \log {\left(\mathrm{D}\mathrm{W}75\mathrm{M}\right)}_{i,2001}+{\beta}_8{\left(\mathrm{D}\mathrm{W}\_\mathrm{M}\mathrm{R}\right)}_{i,2001}\hfill \\ {}\hfill +{\beta}_9 \log {\left(\mathrm{D}\mathrm{ENS}\right)}_{i,2001}+{\beta}_{10} \log {\left(\mathrm{D}\mathrm{ISTCBD}\right)}_{i,2001}+{\upvarepsilon}_{i,2001}\hfill \end{array} $$

where the dependent variable is the growth rate of non-EU immigrants in the 2001–2008 period:

$$ \log \left(\frac{y_{i,2008}-{y}_{i,2001}}{y_{i,2001}}\right) $$

with y i,2008 and y i,2001 being respectively the numbers of non-EU immigrants in census tract i in 2008 and 2001. The interpretation of the parameters in spatial models must be done by using the technique proposed by LeSage and Pace (2009). Note that the “dwellings needing major repairs” variable (DW_MR) is not in logarithmic form because there are several census tracts with zero values for this variable. In short, the model explains immigrant population growth over the period from 2001 to 2008 based on the initial conditions in 2001. On the one hand, we have used cross-sectional data because this kind of data presents heteroscedasticity, but we found it difficult to determine the appropriate transformation model. In this case, a logarithmic transformation is useful. This method is usual in econometrics and empirical model is specified where the variables are expressed in logarithm. On the other hand, in the models that were estimated, the interpretation of the parameters becomes richer and more complex than with linear regression parameters. We used LeSage and Pace’s (2009) approach based on S r(W) matrix.

To introduce spatial effects, we estimated two models. The spatial lag model is the most frequently encountered specification in spatial econometrics:

$$ y=\rho Wy+X\beta +\varepsilon, $$

where y is an (N × 1) vector of observations on a dependent variable measured at each of N locations, X is an (N × k) matrix of exogenous variables, β is a (k × 1) vector of parameters, ε is an (N × 1) vector of independent and identically distributed disturbances and ρ is a scalar spatial lag parameter. In our case, this means that the growth of the immigrant population in each unit (i.e. census tract) is modeled so as to depend on the growth of the immigrant population in neighbouring units captured by the spatial lag vector Wy.

The spatial error model may be written out as follows:

$$ \begin{array}{cc}\hfill y=X\beta +u,\hfill & \hfill u=\lambda Wu+\varepsilon, \hfill \end{array} $$

where λ is a scalar spatial error parameter, and u is a spatially autocorrelated disturbance vector. In this model, the spatial influence comes only from the error terms; this means that the growth of the immigrant population in each unit is modeled so as to depend on the error terms in neighbouring units captured by the spatial error vector Wu.

Specification searches in spatial econometrics are a topic that has been discussed in the urban and regional literature. Florax et al. (2003) present new specification search strategies and examples in different research fields, and Mur and Angulo (2009) give a recent and detailed discussion of different strategies for detecting the most appropriate form of spatial autocorrelation. The standard approach in most empirical work is to start with a non-spatial linear regression model (OLS) and then to determine (e.g. by using the Moran’s I test) whether or not the model needs to be extended with spatial effects. Afterwards, if this is confirmed, the introduction of spatial effects is required to determine what kind of model should be used.

The results of the Lagrange Multiplier tests (LM-Lag and LM-Error) and their robust versions (RLM-Lag and RLM-Error) may be used to decide what kind of spatial dependence is the most appropriate to control for the presence of spatial dependence in the OLS residuals. Following the decision rule suggested by Anselin and Florax (1995), if LM-Lag is more significant than LM-Error and RLM-Lag is significant but RLM-Error is not (or is less significant), then the appropriate model is the spatial lag model. Equally, if LM-Error is more significant than LM-Lag and RLM-Error is significant but RLM-Lag is not (or is less significant), then the appropriate model is the spatial error model. This classic approach is well known and has been widely used in econometric studies (Brasington 2005).

Results

The results are divided into three parts. First, we present empirical evidence of the presence of spatial autocorrelation in the OLS residuals. Next, we introduce the spatial effects into the model and base our decision on the LM test. To introduce these spatial effects, we use a row standardized contiguity matrix of first-order rook weights.

Table 2 shows the univariate statistics for the variables used in the models and, as shown in Table 2, based on Moran’s I, all variables are spatially autocorrelated. It should be noted that the “dwellings needing major repairs” variable has a higher dispersion than those associated with other factors discussed in the theoretical framework (coefficient of variation, CV = 1.04).

Table 2 Univariate statistics for the variables used in the models

As shown in Table 3, the initial analysis of the OLS results reveals the statistical significance of several variables (e.g. BLUEC, INCOME, IMG_ENT, DW_75M and DENS). So we have significant variables for each of the factors detailed in the theoretical framework. To begin with, two of these variables are related to labour market conditions: that is, the percentage of blue-collar workers is positively associated with non-EU immigrant population growth, while the annual household income is negatively associated. These results are consistent with the findings of Walks and Bourne (2006). In BMA case, this negative significance of the income results from access to a highly informal labour market associated with low level of income as the building and hotel industry (Canal-Dominguez and Rodriguez-Gutierrez 2008; Simon et al. 2008). The presence of an immigrant population is represented by the immigrant diversity variable (entropy index), with a high positive value for the coefficient and for significance. “Dwellings less than 75 m2” is the variable that contains a characteristic that shows the conditions of the housing market for the immigrant population. The most relevant geographic factor is population density, and this result is consistent with the findings of Grant and Sweetman (2004).

Table 3 OLS model

As we can see, the OLS results demonstrate empirical evidence of the importance of the four explanatory factors of immigrant population growth discussed in the theoretical framework. Since we are dealing with a known spatial phenomenon, it is important to introduce spatial effects into the specifications. Following the classic approach, we computed the robust versions of the Lagrange Multiplier tests (LM) for the spatially lagged dependent variable (RLM-Lag) and for error dependence (RLM-Error). The results presented in Table 4 show that LM-Lag and LM-Error are significant and indicate the need to include a spatial component in the model. As a general rule, in this situation, statisticians choose the spatial model with the most significant Lagrange Multiplier test or robust version of the LM test. In our case, RLM-Lag is significant but RLM-Error is not. These results indicate that the spatial lag model is preferable to a spatial error model. The results shown in this section were obtained by using the R libraries, spdep (Bivand 2013), and sphet (Piras 2010).

Table 4 Spatial autocorrelation LM tests

Table 5 shows the results of two spatial models: the lag model and the error model. We have estimated these two models using a maximum likelihood (ML). However, because non-normality of the error terms and heteroscedasticity may affect the results, we have also estimated lag and error models using Generalized Methods of Moments (GS2SLS) as proposed by Arraiz et al. (2010). The results obtained are qualitatively and quantitatively similar.

Table 5 Spatial models

It appears that the spatial coefficient—ρ in the lag model, and λ in the error model—is always positive and significant. The significance of the spatial effects confirms the presence of social interactions, so that we find evidence of the neighbourhood effect. For example, Dubin (1992) presents empirical evidence of spatial autocorrelation for several neighbourhood characteristics. From a sociological point of view, Sampson et al. (2002) synthesize the cumulative results of a new neighbourhood effects literature.

Overall, according to the results of the lag model, the most significant variables (i.e. P value less than 0.05) explaining the growth of the immigrant population are: household income, immigrant diversity, small dwellings, population density and distance to the CBD. With these results, we have at least one significant variable for each dimension of our theoretical framework. The significance of immigrant diversity coefficients confirms that immigrants are attracted to neighbourhoods with a mix of non-EU immigrant groups. The housing market is represented by dwellings less than 75 m2, and the purely geographic factors are represented by population density. As far as the diversity of immigrant populations and its effect on immigrant population growth are concerned, these results are in line with our previous work (Martori and Apparicio 2011). As expected, zones with high values of entropy are also areas of strong immigrant population growth. The negative sign for income suggests the presence of greater immigrant population growth in areas with lower income levels (Arbaci and Malheiros 2010; Walks and Bourne 2006). The positive significance of the distance to the CBD reveals the process of decentralization (Arbaci and Malheiros 2010); in other words, as the distance from the CBD increases, immigrant population growth also increases. It is worth noting that this decentralization is not synonymous with low population density, because some areas outside the central city also have high density levels. Housing characteristics are especially important, and the variable “dwellings less than 75 m2” is significant. These results regarding the importance of housing characteristics confirm previous studies showing the immigrant population living in smaller dwellings (Miret and Serra del Pozo 2013).

The spatial connectivity of the different urban zones in the BMA has an important role in the theoretical framework in regard to the understanding of spatial immigrant population growth. Our spatial models use a structure of dependence between census tracts. Because of this, the results presented in Table 5 contain information on the relationships between these census tracts. In the spatial models that we have presented, the interpretation of the parameters becomes richer and more complex than with linear regression parameters. In our case, a change in the explanatory variable for a single census tract can potentially affect the dependent variable in this unit (direct impact) as well as in all other census tracts in the BMA (indirect impact). LeSage and Pace (2009) provide computationally feasible means of calculating scalar summary measures of these two types of impacts that arise from changes in the explanatory variables in our models. We use these measures to make inferences about the sign and magnitude of these explanatory variables. Table 6 presents the direct, indirect and total impacts for each explanatory variable in the spatial lag model (estimated by GS2SLS).

Table 6 Impact measures

As shown in Table 6, for the lag model, the greatest total effect is related to household income, with a negative sign; in second position, we find a variable associated with the housing market (i.e. dwellings less than 75 m2); and in third position we have immigrant diversity. If we look at the indirect effect of these variables, we can see that a change in a given census tract has a similar effect in the rest of the units as in the original census tract. It should be noted that the indirect effect of each variable is somewhat greater than the direct effect.

Conclusion

In this paper, we have analyzed a number of explanatory factors that are relevant to understanding the variability in the growth of the immigrant population in a Southern European metropolitan area. More specifically, we studied the Barcelona metropolitan area as a special case characterized by a rapidly growing immigrant population over a short period of time. The results of the OLS demonstrated the importance of the classic factors mentioned in the literature: that is, the labour market, the immigrant population’s diversity, the housing market, and geographic factors.

However, the aim of this paper goes beyond the gathering of new empirical evidence for the theoretical causes of immigrant population growth in a Southern European metropolitan area; in sections three and four, we show the relevance of spatial factors in this kind of analysis.

Firstly, spatial effects have to be introduced into the econometric specifications. The spatial effects are due to both theoretical reasons and the available data. The main theoretical argument is that the location decision of a given immigrant household may be determined not only by the presence of immigrants in a given neighbourhood but also by the levels of immigration in nearby areas. In our case, if the scale and location of the process under study do not correspond to the available data, the mismatch will tend to result in model structures that include a spatial pattern.

Secondly, the spatial effects were explored here by using two models: the spatial lag and spatial error models. Following a standard specification search process based on LM and RLM tests, the spatial lag model was found to best describe the data. The final results indicated the importance of spatial effects that affect the growth of the immigrant population. Thirdly, we used LeSage and Pace’s (2009) approach to estimate the impact measures for each of the explanatory variables. For the Barcelona metropolitan area, our models revealed that immigrant population’s diversity in a given area is less important in explaining the growth of the immigrant population than household income is, and that this is especially the case with the spatial lag model. This result is important because the literature suggests that the presence of other immigrants is the primary determinant of immigrants’ location choices. This would suggest that in southern EU metropolitan areas, there may be other factors that are more important than the presence of other immigrants in explaining immigrant population growth in these urban areas.

These results suggest a number of avenues for future research. In order to be able to generalize this empirical model, further validation and new data from other southern EU metropolitan areas are needed. Other previous studies of urban areas in Southern EU (Arbaci 2007, 2008; Kandylis et al. 2012; Rebelo 2012) in Portugal (Lisbon and Oporto), Italy (Roma, Milan and Turin) and Greece (Athens) pointed out the importance of housing market. In this work, we have introduced other variables: labour market, diversity and geographical. These variables proved to be important for explaining the variability in the growth of the immigrant population within urban areas.

Finally, we must extend spatial econometric models to include not only spatial autocorrelation but also spatial heterogeneity, or at least to include municipality fixed effects to account for this spatial heterogeneity.