1 Introduction

One distinctive feature of Italy, demographically and socio-economically, is its marked intra- and interregional geographical heterogeneity. Within individual regions, the differences between urban and rural areas, or coastal and mountainous ones, are broad and persistent (Salvati et al. 2020). But the variations between regions, in some cases real gaps, are also wide and deep-rooted, especially about the North–South divide (Asso 2021). Generally speaking, and with some approximation, within the first type—intra-regional differences—urban areas, particularly metropolitan and/or provincial capitals, show more dynamic, or less stagnant, demographic and economic profiles, with populations that are on average less aged than other contexts since they are usually characterized by positive (internal and international) migration balances (Strozza et al. 2016). Indeed, international migration guarantees a reliable population turnover in a context such as Italy, where natural growth is nearly zero across the country (Buonomo et al. 2024). Such contexts are contrasted by non-urban, if not properly rural areas, which often experience systematic depopulation that increases their exposure to exogenous risks, especially natural ones (e.g., hydrogeological disruption, landslides, etc.), and which may be in less accessible areas, even if they have a high territorial capital in natural and environmental terms (Reynaud et al. 2020; Benassi et al. 2021). Between these two poles (urban and rural areas) are all the peri-urban areas that, often due to issues of geographical proximity and/or an adequate logistical, infrastructural system, are frequently impacted by suburbanization processes fuelled both by centrifugal movements (i.e., from the centre of the metropolitan area towards the peripheral areas) and by migrations coming from abroad or from other regional contexts (Bottai and Benassi 2016). Of course, these aspects often affect (and are also affected by) Italians and foreigners differently because of their different age structures, subsequent differing propensity to move, and socio-economic and professional conditions (Casacchia et al. 2022). The result is a complex and variegated territorial mosaic that becomes increasingly difficult to interpret through the classic quantitative analysis tools, especially when not referable to a local scale (Fotheringham and Sachdeva 2022). Even between regions, as mentioned above, the gap is wide and persistent (Casacchia et al. 2005). The differences are rooted in the history of Italy and, to date, do not seem to show any signs of change. Northern regions are, on average, more productive and richer compared to the contexts of the South in which the unemployment rates, especially of the young population, are on average higher, productivity is lower, internal emigration processes are very intense, and the depopulation of entire internal areas is more widespread (Zambon et al. 2020). Naturally, this dual and polarized dynamic translates into a different capacity for attracting (and integrating) the foreign population. In this regard, it is sufficient to say that the last census (2021) registered just over 5 million foreign citizens in Italy. Of these, slightly under 3 million (59.1%) reside in Northern Italy, 1.2 million in the Centre (24.6%), and only 816 thousand in the South and the Islands (16.2%). One of the dimensions most directly concerned by the presence of this foreign population is the working one. Even if, in recent years, the forms/types of migration have expanded and work migration (so-called labour-dominant) has become one of the many existing types of migration, and perhaps not even the most relevant, it is also true that work remains a fundamental determinant (driver) both as a push for migration and as a factor for inclusion in different local contexts, including in terms of settlement models (Benassi et al. 2023b). The topic of foreign employment and its role compared to native employment has been addressed several times in relation to Italy (Strozza 1995; Venturini 1999; Ambrosini 2001; Fullin and Reyneri 2011; Bonifazi and Marini 2014; Benassi and Naccarato 2018). Some of these contributions, such as that of Benassi and Naccarato (2018), have also addressed the subject from a spatial point of view using Geographically Weighted Regression (GWR) models but working at a geographical scale that is not strictly local, i.e., provincial (NUTS 3 level). Although many studies have focussed on the topic, some of which are spatial, none have approached the phenomenon of the employed foreign population as a multiscale process to the best of our knowledge. This paper aims to provide some initial observations using a multiscale local approach based on the methodology developed by Oshan and colleagues (2019). The idea is to evaluate the multiscale dimensions that characterize the geographical distribution of the employed foreign citizens residing in two of the most populous regions of Italy: Lombardy and Campania. Being respectively located in the North and in the South of Italy, these regions are representative of the North–South divide, perhaps to a greater extent than other contexts. Moreover, they host two of the most important metropolitan cities among the 14 in Italy, namely Milan and Naples. We intend to contribute to advancing the specific knowledge on this subject by answering the following research questions: Is the spatial allocation of employed foreign citizens a multiscale varying process? (RQ1); Are the determinants of this process stable across different spatial scales, or do they produce diversified effects in space? (RQ2); Do these effects vary from one regional context to another, in line with the North–South differential? (RQ3); Do these effects vary inside each regional context between metropolitan and non-metropolitan contexts? (RQ4). This paper is structured as follows: in Sect. 2, the data, methods, and geographical contexts of the analysis are described; subsequently, the results are presented in Sect. 3 and discussed in Sect. 4, where some final considerations are also outlined.

2 Methods and materials

2.1 Data and quantitative approach

Data used in the paper come from the Italian census (2021) and are produced and disseminated by the Italian National Institute of Statistics (Istat).Footnote 1 The variables used, the same as those used in a previous study (Benassi and Naccarato 2018), are computed at the municipality levels and shown in Table 1.

Table 1 Dependent and independent variables used in the regression analysis

The dependent variable (y) could be seen as a proxy of the importance of the foreign labour force in the local labour market. The three independent variables reflect different dimensions that have been proven to be correlated to the foreign presence. These are the Italian labor force’s unemployment rate, population density, and aging index. The first variable (x1) can be considered as a proxy of the attractiveness of the local labour market; the second (x2) reflects the level of urbanization of the different territorial contexts; and, finally, the third (x3) measures the level of aging of the population. The variables have been standardized before the model estimation. The regression analysis uses the Multiscale Geographically Weighted Regression (MGWR). This is a recently proposed model by Oshan and colleagues (2019) and can be seen as an extension of the GWR model (Brunsdon et al. 1996; Fotheringham et al. 2003). Following Oshan and colleagues (2019) for an MGWR model, the linear regression model can be written as follows. Assuming that there are n observations, for observation \(i \in \left\{\text{1,2},\dots ,n\right\}\) at location \(\left({\varphi }_{i},{\delta }_{i}\right)\),

$${y}_{i}= {\beta }_{0}\left({\varphi }_{i},{\delta }_{i}\right)+\sum j{\beta }_{bwj}\left({\varphi }_{i},{\delta }_{i}\right){x}_{ij}+{\varepsilon }_{i}$$
(1)

where \(bwj\) in \({\beta }_{bwj}\) indicates the bandwidth used to calibrate the jth conditional relationship. The idea beyond the MGWR model is that the scale of a spatial non-stationary relationship may vary for each predictor variable. The MGWR model can differentiate local, regional, and global processes by optimizing a different bandwidth for each covariate (Li and Fotheringham 2020). As pointed out in Benassi et al. (2023a), a significant advantage of MGWR over traditional local regression models (like GWR) is that it can more accurately capture the spatial heterogeneity within and across spatial processes, minimize overfitting, mitigate concavity, and reduce bias in the parameter estimates (Oshan et al. 2020). The MGWR model is calibrated using a back-fitting algorithm, which maximizes the expected likelihood. The criteria for selecting the bandwidths are derived from the same procedure used in the conventional GWR framework using the corrected Akaike Information Criteria (AICc) for finite samples (Fotheringham et al. 2022). In our empirical application, we used an adaptive (bi-square) kernel that is more suitable to handle non-uniform spatial distribution of observation (i.e., municipalities in our case) and irregularly shaped study areas (Benassi et al. 2023a). To compare each of the bandwidths obtained from an MGWR model, it is necessary to standardize the dependent and independent variables so that they are zero-centred and based on the same range of variation. Consequently, the bandwidths are unconstrained from the scale and the variation of the explanatory variables, helping the relative comparison of bandwidths (Oshan et al. 2020). To measure the multiscale dimension, we compute two parameters proposed initially by Yang and colleagues (2022): level of influence and scalability. In the case of the level of influence, we identify the municipalities where the effects of a given independent variable on the dependent are statistically significant using, as suggested in Yu et al. (2020), the adjusted t-value (95%). We then divide the number of municipalities where the estimates are significant by the total number of municipalities covered in the analysis. If the given variable influences more than 50% of the total number of municipalities, it will be categorized as a primary influence; otherwise, (< 50%) as a secondary influence. Scalability is defined based on the bandwidth (number of nearest neighbor municipalities for each location) of each variable as obtained in the calibration procedure of the MGWR model. It has three categories: macro-regional (global), regional, and local. When the bandwidth of a variable considers no more than 25% of the municipalities of the entire region the related variables is considered as local, when it is between 25 and 50%, it is regional; when it takes into consideration more than 50% of the entire municipalities of the region is macro-regional. The model was estimated using MGWR 2.2 software (Oshan et al. 2019), and the maps were created using QGIS 3.36.2.

2.2 Geographic contexts of analysisFootnote 2

The two regional contexts on which the analysis is based are representative of Northern and Southern Italy. They are, in fact, the most populous regions of each macro area and among the most important, not only demographically, of the entire nation. They include two of the most important Metropolitan Cities among the 14 currently existing in Italy: Milan and Naples. The first, Lombardy, can be considered Italy’s economic and financial engine. A region at the base of the so-called ‘blue banana’ represents a largely urbanized and industrialized region with a dense road infrastructure system and well-connected cities of different population sizes. The regional capital and metropolitan city of Milan is a dynamic and very compact city that has always been a destination for large migratory flows (both national, and especially coming from the South during the economic boom period, and international, registering a boost in recent years). In 2021, the residents of Lombardy are just under ten million, representing about 17% of the Italian population. Lombardy is the region of Italy where the highest number of foreigners reside. There are more than one million, representing about 23% of all foreign residents in Italy and the 12% of the regional population. Milan is the municipality of Lombardy in which the highest number of Italian and foreign populations is concentrated. Specifically, the residents in Milan are 1.5 million, which equals just under 14% of the population of Lombardy. Nearly 250 thousand are foreigners, accounting for 22% of foreigners residing in the region. To give an idea of the concentration of foreign presence in Milan, it can be observed that the second-ranked municipality for resident foreigners is Brescia, where 2% (20 percentage points less than in Milan) of the foreigners in Lombardy reside. Campania, after Lombardy and Lazio, is the third most populated region in Italy. More than 5.5 million individuals reside in Campania, representing nearly 10% of the peninsula’s residents. It is also the first southern region by number of resident foreigners. However, Campania is the seventh region in this ranking, preceded by central and northern Italy regions. About 240,000 foreigners are residing in Campania; they account for nearly 5% of all foreigners residing in Italy, and their incidence is equal to just over 4% of the resident population in Campania. Naples is the municipality where most of Campania’s population resides. They number 920 thousand individuals, more than 16% of the region’s residents. This municipality is also home to the region’s largest number of foreigners. There are more than 53,000, and they account for more than 5% of the region’s foreign residents. The foreign incidence in the city of Naples is nearly 6%. Among the southern regions of Italy, Campania undoubtedly represents a case study of primary interest for two fundamental reasons. It is a region that hosts a considerable number of resident foreign citizens, especially when compared to other contexts in Southern Italy where foreign presence is, to date, more limited and less stable. It records a significant presence of foreign communities that are predominantly settled in the Campania region itself, such as the Ukrainian and Sri Lankan ones. The differences between the two regional contexts are not only related to the total number of resident foreigners but also to their characteristics. Referring to the country of citizenship (Table 2.A, Annex), we can observe that only two communities, among the five largest, appear in both contexts: the Romanian community, that presents a more recent settlement in Italy, and the Moroccan one, which, on the contrary, has a long tradition of immigration to Italy. Other major communities in Lombardy include the Egyptian, Albanian, and Chinese, while those in Campania include the Ukrainian, Sri Lankan, and Bangladeshi. This ethnic diversity is naturally linked to the different economies of the two regional contexts as well as to the various and complex dynamics of the local labour market. Due to length limitations, it is not possible to provide a description of these aspects, for which reference is made to the extensive existing literature. Therefore, we limit ourselves to reporting some indicators from Istat that can help provide some reference points. In 2021, the employment rate in Lombardy is 66.5%, higher than the national average (58.2%) and significantly distant from that of Campania: 41.3%. This diversity also characterizes rural areas. In fact, still referring to 2021, the employment rate in rural areas, traditionally focused on by immigrant labour force, is 65.0% in Lombardy, while in Campania, it is only 44%, compared to a much higher national average of 56.2%. If we then consider youth unemployment, an important segment considering the higher incidence of youth among the foreign population, the differences are even more striking: 21.2% in Lombardy, slightly less than 45% in Campania, with a national average of 29.7%. Such differences, as expected, also reflect in terms of wealth and well-being. These diversities are also reflected in Fig. 1 that shows the geographical distribution of the share of foreigners employed over the total employment (SFE) among the municipalities of Lombardy and Campania.

Fig. 1
figure 1

Share of foreigners employed over the total employment (SFE). Quantile maps for the regions of Lombardy (a) and Campania (b)

In Lombardy, the highest share of SFE is observed in Milan (21%)—the capital of the Metropolitan City and of the region —and in some areas located in the southern part of the region in the provinces of Mantova, Cremona, and part of Brescia. Conversely, the Northern area of Lombardy (particularly the north-eastern area, provinces of Sondrio, northern part of Bergamo, Lecco, and Como) shows the lowest percentages of SFE. In the case of Campania, the highest percentages of SFE are observed in coastal areas and the municipalities closest to the sea across the provinces of Caserta, Naples, and Salerno. Also, in this case, the northeastern area together with inner areas of the region shows the lowest percentages of SFE.

3 Results

Results for the global models are shown inTable 3.A of the Annex.

3.1 The case of Lombardy

The summary results of the MGWR model are shown in Table 2. As it is clear, the share of foreigners employed over the total employment is a spatial varying multiscale process. Indeed, local beta presents a high variability that is proved by the summary statistics of Table 2. Moreover, each independent variable has a different bandwidth, implying the existence of a multiscale process. In comparative terms, the variable that presents a broader scale is the AI (68 municipalities, 4.6% of the entire municipality). On the opposite, the lowest scale is characterized by the URI (47 municipalities, 3.1% of the entire municipality). The explicative capacity of the MGWR model is equal to 0.64. This value is higher than the one of the global model (Table 3.A) which is actually quite low (0.073). This result is related to the fact that the MGWR model improves the estimation process but also to the local effects of the intercept. It is worth to mention that the better performance of the local model is confirmed also by the lower value of Akaike Information Criterion (AIC) compared to the one of the global model (4167.864). The added value of working with a local regression model is the possibility of mapping the local parameters and thus analysing their spatial patterns (Matthews and Yang 2012; Benassi and Naccarato 2017). The explicative capacity of the model varies significantly across the entire regional surface (panel a in Fig. 2). In the municipalities that belong to the metropolitan city of Milan, the explicative capacity of the model is high. These municipalities form an exceptionally clear and compact cluster. In the areas outside the Metropolitan City of Milan, the geographical distribution of the local R2 presents a higher variability and patterns that could be defined as clustered and dispersed. In the rural and inner municipalities of the region the value of R2 is in general lower that then global value of 0.64 (Table 2).

Table 2 MGWR summary results
Fig. 2
figure 2

Local R2 (a) and Local Condition Number (b)

The local estimation process has no multicollinearity problem because the local condition number (panel b in Fig. 2) is always minor compared to the common threshold of 30.Footnote 4 Local beta’s geographies (Fig. 3) help us to better understand each covariate’s scale and spatial patterns. The URI variable, which has a mean positive impact on the dependent variable (0.593 its mean value in the MGWR model; 0.06 with p < 0.05 in the OLS model), records the highest impact (positive) on the dependent variable in a cluster of municipalities located inside the Metropolitan City (MC) and in some other areas, especially in the province of Bergamo and of Brescia (on the East), and on an area (on the South) that belongs to three different provinces (Pavia, Lodi, and Cremona). What is clear is that URI plays an effect on the dependent variable in urban municipalities located both at the core and in surrounding areas of metropolitan and urban systems. Population density (PD) positively impacts the dependent variable (0.380 its mean value in the MGWR model; 0.186 with p < 0.001 in the OLS model) and presents different geographies. In this case, the major (positive) effects are recorded in the northern region of Sondrio province and the upper part of Brescia, Como, and Bergamo. These are mountain areas in such cases affected by depopulation processes and, in any case, characterized by small municipalities, typically rural, with small population density. The aging (AI) level, which presents a negative mean effect on y (− 0.366 its mean value in the MGWR model; − 0.158 with p < 0.001 in the OLS model), also has its geographies. In this case, the effect is, in most of the cases, negative. All conditions being equal, where the population’s aging level is higher, the share of foreigners employed over the total employment is lower. The highest negative local parameters are located across the southeast border in Mantova and Brescia province and some Bergamo areas.

Fig. 3
figure 3

Local beta (Adj-t)(a). Quantile map

3.2 The case of Campania

As in the case of Lombardy also in Campania it seems clear that the share of foreigners employed over the total employment is a spatially varying multiscale process. Indeed, local beta presents a high variability that is proved by the summary statistics of Table 3. Moreover, each independent variable has a different bandwidth, implying the existence of a multiscale process. The major difference here is that one independent variable, the population density (PD), has a global effect since its bandwidth (549) is almost the same as the entire number of the region’s municipalities (550). Nevertheless, considering that this variable is not statistically significant in the OLS model (Table 3.A in the Annex) we must infer that the share of foreign employed over the total employed is not influenced by the population density in the regional context of Campania. This is a first major difference in comparison with the context of Lombardy. The bandwidth of the other two independent variables is almost the same: 46 municipalities (8.3% of the total municipalities) for URI and 43 municipalities (7.8% of the total municipalities) for AI. The explicative capacity of the MGWR model is 0.50, that is higher than the one of the global model (0.039). As in the case of Lombardy, the MGWR model records a lower value for the AIC parameter compared to the one of the global model: 1269.505 versus 1547.128. Looking at Fig. 4, we can appreciate that, also in this case, the local distribution of R2 (panel a) reveals interesting spatial patterns with clusters of municipalities where the capacity is much higher than the global value (0.50) and other clusters where the opposite holds. It is interesting to note that a cluster of municipalities with higher values of local R2 is, as in the case of Milan, located in the MC of Naples on the eastern quadrant without directly involving the municipality of Naples. Some other clusters of this type are in Caserta, Benevento, and Salerno provinces. The local Condition Number map (Fig. 4, panel b) excludes the existence of a multicollinearity problem (values are consistently below the threshold of 30).Footnote 5

Table 3 MGWR summary results
Fig. 4
figure 4

Local R2 (a) and local condition number (b)

Local beta (Fig. 5) reveals specific territorial patterns. The first independent variable (URI) has a negative impact on the dependent one at the global level, with a mean value in the MGWR model of − 0.070, meaning that where the unemployment rate of the Italian labour force is higher, the percentage of foreigners employed is lower (i.e., where Italians do not work, so the foreigners do not too). It should be noted, however, that in the global model this variable is not statistically significant (Table3.A in the Annex). Looking at the local estimation, we can appreciate that this relation is not significant everywhere and, more importantly, in some areas, is negative (especially in the eastern part of the Naples’ MC), but in some others is positive. Therefore, the two components of the labour force are competing in these last areas but not in the first ones (where the sign of the coefficient is negative). From this point of view, the distinction between the metropolitan labour market and the non-metropolitan one is quite clear. Where it is significant, in the first case, the net effect of the URI on the dependent variable is negative, while the opposite is true in the second case. This could imply that in the urban and metropolitan labour market, there is no competition (i.e., complementarity) between the two components of the labour force (native and foreigners). At the same time, in the non-metropolitan context, it seems that the components of the labour force act as competitive drivers. This could depend on the different local contexts’ economic vocations and specialization. The population density (PD), as it happens in the global model, doesn’t affect the dependent variable at the significance level considered. This result is quite far from the one obtained in the case of Lombardy, but it could depend on the different territorial structures of the two regions. In Campania, 302 municipalities are classified as rural areas based on the Degree of Urbanization (Degurba) indicator (obtained based on population density) (Eurostat 2019). It means that 55% of the total municipalities are rural. In Lombardy, this quote is equal to 41%. The proportion of geographic area covered by rural municipalities compared to the total area of the region is equal to 54.2% in Lombardy and to 66.5% in Campania. As in Lombardy’s case, AI has a negative mean effect on the dependent variable. It is important to note that this is the only independent variable that, for the case of Campania, results statistically significant also in the global model (− 0.177 with p < 0.001). The local geographies clearly distinguish between metropolitan areas (where the effect is negative and comparatively higher) and non-metropolitan municipalities, where the effect is less negative and somewhere positive.

Fig. 5
figure 5

Local beta (Adj-t)(a). Quantile map

3.3 Detecting and interpreting multiscale effects in both regions

The two parameters of influence and scalability (Yang et al. 2022) allow us to compare the two regional study cases regarding the multiscale effect of the different independent variables. Even if in both regions the level of influence of the independent variables that are significant at local level is of a secondary type, the magnitude is quite different from one context to another. In the case of Lombardy, the percentage of municipalities where the local estimates are significant is always higher than 30% (Table 4).

Table 4 Multiscale effect. Influence and scalability

On the opposite in Campania is quite low. Furthermore, an element of (partial) homogeneity is represented by the local scalability for all the independent variables (except PD in the case of Campania, which has a macro-regional effect). These results underline the different multiscalar structures of the process observed here and the heterogeneity between the drivers of such process in the two regional contexts. Moreover, it is important to remind that even higher differences are recorded in the OLS model. In both regions, global model has low explicative capacity but in the case of Lombardy all the independent variables are significant. In the case of Campania only the AI variable is statistically significant at global level.

4 Discussions and conclusions

The results achieved by the explorative analysis made so far allow us to formulate some first reflections. First, we must underline the different effects that the unemployment rate of the Italian labour force (URI) has on the dependent variable: a positive effect (both globally and locally) in the case of Lombardy and a negative effect (in some local contexts) in the case of Campania where, however, URI is not significant on a global scale. It seems, therefore, that in the first case, the two components of the labour force are competing. In the second case, the opposite is true in the specific contexts where the effect is negative. However, the analysis reveals the existence of a significant local heterogeneity that proves the importance (and the urgency) of using local scale analysis for statistics and society (Fotheringham amd Sachdeva 2022). Each independent variable has its geography regarding local effects and acts at different scales, proving that the process analysed here is multiscale (RQ1). Therefore, the determinants of the process analysed here are not stable at the scalar level, and they produce (in most cases) diversified effects in space (RQ2). These effects vary from one regional context to another concerning the level of influence (RQ3). All the independent variables have a level of influence of secondary type in the case of Lombardy, while in the case of the Campania region, this is not true. For the most critical covariate, the URI, we can appreciate clear differential patterns between MC municipalities and other municipalities inside each regional context but, as explained before, with different signs and patterns (RQ4). The study represents an attempt to deal with the foreign presence as a multiscalar process. To our knowledge, this is the first study done in Italy. The study presents some limitations in using just two regions as a study case and in the cross-sectional nature of the data. The explicative capacity of the OLS models is very poor indicating the need to select other independent variables in the next studies. Nevertheless, it should be noted that a study done at the provincial level with data from the 2011 census (Benassi and Naccarato 2018) that used the same variables proved to be quite good in explaining the variance of the dependent variable. The significant heterogeneity in the model’s performance is another reason why it is essential to have a local approach and a multiscalar approach in measuring spatial processes linked to demography and population issues (Voss 2007).