1 Introduction

Demographic dynamics are a pivotal dimension of regional disparities (O’Brien and Eger 2021). Measures that stimulate social cohesion and boost local development represent important policy tools supporting the sustainable management of cities and regions (Ehlert 2021). Sequential expansions and recessions of a given economic system have been demonstrated to alter the demographic response at both regional and local scales (Dijkstra et al. 2015; Cazzola et al. 2016; Carbonaro et al. 2018). A complex interplay of contextual factors that includes diversified production bases and articulated social organizations, variable unemployment rates, and the intensity of poverty along urban–rural gradients, among others, may influence the above-mentioned response to economic shocks (Serra et al. 2014; Carlucci et al. 2017; Wolff et al. 2022). Sudden and unpredictable disturbances to well-established demographic regimes exert a possibly relevant effect on both long-term and short-term population dynamics (Caltabiano 2008; Jackson et al. 2021; Gonzales-Leonardo and Spijker 2022).

Together with economic shocks, global pandemics had a documented role in shaping population structures and local (socioeconomic) dynamics at large (Kalabikhina 2020; Gonzales-Leonardo et al. 2022; Kahraman et al. 2022), in both emerging and advanced economies (MacKellar and Friedman 2021). As a particularly intense and globalized event, COVID-19 pandemic has determined a sudden fall in economic activity, affecting population dynamics, limiting social cohesion, and exerting indirect impacts that require further investigation (Galati et al. 2022; Gutierrez et al. 2022; Vinci et al. 2022). To generalize, the COVID-19 pandemic has influenced a wide ensemble of socioeconomic phenomena heterogeneously over spatial scales (Zhang and Schwartz 2020; Egidi and Manfredi 2021; Xu et al. 2022). Impacts have been positive or negative depending on the local/regional background (Zambon et al. 2020; Turok and Visagie 2021; Salvati 2022). Response to the intrinsic shock has been demonstrated to be more or less rapid depending on the specific process (Ullah et al. 2020; De Rose et al. 2021; Thomas et al. 2022). Short-term, medium-term, and long-term effects can finally differ and literature evidence is still occasional, since only short-term impacts have been investigated with some details (Aassve et al. 2021; Dumont 2021; Alaimo et al. 2022a; Bailey et al. 2022).

How COVID-19 pandemic has affected social behaviors (Alaimo et al. 2020, 2022b; Cruz-Cárdenas et al. 2021; Truong and Truong 2022), and particularly demographic dynamics, is a research issue intensively discussed in the last 2 years (Aburto et al. 2022a, b; Mazzucco and Campostrini 2022; Schöley et al. 2022). Being substantially decoupled from demographic theory (Gonzales-Leonardo and Rowe, 2022), empirical studies focused on specific phenomena most intuitively assumed as a short-term consequence of the pandemic (e.g. differential mortality and changes in life expectancy). Short-term impacts of COVID-19 on demographic processes supposed to be indirectly associated with the pandemic, such as fertility, nuptiality, childbearing propensity, internal and international migrations, were largely demised or occasionally investigated at a very local scale (Aassve et al. 2020), depending on the availability of relevant data and statistical indicators released on time (Sun et al. 2020). The debate on the intensity and duration of the consequences of COVID-19 in economic systems and local societies provided mixed results (Kalabikhina 2020), since the pandemic’s impacts were classified as typically short-term in most cases—being progressively re-adsorbed in the following years (Marteleto et al. 2022). However, medium-term and long-term effects have been theoretically delineated (Gonzales-Leonardo et al. 2022) and need empirical confirmation when objective information and sufficiently long time-series indicators derived from official statistics will be available.

Based on these premises, our study assumed the short-term and medium-term impacts of COVID-19 in a specific dimension of demographic dynamics, namely the consolidation and enlargement of spatial disparities in population growth rates (e.g. Strozza et al. 2016). In other words, we indirectly test the outcomes of COVID-19 pandemics in the first two years after the outbreak (2020 and 2021) as reflected in the short-term changes of population balance resulting from differential fertility, mortality, and migration patterns over space (e.g. Sobotka et al. 2011; Vignoli et al. 2012; Stockdale 2016). Assuming a different impact of COVID-19 pandemic on local populations (e.g. Tragaki and Bagavos 2014, 2019), we verify if spatial divides in fertility, mortality, and migration rates have increased in the last years (sensu Billari and Vitali 2017) compared with previous dynamics investigated over a sufficiently long-time window.

We further verify the joint impact of these three processes on the overall demographic balance, testing if population growth rates converge or diverge over space (Zambon et al. 2020). An evident divergence in population growth rates across regions before and during COVID-19 health crisis may confirm the role of pandemic fueling territorial disparities in complex socio-demographic dynamics (Goujon et al. 2021). Results of this study bring the empirical knowledge on COVID-19 in a context of local development, applied economics and regional demography (Pomar et al. 2022), allowing for a multi-disciplinary interpretation of the consequences of health crises that can be generalized to vastly different social contexts (Chakraborty and Maity 2020).

Operationally speaking, the present work applies an exploratory multivariate analysis of ten indicators representative of multiple demographic phenomena (fertility, mortality, nuptiality, internal and international migration) and the related outcomes of such dynamics (natural balance, migration balance, total population growth). We developed a quantitative analysis of the statistical distribution of such demographic indicators over space using nine metrics (i.e. descriptive statistics) reflective of spatial divides (Okuoughae and Omame 2020), and thus controlling for shifts over time in both central tendency, dispersion, and distributional shape regimes.

All indicators were made available over 20 years (2002–2021) at a relatively detailed spatial scale (110 NUTS-3 provinces) in Italy. COVID-19 pandemic exerted a particularly heavy impact on Italian population because of multiple intrinsic (e.g. a particularly older population age structure compared with other advanced economies: Billari et al. 2007; Caltabiano et al. 2009; Benassi et al. 2020) and extrinsic (the early start of the pandemic spread compared with the neighboring European countries: Cutrini and Salvati 2021; Alaimo 2022; D’Urso et al. 2022) factors. For such reasons, Italy represents a sort of ‘worst’ demographic scenario for other countries affected by COVID-19 (Alaimo 2021a; b) and the results of our study may be informative when delineating policy measures (with both economic and social impact) mitigating the effect of pandemics on demographic balance (Alaimo and Maggino 2020) and improving the adaptation capacity of local societies to future pandemic’s crises (Wang and Chi 2017).

With this perspective in mind, the paper is structured as follows into five sections. More specifically, Sect. 2 illustrates data and methodologies used; Sect. 3 details the empirical results of our analysis. Section 4 discussed the relevance of the main findings in light of the current literature, and Sect. 5 concludes delineating the original contribution of this work and the prospect for future research.

2 Data and methods

2.1 Study area

Italy extends nearly 301,330 km2 and is partitioned into three basic macro-regions (North, Centre, South) and 20 administrative regions (Salvati et al. 2017) that reflect socioeconomic disparities along the latitude gradient (Ciommi et al. 2018). Such disparities, whose consolidation is reflected in the widely discussed North–South gap, should be taken in mind when analyzing economic phenomena and designing (or implementing) social policies in Italy (Alaimo and Maggino 2020). For decades, Southern Italy was considered a marginal and economically disadvantaged area with dynamic population (e.g. high fertility, low mortality). Northern Italy, one of the wealthiest areas in Europe, attracted population and workers from both Southern Italy and abroad (Zambon et al. 2020), evidencing in turn a particularly accentuated urban–rural gap (Ferrara et al. 2017) as far as accessibility (Zambon et al. 2017), economic opportunities (Carlucci et al. 2017), and demographic structure (Strozza et al. 2016) is concerned.

2.2 Data and indicators

The present study benefited from official statistics derived from the website of the Italian National Institute of Statistics (Istat) releasing a full set of population data (www.demo.istat.it). We used spatially stabilized and fully comparable time series from a dashboard of demographic indicators covering a time interval between 2002 and 2021 and calculated from the national population register (Caltabiano 2008; Caltabiano et al. 2009; Vitali and Billari 2017). For all these indicators, yearly figures were calculated at the level of Italian provinces (NUTS-3 level of the European nomenclature). The spatio-temporal data series is the longest (20 years) available for Italy at the province scale (NUTS-3) and covers a large number of indicators representative of different demographic processes (Modena et al. 2014; Del Bono et al. 2015; Recanatesi et al. 2016) at the lowest desirable spatial scale (110 units reflecting administrative boundaries that may describe, likely better than other geographic partitions, the important socioeconomic divided existing in the country).

Selection of a restricted number of non-redundant indicators relevant to this study was based on an early study by Alaimo et al. (2022a) that focused on the main changes in population dynamics affecting the demographic balance, and additional demographic phenomena possibly influenced by COVID-19 pandemic in the medium-term (Fiore et al. 2020). Population balance indicators made available for each year of investigation include: (1) crude birth rate, (2) crude death rate, (3) the consequent natural balance (births–deaths), calculated as a percent rate of native population growth, (4) internal migration rate, (5) foreign migration rate, representative of the net migration balance (immigrants–emigrants), calculated as the percent rate of non-native population growth and, finally, (6) population annual growth rate (%). Ancillary indicators of specific demographic phenomena—basically marriage, fertility, and aging—include: (7) gross marriage rate, (8) mean age at childhood, (9) total fertility rate, and (10) mean population age.

2.3 Statistical analysis

To analyze short-term changes in the demographic balance of Italy and the formation (or consolidation) of spatial disparities in demographic processes over time, possibly associated with external shocks such as the COVID-19 pandemic, we followed the operational scheme proposed by Aassve et al. (2020). More specifically, we compared the indicators illustrated above (10 variables expressed as an average of two time intervals of equal duration: 2002–2010 and 2011–2019) with the respective, annual value at two subsequent years (2020 and 2021), controlling for the regional context, i.e. comparing values along their spatial distribution over Italy, namely 110 values associated with each Italian province (e.g. Colantoni et al. 2015; Bagavos et al. 2018; Ciommi et al. 2018). The two time intervals (i.e. 2002–2010 and 2011–2019) were assumed to be representative of (1) economic expansion (2002–2010) and demographic recovery (mainly of fertility and immigration) after a relatively long decline since the late 1980s, and (2) recession (2011–2019) with a progressive demographic decline (Salvati et al. 2017). Indicators’ values observed along the years 2020 and 2021 were instead assumed to reflect the short- and medium-term impact of the COVID-19 pandemic on population dynamics and demographic structures. The results of the statistical analysis highlight how COVID-19 pandemic has exerted (more or less) considerable pressure on population dynamics, determining short-term (mortality increase), medium-term (more volatile migration flows) and, possibly, long-term (fertility decline) effects, likely consolidating the existing demographic divide along the latitude gradient in Italy (Salvati and Benassi 2020). Being representative of relevant socio-demographic processes for Italy, e.g. fertility, aging, migrations (Wachter 2005), the indicators selected in this study were rather well known in spatial demography, and regarded as particularly stable and reliable over both time and space (Colantoni et al. 2012; Salvati 2014; Di Feliciantonio and Salvati 2015; Ferrara et al. 2016).

2.3.1 Descriptive statistics

We studied the statistical distribution of these indicators across Italian provinces using descriptive statistics (2002–2019) compared with the respective values observed for 2020 and 2021. Dissimilarities between the average values (2002–2019) and the current values (2020 and 2021) of these indicators were assumed as estimates of short- and medium-term impacts of COVID-19 on population dynamics in Italy (Boyle 2003; Castro 2007; Ciganda 2015). Descriptive statistics calculated by year (from 2002 to 2021) allowed an explicit analysis of the dissimilarity of a given indicator’s value before and during the pandemic based on nine basic metrics, calculated as follows: (1) the median value of the statistical distribution for a given indicator (‘Median’), (2) the absolute difference between maximum and minimum values (‘Max–Min’), the absolute differences between (3) 95th and 5th percentile values (‘95th-5th’), and between (4) 75th and 25th percentile values (‘75th–25th’), the absolute (5) standard deviation (‘Dev.St’), indicators of (6) Kurtosis (‘Kurtosis’) and (7) Asymmetry (‘Asymmetry’), as well as the ratios of (8) Median to (arithmetic) mean values (‘Med-Mean’) and of (9) mode to median (‘Mode-Med’) values (Salvati 2016). These metrics were considered appropriate to analyze heterogeneous statistical distributions with deviations from normality (Gavalas et al. 2014; Salvati et al. 2018; Ciommi et al. 2018).

2.3.2 Principal component analysis

A Principal Component extraction was run separately for each demographic indicator in order to estimate the multivariate distance between population dynamics during the long-term stage before COVID-19 and what has been observed during (2020) and immediately after (2021) the pandemic in Italy (Zambon et al. 2020). More specifically, a Principal Component Analysis (PCA) was run separately on each outcome matrix derived from the descriptive analysis run as above (see Tables 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, each formed of 9 statistical metrics, from ‘median’ to ‘mode-med’, by column, and 20 years, from 2002 to 2021, by row). PCA summarized and depicted graphically the trajectory over time of each demographic indicator (2002–2021) in Italy (Salvati and Serra 2016). Biplots made clear the possible year anomaly (sensu Delfanti et al. 2016; Recanatesi et al. 2016; Egidi and Manfredi 2021) by comparing short-term demographic dynamics observed in 2020 and 2021 with those observed in earlier years, individually and as a long-term average (Ferrara et al. 2016).

Table 1 Descriptive statistics of crude birth rate (province-level) in Italy, by year
Table 2 Descriptive statistics of crude death rate (province-level) in Italy, by year
Table 3 Descriptive statistics of crude marriage rate (province-level) in Italy, by year
Table 4 Descriptive statistics of internal migration balance (province-level) in Italy, by year
Table 5 Descriptive statistics of international migration balance (province-level) in Italy, by year
Table 6 Descriptive statistics of total migration balance (province-level) in Italy, by year
Table 7 Descriptive statistics of natural population change (province-level) in Italy, by year
Table 8 Descriptive statistics of total population growth (province-level) in Italy, by year
Table 9 Descriptive statistics of total fertility rate (province-level) in Italy, by year
Table 10 Descriptive statistics of the average mother’s age at birth (province-level) in Italy, by year

2.3.3 Fuzzy clustering

Finally, we verified the existence of homogeneous groups of provinces at three time points (2002–2019, 2020, and 2021), identifying the most characteristic indicators for each provinces’ profile (D’Urso 2016). In addition, by comparing the year of the outbreak of the COVID-19 pandemic in Italy (2020) with both 2002–2019 and 2021 values, we highlighted a possible pandemic effect in demographic spatial dynamics (D’Urso et al. 2019). In order to correctly deal with the complexity of the phenomenon (Alaimo 2021a, b, 2022), we decided to adopt a fuzzy approach. Fuzzy clustering is an overlapping approach based on Fuzzy Set Theory (Zadeh 1965). By violating the condition of mutual exclusivity, overlapping clustering techniques allow units to belong to more than one cluster simultaneously (Bezdek 1981), depending on a certain membership degree (D’Urso 2016). The fuzzy approach has several advantages (D’Urso et al. 2019) and is particularly suitable for the analysis of socioeconomic phenomena, as demonstrated in earlier studies (see, for instance, Fiore et al. 2020; D’Urso et al. 2022; Galati et al. 2022). In this paper, the Fuzzy k-Means (FkM) algorithm (Bezdek 1981), a generalization of the standard k-means method—a renewed fuzzy clustering technique—swas adopted for computation. Let the following matrix be given as:

$${\mathbf{X}} = { }\left\{ {{\text{x}}_{{{\text{ij}}}} :{\text{i}} = 1 \ldots {\text{n}};{\text{j}} = 1 \ldots {\text{p}}} \right\} = \left( {\begin{array}{*{20}c} {{\text{x}}_{11} } & \cdots & {{\text{x}}_{{1{\text{p}}}} } \\ \vdots & \ddots & \vdots \\ {{\text{x}}_{{{\text{n}}1}} } & \cdots & {{\text{x}}_{{{\text{np}}}} } \\ \end{array} } \right)$$
(1)

where i = 1, …, n are the analysis’ units (the Italian provinces in this study) and j = 1,.., p are the variables (in this case, 10 demographic indicators). The FkM method is formalized as followsFootnote 1:

$$\left\{ {\begin{array}{*{20}l} {min:\quad \mathop \sum \limits_{{{\text{i}} = 1}}^{{\text{n}}} \mathop \sum \limits_{{{\text{c}} = 1}}^{{\text{k}}} {\text{u}}_{{{\text{ic}}}}^{{\text{m}}} {\mathbf{x}}_{{\text{i}}} - {\mathbf{h}}_{{\text{c}}}^{2} } \hfill \\ {s.t. \quad \mathop \sum \limits_{{{\text{c}} = 1}}^{{\text{k}}} {\text{u}}_{{{\text{ic}}}} = 1, {\text{u}}_{{{\text{ic}}}} \ge 0} \hfill \\ \end{array} } \right.$$
(2)

where uic is the membership degree of the ith observation to the cth cluster; \({\mathbf{x}}_{{\text{i}}} = \left( {{\text{x}}_{{{\text{i}}1}} ,{\text{ x}}_{{{\text{i}}2}} , \ldots {\text{ x}}_{{{\text{ip}}}} } \right)\) represents the vector of the ith observation; \({\mathbf{h}}_{{\text{c}}} = \left( {{\text{h}}_{{{\text{c}}1}} ,{\text{ h}}_{{{\text{c}}2}} , \ldots {\text{h}}_{{{\text{cp}}}} } \right)\) denotes the cth centroid; \({ }{\mathbf{x}}_{{\text{i}}} - {\mathbf{h}}_{{\text{c}}}^{2}\) is the squared Euclidean distance between the i-th object and the centroid of the cth cluster; m is a parameter controlling the fuzziness of the partition (in this paper, we used m = 1.3). Centroids thus summarized the characteristics of each cluster. In particular, each of them represents an appropriate weighted average of the characteristic set of the respective cluster, and was used to formulated an augmented interpretation of the underlying phenomena.

For the choice of the optimal partition, we adopt the Fuzzy Silhouette (FS) index (Campello and Hruschka 2006) formalized as follows:

$${\text{FS}} = \frac{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{{\text{I}}} \left( {{\text{u}}_{{{\text{pi}}}} - {\text{u}}_{{{\text{qi}}}} } \right)^{{\upalpha }} \cdot {\uplambda }_{{\text{i}}} }}{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{{\text{I}}} \left( {{\text{u}}_{{{\text{pi}}}} - {\text{u}}_{{{\text{qi}}}} } \right)^{{\upalpha }} }},\quad {\uplambda }_{{\text{i}}} = \frac{{\left( {{\text{b}}_{{{\text{pi}}}} - {\text{a}}_{{{\text{pi}}}} } \right)}}{{\mathop {\max }\limits_{{}} \left\{ {{\text{b}}_{{{\text{pi}}}} } \right.,\left. {{\text{a}}_{{{\text{pi}}}} } \right\}}}$$
(3)

where \({\text{a}}_{{{\text{pi}}}} { }\) is the average distance of object i-th to all other objects belonging to the same cluster p and \({\text{b}}_{{{\text{pi}}}} { }\) is the minimum (over clusters) average distance of the i-th unit to all units belonging to the cluster q with \({\text{p }} \ne {\text{q}}\); \(\left( {{\text{u}}_{{{\text{pi}}}} - {\text{u}}_{{{\text{qi}}}} } \right)^{{\upalpha }}\) is the weight of each \({{ \lambda }}_{{\text{i}}}\), where \({\text{u}}_{{{\text{pi}}}}\) and \({\text{u}}_{{{\text{qi}}}}\) correspond to the first and second largest element of the i-th column of the fuzzy partition matrix U, respectively; \({\alpha } \ge 0\) is an optional user-defined weighting coefficient. A higher value of FS means a better assignment of the units to the clusters implying that, simultaneously, the intra-cluster distance is minimized while the inter-cluster distance is maximized.

3 Results

3.1 Descriptive analysis

Tables 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10 summarized the statistical distribution of selected demographic indicators across Italian provinces, by year. As expected, birth rates and death rates showed a reverse pattern over time. While fertility declined in the last two decades, the last two years with COVID-19 evidenced a further fertility slowdown, which seems to be hardly recoverable in the coming future (Table 1).

COVID-19 accelerated population aging in Italy, consolidating a trend observed since the early 1990s. Death rates increased substantially, being the highest both in 2020 and 2021, with COVID-19 representing an additional cause of death in a context of population aging (Table 2). A mild recovery was observed for 2021 as compared with 2020, although pre-COVID values seem to be hardly recovered in the coming future. Population aging was more intense in Southern Italy, despite the mean age of population was systematically higher in Northern Italy.

Taken together, basic demographic indicators delineated a progressive aging and a sudden fertility decline, made more intense during both 2020 and 2021. In this direction, gross marriage rates (Table 3) declined substantially between 2002–2010 and 2011–2019 and decreased further in 2020, likely because of marriage postponements, with a moderate recovery in 2021.

COVID-19 pandemic caused a moderate slowdown of internal migrations, preserving—at least temporarily—the traditional south–north flows, and possibly supporting a residual demographic dynamism of Northern Italy (Table 4).

Conversely, foreign migration balance decreased substantially over time, moving from positive figures in 2002–2010 to almost null values in 2011–2019. These values declined further in 2020. In this case, the impact of COVID-19 seems to add to the medium-term effect of the economic crisis in Italy, lowering the economic attractiveness of regions and cities to foreign migrants (Table 5). A moderate recovery in 2021 was consolidating the traditional disparities between Northern Italy (more attractive) and Southern Italy (less attractive).

Total migration rates in the Italian provinces were largely different over time (Table 6). Descriptive statistics indicate 2020 as an outlier reflecting the disturbance of COVID-19 on migration flows, because of lock-down policies all over the world during the first semester of the year. International migration flows recovered moderately during 2021, and the overall impact of COVID-19 (if temporary or more persistent) is still under scrutiny, needing longer time series for a confident analysis.

As a consequence of such dynamics, natural balance was slightly negative between 2002 and 2010, decreasing in 2011–2019, and shifting toward negative values in 2020, and finally recovering weakly in 2021 (Table 7).

Considering natural balance and migration rates together, total population growth moved from positive rates for 2002–2010 to weakly negative rates for 2011–2019, turning further to negative rates for 2020, with a modest recovery observed for 2021 (Table 8).

Fertility divides (higher birth rates in Northern Italy than in Southern Italy) consolidated over time reverting, especially in the pandemic years, the traditional interpretation of Southern regions as acting as the (internal) demographic engine of the country (Table 9). Total fertility rate was rather stable in the last two decades and a moderate decline was recorded in 2020 and 2021 (on average, 1 child less per 10 women per year).

On the contrary, mean age at childhood increased almost linearly over time. COVID-19 was assumed to indirectly consolidate childbearing postponement all over Italy, with a more evident trend in Southern Italy (Table 10).

3.2 Multivariate exploratory analysis

Results of a Principal Component Analysis run separately on each descriptive statistics’ outcome matrix (Tables 1, 2, 3, 4, 5, 6, 7, 8, 9, 10) were reported in Table 11. A careful scrutiny of component loadings show that almost all indicators reflected a marked increase in the divide between Northern and Southern Italy. Fertility divides (both considering crude birth rates and the total fertility rate) increased strongly, reaching the maximum imbalance in 2020 and 2021. On the contrary, COVID-19 had the indirect effect of levelling out the traditional disparities in death rates, being lower in Northern Italy before the pandemic but increasing substantially in both 2020 and 2021. Consequently, natural balance shifting toward negative values was also more homogeneous over space with COVID-19.

Table 11 Spatial distribution of demographic indicators by time and geographical region in Italy and the relative difference of 2020 (bold) and 2021 (italics) with a reference period (2002–2019), considering the outcomes of a Principal Component Analysis (Axes 1 + 2) run separately for each indicator

Biplots (Fig. 1) of PCAs run separately on each descriptive statistics’ outcome matrix (Tables 1, 2, 3, 4, 5, 6, 7, 8, 9, 10) were used to delineate the trajectory over time of each demographic indicator (2002–2021). Biplots outlined the anomaly of 2020 and 2021 dynamics compared with previous years. By reducing processes’ dimensionality and better managing data redundancy over time, principal component extraction ensured a high proportion of explained variance on the first two axes, taken as relevant dimensions for all demographic indicators. The variance extracted from the two axes was rarely less than 70%, ensuring a reliable representation of the demographic trajectories over time, and thus highlighting the anomalies of the demographic dynamics in 2020 and 2021 compared with the earlier two decades, assumed as the baseline.

Fig. 1
figure 1figure 1

Results of a principal component analysis (biplot) run separately on each descriptive statistics’ outcome matrix (Tables 1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and depicting the trajectory over time of each demographic indicator (2002–2021), and thus the possible year anomaly by comparing 2020 and 2021 with previous years, individually and as a long-term average

Considering Components 1 and 2 together, it can be seen that both 2020 and 2021 showed anomalous dynamics compared with the reference period for almost all indicators. In particular, we defined as ‘anomalous’ the years in which loadings were higher than (or close to) the range (max–min) observed in the previous two decades (2002–2019). The highest loadings (anomalous compared with the previous time interval) on component 1 were observed for crude death and marriage rates specifically for 2020, and for the total population growth rate. Component 2 highlighted the highest (anomalous) loading for the total migration rate with reference to 2020.

In general, 2021 represented a less anomalous year from a demographic point of view, according to the results of the PCA. Nevertheless, important anomalies were recorded, sometimes more intense than those previously observed for 2020, with regard to the total fertility rate, the natural population balance, as well as the crude birth rate and the crude death rate, namely on Component 1. The crude rates of births, deaths, and marriages on Component 2 were found anomalous in 2021 compared with 2020 and the earlier years. These results suggest a medium-term pandemic effect that seems to cumulate with the more intense (but likely temporary) impact of 2020.

3.3 Fuzzy clustering

Results of fuzzy clustering were presented in this section applying the FkM algorithm to the data matrix consisting of the Italian provinces (row units) and the demographic indicators considered in this paper (column variables) in three different time occasions: the average value 2002–2019, and the individual years 2020 and 2021. Thus, three different analyses were performed on the same set of units and variables over three different time periods. Table 12 and Fig. 2 reported the FS index for \(2{ } \le {\text{C}} \le 6\) for the three analyses. Based on the FS index, we chose the solution with 3 clusters (C = 3) for the average 2002–2019 and for 2021; a solution with 4 clusters (C = 4) was the optimal one for 2020.

Table 12 Cluster validation: fuzzy silhouette (FS) index for different value of c; years; average 2002–2019; 2020; 2021 (bold indicates the highest index corresponding with the selected silhouette
Fig. 2
figure 2

Cluster validation: fuzzy silhouette (FS) index for different values of c; years; average 2002–2019; 2020; 2021

In order to evaluate clusters’ fuzziness, we specified a cut-off point for membership degree, which of course depends on the partition chosen. In a three-cluster solution, if the membership degree within a given cluster is not equal to 0.6 at least, it would be considered that there is a reasonable level of fuzziness in the cluster membership (for more information on the choice of cut-off, see Maharaj and D’Urso 2011). Consequently, 0.6 has been chosen as cut-off. Therefore, those units that do not have at least that value as membership degree to a cluster were regarded as fuzzy. In a four-cluster solution, a reasonable choice for the cut-off was 0.5 (D’Urso et al. 2022).

We examined the empirical results of fuzzy clustering with reference to the 2002–2019 average (Table 13). Figure 3 reports both clusters’ composition and centroids. Cluster 1 was composed of 26 provinces from Southern Italy. By examining the centroid, better values than national values were recorded for the crude birth, death, and marriage rates as well as the natural population change. Conversely, internal and international migration balances, as well as total migration rate and total population growth rates totalized lower values than the Italian ones. The total fertility rate was finally in line with national data, and the mean mother’s age at birth was lower than the Italian one.

Table 13 Values of demographic indicators associated with fuzzy clusters, average 2002–2019
Fig. 3
figure 3

Clusters composition and centroids for NUTS-3 Italian provinces (average 2002–2019)

Cluster 3 was made up of 43 provinces from Central and Northern Italy, which showed worse values than the national ones in the crude birth, death, and marriage rates, as well as the natural population change. On the contrary, internal and international migration balances, as well as the total migration rate, displayed higher values than the Italian ones.

Cluster 2 included 32 provinces with values in line with the national values in all selected indicators except the migration variables and the total population growth showing higher values than the Italian ones. Based on these findings, these provinces attracted both internal and foreign migration flows. Additionally, 6 provinces were classified as fuzzy presenting values in-between two different clusters: CA (with membership degree 0.45 to Cluster 1 and 0.38 to Cluster 2); FR (with membership degree 0.47 to Cluster 1 and 0.48 to Cluster 2); NU (with membership degree 0.52 to Cluster 1 and 0.43 to Cluster 2); PV (with membership degree 0.44 to Cluster 2 and 0.56 to Cluster 3); RG (with membership degree 0.40 to Cluster 1 and 0.58 to Cluster 3); and SS (with membership degree 0.55 to Cluster 2 and 0.31 to Cluster 3).

Figure 4 showed clusters’ composition and the related centroids with reference to the year 2020, documenting a partition formed of 4 clusters. Cluster 3 included 19 provinces, almost all in Southern Italy (except Bolzano, BZ), having values higher then the national ones in the crude birth rate, crude death rate, crude marriage rate and the natural population change, and lower in the internal migratory balance, the international migration balance, the total migration rate and the total population growth; this group resembled Cluster 1 of the 2002–2019 analysis). By contrast, Cluster 1 was composed of 22 provinces basically located in Central and Northern Italy, with the centroid presenting values in line with the national ones for all indicators, with the exception of the internal migration balance, international migration balance, total migration rate and total population growth—displaying lower values than the national one (Table 14). Cluster 2 (made up of 44 provinces) assumed higher values than the Italian ones in the international migration balance, total migration rate and total population growth, being in line with the national data for the other indicators. Cluster 4 comprised 18 provinces with higher values than the Italian ones in the crude marriage rate, the internal migratory balance, the international migration balance and the total migration rate, while assuming lower values than the country figures for the remaining indicators. Four fuzzy provinces were finally identified: AO (with membership degree 0.42 to Cluster 2 and 0.48 to Cluster 4), RI (with membership degree 0.41 to Cluster 2 and 0.49 to Cluster 4), PU (with membership degree 0.44 to Cluster 1 and 0.42 to Cluster 2), and TA (with membership degree 0.46 to Cluster 1 and 0.38 to Cluster 2).

Fig. 4
figure 4

Clusters composition and centroids for NUTS-3 Italian provinces (2020)

Table 14 Values of demographic indicators associated with fuzzy clusters, 2020

Figure 5 illustrated clusters’ composition and the related centroids with reference to 2021. Cluster 1, composed of 35 provinces, was characterized with higher values than the national figure in all migration indicators (internal migration balance, international migration balance, total migration rate) and in total population growth (Table 15). The remaining indicators assumed values lower than (or similar to) the national figures. Cluster 2 covered 45 provinces; similarly to Cluster 1, it showed higher values than the national ones for migration indicators, but (differently to Cluster 1) lower values than the national ones for the remaining indicators. With respect to Cluster 3 (20 provinces mainly from Southern Italy), better values than the Italian ones were observed for crude birth rate, crude death rate, crude marriage rate and worse values were observed for internal migration balance, international migration balance, total migration rate, natural population change and total population growth; the remaining indicators were almost in line with the respective national figures. A total of seven provinces were classified as fuzzy: BA (with membership degree 0.35 to Cluster 1 and 0.53 to Cluster 3), BN (with membership degree 0.46 to Cluster 2 and 0.51 to Cluster 3), BZ (with membership degree 0.59 to Cluster 1 and 0.35 to Cluster 3), FI (with membership degree 0.53 to Cluster 1 and 0.46 to Cluster 2), GO (with membership degree 0.39 to Cluster 1 and 0.56 to Cluster 2), MT (with membership degree 0.32 to Cluster 2 and 0.47 to Cluster 3) and PV (with membership degree 0.47 to Cluster 1 and 0.52 to Cluster 2).

Fig. 5
figure 5

Clusters composition and centroids for NUTS-3 Italian provinces (2021)

Table 15 Values of demographic indicators associated with fuzzy clusters, 2021

4 Discussion

COVID-19 pandemic demonstrated to affect population dynamics in various ways, exerting almost negative impacts in both emerging and advanced economies, amplifying traditional socio-demographic divides while reducing communities’ wellbeing (Pomar et al. 2022). Impacts of COVID-19 on population dynamics were classified as short-term, medium-term, and long-term (Aassve et al. 2020). Short-term (direct) impacts on mortality rates were studied extensively (Aassve et al. 2021). Medium-term and long-term impacts need additional investigation (MacKellar and Friedman 2021).

Our study investigates how COVID-19 pandemic has contributed to enlarge spatial divides in specific demographic processes and dimensions in Italy, illustrating the results of an exploratory multivariate analysis of ten indicators representative of fertility, mortality, nuptiality, internal and foreign migration, and the related outcomes (natural balance. migration balance. total population growth). Using metrics reflective of spatial divides, a descriptive analysis of the statistical distribution of these indicators across the Italian provinces controlled for shifts over time (2002–2021) in both central tendency, dispersion, and distributional shape regimes (Zambon et al. 2020).

Using official statistics, our study estimates short-term (direct) impact of COVID-19 on fertility, nuptiality, childbearing propensity, and other processes dealing with mortality and migration (Zhang and Schwartz 2020). The complex interplay between fertility, mortality, and migration demonstrated how the short-term impact of COVID-19 on population growth and decline was rather intense, often cumulating the negative effects of mortality increase, fertility decrease, and immigration slowdown (Xu et al. 2022). As a result of such dynamics, it can be said that COVID-19 pandemic has altered short-term demographic trends in Italy and, possibly, in many other advanced economies (Gonzales-Leonardo et al. 2022). How such disturbance regime can be regarded as temporary or persistent over time is a matter of future studies when longer time series will become available to scholars, practitioners, and stakeholders (Wachter 2005; Myrskyla et al. 2009; Rees et al. 2017).

Earlier studies have clarified how complexity, multi-dimensionality of effects, and pervasiveness—intended as key features of the impact of COVID-19 pandemic on socio-demographic dynamics—reflected the multiplicity of involved risks of both direct and indirect nature, resulting either from the disease per se and from the enacted mitigation measures (Aassve et al. 2020). The results of multivariate analyses (both Principal Component Analysis and Fuzzy Clustering) outline, for both 2020 and 2021, how multidimensionality, complexity, and pervasiveness are in turn reflected in the enhanced spatial heterogeneity of the demographic processes investigated in this study, as compared with pre-COVID dynamics (Gutierrez et al. 2022). Going well beyond the chief demographic topic in relation to any pandemic crisis—namely the magnitude, timing, and structure of mortality—our study indicates changes over time in fertility regimes as a basic issue whose variations before and after COVID-19 merit a specific focus (Aassve et al. 2021). Even less studied is the net (while indirect) impact of COVID-19 on migration (MacKellar and Friedman 2021).

Taken together, the empirical results of this study can be envisaged as a novel contribution to regional demography (sensu Goldstein et al. 2009, 2013) when indicating how COVID-19 pandemic exerted a marked impact on Italian population because of both intrinsic (e.g. a particularly older population age structure compared with other advanced economies) and extrinsic (the early start of the pandemic spread compared with the neighboring European countries) factors. At the same time, being dependent—more or less intensively—on the local context, long-term demographic dynamics possibly associated with COVID-19 need more comprehensive analysis’ systematization and generalized (theoretical and empirical) investigation approaches (Egidi and Manfredi 2021). Especially medium-term and long-term (direct and indirect) impacts require additional research efforts (Wolff et al. 2022). For such reasons, Italy was seen as a sort of ‘worst’ demographic scenario (e.g. Bernardi 2005) for other countries affected by COVID-19. With this perspective in mind, the results of our study are regarded as particularly informative when delineating policy measures (with both economic and social impact) able to mitigate the effect of pandemics on demographic balance and improve the adaptation capacity of local societies to future pandemics.

5 Conclusions

A policy issue related to the COVID-19 pandemic in European countries concerns the extent and ways in which population dynamics have been affected between regions and social groups, and whether and how the pandemic and its economic consequences will affect population dynamics in the future. Post-pandemic policy evaluations on the medium- and long-term impacts of COVID-19 should include a thorough analysis going beyond strictly health and economic indicators. Quantitative analyses based on traditional or more sophisticated approaches, should assess the role played by key demographic processes like aging, ethnicity, space, family structures and mobility. These results will help the design and application of reliable policies, since a high degree of uncertainty in decision-making processes has characterized the early phases of COVID-19 outbreak. In addition to ad-hoc consultancy required during any type of crisis, a permanent monitoring system based on collected evidence of population issues and dynamics is recommendable, highlighting the crucial importance of effectively tackling socio-demographic disparities in regions, countries, and continents.