Abstract
In 2021 ISTAT presented the Report on Equitable and Sustainable Well-being (BES 2020), consisting of a system of indicators that follow the significant changes that have characterized the Italian society in the last 10 years. With the integration of new indicators, realized in coherence with the fundamental lines of the Next Generation EU, there has been an enrichment of information on the country system concerning health aspects, education and training, and economic well-being. The 20 Italian regions, the 2 autonomous provinces of Bolzano and Trento, the 3 territorial divisions North, Center, South and the total of Italy constituting a set of 26 territorial units, have been described each with a set of 36 numerical indicators, concerning the areas of Health, Education and Training, Economic Wellbeing. These areas are the most suitable for highlighting regional imbalances in Italy. In this paper has been analyzed the input data matrix, made up of 26 rows, one for each of the territorial units, and of 36 columns, the number of descriptors used for each territorial unit, by means of a factor analysis, using the principal components method, in order to construct a regional taxonomy characterized by those of the 36 indicators that are most correlated with each of the factors that have emerged. Moreover, starting from the coordinates calculated for each of the 26 territorial units in the factor space, a cluster analysis of the 26 territorial units was carried out, using the connected graph method, in order to highlight the territorial similarities and differences in Italy.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Equitable and Sustainable well-being indicators (Bes) have been introduced into the Italian legislative system (Article 14 of Law No. 163/2016 reforming the accounting law) as an economic planning tool. An ad hoc committee was appointed, chaired by the Minister of Economy and Finance, to select useful indicators for the assessment of well-being based on the experience gained at national and international level. Ten years after the start of the project, the proposed indicators show that there have been many changes in the profile of well-being in Italy, both in the direction of progress and in the persistence of critical areas. By effect of budget cuts carried out continuously throughout the decade, in our health system there are fewer beds, there are doctors of a higher average age, due to the blocking of turnover, with the result of greater inequality in access to care. There are still too few children enrolled in the nursery and young people who graduate, so the gap with Europe on education continues to widen. At the same time, the number of young people who don’t study, don’t work and aren’t included in professional training programs (NEET) has increased. The quality of work in Italy is objectively critical, and the incidence of absolute poverty in 2019 showed, for the first time, a slight decline and then rose again in 2020. The Bes 2020 Report, presented by ISTAT at the end of 2020, highlights a complex and not merely emergency situation that is at the same time contradictory [3,4,5]. The extraordinary resources made available by the Next Generation EU Program represent an unprecedented opportunity to intervene substantially for economic recovery. The Bes is therefore a targeted, sensitive and reliable tool to guide decisions and to allow for the evaluation of results of the implemented policies. In this work, 15 variables of the Health area, 13 variables of the Education and Training Area and 8 variables of the Economic Wellbeing Area, as specified below, were assumed as indicators (descriptors) of the 26 territorial units considered.
1.1 Description of the Indicators
Health Area - Description of the 36 indicators of each of the 26 territorial units.
ID | Indicators |
---|---|
1 | Life expectancy at birth |
2 | Healthy life expectancy at birth |
3 | Mental health index (SF36) |
4 | Avoidable mortality (0–74 years) |
5 | Multichronicity and severe limitations (75 years and over) |
6 | Child mortality |
7 | Mortality due to road accidents (15–34 years) |
8 | Mortality from cancer (20–64 years) |
9 | Mortality due to dementia and diseases of the nervous system |
10 | Unlimited life expectancy in activities at 65 |
11 | Overweight |
12 | Smoking |
13 | Alcohol |
14 | Sedentary lifestyle |
15 | Adequate nutrition |
Education and Training Area - Description of the 36 indicators of each of the 26 territorial units.
ID | Indicators |
---|---|
16 | Attendance in kindergarten |
17 | People with at least a diploma (25–64 years) |
18 | Transition to the University |
19 | Young people who do not work and do not study (NEET) |
20 | Participation in continuing education |
21 | Inadequate literacy skills |
22 | Inadequate numerical competence |
23 | High digital skills |
24 | 0–2 years old children enrolled in the nursery |
25 | Graduates in technical-scientific disciplines (STEM) |
26 | Cultural participation outside the home |
27 | Reading of books and newspapers |
28 | Usage of libraries |
Economic Wellness Area - Description of the 36 indicators of each of the 26 territorial units.
ID | Indicators |
---|---|
29 | Income available |
30 | Net income inequality (s80 / s20) |
31 | Risk of poverty |
32 | Severe material deprivation |
33 | Severe housing deprivation |
34 | Great difficulty to get to the end of the month |
35 | Low labor intensity |
36 | Overhead of the cost of housing |
The corresponding values of the 36 indicators detected by ISTAT during 2020 were associated with each of the 26 territorial units, as listed in Table 1 [1, 2]. Only the data from the areas of health, education and training, economic well-being, of the twelve available, were used, because they were considered more discriminating for the purposes of an analysis of regional imbalances.
Health represents a central element of life and an indispensable condition for individual well-being and the prosperity of populations, as recalled, at European level, by the Lisbon strategy for Development and Work, declared by the European Commission in 2000 in response to the challenges of globalization and aging.
Health has consequences that impact on all dimensions of the life of each individual, being able to modify the living conditions, behaviors, social relationships, opportunities and perspectives of individuals and, often, of their families.
Education, training and skill levels influence people's well-being and open up opportunities for social growth that would otherwise be precluded.
Higher educated people have a higher standard of living and have more opportunities to find work, live longer and better, because they have healthier lifestyles and have more opportunities to find work in less risky environments.
Furthermore, a higher level of education usually corresponds to a higher level of access and enjoyment of cultural goods and services, and an active participation in the process of production of culture and creativity. Economic well-being is not considered a goal, but rather as a means by which an individual can have and sustain a certain standard of living.
Variables that contribute to economic well-being include income, wealth, spending on consumer goods, housing conditions, and ownership of durable goods [6,7,8].
Obviously, the judgment on the level of economic well-being of a society can vary if the same average overall income is equally distributed among citizens or is instead concentrated in the hands of a few wealthy people.
The peculiar characteristic that we call “value” is born in consumer products, a characteristic that can be defined as the ability of a product to excite in the individual the desire to have an exclusive use or at least a use for the total satisfaction of their needs. The 26 territorial units, each described by the 36 numerical variables concerning the three areas mentioned above, were considered as a matrix (input matrix X) of 26 “objects” in a 36-dimensional space [9,10,11,12]. We subjected the above matrix to a factor analysis, with the principal components (or Hotelling's) method, in order to obtain a representation of the 26 territorial units projected in a space with only three-four dimensions (factor space) in order to use a simplified description of the 26 territorial units [13, 14]. This synthetic description, obtained from the factorial analysis, allows us to examine the distribution of the 26 territorial units, i.e. their possible mutual proximity in the space of the main components.
Furthermore, the 26 territorial units distributed in the space of the principal components were subjected to a cluster analysis in order to evaluate their similarities or their differences.
Since the factorial model is invariant (equivariance) with respect to changes in scale of the variables contained in the input matrix X, it’s possible to standardize the observed variables, in order to examine not the variance and covariance matrix, but the correlation matrix R [15, 16].
Below we briefly illustrate the results obtained with the factor analysis applied to the R correlation matrix of the 36 numerical indicators used for the description of the 26 territorial units.
The eigenvalues and corresponding eigenvectors of this matrix were calculated. The first four eigenvalues explained 77% of the total variance of the system.
Cumulative % of Eigenvalues.
.51 | .62 | .72 | .77 |
After the rotation of the factor matrix, made up of the four eigenvectors corresponding to the first four eigenvalues, the 4 factor scores (coordinates in the factor space) were calculated for each of the 26 territorial units, with the significant (non-zero) weights (factor loading) of the input variables on each of the 4 factors.
1.2 Factor Analysis
For each factor, the following lists show the number of the variable, the weight of the variable on the factor and the description of each of the variables divided by area. Moreover, for each factor, the coordinates on the factor of the 26 territorial units are listed in an ordered manner. The weights of the variables and the coordinates on each main component are given below (Figs. 1, 2, 3 and 4):
Health Area – Factor loading 1 st Main Component.
ID | Weights of variables | Indicators |
---|---|---|
2 | 0.57 | Healthy life expectancy at birth |
4 | −0.81 | Avoidable mortality (0–74 years) |
5 | −0.63 | Multichronicity and severe limitations (75 years and over) |
6 | −0.54 | Child mortality |
8 | −0.65 | Mortality from cancer (20–64 years) |
10 | 0.80 | Unlimited life expectancy in activities at 65 |
11 | −0.80 | Overweight |
13 | 0.61 | Alcohol |
14 | −0.84 | Sedentary lifestyle |
15 | 0.70 | Adequate nutrition |
Education And Training Area - Factor loading 1 st Main Component.
ID | Weights of variables | Indicators |
---|---|---|
17 | 0.90 | People with at least a diploma (25–64 years) |
19 | −0.88 | Young people who do not work and do not study (NEET) |
20 | 0.64 | Participation in continuing education |
21 | −0.78 | Inadequate literacy skills |
22 | −0.84 | Inadequate numerical competence |
Economic Wellness Area - Factor loading 1 st Main Component.
ID | Weights of variable | Indicators |
---|---|---|
29 | 0.75 | Income available |
30 | −0.88 | Net income inequality (s80 / s20) |
31 | −0.92 | Risk of poverty |
32 | −0.84 | Severe material deprivation |
34 | −0.76 | Great difficulty to get to the end of the month |
35 | −0.91 | Low labor intensity |
36 | −0.82 | Overhead of the cost of housing |
Coordinates of Territorial Units on the 1 st Main Component.
Territorial units | Value |
---|---|
Trento | 23.58 |
Trentino-AltoAdige | 23.35 |
Bolzano/Bozen | 17.34 |
Valle d'Aosta | 16.36 |
Veneto | 14.20 |
Friuli-Venezia Giulia | 13.30 |
Emilia-Romagna | 11.09 |
Toscana | 8.81 |
Marche | 7.18 |
Lombardia | 6.94 |
Umbria | 6.81 |
Piemonte | 4.50 |
Liguria | 4.48 |
Lazio | 0.44 |
Abruzzo | −4.73 |
Sardegna | −9.43 |
Molise | −10.96 |
Basilicata | −12.73 |
Puglia | −15.53 |
Calabria | −21.18 |
Sicilia | −33.70 |
Campania | −34.87 |
Health Area – Factor loading 2 nd Main Component.
ID | Weights of variables | Indicators |
---|---|---|
3 | 0.52 | Mental health index (SF36) |
7 | 0.58 | Mortality due to road accidents (15–34 years) |
12 | −0.40 | Smoking |
Education and Training Area - Factor loading 2 nd Main Component.
ID | Weights of variables | Indicators |
---|---|---|
18 | −0.83 | Transition to the University |
Coordinates of Territorial Units on the 2 nd Main Component.
Territorial units | Value |
---|---|
Bolzano/Bozen | 13.49 |
Trentino-Alto Adige | 9.01 |
Friuli-Venezia Giulia | 1.69 |
Trento | 1.68 |
Veneto | 1.66 |
Lazio | 1.02 |
Puglia | 0.74 |
Calabria | 0.55 |
Sicilia | −0.39 |
Emilia-Romagna | −0.99 |
Valle d'Aosta | −1.04 |
Liguria | −1.17 |
Abruzzo | −1.41 |
Umbria | −1.50 |
Basilicata | −1.58 |
Lombardia | −1.88 |
Toscana | −2.10 |
Campania | −2.14 |
Sardegna | −2.35 |
Marche | −3.07 |
Piemonte | −3.28 |
Molise | −3.30 |
Education and Training Area - Factor loading 3 rd Main Component.
ID | Weights of variables | Indicators |
---|---|---|
16 | 0.94 | Attendance in kindergarten |
23 | 0.81 | High digital skills |
24 | 0.62 | 0–2 years old children enrolled in the nursery |
26 | 0.66 | Cultural participation outside the home |
27 | 0.66 | Reading of books and newspapers. |
Coordinates of Territorial Units on the 3 rd Main Component.
Territorial units | Value |
---|---|
Trento | 8.73 |
Valle d'Aosta | 8.03 |
Trentino-Alto Adige | 5.31 |
Friuli-Venezia Giulia | 4.45 |
Veneto | 4.17 |
Emilia-Romagna | 3.92 |
Bolzano/Bozen | 3.89 |
Toscana | 3.63 |
Lombardia | 2.50 |
Liguria | 1.86 |
Piemonte | 1.80 |
Marche | 1.50 |
Umbria | 1.09 |
Sardegna | 0.77 |
Abruzzo | −1.32 |
Molise | −2.61 |
Puglia | −2.77 |
Basilicata | −3.09 |
Calabria | −5.10 |
Sicilia | −9.04 |
Campania | −9.09 |
Lazio | −16.81 |
Health Area – Factor loading 4 th Main Component.
ID | Weights of variables | Indicators |
---|---|---|
1 | 0.54 | Life expectancy at birth |
9 | −0.72 | Mortality due to dementia and diseases of the nervous system |
Education and Training Area - Factor loading 4 th Main Component.
ID | Weights of variables | Indicators |
---|---|---|
25 | 0.64 | Graduates in technical-scientific disciplines (STEM) |
28 | −0.67 | Usage of libraries |
Economic Wellness Area - Factor loading 4 th Main Component.
ID | Weights of variable | Indicators |
---|---|---|
33 | 0.51 | Severe housing deprivation |
Coordinates of Territorial Units on the 4 th Main Component.
Territorial units | Value |
---|---|
Campania | 12.84 |
Calabria | 11.62 |
Sicilia | 10.51 |
Basilicata | 8.96 |
Molise | 8.06 |
Abruzzo | 6.76 |
Puglia | 6.68 |
Lazio | 3.40 |
Umbria | 2.60 |
Sardegna | 1.35 |
Marche | −0.05 |
Liguria | −2.43 |
Toscana | −2.76 |
Piemonte | −3.88 |
Emilia-Romagna | −4.68 |
Friuli-Venezia Giulia | −4.72 |
Veneto | −4.80 |
Lombardia | −4.97 |
Trento | −10.36 |
Valle d'Aosta | −12.49 |
Bolzano/Bozen | −13.82 |
Trentino-Alto Adige | −14.08 |
2 Territorial Cluster Analysis
The new coordinates in factor space had been assigned to the 26 territorial units considered, it was possible to search for their proximity (similarity). It seems appropriate to clarify what is meant by similarity between regions. To this end, we introduce a similarity coefficient Si, j between two regions Ri and Rj in such a way that for each pair (i, j), with i,j = 1, 2,…, N we have:
with
only if (1).
with.
(symmetry property). If we denote by Di the modulus of the description vector of the spatial unit Ri where Di, k is the k-th component of the vector Di, the similarity coefficient Si, j can be defined as the addition k of the proportion [18].
which, as can easily be verified, satisfies the relations given in (1). The square matrix of order N constructed with the coefficients Si,j is called similarity matrix S. It is usually transformed into a Boolean matrix B by introducing a threshold t, with 0 < t < 1, and setting bi,j = 1 if Si,j > t and setting bi,j = 0 otherwise [17, 23].
The similarity matrix S can be interpreted as the adjacency matrix associated with a digraph. Since we are in the presence of a symmetrical matrix, if an arc Si,j exists between any pair (i,j) of nodes, the arc Sj,i with the same ends but directed in the opposite direction will also exist [21]. One can then disregard the direction on the arcs and simply consider undirected graphs [20]. The existence of an arc between two nodes indicates that there is a similarity between corresponding regions that exceeds the threshold t adopted. The search for the connected components of a graph can be carried out by introducing the concept of a complete tree associated with each of the components. It is to be noted that a tree is a graph such that between any two nodes of it there is one and only one path; this implies that within a tree there are no cycles and that, if N is the number of nodes, the tree itself consists of exactly N-1 arcs [19, 22].
Considering the problem of the research of homogeneous clusters of our territorial units, having calculated the similarity matrix S, the classes have been identified through an algorithm of research of the connected components of the digraph associated to the Boolean matrix of the adjacencies, as deduced from the similarity matrix S, having imposed the threshold value t = 0.960. The results obtained are shown in the Table 2 below (Fig. 5):
3 Conclusion
From the results obtained, considering the adoption of a highly discriminating value, close to unity, of the t threshold, it can be deduced that Piemonte, as well as Umbria, Lazio, Abruzzo and Marche present specific and peculiar regional profiles, in relation to the 36 descriptive variables used. It should also be noted, however, that the Marche Region has a profile that is most similar to the territorial distribution of Central Italy (cluster n. 7). Cluster n. 2 is formed by the Trentino respectively: the Autonomous Province of Bolzano while the Autonomous Province of Trento, which on the first principal component has assumed the greatest distance (23.58) from the origin of the factorial axes, near which the Lazio region is located, has been included in cluster n. 3 to which all the other regions of the Northern Italy territorial Breakdown also belong.
Finally, a special reflection should be made on cluster no. 8 that includes the seven southern regions, the territorial distribution Mezzogiorno and total Italy. It should be noted that the seven southern regions, Molise, Campania, Puglia, Basilicata, Calabria, Sicilia and Sardegna, described by their respective scores on the four factorial axes, all have negative coordinates on the first and third main components. It is left to the reader to read and easily interpret the weights of the variables on factors I and III, as they emerge from the factor analysis, in order to highlight the shortcomings, and therefore the regional imbalances, in the areas of health, education and training, and economic well-being.
References
Bimonte, S., Gensel, J., Bertolotto, M.: Enriching spatial olap with map generalization: a conceptual multidimensional model. In: Proceedings of the 2008 IEEE International Conference on Data MiningWorkshops. ICDMW 2008, Washington, DC, USA, IEEE Computer Society, pp. 332–341 (2008)
Bimonte, S., Miquel, M.: When spatial analysis meets olap: multidimensional model and operators. IJDWM 6(4), 33–60 (2010)
Mirco Gamberini, “Data Warehouse – 1a parte: Introduzione e applicazioni nel mondo reale”, Mokabyte 154, settembre 2010
Golfarelli, M., Rizzi, S.: Data Warehouse: teoria e pratica della progettazione”, McGraw-Hill, 2a edizione (2006)
Di Martino, S., Bimonte, S., Bertolotto, M., Ferrucci, F.: Integrating Google earth within OLAP tools for multidimensional exploration and analysis of spatial data. In: Filipe, J., Cordeiro, J. (eds.) ICEIS 2009. LNBIP, vol. 24, pp. 940–951. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01347-8_78
Franklin, C.: “An introduction to geographic information systems: linking maps to databases”. Journal Database, April, p. 1321 (1992)
Malinowski, E., Zimányi, E.: Advanced Data Warehouse Design: From Conventional to Spatial and Temporal Application”. Springer, Cham (2008). https://doi.org/10.1007/978-3-540-74405-4
Laurini, R., Thompson, D.: Fundamentals of Spatial Information Systems. Academic Press, London (1992)
Ahmed, T.O., Miquel, M.: Multidimensional structures dedicated to continuous spatiotemporal phenomena. In: Jackson, M., Nelson, D., Stirk, S. (eds.) BNCOD 2005. LNCS, vol. 3567, pp. 29–40. Springer, Heidelberg (2005). https://doi.org/10.1007/11511854_3
Bedard, Y., Merrett, T., Han, J.: Fundamentals of spatial datawarehousing for geographic knowledge discovery. in: geographic data mining and knowledge discovery. Taylor and Francis, pp. 53–73 Research Monographs in GIS series edited by Peter Fisher and Jonathan Raper (2001)
Cembalo, A., Pisano, F.M., Romano, G.: Document Warehousing: l’analisi multidimensionale applicata a sorgenti testuali-I parte:panoramica e introduzione”, MokaByte 162, maggio (2011)
Caranna, V.: “Data Warehouse - III parte: Definiamo un modello progettuale per DWH”, MokaByte 158, gennaio (2011)
Petrov, D.A., Stankova, E.N.: Integrated information system for verification of the models of convective clouds. In: Gervasi, O., et al. (eds.) ICCSA 2015. LNCS, vol. 9158, pp. 321–330. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21410-8_25
Stankova, E.N., Balakshiy, A.V., Petrov, D.A., Shorov, A.V., Korkhov, V.V.: Using technologies of OLAP and machine learning for validation of the numerical models of convective clouds. In: Gervasi, O., et al. (eds.) ICCSA 2016. LNCS, vol. 9788, pp. 463–472. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42111-7_36
Gankevich, I., et al.: Constructing virtual private supercomputer using virtualization and cloud technologies. In: Murgante, B., et al. (eds.) ICCSA 2014. LNCS, vol. 8584, pp. 341–354. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09153-2_26
Kluge, M., Asche, H.: Validating a smartphone-based pedestrian navigation system prototype. In: Murgante, B., et al. (eds.) ICCSA 2012. LNCS, vol. 7334, pp. 386–396. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31075-1_29
Mazzei, M., Palma, A.L.: A Decision support system for the analysis of mobility. In: Gervasi, O., et al. (eds.) ICCSA 2015. LNCS, vol. 9157, pp. 390–402. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21470-2_28
Mazzei, M., Palma, A.L.: Spatial statistical models for the evaluation of the landscape. In: Murgante, B., et al. (eds.) ICCSA 2013. LNCS, vol. 7974, pp. 419–432. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39649-6_30
Mazzei, M., Palma, A.L.: Comparative analysis of models of location and spatial interaction. In: Murgante, B., et al. (eds.) ICCSA 2014. LNCS, vol. 8582, pp. 253–267. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09147-1_19
Mazzei, M., Palma, A.L.: Spatial multicriteria analysis approach for evaluation of mobility demand in urban areas. In: Gervasi, O., et al. (eds.) ICCSA 2017. LNCS, vol. 10407, pp. 451–468. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62401-3_33
Mazzei, M.: An unsupervised machine learning approach in remote sensing data. In: Misra, S., et al. (eds.) ICCSA 2019. LNCS, vol. 11621, pp. 435–447. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24302-9_31
Mazzei, M.: Software development for unsupervised approach to identification of a multi temporal spatial analysis model - Proceedings of the 2018 International Conference on Image Processing, Computer Vision, and Pattern Recognition, IPCV 2018, 2018, pp. 85–91 (2018)
Mazzei, M.: An unsupervised machine learning approach for medical image analysis. In: Arai, K. (ed.) FICC 2021. AISC, vol. 1364, pp. 813–830. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-73103-8_58
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
De Maria, M., Mazzei, M., Bik, O.V., Palma, A.L. (2021). Analysis of Regional Imbalances in Italy Based on Cluster Analysis. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2021. ICCSA 2021. Lecture Notes in Computer Science(), vol 12954. Springer, Cham. https://doi.org/10.1007/978-3-030-86979-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-86979-3_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86978-6
Online ISBN: 978-3-030-86979-3
eBook Packages: Computer ScienceComputer Science (R0)