1 Introduction

Composite indicators are one-dimensional measures that facilitate the interpretation and analysis of multidimensional phenomena in different areas of knowledge (Mazziotta and Pareto 2017; Kuc-Czarnecka et al. 2020; Bernardes et al. 2021; Ekel et al. 2022). In particular, composite indicators are especially useful for geographic analysis because they represent the several sub-indicators adjacent to the multidimensional phenomenon on a single map (Carley et al. 2018; Marzi et al. 2019). However, building composite indicators involves significant shortcomings during normalization, weighting, and aggregation of sub-indicators (Saltelli 2007; Dialga and Le Giang 2017; Cinelli et al. 2021).

The weights of the sub-indicators can be assigned using the Equal-weight, Data-driven, Expert opinion, or Hybrid weighting schemes (Becker et al. 2017). There is no single best weighting scheme, as none of them are exempt from criticism, and each offers a different perspective of analysis (Greco et al. 2019). From a geographical point of view, the weighting scheme may or may not consider spatial heterogeneity.

Spatial heterogeneity is considered in methods that assign different weights to the same sub-indicator in different spatial units. In the Benefit-of-the-Doubt method, higher weights are assigned to higher performance sub-indicators of spatial units, offering a composite indicator that highlights the positive characteristics of each spatial unit (Fusco et al. 2018; Libório et al. 2022c). In the Geographic Weighted Principal Component Analysis method, weights maximize the variance extracted from the original data in the first component in each spatial unit, providing a composite indicator that captures locally as many sub-indicators of the multidimensional phenomenon as possible (Harris et al. 2015; Kallio et al. 2018; Cartone and Postiglione 2021). In the Ordered Weighted Averaging operator, the assignment of weights makes it possible to regulate the compensation levels between sub-indicators of poor and above-average performance, offering a composite indicator that highlights the positive or negative aspects of each of them the spatial units (Badea et al. 2011).

In weighting schemes that disregard spatial heterogeneity, the weights of a sub-indicator do not vary between spatial units, offering better comparability of composite indicator scores. The Equal-weights is an objective weighting scheme, easy to implement, and appropriate when there is a lack of a theoretical framework that justifies the differentiation of weights, when the experts disagree on the weights, or when the statistical and empirical knowledge is inadequate (Greco et al. 2019). Expert opinion weighting schemes are based on the concept of the multidimensional phenomenon.Footnote 1 Therefore, greater weights are assigned to sub-indicators with greater conceptual importance, offering a composite indicator closer to the conceptual structure of the multidimensional phenomenon (Nardo et al. 2005). Data-driven weighting schemes use the data to assign weights to sub-indicators. In the principal component analysis, the weights of the sub-indicators are defined to maximize the variance extracted from the data in the first component (Libório et al. 2020a). This approach offers a composite indicator that explains most of the information contained in the original sub-indicators (Saisana and Tarantola 2002). Enhanced Scatter Search assigns weights to sub-indicators to reduce rank reversals in robustness analyses, offering a more stable composite indicator (Maricic et al. 2019). The weights can also be assigned to minimize the effects of correlations between sub-indicators on the weighting, offering a composite indicator that maximizes the transfer of information in the sub-indicators (Lindén et al. 2021).

Considering spatial heterogeneity or not, research on the weighting of sub-indicators does not address a relevant topic for the geographic area: the spatial dependence of the multidimensional phenomenon. Spatial dependence is a property in which observations from different geographic areas can be correlated. Thus, observations from neighboring areas tend to be more similar than observations from distant areas (Goodchild 1991). The property of spatial dependence is associated with the First Law of Geography: "everything is related to everything else, but near things are more related to each other" (Tobler 1970).

The Moran's (1950) Index and Geary's (1954) Coefficient are two popular spatial autocorrelation measures for measuring the overall spatial dependence of a set of observations. The Moran's Index is based on the covariance between associated locations, while Geary's Coefficient considers numerical differences between the associated locations (Getis 1999). These indices are calculated based on a spatial weighting matrix built by proximity, contiguity, or geographic distance criteria.

Several studies employ the Moran's Index and the Geary Coefficient to analyze the spatial dependence of multidimensional phenomena. These researches bring examples of spatial autocorrelation of multidimensional phenomena in urban and rural areas and at different geographic scales: district, municipal, state, and regional (Jha and Gundimeda 2019; Katumba et al. 2019; Aljoufie and Tiwari 2020; Chauhan et al. 2020; Adeleke and Alabede 2021; Mavhura et al. 2021). However, these studies ignore that sub-indicator weighting schemes directly correlate with the composite indicator's results (Greco et al. 2019) and, consequently, influence its spatial autocorrelation.

This research aims to examine the effect of sub-indicator weighting schemes on the spatial dependence of a multidimensional phenomenon. This analysis allows answering questions such as: How does spatial autocorrelation relate to the robustness and quality parameters of the composite indicator? What are the effects of the weighting scheme in representing the multidimensional phenomenon? What is the relationship between the spatial autocorrelation of the sub-indicators and the composite indicator? What is the effect of the spatial weighting matrix on spatial autocorrection? How is this effect reflected in the robustness and quality parameters of the composite indicator?

The first innovation of this research is associated with the introduction of procedures that avoid judgment errors and evaluation biases in the Expert opinion weighting scheme. The first procedure allows the processing of different assessment formats, offering flexibility and psychological comfort that reduce judgment errors. The second procedure allows for measuring the extent to which individual and collective assessments differ by sub-indicator, signaling possible biased assessments. The second innovation of this research is associated with developing a flexible Data-driven sub-indicator weighting scheme that allows for emphasizing the property of spatial dependence of the composite indicator, taking into account, individually or simultaneously, parameters of robustness and quality.

2 Materials and methods

The details of the materials and methods used in the research are divided into six subsections. Section 2.1 presents the steps for building the distance and spatial weighting matrices and calculating Moran's Index. In Sects. 2.2 and 2.3, the stages of the building of composite indicators are presented. The building method of composite indicators is presented in Sect. 2.2, and the Equal-weights, Expert opinion, Data-driven, and Hybrid weighting schemes are presented in Sect. 2.3. The robustness and quality of the compost indicator and the consistency of the results are verified in Sects. 2.4, 2.5, and 2.6.

2.1 Spatial model of the study area: weighting matrix and spatial autocorrelation

The study area of this research is an urban conurbation composed of three cities in the state of Paraná, Brazil. The cities of Maringá-Sarandi-Paiçandu have an estimated population of 577,611 inhabitantsFootnote 2 and are divided into 661 census tracts (IBGE, 2010). Figure 1 shows these census tracts and the location map of the study area.

Fig. 1
figure 1

The study area location map: Maringá-Sarandi-Paiçandu, Paraná, Brazil

A spatial weighting matrix was built considering the 661 census tracts of the study area. This matrix reflects the arrangement of spatial interactions through spatial weights that represent the degree of connection between the census tracts. The degree of connection between the census tracts represented by the spatial weights was obtained using the non-binary geographic distance proximity criterion. This criterion has two advantages. First, it avoids the unbalanced connectivity that occurs due to census tracts with many neighbors and census tracts with few neighbors. Second, it eliminates subjectivity in defining the influence of distance from neighboring areas. The spatial weighting matrix was built as follows:

First, the Euclidean distances were calculated using the latitude and longitude of the centroids of the census tracts \(u,v \in U\) where \(U\) is the total number of census tracts:

$$d_{uv} = \sqrt {\left( {u_{i} - v_{i} } \right)^{2} + \left( {u_{j} - v_{j} } \right)^{2} , } \quad u,v \in U.$$
(1)

Then, the spatial weight matrix can be elaborated based on the distance matrix \(d_{uv}\) and the distance sensitivity parameters \(b\):

$$w_{uv} = \exp \left( { - bd_{uv} } \right),\quad u,v \in U$$
(2)

where the distance sensitivity parameter \(b\) was set to 1.

Finally, the spatial weighting matrix was normalized as

$$w_{uv}^{*} = \frac{{w_{uv} }}{{\mathop \sum \nolimits_{v \in U} w_{uv} }},{\text{so}}\;{\text{that}}\quad \mathop \sum \limits_{v \in U} w_{uv}^{*} = 1,$$
(3)

where \(w_{uv}^{*}\) is the normalized matrix \(w_{uv}\). \(u = \sum\nolimits_{v \in U} {w_{uv} }\).

Based on \(w_{uv}^{*}\), the Moran's Index global spatial autocorrelation coefficient was calculated as

$${\text{Moran}}^{\prime}{\text{s}}\; I = \frac{N}{{\mathop \sum \nolimits_{u} \mathop \sum \nolimits_{v} w_{uv}^{*} }}\frac{{\mathop \sum \nolimits_{u} \mathop \sum \nolimits_{v} w_{uv}^{*} \left( {X_{u} - \overline{X}} \right)\left( {X_{v} - \overline{X}} \right)}}{{\mathop \sum \nolimits_{u} \left( {X_{u} - \overline{X}} \right)^{2} }},$$
(4)

where \(N\) is the number of census tracts indexed by \(u\) and \(v\), \({X}\) corresponds to the Composite Indicator (CI) score, and the \(\overline{X}\) corresponds to the average of the CI scores. CI is to be further explained in Sect. 2.2

After calculating the Moran's Index, the statistical significance test of spatial effects is performed. This test measures the model's pseudo significance (Anselin 1996).

2.2 Composite indicator of social vulnerability

Social vulnerability is a multidimensional social phenomenon that reflects the condition of a group of individuals on the margins of society. The condition of social vulnerability is not limited to the problem of the economic dimension. It includes dimensions that portray household conditions, quality of education, access to clean water and sanitation, and the environment (Davino et al. 2021). Multidimensional social phenomena such as social vulnerability are also characterized by the differentiation of weights of sub-indicators (Mazziotta and Pareto 2017; Libório et al. 2020a, b) and by the property of spatial dependence (Cutter and Finch 2008) that can vary according to or sub-indicator (Rufat et al. 2019).

The composite indicator of social vulnerability was built following these characteristics and aggregating the fifteen objective-quantitativeFootnote 3 sub-indicators of the five dimensions listed below:

  • Demographic Dm-1. Residents up to one year old, Dm-2. Residents per household, Dm-3. Dependents per head of household, and Dm-4. Heads of households between 10 and 19 years old.

  • Economic Ec-1. Heads of households without income, Ec-2. Heads of households with income of up to two minimum wages, and Ec-3. Heads of households with income above 20 minimum wages.

  • Educational Ed-1. Heads of households without literacy and Ed-2. Illiterate residents aged between 10 and 14 years.

  • Environmental En-1. Vegetation Coverage Index (Normalized Difference Vegetation Index).

  • Households Hd-1. Households rented or leased, Hd-2. Households connected to the water network, Hd-3. Households connected to the sewage network, Hd-4. Households without a bathroom, and Hd-5. Households with more than three bathrooms.

The data from the fifteen sub-indicators refer to the last census carried out in Brazil in 2010 (IBGE, 2010). The data were collected and then normalized as

$$I_{lu} = \frac{{x_{lu} - \min \left( {x_{l} } \right)}}{{\max \left( {x_{l} } \right) - \min \left( {x_{l} } \right)}},$$
(5)

where \(x_{lu}\) is the value of sub-indicator \(l \in L\) of census tract \(u \in U\).

Simple Additive Weight (SAW) offers a more flexible and straightforward way of implementing different weighting schemes than other methods applied to build composite indicators (El Gibari et al. 2019). The aggregation of the sub-indicators in the SAW is performed in a compensatory way. Compensatory aggregation allows sub-indicators of poor performance to be offset by sub-indicators of above-average performance (Ekel et al. 2020). This compensation between poor and above-average performance sub-indicators increases the composite indicator scores, reducing its capability to represent multidimensional social phenomena (Libório et al. 2022a). The formulas used in non-compensatory aggregation schemes involve normalization adjustments for the calculation of geometric or harmonic means (Cinelli et al. 2021), data standardization with the inclusion of penalties (Mazziotta and Pareto 2018), or the ordered aggregation of sub-indicators (Libório et al. 2021). In turn, the building of the composite indicator by the SAW method is quite simple, facilitating the implementation of the weighting scheme proposed in this research through the expression

$${\text{CI}} = \mathop \sum \limits_{l \in L}^{{}} w_{l} I_{lu} ,$$
(6)

where \(w_{l}\) is the weight of \(I_{l}\) so that:

$$\mathop \sum \limits_{l \in L}^{{}} w_{l} = 1$$
(7)

and

$$0 \le w_{l} \le 1, l \in L.$$
(8)

2.3 Weighting of sub-indicators

Assigning weights by Data-driven weighting schemes avoids judgment errors and evaluation biases in the Expert opinion weighting scheme. In turn, assigning weights based on expert opinion prevents weights from disagreeing with the importance of sub-indicators in the concept of the multidimensional phenomenon. Finally, assigning Equal-weights avoids judgment errors and evaluation biases while ignoring the differentiation of the relative importance of indicators (Greco et al. 2019). This research analyzes how the advantages and disadvantages of these four weighting schemes are reflected in the spatial dependence property of the composite indicator. For this, the following methods were used:

  • Equal-weights is an objective weighting scheme that assigns an equal \(w_{i}\) value to all \(x_{i}\) sub-indicators from the ratio \({\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 N}}\right.\kern-0pt} \!\lower0.7ex\hbox{$N$}}\) where \(N\) is the number of sub-indicators.

  • Data-driven is an objective weighting scheme that assigns weights that maximize the value of the Moran's Index, subject to constraints (7) and (8), through the Generalized Reduced Gradient algorithm (GRG; Abadie 1969). The GRG algorithm is appropriate when the objective functions and constraints have exponential or trigonometric powers and calculations (Lasdon et al. 1974). The weighting of sub-indicators by the GRG was implemented in Microsoft Excel (Powell and Batt 2008).

  • Expert opinion is a subjective weighting scheme that assigns weights considering the conceptual importance of the sub-indicator in the phenomenon, taking as a reference the opinion of a group of experts. The assignment of weights to the sub-indicators was performed using the Fuzzy extension of Saaty's (1988) Analytic Hierarchy Process, the so-called Fuzzy-AHP (Van Laarhoven and Pedrycz 1983). The Fuzzy-AHP allows the processing of evaluations in different formats, offering psychological comfort to specialists when evaluating alternatives (sub-indicators) and avoiding judgment errors (Ekel et al. 2020). Assessments were performed by four experts using four different assessment formats (see “Appendix” for details). The degree of consensus between individual and collective opinions was calculated and served as a basis to verify possible evaluation biases (Pedrycz et al. 2011).

  • Hybrid is an objective-subjective weighting scheme that assigns weights by combining elements from the GRG and Fuzzy-AHP. This weighting scheme seeks an intermediate solution between the Data-driven and Expert opinion schemes. It assigns weights from the data without disregarding the sub-indicators' relative importance in the multidimensional phenomenon concept. The GRG algorithm was configured to maximize the value of the Moran's Index, taking as constraints the lowest and highest weight of the sub-indicator according to the group of experts (see Table 10 in “Appendix”).

2.4 Robustness check

The robustness check consists of analyzing how the different input factors propagate through the composite indicator structure and impact the positions of the spatial units (Nardo et al. 2005). The uncertainty analysis makes it possible to verify the robustness of the composite indicator through the absolute variation of the difference in the positions of the census tracts \(N\) with each change in the input variables (Becker et al. 2017). The uncertainty analysis performed in this research measures the changes in the positions of the spatial units with each change in the weighting scheme through the following expression

$$\overline{R}_{S} = \frac{1}{N}\mathop \sum \limits_{u \in U}^{N} \left| {{\text{Rank}}_{{{\text{ref}}}} \left( {{\text{CI}}_{u} } \right) - {\text{Rank}}\left( {{\text{CI}}_{u} } \right)} \right|$$
(9)

where \({\text{Rank}}_{{{\text{ref}}}} \left( {{\text{CI}}_{u} } \right)\) is the position occupied by the census sector score \(u\) in the composite indicator of reference \({\text{CI}}\), \({\text{Rank}}\left( {{\text{CI}}_{u} } \right)\) is the position occupied by the census sector score \(u\) in the composite indicator.

The uncertainty analysis reveals whether the classifications of spatial units are stable, providing a measure of the composite indicator's robustness and capacity to represent the multidimensional phenomenon regardless of the variations in the weights of the sub-indicators (Saisana et al. 2005).

2.5 Quality check

Two quality checks were performed. The first quality check measures how well the composite indicator captures the main concept of the multidimensional phenomenon. This verification is performed through Pearson's \({\varvec{r}}\) correlation coefficient between the composite indicator and the main individual indicator in the concept of the multidimensional phenomenon.

The individual indicator Average Income was used in the quality check for two reasons. First, because the average income is pointed out by many researchers as the main individual indicator of the concept of multidimensional social phenomena, such as social vulnerability, poverty, inequality, and social exclusion (Libório et al. 2022a). Second, because the individual indicator Average Income is not aggregated in the composite indicator of social vulnerability.

The second quality check measures how compatible the weights and relative importance of the sub-indicators in the concept of the multidimensional phenomenon are. This verification is carried out through the degree of consensus between the weights obtained by the different weighting schemes and the opinion of the experts, applying the following expression

$$C^{p} = 1 - \mathop \sum \limits_{l = 1}^{m} \frac{{\frac{{\left| {O^{G} \left( {x_{k} } \right) - O^{{E_{i} }} \left( {x_{k} } \right)} \right|}}{{\left( {n - 1} \right)}}}}{m}, \quad p \in P$$
(10)

where \(O^{G} \left( {z_{k} } \right)\) is the position of the \(k\)th alternative of the group's opinions and \(O^{{E_{i} }} \left( {z_{k} } \right)\) is the \(k\)th position of the alternative of the expert \(E_{i}\) results, \(m\) is the number of experts, and \(n\) is the number of sub-indicators.

The decision-maker stipulates the threshold value of the degree of consensus (Ekel et al. 2020; Libório et al. 2022a, b, c). The weight of the sub-indicator is compatible with its relative importance in the concept of the multidimensional phenomenon when the consensus threshold is reached.

2.6 Consistency of results

The consistency analysis verifies how much the spatial weighting matrix influences the weights of the sub-indicators, the spatial dependence, and the robustness and quality parameters of the composite indicator. The analysis is performed using alternative spatial models to the base model. In particular, alternative models were created by changing the sensitivity parameter of distance \(b\) and recalculating the spatial weighting matrix

$$w_{uv} = d_{uv}^{ - b} .$$
(11)

Model 2 was built by applying Eq. (2) with \(b = 2\). Models 3 and 4 were built by applying Eq. (11). Model 3 uses \(b = 1,\) and Model 4 uses \(b = 2\).

3 Effect of the sub-indicator weighting scheme on the spatial dependence property, robustness, and quality of the composite indicator

Table 1 shows the sub-indicator weighting scheme's effect on the composite indicator's spatial dependence property. The intensity of this effect is greater in the Data-driven and Hybrid weighting schemes. On average, the Moran's Index in these weighting schemes is thirteen and nine percent higher than in the Equal-weights and Expert opinion weighting schemes. In particular, weighting the sub-indicators by the Data-driven scheme allows obtaining a Moran's Index fourteen percent higher than the Equal-weights scheme.

Table 1 The composite indicator's spatial dependence, robustness, and quality by weighting scheme

The results in Table 1 show that the composite indicator of social vulnerability has a positive and significant spatial dependence, regardless of the weighting scheme. However, the spatial dependence of the composite indicator of social vulnerability is more prominent when the sub-indicators are weighted using the Hybrid and Data-driven schemes.

The composite indicators obtained by the four weighting schemes have an uncertainty of lower than 30 positions. This level of uncertainty means that changing the weights of the sub-indicators implies an unimportant variation in the positions of the spatial units. The composite indicator with the highest level of uncertainty was obtained using the Equal-weights weighting scheme. The uncertainty associated with the positions of the spatial units of the composite indicator built with Equal-weights is, on average, twenty-nine positions. The uncertainty associated with the weighting scheme is lower when the composite indicators are built using the Hybrid and Data-driven schemes. The uncertainty of the composite indicators obtained in the Hybrid and Data-driven schemes is twenty-five percent lower than in the composite indicators obtained in the Equal-weights and Expert opinion schemes. These results suggest that composite indicators built in Hybrid and Data-driven weighting schemes offer a robust and representative measure of the multidimensional phenomenon (Saisana et al. 2005).

The correlation coefficients between the composite indicators and the individual average income indicator exceeded 0.60 in the four weighting schemes. These results indicate that the main individual indicator of the multidimensional phenomenon concept is captured independently of the weighting scheme.

The Expert opinion, Hybrid, and Data-driven weighting schemes generated composite indicators compatible with the expert group's opinion. The composite indicators built through these weighting schemes converge with the expert group's opinion above the threshold of 0.70. This high degree of consensus indicates that these composite indicators are compatible with the concept of the multidimensional phenomenon. These results reinforce the argument that the Equal-weights weighting scheme should be considered when there is no theory to justify weight differentiation and no consensus among experts (Greco et al. 2019).

However, several sub-indicators weighted through Equal-weights, and Data-driven weighting schemes have weights that exceed the upper and lower limits defined by expert opinion. Figure 2 reveals that the Equal-weights scheme has three sub-indicators with weights that exceed upper and lower limits defined by specialists. The number of sub-indicators that exceed the upper and lower limits rises to 12 in the Data-driven weighting scheme.

Fig. 2
figure 2

Agreement between the weights of the sub-indicators according to the opinion of the experts and the weighting schemes (for sub-indicators descriptions see Table 2)

The Hybrid weighting scheme assigns weights within the maximum and minimum limits to two sub-indicators and weights equal to the minimum and maximum limits to five and eight sub-indicators, respectively.

The joint analysis of the Moran's Index, uncertainty, correlation with average income, and degree of consensus suggests that Hybrid and Data-driven weighting allow obtaining a better composite indicator than composite indicators built by Equal-weights and, Expert opinion weighting schemes. Table 2 shows that four sub-indicators were decisive for the best performance of the Hybrid and Data-driven weighting in the analyzed parameters. These sub-indicators are identified based on the variance criterion (\(\sigma^2\)).

Table 2 Weights of sub-indicators by weighting scheme

The results show that the sub-indicators “Heads of households with income above 20 minimum wages” and “Households connected to the sewage network” increase the spatial autocorrelation of social vulnerability in the study area. The weight of the sub-indicator "Heads of households with income above 20 minimum wages" is 0.37 greater in the Data-driven weighting scheme than in the Equal-weights and Expert opinion weighting schemes. In the sub-indicator "Households connected to the sewage network," the weight obtained in the Data-driven weighting scheme is 0.17 greater than in the other weighting schemes. These results suggest that experts cannot consider the spatial dependence of sub-indicators in the weighting process. The weighting by Equal-weights ignores that the spatial dependence property is not equal between the sub-indicators.

In the Data-driven weighting scheme, the composite indicator with the highest spatial autocorrelation reduced the weight of the sub-indicators "Heads of households without literacy" and "Household without bathroom" by 0.10 and 0.11, respectively.

4 Influence of weighting schemes in the representation of social vulnerability

The influence of weighting schemes on the representation of social vulnerability in the study area can be seen in Fig. 3. There is a pattern in the distribution of social vulnerability regardless of the weighting scheme. The high social vulnerability is concentrated in the peripheral census tracts, especially in the eastern and western peripheries. Census tracts classified as having no social vulnerability are concentrated in the central region of the urban conurbation. However, it is possible to observe in the maps that the intensity of social vulnerability varies with the weighting scheme. The map shows that changes in the intensity of social vulnerability are accompanied by changes in the Moran's Index (Moran's I).

Fig. 3
figure 3

Effect of sub-indicator weighting schemes on Social Vulnerability in the study area

The Equal-weights and Expert opinion weighting schemes have a lower Moran's Index and a higher number of census tracts classified as high and medium social vulnerability. Table 3 shows how the Moran's Index relates to social vulnerability in the study area. First, it shows a positive relationship between the Moran's Index and the number of census tracts classified as having None social vulnerability. Second, it shows a negative relationship between the Moran's Index and the sum of census tracts classified as High and Medium social vulnerability. Third, it reveals that the Moran's Index is positively related to the average score of the composite indicator scores.

Table 3 Social vulnerability of census tracts by weighting scheme

These results show that disregarding the opinion of experts through Data-driven weighting increases the spatial autocorrelation, the composite indicator scores, and the number of census tracts classified as Medium and High social vulnerability. At the same time, it reduces the number of census tracts classified as None and Low social vulnerability. The number of census tracts of None and Low social vulnerability is 2.30 times and 1.21 times greater in the Data-driven weighting scheme than in the Expert opinion and Hybrid weighting schemes, respectively. The Data-driven weighting scheme presents 1.61 times and 1.19 times fewer census tracts classified as Medium and High social vulnerability concerning the Expert opinion and Hybrid weighting schemes.

These results indicate that positive spatial autocorrelation increases the composite indicator's average score due to the propagation of scores between census tracts. Therefore, disregarding the opinion of experts through Data-driven weighting can make it challenging to evaluate and formulate public policies to reduce social vulnerability, as this weighting scheme of sub-indicators tends to hide areas of Medium and High social vulnerability.

Furthermore, the composite indicator built by the Data-driven weighting scheme presents 26 census tracts with atypical positions. These census tracts show variations above 27.6 positions concerning the composite indicators built by the Hybrid, Equal-weights, and Expert opinion weighting schemes. The map in Fig. 4 shows that most of these atypical shifts occurred in census tracts located in the central area of the study area that presents a higher average income.

Fig. 4
figure 4

Atypical shifts in the census tracts' position in the composite indicator ranking. Note The three z-score rule was applied to identify the atypical shifts corresponding to 27.6 positions

The census tracts highlighted with dots on the map shifted an average of 94.7 positions in the ranking of the composite indicator. The average income in these census tracts is 8% higher than the average for the study area. This result is associated with the weighting schemes' significant variance in the weights of the sub-indicator Heads of households with income above 20 minimum wages (\(\sigma^{2}\) = 0.024). The weight of this sub-indicator in the Data-driven and Hybrid weighting schemes is, on average, 5.1 times greater than in the Equal-weights and Expert opinion weighting schemes. The sub-indicator Heads of households with income above 20 minimum wages have a greater weight in the weighting schemes that have a greater impact on the Moran's Index, suggesting a relationship between these two elements.

5 Relationship between the spatial autocorrelation of the sub-indicators and the composite indicator

The direct relationship between the Moran's Index and the average score of the composite indicator suggests that the Data-driven weighting scheme assigns greater weights to sub-indicators with greater spatial dependence. This weighting logic suggests that the spatial autocorrelation of the sub-indicator determines its weight. Therefore, conceptually important sub-indicators and low spatial autocorrelation may be underrepresented. Table 4 demystifies this assumption by showing a non-significant correlation between the weights obtained by the Data-driven weighting scheme and the Moran's Index of the sub-indicator.

Table 4 Weights and Moran's Index of sub-indicators

These results allow us to reach a peculiar conclusion. The average Moran Index of the sub-indicators is 0.280 and is far below the Moran Index of the composite indicators. The Moran's Index of the composite indicator built by Data-driven weights is 1.96 times higher than the average Moran's Index of the sub-indicators. None of the sub-indicators presents greater spatial autocorrelation than the composite indicator built by Expert opinion, Hybrid, and Data-driven weighting. Furthermore, the results show that emphasizing spatial dependence does not distort the representation of the multidimensional phenomenon.

The discrepancy of the Moran Index between social indicators is not new and has been reported in previous research (Rufat et al. 2019). However, this research results indicate that the composite indicator's spatial dependence property is not related to the Moran Index of its sub-indicators. The aggregation of sub-indicators has a positive and strong impact on the Moran Index of the composite indicator, regardless of the weighting scheme. However, the strength of this impact is higher in the Data-driven weighting scheme and lower in the Equal-weights and Expert opinion weighting schemes.

Table 5 provides evidence that reinforces this conclusion. Moran Indices of composite social indicators are normally higher when built by the Data-driven weighting scheme than when built by the Equal-weights and Participatory weighting schemes (see Adeleke and Alabede (2021) for exception).

Table 5 Relationship between the weighting scheme and Moran's Index in composite social indicators

The Hybrid weighting scheme developed in this research offers a solution that counterbalances the property of spatial dependence, the compatibility of the composite indicator with the multidimensional phenomenon, and the agreement of the weights of the sub-indicators with the opinion of experts. This solution makes it possible to know the spatial association of the multidimensional social phenomenon between neighboring areas without overestimating the conditions of these areas or weighing sub-indicators with weights that disagree with the opinion of experts.

6 Effect of the spatial weighting matrix on the spatial autocorrelation, robustness, and quality of the composite indicator

Composite indicators built by Hybrid and Data-driven weighting schemes, which emphasize spatial dependence, are more robust (\(\overline{R}_{S}\) = 19 and \(\overline{R}_{S}\) = 23). Quality checks indicate that these composite indicators provide good representations of social vulnerability in the study area. The correlation of these composite indicators with the individual average income indicator was \({\varvec{r}}\) = 0.630 in the Hybrid weighting and \({\varvec{r}}\) = 0.657 in the Data-driven weighting. The degree of consensus was \(C^{G}\) = 0.710 in the Hybrid weighting and \(C^{G}\) = 0.745 in the Data-driven weighting, surpassing the acceptance threshold of \(C^{G}\) > 0.60.

Tables 6, 7, 8, 9 shows that changes in the spatial weighting matrices do not change the relationships between the weighting schemes, spatial dependence and robustness, and quality parameters, indicating that the results are reliable. For example, the Moran's Index and uncertainty remain higher in composite indicators built with Hybride and Data-driven weights.

Table 6 Moran’s Index of the composite indicator per model
Table 7 Uncertainty of the composite indicator per model
Table 8 Degree of consensus of the composite indicator per model
Table 9 Correlation with the average income of the composite indicator per model

These results show that the spatial weighting matrix does not change the relationship of the weighting scheme with the Moran's Index but shows that the effect of the weighting scheme on spatial autocorrelation is greater in spatial weighting matrices that are less sensitive to distance (Models 2 and 4).

Tables 7, 8, 9 shows that the spatial weighting matrix does not change the uncertainty and correlation of the composite indicator with the most important individual indicator in the multidimensional phenomenon. Finally, Table 8 shows that the weighting matrix changes the degree of consensus regarding the weights of the sub-indicators in the Hybrid and Data-driven schemes. The degree of consensus is lower in models based on spatial weighting matrices that are more sensitive to distance. In models 2 and 4, the degree of consensus between the Data-driven and Expert group weights did not exceed the limit of \(C^{G}\) = 0.70.

On the one hand, the spatial weighting matrix influences the spatial autocorrelation of the composite indicator. On the other hand, the spatial weighting matrix does not change the effects of the weighting scheme on spatial autocorrelation, nor does it influence the robustness and quality parameters of the composite indicator.

7 Conclusions

The results gathered in this research reveal that the property of spatial dependence of the composite indicator is influenced by the weighting scheme of the sub-indicators and the spatial weighting matrix. It is possible to build composite indicators that emphasize the spatial aspects of the neighborhood by applying Hybrid and Data-driven weighting schemes. Robustness and quality checks indicate that emphasizing the property of spatial dependence does not diminish the ability of composite indicators built through these weighting schemes to represent the multidimensional phenomenon.

The results offered in this research are limited to Moran's Global Index. Therefore, there are still numerous investigations to be carried out on the effects of sub-indicator weighting schemes on Moran's Local Index. In addition, the Data-driven weighting scheme developed in the research has the flexibility to deal with other spatial dimension problems. It is possible to configure the GRG algorithm for at least three other purposes. First, to maximize the spatial autocorrelation between two composite indicators. Second, maximize the correlation coefficient of a spatial regression model. Third, maximize the spatial autocorrelation based on Geary's (1954) Coefficient. These models can be configured to individually or simultaneously incorporate robustness and quality parameters. The flexibility of this weighting scheme also allows finding weights that maximize the spatial dependence of the composite indicator and, at the same time, ensure that the correlation of the composite indicator with the main individual indicator of the concept of the multidimensional phenomenon exceeds a certain threshold. For example, it is possible to emphasize the spatial dependence of the composite indicator and guarantee an \({\varvec{r}}\) = 0.65. The composite indicator obtained from the weighted sub-indicators with this GRG algorithm configuration allows reaching a Moran's Index equal to 0.635 and \({\varvec{r}}\) = 0.650. Finally, it is possible to adapt the Data-driven weighting scheme introduced to build composite indicators based on other methods. Future research may adapt the GRG algorithm to implement the Mazziotta-Pareto Index to consider no compensation between sub-indicators, rather than the Ordered Weighted Average operator that simultaneously considers no compensation between sub-indicators, spatial dependency, and spatial heterogeneity.