1 Introduction

The Human Development Index (HDI) was first published in the Human Development Report (HDR) in 1990. It was created in order to be a more realistic measured in terms of national development than the Gross Domestic Product (GDP) as three social dimensions compose it: Long and Healthy Life (LHL), Access to Knowledge (ATK) and the Decent Standard of Living (DSL). Since its creation, the index is the most widely used as a benchmark to the development of nations as well as to assist government decision makers. According to UNDP (2016), the Human Development Index (HDI) is a summary measure of achievements in key dimensions of human development.

Despite the widespread use of the HDI, much criticism arose about the dimensions and the methodology used by United Nations Development Programme (UNDP). For this reason many researches proposed alternative ways to measure and manage the HDI, like, Noorbakhsh (1998), Sagar and Najam (1998), Alkire (2002), Morse (2003), Dowrick et al. (2003), Berenger and Chouchane (2007), Grimm et al. (2008), Cherchye et al. (2008), Zhou et al. (2010), Hatefi and Torabi (2010), Martinez (2013) and Bluszcz (2015).

As reported by Klugman et al. (2011), the UNDP did a job of reviewing the main criticisms, suggestions and considerations published by researchers and explain the changes introduced to the HDI from these same criticisms. The article highlights three critics that the UNDP considered constructive for the evolution of the index: (1) the addition of new dimensions to compose the index like political rights, human rights, sustainability, among others; (2) the simplistic approach of the method and concerns revolving around the normalization of indicators, asymmetric treatment of the income index and choice of weights; and finally (3) the redundancy and robustness of the chosen dimensions.

Afterwards, UNDP changed some aspects of the index design, replacing the indicators for income and education, changing the method of aggregation from an arithmetic mean to a geometric mean, and redefining the upper and lower bounds used to normalize the index, eliminating the practice of capping dimensions that surpassed the upper bounds (Klugman et al. 2011).

Regarding the third complaint considered by the UNDP, the question to be asked is that there is a high correlation between the indicator and its dimensions, and then any set of weights will result in the same ranking. However, if the correlation is low, the choice of weights will be very important in determining the ranking of countries. To deal with this, the UNDP only considered in their changes to give the same weights for all dimensions, arguing that the time series of the last forty years have shown a high correlation between indicator and the dimensions. On the other hand, the classification of countries in three classes, low, medium and high development does not show strength with the adopted changes and therefore the categorization of countries is done by quartiles of the HDI. Thus, one possible point to improve the design of the index (after the changes) refers to a still open criticism that is the method used to the classification of countries.

The HDI is calculated as the geometric mean of normalized indices for each one of the three dimensions, however the last dimension, DSL, is used transformed by the natural logarithm. We propose in this work an alternative method to classify the countries in a robust way. A multicriteria method from the Elimination Et Choix Traidusaint la Realite (ELECTRE) family, named ELECTRE TRI is used in conjunction with two statistical tools, the Kernel Density Estimation (KDE) and the Jenkins Natural Breaks (JNB) the first one will assist to decide the number of classes more appropriated to the data and the latter will define the profiles for each one of the classes. The ELECTRE TRI method results in ordered classification of the alternatives (countries) given a set of dimensions (criteria), also, the result is non-compensatory, so, an alternative that excels only in few dimension will not be assigned to a high position (rank). For the exposed reasons, our proposed methodology is an alternative approach to the geometric mean and the definition of classes by quartiles used by the UNDP.

Our objective is to classify 188 countries from the HDR 2015 through three dimensions: Long and Healthy Life (LHL), Access to Knowledge (ATK) and Decent Standard of Living (DSL). The results of our proposed method are compared with the ones shown in the HDR 2015.

This paper is organized as follows: Sect. 2 consists of Literature Review contemplating authors criticizing the HDI, authors that have proposed changes in the dimensions used to compose the index and authors who have proposed changes in the methodology and index. Section 3 will present the ELECTRE TRI method, KDE and the JNB algorithm. Section 4 presents the proposed model and the results are compared with the HDI results published in the HDR 2015. Section 5 is reserved for the conclusions and proposals of future works.

2 Literature Review

To conduct the literature review, a research was made in scientific data bases and 19 articles were adherent to the main critics to the HDI, of which 3 are related to general criticism, 16 articles suggested new dimensions to calculate the indicators or alternative methods to calculate the index.

The articles related to general criticism discussed respectively: (1) the relationship between the dimensions and the compensatory effect where a low performance in a dimension can be compensated by a high performance in other one (Sagar and Najam 1998), (2) the concept and selection method of dimensions (Alkire 2002) and finally (3) a structured way to evaluate and classify development indicators.

Regarding the compensatory effect, it’s worth to explain that this effect can lead to the wrong impression about the development of a country, as the result of the index does not capture the individual results of the indicators that compose it. This effect is related to the weighting process and depending the way the weights are assigned, may lead the index to higher or lower score. For example, a country may be poorly evaluated in health and education indicators, but very highly rated in the economic indicator and it would raise the overall result of the index to a level that does not match the overall assessment of development.

On this subject, Sagar and Najam (1998) affirm,

Perhaps the single most powerful attribute of the human development concept is the centrality that it invests in the notion that each of its three dimensions are equally essential in determining the level of human development. In fact, the reports have made considerable effort to defend the decision of giving equal weight to the three variables… Our concern here, however, is about the conceptual implications of the current method for folding the three component indices into a single index. We believe that the scheme of arithmetic averaging of the dimensions runs counter to the notion of their being essential and, therefore, non-substitutable. After all, ‘‘additivity over the three variables implies perfect substitution which can hardly be appropriate’’ (Desai 1991). This scheme masks trade-offs between various dimensions since it suggests that you can make up in one dimension the deficiency in another. Such a reductionist view of human development is completely contrary to the UNDP’s own definition. (p. 251)

Adding to the affirmative above, Klugman et al. (2011) say,

The issue of substitutability has been raised by several authors, including Desai (1991), Palazzi and Lauri (1998), and Nathan et al. (2008). They saw the additive form of the HDI as problematic, because this implies perfect substitution across dimensions. There are constant marginal returns to improvements in each dimension, and therefore the marginal rate of substitution between dimensional achievements is also a constant. This would seem to run counter to the intuition that, the worse the deprivation in a particular dimension, the more urgent the efforts to improve achievements in that dimension should be regarded. (p. 12)

The additive aggregation method may distort the HDI ranking because of the perfect substitution as mentioned by authors. However, the multiplicative method may also lead to rankings different from reality if each indicator weight was chosen in a wrong way by the judges and researchers. On this issue, according to Berenger and Chouchane (2007),

…The crucial problem is to assign suitable weights to the indicators. Information can be aggregated into a single measure in two main ways. One is by bringing into play the arbitrariness and beliefs of the researcher and may involve public and expert judgments.

The choice of aggregation function is also crucial as it affects the compensability of additive aggregations. The derived indices can be either additive or functional, depending on the context of analysis. (p. 1266–1267)

Examples of non-compensatory approaches can be seen on papers that evaluate Well Being indexes. These indexes are composed by subjective dimensions, such as: physical, psychological/emotional, social, intellectual, spiritual, occupational, and environmental among others dimensions.

Hall (2013) studied the relation between subjective dimensions of Well Being Index and HDI. For him, “the two approaches can provide alternate, but complementary, pictures of—and improvements to—human lives. One of the insights that subjective data can bring to many aspects of life is the ability to compare people’s perceptions with the objective evidence.”

In this context, the author found that components of the HDI all correlate strongly with a sample range of 124–152 countries life evaluations measured by average national data from 2010 to 2012 years of the Gallup World Poll. The present paper also exercised this correlation using the proposed multicriteria methodology of classification for HDI, which will be showed on the Sect. 4.1.

Craveirinha and Clímaco (2012) make a proposal for a non-compensatory evaluation approach of quality of life/well-being based on multidimensional dashboards. They say that it is well known that quality of life is much more than what is evaluated through production or even standards of living metrics. Authors also call attention to the question of the construction of aggregated indices of quality of life. According to them,

There are various issues that clearly deserve a discussion in this context. That is: the choice of the dimensions/indicators to be integrated and the quality of the respective measures; the problem of setting of weights which although being attributed as importance coefficients, are in fact associated with trade-offs and lead to fully compensatory indices concerning the considered dimensions; the problem of the construction and normalization of scales for each dimension; the problem of the independence of the dimensions, essential when an additive model is used. These indices represent, in general, arithmetic means of what happens in a given analyzed universe and do not reflect the striking inequalities possibly occurring among parts of the whole under evaluation. (p. 6)

In their paper, they discuss the process of analysis of a dashboard for a multidimensional evaluation of QoL and illustrate their study by proposing the use of a methodology to the analysis of the multidimensional evaluation matrix used by the OECD in the “BetterLife Index”.

As can be seen, the authors present the following methodology:

…based on an interactive implementation of the conjunctive method, enabling the consideration of up to three performance thresholds, having in mind to classify the objects (countries) under evaluation into four classes: non-acceptable, acceptable, good and very good… On the proposed methodology, there is no inter-criterion aggregation, there is no need to reduce the various evaluation dimensions into the same scale thereby avoiding all possible associated distortions which we have referred to above. Furthermore, the described aggregation process is non-compensatory thence avoiding the problem that a weak performance in one dimension may always be compensated by a strong performance in another dimension, as in an additive model. Moreover, in our case, there is no need to assume additive independence among the various attributes, an adequate property because, as we have discussed, that is a requirement too strong in this context. Therefore, the main limitations of the additive model are surpassed. (p. 17)

Otherwise, Craveirinha and Clímaco (2012) mention that,

The proposed methodology is to obtain just a classification of the objects under evaluation and not a ranking. Moreover and in spite of the great flexibility of the proposed tool and concerning the fixation/variation of the thresholds, the problem of knowing how to fix reference values for these thresholds has to be addressed, which is not an easy issue. At this point, we will recur to the framework of the capability approach proposed by Sen. (p. 17)

Maziotta and Pareto (2015) compare two non-compensatory composite indices for measuring multidimensional phenomena and monitoring their changes over time: the Adjusted Mazziotta-Pareto Index (AMPI) and the Mean-Min Function (MMF).

The authors explain that,

The AMPI is a non-linear composite index which, starting from a linear aggregation, introduces a penalty for the units with unbalanced values of the indicators. It is composed of two parts (a measure of the mean level and a measure of the amount of unbalance) and, differently from other methods, may be used for constructing both ‘positive’ and ‘negative’ composite indices. The MMF is an intermediate case between arithmetic mean, according to which no unbalance is penalized, and min function, according to which the penalization is maximum, because the other values cannot increase the value of the index. It depends on two parameters that are respectively related to the intensity of penalization of unbalance and intensity of complementarity between indicators. (p. 95)

They illustrate the methodological comparison using a set of regional indicators of development in Italy between 2004 and 2011. To construct the index, they considered five dimensions: Health, Income, Work, Education and Environment. The comparative application results show that the AMPI is very similar to an ‘intermediate’ MMF. However, it respects both the constraint of time comparisons and the non-compensability by using an easier and more transparent methodology than the MMF.

Another authors conclusion is that, “aside from the procedure used, composite indices provide an irreplaceable contribution to simplification, but they are based on methods that flatten the information and can lead to a myopic reading of reality, especially if they are not supported by an adequate selection and interpretation of the individual indicators.

To evaluate and classify development indicators, judges and researches may take into consideration eight issues as can be seen in Table 1 (Booysen 2002).

Table 1 Development indicators (Booysen 2002)

Booysen (2002) also affirms that composite indices are in general, of a cardinal nature, but remains ordinal insofar as differences in index values cannot be interpreted meaningfully. The author also affirms that the multidimensionality of these indices represents one of their main advantages; however, the comparative application of indices of development over space and time remains problematic.

Regarding the proposal of new dimensions to calculate the indicators, Salas-Bourgoin (2014) developed a study proposing two new dimensions to the HDI: Employment Index (including employment-to-population ratio) and Democracy Index (as a way of gauging freedom). The author concludes that the modified HDI reveals that the weaknesses in countries with high overall HDI scores relate mainly to employment, while developing countries lag behind in the quality of employment.

Berenger and Chouchane (2007) focused their work in two welfare indexes: Standard of Living (SL) and Quality of Life (QL). To build the SL indicator they have used nine variables separated into three areas (education, health and material well-being). The QL indicator also combines nine variables distributed among three areas (education, health and environment). Among its main conclusions, the authors found out that both indicators are statistically significant and correlated. Bilbao-Ubillos (2012) constructed an index considering 8 dimensions (health, education, economic welfare, inequality, poverty, gender situation, sustainability and personal safety). The method was applied in 15 representative developing countries of each region of the world and their results were different from the original HDI.

Martinez (2013) proposes an alternative index, the Human Wellbeing Composite Index (WCI) to rank 42 countries in Europe, North Africa and the Middle East. The following dimensions compose the index: income per capita, environmental burden of disease, income inequality, gender gap, education, life expectancy at birth and government effectiveness. According to the author, the results highlight the distance still separates que Southern Mediterranean countries from the benchmark levels established by some European countries. Bluszcz (2015) assesses the EU countries from three dimensions (social, economic and environmental) using the average of the normalized data and classifying countries according to the level of sustainability.

Grimm et al. (2008), Smet et al. (2012) and Dowrick et al. (2003) suggested alternative methods to classify countries at different levels of development. Morse (2003) with his approach solved the problem of comparison of indices over time. Cherchye et al. (2008), Zhou et al. (2010), Hatefi and Torabi (2010), Martinez (2013), Wu et al. (2013), Zheng and Zheng (2015) and Pinar et al. (2012) studied the subjectivity of weights and the compensation effects between the dimensions.

According to the literature review, to turn the rank closer to the reality could add new dimensions to the index as well as utilize alternative methods of classification. With this concern in mind, we propose a methodology, mathematically robust, and that can work with any set of dimensions and weights, improving the quality of the classification.

3 ELECTRE TRI Method

The ELECTRE TRI method was developed by Mousseau and Slowinski, (1998) to solve classification problems by the comparison of each alternative (a country in our context) with reference profiles, which are limits of each considered class (group or category). In other words, a set of X = {x 1x 2x 3, …} alternatives are classified according to a comparison (outranking relationship) made between a set classes C = {C 1C 2, …, C h+1} that are limited by upper and lower limits (profile b h ) profiles with each dimension (criterion) performance \(G = \left\{ {g_{1} , g_{2} , \ldots ,g_{j} } \right\}\). Figure 1 illustrates the ordered of classes, profiles and dimensions.

Fig. 1
figure 1

Classes and Profiles (Costa et al. 2007)

As seen in Fig. 1 upper and lower profiles limits the classes. The exceptions are the C 1 class, which has no lower limit and the C h+1 class, which has no upper limit. The alternatives classified in C h+1 are superior to the ones classified in the C h class, the same logic continues until the worst class C 1 is reached.

The outranking relationship is built to make it possible to claim that x i  > b h , meaning that: x i is at least as good as b h . To validate this statement, x i  > b h , two conditions must be checked:

  • Concordance: A sufficient majority of dimensions should be in favor of the statement,

  • Discordance: None of the dimensions in the minority should oppose to the statement.

Also should be considered two types of parameters associated to the dimensions that can interfere in the statement, x i  > b h :

  • A set of weights or importance (W 1W 2, …, W j ) for each dimension, that are used in the Concordance test;

  • A set of veto limits \(v_{1} ,v_{2} , \ldots ,v_{j}\), used in the Discordance test.

Defines the notation used as follows:

  • xi = Alternative i;

  • bh = Limit of the class h;

  • \({\text{c}}_{\text{j}} \left( {{\text{x}}_{\text{i}} ,{\text{b}}_{\text{h}} } \right)\) = Partial Concordance Index between alternative and class;

  • \({\text{c}}_{\text{j}} \left( {{\text{b}}_{\text{h}} ,{\text{x}}_{\text{i}} } \right)\) = Partial Concordance Index between class and alternative;

  • \({\text{C}}\left( {{\text{x}}_{\text{i}} ,{\text{b}}_{\text{h}} } \right)\) = Global Concordance Index between alternative and class;

  • \({\text{C}}\left( {{\text{b}}_{\text{h}} ,{\text{x}}_{\text{i}} } \right)\) = Global Concordance Index between class and alternative;

  • \({\text{D}}_{\text{j}} \left( {{\text{x}}_{\text{i}} ,{\text{b}}_{\text{h}} } \right)\) = Partial Discordance Index between alternative and class;

  • \({\text{D}}_{\text{j}} \left( {{\text{b}}_{\text{h}} ,{\text{x}}_{\text{i}} } \right)\) = Partial Discordance Index between class and alternative;

  • \(\sigma \left( {{\text{x}}_{\text{i}} ,{\text{b}}_{\text{h}} } \right)\) = Credibility Index between alternative and class;

  • \(\sigma \left( {{\text{b}}_{\text{h}} ,{\text{x}}_{\text{i}} } \right)\) = Credibility Index between class and alternative;

  • \({\text{W}}_{\text{j}}\) = Weight of the dimension \({\text{j}}\);

  • \({\text{g}}_{\text{j}} \left( {{\text{x}}_{\text{i}} } \right)\) = Performance of the dimension j in relation with the alternative \({\text{i}}\);

  • \({\text{g}}_{\text{j}} \left( {{\text{b}}_{\text{h}} } \right)\) = Performance of the dimension j in relation with profile \({\text{b}}_{\text{h}}\);

  • \({\text{p}}_{\text{j}}\) = Strong Preference of the dimension j;

  • \({\text{q}}_{\text{j}}\) = Weak Preference of the dimension j;

  • \({\text{v}}_{\text{j}}\) = Veto of the dimension j;

  • \(\lambda\) = Cut off value, where \(0.5 \le \lambda \le 1\).

With the following constraints:

$$v_{j} \,\ge\, p_{j} \,\ge\, q_{j}$$
(1)
$$g_{j} \left( {b_{h} } \right) \le\, g_{j} \left( {b_{h + 1} } \right)$$
(2)

The comparisons in ELECTRE TRI were originally used based on pseudo-criteria. As noted by Tervonen et al. (2007), a pseudo-criterion is a criterion associated to the limits of indifference q j and preference p j . These limits allow the adoption of hesitations and uncertainties in evaluations of the performance criteria at the discretion of each alternative. Thus, as the dimensions considered for the HDI measure are not pseudo-criterion, we must assume:

$$q_{j} = p_{j} = 0\;\forall j$$
(3)

The construction of the outranking relationship in the ELECTRE TRI method have the following the steps (Tervonen et al. 2007):

  1. 1.

    Compute the Partial Concordance Indices c j (x i b h ) and c j (b h x i ).

    As shown by Tervonen et al. (2007), the Partial Concordance Index is calculated individually for each criterion j to support the statement x i  > b h . When j has a direction of increasing preference, the Partial Concordance Indices can be calculated as follows:

    $$c_{j} (x_{i} ,b_{h} ) = \left\{ {\begin{array}{*{20}l} {0,} \hfill & {if\;g_{j} (b_{h} ) - g_{j} (x_{i} ) \ge p_{j} } \hfill \\ {1,} \hfill & {if\;g_{j} (b_{h} ) - g_{j} (x_{i} ) < q_{j} } \hfill \\ {\frac{{p_{j} - g_{j} (b_{q} ) + g_{j} (x_{i} )}}{{p_{j} - q_{j} }},} \hfill & {if\;q_{j} \le g_{j} (b_{h} ) - g_{j} (x_{i} ) < p_{j} } \hfill \\ \end{array} } \right.$$
    (4)
    $$c_{j} (b_{h} ,x_{i} ) = \left\{ {\begin{array}{*{20}l} {0,} \hfill & {if\;g_{j} (x_{i} ) - g_{j} (b_{h} ) \ge p_{j} } \hfill \\ {1,} \hfill & {if\;g_{j} (x_{i} ) - g_{j} (b_{h} ) < q_{j} } \hfill \\ {\frac{{p_{j} - g_{j} (x_{i} ) + g_{j} (b_{q} )}}{{p_{j} - q_{j} }},} \hfill & {if\;q_{j} \le g_{j} (x_{i} ) - g_{j} (b_{h} ) < p_{j} } \hfill \\ \end{array} } \right.$$
    (5)
  2. 2.

    Compute the Global Concordance Indices C(x i b h ) and C(b h x i ).

    They represent how much the evaluations in all dimensions supports the statement x i  > b h , and are calculated as follows:

    $$C\left( {x_{i} ,b_{h} } \right) = \frac{{\mathop \sum \nolimits_{j = 1}^{n} W_{j} C_{j} \left( {x_{i} ,b_{h} } \right)}}{{\mathop \sum \nolimits_{j = 1}^{n} W_{j} }}$$
    (6)
    $$C\left( {b_{h} ,x_{i} } \right) = \frac{{\mathop \sum \nolimits_{j = 1}^{n} W_{j} C_{j} \left( {b_{h} ,x_{i} } \right)}}{{\mathop \sum \nolimits_{j = 1}^{n} W_{j} }}$$
    (7)
  3. 3.

    Compute the Discordance Indices D j (x i b h ) and D j (b h x i ).

    As shown in Tervonen et al. (2007) they describe the effect of the veto criterion which rules against the statement x i  > b h . When j has a direction of increasing preference, the Discordance Indices can be calculated as follows:

    $$D_{j} (x_{i} ,b_{h} ) = \left\{ {\begin{array}{*{20}l} {0,} \hfill & {if\;g_{j} (b_{h} ) - g_{j} (x_{i} ) < p_{j} } \hfill \\ {1,} \hfill & {if\;g_{j} (b_{h} ) - g_{j} (x_{i} ) \ge v_{j} } \hfill \\ {\frac{{ - p_{j} - g_{j} (x_{i} ) + g_{j} }}{{v_{j} - p_{j} }}} \hfill & {if\;p_{j} \le g_{j} (b_{h} ) - g_{j} (x_{i} ) < v_{j} } \hfill \\ \end{array} } \right.$$
    (8)
    $$D_{j} (b_{h} ,x_{i} ) = \left\{ {\begin{array}{*{20}l} {0,} \hfill & {if\;g_{j} (x_{i} ) - g_{j} (b_{h} ) < p_{j} } \hfill \\ {1,} \hfill & {if\;g_{j} (x_{i} ) - g_{j} (b_{h} ) \ge v_{j} } \hfill \\ {\frac{{ - p_{j} - g_{j} (b_{q} ) + g_{j} }}{{v_{j} - p_{j} }}} \hfill & {if\;p_{j} \le g_{j} (x_{i} ) - g_{j} (b_{h} ) < v_{j} } \hfill \\ \end{array} } \right.$$
    (9)
  4. 4.

    Compute the outranking relationship according to the Credibility Indices (x i b h ) and σ(b h x i ).

    The degree of credibility of the outranking relationship expresses the extent to which x i  > b h according to the Global Concordance Indices and the Discordance Indices. The Credibility Index presumes the following principles:

    • When there is no discordant dimension, the Credibility Index is equal to the Global Concordance Index;

    • When a discordant dimension opposes the veto limit (e.g., D j (x i ,b h ) = 1.), then the Credibility Index becomes zero (meaning that the statement, x i  > b h , is not entirely believable).

    • When a discordant dimension is as C(x i b h ) < D j (x i b h ) < 1, the Credibility Index becomes lower than the Concordance Index, suffering an effect from this opposition.

    The conclusion of these principles is that the Credibility Index corresponds to the low level of agreement for a possible effect of a veto. The calculation of the Credibility Indices is made as follows:

    $$\sigma (x_{i} ,b_{h} ) = \left\{ {\begin{array}{*{20}l} {C(x_{i} ,b_{h} ) \times \prod\limits_{j = 1}^{n} {\frac{{1 - D_{j} (x_{i} ,b_{h} )}}{{1 - C(x_{i} ,b_{h} )}}} ,} \hfill & {if\;D_{j} (x_{i} ,b_{h} ) > C(x_{i} ,b_{h} )} \hfill \\ {C(x_{i} ,b_{h} )} \hfill & {Otherwise} \hfill \\ \end{array} } \right.$$
    (10)
    $$\sigma (b_{h} ,x_{i} ) = \left\{ {\begin{array}{*{20}l} {C(b_{h} ,x_{i} ) \times \prod\limits_{j = 1}^{n} {\frac{{1 - D_{j} (b_{h} ,x_{i} )}}{{1 - C(b_{h} ,x_{i} )}}} ,} \hfill & {if\;D_{j} (b_{h} ,x_{i} ) > C(b_{h} ,x_{i} )} \hfill \\ {C(b_{h} ,x_{i} )} \hfill & {Otherwise} \hfill \\ \end{array} } \right.$$
    (11)
  5. 5.

    Determine a cut off level value, λ, to obtain an ordered classification.

    The cut off level is considered as the lowest value of the credibility that is compatible with the claim that \(\left( {x_{i} ,b_{h} } \right) \ge , {\text{therefore}}\;x_{i} > b_{h}\). When λ = 1, then an alternative must overcome a profile limit in every dimension in order to belong to a class above this boundary. A sensibility analysis can be made by varying the λ level and observing if the classification is stable or not. Based in different combinations, an alternative x i can be preferable to a profile b h (x i Sb h ) or the profile b h can be preferable to the alternative x i (b h Sx i ), indifferent (x i Ib h ) ou incomparable (x i Rb h ). Describes this fuzzy outranking relationship as follows:

    1. 1.

      x i Ib h  ↔ x i Sb h  ⋀ b h Sx i

    2. 2.

      x i Sb h  ↔ x i Sb h  ⋀ ¬b h Sx i

    3. 3.

      b h Sx i  ↔ ¬x i Sb h  ⋀ b h Sx i

    4. 4.

      \(x_{i} Rb_{h }\) ↔ ¬x i Sb h  ⋀ ¬b h Sx i

  6. 6.

    Classification procedure.

    The assignment of each alternative to a class can be obtained by two different procedures:

    • Pessimistic Procedure: An alternative is compared successively with each b h profile, for \(h = n, n - 1, \ldots , 0\). When x i Sb h , it is considered that the alternative belongs to the class C h+1.

    • Optimistic Procedure: An alternative is compared successively with each b h profile, for \(h = 1, 2, \ldots , n\). When b h Sx i , it is considered that the alternative belongs to the class C h .

    Note that the fundamentals ruling these two procedures are distinct and may result in different classifications. However, the Pessimistic Procedure has the advantage of being non-compensatory; therefore, this procedure is the only one that will be used in this paper.

3.1 Kernel Density Estimation

A kernel is a special type of probability density function (PDF) that is non-negative and its definite integral is equal to 1. A Kernel Density Estimation (KDE) is a non-parametric way to estimate the PDF of a random variable, and it is non-parametric because it does not assume any underlying distribution for the variable. For every data point a kernel function is created with this data point at its center (forming a “bump”) so the PDF can be estimated by adding these bumps, and each one has a bandwidth (h), that determines their spread, therefore this is analogous to the bin-width of a histogram. Large values of h over-smooth, while small values under-smooth the data. (Trosset 2011; Everitt and Torsten 2006; Parzen 1962; Rosenblatt 1956). A standardization of the dimensions becomes necessary, in order to obtain the same bandwidth value for all dimensions, avoiding this way to deal with different smoothing problems.

Mathematically, the KDE can be calculated as:

$$f\left( x \right) = \frac{1}{nh}\mathop \sum \limits_{i = 1}^{n} K\left( {\frac{{x - x_{i} }}{h}} \right)$$
(12)

where n = considered data points; h = bandwidth; K = Gaussian kernel function, which can be understood as a weighted linear combination of data points that tries to fit a Gaussian (bell shaped) function.

For 1D data, the peaks (extreme convex points) can be used to indicate the number classes, so for each dimension a KDE plot was made and all of them with the same bandwidth (h = 0.2). Figure 2 represents the KDE plot for the first dimension g 1 (LHL). The horizontal axis shows that a total of 505 density points were sufficient to find a smooth probability density function of the real data. The vertical axis describes the relative likelihood of a value to fall in a particular range.

Fig. 2
figure 2

g 1 KDE plot

One can note the formation of four peaks, thus represents this dimension by four classes.

Figure 3 represents the KDE plot for the second dimension g 2 (ATK). One can note the formation of three peaks, thus represents this dimension by three classes. There is a noticeable lump between the 0.1 and 0.2 values, it may indicate a possible peak that was not fully formed because of the bandwidth value choice, and can be concluded that this dimension could be represented by four classes.

Fig. 3
figure 3

g 2 KDE plot

Figure 4 represents the KDE plot for the third dimension g 3 (DSL). One can note the formation of three peaks, thus represents this dimension by three classes.

Fig. 4
figure 4

g 3 KDE plot

The ELECTRE TRI method, however, needs that all dimensions must have the same number of classes. Then in the following section, the data will be tested for three and four classes and quality of each formation will decide how the data will be divided.

3.2 Jenks Natural Breaks

Jenks Natural Breaks is a data clustering method designed to determine the best arrangement of values into different classes minimizing the variance within classes and maximizing the variance between classes. (Jenks and Caspall 1971). Describes the algorithm as follows (Smith et al. 2015)

  1. 1.

    Select a 1D variable (dimension) to be classified and specify the number of classes required, k.

  2. 2.

    A set of k − 1 random or uniform values are generated in the range [min(g); max(g)]. These are used as initial class boundaries.

  3. 3.

    The mean values for each initial class are computed and the sum of squared deviations of class members from the mean values is computed. The total sum of squared deviations (TSSD) is recorded.

  4. 4.

    Individual values in each class are then systematically assigned to adjacent classes by adjusting the class boundaries to see if the TSSD can be reduced.

This is an iterative process, which ends when improvement in TSSD falls below a threshold level (when the within class variance is as small as possible and the between class is as large as possible).

To indicate the quality of the breaks uses the goodness of variance fit (GVF). A value of 0 indicates that there is no fit and a value of 1 indicates a perfect fit. The GVF is calculated as:

$$GVF = 1 - \frac{{d^{*} }}{d}$$
(13)

where d* = minimum TSSD; d = data set squared deviation.

The GVF for three and four classes can be summarized in Table 2.

Table 2 GVF

Therefore, the GVF, indicates that the more indicated number of classes to use the ELECTRE TRI method is four. The four classes can be coded as: A, B, C and D; where A > B > C > D. For each dimension, the Natural Breaks values for the four classes are show in Table 3.

Table 3 Breaks

The first column indicates the classes for each dimension, the second column indicates the minimum value that an alternative must have to belong a given class, the third column indicates the maximum value that an alternative must have to belong a given class and finally the fourth column indicates how many alternatives composes a class.

The profiles for each class are calculated as the average of the minimum value of a superior class and the maximum value of the class immediately bellow. As there are four classes, then three profiles are needed, their values are show in Table 4:

Table 4 Profiles

4 Result Analysis and Discussion

Finally, the ELECTRE TRI method can be used considering four classes (three profiles) and weights used are the same ones adopted by UNDP, all dimensions have a weight equal to 1. Since there are only three dimensions with the same weight, we need to analyze only two cut off levels: \(\lambda 1 < \frac{2}{3}\) (relaxed view) and \(\lambda 2 \ge \frac{2}{3}\) (restricted view). The relaxed view (assuming that all the dimensions have the same weight) indicates that an alternative belongs to an upper class if at least two-thirds of the dimensions belong to this class, and in the restricted view (assuming that all the dimensions have the same weight) the alternative belongs to the class where a dimension have the worst performance.

The “countries” columns in Table 5 indicates which countries were considered in the UNDP ranking (First—Norway and Last—Niger), the “HDI” columns represents the classification of countries by quartiles (A, B, C and D), the “λ1” and “λ2” columns represents the relaxed and restricted classification views of ELECTRE TRI method (also A, B, C and D).

Table 5 Results

The Fig. 5 shows the comparison between the HDI index and the relaxed view classification. The horizontal axis represents the 188 countries being analyzed and they appear in the same order presented in Table 5 (1—Norway, 2—Australia, …, and 188—Niger). The vertical axis is divided in four classes (A, B, C and D) and it represents the HDI classification based in quartiles for each country (HDI columns in Table 5) and the relaxed view classification is represented by colors (A—blue, B—red, C—green, D—yellow). Now the differences between each classification can be seen more clearly, for example, the HDI index defines the class B as the second quartile (from Belarus to Samoa), however the relaxed view assigns Cuba (blue line) to class A, Fiji and Belize to class C (green lines). Also some adjacent countries, considered as class A (Estonia, Poland, Lithuania, Argentina, Arab Emirates, Hungary, Bahrain, Latvia, Croatia, Kuwait and Montenegro) and class C (Botswana, Moldova, Egypt, Turkmenistan, Gabon, Indonesia, Paraguay, Palestine, South Africa, Viet Nam, and Kyrgyzstan) by the HDI index, are assigned to class B by the relaxed view.

Fig. 5
figure 5

HDI and relaxed view

The Fig. 6 shows the comparison between the HDI index and the restricted view classification. The differences between each classification can be seen more clearly, for example, the HDI index defines the class D as the fourth quartile (from Kenya to Niger), however the restricted view assigns Kenya (green line) to class C and also some adjacent countries, considered as class C (South Africa, Tajikistan, Kiribati, Equatorial Guinea and Zambia) by the HDI index, are assigned to class D by the restricted view.

Fig. 6
figure 6

HDI and restricted view

To summarize the results, Table 6 shows a classification comparisons table where we compared the HDI classification with both the relaxed and restricted views. The accuracy (ACC), which is the total number of classifications made by our model that agrees with the HDI index divided by the total of countries, has a value of 76.60% for the relaxed view and 78.72% for the restricted view. The HDI index values and the relaxed view agree most frequently on the A, B, C group classifications (77.55, 94.64 and 71.69% respectively) and with the restricted view in the B, C, D group classifications (76.79, 84.62 and 97.73% respectively). The disagreements cases are discussed in the following pages.

Table 6 Classification comparisons table

The classification made by UNDP using the method of quartiles, has a main disadvantage the misclassification of alternatives at the edge of the quartile line, in other words, they alternatives may be erroneous misplaced in contiguous classes. The Fig. 7 demonstrates this effect.

Fig. 7
figure 7

UNDP

Analyzing the results obtained by the UNDP, and considering for this analysis the first quartile (Class A), the following results are obtained.

  • Greece, Brunei Darussalam, Qatar, Andorra, Slovakia, Malta, Saudi Arabia, Chile and Portugal performs well in LHL and ATK dimensions (blue line or Class A performance) and do not perform well in DSL dimension (red line or Class B performance).

  • Estonia, Poland, Lithuania, Argentina, United Arab Emirates, Hungary, Bahrain, Latvia, Croatia, Kuwait and Montenegro, do not perform well in ATK and DSL (red line or Class B performance).

Alternatives near the edges (that not perform well in all dimensions) are considered as Class A alternatives by the quartile method used by the UNDP. What causes this effect is the compensatory nature of the UNDP classification, if only one dimension has a very high value, then this dimension compensates the other ones and the alternative is allocated in a higher class.

The classification made by the ELECTRE TRI using the relaxed view (λ1) to the class A can be seen in Fig. 8.

Fig. 8
figure 8

ELECTRE TRI—relaxed view

Analyzing the results obtained by the ELECTRE TRI (λ1), and considering for this analysis the Class A, the following results are obtained:

  • Greece, Brunei Darussalam, Qatar, Andorra, Slovakia, Malta, Saudi Arabia, Chile and Portugal performs well in LHL and ATK dimensions (blue line or Class A performance) and DSL dimension do not perform well (red line or Class B performance).

  • Cuba performs well in LHL and ATK dimensions (classified in the Class A—blue line) and DSL dimension do not perform well (green line or Class C performance).

Thus at least 2 dimensions must be at the first class to the alternative be considered as a Class A

The classification made by the ELECTRE TRI using the restricted view (λ2) to the class A can be seen in Fig. 9.

Fig. 9
figure 9

ELECTRE TRI—restricted view

Analyzing the results obtained by the ELECTRE TRI (λ2), and considering for this analysis the Class A, the following result is obtained:

  • Only alternatives that performs well in all dimensions belong to this Class.

Comparing the UNDP classification and the ELECTRE TRI method presented on Table 5, considering the restricted view it is observed that 44 countries were classified into different profiles of the HDI ranking:

  • 11 countries, among them Poland, United Arab Emirates, Argentina, Hungary, fell from Class A to B, Cuba rose from Class B to A, Fiji and Belize fell from Class B to C, 11 countries moved up from Class C to B as Egypt, South Africa, Paraguay and Viet Nam and 19 countries rose from class D to C, among which Kenya, Cameroon, Angola, Myanmar, Sudan and Rwanda.

Comparing both cut off values (1 and 2):

  • In general, when you increase the cut off value, 52 countries lose one position in the ranking, with the exception of Cuba and South Africa that go lose two positions in the ranking. Specifically, in relation to Cuba, this is due to the DSL dimension that is in class C, while the other dimensions (LHL and ATK) are in class A. South Africa has the dimensions ATK and DSL in class B and LHL dimension in class D.

  • In the restricted view, 20 countries fell from Class A to B, including Portugal, Greece and Chile, 13 countries fell from Class B to C, including Russian Federation, Cuba and Ukraine, and 5 countries fell from Class C to D.

Generalizing the results (considering that all criteria possesses the same weight), we can affirm that:

  • When comparing the results of the proposed method to the UNDP ranking, it was observed that even with a more flexible allocation of countries, the method produces a more accurate solution than the separation of classes in quartiles.

  • In the proposed method, to belong a Class, without a relaxed view, all dimensions must obtain a performance above the class profile. When it relaxes, most of the dimensions (two out of a total of three) should be above the profile to this allocation occur.

The advantages of the proposed methodology are: (1) extinguishes the weight compensation problem, (2) allows the statistical definition of the classes, and (3) can generate two types of solutions.

4.1 Comparison with the World Happiness Report 2015

In order to compare both ELECTRE TRI classifications and the World Happiness Report 2015 (WHR 2015—Helliwell et al. 2015) we have used the quartiles of the rank-ordered data given by the report, obtaining is this way, four ranked groups. Thus Spearman’s rank correlation coefficient (Spearman’s rho) could be applied to measure the strength and the direction of association between the ranked variables.

Table 7 shows the Spearman’s rho between the WHR 2015 classification with both the relaxed and restricted views. In relation with the relaxed view the correlation coefficient has a value of 0.735 and a value of 0.763 for the restricted view. Both correlations are significant at the 95% level (p value < 0.05) and both views show a strong positive correlation with the WHR 2015.

Table 7 WHR Association

However, the set of countries analyzed by the WHR 2015 differs from the one used by the HDR 2015 and a total of 150 countries were used to obtain the Spearman’s rho. The following countries were excluded from the calculations: Andorra, Antigua and Barbuda, Bahamas, Barbados, Belize, Brunei Darussalam, Cabo Verde, Côte d’Ivoire, Cuba, Dominica, Equatorial Guinea, Eritrea, Fiji, Gambia, Grenada, Guinea-Bissau, Guyana, Kiribati, Korea (Republic of), Lao People’s Democratic Republic, Liechtenstein, Maldives, Micronesia, Namibia, Palau, Papua New Guinea, Saint Kitts and Nevis, Saint Lucia, Saint Vincent and the Grenadines, Samoa, Sao Tome and Principe, Seychelles, Solomon Islands, South Sudan, The former Yugoslav Republic of Macedonia, Timor-Leste, Tonga and Vanuatu.

5 Conclusion

The criticisms made by experts and scholars about the HDI measurement method opened the way for new proposals and methods that could make the HDI a more robust and credible index given the actual development of a country. Three types of criticisms have been highlighted in this article: General Critics (indicating problems between the connections of the index with the reality), Proposition of Alternative Methods of Measurements (with a more solid mathematical fundament) and New Dimensions (like Environment, Safety, Politics, etc.). In this context, this study aimed to work the method of countries classification, and the suggested method included the use of a multicriteria tool, the ELECTRE TRI method, combined with two statistical approaches, which were a Kernel Density Estimation to estimate the number of classes and the Jenks Natural Breaks to define each class profile. The accuracy level of our approach, comparing with the HDI index, were 76.60% for the first model (an ELECTRI TRI model with a relaxed view) and 78.72% for the second model (an ELECTRI TRI model with a restricted view).

The main advantages of the proposed methodology relate to the elimination of the compensatory effect between the dimensions, the statistical definition of the number of classes and their profiles, and flexibility in the allocation of alternatives. The disadvantage of the specific case study concerns the amount of the dimensions used that are unrepresentative when realistic is required to evaluate the development of a country. Therefore, suggests as a future work using the proposed method a classification considering a larger set of dimensions.