1 Introduction

Composite indicators are more and more popular; many international organizations propose their use in search of evidence based policy (Saltelli 2007; Nardo et al. 2008). From a formal point of view, a composite indicator is an aggregate of all dimensions, objectives, individual indicators and variables used for its construction. This implies that what defines a composite indicator is the set of properties underlying its aggregation convention. Although various functional forms for the underlying aggregation rules of a composite indicator have been developed in the literature, in the standard practice, a composite indicator is very often constructed by using a weighted linear aggregation rule applied to a set of variables. However, Munda and Nardo (2009) analyse the case of aggregation rules in the framework of composite indicators and conclude that the use of non-linear/non-compensatory aggregation rules to construct composite indicators is compulsory for reasons of theoretical consistency when weights with the meaning of importance coefficients are used or when the assumption of preferential independenceFootnote 1 does not hold. Moreover, in standard linear composite indicators, compensability among the different individual indicators is always assumed; this implies complete substitutability among the various components considered. For example, in a hypothetical sustainability index, economic growth can always substitute any environmental destruction or inside e.g., the environmental dimension, clean air can compensate for a loss of potable water. From a normative point of view, such a complete compensability is often not desirable. A search for alternative mathematical aggregation rules is then needed. In this article, I try to revise the theoretical debate on aggregation rules by looking at contributions from both voting theory and multi-criteria decision analysis. This cross-fertilization helps in clarifying many ambiguous issues still present in the literature and allows discussing the key assumptions that may change the evaluation of an aggregation rule, when a composite indicator has to be constructed.

Borda and Condorcet consistent rules were originally developed for preference aggregation in the theory of social choice. Nowadays these rules are applied in a variety of fields such as discrete multi-criteria analysis, composite indicators, artificial intelligence, queries in databases or Internet multiple search engines. The debate on the relative merits of Borda and Condorcet consistent voting rules is a very old one. Indeed according to McLean (1990), these rules were already known in the medieval age, when Ramon Lull (1235–1315) proposed a Condorcet method and Nicolaus Cusanus (1401–1464) proposed a Borda method.

In this article I will compare Borda consistent scoring methods with two precise Condorcet consistent rules, i.e., the so-called Kemeny’s method (Kemeny 1959) and the Arrow and Raynaud (1986) ranking procedure. This because the former if the most general rule for dealing with the cycle issue and the latter is the only algorithm respecting the axiom of independence of irrelevant alternatives fully. I will discuss the basic properties and assumptions of these decision rules inside the framework of composite indicators. One has to note that although the construction of a composite indicator presents the same characteristics of a voting problem (Arrow and Raynaud 1986), it is more general in nature, since indicator scores can be measured on interval or ratio scales too and not only on ordinal scale as in the case of voting theory.

Section 2 illustrates the basic characteristics of the composite indicator framework, in particular the concepts of compensability, indicator weights and preference modelling are considered. Section 3 summarizes the main properties of Borda and Condorcet consistent rules. Section 4 deals with the cycle issue in Condorcet consistent rules; in particular, the so-called Kemeny’s method is illustrated in depth and historically reconstructed. Section 5 describes briefly a peculiar Condorcet consistent rule, i.e., the so-called Arrow and Raynaud ranking algorithm. Section 6 compares systematically scoring methods with Condorcet consistent voting rules according to their properties and some conclusions about their possible use in different frameworks are drawn.

2 On the Equivalence between the Discrete Multi-Criterion Problem, Voting Theory and the Composite Indicator Framework

The discrete multi-criterion problem can be described in the following way: A is a finite set of N feasible actions (or alternatives); M is the number of different points of view or evaluation criteria g m m = 1, 2,…, M considered relevant in a policy problem, where the action a is evaluated to be better than action b (both belonging to the set A) according to the m-th point of view if g m (a) > g m (b), W is a set of criterion weights W = {w m }, m = 1, 2,…, M, with \( \sum\nolimits_{m = 1}^{M} {w_{m} } = 1 \), which can be importance coefficients or trade-offs. It is evident that the discrete multicriterion problem and the aggregation of individual indicators to build a composite are completely equivalent problems.

In synthesis, the information contained in the impact matrix needed for the construction of a composite indicator is:

  • Intensity of preference (when quantitative indicator scores are present).

  • Number of individual indicators in favour of a given object (country, region, city, etc.) to be ranked.

  • Weight attached to each single individual indicator.

  • Relationship (i.e., relative ordering) of each single object with all the other objects to be ranked.

Combinations of this information generate different aggregation conventions, i.e., manipulation rules of the available information to arrive at a preference structure generating the composite indicator. The aggregation of several individual indicators implies taking a position on the fundamental issue of compensability. Compensability refers to the existence of trade-offs, i.e., the possibility of offsetting a disadvantage on some indicators by a sufficiently large advantage on another indicator, whereas smaller advantages would not do the same. Thus a preference relation is non-compensatory if no trade-off occurs and is compensatory otherwise. The use of weights with intensity of preference originates compensatory aggregation methods and gives the meaning of trade-offs to the weights. On the contrary, the use of weights with ordinal indicator scores originates non-compensatory aggregation procedures and gives the weights the meaning of importance coefficients (Bouyssou and Vansnick 1986; Keeney and Raiffa 1976; Podinovskii 1994; Roberts 1979).

Trade-offs can be evaluated only if one knows the quantitative scores of the indicators involved without any uncertainty. On the contrary, the concept of importance is connected to the criterion itself and NOT with its quantification. If protected species are considered more, equal or less important than GDP, this is a quality of the indicators which is independent from any measurement scale one may use. As clearly shown by Anderson and Zalinski (1988), when weights depend on the range of variable scores, such as in the context of a linear aggregation rule, the interpretation of weights as a measurement of the psychological concept of importance is always completely inappropriate. The concept of importance I am using along this paper can be classified as symmetrical importance, that is “if we have two non-equal numbers to construct a vector in R 2 , then it is preferable to place the greatest number in the position corresponding to the most important criterion” (Podinovskii 1994, p. 241).

More formally, to use the compensatory approach in practice, such as the linear aggregation rule, one has to determine for each single indicator, a mapping \( \mathop \phi \nolimits_{i} :\mathop x\nolimits_{i} \to \mathop R\nolimits_{{}} \) which provides at least an interval scale of measurement and to assess scaling constants (i.e., weights) in order to specify how the compensability should be accomplished, given the scales ϕ i between the different indicators (Roberts 1979). Note that the scaling constants which appear in the compensatory approach depend on the scales ϕ i , thus they do not characterize the intrinsic relative importance of individual indicators.

Vansnick (1990) showed that the two main approaches in multi-criteria decision theory i.e., the compensatory and non-compensatory ones can be directly derived from the seminal work of Borda (1784) and Condorcet (1785).

In 1986 Kenneth Arrow and Hervé Raynaud published a very influential book titled “Social choice and multicriterion decision-making”, where the formal analogies between the discrete multi-criterion problem and the social choice one are deeply analyzed. This book is based on the assumption that, in the case where all criteria have ordinal impact scores, if one considers the evaluation criteria as voters, a multi-criteria impact matrix and a voting matrix are identical. As a consequence all results of social choice also apply to multi-criteria decision theory fully (when no intensity of preference is used) and then to the construction of composite indicators too.

When an interval or ratio scale of measurement is used, preference modelling becomes a richer mathematical structure than the usual one. Given a set of indicators G = {g m }, m = 1, 2,…, M, and a finite set A = {a n }, n = 1, 2,…, N of objects, let’s start with the simple assumption that the performance (i.e., the indicator score) of an object a n with respect to an indicator g m is based on an interval or ratio scale of measurement. For simplicity of exposition, here the assumption is made that a higher value of an indicator is preferred to a lower one (the higher, the better). The famous bald paradox in Greek philosophy (how many hairs one has to cut off to transform a person with hairs to a bald one?), later on Poincaré (1935, p. 69) and finally Luce (1956) made the point that the transitivity of indifference relation is incompatible with the existence of a sensibility threshold below which an agent either does not sense the difference between two objects, or refuses to declare a preference for one or the other. Luce was the first one to discuss this issue formally in the framework of preference modelling. Mathematical characterizations of preference modelling with thresholds can be found in Roubens and Vincke (1985).

By introducing a positive constant indifference threshold q the resulting preference model is the threshold model:

$$ \left\{ {\begin{array}{*{20}c} {\mathop a\nolimits_{j} P\mathop a\nolimits_{k} \Leftrightarrow \mathop g\nolimits_{m} (\mathop a\nolimits_{j} ) > \mathop g\nolimits_{m} (\mathop a\nolimits_{k} ) + q} \\ {\mathop a\nolimits_{j} I\mathop a\nolimits_{k} \Leftrightarrow \left| {\mathop {\,g}\nolimits_{m} (\mathop a\nolimits_{j} ) - \mathop g\nolimits_{m} (\mathop a\nolimits_{k} )} \right| \le q} \\ \end{array} } \right\} $$
(1)

where a j and a k belong to the set A of objects and g m to the set G of indicators.

Real life experiments show that often there is an intermediary zone inside which an agent hesitates between indifference and preference. This observation led to the so-called double threshold model where variable indifference and preference thresholds are introduced, that is

$$ \begin{gathered} \left\{ {\begin{array}{l} {\mathop a\nolimits_{j} P\mathop a\nolimits_{k} \Leftrightarrow \mathop g\nolimits_{m} (\mathop a\nolimits_{j} ) > \mathop g\nolimits_{m} (\mathop a\nolimits_{k} ) + p\left( {\mathop g\nolimits_{m} (\mathop a\nolimits_{k} )} \right)} \\ {\mathop a\nolimits_{j} Q\mathop a\nolimits_{k} \Leftrightarrow \mathop g\nolimits_{m} (\mathop a\nolimits_{k} ) + p\left( {\mathop g\nolimits_{m} (\mathop a\nolimits_{k} )} \right) \ge \mathop g\nolimits_{m} (\mathop a\nolimits_{j} ) > \mathop g\nolimits_{m} (\mathop a\nolimits_{k} ) + q\left( {\mathop g\nolimits_{m} (\mathop a\nolimits_{k} )} \right)} \\ {\mathop a\nolimits_{j} I\mathop a\nolimits_{k} \Leftrightarrow \left\{ {\begin{array}{*{20}c} {\mathop g\nolimits_{m} (\mathop a\nolimits_{k} ) + q\left( {\mathop g\nolimits_{m} (\mathop a\nolimits_{k} )} \right) \ge \mathop g\nolimits_{m} (\mathop a\nolimits_{j} )} \\ {\mathop g\nolimits_{m} (\mathop a\nolimits_{j} ) + q\left( {\mathop g\nolimits_{m} (\mathop a\nolimits_{j} )} \right) \ge \mathop g\nolimits_{m} (\mathop a\nolimits_{k} )} \\ \end{array} } \right.} \\ \end{array} } \right\} \hfill \\ \hfill \\ \end{gathered} $$
(2)

For any m = 1, 2,…, M, being p a positive preference threshold. Relation Q has been called “weak preference” by Roy (1985, 1996). It translates the decision-maker’s hesitation between indifference and preference and not “less strong” preference as its name might lead to believe. An indicator with both preference and indifference thresholds can be called a pseudo-indicator. A pseudo-order structure is a double threshold model upon which the following consistency condition is imposed

$$ \mathop g\nolimits_{m} (\mathop a\nolimits_{j} ) > \mathop g\nolimits_{m} (\mathop a\nolimits_{k} ) \Leftrightarrow \left\{ {\begin{array}{*{20}c} {\mathop g\nolimits_{m} (\mathop a\nolimits_{j} ) + q\left( {\mathop g\nolimits_{m} (\mathop a\nolimits_{k} )} \right) > \mathop g\nolimits_{m} (\mathop a\nolimits_{k} ) + q\left( {\mathop g\nolimits_{m} (\mathop a\nolimits_{k} )} \right)} \\ {\mathop g\nolimits_{m} (\mathop a\nolimits_{j} ) + p\left( {\mathop g\nolimits_{m} (\mathop a\nolimits_{k} )} \right) > \mathop g\nolimits_{m} (\mathop a\nolimits_{k} ) + p\left( {\mathop g\nolimits_{m} (\mathop a\nolimits_{k} )} \right)} \\ \end{array} } \right. $$
(3)

More sophisticated versions of the double threshold model also include the treatment of uncertainty and risk (Fishburn 1970, 1973a; Kacprzyk and Roubens 1988; Ozturk et al. 2005; Roubens and Vincke 1985).

A first topic to start with is Arrow’s impossibility theorem (Arrow 1963). A legitimate question arises: does this paradoxical result apply to the general discrete aggregation problem too? Arrow and Raynaud (1986, pp. 17–23) answer this question. Let’s assume that a mathematical aggregation convention to arrive at a total ranking of all objects needs at least to satisfy three axiomsFootnote 2:

Axiom 1: Unrestricted Domain

The values that can be taken by the indicators are unrestricted and the mathematical aggregation convention must respect unanimity.

Axiom 2: Independence of irrelevant alternatives

The ranking of the objects (alternatives) in A depends only on the objects (alternatives) belonging to A. “This means that it is of no importance for the decision if you have forgotten in the application of the method some (poorly ranked) alternatives: …. The complete set of alternatives is always very large and only a relatively small subset can be identified. It is thus essential that the result of the method on a small set of alternatives not vary if forgotten alternatives are taken into consideration” (Arrow and Raynaud 1986, p. 19).

Axiom 3: Positive Responsiveness

The degree of preference between two objects a and b is a strictly increasing function of the number of indicators (or weights) that rank a before b.

The following paradoxical result then applies: the only ranking respecting all these axioms must coincide with the ranking supplied by one of the indicators taken into consideration (in Arrow’s words, dictatorship is the only democratic solution!). A consequence of this theorem is that no perfect mathematical aggregation convention can exist. “Reasonable” ranking procedures must then be found. In the framework of composite indicators, this consequence implies two questions: Is it possible to find a ranking algorithm consistent with a set of desirable properties Footnote 3? And on the reverse, is it possible to assure that no essential property is lost? At this point, the question arises: in the framework of composite indicators, can we choose between Borda and Condorcet aggregation rules on some theoretical and/or practical grounds? A first partial answer to this question will be given in the next Section.

3 Borda Versus Condorcet in the Context of Composite Indicators

First of all, there is a need to generalize the Borda approach such that we can compare any Borda consistent rule with any Condorcet consistent rule. The Borda approach can be generalized by means of the concept of scoring voting rules, meaning that the Borda winner can always be found for any non-decreasing sequence of real numbers \( s_{0} \le s_{1} \le \cdots \le s_{p - 1} ,\quad {\text{with}}\,s_{0} < s_{p - 1} \). Where of course s 0 points are given to the alternative ranked last and so on, till the first one in the ranking which receives s p − 1 points.Footnote 4

Fishburn (1973b) proves the following theorem: there are profiles where the Condorcet winner exists and it is never selected by any scoring method. Moulin (1988, p. 249) proves that “a Condorcet winner (loser) cannot be a Borda loser (winner)”. In other words, Condorcet consistent rules and scoring voting rules are deeply different in nature. Their disagreement in practice is the normal situation. We have then necessarily to examine both approaches carefully and choose one of them considered more adequate in a composite indicator framework.

Let’s consider a numerical example with 60 indicators and 3 objects; this example showed in Table 1, is due to Condorcet himself.

Table 1 An original Condorcet’s numerical example (Source: Condorcet 1785)

The corresponding frequency matrix is showed in Table 2 (where any element indicates the number of times each object is ranked first, second or third by any single indicator).

Table 2 A frequency matrix derived from Table 1

By applying Borda’s scoring rule, the following results are obtained: a = 58, b = 69, c = 53, thus object b is univocally selected.

Let’s now apply the Condorcet rule. The corresponding outranking matrixFootnote 5 is the one showed in Table 3 (any element in the matrix is obtained by pair-wise comparison of objects and by counting the number of indicators in favour of each single object).

Table 3 Outranking matrix derived from Table 1

In this case, the concordance (i.e., the majority) threshold is 31. It is: a P b, b P c and c P a, thus due to the transitive property a cycle exists and no object can be selected. Let’s then try the application of Condorcet consistent rules.

From this example we might conclude that the Borda rule (or any scoring rule) is more effective since an alternative is always selected while the Condorcet one sometimes leads to an irreducible indecisiveness. It seems appropriate then to know more about the properties hold by the Borda rule.

Let’s examine again the outranking matrix presented in Table 3. From this matrix we can realize that 33 indicators are in favour of object a , while only 27 are in favour of object b. So a legitimate question is: why the Borda rule ranks b before a? This is mainly due to the fact that the Borda rule is based on the concept of intensity of preference while the Condorcet rule only uses the number of indicators.

In the framework of Borda rule, and all scoring methods in general, the intensity of preference is measured by the scores given according to the rank positions. This implies that compensability is allowed. Moreover, the rank position of a given object depends on the number of objects considered. This implies that the mutual preference relation of a given pair of objects may change according to the objects considered. As a consequence, preference reversal phenomena may easily occur and of course the axiom of independence of irrelevant alternatives is not respected. This problem has been extensively studied by Fishburn (1984).

Let’s examine the numerical example presented in Table 4.

Table 4 Fishburn numerical example on Borda rule

By applying Borda’s scoring rule, the following results are obtained: a = 13, b = 12, c = 11, d = 6, thus object a is chosen. Let’s now suppose that object d is removed from the analysis. Since d was at the bottom of the ranking, nobody should have any reasonable doubt that object a is still the best alternative. By applying Borda’s scoring rule, the following results are obtained: a = 6, b = 7, c = 8, thus object c is now chosen! Unfortunately, Borda rule is fully dependent on irrelevant alternatives (objects) and preference reversals can happen with an extreme high frequency.

It is also interesting to remember that Borda rule can sometimes lose a fundamental property for a ranking procedure, i.e., monotonicity. This happens in the framework of scoring rules based on successive elimination; that is, ascending procedures which first find the worst alternative, then eliminate it and start again in search for the second worst and so on. Fishburn (1982) proves that any rule based on successive elimination by scoring methods must violate monotonicity at some profiles.

At this point, we need to tackle the issue of when, in a composite indicator framework, it is better to use a Condorcet consistent rule or a scoring method. Given that there is a consensus in the literature that the Condorcet’ theory of voting is non-compensatory while Borda’s one is fully compensatory, a first conclusion is that when one wishes to have weights as importance coefficients, there is a need for a Condorcet approachFootnote 6 while a Borda’s one is desirable when weights are meaningful in the form of trade - offs. Moreover, a Condorcet approach is useful for generating a ranking of the available objects while Borda’s one is more useful for isolating one object, considered the best) (Moulin 1988; Truchon 1995; Young 1988, 1995).

However, as we have seen, a basic problem inherent in the Condorcet’s approach is the presence of cycles. This problem has been studied by various scientists (e.g., Fishburn 1973a, b; Kemeny 1959; Moulin 1985; Truchon 1995; Young and Levenglick 1978, Vidu 2002; Weber 2002). The probability π(NM) of obtain a cycle with N objects and M indicators increases with N as well as the number of indicators. Estimations of probabilities of getting cycles according to N objects and M voters (indicators) can be found in Fishburn (1973b, p. 95). One should note that these probabilities are estimated under the so-called “impartial culture assumption”, i.e., voters’ opinions do not influence each other. While this assumption is unrealistic in a mass election, it is fully respected in a composite indicator context since indicators are supposed to be non-redundant. In Fishburn et al. (1979), the issue of cycles was tackled specifically for the discrete mathematical aggregation problem and indeed it has been proved that the cycle issue is a serious problem for the use of Condorcet’s voting theory, since with many objects and indicators, cycles occur with an extremely high frequency. For this reason, mathematical aggregation conventions based on Condorcet ideas need rules of thumb to solve cycles. Unfortunately these rules of thumb normally imply the loss of neutrality (among tied objects choose the first in alphabetic order) or anonymity (among tied objects choose the one most preferred by indicator 1). One of the original suggestions of Condorcet was to delete successively all weakest pair-wise majorities till the point that all cycles are eliminated. However, Young (1986) proved that this rule is not valid when the number of objects is bigger or equal to four.

Now the question is: Is it possible to tackle the cycle issue in a more general way? The answer to this question will be given in the next Section.

4 The Cycle Issue in Condorcet Consistent Rules: The So-Called Kemeny’s Method

Condorcet himself was aware of the problem of cycles in his approach; he built examples to explain it (as the one shown in Table 1) and he was even close to find a consistent rule able to rank any number of alternatives when cycles are present. However, attempts to fully understand this part of Condorcet’s voting theory have arrived at conclusions like “… the general rules for the case of any number of candidates as given by Condorcet are stated so briefly as to be hardly intelligible … and as no examples are given it is quite hopeless to find out what Condorcet meant” (E.J. Nanson as quoted in Black 1958, p. 175). Or “The obscurity and self-contradiction are without any parallel, so far as our experience of mathematical works extends … no amount of examples can convey an adequate impression of the evils” (Todhunter, 1949, p. 352 as cited by Young 1988, p. 1234). Attempts of clarifying, fully understanding and axiomatizing Condorcet’s approach for solving cycles have been mainly done by Kemeny (1959), who made the first intelligible description of the Condorcet approach, and by Young and Levenglick (1978) who made its clearest exposition and complete axiomatization. For this reason I call this approach the Condorcet-Kemeny-Young-Levenglick ranking procedure, in short the C-K-Y-L ranking procedure.Footnote 7

Its main methodological foundation is the maximum likelihood concept. In fact, the C-K-Y-L ranking procedure may be considered one of its earliest applications. “Condorcet’s argument proceeds along the following lines. People differ in their opinions because they are imperfect judges of which decision really is best. If on balance each voter is more often right than wrong, however, then the majority view is very likely to identify the decision that is objectively best.” (Young 1988, p. 1232). The maximum likelihood principle selects as a final ranking the one with the maximum pair-wise support. This selected ranking is the one which involves the minimum number of pair-wise preference inversions. Since Kemeny (1959) proposes the number of pair-wise preference inversions as a distance to be minimized between the selected ranking and the other individual profiles, the two approaches are perfectly equivalent. Formal proofs of this equivalence can be found in Truchon (1998, pp. 6–10) and Saari and Merlin (2000). The selected ranking is also a median ranking for those composing the profile (in multi-criteria terminology it is the “compromise ranking” among the various conflicting points of view), for this reason the corresponding ranking procedure is often known as the Kemeny median order.

The maximum likelihood ranking of alternatives, in a composite indicator framework, is the ranking supported by the maximum number of indicators for each pair-wise comparison, summed over all pairs of objects. By applying the C-K-Y-L ranking procedure to the numerical example of Table 1, the following 6 possible rankings with the corresponding total values are obtained (where e.g., the value 104 is obtained by summing the elements of the outranking matrix showed in Table 3, bc = 42, ba = 27, ca = 35).

a

b

c

100

b

c

a

104

c

a

b

86

b

a

c

94

c

b

a

80

a

c

b

76

The ranking b → c → a is the final result. The original Condorcet problem has been solved in a satisfactory way.

Condorcet made 3 basic assumptions:

  1. (1)

    Voters’ opinions do not influence each other.

  2. (2)

    The voters all have the same competence, i.e., each voter chooses his/her best candidate with a fixed probability p, where \( {\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 2}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$2$}}\, < \,p\, < \,1 \) and p is the same for all voters.

  3. (3)

    Each voter’s judgement on any pair of candidates is independent on his/her judgement on any other pair.

Assumption 1 always applies in a composite indicator framework, since indicators are supposed to be non-redundant. Assumption 2 might be a serious limitation in our case, since it is assumed that indicator weights are different, while assumption 2 would imply equal indicator weighting. In social choice terms then the anonymity property (i.e., equal treatment of all voters) is broken. Indeed, given that full decisiveness yields to dictatorship, Arrow’s impossibility theorem forces us to make a trade-off between decisiveness (an object (alternative) has to be chosen or a ranking has to be made) and e.g., anonymity. As a consequence, the loss of anonymity in favour of decisiveness in our case is even a positive property.Footnote 8 In general, it is essential that no indicator weight is more than 50% of the total weights; otherwise the aggregation procedure would become lexicographic in nature, where this indicator would become a dictator in Arrow’s term.

The third assumption refers to the axiom of independence of irrelevant alternatives. Of course the C-K-Y-L ranking procedure does not respect this axiom. However, two considerations have to be made on this subject.

  1. (1)

    A Condorcet consistent rule always presents smaller probabilities of the occurrence of a rank reversal in comparison with any Borda consistent rule (Moulin 1988; Young 1995). This is again a strong argument in favour of a Condorcet’s approach in a composite indicator framework.

  2. (2)

    Young (1988, p. 1241) claims that the C-K-Y-L ranking procedure is the “only plausible ranking procedure that is locally stable”. Where local stability means that the ranking of objects does not change if only an interval of the full ranking is considered. It is interesting to note that this property was also studied by Jacquet-Lagrèze (1969), one of the first researchers in multi-criteria analysis, who called it the median procedure.

Personally, I think that if the final ranking is dependent on the objects considered, this is a desirable property; given of course that the ranking procedure is neutral.

Saari and Merlin (2000) explicitly state that the C-K-Y-L ranking procedure (cited by them as the Kemeny’s rule) enjoys “remarkable properties”; one of these being “consistency in societal rankings when candidates are dropped …. To underscore this kemeny’s rule property, recall how dropping candidates can cause the Borda count societal ranking to radically change … The unexpected, troubling fact is that Kemeny’s rule achieves its consistency by weakening the crucial assumption about the individual rationality of the voters” (Saari and Merlin 2000, p. 404). Thus they conclude that “The Kemeny’s rule structure and the consistency of the Kemeny’s rule words are impressive; the reasons why they occur are worrisome” (Saari and Merlin 2000, p. 431).

Indeed the argument given by Saari and Merlin against the C-K-Y-L ranking procedure is a serious one. Let’s then investigate it more deeply.

First of all, let’s understand what is meant by individual rationality. In Saari’s words (Saari, 2000, p. 35) Transitivity is a sequencing condition which requires the pair-wise rankings to mimic the ordering properties of points on the line. For instance, if a voter prefers X to Y and Y to Z , then the voter must prefer X to Z . A voter with transitive preferences is called rational; a voter with non-transitive preferences is called irrational ”.

The underlying assumption of this definition is the identification of human rationality with consistency, and this can be criticized from many points of view.Footnote 9 In particular, in Sect. 2 we noted that a down-to-earth preference modelling should imply the use of indifference and preference thresholds; this implies exactly the loss of the transitivity property of at least the indifference relation. Surprisingly enough, we can conclude that an appropriate preference modelling should be based on the “weakening the crucial assumption about the individual rationality” and this is highly desirable! Moreover, one has to have clear why a C-K-Y-L ranking procedure is needed; it answers to a precise problem of the original Condorcet proposal, i.e., the issue of cycles. It is then clear that we have to evaluate this procedure in the framework of the cycle issue. As we know, cycles are originated exactly by the transitivity of the preference relations thus it is clear that any attempt to solve cycles has to weaken this property. The point is to do this by the less arbitrariness as possible, and this is exactly what the C-K-Y-L ranking procedure does.

Concluding, we can state that if

  1. (1)

    one accepts high probabilities of the occurrence of a rank reversal,

  2. (2)

    transitivity of preference and indifference relations is considered essential, and

  3. (3)

    neutrality can be abandoned,

then to use a Borda rule is simply the only option left. Otherwise, if one wishes to adopt a Condorcet based approach (and then to keep neutrality and to have much less rank reversal phenomena), the only acceptable rule to break cycles is the C-K-Y-L ranking procedure. The price to pay is of course the loss of the “rationality assumption”. Whether this is an acceptable price or not, it depends on the framework of application. Maybe in voting theory, it is a high price, but in the framework of composite indicators and multi-criteria decision analysis, it is definitely an acceptable price or even, under certain conditions, a desirable property.

Other properties of the C-K-Y-L ranking procedure are the following (Young and Levenglick 1978).

  • Neutrality: it does not depend on the name of any object, all objects are equally treated.

  • Unanimity (sometimes called Pareto Optimality): if all indicators prefer object a to objecte b than b should not be chosen.

  • Monotonicity: if object a is chosen in any pair-wise comparison and only the indicator scores of a are improved, then a should be still the winning object. Monotonicity is an essential property when dominated objects are not advised to be deleted from the analysis.

  • Reinforcement: if the set A of objects is ranked by 2 subsets G 1 and G 2 of the indicator set G, such that the ranking is the same for both G 1 and G 2, then \( \mathop G\nolimits_{1} \, \cup \mathop G\nolimits_{2} \, = \,\mathop G\nolimits_{{}} \) should still supply the same ranking. This general consistency requirement is very important in a composite indicator framework where one may wish to apply the indicators belonging to each single dimension first and then pool them in the general model.

As a conclusion, let’s examine the example of Table 5, with three objects and three indicators (this is the classical example of Condorcet paradox shown in many textbooks).

Table 5 Example of an unsolvable ranking problem

By applying the Borda rule, all objects receive a score equal to 3, no selection is possible. By applying the Condorcet rule, being majority equal to 2/3, the cycle a P b, b P c and c P a is obtained. By applying the C-K-Y-L ranking procedure, three rankings have the biggest support. These are: \( a \to b \to c,\,\,b \to c \to a,\,\,c \to a \to b \) the cycle remains unsolved.

This example is a perfect materialization of Arrow’s theorem; no decisiveness is possible! To eliminate ties, there is a need for a larger number of indicators or for some indicator weights. This is the reason why I defend, when it is meaningful, the use of indicator weights. Anonymity is lost but decisiveness improves enormously.

5 Arrow-Raynaud’s Ranking Procedure

Arrow and Raynaud (1986) developed an original procedure explicitly designed to solve the discrete mathematical aggregation problem. Such a procedure is based on a set of axioms mainly built on previous research done by Köhler (1978). These axioms can be synthesized as follows:

  1. (1)

    Objects are ranked through a step-by-step process.

  2. (2)

    At each step the information used refers only to objects (alternatives) not yet ranked.

  3. (3)

    The axiom of independence of irrelevant alternatives must apply. No preference reversal is possible. In this framework, it is called “prudence” axiom by Arrow and Raynaud (1986, p. 95).

Clearly the prudence axiom discards both scoring methods and Condorcet consistent rules, thus Arrow and Raynaud proposed a new ranking algorithm. If no cycles exist, the ranking algorithm is the following. Given an outranking matrix, “Step r: Identify the maximum a ij along each row of the current matrix. One at least from among these maxima is smaller than the others. If there are ties, one from among them is chosen arbitrarily. The row of this minimum corresponds to an alternative that will be ranked at the (n − r + 1)th rank in the multicriterion ranking. If r < n, delete the corresponding row and column of the outranking matrix, in order to obtain the current outranking matrix for the (r + 1)th step. The algorithm stops when the outranking matrix becomes void.” (Arrow and Raynaud 1986, p. 105).

To make the exposition clearer, let’s develop a numerical example starting with the outranking matrix presented in Table 6.

Table 6 Outranking matrix for the Arrow-Raynaud algorithm

For each row (starting from the first to the fifth) the maxima are: 2.5, 4, 2.5, 4, 4. Since there is a tie, the first row can be chosen arbitrarily and object a is put in the last position of the ranking. The next one is obviously object c . At this stage the new outranking matrix is the one showed in Table 7.

Table 7 Outranking matrix after deleting objects a and c

Now the corresponding maxima are 1.5, 3.5 and 3.5. The object to be chosen is b unambiguously. After having eliminated b, the only remaining comparison to be made is between d and e. Obviously d is in the first position and e in the second one. In conclusion, by applying Arrow and Raynaud algorithm two rankings are possible: \( d \to e \to b \to c \to a\quad {\text{and}}\,d \to e \to b \to a \to c \).

If the Arrow-Raynaud’s algorithm is applied to the original Condorcet numerical example, one may easily realize that the ranking obtained is \( b \to c \to a \) which is the same obtained through the application of the C-K-Y-L procedure but the computation time is much less. Now a question arises why not to use the Arrow-Raynaud’s procedure if even cycles sometimes can be solved so efficiently? To answer this question let’s look at another numerical example.

Table 8 presents a numerical example of an outranking matrix with 4 objects and three indicators.

Table 8 Outranking matrix for a 3 indicators and 4 objects problem

By considering the 4! Possible rankings, the C-K-Y-L procedure gives a clear cut solution: the final ranking is a → c → b → d. One has to note, however, that this clear cut solution presents an analytical cost, i.e., the loss of a feasible ranking. In fact as noted by Arrow and Raynaud (1986, p. 110): “Rows 1 and 3 on one hand and 2 and 4 on the other hand are identical up to a permutation of their coefficients. Hence, in both cases, there should be a tie in the choice of a first element, and kemeny’s method should have given at least two solutions”.

On the other side, Arrow and Raynaud do not mention that, in this case, their procedure cannot supply any robust ranking at all given the presence of so many ties. Unfortunately, ties are very common when cycles are present. This, in my opinion, diminishes the applicability of Arrow-Raynaud’s algorithm significantly.

6 Conclusion

The following conclusions can be drawn. Scoring methods present the advantage of always selecting one final solution thus their degree of decisiveness is very high. However, one has to accept that a scoring method always implies to transform (arbitrarily) an original ordinal scale of measurement into a quantitative one, and this implies to always have a compensatory aggregation rule. Compensability, which is based on the concept of intensity of preference, causes a high probability of preference reversal phenomena. Weights should always be in the form of trade-offs. Monotonicity sometimes is lost and neutrality can be relaxed. A strong argument in favour of a Borda scoring rule is that transitivity of the preference relation is never weakened, thus the assumption of individual rationality always applies.

Condorcet consistent rules are adequate for finding rankings of objects. They present a lower probability of rank reversal than any scoring method. They are not compensatory thus weights can be treated as importance coefficients. A weak point is the high probability of presence of cycles; their solution normally implies ad hoc rules of thumb. By means of the C-K-Y-L approach cycles can be tackled in a general way with no arbitrariness. Reinforcement is always respected by this ranking procedure. The independence of irrelevant alternatives axiom is not fulfilled but the C-K-Y-L rule. This rule is anyway much more stable than any Borda count; however, the cost of this stability is the weakening of the individual rationality assumption (this loss of the transitivity assumption might seem a wild approach if considered by a purely social choice theoretical point of view, however, it can be justified on the light of empirical grounds, as shown for example by Luce and Simon). Moreover sometimes feasible rankings are lost. Neutrality cannot be relaxed, but anonymity can; this increases decisiveness a lot.

Arrow-Raynaud’s method is the only ranking procedure respecting the independence of irrelevant alternatives axiom fully; no preference reversal can exist. It is useful for generating ranking. It is not compensatory. However, it does not respect reinforcement that as noted by Arrow and Raynaud themselves, it is a very important characteristic when social decisions have to be made.Footnote 10 As a consequence Arrow-Raynaud method can be considered more useful in the framework of private business decisions while the C-K-Y-L ranking procedure is more adequate in a social context (Munda 2004, 2008). When cycles are present, by using the Arrow-Raynaud approach, often no clear cut solution can be found.

In the framework of composite indicators, sometimes compensability should be limited and rankings should be supplied; furthermore, transitivity relation can be weakened and neutrality should in principle always be kept. Scoring methods are then, sometimes less adequate than Condorcet based approaches to rank feasible objects. Since reinforcement is very important in a social context and since cycles are very likely to occur, Arrow-Raynaud’s method looks slightly worse than the C-K-Y-L ranking procedure.

However, an important problem to be solved is the computation of the C-K-Y-L ranking scores when many objects are present. One should note that the number of permutations can easily become unmanageable; for example when 10 objects are present, it is 10! = 3,628,800. Moulin (1988, p. 312) clearly states that the Kemeny method (that I call the C-K-Y-L approach) is “the correct method” for ranking objects, and that the “only drawback of this aggregation method is the difficulty in computing it when the number of candidates grows”. Indeed this computational drawback is very serious since the Kemeny median order is NP-hard to compute.Footnote 11 This NP-hardness has discouraged the development of algorithms searching for exact solutions; thus the majority of algorithms useful in the framework of composite indicators are heuristics based on artificial intelligence, branch and bound approaches and multi-stage techniques (see e.g., Barthelemy et al. 1989; Charon et al. 1997; Cohen et al. 1999; Davenport and Kalagnanam 2004; Dwork et al. 2001; Truchon 1998).

The discussion on the relative pros and cons of the various aggregation rules, described in this article, is summarised in Table 9, where such aggregation rules are evaluated according to the formal properties identifying them.

Table 9 Properties identifying various aggregation rules