Keywords

1 Introduction

Despite its popularity, Saaty’s Analytic Hierarchy Process, AHP in the following, has been criticized by many authors since its introduction in 1977 [16]. Some researchers pointed out single drawbacks or weaknesses, whereas other researchers criticized and rejected the very foundations of the method. Among these criticisms, we mention only few popular ones. The interested reader may refer to [7] and [19] for a more extended debate. Barzilai [4], as an example, considers the eigenvector method and the normalization procedure in AHP as mathematical errors. Well-known criticisms came also from Bana e Costa and Vansnick [3] and concern the meaning of the priority vector derived from the principal eigenvalue method and the Saaty’s consistency ratio. Nevertheless, the best known and most cited drawback of AHP is certainly the rank reversal (RR) phenomenon. Rank reversal is the change of ranking of alternatives as a consequence, for example, of the addition or deletion of an alternative. This clearly contradicts the principle of the independence of irrelevant alternatives. Rank reversal in AHP was firstly evidenced in 1983 by Belton and Gear [6]. After that, numerous paper were published on this subject [5, 10, 15, 20]. Saaty regularly answered to the criticisms on AHP and, in particular, on RR, arguing on the validity of his method and on the legitimacy of RR [17]. A survey on this topic can be found in [13]. Numerous authors argued on the main causes/factors influencing RR, mainly focusing on the role of vector normalization, aggregation rule and inconsistency. The aim of this paper is not to enter the debate in favor or against AHP and/or RR. Our scope is, instead, to contribute to the understanding on how some relevant factors influence the probability of the RR phenomenon. The paper is organized as follows. In Sect. 2 we set the necessary notation and definitions in order to specify the framework of pairwise comparison and AHP. In the same section we also define the main issues on RR. In Sect. 3 we describe the plan and the results of our numerical study, both in the case of a single decision maker and in the case of group decision making. Finally, we discuss and comment our results.

2 Preliminaries

2.1 Pairwise Comparisons and AHP

We assume that the reader is familiar with AHP [12, 16, 18], so that we only briefly recall the main steps of the method. Let us consider a set of n alternatives \(X=\{ x_{1},\ldots ,x_{n} \}\). We first recall the definition of pairwise comparison matrix, PCM in the following, which is a positive and reciprocal square matrix \(\mathbf {A}\) of order n obtained by pairwise comparing the n alternatives. More precisely, it is \(\mathbf {A}=(a_{ij})_{n \times n}\), with \(a_{ij}>0,~a_{ij}a_{ji}=1,~\forall i,j\), where \(a_{ij}\) is an estimation of the degree of preference of \(x_i\) over \(x_j\). Given a PCM, the most relevant task is to derive a weight vector, that is a vector \(\mathbf {w}=(w_{1},\ldots ,w_{n})\), where \(w_j\) is a numerical value quantifying the priority of alternative \(x_j\) as estimated on the basis of the pairwise comparisons. Vector \(\mathbf {w}\) is defined up to a positive factor. Normalization is often applied to \(\mathbf {w}\), so that the components sum up to one.

In order to derive the vector \(\mathbf {w}\), Saaty [16] proposes the eigenvector method. Namely, \(\mathbf {w}\) is the solution of the following equation

$$\begin{aligned} \mathbf {A}\mathbf {w}=\lambda _{\max }\mathbf {w}\, . \end{aligned}$$
(1)

Note that the maximum eigenvalue of \(\mathbf {A}\), here denoted by \(\lambda _{\max }\), refers to the Perron–Frobenius theorem. Another popular method to derive \(\mathbf {w}=(w_{1},\ldots ,w_{n})\) is the geometric mean method [2, 9], where

$$\begin{aligned} w_{i}=\left( \prod _{j=1}^{n}a_{ij} \right) ^{\frac{1}{n}}~ \forall i . \end{aligned}$$
(2)

Several other methods were proposed for obtaining a weight vector \(\mathbf {w}\) from a PCM [8], but we do not consider them in this paper. In general, given a PCM, different methods lead to different vectors \(\mathbf {w}\), but in particular cases the result is the same, as pointed out below.

Beside reciprocity, which is a property required in the definition of a PCM, consistency is another relevant property. A pairwise comparison matrix is called consistent if and only if the following condition holds:

$$\begin{aligned} a_{ik}=a_{ij}a_{jk} \;\;\; \forall i,j,k. \end{aligned}$$
(3)

Property (3) can be considered as a cardinal transitivity condition and means that preferences are fully coherent. If and only if \(\mathbf {A}\) is consistent, then there exists a weight vector \(\mathbf {w}=(w_{1},\ldots ,w_{n})\) such that

$$\begin{aligned} a_{ij}=\frac{w_i}{w_j} \;\;\; \forall i,j. \end{aligned}$$
(4)

If \(\mathbf {A}\) is consistent, then the Saaty’s eigenvector method and the geometric mean method lead to the same weight vector \(\mathbf {w}\).

Having defined a PCM and the corresponding weight vector \(\mathbf {w}\), we can now apply these basic concepts to a hierarchy, which is the characterizing structure of AHP.

Let us consider a set of m criteria \(C=\{ c_{1},\ldots ,c_{m} \}\) and require that the n alternatives \(x_{1},\ldots ,x_{n}\) must be evaluated on the basis of all criteria. Similarly to the case of a single PCM, the aim of AHP is to provide a weight vector \(\mathbf {w}=(w_{1},\ldots ,w_{n})\), where \(w_j\) a numerical value quantifying the global priority of alternative \(x_j\) as estimated on the basis of all the m criteria. For each fixed criterion \(c_k\), Saaty proposed to construct a PCM, say \(\mathbf {A}_k\) where the n alternatives are pairwise compared on the basis of \(c_k\), \(k=1, \ldots , m\). In such a way, m PCMs of order n are obtained. Then, the corresponding ‘local weight vector’, say \(\mathbf {w}_k\), is derived for each \(\mathbf {A}_k\) by using the eigenvector method. Vectors \(\mathbf {w}_k\) are normalized so that for each vector the sum of the components equals one. Finally, by means of the so-called ‘hierarchical composition’, the m local weight vectors \(\mathbf {w}_1,\ldots ,\mathbf {w}_m\) are aggregated in order to obtain the global weight vector \(\mathbf {w}\). The AHP prescribes that the aggregation of the local weight vectors is done through the weighted arithmetic mean,

$$\begin{aligned} \mathbf {w} = v_1\mathbf {w}_1 + \cdots + v_m\mathbf {w}_m =\sum _{i=1}^{m} v_i\mathbf {w}_i \, , \end{aligned}$$
(5)

where \(v_1, \ldots , v_m\) are the weights, or priorities, of the criteria \(c_1, \ldots , c_m\) respectively. AHP requires that also the weights \(v_1, \ldots , v_m\) are computed as the components of the maximal eigenvector of the \(m \times m\) PCM obtained by pairwise comparing the m criteria. Nevertheless, this is not relevant for our study, so that, in the next section, we will determine \(v_1, \ldots , v_m\) more directly. In Fig. 1 we give an example of a hierarchy where the four alternatives constitute the lowest layer and the five criteria the layer immediately above.

Fig. 1
figure 1

Example of a hierarchy with five criteria and four alternatives

2.2 Rank Reversal

Few years after the introduction of AHP by T. Saaty in 1977 [16], Belton and Gear proved in 1983 [6] that AHP may suffer of a drawback that was then named Rank Reversal. They considered an example with three alternatives and three criteria. The three PCMs where consistent and the ranking of the alternatives were computed by means of AHP as described in the preceding subsection. Belton and Gear showed that, by adding a fourth alternative, the ranking of the original three alternatives changed even if the preferences among them remained unchanged and consistency was preserved for the new \(4 \times 4\) PCMs. Other types of RR were studied in the following years, evidencing, for example, that the phenomenon may occur also by replacing an alternative with a similar one. Other authors evidenced the role of the weight vectors normalization. It was also proved that RR is avoided by aggregating the local weight vectors using the weighted geometric mean instead of the weighted arithmetic meas prescribed by the original AHP. Moreover, it was proved by Saaty itself that RR may not occur in the case of a single PCM, that is for a single criterion, provided that the PCM is consistent. On the contrary, if consistency assumption is removed, RR may occur even for a single PCM. In the next section, we describe our numerical simulations in the case of multiple criteria and consistent PCMs. Our study is aimed at investigating the influence on RR of some relevant factors, as the aggregation method for local weight vectors and the entropy of the criteria weight distribution.

3 Numerical Study on Rank Reversal

3.1 The Effect of Criteria Weights Distribution on Rank Reversal

In this subsection, we study how the distribution of the normalized criteria weights \((v_1, \ldots , v_m)\), \(\sum _{i=1}^m v_i=1\) can influence the probability of RR. We start by observing that if \(m-1\) weights are null, being 1 the remaining weight, as for example in \((v_1, \ldots , v_m)=(1,0,\ldots , 0)\), then RR doesn’t occur. The simple reason is that this case leads back to the single criterion consistent case. As mentioned above, Saaty proved that this latter case is RR-free. By drifting away from this polarized case, RR can arise. In particular, our assumption was to consider the uniform case \((v_1, \ldots , v_m)=(\frac{1}{m},\frac{1}{m},\ldots , \frac{1}{m})\) as an ‘opposite’ case with respect to the previous polarized one. In our opinion, the most suitable quantity to describe the range between these two extreme cases is the entropy of \((v_1, \ldots , v_m)\),

$$\begin{aligned} H(v_1, \ldots , v_m) = -\sum _{i=1}^{m} v_i \ln (v_i) \, . \end{aligned}$$
(6)

Note that entropy of \((v_1, \ldots , v_m)\) measures the ‘closeness’ to the uniform case \((\frac{1}{m},\frac{1}{m},\ldots , \frac{1}{m})\). In fact, the minimum value of entropy (6) is reached in the fully polarized case, for example, \(H(1, 0, \ldots , 0) = 0\), whereas the maximum value of entropy (6) is reached in the case of uniformly distributed weights, \(H(\frac{1}{m},\frac{1}{m},\ldots , \frac{1}{m}) = \ln m\). We assume that, in the case of \(v_i=0\), the value of the corresponding term in (6) is taken to be 0, according to the limit \(\lim _{v_i \rightarrow 0^{+}} \left( v_i \ln (v_i)\right) =0\). We performed numerous simulations with different values of the number of alternatives and criteria. We anticipate that, for the sake of simplicity, Fig. 2a reports a single example, corresponding to \(n=4\) alternatives and \(m=5\) criteria. In the following, we briefly describe the plan of our numerical study.

  1. 1.

    We construct m consistent PCMs \(\mathbf {A}=(a_{ij})_{n \times n}\) by setting \(a_{ij}=\frac{w_i}{w_j} \), where \((w_{1},\ldots ,w_{n})\) is a randomly generated vector by uniformly sampling in the set \(\{1, 2, 3, 4, 5, 6, 7, 8, 9 \}\).

  2. 2.

    We associate to each PCM constructed in the previous point the corresponding normalized weight vector. Given that all the PCMs are consistent, it is clearly irrelevant wether to use the eigenvector method or the geometric mean method. Moreover, the obtained normalized vector will be proportional to that used for constructing the PCM. At this point, we have the m normalized local weight vectors \(\mathbf {w}_1,\ldots ,\mathbf {w}_m\).

  3. 3.

    The m local weight vectors \(\mathbf {w}_1,\ldots ,\mathbf {w}_m\) are aggregated in order to obtain the global weight vector \(\mathbf {w}\). The aggregation is performed using the standard weighted arithmetic mean (5) and weights \(v_1, \ldots , v_m\) of the criteria are randomly generated.

  4. 4.

    For each one of the m consistent PCMs \(\mathbf {A}=(a_{ij})_{n \times n}\) constructed at the point 1, we remove the last row and the last column, thus obtaining a new set of m consistent PCMs of order \(n-1\).

  5. 5.

    We repeat on the new set of m consistent PCMs of order \(n-1\) exactly the same computations as described at points 2 and 3 with the same weights \(v_1, \ldots , v_m\) of the criteria. So, we obtain the corresponding global weight vector with \(n-1\) components.

  6. 6.

    We compare, for each instance of m consistent PCMs, the rank obtained in the n alternatives case with that obtained in the \((n-1)\) alternatives case, in order to verify whether the RR occurred.

Fig. 2
figure 2

Estimated probability of rank reversal for a single DM

We repeat 100.000 times the points from 1 to 6, thus obtaining the output data set of our study. We can now better describe the graphical results shown in Fig. 2a, where the case with 4 alternatives and 5 criteria is reported. Each point in the plot is determined as follows. We report on the horizontal axis the entropy values, ranging from 0 to its maximum value \(\ln m = \ln 5\). This interval is then partitioned in k equally spaced subintervals. For each subinterval, we consider all the weight vectors \((v_1, v_2, v_3, v_4 , v_5)\) with entropy value belonging to this subinterval. Correspondingly, we compute the percentage of occurrence of RR for all \(4 \times 4\) PCMs, constructed as described above, provided that the aggregation is made using the criteria weight vectors with entropy values belonging to the fixed subinterval. Finally, this percentage is reported on the vertical axis, thus obtaining the second coordinate of the point. The first coordinate of the point is the center of the subinterval. As synthesized in Fig. 2a, it is apparent that the probability of RR increases when the entropy of \((v_1, v_2, v_3, v_4 , v_5)\) increases, thus evidencing the role of criteria weights distribution on RR. Figure 2b reports the outcome of a study which is very similar to the one reported in Fig. 2a. The only difference is that, instead of measuring the closeness of a criteria weight vector \((v_1, v_2, v_3, v_4 , v_5)\) to the uniform case \((\frac{1}{5},\frac{1}{5},\frac{1}{5},\frac{1}{5}, \frac{1}{5})\) with the entropy of \((v_1, v_2, v_3, v_4 , v_5)\), we use its standard deviation. Note that, actually, the standard deviation measures the distance from \((\frac{1}{5},\frac{1}{5},\frac{1}{5},\frac{1}{5}, \frac{1}{5})\), rather than the closeness. As pointed out above, the interpretation of Fig. 2a is straightforward. The percentage of cases in which RR occurred increases when the entropy (6) increases. In the case of maximum entropy, i.e. \(H(v_1, v_2, v_3, v_4, v_5)=H(\frac{1}{5},\frac{1}{5},\frac{1}{5},\frac{1}{5}, \frac{1}{5})= \ln 5 \approx 1.6\), we found that RR occurred approximately in 23% of cases. Figure 2b represents the same outcome as in Fig. 2a, but referring to the standard deviation of criteria weights. Coherently with the outcome in Figs. 2a, 2b shows that the percentage of cases in which RR occurred decreases when the standard deviation increases.

3.2 The Effect of Aggregation Methods on Rank Reversal

Let us consider again the question of the aggregation of the local weight vectors \(\mathbf {w}_1, \ldots , \mathbf {w}_m \). As it is known, Saaty’s AHP states that the aggregation is made using the weighted arithmetic mean (5). On the other hand, Barzilai and Golany [5] proposed to use the weighted geometric mean,

$$\begin{aligned} \mathbf {w} = \prod _{i=1}^{m} \mathbf {w}_i^{v_i} \, . \end{aligned}$$
(7)

Note that the exponents \(v_i\) in (7) act componentwise. This means that components \(w_j\) of vector \(\mathbf {w}\) are computed as

$$\begin{aligned} w_j = \prod _{i=1}^{m} (\mathbf {w}_i)_j^{v_i} \, . \end{aligned}$$
(8)

Barzilai and Golany proved that if the weighted geometric mean aggregation (7) is applied, RR cannot occur, thus evidencing that the weighted arithmetic mean can be considered as a relevant factor determining RR. In our study, we consider the weighted arithmetic mean and the weighted geometric mean as particular cases of the weighted power mean,

$$\begin{aligned} \mathbf {w} = \left( \sum _{i=1}^{m} v_i(\mathbf {w}_i)^p \right) ^\frac{1}{p} \, , \end{aligned}$$
(9)

where, again, the exponent p in (9) acts componentwise.

More precisely, it is known that the weighted arithmetic mean (5) is obtained from (9) for \(p=1\) and the weighted geometric mean (7) is obtained from (9) for \(p \rightarrow 0\). Justified by this latter result, definition (9) is often completed assuming that for \(p=0\) the weighted power mean is defined to be the weighted geometric mean. We assume this extended definition in the following. Clearly, each value of p in [0, 1] is associated with a different aggregation method which acts between the two extreme cases of the weighted geometric and arithmetic mean. By means of numerical simulations, we study how the relative frequency of RR varies by moving from the RR-free case of the geometric mean aggregation, corresponding to \(p=0\), to the weighted arithmetic mean case, corresponding to \(p=1\). As it might be expected, the outcome is a monotonically increasing behavior, as showed in Fig. 3. Similarly to the study described in the Sect. 3.1, we performed numerous simulations with different values of the number of alternatives and criteria but, in Fig. 3, for the sake of simplicity, we report a single example, corresponding to \(n=4\) alternatives and \(m=5\) criteria. The following description of the plan of our numerical study is quite similar to the one described in the previous subsection, as it differs only for what concerns the aggregation method at point 3.

  1. 1.

    We construct m consistent PCMs \(\mathbf {A}=(a_{ij})_{n \times n}\) by setting \(a_{ij}=\frac{w_i}{w_j} \), where \((w_{1},\ldots ,w_{n})\) is a randomly generated vector by uniformly sampling in the set \(\{1, 2, 3, 4, 5, 6, 7, 8, 9 \}\).

  2. 2.

    We associate to each PCM constructed in the previous point the corresponding normalized weight vector. Given that all the PCMs are consistent, it is clearly irrelevant whether to use the eigenvector method or the geometric mean method. Moreover, the obtained normalized vector will be proportional to that used for constructing the PCM. At this point, we have the m normalized local weight vectors \(\mathbf {w}_1,\ldots ,\mathbf {w}_m\).

  3. 3.

    The m local weight vectors \(\mathbf {w}_1,\ldots ,\mathbf {w}_m\) are aggregated in order to obtain the global weight vector \(\mathbf {w}\). The weights \(v_1, \ldots , v_m\) of the criteria are randomly generated and the aggregation is performed using the weighted power mean (9) with 100 different values of p uniformly spaced in [0, 1].

  4. 4.

    For each one of the m consistent PCMs \(\mathbf {A}=(a_{ij})_{n \times n}\) constructed at the point 1, we remove the last row and the last column, thus obtaining a new set of m consistent PCMs of order \(n-1\).

  5. 5.

    We repeat on the new set of m consistent PCMs of order \(n-1\) exactly the same computations as described at points 2 and 3 with the same weights \(v_1, \ldots , v_m\) of the criteria and the same 100 different values of p used in the \(n \times n\) case.

  6. 6.

    We compare, for each instance, the rank obtained in the \(n \times n\) case with that obtained in the \((n-1) \times (n-1)\) case, in order to verify whether the RR occurred.

Fig. 3
figure 3

Estimated probability of rank reversal as influenced by the aggregation method in the case of a single DM

Let us describe how the plot in Fig. 3 is obtained. Each one of the 100 points in the plot corresponds to a fixed value of p on the horizontal axis. The second coordinate of the point is obtained by reporting on the vertical axis the relative frequency of RR, as computed by performing the numerical simulations described above. This relative frequency was computed on 5.000 different instances of five \(4 \times 4\) PCMs. We use the same set of instances for generating all points in the plot, using different values of p in the aggregation method (9). As synthesized in Fig. 3, it is apparent that the probability of RR increases with p, thus evidencing how the RR is influenced by the aggregation method (9). For example, it can be noted that points corresponding to values of p close to zero represent aggregation methods close to weighted geometric mean (7). In these cases, RR occurs very rarely. Although we reported only few examples in Figs. 2 and 3, we can send the interested readers other outcomes corresponding to different values of n and m.

3.3 Group AHP and Rank Reversal

As mentioned in the introduction, we extended our study to the case of a group of Decision Makers, DMs in the following. We assume that N DMs express their preferences on n alternatives \(\{ x_{1},\ldots ,x_{n} \}\) through AHP exactly as described in the previous subsections. Then, m PCMs of order n are associated to each DM, corresponding to the m criteria. We denote by \(\mathbf {A}_k^l\) the PCM of DM l referring to criterion k, for \(l=1, \ldots , N\) and \(k=1, \ldots , m\). In order to derive a final weight vector \(\mathbf {w}=(w_{1},\ldots ,w_{n})\) quantifying the group priorities on the n alternatives, one has to aggregate the data provided by the N DMs. As it is known, there are two main aggregation procedures within AHP, i.e. the aggregation of individual priorities (AIP) and the aggregation of individual judgments (AIJ). In AIP procedure, each DM derives independently his/her individual priorities. Then, the N priority vectors are aggregated to a group priority vector using either a (weighted) arithmetic (WAMM) or a (weighted) geometric mean method. Conversely, in AIJ procedure, the group PCMs are first determined and the group priority vector is computer after that [11, 21]. In the of AIJ procedure, each entry of the group PCMs \(\mathbf {A}_k^G\) is obtained using the geometric mean method on the corresponding entries of the PCMs of all DMs,

$$\begin{aligned} \left( \mathbf {A}_k^G \right) _{ij} = \left[ \prod _{l=1}^{N} \left( \mathbf {A}_k^l \right) _{ij} \right] ^\frac{1}{N} \;\;\; k=1,\ldots , m \;\; i,j=1, \ldots , n \,. \end{aligned}$$
(10)

As pointed out by Aczél and Saaty [1], the geometric mean method must be used in AIJ procedure in order to preserve the reciprocity of the group PCMs (10). In the following, we use AIJ procedure, since we consider it as more relevant for our study. The interested reader can refer to [14] for an updated review on group aggregation techniques for AHP.

Fig. 4
figure 4

Estimated probability of rank reversal in a group of 5 DMs

Our numerical study on group RR is then performed as follows.

  1. 1.

    The number N of DMs constituting the group is fixed and the PCMs of the N DMs are constructed as described in Sect. 3.1.

  2. 2.

    The PCMs of the N DMs are aggregated using componentwise the geometric mean method, as in (10), in order to form the the group PCMs.

  3. 3.

    The same study and graphical representation described in Sect. 3.1 for a single DM is performed for the group. An illustrative example of the obtained outcomes is reported in Fig. 4, corresponding to the case with \(N=5\).

  4. 4.

    The study described in the preceding points is performed on the aggregation method too, thus extending to the group case the investigation carried out in Sect. 3.2. Again, we report an illustrative example with \(N=4\) in Fig. 5.

Similarly to the study described in Sects. 3.1 and 3.2, we considered different values of n, m and N. For the sake of simplicity, we reported only the illustrative examples in Fig. 4 and in Fig. 5. A first remark is that the monotone behavior of the estimated RR probability is confirmed for the group case too. Although Fig. 4 resembles Fig. 2, it can be observed that the values reported in the latter are approximately double that the values in the former. For different values of the parameters n, m and N we obtain results that are coherent with this last observation. We can therefore conclude that outcomes of our simulations support the conjecture that RR probability decreases when the number of experts in a group increases. The impact of the number of experts on rank reversal was studied from a different point of view in [10]. Comparison between Figs. 3 and 5 leads to very similar conclusions for the aggregation method too.

Fig. 5
figure 5

Estimated probability of group rank reversal as influenced by the aggregation method. Case with 4 DMs