Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In this paper we deal with voting procedures, maybe the most intuitively appealing examples of social choice function, which are meant to determine the winner of some election in the function of individual votes—cf. for a comprehensive exposure in particular Pitt et al. [1] but also Pitt et al. [2, 3], Arrow, Sen and Suzumura [4], Kelly [5], Plott [6], Schwartz [7], etc.

Basically, we consider the following problem: we have n, \(n\ge 2\) individuals who present their testimonies over the set of m, \(m\ge 2\), options. The testimonies can be exemplified by individual preference relations which are often, also here, binary relations over the set of options, orderings over the set of options. We look for social choice functions, or—to be more specific—a voting procedure that would select a set of options that would best reflect the opinions of the whole group, as a function of individual preference relations.

A traditional line of research here has been whether and to which extent the particular voting procedures do or do not satisfy some plausible and reasonable axioms and conditions, maybe best exemplified by the famous Arrows theorem, and so many paradoxes of voting. We will not deal with this, for details cf. Arrow [8], Gibbard [9], Kelly [10], May [11], Nurmi [12], Riker [13], Satterthwaite [14], etc.

We will deal with an equally important, or probably practically more important, problem of how similar or dissimilar the particular voting procedures are. This was discussed in Nurmi’s [12] book, cf. also Baigent [15], Elkind, Faliszewski and Slinko [16], McCabe-Dansted and Slinko [17], Richelson [18], etc.

In this paper we will deal with the above mentioned problem of how to measure the similarity and dissimilarity of voting procedures. First, we will take into account only a subset of well known voting procedures. Then, we will employ the idea of a qualitative similarity (and its related dissimilarity) analysis of voting procedures proposed by Fedrizzi, Kacprzyk and Nurmi [19] in which Pawlak’s rough sets (cf. Pawlak [20, 21], cf. also Pawlak and Skowron [22]), have been used. Then, we will use the idea of the recent approach proposed by Kacprzyk, Nurmi and Zadrożny [23] in which the above mentioned more qualitative rough sets based analysis has been extended with a quantitative analysis by using the Hamming and Jaccard-Needham similarity indexes.

This paper is a further extension of Kacprzyk, Nurmi and Zadrożny [23]. Basically, we consider some other more popular similarity (and their related dissimilarity) measures:

  • Jaccard-Needham (to repeat, for completeness, the results already obtained for this measure in [23]),

  • Dice,

  • correlation,

  • Yule,

  • Russell–Rao,

  • Sockal–Michener,

  • Rogers–Tanimoto, and

  • Kulczyński—cf. Tubbs [24] for details.

Notice that these measure are just a small subset of a multitude of similarity measures known in the literature, cf. Choi, Cha and Tappert [25]. Moreover, in this paper we limit our attention to those similarity measures which, first of all, take in values in [0, 1], and the corresponding dissimilarity measures of which are dual in the sense that their values add up to 1, which is not the case for all measures.

Notice that this approach is different both conceptually and technically from the approach by Kacprzyk and Zadrożny [26, 27] in which some distinct classes of voting procedures are determined using the concept of Yager’s [28] ordered weighted averaging (OWA) aggregation operator (cf. Yager and Kacprzyk [29], Yager, Kacprzyk and Beliakov [30]), and the change of the order of variables to be aggregated and the type of weights (i.e. the aggregation behavior) determines various classes of voting procedures.

2 Foundations of the Theory of Rough Sets

Rough sets were proposed in the early 1980s by Pawlak [20], and then extensively developed by Pawlak [21], Polkowski (e.g., [31]), Skowron (e.g., [22, 32, 33]), Słowiński (e.g., [34]), etc. and their collaborators. It is a conceptually simple and intuitively appealing tool for the representation and processing of imprecise knowledge when the classes into which the objects are to be classified are imprecise but can be approximated by precise sets, from the above and below.

Here we will just briefly recall some basic concepts and properties of rough sets theory which may be useful for our purpose, and for more detail, cf. Pawlak [20, 21], Polkowski (e.g., [31]), Skowron (e.g., [22], Pawlak and Skowron [32, 35], Pawlak et al. [33]), and Greco et al. (e.g., [34]) etc. to just list a few.

Let \(U=\{u\}\) be a universe of discourse. It can usually be partitioned in various ways into a family R of partitionings, or equivalence relations defined on U. A knowledge base, denoted by K, is the pair \(K=(U, \mathbf{R})\). Let now P be a non-empty subset of R, \(\mathbf{P} \subset \mathbf{R}, \mathbf{P}\ne \emptyset \). Then, the intersection of all equivalence relations (or partitionings) in P, which is also an equivalence relation, is called an indiscernibility relation over P and is denoted by \(IND(\mathbf{P})\).

The family of its equivalence classes is termed the P-basic knowledge about U in K and it represents all that can be said about the elements of U under P. Therefore, one cannot classify the elements of U any deeper than to the equivalence classes of \(IND(\mathbf{P})\). For instance, if for some U, \(\mathbf{P}=\{R_1, R_2\}\) such that \(R_1\) partitions the objects into the classes labeled “heavy” and “lightweight”, and \(R_2\) partitions into the classes labeled “black” and “white”, then all that can be said about any element of U is that it belongs to one of: “heavy-and-black”, “heavy-and-white”,“lightweight-and-black”, “lightweight-and-white”.

Equivalence classes of \(IND(\mathbf{P})\) are called the basic categories (concepts) of knowledge P. If \(Q \in \mathbf{R}\), that is, Q is an equivalence relation on U, then its equivalence classes are called the Q-elementary categories (concepts) of knowledge R.

If \(X \subset U\), and R is an equivalence relation on U, then X is called R-definable or R-exact if it is a union of some R-elementary categories (R-basic categories); otherwise, it is called R-rough.

Rough sets can be approximately defined by associating with any \(X \subset U\) and any equivalence relation R on U the following two sets (U / R denotes the set of all equivalence relations of R):

  • a lower approximation of X:

    $$\begin{aligned} R_LX = \bigcup \{Y \in U/R \mid Y \subset X\} \end{aligned}$$
    (1)
  • an upper approximation of X:

    $$\begin{aligned} R_UX = \bigcup \{Y \in U/R \mid Y \cap X \ne \emptyset \} \end{aligned}$$
    (2)

and a rough set is defined as the pair \((R_L, R_U)\).

The lower approximation yields the classes of R which are subsets of X, i,e, contains those elements which are necessarily also elements of X, while the upper approximation yields those classes of R which have at least one common element with X.

For our purposes two concepts related to the reduction of knowledge are crucial. First, for a family of equivalence relations R on U, one of its elements, Z, is called dispensable in R if

$$\begin{aligned} IND(\mathbf{R}) = IND(\mathbf{R} \setminus \{Z\}) \end{aligned}$$
(3)

and otherwise it is called indispensable. If each Z in R is indispensable, then R is called independent.

For a family of equivalence relations, R, and its subfamily, \(\mathbf{Q} \subset \mathbf{R}\), if:

  • \(\mathbf{Q}\) is independent, and

  • \(IND(\mathbf{Q}) = IND(\mathbf{R})\),

then \(\mathbf{Q}\) is called a reduct of R; clearly, it need not be unique.

The core of R is the set of all indispensable equivalence relations in R, and is the intersection of all reducts of R—cf. Pawlak [20].

From the point of view of knowledge reduction, the core consists of those classifications (equivalence relations) which are the most essential in the knowledge available in that no equivalence relation that belongs to the core can be discarded in the knowledge reduction process without distorting the knowledge itself. A reduct yields a set of equivalence relations which is sufficient for the characterization of knowledge available without losing anything relevant.

In this paper our analysis is in terms of indiscernibility relations; for the concept of a discernibility relation, cf. Yao and Zhao [36].

3 A Comparison of Voting Procedures Using Rough Sets

The problem of comparison and evaluation of voting procedures (social choice functions) is very important and has been widely studied in the literature, cf. Richelson [18], Straffin [37], Nurmi [12], to name a few.

A simple, intuitively appealing, rough set based approach, was proposed by Fedrizzi, Kacprzyk and Nurmi [19]. It was more qualitative, and was extended to include more quantitative aspects by Kacprzyk, Nurmi and Zadrożny [23]. We will now briefly recall this approach since it will provide a point of departure for this paper.

We assume that we have 13 popular voting procedures:

  1. 1.

    Amendment: proposals (options) are paired (compared) with the status quo. If a variation on the proposal is introduced, then it is paired with this proposal and voted on as an amendment prior to the final vote. Then, if the amendment succeeds, the amended proposal is eventually paired with the status quo in the final vote, otherwise, the amendment is eliminated prior to the final vote.

  2. 2.

    Copeland: selects the option with the largest so-called Copeland score which is the number of times an option beats other options minus the number of times this option loses to other options, both in pairwise comparisons.

  3. 3.

    Dodgson: each voter gives a rank ordered list of all options, from the best to the worst, and the winner is the option for which the minimum number of pairwise exchanges (added over all candidates) is needed before they all become a Condorcet winner, i.e. defeat all other options in pairwise comparisons with a majority of votes.

  4. 4.

    Schwartz: selects the set of options over which the collective majority preferences are cyclic and the entire cycle is preferred over the other options; it is the single element in case there is a Condorcet winner, otherwise it consists of several options.

  5. 5.

    Max-min: selects the option for which the minimal support in all pairwise comparisons is the largest.

  6. 6.

    Plurality: each voter selects one option (or none in the case of abstention), and the options with the most votes win.

  7. 7.

    Borda: each voter provides a linear ordering of the options which are assigned a score (the so-called Borda score) as follows: if there are n candidates, \(n-1\) points are given to the first ranked option, \(n-2\) to the second ranked, etc., and these numbers are summed up for each option to end up with the Borda score for this option, and the option(s) with the highest Borda score win(s).

  8. 8.

    Approval: each voter selects a subset of the candidate options and the option(s) with the most votes is/are the winner(s).

  9. 9.

    Black: selects either the Condorcet winner, i.e. an option that beats or ties with all others in pairwise comparisons, when it exists, and the Borda count winner (as described above) otherwise.

  10. 10.

    Runoff: the option ranked first by more than a half of the voters is chosen if one exists. Otherwise, the two options ranked first by more voters than any other option are compared with each other and the winner is the one ranked first (among the remaining options) by more voters than the other option.

  11. 11.

    Nanson: we iteratively use the Borda count, at each step dropping the candidate with the smallest score (majority); in fact, this is sometimes called a modified version of the Nanson rule, cf. Fishburn [38],

  12. 12.

    Hare: the ballots are linear orders over the set of options, and we repeatedly delete the options which receive the lowest number of first places in the votes, and the option(s) that remain(s) are declared as the winner(s).

  13. 13.

    Coombs: each voter rank orders all of the options, and if one option is ranked first by an absolute majority of the voters, then it is the winner. Otherwise, the option which is ranked last by the largest number of voters is eliminated, and the procedure is repeated.

What concerns the criteria against which the above mentioned voting procedures are compared, we use some basic and popular ones presented in the classic Nurmi’s [12] book. More specifically, we will consider 7 criteria the voting procedures are to satisfy:

  1. 1.

    A—Condorcet winner,

  2. 2.

    B—Condorcet loser,

  3. 3.

    C—majority winner,

  4. 4.

    D—monotonicity,

  5. 5.

    E—weak Pareto winner,

  6. 6.

    F—consistency, and

  7. 7.

    G—heritage,

the essence of which can be summarized as:

  1. 1.

    Condorcet winner: if an option beats each other option in pairwise comparisons, it should always win.

  2. 2.

    Condorcet loser: if an option loses to each other option in pairwise comparisons, it should always loose.

  3. 3.

    Majority winner: if there exists a majority (at least a half) that ranks a single option as the first, higher than all other candidates, that option should win.

  4. 4.

    Monotonicity: it is impossible to cause a winning option to lose by ranking it higher, or to cause a losing option to win by ranking it lower.

  5. 5.

    Weak Pareto winner: whenever all voters rank an option higher than another option, the latter option should never be chosen.

  6. 6.

    Consistency criterion: if the electorate is divided in two and an option wins in both parts, it should win in general.

  7. 7.

    Heritage: if an option is chosen from the entire set of options using a particular voting procedure, then it should also be chosen from all subsets of the set of options (to which it belongs) using the same voting procedure and under the same preferences.

We start with a illustrative account of which voting procedure satisfies which criteria(‘0” stands for “does not satisfy”, and “1” stands for “satisfies”) which is presented in Table 1; the 13 voting procedures correspond to the rows while the 7 criteria correspond to the columns, here and in next tables.

Table 1 Satisfaction of 7 criteria by 13 voting procedures

Though the data shown in Table 1 can be immediately used for the comparison of the 17 voting procedures against the 7 criteria by a simple pairwise comparison of rows, a natural attempt is to find first if, under the information available in that table, all the 17 voting procedures are really different, and if all the 7 criteria as really needed for a meaningful comparison.

Quite a natural, simple and intuitively appealing approach was proposed in this respect by Fedrizzi et al. [19] using rough sets. We will present below its essence.

4 Simplification of Information on the Voting Procedures and Criteria to Be Fulfilled

We will now show the essence of Fedrizzi et al. [19] approach based on the application of some elements of rough sets theory, briefly presented in Sect. 2, to simplify information in the source Table 1. We will basically consider crucial properties or attributes of the voting procedures that will make it possible to merge them into one (class of) voting procedure under a natural condition that they satisfy the same properties, i.e. the criteria assumed.

First, one can see that the amendment procedure and Schwartz’ choice function have identical properties in Table 1, so one can be deleted and similarly for Copeland’s and Black’s choice functions, the runoff, Hare’s and Coombs’ choice functions. We obtain therefore Table 2.

Table 2 Satisfaction of 7 criteria by 9 equivalent (classes of) voting procedures

So that we have 9 “really different” (classes of) voting procedures:

  1. 1.

    Amendment (which stands now for Amendment and Schwartz),

  2. 2.

    Copeland (which stands now for Copeland and Black),

  3. 3.

    Dodgson,

  4. 4.

    Max-min,

  5. 5.

    Plurality,

  6. 6.

    Borda,

  7. 7.

    Approval,

  8. 8.

    Runoff (which stands now for Runoff, and Hare and Coombs).

  9. 9.

    Nanson.

Now, we look for the indispensable criteria, cf. Sect. 2 which boils down to that if we take into account that each attribute (which corresponds to a criterion) generates such an equivalence relation, then to the same class there belong those voting procedures that fulfill those criteria, and to another class those which do not. This can be done by eliminating the criteria one by one and finding out whether the voting procedures can be discerned from each other in terms of the remaining criteria.

Therefore, if we start from Table 2, by eliminating criterion A we get Table 3.

Table 3 Elimination of criterion A from Table 2
Table 4 Elimination of criterion B from Table 2
Table 5 Elimination of criterion C from Table 2

The two last rows of Table 3 are identical and to distinguish those two last rows, i.e. Runoff and Nanson, criterion A is necessary, i.e. criterion A is indispensable.

And, analogously, we delete criterion B and obtain Table 4.

The Copeland and Max-Min procedures become indistinguishable so that criterion B is indispensable.

Next, the elimination of criterion C leads to Table 5.

All rows in Table 5 are different so that criterion C is unnecessary to differentiate between those voting functions, and we can conclude that C is dispensable.

Further, we delete criterion D and obtain Table 6. The Copeland and Nanson choice functions are now indistinguishable which means that criterion D is indispensable.

Table 6 Elimination of criterion D from Table 2

Now, we eliminate criterion E and get Table 7.

Table 7 Elimination of criterion E from Table 2

Two uppermost rows are now identical so that criterion E is needed, i.e. it is indispensable.

Next, criterion F is eliminated as shown in Table 8 in which no pair of rows is identical so that criterion F is dispensable.

Table 8 Elimination of criterion F from Table 2

Finally, criterion G is eliminated which is shown in Table 9. We can see that all rows are different so that we can conclude that criterion G is dispensable.

Table 9 Elimination of criterion F from Table 2

It is easy to notice that the core is the set of indispensable criteria, i.e. {A, B, D, E}, and the reduct is in this case both unique and also equal to {A, B, D, E}. That is, we need just that set of criteria to distinguish the particular voting procedures from each other (naturally, under the set of criteria assumed).

Table 10 Satisfaction of the criteria belonging to the core by the particular voting procedures
Table 11 An economical characterization of the voting procedures shown in Table 10

We can then consider the reduct (or core). In Table 10 we show which criteria are indispensable in the sense that if we do not take them into account, the two or more rows (corresponding to the respective voting procedures) become indistinguishable. For example, without criterion E, Amendment and Copeland would be indistinguishable, without D, Copeland and Nanson would be indistinguishable, without B, Copeland and Max-Min would be indistinguishable, etc.

Table 10 expresses the most crucial properties or criteria of the voting procedures in the sense that the information it conveys would be sufficient to restore all information given in the source Table 2. Therefore, for an “economical” characterization of the voting procedures, we can use the values of the criteria given in Table 10 and present the results as in Table 11 where the subscripts of the particular criteria stand for the values they take on, for instance, to most economically characterize Amendment, the A, B and D should be 1 and E should be 0, etc.

This is, however, not yet the most economical characterization but this issues will not be dealt with here and we refer the interested reader to Fedrizzi, Kacprzyk and Nurmi [19] or Kacprzyk, Nurmi and Zadrożny [23] to find that the minimal (most economical) characterization of the voting procedures in term of information given in Table 2 can be portrayed as shown in Table 12.

Table 12 The minimal (most economical) characterization of the voting procedures shown in Table 10

This is a very compact representation which is due to the very power of rough sets theory.

5 Similarity and Dissimilarity of the Voting Procedures: A Quantitative Approach Based on Similarity and Dissimilarity Measures for Binary Patterns

As it could be seen from the previous section, a rough sets based analysis has made it possible to find a smaller subset of all the choice functions considered such that choice functions merged could have been meant as similar. This, rather qualitative result, is clearly the first step. The next steps towards a more quantitative analysis can be made, using elements of rough sets theory, by using some indiscernibility analyses. This was proposed by Fedrizzi, Kacprzyk and Nurmi [19], and then extended by Kacprzyk, Nurmi and Zadrożny [23]. We will not deal with this approach and refer the interested reader to the above mentioned papers.

In this paper we will approach the problem of measuring the similarity and dissimilarity in a more quantitative way, using some similarity and dissimilarity measures, but going beyond the classic Hamming and Jaard-Needham measures used in Kacprzyk, Nurmi and Zadrożny [23].

We take again as the point of departure the characterization of the voting procedures as shown in Table 2, that is, just after the reduction of identical rows in Table 1, but—to better show the generality of our approach—without all further reductions (or a representation size reduction) as proposed later on and presented in Tables 310.

The data sets involved are in fact binary patterns and there is a multitude of similarity/dissimilarity measures for binary patterns but we will here concentrate on the measures given by Tubbs [24] which are useful in matching binary patterns in pattern recognition. We will follow to a large extent Tubb’s notation.

A binary vector Z of dimension N is defined as:

$$\begin{aligned} Z = (z_1, z_2, \ldots , z_N) \end{aligned}$$
(4)

where \(z_i \in \{0,1\}\), \(\forall i \in \{1,2, \ldots , N\}\).

The set of all N-dimensional binary vectors is denoted by \(\varOmega \), the unit binary vector, \(I \in \varOmega \), is a binary vector such that \(z_i = 1, \forall i\in \{1,2, \ldots , N\}\), and the complement of a binary vector \(Z \in \varOmega \) is \(\overline{Z} = I - Z\).

The magnitude of a binary vector \(Z \in \varOmega \) is

$$\begin{aligned} \mid Z \mid = \sum _{i=1}^N z_i \end{aligned}$$
(5)

that is, the number of elements which are equal to 1.

If we have two binary vectors, \(X, Y \in \varOmega \), then we denote by \(S_{i,j}(X,Y)\) the number of matches of i in vector X and j in vector Y, \(i,j \in \{0,1\}\). That is, if we have two vectors:

$$ X = [0, 1, 1, 0, 1, 0, 0, 1, 1, 0] $$
$$ Y = [1, 1, 0, 0, 1, 1, 0, 0, 1, 0] $$

then we have:

$$\begin{aligned} S_{00}(X,Y)= & {} 3\\ S_{01}(X,Y)= & {} 2\\ S_{10}(X,Y)= & {} 2\\ S_{11}(X,Y)= & {} 3 \end{aligned}$$

Formally, we can define those measures as follows. First, for vectors \(X=(x_1,x_2,\ldots ,x_N)\) and \(Y=(y_1,y_2,\ldots ,y_N)\):

$$\begin{aligned} v_{ij} = \left\{ \begin{array}{ll} 1 &{} \text { if } x_i = y_j\\ 0 &{} \text { otherwise} \end{array} \right. \end{aligned}$$
(6)
$$\begin{aligned} v_{ij}^k(X,Y) = \left\{ \begin{array}{ll} 1 &{} \text { if } x_k = i \text{ and } y_k = j\\ 0 &{} \text { otherwise} \end{array} \right. \end{aligned}$$
(7)

then

$$\begin{aligned} S_{ij}(X,Y) = \sum _{k=1}^N (v_{00}^k(X,Y) + v_{01}^k(X,Y) + v_{10}^k(X,Y) + v_{11}^k(X,Y)) \end{aligned}$$
(8)

One can easily notice that

$$\begin{aligned} S_{00}(X,Y)= & {} \overline{X} \times \overline{Y}^T\end{aligned}$$
(9)
$$\begin{aligned} S_{11}(X,Y)= & {} X \times Y^T \end{aligned}$$
(10)

where “\(\times \)” denotes the product of the matrices.

Following the notation of Tubbs [24], the \(S_{ij}\)’s, \(i,j \in \{0,1\}\), can be used to define many well known measures of similarity and dissimilarity, and we will consider here the following ones (we follow here the source terminology from that paper but in the literature sometimes slightly different names are used):

  • Jaccard-Needham,

  • Dice,

  • correlation,

  • Yule,

  • Russell–Rao,

  • Sockal–Michener,

  • Rodgers–Tanimoto, and

  • Kulczyński.

These measures, both of similarity \(S_{.}(X, Y)\), and their corresponding measures of dissimilarity, \(D_{.}(X,Y)\), are defined in terms of \(S_{ij}(X,Y)\) as follows (we omit the arguments (XY), for brevity):

  • Jaccard–Needham:

    $$\begin{aligned} S_{J-N} = \frac{S_{11}}{S_{11} + S_{10} + S_{01}} \end{aligned}$$
    (11)
    $$\begin{aligned} D_{J-N} = \frac{S_{10} + S_{01}}{S_{11} + S_{10} + S_{01}} \end{aligned}$$
    (12)
  • Dice

    $$\begin{aligned} S_{D}=\frac{2S_{11}}{2S_{11} + S_{10} + S_{01}} \end{aligned}$$
    (13)
    $$\begin{aligned} D_D = \frac{S_{10} + S_{01}}{2S_{11} + S_{10} + S_{01}} \end{aligned}$$
    (14)
  • Correlation

    $$\begin{aligned} S_C = \frac{1}{\sigma } (S_{11} S_{00} - S_{10} S_{01}) \end{aligned}$$
    (15)
    $$\begin{aligned} D_C = \frac{1}{2} - \frac{1}{2 \sigma } (S_{11}S_{00} - S_{10}S_{01}) \end{aligned}$$
    (16)

    where

    $$\begin{aligned} \sigma = \sqrt{(S_{10} + S_{11})(S_{01} + S_{00})(S_{11} + S_{01})(S_{00} + S_{10})}; \end{aligned}$$
    (17)
  • Yule

    $$\begin{aligned} S_Y = \frac{S_{11}S_{00} - S_{10}S_{01}}{S_{11} S_{00} + S_{10} S_{01}} \end{aligned}$$
    (18)
    $$\begin{aligned} D_Y = \frac{S_{10} S_{01}}{S_{11} S_{00} + S_{10} S_{01}} \end{aligned}$$
    (19)
  • Russell–Rao

    $$\begin{aligned} S_{R-R} = \frac{S_{11}}{N} \end{aligned}$$
    (20)
    $$\begin{aligned} D_{R-R} = \frac{N - S_{11}}{N} \end{aligned}$$
    (21)
  • Sokal–Michener

    $$\begin{aligned} S_{S-M} = \frac{S_{11} + S_{00}}{N} \end{aligned}$$
    (22)
    $$\begin{aligned} D_{S-M} = \frac{S_{10} + S_{01}}{N} \end{aligned}$$
    (23)
  • Rogers–Tanimoto

    $$\begin{aligned} S_{R-T} = \frac{S_{11} + S_{00}}{S_{11} + S_{00} + 2 S_{10} + 2 S_{01}} \end{aligned}$$
    (24)
    $$\begin{aligned} D_{R-T} = \frac{2 S_{10} + 2 S_{01}}{S_{11} + S_{00} + 2 S_{10} + 2 S_{01}} \end{aligned}$$
    (25)
  • Kulczyński

    $$\begin{aligned} S_K = \frac{S_{11}}{S_{10} + S_{01}} \end{aligned}$$
    (26)
    $$\begin{aligned} D_K = \frac{S_{10} + S_{01} - S_{11} + N}{S_{10} + S_{01} + N} \end{aligned}$$
    (27)

Notice that though not all similarity measures employed are normalized, their respective dissimilarity measures are all normalized to the unit interval [0, 1] which is usually welcome in applications, also in our context. On the other hand, not all the measures exhibit the metric property but this will not be discussed in this paper as the importance of this property is not clear from a practical point of view.

Now, we will use these measures to the evaluation of similarity and dissimilarity of the voting procedures employed in our paper.

We will use as the point of departure the binary matrix given in Table 2 which shows the satisfaction (\(=1\)) or a lack of satisfaction (\(=0\)) of the ABCDEFG criteria by the 9 (classes of) voting procedures, and this table will be repeated for convenience in Table 13.

Table 13 Satisfaction of 7 criteria by 9 equivalent (classes of) voting procedures, cf. Table 2

Now, we will calculate \(S_{ij}\), \(i,j \in \{0,1\}\), according to (6)–(8), for the particular pairs of 9 voting procedures which will be presented in Table 14 the entries of which are given as \([S_{00}, S_{01}, S_{10}, S_{11}]\), for each pair.

Table 14 Values of \([S_{00}, S_{01}, S_{10}, S_{11}]\) for the particular pairs of the voting procedures

Following (11)–(27) and taking as the point of departure the values of \([S_{00}, S_{01}, S_{10}, S_{11}]\) shown in Table 14, we can calculate the values of the particular similarity and dissimilarity indexes calculated using the methods of:

  • Jaccard-Needham,

  • Dice,

  • Correlation,

  • Yule,

  • Russell–Rao,

  • Sockal–Michener,

  • Rodgers–Tanimoto, and

  • Kulczyński,

which are shown in the consecutive Tables 15, 16, 17, 18, 19, 20, 21 and 22.

Table 15 Values of [degree of similarity, degree of dissimilarity] for the particular pairs of voting procedures using the Jaccard–Needham measures of similarity (11) and dissimilarity (12)
Table 16 Values of [degree of similarity, degree of dissimilarity] for the particular pairs of voting procedures using the Dice measures of similarity (13) and dissimilarity (14)
Table 17 Values of [degree of similarity, degree of dissimilarity] for the particular pairs of voting procedures using the correlation measures of similarity (15) and dissimilarity (16)
Table 18 Values of [degree of similarity, degree of dissimilarity] for the particular pairs of voting procedures using the Yule measures of similarity (18) and dissimilarity (19)
Table 19 Values of the [degree of similarity, degree of dissimilarity] for the particular pairs of voting procedures using the Russell–Rao measures of similarity (20) and dissimilarity (21)
Table 20 Values of [degree of similarity, degree of dissimilarity] for the particular pairs of voting procedures using the Sokal–Michener measures of similarity (22) and dissimilarity (23)
Table 21 Values of [degree of similarity, degree of dissimilarity] for the particular pairs of voting procedures using the Rogers–Tanimoto measures of similarity (24) and dissimilarity (25)
Table 22 Values of [degree of similarity, degree of dissimilarity] for the particular pairs of voting procedures using the Kulczyński measures of similarity (26) and dissimilarity (27)

The results concerning the similarity and dissimilarity of the voting procedures with respect to 7 widely accepted criteria that have been obtained by using a set of popular and highly recommended similarity and dissimilarity measures for binary patterns, presented in Tables 1522, provide a lot of insight that can be very much useful for both social choice and voting theorists. They can also be of relevance for people involved in a more practical task of choosing or even developing a proper voting system in a particular situation. Such an analysis would have been too specific for the purpose of this paper in which a new method is proposed.

To briefly summarize the results obtained, we can say that the quantitative analysis of similarity and dissimilarity via the measures employed in this section, i.e. (11)–(27), does confirm the very essence of results obtained by employing the more qualitative approach proposed in Sect. 3.

Namely, one can notice again that, not surprisingly, Copeland, Max-Min, Dodgson and Nanson form a group of voting procedures that have a high similarity and a low dissimilarity. Quite closely related to that group are Runoff and Amendment. By the way, except for Runoff, all these procedures are the Condorcet extensions, i.e. they result in the choice of the Condorcet winner if it exists. The so-called positional methods, that is, Plurality, Borda and Approval, seem to be rather far away from the rest of the procedures. This holds particularly for Approval. It can also be noticed that it is not very relevant which particular similarity and dissimilarity measure is actually used. The values obtained can be different but the order and proportions are maintained.

6 Concluding Remarks

We have presented a more comprehensive approach to a quantitative analysis of similarity and dissimilarity of voting procedures. We assumed a set of well known voting procedures and criteria which they should satisfy, which are known in political science (cf. Nurmi’s [12] book). More specifically, we have considered the amendment, Copeland, Dodgson, max-min, plurality, Borda, approval, runoff, and Nanson, voting procedures, and the Condorcet winner, Condorcet loser, majority winner, monotonicity, weak Pareto winner, consistency, and heritage criteria. The satisfaction or dissatisfaction of the particular criteria by the particular voting procedures are represented as binary vectors. We used first rough sets to obtain a smaller number of voting procedures (9 instead of 13), following Fedrizzi, Kacprzyk and Nurmi [19], and then used the idea of Kacprzyk, Nurmi and Zadrożny [23] in which the use of some measures of similarity and dissimilarity for binary patterns has been proposed and the Jaccard–Needham measures have been used. In this paper we extend the above approach by using in addition to those, the similarity and dissimilarity measures of: Dice, Correlation, Yule, Russell–Rao, Sockal–Michener, Rodgers–Tanimoto, and Kulczyński.