1 Introduction

We investigate the effectiveness of polynomial-time data reduction for computing optimal solutions of the NP-hard Kemeny Rank Aggregation problem. Kemeny’s corresponding voting scheme can be described as follows. An election \((V,C)\) consists of a set \(V\) of \(n\) votes and a set \(C\) of \(m\) candidates. A vote or a ranking is a total order of all candidates. For instance, in case of three candidates \(a,b,c\), the order \(c>b>a\) means that candidate \(c\) is the best-liked one and candidate \(a\) is the least-liked one. For each pair of votes \(v,w\), Kendall’s tau distance (also known as “inversion distance” since it measures the number of inversions between two permutations) between \(v\) and \(w\) is defined as

$$\begin{aligned} {{\mathrm{KT-distance}}}(v,w) = \sum _{\{c,d\} \subseteq C}d_{v,w}(c,d), \end{aligned}$$

where \(d_{v,w}(c,d)\) is set to \(0\) if \(v\) and \(w\) rank \(c\) and \(d\) in the same order, and is set to \(1\), otherwise. The score of a ranking \(r\) with respect to an election \((V,C)\) is defined as \(\sum _{v \in V} {{\mathrm{KT-distance}}}(r,v)\). A ranking \(r\) with a minimum score is called a Kemeny ranking of \((V,C)\) and its score is the Kemeny score of \((V,C)\). The central problem considered in this work is as follows:

Kemeny Rank Aggregation 

Input: An election \((V,C)\).

Task: Find a Kemeny ranking of \((V,C)\).

Its decision variant Kemeny Score asks whether there is a Kemeny ranking of \((V,C)\) with score at most some additionally given positive integer \(k\). The Kemeny Rank Aggregation problem has numerous applications, ranging from building meta-search engines for the web or spam detection [16] over databases [17] to the construction of genetic maps in bioinformatics [22]. Kemeny rankings are also desirable in classical voting scenarios such as the determination of a president (see, for example, http://www.votefair.org) or the selection of the best qualified candidates for job openings. The wide range of applications is due to the fulfillment of many desirable properties from the social choice point of view. For example, the Kemeny rule is the unique preference function that is neutral, consistent, and satisfies the Condorcet property [40]. These three properties are defined as follows.

Neutrality: The candidate names do not influence the Kemeny ranking, that is, given a Kemeny ranking \(r\) for an election \((\{v_1,v_2,\dots ,v_n\},C)\), for every permutation \(\pi : C \rightarrow C\), \(\pi (r)\) is a Kemeny ranking for \((\{\pi (v_1), \pi (v_2), \dots , \pi (v_n)\},C)\).Footnote 1

Consistency: Let \((V_1,C)\) and \((V_2,C)\) be two elections with disjoint vote sets, that is, \(V_1 \cap V_2 = \emptyset \). If \(r\) is a Kemeny ranking for \((V_1,C)\) and for \((V_2,C)\), then \(r\) is also a Kemeny ranking for \((V_1 \cup V_2,C)\).

Condorcet property: If there is a candidate (Condorcet winner) who is better than every other candidate in more than half of the votes, then this candidate is also ranked first in every Kemeny ranking.

Previous work.

First computational complexity studies of Kemeny Score go back to Bartholdi et al. [3], showing its NP-completeness. Dwork et al. [16] showed that the problem remains NP-hard even in the case of four votes—small errors in their proof have been corrected by Biedl et al. [8]. Moreover, Dwork et al. [16] demonstrated the usefulness of Kemeny Score in aggregating web search results and provided several approximation and heuristic algorithms. In earlier work, Hemaspaandra et al. [21] derived computational complexity classifications for generalized versions of Kemeny Score. Recent papers showed constant-factor approximability [1, 41] and an (impractical) PTAS [24]. Schalekamp and van Zuylen [35] provided a thorough experimental study of approximation and heuristic algorithms. Another extensive experimental study compares 104 (mostly heuristic) algorithms solving Kemeny Rank Aggregation in different situations [2]. Due to the importance of computing provably optimal solutions, there have been experimental studies in this regard as well [13, 14]: An integer linear program and a branch-and-bound approach were applied to random instances generated under a noise model. Another line of research concerns the interpretation of Kemeny rankings as maximum likelihood estimators and several generalizations thereof [19, 29, 30, 39].

From a parameterized complexity perspective [15, 20, 31], the following is known. First fixed-parameter tractability results have been shown with respect to the single parameters number of candidates, Kemeny score, maximum range of candidate positions, and average KT-distance \(d_a\) [6]. The average KT-distance

$$\begin{aligned} d_a := \sum _{v,w \in V}{{\mathrm{KT-distance}}}(v,w) / ({n (n-1)}), \end{aligned}$$

will also play a central role in this work. The range of positions of some candidate \(x\) is defined as the difference between its maximum position and its minimum position where the position of \(x\) in a vote is defined as the number of candidates that are ranked better than \(x\). Kemeny Score remains NP-hard when the average range of candidate positions is two [6], excluding hope for fixed-parameter tractability with respect to this parameterization. Simjour [36] further introduced the parameters “Kemeny score divided by the number of votes” (which is identical to \(d_a\) up to some constants factors [36]) and “maximum KT-distance between two input votes” also showing fixed-parameter tractability and improved the running times for the fixed-parameter algorithms corresponding to the parameterizations by average KT-distance and Kemeny score. Very recently, Nishimura and Simjour [32] developed a fixed-parameter algorithm with respect to the parameter “Kemeny score divided by the number of votes” that enumerates all Kemeny rankings. Karpinski and Schudy [23] and Fernau et al. [18] devised subexponential-time fixed-parameter algorithms for the parameters Kemeny score, average KT-distance, and Kemeny score divided by the number of votes. Mahajan et al. [26] studied above guarantee parameterization with respect to the Kemeny score. Finally, introducing the new concept of partial kernelization, it has been shown that with respect to the average KT-distance \(d_a\) one can compute in polynomial time an equivalent instance where the number of candidates is at most \(162 d_a^2 + 9d_a\) [7].

Formally, the concept of partial kernelization is defined as follows. Let \((I,k)\) be an instance of a parameterized problem \(P\), where \(I \in \Sigma ^*\) denotes the input instance and \(k\) a parameter. Let \(d: \Sigma ^* \rightarrow \mathbb {N}\) be a computable function such that \(P\) is fixed-parameter tractable with respect to \(d(I)\). Then \(P\) admits a partial kernel if there is a polynomial-time algorithm that computes an instance \((I',k')\) of \(P\) such that \((I,k)\) is a yes-instance if and only if \((I',k')\) is a yes-instance, \(k' \le g_1(k)\), and \(d(I') \le g_2(k)\) for computable functions \(g_1\) and \(g_2\). A (partial) problem kernel is called enumerative if there is a one-to-one correspondence between solutions to the original instances and solutions to the kernel [37].

Our contributions.

Our central focus is on mathematically analyzing and empirically evaluating polynomial-time executable data reduction algorithms for Kemeny Rank Aggregation. These rules preserve the possibility to find optimal (not only approximate) solutions on the one hand, and provide an efficient and effective preprocessing with provable performance guarantee on the other hand. We remark that a similar data reduction has been used by Cohen et al. [12] to improve the approximation quality (but not running time) of a heuristic algorithm for a closely related (more general) problem. In our setting, however, data reduction is shown to be the key to significantly improve the running time of exact solution strategies.

On the theoretical side, we improve the previous partial kernel size [7] from \(162 d_a^2 + 9d_a\) candidates to  candidates. More specifically, we investigate data reduction rules exploiting majorities, ending up with an enumerative partial kernel consisting of at most  candidates. Next, we show that in case of dropping the quest for an enumerative kernel, our previous data reduction rules are subsumed by a data reduction rule based on the well-known Extended Condorcet Criterion [38]. We note that, very recently, Simjour [37] showed a further improvement for the enumerative kernel to \(4d_a\) candidates and for the relaxed case to even \((2+\epsilon )d_a\) candidates for \(\epsilon >0\).

On the practical side, we provide empirical evidence for the usefulness of data reduction rules associated with the above mentioned kernelization. An essential property of the data reduction rules is that they can break instances into several subinstances to be handled independently, that is, the relative order between the candidates in two different subinstances in a Kemeny ranking is already determined. This also means that for hard instances which we could not completely solve we were still able to compute “partial rankings” of the top and bottom ranked candidates. After exhaustive application of our data reduction rules, we employ some of the known fixed-parameter algorithms [6, 11, 36] and a known integer linear program formulation [13] to solve small parts of the instances remaining after data reduction. Our experiments showed that the data reduction rules allow for significantly decreased running times; they are crucial to solving larger instances. For example, using data reduction rules decreased the running time with the Gurobi ILP-solver for a test instance from winter sports with 69 candidates from 3.9 to 0.11 s. Furthermore, an instance generated from web search with 240 candidates, which could not be solved within 70 s without data reduction, was solved in 3 s.

2 Data reduction and a linear partial problem kernel

In this section, we examine two simple properties of Kemeny rankings which directly yield data reduction rules. In Sect. 2.2, we analyze natural extensions of a majority-based property including limitations of the corresponding data reduction rules. In Sect. 2.3, we show that the data reduction from Sect. 2.2 provides a linear partial kernel for the parameter average KT-distance that preserves all Kemeny rankings. In Sect. 2.4, we analyze a known extension of the Condorcet property and show how the corresponding data reduction rules relate to the data reduction from Sect. 2.2. In Sect. 2.5, we briefly discuss the tightness of our kernelization upper bound and highlight some practical consequences of our theoretical findings. We start in Sect. 2.1 with some definitions and sketch some relevant previous results [7].

2.1 Definitions and previous results

As to the notation of rankings, we write the list of candidates separated by the \(>\)-symbol in the corresponding preference order. For example, \(a>b>c\) indicates that \(a\) is preferred to \(b\) and \(c\), and \(b\) is preferred to \(c\). When we express partial information about the preference order we use quotation marks. For example, “\(a>b\)” means that candidate \(a\) is preferred to \(b\), without giving further information about other candidates. Furthermore, let \(\#_v(a>b)\) denote the number of votes with “\(a>b\)”. A candidate pair \(\{a,b\}\) is called a conflict pair if \(\#_v(a>b)>0\) and \(\#_v(b>a)>0\).

The data reduction framework from previous work [7] introduces a “dirtiness concept” and shows that one can delete some “non-dirty candidates” by a data reduction rule leading to a partial kernel with respect to the average KT-distance. The “dirtiness” of a pair of candidates is measured by the degree of agreement of the votes for this pair. To this end, we introduce the following notation. For an election \((V,C)\), two candidates \(c,c' \in C\), and a rational number \(s \in \) \(]0.5,1]\), we write

$$\begin{aligned} c \ge _{s} c', \end{aligned}$$

if at least \(\lceil s \cdot |V| \rceil \) of the votes prefer \(c\) to \(c'\), that is, \(\#_v(c>c') \ge s \cdot |V|\). A candidate pair \(\{c,c'\}\) is dirty according to the \(\ge _s\) -majority if neither \(c\ge _{s} c'\) nor \(c'\ge _{s} c\). The remaining pairs are called non-dirty according to the \(\ge _s\) -majority. This directly leads to the parameter \(n_d^s\) denoting the number of dirty pairs according to the \(\ge _s\)-majority. To simplify matters, we write “” instead of “\( \ge _s\) with ”. In this sense, let  denote the maximum number of dirty pairs according to \(\ge _s\)-majorities with . When the value of \(s\) is clear from the context, then we speak of “dirty pairs” and omit “according to the \(\ge _s\)-majority”. Previous work only considered -majorities and provided a data reduction rule such that the number of candidates in a reduced instance is at most quadratic in . In this work, we provide a linear partial kernel with respect to \(n_d^s\) according to the \(\ge _s\)-majority for and show that this leads to a linear partial kernel with respect to \(d_a\).

We say that \(c\) and \(c'\) are ordered according to the \(\ge _s\) -majority in a ranking \(l\) if \(c>c'\) in \(l\) and \(c\ge _{s}c'\). If all candidate pairs are non-dirty with respect to the \(\ge _s\)-majority for an , then there exists a \(\ge _s\) -majority order, that is, a ranking in which all candidate pairs are ordered according to the \(\ge _s\)-majority [7]. Furthermore, the corresponding -majority order can be found in polynomial time and is a Kemeny ranking [7]. Candidates appearing only in non-dirty pairs according to a \(\ge _s\)-majority are called non-dirty candidates and all remaining candidates are dirty candidates according to the \(\ge _s\)-majority. Note that with this definition a non-dirty pair can also be formed by two dirty candidates. See Table 1 for an overview of partial kernelization results versus polynomial-time solvability.

Table 1 Partial kernelization and polynomial-time solvability

We end with some notation needed to state our data reduction rules. For a candidate subset \(C' \subseteq C\), a ranking fulfills the condition \(C'> C {\setminus } C'\) if every candidate from \(C'\) is preferred to every candidate from \(C {\setminus } C'\). A subinstance of \((V,C)\) induced by a candidate subset \(C' \subseteq C\) is given by \((V',C')\) where every vote in \(V'\) one-to-one corresponds to a vote in \(V\) keeping the relative order of the candidates from \(C'\).

2.2 Exploiting -majorities

We generalize a well-known property of Kemeny rankings to develop data reduction rules: If there is a specific candidate \(a\) such that all voters agree on each relative ordering of \(a\) and some other candidate, then in every Kemeny ranking the relative ordering of \(a\) and any other candidate is consistent with the votes. Formally, we have the following observation.

Observation 1

Let \(a \in C\) be a non-dirty candidate with respect to the \(\ge _{1}\)-majority and \(b \in C{\setminus } \{a\}\). If \(a \ge _{1} b\), then in every Kemeny ranking “\( a > b \)”; if \(b \ge _{1} a\), then in every Kemeny ranking “\( b > a \)”.

This observation directly implies a simple data reduction rule which may split the instance into two subinstances: one containing all votes restricted to the candidates that are preferred to \(a\) by all votes and one containing all votes restricted to the remaining candidates (except \(a\)). In the remaining part of this subsection, we show how to extend this observation with respect to \(\ge _{s}\)-majorities, that is, to cases where only a \(\ge _{s}\)-majority of the voters instead of all voters agree on the relative orderings. We investigate to which \(\ge _s\)-majorities Observation 1 extends. An overview of properties for a Kemeny ranking for different values of \(s\) is provided in Table 2.

The -Majority Rule. It is easy to see that Observation 1 cannot hold for arbitrary \(\ge _{s}\)-majorities. For example with the observation would already be contradictory for the election consisting of two votes: \(a>b\) and \(b>a\). The following lemma shows that Observation 1 can be generalized for all \(\ge _{s}\)-majorities with .

Lemma 1

Let \(a \in C\) be a non-dirty candidate with respect to the \(\ge _{s}\)-majority and \(b \in C{\setminus } \{a\}\) with . If \(a \ge _{s} b\), then in every Kemeny ranking “\( a > b \)”; if \(b \ge _{s} a\), then in every Kemeny ranking “\( b > a \)”.

Table 2 Properties “induced” by \(\ge _s\)-majorities for different values of \(s\)

Proof

Let the partial score of a candidate subset \(C'\) be the Kemeny score of the subinstance induced by \(C'\). We consider the case ; the case and the proof for \(\ge _{s}\)-majorities with follow in complete analogy. The proof is by contradiction. Assume that there is a Kemeny ranking \(l\) with “\(b > D > a\)” for some \(D \subseteq C {\setminus }\{a,b\}\). Since \(a\) is non-dirty, for every candidate \(d \in D\) it must either hold that or . Let  and . Consider the ranking \(l'\) obtained from \(l\) through replacing

$$\begin{aligned} \text {``}b >D> a\text {''}\quad \text {by}\quad \text {``}D_2 >a>b>D_1\text {''}, \end{aligned}$$

where the positions of all other candidates remain unchanged and the candidates within \(D_1\) and \(D_2\) have the same relative order as within \(D\).

We show that the score of \(l\) is greater than the score of \(l'\), contradicting that \(l\) is a Kemeny ranking. The only pairs of candidates that have different orders in \(l\) and \(l'\) and thus can contribute with different partial scores to the scores of \(l\) and \(l'\) are \(\{a,d_1\}\) and \(\{d_2,d_1\}\) for all \(d_1 \in D_1 \cup \{b\}\) and all \(d_2 \in D_2\). Consider any \(d_1 \in D_1 \cup \{b\}\) and \(d_2 \in D_2\). Since and , the intersection of these two sets must contain at least \(|V|/2\) elements, that is, there must be at least \(|V|/2\) votes with “\(d_2>a>d_1\)”. Thus, the partial score of \(\{d_2,d_1\}\) in \(l\) is at least as high as its partial score in \(l'\). The partial score of every pair \(\{a,d_1\}\) with \(d_1 \in D_1 \cup \{b\}\) in \(l'\) is strictly less than the partial score in \(l\). Since \( |D_1 \cup \{b\}| \ge 1\), the score of \(l'\) is smaller than the score of \(l\) and thus \(l\) cannot be a Kemeny ranking, a contradiction. \(\square \)

As a direct consequence of Lemma 1 we can partition the candidates of an election \((V,C)\) as follows. Let \(N:=\{n_1, \dots , n_t\}\) denote the set of non-dirty candidates with respect to the -majority such that  for \(1 \le i \le t-1\). Then,

-Majority Rule. Let \((V,C)\) be an election and \(N\) and \(D_0, \dots , D_t\) be the sets of non-dirty and dirty candidates as specified above. Replace the original instance by the \(t+1\) subinstances induced by \(D_i\) for \(i \in \{0, \dots , t\}\).

The soundness of the -Majority Rule follows directly from Lemma 1. It remains to discuss the running time of one application of the data reduction rule. The application of the -Majority Rule works as follows. In a first step, compute for each pair of candidates the majority ratio, that is, for each candidate pair \(\{a,b\}\) determine how many votes prefer \(a\) to \(b\) and vice versa. This takes \(O(|V| \cdot |C|^2)\) time and a table storing this information takes \(O(|C|^2)\) space. In a second step, decide with \(O(|C|)\) table lookups for each candidate whether it is dirty or non-dirty according to the -majority. This gives the candidate set \(N\) as specified above. A third step is to compute the candidate sets \(D_0, \dots , D_t\). For each set, we need at most \(O(|C|)\) table lookups. Finally, computing the \(t+1\) new instances takes \(O(|V| \cdot |C|^2)\) time, because one just copies the original instance, leaving out the candidates that are not in the corresponding candidate sets. Since \(t \le |C|\), the overall running time is \(O(|V|\cdot |C|^2)\).

Rules for weaker majorities.

The existence of a data reduction rule analogously to the -Majority Rule for \(\ge _s\)-majorities for  would be desirable since such a rule might be more effective: There are instances for which a candidate is dirty according to the -majority but non-dirty according to a \(\ge _s\)-majority with . Hence, for many instances, the number of dirty pairs according to the -majority is higher than according to smaller values of \(s\). In the following, we discuss why an analogous \(s\)-Majority Rule with cannot work. The decisive point of the -Majority Rule is that, in a Kemeny ranking, every non-dirty candidate must be ordered according to the -majority with respect to every other candidate.

To this end, we start with the introduction of the concept of the \(\ge _s\) -majority order.

Definition 1

Let \((V,C)\) be an election and \(r\) be a ranking over \(C\) and \(s \in [0,1]\). We say \(r\) is a \(\ge _s\)-majority order if the following holds:

$$\begin{aligned} \forall \{a,b\} \subseteq C: a \ge _s b \Rightarrow \text {``}a > b\text {'' in }r. \end{aligned}$$

Informally speaking, a \(\ge _s\)-majority order fully respects all \(\ge _s\)-majority relations.

For the -majority, instances without dirty candidates are polynomial-time solvable [7]. More specifically, the -majority order is a Kemeny ranking. However, a simple example shows that, for any , a \(\ge _s\)-majority order does not always exist.

Example 1

Consider the election consisting of the three candidates \(a,b\), and \(c\) and the three votes \(a>b>c\), \(b>c>a\), and \(c>a>b\). Here, , , and . No linear order fulfills all three relations.

When the \(\ge _s\)-majority order does not exist, then the ordering for the non-dirty candidates is not clear. Hence, it is not even possible to compute the sets \(D_1, \dots , D_t\) as described in the -Majority Rule. It remains to study \(s\)-values for . We show that in such cases a pair consisting of a dirty and a non-dirty candidate needs not be ordered according to the \(\ge _s\)-majority.

Proposition 1

Consider a \(\ge _s\)-majority for any rational \(s \in \ ] 2/3, 3/4 [ \). For a non-dirty candidate \(x\) and a dirty candidate \(y\), \(x \ge _s y\) does not imply \(x>y\) in a Kemeny ranking.

Proof

Let \(s_1\) and \(s_2\) be two positive integers such that \(s = s_1/s_2\). We construct an election such that there is a non-dirty candidate \(x\) and a dirty candidate \(y\) with \(x \ge _{s} y\) but “\(y>x\)” in every Kemeny ranking. The set of candidates is \(\{x,y\} \cup A\) with \(A:=\{a_1,a_2,\dots ,a_l\}\) being the set of dummy candidates. The number \(l\) of dummy candidates directly depends on \(s\) and will be discussed later. For now, let \(l\) be a fixed integer greater than one. We have the following votes, where \(A\) stands for \(a_1>a_2>\ldots >a_l\):

$$\begin{aligned} \begin{array}{llll} \bullet &{}\,s_1 \cdot s_2 - s_1^2 &{}\quad \text {votes of type~1:}\,\, x > y > A,\\ \bullet &{}\,2s_1^2- s_1\cdot s_2 &{}\quad \text {votes of type~2:}\,\, A > x > y,\\ \bullet &{}\,s_1 \cdot s_2 - s_1^2 &{}\quad \text {votes of type~3:}\,\, y > A > x. \end{array} \end{aligned}$$

We first show that there is a positive number of votes of every type and the total number of votes is \(s_1 \cdot s_2\): The total number of votes is

$$\begin{aligned} s_1 \cdot s_2 - s_1^2 + 2s_1^2- s_1\cdot s_2 + s_1 \cdot s_2 - s_1^2 = s_1 \cdot s_2. \end{aligned}$$

Considering the number of votes of types 1 and 3, recall that \(3/4 > s_1/s_2\) and thus \(s_2 > 4/3 \cdot s_1\). Hence, for both types the number of votes is

$$\begin{aligned} s_1 \cdot s_2 - s_1^2 > s_1 \cdot (4/3 \cdot s_1 -s_1) > 0. \end{aligned}$$

Regarding votes of type 2, we use the trivial bound that \(s_1/s_2 > 1/2\) and thus their number is

$$\begin{aligned} 2s_1^2- s_1\cdot s_2 > s_1 \cdot (2s_1 - 2s_1) = 0. \end{aligned}$$

Now, we show that \(x\) is non-dirty and \(x \ge _s y\): The number of votes with “\(a>x\)” for \(a \in A\) is

$$\begin{aligned} 2s_1^2- s_1\cdot s_2+s_1 \cdot s_2 - s_1^2 = s_1^2 = s \cdot |V| \end{aligned}$$

and the number of votes with “\(x>y\)” is

$$\begin{aligned} s_1 \cdot s_2 - s_1^2 + 2s_1^2- s_1\cdot s_2 = s_1^2 = s \cdot |V|. \end{aligned}$$

Thus, \(x\) is non-dirty according to the \(\ge _s\)-majority and \(x \ge _s y\).

In the following, we show that the score of “\(y>A>x\)” is smaller than the score of every other ranking and, hence, there is no Kemeny ranking in which \(x\) and \(y\) are ordered according to the \(\ge _s\)-majority.

Having “\(a_1>a_2>\ldots >a_l\)” in every vote, we also have “\(a_1>a_2>\ldots >a_l\)” in every Kemeny ranking (see, e.g., [6]). Distinguishing three cases, we first show that in every Kemeny ranking “\(a_i > x\)” if and only if “\(a_j> x\)”, and “\(a_i> y\)” if and only if “\(a_j > y\)” for every \(i,j \in \{1,\dots , l\}\). After this, we can treat \(A\) like one candidate of “weight” \(l\). Thus, there remain only six rankings for which the score has to be investigated in order to show that “\(y>A>x\)” is the only ranking with minimum score.

  1. Case 1:

    Consider the ranking \(\ldots >a_i>x>a_j>\ldots \), that is, candidate \(a_i\) is directly followed by candidate \(x\) and \(x\) is directly followed by candidate \(a_j\). Such a ranking cannot have minimum score since swapping \(x\) and \(a_j\) leads to a ranking with smaller score since “\(a_j > x\)” in more than \(s \cdot |V| > 2/3 \cdot |V|\) votes.

  2. Case 2:

    Consider the ranking \(\ldots >a_i>y>a_j>\ldots \), that is, candidate \(a_i\) is directly followed by candidate \(y\) and \(y\) is directly followed by candidate \(a_j\). This ranking cannot have minimum score since swapping \(a_i\) and \(y\) leads to a ranking with smaller score. This can be seen as follows. Since \(s_1 < 3/4 \cdot s_2\), the number of votes with “\(y> a_i\)” is

    $$\begin{aligned} 2s_1s_2-2s_1^2> 2s_1 (s_2- 3/4 \cdot s_2)= 1/2 \cdot s_1s_2 = |V|/2. \end{aligned}$$
  3. Case 3:

    This case consists of ranking \(\ldots >a_i>x>y>a_j>\ldots \) and the same ranking with \(x\) and \(y\) swapped. The latter ranking would clearly have a larger score than the former so it suffices to analyze the former. We show that “\(a_i> a_j > x > y\)” has a smaller score than “\(a_i> x> y> a_j\)”. The only pairs that change the score are \(\{a_j,y\}\) and \(\{a_j,x\}\), contributing with

    $$\begin{aligned} \#_v(a_j>y) + \#_v(a_j > x) = 2s_1^2-s_1s_2 + 2s_1^2-s_1s_2+s_1s_2-s_1^2 = 3s_1^2 -s_1s_2, \end{aligned}$$

    to the old score and with \(2|V| - \#_v(a_j>y) - \#_v(a_j > x)\) to the “new” score. Hence, it remains to show that the difference between the old and the new score is positive, that is,

    $$\begin{aligned} 3s_1^2 -s_1s_2 -2s_1s_2 +3s_1^2 -s_1s_2= 6s_1^2 - 4s_1s_2 > 6 \cdot 2/3 \cdot s_1s_2 - 4s_1s_2 =0. \end{aligned}$$

Finally, consider the scores of all possible remaining six ranking types \(r_1, \dots , r_6\):

$$\begin{aligned} r_1: A>x>y \qquad \quad r_3: x>A>y \qquad \quad r_5:y>A>x,\\ r_2: A>y>x \qquad \quad r_4: x>y>A \qquad \quad r_6: y> x>A. \end{aligned}$$

Let \(t(r)\) denote the score of a ranking \(r\). It is easy to verify that \(t(r_1)< t(r_2)\), \(t(r_1) < t(r_3)\), and \(t(r_4) < t(r_6)\). Hence, to show that \(r_5\) has the smallest score, it remains to compare the score of \(r_5\) with the scores of \(r_1\) and \(r_4\). This shows that \(r_5\) has a smaller score than \(r_1\) and \(r_4\): Since \(A\) represents \(l\) candidates, we count the corresponding pairs \(l\) times in the following.

$$\begin{aligned} t(r_4)- t(r_5)&=~ \#_v(y>x) + l \cdot \#_v(A>x) +l \cdot \#_v(A>y)\\&\quad ~-l \cdot \#_v(A>y) -l \cdot \#_v(x>A) - \#_v(x>y), \\&=~ \#_v(y>x) + l \cdot \#_v(A>x) -l \cdot \#_v(x>A) - \#_v(x>y), \\&=~ s_1s_2-s_1^2 + l s_1^2 - l s_1s_2 + l s_1^2 -s_1^2, \\&=~ 2(l-1)s_1^2-(l-1)s_1s_2. \end{aligned}$$

In the following, we show that this term is positive. Note that \(s_1/s_2>2/3 \Leftrightarrow s_1>2/3 \cdot s_2\) and \(l>1\).

$$\begin{aligned}\begin{array}{ll} 0<2(l-1)s_1^2-(l-1)s_1s_2 &{}\Leftrightarrow \\ 0<2s_1^2-s_1s_2&{} \Leftarrow \\ 0<4/3 \cdot s_1s_2 - s_1s_2&{}\Leftrightarrow \\ 0 <1/3 \cdot s_1^2. \end{array} \end{aligned}$$

Thus, \(t(r_5) < t(r_4)\) and, therefore, \(t(r_5)<t(r_6)\).

$$\begin{aligned} t(r_1) - t(r_5)&= ~l \cdot \#_v(x>A) + l \cdot \#_v(y>A) + \#_v(y>x) -l \cdot \#_v(A>y) -l \\&\cdot \#_v(x>A)- \#_v(x>y),\\&= ls_1s_2-ls_1^2 + 2ls_1s_2-2ls_1^2 + s_1s_2-s_1^2 - 2ls_1^2+ls_1s_2 \!- ls_1s_2 \!+ls_1^2 -s_1^2,\\&= (3l+1)s_1s_2 - (4l+2)s_1^2. \end{aligned}$$

Again, we show that this term is positive. Note that \(s_1/s_2<3/4\). Hence, there is a fixed and positive number \(\epsilon \) such that \(s_2/s_1=4/3+\epsilon \), that is, \(s_2=(4/3+\epsilon )s_1\).

$$\begin{aligned}\begin{array}{ll} 0<(3l+1)s_1s_2 - (4l+2)s_1^2 &{}\Leftrightarrow \\ 0<(4/3+\epsilon )(3l+1)s_1^2 - (4l+2)s_1^2 &{}\Leftrightarrow \\ 0<4l + 3l\epsilon + 4/3 + \epsilon - 4l - 2&{} \Leftrightarrow \\ 2<3l\epsilon + 4/3 + \epsilon .&{} \end{array} \end{aligned}$$

For \(l>1/\epsilon \), one has \(t(r_5)<t(r_1)\) and, hence, \(t(r_5)<t(r_2)\) and \(t(r_5)<t(r_3)\).

As we can see in the above terms, it remains to use a number of dummy candidates \(l\) that only depends on the number \(s\). Our counterexample works for

$$\begin{aligned} l > \frac{1}{\epsilon } = \frac{1}{\frac{1}{s} - \frac{4}{3}} = \frac{3s}{3-4s}. \end{aligned}$$

Altogether, we showed that \(r_5\) is the only Kemeny ranking. Thus, there is an election with \(x \ge _s y\) for every \(s \in \ ] 2/3, 3/4 [ \) such that every Kemeny ranking fulfills “\(y>x\)”. \(\square \)

2.3 A linear partial kernel preserving all Kemeny rankings

Lemma 1 shows that the -Majority Rule preserves the possibility to find all optimal solutions. After exhaustive application (at most \(|C|\) times) of the -Majority Rule the remaining instances contain only dirty candidates. Making use of a simple relation between the number of dirty candidates and the average KT-distance as also used previously [7], we achieve the following.

Theorem 1

Let \(d_a\) be the average KT-distance and \(n_d^s\) be the number of dirty pairs according to the \(\ge _s\)-majority with . Kemeny Rank Aggregation admits a partial problem kernel with less than \(\min \{ 16 d_a / 3, 2 n_d^s \}\) candidates. Moreover, every Kemeny ranking of the original instance contains a Kemeny ranking of the reduced instance and every Kemeny ranking of the reduced instance can be extended in polynomial time to a Kemeny ranking of the original instance.

Proof

The partial kernel derives from applying the -Majority Rule, which is (easily) adapted to work for the decision problem: An instance is reduced by deleting all candidates from \(N\), reordering every vote such that \(D_0>D_1> \dots >D_t\) where, inside \(D_i\)\(0 \le i\le t\), the order of the candidates remains unchanged, and decreasing the Kemeny score appropriately. After having applied the adapted reduction rule, an instance containing only dirty candidates remains. Let their total number be \(i\). Since every dirty candidate must be involved in at least one candidate pair that is not ordered according to the -majority, there must be at least \(i/2\) candidate pairs that contribute to the average KT-distance. To analyze the contribution of such a pair, we take a look at the definition of the average KT-distance.

$$\begin{aligned} d_a&= \sum _{v,w \in V}{{\mathrm{KT-distance}}}(v,w) / ({|V| (|V|-1)}),\\&= \sum _{\{v,w\} \subseteq V}{{\mathrm{KT-distance}}}(v,w) \cdot 2 / ({|V| (|V|-1)}),\\&= \sum _{\{v,w\} \subseteq V} \sum _{\{c,d\} \subseteq C} d_{v,w}(c,d) \cdot 2 / ({|V| (|V|-1)}),\\&= \sum _{\{c,d\} \subseteq C} \sum _{\{v,w\} \subseteq V} d_{v,w}(c,d) \cdot 2 / ({|V| (|V|-1)}). \end{aligned}$$

That is, each candidate pair contributes to the average KT-distance with

$$\begin{aligned} \sum _{\{v,w\} \subseteq V} d_{v,w}(c,d), \end{aligned}$$
(1)

normalized by the factor \(2 / ({|V| (|V|-1)})\). The sum in Eq. (1) is the number of votes that prefer \(c\) to \(d\) times the number of votes that prefer \(d\) to \(c\). In general, this value can be between \((|V|/2) \cdot (|V|/2)\) (as many votes prefer \(c\) to \(d\) as votes prefer \(d\) to \(c\)) and \(0\) (all votes agree). However, for our considered candidate pairs the contribution is at least \((|V|/4) \cdot (3|V|/4)\), since they are not ordered according to the -majority. By definition of the average KT-distance, it follows that

$$\begin{aligned} d_a \ge \frac{i}{2} \cdot \frac{2}{|V|(|V|-1)} \cdot \frac{|V|}{4} \cdot \frac{3|V|}{4} > \frac{3}{16} \cdot i \Rightarrow 16/3 \cdot d_a > i. \end{aligned}$$

As a consequence, the reduced instance has less than \(16 d_a /3\) candidates. The upper bound \(2n_d^s\) for the number of remaining (dirty) candidates is trivial.

The relation between Kemeny rankings of the original and of the reduced instance follows with Lemma 1 and the polynomial-time computability of the data reduction. \(\square \)

2.4 Exploiting the Condorcet property

In this subsection, we analyze a well-known data reduction rule of practical relevance [12]. We show that it reduces an instance at least as much as the -Majority Rule, but it does not preserve all Kemeny rankings. The reduction rule is based on the following lemma which is known as the Extended Condorcet Criterion [38]. We give a short proof for the sake of completeness.

Lemma 2

Let \(C' \subseteq C\) be a candidate subset with for every \(c' \in C'\) and every \(c \in C {\setminus } C'\). Then there is a Kemeny ranking fulfilling \(C'> C{\setminus } C'\).

Proof

If there is a Kemeny ranking \(l\) not fulfilling \(C' > C{\setminus } C'\), then replace it by the ranking \(l'\) with \(C' > C{\setminus } C'\) such that the relative order within the candidates of \(C'\) and \(C{\setminus } C'\) remains unchanged. Now, comparing the scores of both rankings, they can only differ for pairs of candidates \(c',c''\) with \(c' \in C'\) and \(c'' \in C {\setminus } C'\). However, every such pair contributes with at most as many points to the score of \(l'\) as to the score of \(l\). Thus, \(l'\) must also be a Kemeny ranking. \(\square \)

A straight-forward implementation of Lemma 2 already leads to a data reduction rule running in \(O(|V| \cdot |C|^3)\) time. However, it may happen that it is applicable several times. The following data reduction rule runs in \(O(|V| \cdot |C|^2)\) time and computes (in one application) the same output as the exhaustive application of the straight-forward rule. To formulate the rule, we need the concept of a majority graph.

Definition 2

Let \((V,C)\) be an election. The strict (weak) majority graph of \((V,C)\) is a directed graph with \(C\) being the set of vertices that contains an arc from \(x\) to \(y\) if and only if “\(x>y\)” in more than half of votes (in at least half of votes).

Extended Condorcet Rule

Let \((V,C)\) be an election and let \(C_1, \dots , C_t \subseteq C\) be the vertex sets forming the topologically ordered strongly connected components in the strict majority graph of \((V,C)\). Replace the original instance by the subinstances induced by \(C_i\) with \(|C_i| \ge 2, i \in \{1, \dots , t\}\).

Constructing the strict majority graph takes \(O(|V| \cdot |C|^2)\) time. Computing the strongly connected components (in topological ordering) takes \(O(|C|^2)\) time. Computing the subinstances takes \(O(|V| \cdot |C|^2)\) time. Hence, the total running time of the Extended Condorcet Rule is \(O(|V| \cdot |C|^2)\). Alternatively, one can find the set \(C_1,\dots ,C_t\) as defined in the Extended Condorcet Rule by recursively applying Lemma 2. Formally, the Extended Condorcet Rule is sound due to the following lemma (in combination with Lemma 2).

Lemma 3

Let \((V,C)\) be an election and let \(C_1, \dots , C_t \subseteq C\) be the candidate sets as computed by the Extended Condorcet Rule. Then, for every \(c_i \in C_i\) and every \(c_j \in C {\setminus } C_j\) with \(1 \le i<j \le t\). Given a Kemeny ranking for each subinstance induced by \(C_i\), \(i \in \{1,\dots ,t\}\), one can compute a Kemeny ranking of \((V,C)\) in \(O(|V| \cdot |C|^2)\) time.

Proof

The sets \(C_1, \dots , C_t \subseteq C\) represent the strongly connected components of the strict majority graph in topological ordering, that is, by definition of the topological ordering there is no arc from some vertex in \(C_i\) to some vertex in \(C_j\) for \(1 \le i<j \le t\). By the definition of the strict majority graph, it follows that for every \(c_i \in C_i\) and every \(c_j \in C {\setminus } C_j\) with \(1 \le i<j \le t\). Hence, by iteratively applying Lemma 2 one obtains a Kemeny ranking for the original instance by concatenating the Kemeny rankings of the subinstances induced by the vertex sets \(C_1, \dots , C_t\). (This includes to concatenate candidates represented by components consisting of one single vertex.) \(\square \)

Remarkably, already in 1999 Cohen et al. [12] basically used what we now call the Extended Condorcet Rule for improving the solution quality (but not running time) of a greedy algorithm providing approximate results.

Although not explicitly stated, the Extended Condorcet Rule or a similar rule has been used by Schalekamp and van Zuylen [35] as preprocessing. To the best of our knowledge, in the context of exact algorithms neither its theoretically provable performance guarantees have been analyzed nor its practical effectiveness has been evaluated.

Note that Brandt et al. [10] used a similar preprocessing procedure for computing so-called tournament solutions. Similarly to the components in the Extended Condorcet Rule , they compute “dominating sets”, that is, subsets of alternatives such that all alternatives from this set have the same relationship to all alternatives not in the set.

The following proposition shows that the Extended Condorcet Rule is at least as powerful as the -Majority Rule, implying that the Extended Condorcet Rule provides a partial kernel with less than candidates.

Proposition 2

An instance where the Extended Condorcet Rule has been exhaustively applied cannot be further reduced by the -Majority Rule.

Proof

The proof is by contradiction. Assume that there is an instance reduced by the Extended Condorcet-Set Rule where the -Majority Rule successfully applies. Then, there must be a non-dirty candidate \(x\) and a subset \(C' \subseteq C\) with for \(c' \in C'\) and for \(c'' \in C''\) with \(C'':=C {\setminus } (C' \cup \{n_i\})\). Clearly, in the strict majority graph there is no arc from \(x\) to a vertex in \(C'\) and no arc from a vertex in \(C''\) to \(x\).

We can assume that \(C'\) and \(C''\) are nonempty since otherwise the Extended Condorcet Rule would obviously reduce this instance as \(x\) forms a strongly connected component in the strict majority graph. Since and , the intersection of these two sets must contain at least \(|V|/2\) elements, that is, there must be at least \(|V|/2\) votes with “\(c'>\dots >x>\dots >c''\)” for \(c' \in C'\) and \(c'' \in C''\). Hence, in the strict majority graph of the instance there is no arc from a vertex in \(C''\) to a vertex in \(C'\). Thus, \(x\) forms a strongly connected component in the strict majority graph, a contradiction to the fact that the Extended Condorcet Rules was applied.\(\square \)

Note that the proof of Proposition 2 works analogously if one extends the -Majority Rule  to search for “subsets of non-dirty candidates”, that is, to search for a subset \(C' \subseteq C\) such that of the votes agree on the relative ordering of every two candidates \(c' \in C'\) and \(c'' \in (C {\setminus } C')\). That is, the Extended Condorcet Rule also subsumes this extended data reduction rule which clearly reduces some instances that are not reducible by the -Majority Rule .

Finally, we remark that the partial kernel obtained by the Extended Condorcet Rule as defined above does not preserve the possibility to compute all Kemeny ranking from the kernelized instance. Consider the simple election consisting of the vote \(a>b\) and the vote \(b>a\). When computing the partial kernel, the Extended Condorcet Rule has to fix some topological ordering. Moreover, candidates represented by strongly connected components of size one have to be removed in order to obtain the kernel size upper bound (see Theorem  1). Hence, the partial kernel obtained by the Extended Condorcet Rule cannot distinguish between the simple election above and the election consisting of two votes \(a>b\), although the former has two Kemeny rankings and the latter has only one Kemeny ranking. Of course, by providing the kernelized instance with additional information one may find all Kemeny rankings, but the partial problem kernel itself (which must be an instance of the original problem) provided by the Extended Condorcet Rule does not preserve all Kemeny rankings. Note that this observation is not only of theoretical interest since a (partial) problem kernel has the nice property that one can apply every algorithm that solves the original problem on the (partial) problem kernel. If one needs additional information to find all Kemeny rankings, then such an algorithm must be able to handle this additional information.

Replacing the strict majority graph by a weak majority graph in the Extended Condorcet Rule , one obtains a slightly modified reduction rule. Following an argumentation analogous to Lemma 2 and Lemma 3 it is easy to verify that this modified Extended Condorcet Rule preserves all solutions. The following example shows that the -Majority Rule  reduces instances that are not reduced by this modified rule.

Example 2

Consider the election consisting of the three candidates \(a\), \(b\), and \(x\) and the following votes:

      \(a>x>b\),      \(a>x>b\),      \(b>a>x\),      \(x>b>a\).

The -Majority Rule  identifies and removes the non-dirty candidate \(x\) (preserving the only Kemeny ranking \(a>x>b\)). The modified Extended Condorcet Rule constructs the weak majority graph with one single strongly connected component consisting of \(a\), \(b\), and \(x\) and, hence, does not reduce anything.

Example 2 shows that Proposition 2 does not extend to this case and it is unclear whether the upper bound for the size of the partial kernel preserving all Kemeny rankings transfers to this modified rule. A deeper analysis of kernels preserving all Kemeny rankings, so to say enumerative kernels, is a topic of independent interest [32, 37].

2.5 Tightness of the kernel bound and practical consequences

Tightness of the kernel bound.

As main theoretical contribution of this section, we showed that Kemeny Rank Aggregation admits a partial problem kernel where the number of candidates is linear in the average KT-distance even if we require that the partial kernel preserves all Kemeny rankings for a given election. The upper bound of for the number of candidates in the reduced instance, where \(d_a\) denotes the average KT-distance, already holds when applying the “weaker” -Majority Rule . As the following example shows, this bound is tight for the -Majority Rule .

Example 3

Consider an election with the candidates \(a\) and \(b\) such that votes have “\(a>b\)” and the remaining votes have “\(b>a\)”. The -Majority Rule  does not reduce this instance. The average KT-distance of this election is asymptotically

The number of candidates in this election is two. This meets the upper bound for the partial kernel of .

Proposition 2 shows that the Extended Condorcet Rule may yield a better upper bound for the size of the partial problem kernel since it may reduce instances that cannot be reduced by the -Majority Rule . The following example shows that this bound cannot be smaller than \(2 d_a\).

Example 4

Consider an election with the candidates \(a\), \(b\), \(c\), and \(d\) and the following votes:

\(x\) votes: \(a>b>c>d\),

\(x\) votes: \(c>d>a>b\),

1 vote: \(b>c>d>a\).

We consider the contribution of each candidate pair to the average KT-distance. The pair \(\{a,b\}\) contributes with \(2x\), the pair \(\{c,d\}\) with zero, and every other pair with \(x(x+1)\). Hence, for an arbitrary large number \(x\) the average KT-distance of this election is asymptotically

$$\begin{aligned} \lim _{x\rightarrow \infty }(2x + 4x(x+1)) / ((2x+1)\cdot 2x/2) =\lim _{x\rightarrow \infty }\frac{4}{2x + 1}+2=2. \end{aligned}$$

Simjour’s recent PhD thesis [37] shows that slightly modifying the Extended Condorcet Rule yields an upper bound arbitrarily close to \(2 d_a\).

Practical consequences.

From the practical point of view our kernelization results show that by applying a relatively simple data reduction rule one obtains reduced instances where the number of candidates is small if the average Kendall Tau distance is small. This gives a theoretical explanation why this kind of preprocessing is very effective for elections where the votes do not “differ too much on average”.

If one is only interested in finding one Kemeny ranking instead of all Kemeny rankings, the Extended Condorcet Rule subsumes a lot of data reduction rules, including

  • a rule that detects and removes Condorcet winners (resp. Condorcet losers),

  • the -Majority Rule, and

  • an extension of the -Majority Rule  to non-dirty candidate sets as discussed above.

It seems reasonable to assume that any preprocessing for Kemeny Rank Aggregation should at least handle Condorcet winners and Condorcet losers. Since the worst-case time complexity of \(O(|V| \cdot |C|^2)\) for finding Condorcet winners is the same as for applying the Extended Condorcet Rule , we suggest to solely implement the Extended Condorcet Rule for the scenario of finding one Kemeny ranking.

Since our practical focus is computing one optimal Kemeny ranking, in the next section, we analyze the effectiveness and the efficiency of the Extended Condorcet Rule in practice.

3 Experimental work

In this section, we first provide a detailed overview of the data sets, implementation works, and test series that are used in our experiments. Then, we summarize for each data set interesting specific results. Finally, we discuss the suitability of our approach to exactly compute a Kemeny ranking with respect to the specific data sets. Further tables with our experimental results can be found in Supplementary Material.

All experiments were carried out on a standard-PC with 3.4 GHz and 4 GB RAM (CPU: Intel(R) Core(TM) i3-2130) running under Debian 6.0 (64 bit) Linux. Source codes and test data are freely available under the GPL Version 3 license at http://www.akt.tu-berlin.de/menue/software/.

3.1 Data sets

Our test data consist of four sets of real-world data and two sets of synthetic data. We used standard applications in sport competitions and web search to obtain real-world rankings and two statistical model for random permutations to generate synthetic rankings.

3.1.1 Winter sports

Given rankings in sports for several seasons, one may ask for an overall ranking. This can be easily interpreted as ranking with individuals or teams as candidates and the seasonal rankings as votes. We generated one such election for ski jumping and one election for ski cross. To this end, we considered the world cup rankings from the seasons 2005/2006 to 2008/2009,Footnote 2 ignoring candidates not appearing in all four rankings. We ended up with an election consisting of four complete votes ranking 33 candidates for ski jumping and with an election consisting of four complete votes ranking 69 candidates for ski cross (see Table 1 in Supplementary Material).

3.1.2 Formula 1

Similarly to winter sports, determining the winner of a Formula 1 season can be interpreted as an election where the candidates are the drivers and the votes are the single races. We considered the seasons from 1970 till 2008. Since our current algorithms and their implementation can neither handle ties nor partial rankings, we only considered candidates that have competed in all races. Candidates that dropped out of a race are ordered according to the order determined by how long the drivers participated in the race. The generated instances have around 16 votes and up to 28 candidates (see Table 5 in Supplementary Material).

3.1.3 Web search

A prominent application of Kemeny Rank Aggregation is to aggregate search result rankings obtained from different web search engines. We queried the same 37 search terms as Dwork et al. [16], Schalekamp and van Zuylen [35], and Ali and Meilă [2] to generate rankings. We used the search engines Google, Lycos, MSN Live Search, and Yahoo to generate rankings of 1,000 websites. We considered two search results as identical if their URL is identical up to some canonical form (cutting after the top-level domain). Results not appearing in all rankings are ignored. Ignoring the term “zen budism” with only 18 candidates, this results in 36 instances having between 55 and 163 candidates (see Table 9 in Supplementary Material).

3.1.4 Web impact

We generated rankings that measure the “impact in the web” of different search terms. For a search engine, a list of search terms is ranked according to the number of the hits of each single term. We used Ask, Google, MSN Live Search, and Yahoo to generate rankings for all capitals (240 candidates), all nations (242 candidates—we obtained more nations than capitals because some capital names appeared twice), and the 103 richest people of the world (see Table 13 in Supplementary Material).Footnote 3

3.1.5 Mallows model

Our first source of synthetic instances are samples from the Mallows model [27]. This is an exponential model over rankings (resp. permutations in the original setting). The Mallows model consists of a central ranking \(r_0\) of \(m\) elements and a dispersion parameter \(\theta \in \mathbb {R}_+\) which quantifies the concentration of the rankings around the peak \(r_0\) with respect to some distance measure (Kendall’s tau distance in our case). The probability of a ranking \(r\) is

$$\begin{aligned} P_{r_0,\theta }(r) = \frac{e^{-\theta \cdot d(r,r_0)}}{Z} \quad \text {and}\quad Z = \sum _{r \in S_m} e^{-\theta \cdot d(r,r_0)}, \end{aligned}$$

where \(d\) is Kendall’s tau distance and \(S_m\) denotes the symmetric group of order \(m\), that is, the set of all rankings of \(m\) elements. Note that the normalization constant \(Z\) is actually independent of \(r_0\) and that Kemeny Rank Aggregation is a log-likelihood estimator for \(r_0\) [28]. We remark that the expected KT-distance between a ranking generated by the Mallows model and its central ranking is linearly upper bounded by the number of candidates [19]. Since the KT-distance is a metric, this upper bound transfers to the average KT-distance.

We used the following parameters to create our test instances:

  • number of votes \(n\): \(4,5,10,20\);

  • number of candidates \(m\): \(10,20,\dots ,100,125,150,175,200\);

  • dispersion parameter \(\theta \): \(0.005,0.01,0.05,0.1,0.2,\dots ,0.9,0.95,0.99,0.995\).

For each parameter combination, we created ten instances.

3.1.6 Plackett–Luce model

Our second source of synthetic instances are samples from the Plackett–Luce model [25, 33]. The model parameter is a set of weights \(S=\{s_1,\dots ,s_m\} \in \mathbb {R}_+^m\). The probability of a ranking \(r\) of \(m\) elements is

$$\begin{aligned} P_S(r) = \prod _{i=1}^m \frac{s_{r^{-1}(i)}}{Z_i}\quad \text {and}\quad Z_i = \sum _{i\le j}^{m} s_{r^{-1}(j)}, \end{aligned}$$

where \(r^{-1}(j)\) denotes the \(j\)th elements of ranking \(r\). In a sense, \(s_i\) denotes the “importance” of element \(i\). The Plackett–Luce model has not an as nice interpretation as the Mallows model, but it is easy to see that the mode of this distribution ranks the candidates decreasingly with respect to their importance. The votes should be less divergent (and thus easier to solve) when the weights decrease faster.

We used the following parameters to create our test instances:

  • number of votes \(n\): \(4,5,10,20\);

  • number of candidates \(m\): \(10,20,\dots ,100,125,150,175,200\);

  • importance weights \(S\): poly1 = \(\{m,m-1,\dots ,1\}\), poly2 = \(\{m^2,(m-1)^2,\dots ,1\}\), poly3 =  \(\{m^3, (m-1)^3,\dots ,1\}\), exp2 = \(\{2^m,2^{m-1},\dots ,2\}\).

For each parameter combination, we created ten instances.

3.2 Implementation and test series

Our algorithms are implemented in C++ using several libraries of the boost package.Footnote 4 Our implementation consists of about 4,000 lines of code.

Data reduction.

For the sake of completeness, we implemented all data reduction rules that are discussed in this paper including the -Majority Rule  extended to non-dirty candidate sets, the modified Extended Condorcet Rule (restricted on the weak majority graph) and a rule that removes Condorcet winners and Condorcet losers. Preliminary tests showed that it is most efficient to only run the Extended Condorcet Rule —confirming our theoretical findings. Thus, we restrict our evaluation to analyzing the efficiency and the effectiveness of the Extended Condorcet Rule compared to running no data reduction.

Implementation of the exact algorithms.

To solve (hopefully small) parts of the instances remaining after the application of the data reduction, we implemented three exact algorithms. First, an extended version of the search tree algorithm showing fixed-parameter tractability with respect to the Kemeny score [6, 11]. Second, a dynamic programming algorithm running in \(O(2^m m + nm^2)\) time for \(m\) candidates and \(n\) votes [6, 34]. Third, the integer linear program (ILP) [13, Linear Program 3] which was the fastest exact algorithm in previous experimental studies [13, 35]. We used the freely available ILP-solver GLPKFootnote 5 in version 4.43 as well as the two commercial ILP-solvers CPLEXFootnote 6 in version 12 and GurobiFootnote 7 in version 4.6.1 to solve the ILP.

Test series.

Generally speaking, the algorithms supported by high-end ILP-solvers turned out to be the fastest. Although the dynamic programming approach is provably fast for small instances up to 23 candidates and the search tree algorithms are able to beat the dynamic programming algorithm in several instances, we recommend to use the commercial ILP-solvers: Dynamic programming and search tree algorithms are only faster than the ILP-solvers in some small instances which could be solved by all considered approaches in less than 1 s. The commercial solvers are significantly faster than GLPK for all considered instances. Since the corresponding results leave no doubt, we omit a detailed evaluation and we restrict our analysis in the following to the ILP-solver approach with the two commercial ILP-solvers CPLEX and Gurobi. However, we will see that the Extended Condorcet Rule (see Sect. 2) significantly speeds up all approaches, thus confirming our theoretical findings.

In the next subsections, we discuss the results of our experiments. To this end, we computationally analyzed several properties of the input instances. As already discussed in Sect. 2, the data reduction is able to split input instances into smaller subinstances which can be solved independently. Hence, the effectiveness of the data reduction reflects in the structure of these subinstances. To quantify this, after application of the data reduction we determined the profile of the reduced instances. Basically, a profile is a string (consisting of numbers separated by “\(>\)”) that expresses the sizes of the subinstances as well as the way of combining the solutions of the subinstances to get the overall solution. The profiles are be to read as follows. Every “1” stands for a trivial subinstance with one candidate, that is, the position of the candidate in the overall Kemeny ranking was determined by the data reduction. Higher numbers stand for groups of candidates whose “internal” order could not be determined by the data reduction, that is, non-trivial subinstances that can be solved independently. Sequences of \(i\) number 1s are abbreviated by \(1^i\). The ordering of the numbers corresponds to the ordering in which one concatenates the Kemeny rankings of the subinstances to get the overall Kemeny ranking.

For example, the profile “\(1^{2}\)>\(3\)>\(1\)>\(2\)” expresses that we know the best and the second best candidate, we know the set of candidates that must assume positions 3-5 without knowledge of their relative orders, we know the candidate in position 6, and we know the two candidates that must assume positions 7 and 8 without knowledge of their relative order. Thus, in this toy example two non-trivial subinstances remain, one with three and one with two candidates.

We performed the following test series on all instances in our data sets:

  1. Properties.

    We computed basic properties of our instances: the number of candidates and votes, the percentage of conflict pairs, the percentage of candidate pairs ordered according to the -majority, simple lower and upper bounds of the Kemeny score, the maximum position range of a candidate, the average KT-distance, and a maximum likelihood estimation for the dispersion parameter \(\theta \) of the Mallows model.

  2. Reduction.

    We computed the profiles (as discussed above) of the reduced instances and other reduction quality indicators such as the number of subinstances after preprocessing, the maximum number of candidates among the subinstances, the average number of candidates among the subinstances, and the percentage of conflict pairs whose ordering in the consensus could be determined by the data reduction,.

  3. Speedup.

    We computed a Kemeny consensus for each instance, once without using data reduction and once with data reduction as preprocessing. We compared both running times, computed the speedup gained by applying the data reduction, and state the Kemeny scores of the instances.

3.3 Experimental results

In this section, we summarize results from our test series applied to the six ranking data sets.

Winter sports.

Without data reduction, the ski cross instance, consisting of 69 candidates, was solved by the ILP-solver Gurobi in about 3.9 s. In contrast, the instance was solved in 0.11 s when using the Extended Condorcet Rule from Sect. 2 as preprocessing and then applying Gurobi for the unresolved subinstances. This speedup is possible because the data reduction solves the main part of the instance in milliseconds and leaves two relatively small subinstances containing 12 and 15 candidates. Clearly, these subinstances can be quickly solved by all ILP-solvers. Furthermore, the ski jumping instance, consisting of 33 candidates, could be fully solved in milliseconds by only applying the data reduction. The speedup through applying data reduction is at least 8 with respect to both instances and both ILP-solvers. Details concerning the experimental results of this data set can be found in Tables 1–4 in Supplementary Material. Speedup and running times are illustrated in Fig. 1.

Fig. 1
figure 1

Speedup (left) and running times (right) of the two winter sports instances (arbitrarily ordered)

Formula 1.

The data reduction was successful for most Formula 1 instances. Except for one instance, namely the instance corresponding to the season 1983, the data reduction ended up with subinstances with at most 13 candidates each. Notably, also for 1983, for more than 20 % of the candidates their exact positions could be already determined by the data reduction. More details about the structure of the reduced instances can be found in the profiles table (Table 7 in Supplementary Material). As we can learn from Fig. 2 and from Table 8 in Supplementary Material, applying the data reduction gives a significant speedup for the overall running time in most cases. However, all Formula 1 instances are still small enough to solve them without data reduction in less than 1 s.

Fig. 2
figure 2

Speedup (left) and running times (right) of the Formula 1 instances (seasonal ordering)

Currently, the winner determination in the Formula 1 is based on a “scoring rule”, that is, in a single race every candidate gets some points depending on the outcome, and the candidate with highest total score wins. To compare with the official winners, we computed Kemeny winners for the seasons from 1970 till 2008.

The Kemeny winner in most of the considered seasons is the same as the candidate selected by the scoring rule used by the FIA. In 2008, however, Lewis Hamilton was elected as world champion (beating Felipe Massa by only one point) whereas Massa was the “Condorcet driver” and thus the first candidate in every Kemeny ranking (see Fig. 3). Since in contrast to Kemeny’s voting system there is no scoring-based rule fulfilling the Condorcet property [40], such a difference in outcomes is no real surprise.

Fig. 3
figure 3

Comparing the computed Kemeny consensus ranking with the official Formula 1 ranking of season 2008

Web search.

Applying the data reduction as preprocessing helps to significantly speed up the running times of solving the Kemeny Rank Aggregation problem by the ILP-solvers (illustrated in Fig. 4). All details concerning the experimental results for this data set can be found in Tables 9–12 in Supplementary Material.

Fig. 4
figure 4

Speedup (left) and running times (right) of the web search instances (alphabetical ordering according to the search terms)

For all instances with more than 100 candidates, the results of the data reduction are displayed in Table 3: the Extended Condorcet Rule is not only able to reduce candidates at the top and the last positions but also partition some instances into several smaller subinstances. Out of the 36 instances, 22 were solved directly by the reduction and one of the other algorithms in less than five minutes. Herein, the data reduction always contributed with less than 0.5 s to the running time. For all other instances we still could compute the “top” and the “flop” candidates of an optimal ranking. For example, for the search term “telecommuting” there remains a subinstance with 109 candidates but we know the best nine candidates (and their order). The effectiveness in terms of candidates positions that have been fixed by the reduction rules is illustrated in Fig. 5. For example, the data reduction was able to fix more than 50 % of the candidate positions for more than 70 % of the instances.

Fig. 5
figure 5

Effectiveness and speedup for web search instances with more than 50 candidates. Left we use two different measures of effectiveness: the percentage of candidate positions that could be determined by the data reduction (crosses) and the percentage of conflict pairs that could be resolved by the data reduction (boxes). The position on the x-axis gives the number of candidates and the position on the y-axis gives the value of the corresponding measure. Right a cross (box) at position \((x,y)\) means that at least \(x\) % of the candidate positions (conflict pairs) could be determined by the data reduction for at least \(y\) % of the instances. For example, this means that for 80 % of the instances with more than 50 candidates more than 65 % of the conflict pairs could be resolved by the data reduction. Furthermore, 100 % of the candidate positions (that is, the instance was solved) could be determined for more than 25 % of the web search instances with more than 50 candidates

Table 3 Web data instances with more than 100 candidates

Web impact.

As for the capitals, in less than 2 s our algorithms (the data reduction and the ILP-solvers) computed a Kemeny ranking. The data reduction performed very well; the remaining subinstances correspond to the profile

$$\begin{aligned} 1^{7}\!>{5}>1^{10}>{9}>1^{14}>{34}>1^{5}>{6}>1^{3}>{6}>1^{7}>{9}\!>1^{36}>{11}>1^{9}\!>{43}>1^{26}. \end{aligned}$$

Thus, only two harder subinstances of sizes 34 and 43 remained for the ILP-solvers. The final Kemeny ranking starts as follows: London \(>\) Paris \(>\) Madrid \(>\) Singapore \(>\) Berlin \(> \ldots \) For aggregating the nation rankings, our algorithms were less successful. However, we could still compute the top 6 and the flop 14 candidates with the data reduction. Surprisingly, the best represented nation in the web seems to be Indonesia, followed by France, the United States, Canada, and Australia. The instance consisting of the 103 richest persons could be solved exactly in milliseconds by only applying the data reduction. Details concerning the experimental results of this data set can be found in Tables 13–16 in Supplementary Material. Speedup and running times are illustrated in Fig. 6.

Fig. 6
figure 6

Speedup (left) and running times (right) of the web impact instances (arbitrary ordering)

Mallows model.

Confirming our findings for the real-world data sets, the data reduction was also successful on this synthetic data set. For example, for instances with 100 candidates and 10 rankings, the data reduction was able to reduce the instances into at least 26 subinstances with maximum 36 candidates on average when the dispersion parameter \(\theta \) is \(0.1\) or higher.

Interestingly, the data reduction was not only able to compute some top and flop candidates, but usually, also the exact positions of candidates in the middle of the final ranking were computed. Examples can be found in Table 4.

Table 4 Typical structure of the reduced subinstances computed from the Mallows model instances

As expected, our test series also showed a correlation between the dispersion parameter \(\theta \) and the running time for solving Kemeny Rank Aggregation . When the number of candidates and rankings is fixed, the running time is highest when \(\theta \) is close to zero (that is, all possible rankings are almost equally likely) and the running time is lowest when \(\theta \) is close to one (that is, every ranking differing from the central ranking is very unlikely). Overall, the data reduction was quite successful for most test instances: For 100 candidates and 10 rankings, the data reduction enabled a speedup for \(\theta =0.01\) and greater. The speedup factor for CPLEX is more than \(10\) for \(\theta \ge 0.2\). Similar effects can be observed for 4, 5, and 20 rankings. The speedup factor for Gurobi is slightly higher for most instances.

The observed effects are stronger with increasing \(\theta \) as well as with decreasing number of candidates and rankings, respectively. Furthermore, the effects are weaker with decreasing \(\theta \) as well as with increasing number of candidates and rankings, respectively. We illustrate speedup and running times in Figs. 7 and 8.

Fig. 7
figure 7

Speedup (left) and running times (right) of the Mallows model instances. Upper diagrams instances with 50 candidates and 4 votes. Lower diagrams instances with 50 candidates and 20 votes

Fig. 8
figure 8

Speedup (left) and running times (right) of the Mallows model instances. Upper diagrams instances with 100 candidates and 10 votes. Lower diagrams instances with 200 candidates and 4 votes

However, even for 200 candidates, 20 rankings and \(\theta \ge 0.1\), the data reduction caused average speedup factors between \(14\) and \(20\) for CPLEX and between \(22\) and \(59\) for Gurobi. A detailed collection of our results for these instances can be found in Supplementary Material.

Plackett–Luce model.

In contrast to the Mallows model, the data reduction was less successful for instances sampled from the Plackett–Luce model. Especially for instances with 80 and more candidates, the data reduction could often only compute some top and flop instances. The data reduction could not break up the Plackett–Luce model instances into many subinstances like the Mallows model instances. Examples can be found in Table 5.

However, computing some top and flop candidates also leads to some speedup. For example, consider the instances with 70 candidates, 5 votes, and exponential decreasing weight vectors. The average speedup through computing some top and flop candidates was 1.6.

The speedups tend to be stronger with increasing polynomial degree for polynomially decreasing weight vectors as well as with decreasing number of candidates and rankings, respectively. A detailed collection of our results for these instances can be found in Supplementary Material.

Table 5 Typical structure of the reduced subinstances computed from Plackett–Luce model instances

4 Conclusion

Our experiments showed that the Extended Condorcet Rule allows for the computation of optimal Kemeny rankings for real-world instances of non-trivial sizes within seconds. We could find exact solutions for large real-world instances with more than 100 candidates by means of data reduction within milliseconds while the corresponding instances could not be directly solved within 10 s by sole use of the ILP (the fastest exact algorithm [13]). A key feature of the Extended Condorcet Rule is to break instances into smaller, independent instances. In more general terms, algorithm engineering in combination with data reduction and kernelization for NP-hard voting problems seems to be an area with several promising challenges [5].

On the theoretical side, we improved the previous partial problem kernel [7] with respect to the parameter average KT-distance from quadratic to linear size. It is open whether there is a linear partial kernel with respect to the \(\ge _s\)-majority for any , thus potentially giving data reduction rules that are not subsumed by the Extended Condorcet Rule . A natural step in answering this question seems to investigate whether for two non-dirty candidates \(a,b\) there must be a Kemeny ranking with \(a > b\) if \(a \ge _s b\).

An important extension of Kemeny  Rank  Aggregation is to consider “constraint rankings”, that is, the problem input may additionally contain a prespecified order of some candidate pairs in the consensus list [41]. Here, the considered data reduction cannot be applied anymore. Extending the data reduction framework to rankings with ties or partial rankings would be of high interest as well.

Kemeny  Rank  Aggregation  with  Partial  Rankings seems to be the most interesting extension. Unfortunately, considering majorities (as the -Majority Rule  and the Extended Condorcet Rule do) seems not to be promising for practically important instances where the majority of the pairwise relations of the candidates is left unspecified. Furthermore, the problem becomes more difficult: It is NP-hard already for two partial rankings [9] and even if the maximum KT-distance between two input rankings is zero [4], destroying any hope for fixed-parameter tractability or (partial) problem kernels for the parameter “maximum KT-distance” or “average KT-distance”. Hence, a different approach and different parameters seem necessary to obtain any provable guarantees for the effect of polynomial-time preprocessing.

In contrast to Kemeny  Rank  Aggregation  with  Partial  Rankings, for Kemeny Rank  Aggregation  with  Ties a quadratic partial kernel with respect to the parameter “average KT-distance” is known [7]. It looks promising to investigate whether this can also be improved to a linear partial kernel.