1 Introduction

The problem of reaching a consensus in a democracy has become more fraught as parties in many countries evince less and less appetite for compromise. This has been true not only in elections of a single leader, such as a president, but also in elections of councils and legislatures, whose members often are split ideologically and refuse to bargain in good faith.

This problem is ameliorated, to a degree, in countries with systems of proportional representation (PR), especially when centrist parties form alliances with parties on the left or right. In party-list systems, wherein parties’ parliamentary seats are approximately proportional to the numbers of votes they receive—at least if their vote shares exceed a threshold, often around 5%—coalitions tend to be more fluid than in two-party presidential systems.

But even in PR systems, political cleavages have been difficult to bridge. Because voters are restricted to voting for only one party, or for one candidate (e.g., in a nonpartisan election to a city council), they cannot indicate a desire that elected candidates cooperate with members of other parties or factions.

In this paper, we propose a procedure to elect multiple candidates using approval voting. Although we have no empirical evidence at this time (such a procedure has not yet been tried in actual multiwinner elections), we believe that it would encourage voters whose interests cut across ideological or party lines to support sets of candidates whose views reflect theirs.

Under this modification of approval voting, voters would, as in single-winner elections, be able to vote for multiple candidates who may be affiliated with different factions or parties. Their approvals would be aggregated so as to reduce the ability of a single faction or party to win a majority of seats and to encourage cross-cutting coalitions that transcend ideological and party divisions.

This is done by reducing the support that a voter gives to his or her approved candidates as more of them are elected. Hence, a 51% majority, if it votes as a bloc for all of its preferred candidates, will not be able to win all of the seats on a committee or council, leaving the 49% minority unrepresented. In the case of voting for parties, the aggregation is such that each party receives a number of seats in a legislature roughly proportional to the number of votes it receives.

In the case of single-winner elections, the properties of approval voting (AV)—whereby voters can cast one vote for each of the candidates they like, no matter how many, and the candidate with the most votes wins—have been studied extensively (Brams and Fishburn 2007; Brams 2008; Laslier and Sanver 2010). Although AV, as it is used in single-winner elections, has been and continues to be used in multiwinner elections (e.g., to elect members of the Council of the Game Theory Society and some local legislative bodies), such usage has been challenged because, as noted above, it may produce a tyranny of a majority, whereby one party or faction wins disproportionately many of the seats in a voting body.

We are not the first to propose alternative ways of aggregating approval votes to give proportional representation to different parties or factions, as well as candidates who bridge them, in the electorate. Kilgour (2010), Kilgour and Marshall (2012), Elkind et al. (2017) and Faliszewski et al. (2017) have reviewed and assessed a variety of methods for electing a fixed number of winners using a set of approval ballots. Kilgour (2018) has catalogued methods for conducting multiwinner elections with other ballot forms.

More specifically, Kilgour et al. (2006) and Brams et al. (2007) analyzed a “minimax procedure” that chooses the committee whose maximum Hamming distance, which is a metric for measuring the distance between two ballots, is a minimum. The distances may be weighted by the proximity of the ballot to other ballots. They applied this procedure to the 2003 election of the Game Theory Society Council and analyzed differences between the 12 candidates elected under AV and the 12 that would have been elected under the minimax procedure. For a generalization of this approach, see Sivarajan (2018).

Other ways of aggregating approval ballots have been analyzed, including “satisfaction approval voting” (Brams and Kilgour 2014), which maximizes the sum of voters’ satisfaction scores, defined as the fraction of the approved candidates who are elected. Another approach, which also may be based on approval votes, was first proposed by Monroe (1995) and generalized by Potthoff and Brams (1998); the latter uses integer programming to select the set of winning candidates that minimizes voters’ dissatisfaction. Brams (1990, 2008, ch. 4) analyzed “constrained approval voting”, in which winners are determined by both their vote shares and the categories of voters who approve of them, with constraints put on the numbers that can be elected from each category.

A number of scholars (Subiza and Peris 2014; Sánches-Fernández et al. 2016; Aziz et al. 2017; Brill et al. 2018) have suggested using divisor methods of apportionment (more on these methods later) with approval ballots.Footnote 1 We adopt that general approach here, addressing among other topics the representativeness of elected candidates (to be defined), which previous studies have not analyzed.Footnote 2

We focus on the depreciation weights of the two most prominent standard divisor methods of apportionment (Balinski and Young 2001; Pukelsheim 2014), one of which was proposed independently by Thomas Jefferson and Viktor d’Hondt, the other independently by Daniel Webster and André Saint-Laguë. We identify the two methods as “Jefferson” and “Webster”, whose proposals preceded those of d’Hondt and Saint-Laguë.

The standard Jefferson and Webster methods are iterative procedures in which seats are allocated sequentially until a body of requisite size is obtained.Footnote 3 With approval ballots, if the candidates are individuals, then after a candidate is elected, the method is applied to the remaining unelected candidates until the body is complete. If the candidates are political parties, then those methods likewise determine each party’s number of seats in the legislature or other body.

Each of these methods, used with approval ballots, has a simultaneous (nonsequential) analogue, which may produce different—even disjoint—winners from those of the sequential version. We ask whether the nonsequential winners are more representative than sequential winners: For which set of winners do more voters approve of at least one candidate?Footnote 4 We also ask whether the Webster winners (both sequential and nonsequential) are more representative than the Jefferson winners.

We next turn from the election of individual candidates to the election of different numbers of candidates from political parties. We show that Webster tends to elect a member of a relatively small party before electing an additional member of a larger political party, thereby giving more voters at least one representative, whereas Jefferson has the opposite tendency.

If there are only two candidates or parties, we assume that voters prefer, and vote for, only one of them. This renders irrelevant the opportunity afforded by an approval ballot of voting for more than one candidate.

But in the case of parties in multiseat contests, the number of votes a party receives determines its seat share in a legislature, which puts the focus on the percentage thresholds that determine that share. In the two-party case, these thresholds are evenly spaced under Jefferson, coinciding with those for cumulative voting, whereas under Webster they are spaced unequally. Thus, Jefferson thresholds for two parties are more even-handed than those of Webster.

In proportional-representation elections today in which voters can vote for only one candidate, or one party in a party-list system, the standard Jefferson and Webster apportionment methods satisfy several desirable properties (Balinski and Young 2001), but they are not flawless. Like all divisor apportionment methods, they are vulnerable to manipulation when voters are strategic and consider how other voters may vote (see, e.g., Cox 1997, pp. 30–32). Furthermore, they may not give political parties the number of representatives to which they are entitled after rounding (either up or down), which is to say that their apportionments may not stay within the quota (Balinski and Young 2001).

Nevertheless, with our procedures these methods seem the best possible to elect multiple winners using approval ballots. By expressing their support for sets of candidates or parties that cross ideological lines, voters may better be able to diminish the gridlock one sees often in voting bodies, especially in the United States and increasingly in European countries.

Multiwinner elections seem an attractive way to combat partisan gerrymandering in the United States, although at the federal level their implementation would require repeal of the 1967 ban on multimember congressional districts. For example, one bill to lift this ban, introduced in the House of Representatives in June 2017, specifies that every congressional district is to elect 3, 4, or 5 representatives, except in states that are entitled to fewer than 3. But this bill proposes a form of (multiwinner) single transferable vote, a complex system with many shortcomings, rather than approval voting combined with an apportionment method, which we think would better ameliorate partisan gridlock in Congress.Footnote 5

In Sects. 2 and 3, we apply the Jefferson and Webster apportionment methods to the election of individual candidates to a committee or council using approval ballots. In Sect. 4, we show how these methods can be applied to political parties, in which parties win seats roughly in proportion to the approval votes they receive. The proofs of several of our propositions, which often depend on examples, are given in the “Appendix”.

2 The Jefferson and Webster methods applied to candidates

The development and use of apportionment methods has a rich history. It is recounted by Balinski and Young (2001) in the American case, where its best-known application has been to the apportionment of members of the House of Representatives to states according on their populations. In apportioning the House, because a voter can reside in only one state, he or she can be counted only for that state, whereas with approval voting, especially in multiwinner elections, voters would be able to support more than one candidate or party.

There are exactly five divisor methods of apportionment that are stable: No transfer of a seat from one state or party to another can produce less disparity, where “disparity” is measured in five different ways (other ways of measuring disparity are possible, but they do not produce stable apportionments using a divisor method). The Jefferson and Webster methods, which we describe next, provide two of the five ways of defining disparity.Footnote 6

Though originally devised for allocating seats to parties, based on votes, or to states, based on population, apportionment methods also can be used to elect multiple candidates based on approval ballots. In this role, they progressively reduce the value of a voter’s approvals, as more and more of his or her approved candidates are elected. More specifically, the sequential versions of these methods proceed round-by-round, allocating one seat to the candidate, i, who maximizes a deservingness function, denoted d(i).

Let β denote the set of all submitted ballots, and let B(i) ⊆ β denote the set of ballots that include an approval vote for candidate i. In any round, for any ballot b\( \in \)\( \beta \), let r(b) denote the number of approved candidates on ballot b who already are elected. If i is a candidate not already elected, the deservingness of i according to the sequential versions of Jefferson (J) and Webster (W) are, respectively,Footnote 7

$$ d_{J} (i) = \sum\limits_{b\, \in \,B(i)} {\frac{1}{r(b) + 1}} $$

and

$$ d_{W} (i) = \sum\limits_{b\, \in \,B(i)} {\frac{1}{r(b) + 1 /2}.} $$

Simply put, on any round, each approval ballot supporting unelected candidate i is reduced by an amount that reflects the number of approved candidates on that ballot who already have been elected.

On the first round, no candidate has yet received a seat, so r(b) = 0 for every ballot; the Jefferson fraction equals 1 and the Webster fraction equals 2. Thus, the first candidate elected, according to both methods, will be the candidate who obtains the maximum number of approvals, or the AV winner. The following example shows that the two methods may produce different winners, beginning in the second round.

Example 1

Two of four candidates {A, B, C, D} to be elected. The numbers of voters who approve of different subsets of candidates are

$$ 2{:}\;A\quad 5 {:}\;AB\quad 3 {:}\;AC\quad 2 {:}\;BC\quad 4 {:}\;D. $$

A, B, C and D receive, respectively, 10, 7, 5 and 4 approvals, so A is the candidate elected first. For Jefferson on the second round, B’s ballots (the 5 supporting AB, and the 2 supporting BC) are counted differently, because r(AB) = 1 but r(BC) = 0 (each of the 5 AB ballots names one already-elected candidate, whereas the two BC ballots name none). Thus, on the second round, B’s deservingness score is

$$ d_{J} \left( B \right) = 5 \times \left( {1 /2} \right) + 2 \times \left( 1 \right) = 4\,\,1 /2. $$

Similarly, on the second round the deservingness scores of C and D are

$$ d_{J} \left( C \right) = 3 \times \left( {1 /2} \right) + 2 \times \left( 1 \right) = 3\,\,1 /2\quad d_{J} \left( D \right) = 4 \times \left( 1 \right) = 4; $$

so, under Jefferson, the second-round winner is B.

For Webster on the second round, B’s ballots (the 5 supporting AB, and the 2 supporting BC) are counted similarly, but using the Webster fraction, as follows:

$$ d_{W} \left( B \right) = 5 \times \left( {2 /3} \right) + 2 \times \left( 2 \right) = 7\,\,1 /3. $$

Similarly, on the second round the deservingness scores of C and D are

$$ d_{W} \left( C \right) = 3 \times \left( {2 /3} \right) + 2 \times \left( 2 \right) = 6\quad d_{W} \left( D \right) = 4 \times \left( 2 \right) = \, 8; $$

so, the second-round winner is D. To summarize Example 1, Jefferson elects AB (as would standard approval voting), and Webster elects AD.

Notice the difference in the summands that determine deservingness scores under Jefferson and Webster. As r(b) increases, for Jefferson 1/[r(b) + 1] decreases according to the sequence

$$ 1, \, 1/2, \, 1/3, \, 1/4, \, 1/5,\, \ldots , $$

whereas for Webster 1/[r(b) + 1/2] decreases according to the sequence

$$ 2, \, 2/3, \, 2/5, \, 2/7, \, 2/9,\, \ldots , $$

or, equivalently,

$$ 1, \, 1/3, \, 1/5, \, 1/7, \, 1/9,\, \ldots $$

Later we generalize these sequences to the h-sequence, defined by

$$ \frac{1}{h + 0}, \frac{1}{h + 1}, \frac{1}{h + 2}, \frac{1}{h + 3}, \ldots , \quad {\text{or}},\,{\text{equivalently}}, 1, \frac{h}{h + 1}, \frac{h}{h + 2}, \frac{h}{h + 3}, \ldots , $$

where h ≥ 0. Note that setting h = 1 produces the Jefferson sequence and h = ½ produces the Webster sequence.

For both methods, the contributions of voters to deservingness scores are devalued more and more as candidates of whom they approve are elected. But as can be seen by comparing the corresponding fractions in the Jefferson and the Webster sequences (normalized to start at 1), voters who approve of the AV winner—and of subsequent candidates who may be elected on later rounds—are reduced less under the Jefferson method than under the Webster method. This means that the Jefferson method more than the Webster method tends to favor candidates (e.g., B) whose voters have approved of a candidate already elected (e.g., A) than candidates (e.g., D) whose voters have not yet had an approved candidate elected.Footnote 8

Arguably, because the 4 D voters voted only for D, even though two candidates are to be elected, their preferences may be considered more “intense” than those of the other voters. This helps D get elected under Webster but not under Jefferson.

The forgoing decreasing sequences for Jefferson and Webster can be used as the basis for a nonsequential method of committee election. In a nonsequential method, each possible committee is assigned a score measuring the total satisfaction that it would deliver to voters; the committee with the maximum score wins (we assume that ties are broken randomly). Assuming that n candidates compete and a committee of size m < n is to be elected, there are \( \left( {\begin{array}{*{20}c} n \\ m \\ \end{array} } \right) \) possible committees to be compared, which may be very large, especially when m is large and n is about twice the size of m. Denote the set of all possible committees by Ω.

To construct a nonsequential rule from any h-sequence, we measure the satisfaction of electing one candidate as 1, the satisfaction of electing two candidates as \( 1 + \frac{h}{h + 1} \), the satisfaction from three candidates as \( 1 + \frac{h}{h + 1} + \frac{h}{h + 2} \), and so on. The Jefferson (J) and Webster (W) nonsequential scores for a committee C ∈ Ω are obtained by setting h = 1 and h = ½, respectively:

$$ s_{J} (C) = v_{1} (C) + \left( {1 + \frac{1}{2}} \right)v_{2} (C) + \left( {1 + \frac{1}{2} + \frac{1}{3}} \right)v_{3} (C) + \left( {1 + \frac{1}{2} + \frac{1}{3} + \frac{1}{4}} \right)v_{4} (C) + \cdots $$
$$ s_{W} (C) = v_{1} (C) + \left( {1 + \frac{1}{3}} \right)v_{2} (C) + \left( {1 + \frac{1}{3} + \frac{1}{5}} \right)v_{3} (C) + \left( {1 + \frac{1}{3} + \frac{1}{5} + \frac{1}{7}} \right)v_{4} (C) + \cdots , $$

where vk(C) is the number of voters who approve of exactly k members of C.Footnote 9 Formally,

$$ v_{k} \left( C \right) = \left| {\left\{ {j \in V{:}\left| {B_{j} \cap C} \right| = k} \right\}} \right|, $$

where V is the set of all voters and Bj is the ballot (set of approved candidates) of voter j. Of course, the Cs that maximize sJ(C) and sW(C) are the ones chosen by each method.

We illustrate the nonsequential methods of Jefferson and Webster by applying them to Example 1:

$$ 2 {:}\;A\quad 5 {:}\;AB\quad 3 {:}\;AC\quad 2 {:}\;BC\quad 4 {:}\;D. $$

Thus, for the subset AB, v2(AB) = 5 voters approve of both A and B and v1(AB) = 7 voters approve of exactly one of A and B, with 5 approving of A but not B and 2 approving of B but not A. Thus,

$$ s_{J} \left( {AB} \right) \, = \, 5 \times \left( {1 + 1/2} \right) \, + { 7} \times \left( 1 \right) \, = \, 14\,\,1 /2;\quad s_{W} \left( {AB} \right) \, = \, 5 \times \left( {1 + 1 /3} \right){ + 7} \times ( 1 ) { } = 13\,\,2 /3. $$

We do this for each of the possible committees, first calculating the v1 and v2 counts and, from them, the committee’s scores according to Jefferson (J) (in the format v2 × (3/2) + v1 × (1)) and Webster (W) (in the format v2 × (4/3) + v1 × (1)), as shown below:

$$ \begin{array}{*{20}l} {J{:}s_{J} \left( {AB} \right) = 5 \times \left( {3 /2} \right) + 7 \times \left( 1 \right) = \underline{14\,\,1 /2} ;} \hfill & {s_{J} \left( {AC} \right) = 3 \times \left( {3 /2} \right) + 9 \times \left( 1 \right) = 13\,\,1 /2;} \hfill \\ {\quad s_{J} \left( {AD} \right) = 0 \times \left( {3 /2} \right) + 14 \times \left( 1 \right) = 14;} \hfill & {s_{J} \left( {BC} \right) = 2 \times \left( {3 /2} \right) + 8 \times \left( 1 \right) = 11;} \hfill \\ {\quad s_{J} \left( {BD} \right) = 0 \times \left( {3 /2} \right) + 11 \times \left( 1 \right) = \, 11;} \hfill & { s_{J} \left( {CD} \right) = 0 \times \left( {3 /2} \right) + 9 \times \left( 1 \right) = 9.} \hfill \\ \end{array} $$
$$ \begin{array}{*{20}l} {W{:}s_{W} \left( {AB} \right) = 5 \times \left( {4 /3} \right) + 7 \times \left( 1 \right) = 13\,\,2 /3; } \hfill & {s_{W} \left( {AC} \right) = 3 \times \left( {4 /3} \right) + 9 \times \left( 1 \right) = 13;} \hfill \\ {s_{W} \left( {AD} \right) = 0 \times \left( {4 /3} \right) + 14 \times \left( 1 \right) = \underline{14} ;} \hfill & {s_{W} \left( {BC} \right) = 2 \times \left( {4 /3} \right) + 8 \times \left( 1 \right) = 10\,\,2 /3;} \hfill \\ {s_{W} \left( {BD} \right) = 0 \times \left( {4 /3} \right) + 11 \times \left( 1 \right) = 11;} \hfill & {s_{W} \left( {CD} \right) = 0 \times \left( {4 /3} \right) + 9 \times \left( 1 \right) = 9.} \hfill \\ \end{array} $$

As the underscored maxima indicate, the nonsequential versions of Jefferson and Webster choose the same committees (AB for Jefferson, AD for Webster) as the sequential versions. But that is not always the case.

Proposition 1a

The sequential and nonsequential versions of Jefferson, or of Webster, may elect different committees, which may not even overlap (i.e., may have no common members).

Proof

See “Appendix”.

The fact that the sequential apportionment methods start by choosing the AV winner is the reason why they may fail, as in Examples 2 and 3 in the “Appendix”, to find the committee that maximizes the deservingness score when it excludes the AV winner. If they had started with one member of the maximizing committee, they would have found the other.

The sequential versions of Jefferson and Webster will always have at least some overlap, since both start out by choosing the AV winner. But that does not apply for the nonsequential versions.

Proposition 1b

The nonsequential versions of Jefferson and Webster may have no overlap. (The sequential versions always have overlap.)

Proof

See “Appendix”.

As noted in Sect. 1, the nonsequential versions of Jefferson and Webster are computationally complex. However, if the committee (or council) size is small, and the number of candidates is not much larger, then the calculation of satisfaction scores for all committees is certainly feasible with modern computers.

3 Representativeness of a voting body

While it is desirable that as many voters as possible be represented on a committee by at least one candidate of whom they approve, it also is desirable that voters who approve of the same or similar subsets of candidates get them elected in numbers roughly proportional to the numbers of voters who approve of them. Different election procedures, including sequential and nonsequential Jefferson and Webster, may clash on these criteria.

Examples 2, 3, and 4 in the proofs of Propositions 1a and 1b in the “Appendix” illustrate that clash between the Jefferson and Webster apportionment methods (both their sequential and nonsequential versions). In Example 2, both versions of these methods elect B, but the sequential version first elects A (the AV winner), and only then B, making AB the winning pair, whereas the nonsequential version elects BC. BC gives all 21 voters one approved member of the committee, whereas AB gives 7 voters two approved members and 10 voters one approved member (in total, 17 voters have at least one approved member), but 4 voters approve of no member.

The conflict between these criteria also is evident in Example 3, wherein the sequential versions of Jefferson and Webster choose AB and the nonsequential versions choose CD. CD provides 16 of the 18 voters with at least one approved committee member, whereas AB provides only 15 voters with at least one approved member.

Finally, Example 4 illustrates the clash between the nonsequential versions of the two methods: AB is the nonsequential Jefferson winner, and CD is the nonsequential Webster winner. CD provides all 26 voters with at least one approved committee member, whereas AB provides only 22 voters with at least one approved member.

Recall that the nonsequential version of each procedure compares all possible committees on the basis of voter satisfaction scores, which increase by smaller and smaller amounts as additional approved candidates are elected. The sequential version starts by electing the AV winner (A in both Examples 2 and 3), who is not even a member of the nonsequential winning pair in either case.

We define the representativeness of a committee to be the number of voters who approve of at least one member of that committee (we generalize and formalize this concept shortly). This makes BC more representative than AB or AC in Example 2, and CD more representative than AB in Examples 3 and 4. Although nonsequential Jefferson and Webster produce more representative committees than their sequential counterparts in Examples 2 and 3, that is not always the case, as we will show.

Representativeness is not a new concept, although the idea that electoral methods might have different tendencies toward representativeness is. Representativeness is the REP-1 scoring procedure proposed by Kilgour and Marshall (2012), which also is known as the Chamberlin and Courant (1983) procedure. In Generalized Approval Voting, the score of a subset is the sum over all voters of a measure of the worth of the subset to the voter, which depends only on the number of candidates in the subset that the voter supports (Kilgour and Marshall 2012). In fact, representativeness, nonsequential Jefferson scores, and nonsequential Webster scores all are generalized approval scores.

The Generalized Approval score of committee C ∈ Ω is

$$ S\left( C \right) = r_{1} v_{1} \left( C \right) + r_{2} v_{2} \left( C \right) + r_{3} v_{3} \left( C \right) + \ldots , $$

where r1, r2, r3, … is the so-called rep sequence that characterizes the procedure, and v1(C), v2(C), v3(C),…are as defined in Sect. 2. Thus, the score of subset C, S(C), is a sum of contributions from the voters: 0 for voters who did not support any candidate in C; r1 for each voter who supported one candidate in C; r2 for each voter who supported two candidates in C; and so on. In particular, an h-sequence corresponds to a rep sequence defined by r1 = 1 and, for J = 2, 3, 4, …,

$$ r_{J} = 1 + \sum\limits_{j = 2}^{J} {\left( {\frac{h}{h + j - 1}} \right)} . $$

Therefore, nonsequential Jefferson and Webster are Generalized Approval procedures, based on, respectively, the rep sequences

$$ {\text{Jefferson:}} \quad r_{1} = 1,\,r_{2} = 1 + 1 /2,\,r_{3} = 1 + 1 /2 + 1 /3,\, \ldots $$
$$ {\text{Webster:}} \quad r_{1} = 1,\,r_{2} = 1 + 1 /3,\,r_{3} = 1 + 1 /3 + 1 /5,\, \ldots $$

Because R(C) = v1(C) + v2(C) + v3(C) +··· gives the number of voters who approve of at least one candidate in C, representativeness is measured by the score under the rep sequence corresponding to h = 0,

$$ {\text{Representativeness:}} \quad r_{1} = 1,\,r_{2} = 1,\,r_{3} = 1,\, \ldots , $$

Proposition 2

If the sequential and nonsequential versions of Jefferson elect different committees, either the sequential or nonsequential committee may be more representative. The same is true for Webster.

Proof

See “Appendix”.

While the nonsequential version of each apportionment method produced the most representative two-candidate committees in Examples 2 and 3, it is the sequential version that does so in Example 5 in the “Appendix”. But when either the sequential or nonsequential version of Jefferson or Webster gives a more representative committee, is that version the one that should be chosen?

Not necessarily. One important principle is that the method of vote aggregation, sequential or nonsequential, should be specified in advance so that no ambiguity exists about the aggregation procedure being used. In general, our calculations suggest that the nonsequential outcome is likely to be more representative than the sequential outcome when the two differ.Footnote 10

We recommend nonsequential methods if feasible. They guarantee that (by definition) one finds the committee that maximizes voter satisfaction; by contrast, the sequential committee must include the AV winner, a restriction that sometimes reduces representativeness. However, none of the methods described so far may yield the most representative committee.

Proposition 3

Neither the sequential nor the nonsequential version of Jefferson may elect the most representative committees. The same is true for Webster.

Proof

See “Appendix”.

Example 7 in the “Appendix” shows that not only do Jefferson and Webster give different outcomes, but two of the three outcomes given by sequential and nonsequential Webster (AC and BC) are more representative than the unique outcome (AB) given by sequential and nonsequential Jefferson.

Example 1 (see Sect. 2) illustrated that outcomes produced by Webster may be more representative than those produced by Jefferson: Sequential Jefferson elects AB, representing 12 of the 16 voters, whereas sequential Webster elects AD, representing 14 out of 16. Nonsequential versions of each method yield the same outcomes, suggesting that Webster gives outcomes at least as representative as, and sometimes more representative than, Jefferson for both the sequential and nonsequential versions of each method. But that is not always true.

Proposition 4

For committees of size 2 elected by the nonsequential versions of Jefferson and Webster, the Webster committee is equally or more representative. The same is true for the sequential versions of each method if one candidate is the unique approval-vote winner. But for committees larger than size 2 for both the sequential and nonsequential versions, either the Jefferson or the Webster committee may be more representative.

Proof

See “Appendix”.

To illustrate the proof of Proposition 4 as it pertains to committees of size 2 for the nonsequential method, we use Example 1 (see Sect. 2). Figure 1 shows the six possible committees in two dimensions—the horizontal dimension is R = v1 + v2, and the vertical dimension is v2. In Example 1, v1(AB) = 7 and v2(AB) = 5, so R(AB) = 12, giving AB at the point (12, 5).

Fig. 1
figure 1

Properties (R, v2) of all possible committees in Example 1

To visualize the Jefferson maximization, observe that all six points lie on one side of the line J, which has slope − 2 (and takes the form of v2/2 + R = s, where s is the score). Imagine moving the line J parallel to itself until it touches one of the six points representing the committees, keeping the other five points on the same side. It is clear that the committee that comes first with respect to line J is AB.

For the Webster maximization, the process is similar, except that the initial line, labelled W, has slope − 3 (and takes the form of v2/3 + R = s). Again, the committee that the (extended) W line touches first is AD, which also happens to be the most representative, because its R is highest.

In Sect. 4, we analyze the problem of apportioning different numbers of seats to parties. Unlike individuals who can fill only one seat on a committee, parties can fill multiple seats in a legislature.

4 The Jefferson and Webster methods applied to parties

States in the Balinski and Young (2001) model, which receive seats in the US House based on their populations, are akin to parties in our model, which receive seats based on the votes that they receive. Currently in the United States, voters can vote for only one party, but under AV, a voter can vote for as many parties as he or she likes. How do we apply the Jefferson and Webster methods to determine how many seats each party receives?

To calculate the numbers of seats that parties receive, we assume that each party nominates as many candidates, s, as will be elected to the legislature. Thus, party I nominates candidates i1, i2, …, is; if an apportionment method allocates k ≤ s seats to I, they go to candidates i1, i2, …, ik. We assume that a voter who votes for a party approves of all of its candidates.

The following example illustrates how the Jefferson method would allocate seats to parties when voters are not restricted to voting for one party but can vote for more than one:

Example 9

Two of six candidates, {a1, a2, b1, b2, c1, c2} from parties {A, B, C} to be elected. The numbers of voters who approve of different parties are

$$ 7 {:}\;AB\quad 5 {:}\;AC \quad 2 {:}\;B \quad 3 {:}\;C, $$

which translates into votes for the following sets of candidates:

$$ 7 {:}\;a_{1} a_{2} b_{1} b_{2} \quad 5 {:}\;a_{1} a_{2} c_{1} c_{2} \quad 2 {:}\;b_{1} b_{2} \quad 3 {:}\;c_{1} c_{2} . $$

Each of the two candidates nominated by parties A, B, and C initially receives, respectively, 12, 9, and 8 approvals. Thus, candidate a1 is the first candidate elected under sequential Jefferson. On the second round, deservingness scores must be compared for a2 (since a1 already has been elected from party A), b1 (from party B), and c1 (from party C). We put the summations in the format of Example 8 in the “Appendix” but exclude from them subsets of voters who contribute 0 to a candidate’s approval score:

$$ {a_{2}}{:}\,7\left( {1 /2} \right) + 5\left( {1 /2} \right) = \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{6} ; \quad {b_{1}}{:}\,7\left( {1 /2} \right) + 2\left( 1 \right) = 5\,\,1 /2; \quad {c_{1}}{:}\,5\left( {1 /2} \right) + 3\left( 1 \right) = 5\,\,1 /2; $$

so a2 is the second candidate elected, making the winning pair a1a2. Under nonsequential Jefferson, the satisfaction scores of the six possible winning pairs of candidates are

$$ \begin{array}{*{20}l} {a_{1} a_{2} {:}\;7\left( {3 /2} \right) + 5\left( {3 /2} \right) = \underline{18} ; } \hfill & {a_{1} b_{1} {:}\;7\left( {3 /2} \right) + 5\left( 1 \right) + 2\left( 1 \right) = 17 \,\,1 /2;} \hfill \\ {a_{1} c_{1} {:}\;7\left( 1 \right) + 5\left( {3 /2} \right) + 3(1) = 17 \,\,1 /2;} \hfill & {b_{1} b_{2} {:}\;7\left( {3 /2} \right) + 2\left( {3 /2} \right) = 13 \,\,1 /2;} \hfill \\ {b_{1} c_{1} {:}\;7\left( 1 \right) + 5\left( 1 \right) + 2(1) + 3(1) = 17;} \hfill & {c_{1} c_{2} {:}\;5\left( {3 /2} \right) + 3\left( {3 /2} \right) = 12;} \hfill \\ \end{array} $$

so a1a2 again is the winning pair. Observe that 7 + 5 = 12 of the 17 voters are represented by this pair.

By contrast, sequential Webster, after choosing a1, chooses c1 on the second round, because the deservingness scores are

$$ a_{2} {:}\;7\left( {1 /3} \right) + 5\left( {1 /3} \right) = 4; \quad b_{1} {:}\;7\left( {1 /3} \right) + 2\left( 1 \right) = 4\,\,1 /3;\quad c_{1} {:}\;5\left( {1 /3} \right) + 3\left( 1 \right) = \underline{4\,\,2 /3;} $$

so a1c1 is the winning pair and represents 15 of the 17 voters. Under nonsequential Webster, the satisfaction scores of the six pairs of candidates are

$$ \begin{array}{*{20}l} {a_{1} a_{2} {:}\;7\left( {4 /3} \right) + 5\left( {4 /3} \right) = 16; } \hfill & {a_{1} b_{1} {:}\;7\left( {4 /3} \right) + 5\left( 1 \right) + 2\left( 1 \right) = 16\,\,1 /3;} \hfill \\ {a_{1} c_{1} {:}\;7\left( 1 \right) + 5\left( {4 /3} \right) + 3(1) = 16\,\,2 /3;} \hfill & {b_{1} b_{2} {:}\;7\left( {4 /3} \right) + 2\left( {4 /3} \right) = 12;} \hfill \\ {b_{1} c_{1} {:}\;7\left( 1 \right) + 5\left( 1 \right) + 2(1) + 3(1) = \underline{17} ;} \hfill & {c_{1} c_{2} {:}\;5\left( {4 /3} \right) + 3\left( {4 /3} \right) = 10\,\,2 /3;} \hfill \\ \end{array} $$

so b1c1 is the winning pair, which represents all 17 voters.

In applying apportionment methods to parties, we have assumed that more than one candidate can be elected from a party. In fact, as Example 9 illustrated for Jefferson, all of the winners may be from the same party.

Multiwinner approval voting rules are vulnerable to manipulation by strategic voting. To illustrate, consider the outcome, a1a2, under sequential and nonsequential Jefferson in Example 9. Assume that polls just before the election show that party A is a shoo-in to win one seat (a1) and possibly two (a1a2). If you are one of the 5 AC voters and would prefer a committee of a1c1 to a1a2, you might well consider voting for just C to boost the chances of c1 being the second winner, making the outcome a1c1.

More specifically, if you switch from AC to C, you increase the number of C voters from 3 to 4 and reduce the number of AC voters from 5 to 4. Then the outcome under sequential and nonsequential Jefferson changes from a1a2 to, respectively, a1c1 and a tie between a1c1 and b1c1, thus producing a more diverse committee.Footnote 11 Put another way, your sincere preference for a committee comprising members of parties A and C—or at least a more diverse committee than a1a2—is abetted by voting for just C, demonstrating that sincerity is not a Nash equilibrium for Jefferson in Example 9.

That strategic voting may be optimal is, of course, not surprising, because virtually all voting systems are vulnerable to manipulation. What complicates matters in the case of the apportionment methods is that the determination of winners, and therefore optimal strategies to produce a preferred outcome, is anything but straightforward. This makes it difficult to use information from polls or other sources to make optimal strategic choices, especially for nonsequential versions of the apportionment methods.

We next turn to the case of just two parties (e.g., Democratic and Republican) or, in nonpartisan elections, two factions, one liberal (e.g., change oriented) and one conservative (status quo oriented). Call the parties A and B, and assume that each voter votes for only one party. Let the fraction of voters who support A be f, so the fraction of B supporters is 1 − f.

If s seats are to be allocated, the question that the apportionment methods answer is how many seats are to be received by each party. Let k = 1, 2, …, s −1. Each apportionment method determines thresholds t(s, k) such that party A receives k seats ifFootnote 12

$$ t\left( {s,k{-} \, 1} \right) < f < t\left( {s,k} \right). $$

Note that party A receives no seats if f < t(s, 0) and s seats if t(s, s − 1) < f.

Recall from Sect. 2 that the weights used in the Jefferson deservingness function for electing 1, 2, 3, 4, … approved candidates are

$$ 1,\,1 /2,\,1 /3,\,1 /4,\, \ldots $$

and those used in the Webster deservingness function are

$$ 1,\,1 /3,\,1 /5,\,1 /7,\, \ldots $$

As noted earlier, these sequences are equivalent to

$$ {\text{either }}\quad 1, \frac{h}{h + 1}, \frac{h}{h + 2}, \frac{h}{h + 3},\, \ldots {\text{or}}\quad \frac{1}{h + 0}, \frac{1}{h + 1}, \frac{1}{h + 2}, \frac{1}{h + 3}, \, \ldots , $$

where h = 1 for Jefferson and h = 1/2 for Webster (Proposition 5 holds for other values of h besides 1 and 1/2).

Proposition 5

Assume in a two-party election that s seats are to be filled, that each party has s candidates, and that every voter approves of every candidate of one party (but no candidates of the other party). Fix h > 0. In an apportionment method based on the weights

$$ \frac{1}{h + 0}, \frac{1}{h + 1}, \frac{1}{h + 2}, \frac{1}{h + 3}, \ldots , $$

the thresholds for k = 0, 1, 2, …, s − 1, are given by

$$ t\left( {h,s,k} \right) = \frac{h + k}{2h + s - 1}. $$

Proof

See “Appendix”.

The thresholds for our two apportionment methods are the following:

$$ {\text{Jefferson:}}\,t\left( {1,s,k} \right) = \frac{k + 1}{s + 1};\quad {\text{Webster:}}\,t\left( {1 /2,s,k} \right) = \frac{2k + 1}{2s}. $$

For example, if s = 5 and k varies from 0 to 4, the minima for winning 1–5 seats are

$$ {\text{Jefferson:}}\,1 /6,\,1 /3,\,1 /2,\,2 /3,\,5 /6;\quad {\text{Webster:}}\,1 /10,\,3 /10,\,1 /2,\,7 /10,\,9 /10. $$

Thus, to win one seat, a party needs to win at least 1/6th of the vote under Jefferson and 1/10th under Webster; to win all five seats requires 5/6ths of the vote under Jefferson and 9/10ths under Webster.

Note that the fractional thresholds between 0 and 1 are equally spaced under Jefferson but not under Webster.Footnote 13 The Webster thresholds are equally spaced between 1/10 and 9/10, with a difference of 2/10; at the extremes, however, Webster requires a relatively small fraction (1/10) to win one seat, and a relatively large fraction (9/10) to win all five seats.

In the two-party case, call the thresholds of a divisor apportionment method even-handed if they render the number of votes a party needs for an additional seat independent of the number of seats it already holds. By this definition, Jefferson intervals are even-handed, whereas Webster intervals, which make attaining the first seat “easy” and the last seat “hard”, are not.Footnote 14

Define the quota qi of party i as the fraction fi of the vote it receives times the number of seats, s, to be apportioned: qi= fis. For example, if a council has 5 seats, the quota of a party that receives 32% of the vote is 0.32 × 5 = 1.6. That is, that party is “entitled” to exactly 1.6 seats.

The number of seats that a party receives must be an integer;Footnote 15 recall from Sect. 1 that a party stays within the quota if the number equals its quota, rounded up or down. From the thresholds we gave above for a 5-seat council, Jefferson would give this party one seat (because 0.32 is less than 1/3), but Webster would give it two seats (because 0.32 is greater than 3/10). As this example illustrates in the two-party case, Webster favors the smaller party, Jefferson the larger party (68% gives it a quota of 3.4, so it would obtain four seats under Jefferson but only three seats under Webster).

An apportionment method satisfies quota if every party always stays within the quota. If s = 1, it is clear that any method of allocating seats satisfies quota. If s ≥ 2, satisfying quota (disregarding ties) means

$$ \frac{k}{s} < t\left( {h, s, k} \right) < \frac{k + 1}{s} , \quad {\text{or}}\quad \frac{k}{s} < \frac{h + k}{2h + s - 1} < \frac{k + 1}{s} \quad \left( {k = 0, 1, 2, \ldots , s - 1} \right) . $$

Proposition 6 proves that, for the same two-party case to which Proposition 5 pertains, the thresholds t(s, h, k) satisfy quota provided that a simple condition on h holds when s > 2. Note that ties are again disregarded in the proof.

Proposition 6

If there are two parties, s ≥ 2, and the context is the same as in Proposition 5 , then quota is satisfied for all positive values of h if s = 2, or, if s > 2, for any positive value of\( h < \frac{s - 1}{s - 2} \).

Proof

See “Appendix”.

If more than two parties compete for seats, Proposition 6 no longer is true (Balinski and Young 2001, chap. 10). A prominent nondivisor method of apportionment, proposed by Alexander Hamilton, satisfies quota but is subject to certain nonmonotonicity problems—for example, the Alabama paradox, whereby the apportionment of a party may decline when the number of seats in a legislature increases (Balinski and Young 2001). It is possible, however, to marry Hamilton with Jefferson or Webster—or any of the other three divisor methods—and stay within the quota and avoid most paradoxes (Potthoff 2014).

Balinski and Young advocate Webster in the apportionment of representatives to states, because it is least biased, showing no systematic tendency to favor either large or small states, and it almost always satisfies quota. But in the apportionment of seats to parties in a legislature, they advocate Jefferson, because it discourages small parties.

Under Jefferson, a small party may win no seats, even when it would win one under Webster. Thus, under Jefferson, smaller parties have an incentive to merge in order better to ensure that they win some seats. With its tendency to deter the fractionalization of parliaments into many small parties, Jefferson also facilitates the formation of a governing coalition comprising a few large parties (e.g., center-left or center-right) that together hold a majority of seats.

We believe that both Jefferson and Webster are likely to foster more cooperation among political parties if voters, using an approval ballot, can approve of more than one party. In effect, voters would be able to support coalitions of parties that they prefer in a governing coalition rather than being restricted to singling out one party for exclusive support.

To be sure, some voters will prefer to approve of only one party if they consider it the only ideologically acceptable party. But other voters are likely to find more than one party—perhaps for different reasons—compatible with their views, even in divided societies like Northern Ireland and now, increasingly, in the United States and European countries.Footnote 17

5 Conclusions

The straightforward extension of approval voting to the election of multiple winners can create a tyranny of the majority. A majority faction or party can win all of the seats on a committee or in a legislature, or at least a disproportionate number of them, giving little or no voice to the views of minorities.

By devaluing the approval votes of voters who have one or more of their approved candidates elected, apportionment methods as we apply them enable different individuals or groups to gain representation, and the resulting voting body to reflect a wider range of viewpoints. We focused on two well-known divisor methods of apportionment, Jefferson and Webster, for devaluing approval votes in order to determine the winners in a multiwinner approval election. Both of these methods are widely used today, though not in combination with approval voting.

For use with approval voting, we distinguished sequential and nonsequential versions of each method. Although either version may elect a more representative set of candidates—in which more voters approve of at least one winning candidate—the nonsequential version is more likely to maximize representativeness when the two versions differ.

The nonsequential versions of the Jefferson and Webster methods are computationally complex, but they should be feasible, with adequate computer support, in many elections. The fact that sequential and nonsequential versions of each method can produce different—even disjoint—sets of winners shows that their impact on who is elected may be decidedly nontrivial. But the fact that the different sets of winners produced by each version tend to produce similar outcomes makes the choice of one or the other less consequential.

The main contribution of our paper has been to compare the Jefferson and Webster methods in the approval-voting context. They can produce sets of winners that not only differ but, for their nonsequential versions, also are disjoint. For both versions, Webster is generally, but not always, more representative than Jefferson.

We also showed that the Jefferson (h = 1) and Webster (h = 1/2) methods are special cases along a continuum defined by the parameter h. One implication, which we did not pursue, is that a compromise between them is available by choosing a value of h (e.g., 2/3) between 1/2 and 1. For a two-party election using approval voting with either the sequential or nonsequential version, we gave formulas for the vote thresholds as functions of h (and of the number of seats to be filled) and provided conditions for the satisfaction of quota (so each party obtains its exact entitlement, rounded either up or down).

Although the apportionment methods are vulnerable to strategic voting, determining optimal manipulation strategies appears to be hard. The Jefferson method, which in a two-party election has the same vote thresholds as cumulative voting for winning seats on a council, eliminates the need for a party to strategize about how many candidates to run to ensure proportional representation (also true of the Webster method, but with different thresholds).

In two-party competition, the vote thresholds for winning are spaced evenly by Jefferson but not by Webster, making the former even-handed. On the other hand, if no restriction is imposed on the number of parties and the methods produce different winners, more voters will tend to approve of the Webster winners than the Jefferson winners.

It seems fitting that in an election with multiple winners, voters should be able to support multiple candidates or parties. Approval ballots provide voters with such an enhanced ability to express themselves, which seems likely to foster more cooperation across ideological and party lines and attenuate the oft-observed gridlock that hamstrings many elected voting bodies today.