Abstract
Consider an election where N seats are distributed among parties with proportions p 1,…,p m of the votes. We study, for the common divisor and quota methods, the asymptotic distribution, and in particular the mean, of the seat excess of a party, i.e. the difference between the number of seats given to the party and the (real) number Np i that yields exact proportionality. Our approach is to keep p 1,…,p m fixed and let N→∞, with N random in a suitable way.
In particular, we give formulas showing the bias favouring large or small parties for the different election methods.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The idea of a proportional election method is that each party gets a number of seats that is proportional to the number of votes. The same mathematical problem arises if seats are to be apportioned (before the election) between states in a union or between multi-member constituencies according to their populations. In particular, the problem of apportionment to the House of Representatives in the United States has been the source of much debate as well as much research since 1790, including the invention of several important election methods; see Balinski and Young (2001) for a detailed history. (The European Union has yet to agree on a similar procedure; see Grimmett (2012) for a recent attempt to replace the current political bartering with a “mathematical formula”, i.e. a well-defined method to allocate seats in the European parliament to the member states.) To fix the terminology, we talk in this paper about parties in an election; the case of apportionment between states, etc. differs mathematically only in language.
Of course, exact proportionality is in general not possible, since the number of seats has to be an integer, and the election method can be seen as a method to “round” the real numbers that yield exact proportionality to integers. The purpose of this paper is to study the resulting deviation from perfect proportionality, i.e. the “rounding errors” introduced by the method. In particular, we are interested in whether there is a systematic bias favouring large or small parties for various election methods.
This problem has received a considerable amount of attention going back to (at least) Pólya (1918a, 1918b, 1919). (Sainte-Laguë 1910b did a related comparison between two different methods already in 1910, see Example 3.5.) Some investigations have analysed data from real elections or from simulations (with party sizes chosen at random according to some predetermined distributions), see e.g. Balinski and Young (2001) and Schuster et al. (2003). The problem has also been studied theoretically by assuming that the parties have random sizes with a uniform distribution of the relative sizes over all possibilities, see Pólya (1918a, 1918b, 1919), Schuster et al. (2003), Heinrich et al. (2005) and Drton and Schwingenschlögl (2004, 2005, 2006). Random party sizes with more general distributions have been considered by Heinrich et al. (2004, 2005) and Schwingenschlögl (2008). (See also Sect. 6 below.)
The approach in the present paper is different. We consider a number of parties with fixed sizes and let the total number of seats N be random. More precisely, we let N be chosen uniformly at random in 1,2,…,N 0 for some large integer N 0, and then we take the limit as N 0→∞. (Limits obtained in this way, if they exist, can be regarded as results for “a random positive integer” N. See also Remark 3.1.) The same approach has been used in Janson and Linusson (2012) for a related problem (the Alabama paradox for Hamilton’s method).
We use this approach for several different election methods, and calculate, for given sizes of the parties (under a technical condition, see Sects. 3 and 11), the (asymptotic) distribution of the deviation from perfect proportionality as well as its mean, i.e. the systematic bias, and its variance. In particular, this shows how the bias depends on the sizes of the parties in a more detailed way than previous studies that consider random party sizes (see the references given above).
The election methods that we consider are divisor methods of the “linear” (or “stationary”) type (including Jefferson/D’Hondt and Webster/Sainte-Laguë) and quota methods (including Hamilton/Hare and Droop). See further Sect. 2, and Appendices A–B.
The main results are stated in Sect. 3 and proven in Sects. 4–5. Section 6 show that the results hold also in the more traditional approach with deterministic house size N and random party sizes. The following sections contain some simple applications of our main results: Sect. 7 discusses the probability of violating quota, Sect. 8 the expected gain or loss for parties forming an apparentement (coalition), Sect. 9 the Sainte-Laguë divergence studied by Heinrich et al. (2004, 2005) and Sect. 10 some related goodness-of-fit functionals. Section 11 discusses our technical condition and the case of rational sizes of the parties. The appendices contain some background material.
2 Notation
We assume throughout that we have m parties with v i votes for party i; we let \(V:=\sum_{i=1}^{m}v_{i}\) be the total number of votes and p i :=v i /V the proportions of votes for party i. (In real elections there may also be votes for parties that are disqualified because of a threshold rule, blank votes, and other invalid votes; these are ignored here so \(\sum_{i=1}^{m} p_{i}=1\). We consider only parties that have received at least one vote, so p i >0.) We further assume that N seats are to be distributed (the house size), and let s i be the number of seats given to party i; thus \(\sum_{i=1}^{m}s_{i}=N\). We occasionally write s i (N) when we need to specify the house size.
Strict proportionality would give
seats to party i. (This is usually not an integer.) We define the seat excess for party i to be the difference
Note that
We use the standard notations ⌊x⌋ and ⌈x⌉ for rounding down and up of a real number x, i.e. the largest integer ≤x and the smallest integer ≥x, respectively. We denote the fractional part of x by {x}:=x−⌊x⌋.
More generally, let α be a real number. We say that the α-rounding of a real number x is the integer [x] α such that
equivalently,
If x−α is an integer, we regard [x] α as multiple-valued, and accept both x−α and x−α+1 as values of [x] α . (This exceptional case typically occurs at ties in the election methods.) Consequently,
except the case when x−α is an integer and both x−α and x−α+1 are possible values. If 0≤α≤1, this means that x is rounded down if its fractional part is less than α and up if its fractional part is greater than α. In particular, \(\alpha=\frac{1}{2}\) yields standard rounding, α=0 yields rounding up and α=1 yields rounding down. (But note that we allow also α<0 or α>1, in which case |x−[x] α | may be greater than 1.)
The two types of election methods that we consider can be defined as follows; see further Appendices A–B where further details and equivalent definitions are given. Note that all methods that we consider are homogeneous, i.e., the result depends only on the proportions p i of votes, but not on the total number V of votes; hence we will mainly use p i rather than v i in the sequel.
-
The β-linear divisor method, or the divisor method with divisors d(n)=n+β−1 (where β is a real number): Let
$$ s_i:= \biggl[\frac{v_i}{D} \biggr]_{\beta}= \biggl[\frac{p_i}{D'} \biggr]_{\beta}, $$(2.7)where D (or D′=D/V) is chosen such that \(\sum_{i=1}^{m}s_{i}=N\). (If β<0 or β>1, there are some exceptions for small N, see Appendix A.1.) This includes several important methods, in particular β=1 (Jefferson, D’Hondt), β=1/2 (Webster, Sainte-Laguë), β=0 (Adams).
-
The γ-quota method (where γ is a real number): Let Q:=V/(N+γ) and let
$$ s_i:= \biggl[\frac{v_i}{Q} \biggr]_{\alpha}= \bigl[(N+\gamma)p_i \bigr]_{\alpha}, $$(2.8)where α is chosen such that \(\sum_{i=1}^{m}s_{i}=N\). In principle, γ can be any real number (provided N+γ>0); in practice only the cases γ=0 (Hamilton, Hare, method of largest remainder) and γ=1 (Droop) are important, although also γ=2 has some limited use.
Remark 2.1
In both divisor methods and quota methods ties may occur; then the α-rounding in (2.7) or (2.8) has two possible values (because x−α∈ℤ in (2.4)) for (at least) two parties, with (at least) one of them rounded to the larger value and (at least) one rounded to the smaller value. We assume that ties are resolved at random, but ties will usually not be a problem for us because of our assumptions below.
We let U(a,b) denote the uniform distribution on the interval (a,b). We will use U i and \(\widetilde{U}_{i}\) for random variables with U i ∼U(0,1) and \(\widetilde{U}_{i}\sim\mathrm{U}(-\frac{1}{2},\frac{1}{2})\), and we assume that such variables with different indices are independent. Recall that then \(\mathbb{E}U_{i}=1/2\), \(\mathbb{E}\widetilde{U}_{i}=0\) and \(\operatorname {Var}U_{i}=\operatorname{Var}\widetilde{U}_{i}=1/12\).
We denote convergence in distribution by \(\stackrel{\mathrm {d}}{\longrightarrow}\), convergence in probability by \(\stackrel{\mathrm{p}}{\longrightarrow}\) and equality in distribution by \(\stackrel{\mathrm{d}}{=}\).
3 Main results
As explained in the introduction, we assume that p 1,…,p m are fixed, and let the total number of seats (house size) N be random and uniformly distributed on {1,…,N 0} for some large integer N 0; we then consider limits as N 0→∞. For convenience, we denote this situation by \({{N\stackrel{\mathrm {p}^{*}}{\longrightarrow}\infty}}\). (Note that this implies \(N\stackrel{\mathrm{p}}{\longrightarrow }\infty\).)
Remark 3.1
We assume that N is distributed in this way for simplicity, but the results can easily be extended to more general sequence of random N∈ℕ. For example, we may take N uniformly distributed on an interval [N 1(k),N 2(k)] and let k→∞, for any two sequences N 1(k) and N 2(k) of positive integers with N 2(k)−N 1(k)→∞. In fact, we conjecture that the results hold for any N that satisfies the weak condition
(This implies \(N\stackrel{\mathrm{p}}{\longrightarrow}\infty\) and excludes parity restrictions.) Note that Weyl’s Lemma 4.1 extends to this situation. However, we have not verified all details for this case, and we leave further investigations to the interested reader.
In order to obtain simple asymptotic results, we will usually make one mathematical simplification: we assume that p 1,…,p m are linearly independent over ℚ, i.e., that there is no relation
with all coefficients a i rational and not all 0. (Equivalently, there is no relation (3.2) with integer coefficients a i , not all 0.) The case when this assumption does not hold is discussed in Sect. 11.
Remark 3.2
Mathematically, this assumption is reasonable, since if we choose p 1,…,p m at random (uniformly given that their sum is 1, as in Sect. 6), they will almost surely be linearly independent over ℚ. However, for real elections the assumption is clearly unreasonable since vote counts v i are integers and the p i thus rational numbers. Nevertheless, the results below are good approximations if the numbers p i have large denominators and there are no relations (3.2) with small integers a 1,…,a m , see Sect. 11. For practical purposes we thus can use the results below as good approximations for any p i and any N that is large enough. (However, we do not investigate the rate of convergence, and how large N has to be.)
3.1 Divisor methods
We begin with a simple deterministic bound, valid for all house sizes and numbers of votes, cf. Kopfermann (1991, Satz 6.2.11). It follows from the proof, or from Theorem 3.7 below, that the bounds are best possible. (Among bounds not depending on N; for a given N, the lower bound can be improved when N is so small that not all parties get seats, see Kopfermann (1991, p. 131).) Proofs of the results below are given in Sect. 4.
Theorem 3.3
Consider the β-linear divisor method. Then
Equivalently,
For the asymptotic bias, we have a simple formula.
Theorem 3.4
Consider the β-linear divisor method, and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow}\infty}}\),
Note that the asymptotic bias in (3.5) for a party depends only on its size p i and the number of parties, but not on the distribution of sizes between the other parties. The limit in (3.5) agrees with the formula by Kopfermann (1991, (S) on p. 208) for the advantage (or disadvantage) for different parties, derived by geometric considerations.
In particular, Theorem 3.4 shows that the bias is 0 for every party when β=1/2 (Webster/Sainte-Laguë). It is well-known that this divisor method is (asymptotically) unbiased, see e.g. (Pólya 1919; Balinski and Young 2001; Schuster et al. 2003; Schwingenschlögl and Drton 2004; Drton and Schwingenschlögl 2005) for various justifications of this; it is satisfying that our approach confirms this, and shows that the method really is unbiased for a party of any size.
For β>1/2 (for example Jefferson/D’Hondt with β=1), the bias is positive for p i >1/m and negative for p i <1/m. It is well-known that these methods favour larger parties; we here see exactly how much parties of different sizes are favoured, and that the boundary is exactly at the average size 1/m. Note also that the bias grows with m, especially for large parties. For example, for β=1, a large party with p i ≈1/2 has bias ≈(m−2)/4, while a small party (p i ≪1/m) always has a bias ≈−1/2.
For β<1/2 (for example Adams with β=0), we have the opposite biases; the boundary is still at p i =1/m.
Example 3.5
Sainte-Laguë (1910b) compared in 1910 the results of his method (which he called the method of least squares) and D’Hondt’s method in the case of two parties (that he thought of as two of several parties in an election) with p 1≥p 2. He did not consider the biases in our sense, comparing the distribution of the seats to the distribution of the votes, but his approach means studying the difference of the two biases. (Since we know that Sainte-Laguë’s method (β=1/2) is unbiased, this is equivalent to calculating the bias of D’Hondt’s method (β=1).)
Sainte-Laguë found that if k:=p 2/p 1∈(0,1], then the largest party gets on the average (1−k)/(2(1+k)) seats more with D’Hondt’s method. This agrees with Theorem 3.4, with m=2 and p 1=1/(1+k).
Sainte-Laguë (1910b) also considered the case of random p i , assuming that the ratio p 2/p 1 is uniformly distributed over (0,1], and found by integration the average gain
for the larger party with D’Hondt’s method. (Note that this differs from the average bias 1/4=0.25 for the largest of two parties derived by Schuster et al. (2003) assuming that p 2 is uniformly distributed on (0,1/2), see Theorem 6.4 below with m=2. The reason is that the two probability distributions of the party sizes are different; they are uniform in different ways.)
Remark 3.6
Sainte-Laguë (1910b) seems to have thought that this calculation for two parties applies to any two of several parties, and that therefore, if party A is larger than B, then B would on the average gain 0.193 seats compared to A if the method is changed from D’Hondt’s to Sainte-Laguë’s, regardless of the number and sizes of the other parties. Theorem 3.4 shows that this is incorrect; in fact, if A and B both are small parties, they both have essentially the same bias −1/2 with D’Hondt’s method. It is true that for divisor methods, the distribution of seats to two parties A and B is the same as if the total number of seats for these two parties is distributed between them by the same divisor method, see Theorem A.19. However, if both A and B are small parties, they will by (3.5) get on the average almost 1 seat less together than their proportional share, and if their total number is redistributed between them, most of this loss will be borne by the larger party, which offsets the advantage of the larger party in the redistribution and explains the fallacy. More generally, parties 1 and 2 will by (3.5) on the average get together N′:=N(p 1+p 2)+(β−1/2)(mp 1+mp 2−2) seats. If we distribute this number of seats between the two parties, the same formula would yield that party 1 on the average gets, with \(p_{1}':=p_{1}/(p_{1}+p_{2})\),
seats, in accordance with Theorem 3.4.
We can describe asymptotically not only the mean but also the distribution of the seat excess. We also obtain a joint limit distribution, which is explicit although a little involved.
Theorem 3.7
Consider the β-linear divisor method, and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow}\infty}}\), for each i,
where \(\widetilde{U}_{k}\sim\mathrm{U}(-\frac{1}{2},\frac{1}{2})\) are independent. Moreover, jointly for i=1,…,m,
where the limit random variables X i can be constructed as follows: Let U 1,…,U m ∼U(0,1) and let J∈{1,…,m} be random with ℙ(J=j)=p j , j=1,…,m, with all variables independent. Let
and let, finally,
Of course, as we shall verify in Sect. 4, \(X_{i}\stackrel {\mathrm{d}}{=}\bar{X}_{i}\).
The description in (3.9)–(3.10) of the joint distribution of X 1,…,X m is explicit but rather involved, and it is quite possible that there exists a simpler description; cf. Remark 10.3. For m=2, X 2=−X 1, and thus \((X_{1},X_{2})\stackrel{\mathrm {d}}{=}(\bar{X}_{1},-\bar{X}_{1})\) where \(\bar{X}_{1}:=(2\beta-\nobreak 1)p_{1}-\beta+U_{0}\) by (3.7). We leave it as an open problem to try to find alternative expressions for m≥3.
Corollary 3.8
Consider the β-linear divisor method, and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow}\infty}}\),
and, when i≠j,
Note that the asymptotic distribution of the seat excess depends on β only through its mean \(\mathbb{E}X_{i}\) (i.e., the asymptotic bias in (3.5)); the centred random variable \(X_{i}-\mathbb{E}X_{i}\) does not depend on β. We see also, from (3.10) or (3.11), that this random part of the seat excess tends to be larger in absolute value for a large party.
In the limit as p i →0, i.e. for very small parties, (3.9) and (3.10) yield \(V_{i}\stackrel{\mathrm{d}}{\longrightarrow}U_{i}\) and
Remark 3.9
A simple heuristic motivation for the bias in Theorem 3.4 is that the β-rounding in (2.7) is unbiased if β=1/2, but otherwise, it costs on the average each party β−1/2 seats, compared to v i /D which is truly proportional. In total, this makes a loss of m(β−1/2) seats, which are redistributed proportionally (by shifting D), so party i gets back p i m(β−1/2) seats, on the average, giving a net gain of (mp i −1)(β−1/2), as shown by Theorem 3.4.
For a related, but rigorous, argument, let us compare the methods with parameters β and β+1, temporarily using a superscript to distinguish the methods. (For example, Adams’s method and Jefferson’s method.) By Theorem A.12 (assuming that N is large enough),
Hence,
Taking the limit as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\), we see that the asymptotic distributions also differ by mp i −1. Hence, it is not surprising that the bias in Theorem 3.4 is mp i −1 times a linear function of β.
3.2 Quota methods
We again begin with a simple deterministic bound, cf. Kopfermann (1991, Satz 6.2.3). It follows from the proof, or from Theorem 3.13 below, that the bounds are best possible. Proofs of the results below are given in Sect. 5.
Theorem 3.10
For the γ-quota method,
Again we have a simple formula for the asymptotic bias.
Theorem 3.11
Consider the γ-quota method and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\),
As for the divisor methods, the asymptotic bias in (3.16) for a party depends only on its size p i and the number of parties, but not on the distribution of sizes between the other parties. The limit in (3.16) agrees with the formula by Kopfermann (1991, (S) on p. 199) for the (dis)advantage for different parties, derived by geometric considerations.
In particular, Theorem 3.11 shows that the bias is 0 for every party when γ=0 (method of largest remainder/Hamilton/Hare). Again it is well-known that this quota method is (asymptotically) unbiased, see e.g. (Pólya 1919; Balinski and Young 2001; Schuster et al. 2003; Schwingenschlögl and Drton 2004; Drton and Schwingenschlögl 2005); our approach confirms this and shows that the method is unbiased for a party of any size.
For γ=1 (Droop), the bias is positive for p i >1/m and negative for p i <1/m, just as for Jefferson’s method discussed above; hence Droop’s method too favours larger parties. (See Kopfermann 1991, p. 118 for figures illustrating the cases m=2 and m=3.) Note, however, that the bias for Droop’s method does not grow with m. A comparison between (3.16) (with γ=1) and (3.5) (with β=1) shows that the bias for Jefferson’s method is m/2 times the bias for Droop’s method, for any party of any size. Hence Droop’s method is less biased than Jefferson’s for every m≥3. (For m=2 not only the biases are the same; in fact the methods coincide when there are only two parties, see Appendix B.1.)
Remark 3.12
There is a simple heuristic explanation of the formula (3.16) for the bias, if we accept the fact that Hamilton/Hare’s method (γ=0) is unbiased. Consider γ=1 (Droop); this method can be seen as doing Hamilton’s method with N+1 seats, but retracting the last seat. Since Hamilton’s method is unbiased, this first distributes on the average (N+1)p i seats to party i. The retracted seat is taken essentially uniformly at random, since it depends on the fractional parts of the numbers q i =Np i only; hence each party loses on the average 1/m seats in that step. This argument gives a bias of p i −1/m, in accordance with (3.16).
Again we have an explicit expression also for the asymptotic distribution of the seat excess.
Theorem 3.13
Consider the γ-quota method and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty }}\), for each i,
where \(\widetilde{U}_{k}\sim\mathrm{U}(-\frac{1}{2},\frac{1}{2})\) are independent.
Corollary 3.14
Consider the γ-quota method and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\),
As for the divisor methods in Theorem 3.7, the asymptotic distribution of the seat excess depends on γ only through the mean in (3.16); \(Y_{i}-\mathbb{E}Y_{i}\) does not depend on γ. Moreover, unlike the case of divisor methods, this centred random variable does not depend on the party size p i . In particular, for Hamilton/Hare’s method (γ=0), the asymptotic distribution of the seat excess is the same for every party.
Remark 3.15
Let for simplicity γ=0. Note that the limit variable Y i in (3.17) is uniform \(\mathrm{U}(-\frac{1}{2},\frac{1}{2})\) for m=2; for m≥3 the distribution is more complicated, but as m→∞, the limit converges to \(\mathrm{U}(-\frac{1}{2},\frac{1}{2})\) (by (3.17) and the law of large numbers). Hence, for large m, Y i has essentially the same distribution as for m=2, and although the supremum and infimum of the range of Y i tend to 1 and −1 as m→∞, cf. (3.15), the probability \(\mathbb{P}(Y_{i}\notin[-\frac{1}{2},\frac{1}{2}])\to0\).
Similarly, it is seen from the variance formula (3.18) that the variance is 1/12≈0.08333 for m=2, increases to 5/54≈0.09259 for m=3 and to the maximum 3/32=0.09375 for m=4; the variance then decreases again and converges to 1/12 as m→∞.
The difference from a uniform variable is not very large, and it may in practice be a good approximation to regard the seat excess Δ i in Hamilton’s method as \(\mathrm{U}(-\frac{1}{2},\frac{1}{2})\) for any party and any reasonably large house size.
We have stated Theorem 3.13 only for individual parties. It is easy to see that a joint asymptotic distribution exists, but (unlike the case of divisor methods, cf. Theorem 3.7) we do not know any simple description of it. (There is a description implicit in the proof of Theorem 3.16, see (5.14).)
Theorem 3.16
Consider the γ-quota method and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\), \(\Delta_{i}\stackrel{\mathrm{d}}{\longrightarrow}\bar{Y}_{i}\) jointly for i=1,…,m for some random variables \(\bar{Y}_{i}\) such that the distribution of \((\bar{Y}_{1}-\gamma p_{1},\dots,\bar{Y}_{m}-\gamma p_{m})\) is symmetric (exchangeable) and independent of p 1,…,p m , and \(\bar{Y}_{i}\stackrel{\mathrm{d}}{=}Y_{i}\) for each i.
Corollary 3.17
Consider the γ-quota method and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\), when i≠j,
Problem 3.18
Find an explicit formula for the limit variables \(\bar{Y}_{1},\dots,\bar{Y}_{m}\) in Theorem 3.16.
4 Proofs for divisor methods
Proof of Theorems 3.3 and 3.7
For the β-linear divisor method, the seat distribution is given by (2.7) with D and D′=D/V such that \(\sum_{i=1}^{m}s_{i}=N\). We let D′=1/t; thus, using (2.7) and (2.6),
(with the usual possible exception if p i t−β is an integer). Let here t grow from 0 to ∞; think of t as time. The seats are assigned one by one (unless there is a tie).
Fix one party j and let t jr be the time when party j receives its r:th seat. By (4.1), r=p j t jr −β+1, and thus
At this time t jr , by (4.1) again, party i has s i seats with
Let, for i=1,…,m,
defining W i :=1 in the exceptional case when p i t jr −β∈ℤ and s i =p i t jr −β. (But W i :=0 when p i t jr −β∈ℤ and s i =p i t jr −β+1.) Then (4.3) can be written
Note that W j =0, since s j =r=p j t jr −β+1.
The total number of seats at the moment t jr is, by (4.5),
and thus the seat excess is, combining (4.5) and (4.6),
So far everything is deterministic. Theorem 3.3 follows from (4.7) since each W k ∈[0,1] and W j =0.
Let us now assume that p 1,…,p m are linearly independent over ℚ; in particular, this implies that p i /p j is irrational when i≠j. Note that there is a tie between parties i and j at time t if and only if both p i t−β and p j t−β are integers. If parties i and j tie at two different times t 1 and t 2, we thus have p i (t 1−t 2)∈ℤ and p j (t 1−t 2)∈ℤ, and taking the quotient we find p i /p j ∈ℚ, a contradiction. Hence there is at most one tie for each pair of parties, and thus at most \(\binom{m}{2}\) values of t (and thus of N) for which there is any tie. Since we consider asymptotics, we may ignore the ties and assume that no tie occurs.
Let now N be random in {1,…,N 0} and let J=J(N) be the party getting the N:th seat. Then Δ i is given by (4.7), where W i =W i (N) for each N is given by (4.4) with j=J and r=s J (N). Fix again j and consider first only N such that J=j. The number of such N≤N 0 is s j (N 0), the number of seats party j receives when the house size is N 0; note that, e.g. by Theorem 3.3, s j (N 0)=p j N 0+O(1).
We recall a well-known result by Weyl (1916). (The result is usually stated with a 1=⋯=a k =0; this case implies immediately the more general version given here.) We say that an infinite sequence (v n ) n≥1∈[0,1)k is uniformly distributed if the empirical distributions \(n_{0}^{-1}\sum_{n=1}^{n_{0}}\delta_{v_{n}}\) converge to the uniform distribution as n 0→∞, where \(\delta_{v_{n}}\) denotes the Dirac measure. (Recall that this means that if A⊆[0,1)k with λ(∂A)=0, then #{n≤n 0:v n ∈A}/n 0→λ(A), where λ is the usual Lebesgue measure, see e.g. Billingsley 1968.)
Lemma 4.1
(Weyl)
Suppose that y 1,…,y k and 1 are linearly independent over ℚ, and let a 1,…,a k be any real numbers. Then the sequence of vectors ({ny 1+a 1},…,{ny k +a k }) n≥1∈[0,1)k is uniformly distributed in [0,1)k. □
(The standard proof is by showing that the Fourier transform (characteristic function) \(\frac{1}{n_{0}}\sum_{n=1}^{n_{0}} \exp(2\pi\mathrm{i}\sum_{j=1}^{k} \ell_{j}\{ny_{j}\} )\to0\), as n 0→∞, for any fixed integers ℓ 1,…,ℓ k , not all 0, see for example Grafakos 2004, Exercises 3.4.2–3.)
Since p 1,…,p m are linearly independent over ℚ, so are p 1/p j ,…,p m /p j , i.e. {p i /p j :i≠j}∪{1}. Taking y i :=p i /p j and a i :=(y i −1)(β−1), we thus see from Lemma 4.1 that (W i ) i≠j , given by (4.4) for r=1,…,s j (N 0), are uniformly distributed vectors in [0,1)m−1. In probabilistic notation, regarding W i =W i (N) as random variables, conditioning on J=j and recalling W j =0 when J=j, as N 0→∞ and thus s j (N 0)→∞,
Since ℙ(J=j)=s j (N 0)/N 0→p j as N 0→∞, the random variable J has asymptotically the distribution given in Theorem 3.7, and it follows from (4.8) that, with V i defined as in Theorem 3.7,
Consequently, (4.7) yields \(\Delta_{i}\stackrel{\mathrm {d}}{\longrightarrow}X_{i}\), jointly for all i=1,…,m, with X i defined by (3.10).
To obtain the simpler form (3.7) for an individual Δ i , let
Then, if J=i,
and if J≠i,
Hence, if we define
then
with \(\widetilde{U}_{j}\) (1≤j≤m−2) and Z independent. By (4.11), conditioning on J, we have the distributions \((Z\mid J=i)\sim\mathrm{U}(\frac{1}{2}-p_{i},\frac{1}{2})\) and \((Z\mid J\neq i)\sim\mathrm{U}(-\frac{1}{2},\frac{1}{2}-p_{i})\); furthermore ℙ(J=i)=p i , and it follows that \(Z\sim\mathrm{U}(-\frac{1}{2},\frac{1}{2})\), so \(Z\stackrel{\mathrm{d}}{=} \widetilde{U}_{0}\) and thus
The definition (4.10) now shows that \(X_{i}\stackrel{\mathrm {d}}{=}\bar{X}_{i}\), with \(\bar{X}_{i}\) defined in (3.7). □
Proof of Theorem 3.4 and Corollary 3.8
Since Δ i is bounded, e.g. by Theorem 3.3 (or by the simpler Theorem A.17), the convergence in distribution \(\Delta_{i}\stackrel{\mathrm{d}}{\longrightarrow}X_{i}\) in Theorem 3.7 implies convergence of all moments and mixed moments, see Gut (2005, Theorem 5.5.9); hence it suffices to calculate the expectations, variances and covariances of X i .
We have \(\mathbb{E}U_{i}=1/2\) and thus by (3.9), since J and U i are independent,
Consequently, \(\mathbb{E}\sum_{i=1}^{m}V_{i}=\frac{1}{2}\sum _{i=1}^{m}(1-p_{i})=\frac{1}{2}(m-1)\) and, by (3.10),
This shows Theorem 3.4.
For the variance and covariances, we have for any i
and, when i≠j, by independence,
Consequently, using also (4.12),
We obtain from this also
Consequently, (3.10) yields
and, for i≠j,
Corollary 3.8 follows. □
Remark 4.2
We can also see when equality holds in Theorem 3.3. By (4.7), Δ i is maximal when W i =0 and W k =1 for k≠i. Thus all W k ∈{0,1}, which by (4.4) means that all p k t−β are integers for t=t jr ; hence all parties tie in the sense that there are two possible values of s i in the β-rounding in (4.1), or equivalently (2.7), for every party. Consequently Δ i equals the upper bound in (3.3) if and only if all parties tie and, moreover, party i is lucky and is rounded up while all others are rounded down. (In the sequential formulation D3 or D4 in Appendix A, this means that all parties tie for the last seat, which is taken by party i.)
Similarly, Δ i is minimal when W i =1 and W k =0 for k≠i, and it follows that Δ i equals the lower bound in (3.3) if and only if all parties tie and, moreover, party i is unlucky and is rounded down while all others parties are rounded up. (In the sequential formulation this means that all parties tie for the last m−1 seats, which are given to all parties except i.)
5 Proofs for quota methods
Proof of Theorems 3.10 and 3.13
The proof for quota methods uses arguments similar to the proof in Janson and Linusson (2012), and the reader might wish to compare the versions. (Only the case γ=0 is treated in Janson and Linusson 2012, but that is only a minor simplification. For completeness we repeat some results from Janson and Linusson 2012.) For notational convenience, we show the results for party 1, i.e. we take i=1 in the proofs.
By definition, s j =[(N+γ)p j ] α for an α that makes \(\sum_{j=1}^{m}s_{j}=N\). Explicitly, by (2.5),
Thus, using party 1 as a benchmark,
Since Δ j −Δ1=s j −s 1−N(p j −p 1), (5.2) yields
Thus, summing over all j≠1, recalling that \(\sum_{j=1}^{m}\Delta _{j}=0\) and \(\sum_{j=1}^{m} p_{j}=1\),
Hence,
which shows Theorem 3.10.
Suppose now that all p i −p j are irrational. A tie between parties i and j can occur only if (N+γ)p i −(N+γ)p j ∈ℤ. If this happens for two different house sizes N 1 and N 2, then both (N 1+γ)(p i −p j )∈ℤ and (N 2+γ)(p i −p j )∈ℤ, and thus by taking the difference (N 1−N 2)(p i −p j )∈ℤ, which is impossible if p i −p j ∉ℚ. Hence, our assumption implies that there is a tie between parties i and j for at most one value of N, and thus at most \(\binom{m}{2}\) values of N for which there is any tie. Since we consider asymptotics, we can ignore these N and assume that there are no ties. Hence we can choose α such that there are strict inequalities in (5.1), and thus there are strict inequalities in (5.2).
By (5.2) (with strict inequalities) ⌊(N+γ)(p j −p 1)⌋∈{s j −s 1−1,s j −s 1}. Let
Then I j ∈{0,1} and I 1=0. We have
Summing over j we obtain, again recalling \(\sum_{j=1}^{m}\Delta_{j}=0\),
Let \(L:=\sum_{j=1}^{m}I_{j}\). By (5.8), we then have the formula
Note that L is an integer with 0≤L≤m−1 and, by (5.6),
Let \(\operatorname{Mod}_{m}(x):=m\{x/m\}\), the remainder when x is divided by m. We thus have
Let y j :=p j −p 1 and a j :=γ(p j −p 1); thus (N+γ)(p j −p 1)=Ny j +a j . Let us now assume that p 1,…,p m are linearly independent over ℚ. It is easily seen that then y 2,…,y m and \(\sum_{1}^{m} p_{j}=1\) also are linearly independent over ℚ. (This can be seen as a change of basis, using a non-singular integer matrix, in a vector space of dimension m over ℚ.)
We need the following extension of Lemma 4.1, slightly generalising Janson and Linusson (2012, Lemma 4.3). The proof is given in Appendix C.
Lemma 5.1
Suppose that y 1,…,y k and 1 are linearly independent over ℚ, and let a 1,…,a k be any real numbers.
Let \(\ell_{n}= \operatorname{Mod}_{m} (n- \sum_{j=1}^{k} \lfloor ny_{j}+a_{j} \rfloor)\in \{0,\dots,m-1\}\). Then the sequence of vectors
is uniformly distributed in [0,1)k×{0,…,m−1}.
Note that L is (5.10) equals ℓ N in Lemma 5.1, with y j and a j as chosen above. Consequently, (5.9) and Lemma 5.1 show that
where U j ∼U(0,1) and \(\bar{L}\) is uniformly distributed on {0,…,m−1}, with all variable independent. Moreover, \(U_{2}\stackrel{\mathrm{d}}{=}1-U_{2}\) and \(\bar{L}+U_{2}\sim \mathrm{U}(0,m)\), and thus
Hence, (5.12) can be written
with \(\widetilde{U}_{j}:=U_{j}-\frac{1}{2}\) for j≥1, which by \(\widetilde{U}_{0}\stackrel{\mathrm{d}}{=}-\widetilde{U}_{0}\) proves Theorem 3.13. □
Proof of Theorem 3.11 and Corollary 3.14
Since Δ i is bounded, e.g. by Theorem 3.10, the convergence in distribution \(\Delta_{i}\stackrel{\mathrm{d}}{\longrightarrow}Y_{i}\) in Theorem 3.13 implies convergence of all moments and it suffices to calculate the expectation and variance of Y i . This is immediate since \(\mathbb{E}\widetilde{U}_{k}=0\) and \(\operatorname{Var}\widetilde {U}_{k}=1/12\); thus by (3.17),
Theorem 3.11 and Corollary 3.14 follow. □
Proof of Theorem 3.16
Let W i :={(N+γ)p i }. For any j, {p i } i≠j and \(1=\sum_{i=1}^{m}p_{i}\) are linearly independent over ℚ, and thus the vectors (W i ) i≠j are uniformly distributed in [0,1)m−1 by Lemma 4.1. Moreover,
so W j ≡γ−∑ i≠j W i (mod 1) and
Consequently, \((W_{1},\dots,W_{m})\stackrel{\mathrm{d}}{\longrightarrow }(V_{1},\dots,V_{m})\) where each V i ∼U(0,1), any m−1 of them are independent, and any such set determines the last variable by V j ={γ−∑ i≠j V i }. Obviously, the distribution of (V 1,…,V m ) is symmetric.
We have
By the construction of the γ-quota method, the numbers s i −⌊(N+γ)p i ⌋ depend only on (W 1,…,W m ), say s i −⌊(N+γ)p i ⌋=f i (W 1,…,W m ) for some function f i , and it follows from (5.13) that, jointly for all i,
where the distribution of (Z 1,…,Z m ) is symmetric. (The m functions f i differ only by permutations of the variables.) Now let \(\bar{Y}_{i}:=Z_{i}+\gamma p_{i}\). Then \(\Delta_{i}\stackrel{\mathrm{d}}{\longrightarrow}\bar{Y}_{i}\), and thus \(\bar{Y}_{i}\stackrel{\mathrm{d}}{=}Y_{i}\) by comparison with Theorem 3.13. □
Proof of Corollary 3.17
By Theorem 3.16 and the fact that each Δ i is bounded, \(\operatorname{Cov}(\Delta_{i},\Delta_{j})\allowbreak \to\operatorname{Cov}(\bar{Y}_{i},\bar{Y}_{j})\). Moreover, by symmetry, \(\operatorname{Cov}(\bar{Y}_{i},\bar{Y}_{j})\) is the same for all pairs (i,j) with i≠j. We have \(\sum_{i=1}^{m}\bar{Y}_{i}=0\) as a consequence of (2.3), and thus
hence by symmetry \(\operatorname{Cov}(\bar{Y}_{i},\bar{Y}_{j})=-\operatorname{Var}(\bar{Y}_{i})/(m-1)=-\operatorname{Var} (Y_{i})/(m-1)\) for i≠j. The final equality follows from (3.18). □
Remark 5.2
We can also see when equality holds in Theorem 3.10. The argument yielding (5.5) shows that Δ1 is maximal exactly when there is equality in the left inequality in (5.1) for party 1, and in the right inequality for all other parties; Δ1 is minimal when the opposite holds. It follows that, exactly as for Theorem 3.3 in Remark 4.2, Δ i equals the upper [lower] bound in (3.15) if and only if all parties tie in (2.8) and, moreover, party i is the only lucky [unlucky] party.
6 Random party sizes
As said in the introduction, there is a long tradition of studying the bias of election methods theoretically with a fixed house size and random party sizes. More precisely, in this approach one (usually) assumes that the vector (p 1,…,p m ) is random and uniformly distributed over the simplex for some given m. One then orders the party sizes p 1,…,p m as p [1]≥⋯≥p [m] and considers the seat excesses Δ[1],…,Δ[m] of the parties in the same order. (Thus, Δ[1] is the seat excess of the largest party, and so on.) Results for the mean \(\mathbb{E}\Delta_{[k]}\) of the seat excess for the k:th largest party (either exact formulas for a fixed N or asymptotic formulas as N→∞) have been obtained by Pólya (1918a, 1918b, 1919), Schuster et al. (2003), Heinrich et al. (2005) and Drton and Schwingenschlögl (2004, 2005); results on the variance are given by Schwingenschlögl and Drton (2006). (We follow these papers and order the parties in decreasing order, cf. Appendix D.)
Heinrich et al. (2004, 2005) and Schwingenschlögl (2008) have also, more generally, considered other distributions of p 1,…,p m (with an absolutely continuous distribution on \(\mathfrak{S}_{m}\)). Note also that Sainte-Laguë (1910b) considered a different distribution for two parties, see Example 3.5 (with p 2/p 1 uniformly distributed, which is not the same as p 2 uniform).
We cannot obtain these results for a given N directly from our results for random N, nor can we obtain our results for fixed p 1,…,p m from these results. However, by combining the methods, we can translate one type of results to the other and thus see that our main results transfer to the case of random p 1,…,p m . (Now with random p i in e.g. (3.7) and (3.17).)
Theorem 6.1
Let p 1,…,p m be random with an absolutely continuous distribution on the simplex \(\mathfrak{S}_{m}\), and let N→∞ (with N deterministic).
-
(i)
For the β-linear divisor method, the conclusions (3.7)–(3.10) of Theorem 3.7 hold, with \(\widetilde{U}_{i}\) and U i independent of p 1,…,p m and with ℙ(J=j∣p 1,…,p m )=p j .
-
(ii)
For the γ-quota method, the conclusions of Theorems 3.13 and 3.16 hold; in particular (3.17) holds, with \(\widetilde{U}_{k}\sim\mathrm{U}(-\frac{1}{2},\frac{1}{2})\) independent of each other and of p 1,…,p m .
Proof
Consider first the β-linear divisor method. The proof of Heinrich et al. (2005, Theorem 3.1(i)) shows that as N→∞,
for some random variables Z j ; with the notation in Heinrich et al. (2005),
Here W j is our p j , c is our m, q is our β, U j is our \(\widetilde{U}_{j}\) and \(\operatorname{sgn}(D_{c})m_{j}(D_{c})\) is an explicit but rather complicated function of U 1,…,U m and W 1,…,W m ; however, the point is that for us it suffices to know the existence of some variables Z j such that (6.1) holds; we can ignore the construction. The proof of (6.1) is based on the fact that for any (deterministic) sequence ν n →∞, the vectors of fractional parts ({ν n p 1},…,{ν n p m−1}) are uniformly distributed in [0,1)m−1, see Lemma C.1, which is the analogue of Lemma 4.1 in the present context.
Consider now N uniformly random on {1,…,N 0} and let N 0→∞, i.e., suppose \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow}\infty }}\) in the notation used in the previous sections. Then \(N\stackrel{\mathrm{p}}{\longrightarrow}\infty\), so by conditioning on N, it follows that (6.1) holds for such random N as well. On the other hand, since the distribution of p 1,…,p m is absolutely continuous, almost surely p 1,…,p m are linearly independent over ℚ. Thus, by conditioning on p 1,…,p m we may apply Theorem 3.7, which shows that
where X 1,…,X m are given by (3.9)–(3.10) (with our random p 1,…,p m ). In the case of random p 1,…,p m and random \({{N\stackrel {\mathrm{p}^{*}}{\longrightarrow}\infty }}\), we thus have both (6.1) and (6.3); consequently,
Consider again deterministic N→∞. We know that (6.1) holds, so (6.4) implies that (6.3) holds in this case too. Finally, \((\bar{X}_{i}\mid p_{i})\stackrel{\mathrm{d}}{=}(X_{i}\mid p_{i})\) by Theorem 3.7, and thus \(\bar{X}_{i}\stackrel{\mathrm{d}}{=} X_{i}\), so (3.7) too holds in the present setting.
For the γ-quota method, the proof of Theorem 3.16 applies to the present situation as well, provided we use Lemma C.1 instead of Lemma 4.1. In particular, (5.14) holds, with V 1,…,V m ∼U(0,1) independent of p 1,…,p m . Further. \((\bar{Y}_{i}\mid{p_{1},\dots,p_{m}})=(Y_{i}\mid{p_{1},\dots,p_{m}})\), and thus \(\bar{Y}_{i}\stackrel{\mathrm{d}}{=}Y_{i}\) in the case of random p 1,…,p m too. The results follow. □
Remark 6.2
By considering p 1,…,p m uniformly distributed on a small neighbourhood \(\max_{i}|p_{i}-p^{0}_{i}|<\varepsilon\) of some \((p_{1}^{0},\dots,p_{m}^{0})\in \mathfrak {S}_{m}\), and then letting ε→0, we recover the limit distributions in Sect. 3, with a somewhat different interpretation as a double limit. (In this version p 1,…,p m can be any numbers, including rationals.)
In particular, we can apply Theorem 6.1 to the case of uniformly distributed p 1,…,p m taken in decreasing order as above, since then (p [1],…,p [m]) has a uniform distribution on the subset of \(\mathfrak{S}_{m}\). This gives immediately the following theorem. (We leave the case of joint distributions to the reader.)
Theorem 6.3
Let p 1,…,p m be random and uniformly distributed on \(\mathfrak{S}_{m}\), and let N→∞.
-
(i)
For the β-linear divisor method, for each j≤m,
$$ \Delta_{[j]} \stackrel{\mathrm{d}}{\longrightarrow} \widehat{X}_j := \biggl(\beta-\frac{1}{2}\biggr) (mp_{[j]}-1) + \widetilde{U}_0+p_{[j]}\sum_{k=1}^{m-2} \widetilde{U}_k. $$(6.5) -
(ii)
For the γ-quota method, for each j≤m,
$$ \Delta_{[j]}\stackrel{\mathrm{d}}{\longrightarrow} \widehat{Y}_j:= \gamma\biggl(p_{[j]}-\frac{1}{m}\biggr) + \widetilde{U}_0+\frac{1}{m}\sum_{k=1}^{m-2} \widetilde{U}_k. $$(6.6)
Here \(\widetilde{U}_{k}\sim\mathrm{U}(-\frac{1}{2},\frac{1}{2})\) are independent of each other and of p 1,…,p m .
As before, this theorem implies moment convergence because Δ[j] is bounded. In particular, for the bias in this approach we obtain the following result. This was (except for the case γ≠0 in (ii)) conjectured by Schuster et al. (2003) and proven by Drton and Schwingenschlögl (2005) and Heinrich et al. (2005).
Theorem 6.4
Let p 1,…,p m be random and uniformly distributed on \(\mathfrak{S}_{m}\), and let N→∞.
-
(i)
For the β-linear divisor method, for each j≤m,
$$ \mathbb{E}\Delta_{[j]} \to\mathbb{E}\widehat{X}_j = \biggl(\beta-\frac{1}{2}\biggr) (m\mathbb{E}p_{[j]}-1) =\biggl(\beta- \frac{1}{2}\biggr) \Biggl(\,\sum_{i=j}^m\frac{1}{i}-1 \Biggr). $$(6.7) -
(ii)
For the γ-quota method, for each j≤m,
$$ \mathbb{E}\Delta_{[j]}\to\mathbb{E}\widehat{Y}_j= \gamma\biggl(\mathbb{E}p_{[j]}-\frac{1}{m}\biggr) = \frac{\gamma}{m} \Biggl(\,\sum_{i=j}^m \frac{1}{i}-1\Biggr). $$(6.8)
Proof
The convergence to \(\mathbb{E}\widehat{X}_{j}\) or \(\mathbb{E}\widehat{Y}_{j}\) follows, as just said, from Theorem 6.3. The first equalities in (6.7) and (6.8) follow immediately from (6.5) and (6.6) since \(\mathbb{E} \widetilde{U}_{k}=0\), cf. (3.5) and (3.16). It remains only to compute \(\mathbb{E}p_{[j]}\). This is a standard formula, and for completeness we give a proof in Appendix D, see (D.6). □
We similarly obtain results for the variance. This was found for m=2 and 3 (except the case γ≠0) by Schwingenschlögl and Drton (2006).
Theorem 6.5
Let p 1,…,p m be random and uniformly distributed on \(\mathfrak{S}_{m}\), and let N→∞.
-
(i)
For the β-linear divisor method, for each j≤m,
(6.9) -
(ii)
For the γ-quota method, for each j≤m,
(6.10)
Proof
Again, by Theorem 6.3, we only have to calculate \(\operatorname {Var}\widehat{X}_{j}\) and \(\operatorname{Var}\widehat{Y}_{j}\); the calculations are similar and we give the details only for \(\widehat{X}_{j}\). We use the standard decomposition
The conditional variance and mean are obtained immediately from (6.5), which yields, cf. (3.5) and (3.11),
The first equality in (6.9) follows from (6.11)–(6.13). It remains only to find \(\mathbb{E}p_{[j]}^{2}\) and \(\operatorname {Var}(p_{[j]})\); these are well-known and we give a computation in Appendix D for completeness, see (D.6)–(D.8). □
Example 6.6
For m=3 and Jefferson/D’Hondt’s method we obtain from (6.7) with β=1, \(\mathbb{E}(\Delta_{[1]},\Delta_{[2]},\Delta_{[3]})\to(\frac{5}{12},-\frac{1}{12},-\frac{4}{12})\), as found already by Pólya (1919).
For m=3 and Droop’s method we obtain from (6.8) with γ=1, \(\mathbb{E}(\Delta_{[1]},\Delta_{[2]},\Delta_{[3]})\to(\frac{5}{18},-\frac{1}{18},-\frac{4}{18})\).
Example 6.7
For m=3, we have for the β-linear divisor method
and for the γ-quota method
These were found by Schwingenschlögl and Drton (2006) (where, however, the case γ≠0 is not treated).
Asymptotic covariances can be computed in the same way, using (3.12), (3.19) and (D.10).
In the same way as in Theorems 6.4–6.5, we can obtain results for the mean and variance of Δ[j] for random p 1,…,p m with other distributions than the uniform one. In particular, this gives the formulas by Heinrich et al. (2005) and Schwingenschlögl (2008) for the asymptotic bias in terms of \(\mathbb {E}p_{[j]}=\mathbb{E} (p_{j}\mid p_{1}\ge\cdots\ge p_{m})\) (assuming that (p 1,…,p m ) has a symmetric distribution).
For example, Heinrich et al. (2005) consider, as a special case, the case of a threshold t (with 0<t<1/m) and uniform distribution on p 1≥⋯≥p m ≥t. In this case, one has (as in Heinrich et al. 2005) \((p_{1}-1/m,\dots,p_{m}-1/m)\stackrel{\mathrm {d}}{=}(1-mt)(p^{*}_{1}-1/m,\dots,p^{*}_{m}-1/m)\), where \((p^{*}_{1},\dots,p^{*}_{m})\) has a uniform distribution on \(\mathfrak {S}_{m,\ge}\), and the mean and variance are easily computed.
7 The probability of violating quota
We say that a seat assignment s i satisfies lower quota if s i ≥⌊q i ⌋ and satisfies upper quota if s i ≤⌈q i ⌉; it satisfies quota if both hold. In terms of the seat excess Δ i =s i −q i , the assignment satisfies lower [upper] quota if and only if Δ i >−1 [Δ i <1], and it satisfies quota if and only if |Δ i |<1.
As is well-known, Hamilton/Hare’s method always satisfies quota, while Jefferson’s and Droop’s methods satisfy lower quota and Adams method satisfies upper quota, see e.g. Balinski and Young (2001) or Theorems 3.3 and 3.10 above. It is also well-known that Webster/Sainte-Laguë does not always satisfy quota, but that violations are unusual in practice.
Theorems 3.7 and 3.13 enable us to calculate the (asymptotic) probabilities that quota is violated for various methods. We give two examples.
Example 7.1
Consider Jefferson/D’Hondt’s method. By Theorem 3.3 with β=1, p i −1≤Δ i ≤(m−1)p i ; hence lower quota is always satisfied (as is well-known Balinski and Young 2001), but upper quota can be violated for a party with p i ≥1/(m−1). Consider the case m=3, and suppose that p 1≥1/2. (Then p 2,p 3<1/2, so quota is always satisfied for the other two parties; thus at most one party can violate quota.) By Theorem 3.7 with β=1, letting \(\widetilde{U}_{i}=\frac{1}{2}-U_{i}\),
hence the asymptotic probability of violating (upper) quota is
(The set of allowed (U 0,U 1) is a right-angled triangle with sides 2p 1−1 and (2p 1−1)/p 1.)
If we let (p 1,p 2,p 3) be random and uniformly distributed on \(\mathfrak{S}_{3}\), as in Sect. 6, then p 1 has the density 2(1−p 1) and thus the asymptotic probability that party 1 violates (upper) quota is
Consequently, the asymptotic probability that some party violates (upper) quota is
as found by Niemeyer and Wolf (1984).
Example 7.2
Consider Jefferson/D’Hondt’s method, and a party i with three times the average size: p i =3/m. Then the bias \(\mathbb{E}\bar{X}_{i}=1\), by Theorem 3.4 with β=1. It follows by (3.7) and symmetry that \(\mathbb {P}(\bar{X}_{i}>1)=1/2\), so the (asymptotic) probability that the party violates quota is 1/2. For a larger party, the probability is even greater.
Example 7.3
The Swedish parliament contains after the general election in 2010 8 parties: two large with 30 % of the votes each and 6 small with 5–8 % each. The seats are in principle distributed by Sainte-Laguë’s method. (We ignore here complications due to the division into 29 constituencies and the system with adjustment seats, which in 2010 did not give complete adjustment because the number of adjustment seats was insufficient.)
We note from Theorem 3.3 (with \(\beta=\frac{1}{2}\)) that the small parties always satisfy quota. In fact, (3.4) shows that for Webster/Sainte-Laguë, only parties with p i ≥1/(m−2) can violate quota.
For the large parties we have (approximatively) p i =0.3, and thus by (3.7) (still with \(\beta=\frac{1}{2}\)) \(\bar{X}_{i}=\widetilde{U}_{0}+0.3\sum_{k=1}^{6}\widetilde{U}_{k}\). An integration (using Maple) yields \(\mathbb{P}(\bar{X}_{i}\ge1)=\mathbb{P}(\bar{X}_{i}\le-1)\approx0.00045\). Hence, for each of the two large parties the (asymptotic) probability of violating quota is 0.0009.
8 Apparentements
Suppose that two or more parties decide to form a coalition in the election, so that their votes are counted together. (In some election systems, parties can register such a coalition, called an apparentement, and continue to have separate names and lists. Otherwise, the parties can always appear on election day as one party with a common umbrella name. In any case we assume that this does not attract or repel any voters, so the coalition gets exactly as many votes as the parties would have had separately, and that all other parties remain unaffected with the same number of votes as before.)
It is well-known that for a β-linear divisor method with β≥1 (e.g. Jefferson/D’Hondt’s method) parties can never lose by forming an apparentement; they will get at least as many seats together as if they appear separately. When β≤0 (e.g. Adams’s method), the opposite is true, and parties will never gain by forming an apparentement (conversely, they may gain by splintering), and for 0<β<1 (e.g. Webster/Sainte-Laguë’s method), they may both gain and lose. For D’Hondt’s method, this effect is politically important in reality, especially in small constituencies, and apparentements are or have been a regular feature in many countries, see e.g. Carstairs (1980).
Leutgäb and Pukelsheim (2009) give a detailed study of the resulting gains in a set of real elections, including examples, statistics and theoretical values assuming that the party sizes are random (as in Sect. 6). We can now give similar theoretical results for given party sizes. For simplicity, we consider only the case of a single apparentement of two parties; larger apparentements can be treated similarly as well as several apparentements (their effects are, asymptotically at least, additive).
Let the parties be i and j, and consider the expectation of the gain s ij −(s i +s j ), where s ij is the number of seats the parties get together as an apparentement.
Theorem 8.1
Suppose that p 1,…,p m are linearly independent over ℚ. If two parties i and j form an apparentement in the election, the mean of their seat gain s ij −s i −s j satisfies, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow} \infty}}\),
for the β-linear divisor method, and
for the γ-quota method.
Proof
If Δ ij is the seat excess for the apparentement, then
so \(\mathbb{E}(s_{ij}-s_{i}-s_{j})=\mathbb{E}\Delta_{ij}-\mathbb{E}\Delta _{i}-\mathbb{E}\Delta_{j}\), which by Theorems 3.4 and 3.11 converges to, recalling that Δ ij is the seat excess in a contest with m−1 parties,
and
for the β-linear divisor method and γ-quota method, respectively. □
One sees in the same way that these expected gains (which are negative if β<1/2 or γ<0) are balanced by a loss for each other party k of \((\beta-\frac{1}{2})p_{k}\) for the β-linear divisor method and γ/(m(m−1)) for the γ-quota method.
We have seen that apparentements are favoured when β>1/2 or γ>0, but the effect is rather small, especially for quota methods. (Nevertheless, as said above, the effect is large enough to be politically important in reality for D’Hondt’s method.) Note that for the β-linear divisor method, the gain in (8.1) does not depend on the number (or sizes) of the other parties, while for the γ-quota method, the gain in (8.2) is largest for m=3 or 4 and decreases as O(1/m) for large m.
Example 8.2
For Jefferson/D’Hondt’s method, two small parties that form an apparentement gain at most 0.5 seats together, and two large parties gain less.
Example 8.3
For Droop’s method, any two parties that form an apparentement gain at most \(\frac{1}{6}\approx0.167\) seats together when there are 3 or 4 parties, 0.15 seats when there are 5 parties, and less if there are more.
Problem 8.4
What are, for the different methods, the asymptotic distributions of the seat gain s ij −s i −s j for an apparentement?
We have so far studied the gain of the two parties combined, but the parties are probably more interested in how the gain is split between them. Typically, the seats given to an apparentement are distributed, in a sub-apportionment, between the participating parties by the same election method as for the main distribution (the super-apportionment). It seems reasonable that the gain then, on the average, is split between the parties proportionally to their sizes, cf. Theorems A.11 and B.3. However, we shall see that this holds only for the divisor methods.
Consider first the β-linear divisor method. Theorem 3.4 shows that, asymptotically as \({{N\stackrel {\mathrm{p}^{*}}{\longrightarrow}\infty}}\), the number of seats the apparentement gets is on the average
If we use Theorem 3.4 again for the distribution of these seats between i and j, we obtain that i gets on the average
seats, which, comparing to Theorem 3.4 for an election without the apparentement, means an average gain for party i of
which indeed is the proportional share p i /(p i +p j ) of the joint gain in (8.1).
There is, however, a gap in this argument. In our model, the seat assignment is a deterministic function of N both in the super-apportionment and in the sub-apportionment, and the calculation above using expectations for random N assumes that there is no hidden correlation, where the numbers of seats given to the apparentement are seat numbers that tend to favour one of the two parties in the sub-apportionment. It seems very unlikely that there will be such a correlation under our assumption that p 1,…,p m are linearly independent over ℚ, but we have so far no rigorous proof, so (8.5) should only be regarded as a conjecture for the asymptotic average gain for party i.
Problem 8.5
Verify rigorously that the expected gain for party i converges to the value in (8.5) as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\).
Nevertheless, if we tentatively use (8.5), and regard it as a valid approximation for real elections with finite N (and arbitrary p 1,…,p m ), we can draw some practical conclusions for D’Hondt’s method (β=1). If the parties i and j both are small, then (8.5) shows that the gain for party i is almost p i /(p i +p j ). Thus most of the gain of the apparentement goes to the larger party, and a party has very little to gain by an apparentement with a party that is substantially larger. In fact, if party i can choose between several parties to form an apparentement, we see, perhaps surprisingly, that it gets the largest gain by choosing the smallest partner. (This is based on asymptotics as N→∞. For a real election with finite N there is a trade-off since a really small partner will not help significantly; the advantage can hardly be larger than Np j , so presumably only parties with q j =Np j being at least 1/2 or so will be useful partners. It would be interesting to study such effects for finite N, but that is beyond the scope of the present paper.)
Consider now instead the γ-quota method. One important complication is that quota methods are not uniform, in the sense of Appendix A.3. Thus, even if the apparentement gets the same number of seats as the two parties would have got together if they had appeared separately in the election, the sub-apportionment may give a different distribution of these seats between the parties.
We calculate as above, with the same reservation that a rigorous verification still is lacking. Theorem 3.11 shows that, asymptotically as \({{N\stackrel {\mathrm{p}^{*}}{\longrightarrow}\infty}}\), the number of seats the apparentement gets is on the average
If we use Theorem 3.11 again for the distribution of these seats between i and j, we obtain that i gets on the average
seats, which, compared to Theorem 3.11 for an election without the apparentement, means an average gain for party i of
This is quite different from the proportional share p i /(p i +p j ) of the joint gain in (8.1). In fact, we see that for γ>0 (e.g. Droop’s method), a party gains only if p i /(p i +p j )>(m−1)/(2m). In other words, a party will lose if it forms an apparentement with a party that is only a little larger. Typically, the redistribution within the apparentement by the sub-apportionment is more important than the collective gain.
9 The Sainte-Laguë divergence
Sainte-Laguë (1910a, 1910b) based his proposal of his method on the fact that it is the least squares method minimising the sum
multiplying this by V we obtain the equivalent quantity
which is called the Sainte-Laguë divergence by Heinrich et al. (2004, 2005), where the asymptotic distribution is studied under the assumption of random party sizes (see Sect. 6). We obtain corresponding result in our setting as corollaries of the results in Sect. 3. (Unfortunately, we do not get as explicit result for quota methods as for divisor methods because Theorem 3.16 is less explicit.) The most interesting case is the \(\frac{1}{2}\)-linear divisor method (Sainte-Laguë’s method) (Heinrich et al. 2004), since then S and \(\widetilde{S}\) are the minima over all allocations, but it is also interesting to see how much larger \(\widetilde{S}\) is for other methods (Heinrich et al. 2005).
Theorem 9.1
Suppose that p 1,…,p m are linearly independent over ℚ, and let \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\).
-
(i)
For the β-linear divisor method,
$$ \widetilde{S}\stackrel{\mathrm{d}}{\longrightarrow} S_{\beta,m}:= \sum_{i=1}^m \frac{(V_i+\beta-1)^2}{p_i}-\Biggl(\sum_{i=1}^m(V_i+ \beta-1)\Biggr)^2, $$(9.3)where V 1,…,V m are as in Theorem 3.7.
-
(ii)
For the γ-quota method,
$$ \widetilde{S}\stackrel{\mathrm{d}}{\longrightarrow} \bar{S}_{\gamma,m}:= \sum_{i=1}^m \frac{\bar{Y}_i^2}{p_i}, $$(9.4)where \(\bar{Y}_{1},\dots,\bar{Y}_{m}\) are as in Theorem 3.16.
Proof
For the β-linear divisor method, we use Theorem 3.7 and obtain by the continuous mapping theorem (Billingsley 1968, Theorem 5.1) and (9.2) immediately \(\widetilde{S}\stackrel{\mathrm{d}}{\longrightarrow}\sum_{i=1}^{m}X_{i}^{2}/p_{i}\). We write (3.10) as X i =p i A−A i , where A i :=V i +β−1 and \(A:=\sum_{i=1}^{m}(V_{i}+\beta-1)=\sum_{i=1}^{m}A_{i}\). Thus
which is (9.3).
The result for the γ-quota method follows directly from Theorem 3.16. □
The expression in (9.3) for S β,m may be compared to the somewhat different expression in Heinrich et al. (2004, 2005) for the limit for fixed N and random p 1,…,p m ; by the argument in Sect. 6, the formulas have to be equivalent (for random p 1,…,p m ).
For the expectation we similarly obtain:
Theorem 9.2
Suppose that p 1,…,p m are linearly independent over ℚ, and let \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\).
-
(i)
For the β-linear divisor method,
$$ \mathbb{E}\widetilde{S}\to\frac{1}{12}\Biggl(\,\sum _{i=1}^m\frac{1}{p_i}+m-2\Biggr) +\biggl(\beta- \frac{1}{2}\biggr)^2\Biggl(\,\sum_{i=1}^m \frac{1}{p_i}-m^2\Biggr). $$(9.5) -
(ii)
For the γ-quota method,
$$ \mathbb{E}\widetilde{S} \to\frac{(m+2)(m-1)}{12 m^2}\sum _{i=1}^m\frac{1}{p_i} +\frac{\gamma^2}{m^2} \Biggl(\,\sum_{i=1}^m\frac{1}{p_i}-m^2 \Biggr). $$(9.6)
Proof
This can be shown from Theorem 9.1, but it is easier to use
and Theorem 3.4, Corollary 3.8, Theorem 3.11, Corollary 3.14. □
Note that \(\sum_{i=1}^{m}1/p_{i}\ge m^{2}\), with equality if and only if all p i are equal. We know that \(\widetilde{S}\) always is minimised by the \(\frac{1}{2}\)-linear divisor method (Sainte-Laguë’s method). We see from Theorem 9.2 that also asymptotically, \(\mathbb{E} \widetilde {S}\) is strictly larger for any β-linear divisor method or γ-quota method, except when all parties have exactly the same size (a trivial case where all methods yield the same result; note that we have excluded this case by our assumptions).
Furthermore, Theorems 9.1–9.2 show that \(\widetilde{S}\) is typically of the order m 2 if all p i are of roughly the same order, but if some p i is very small, \(\widetilde {S}\) can be substantially larger. Heinrich et al. (2004, 2005) consider \(\widetilde{S}\) and the limit S β,m for random, uniformly distributed, p 1,…,p m , and show that, as m→∞,
where S ∗ is a certain 1-stable random variable (depending on β). We see that the logm term and the large tail of S ∗ (we have \(\mathbb{E}S^{*}=\infty\)) depend on the possibility of some very small p i ; the asymptotic behaviour for deterministic p 1,…,p m with, say, (minp i )−1=O(m) is quite different. To see this better, we first replace V i (which are dependent) by U i (which are i.i.d.) in (9.3); the following lemma shows that for large m, the difference is small. (Cf. similar approximations in Heinrich et al. 2004, 2005.)
Lemma 9.3
Let \(\widehat{U}_{i}:=U_{i}+\beta-1\), so \(\widehat{U}_{i}\) are i.i.d. U(β−1,β), and let
Then \(\mathbb{E}\vert S_{\beta,m}-S_{\beta,m}'\vert =O(m)\), uniformly in p 1,…,p m .
Proof
We have |U i +β−1|≤|β|+1=O(1) and similarly |V i +β−1|=O(1). Further, by (3.9), ℙ(V i ≠U i )=ℙ(J=i)=p i . Hence,
Similarly, \(\sum_{i=1}^{m}(V_{i}+\beta-1)-\sum_{i=1}^{m}(U_{i}+\beta-1)=-U_{J}=O(1)\) and thus
The result follows by summing (9.9) and (9.10). □
We can now state a normal limit theorem corresponding to (9.7), but for deterministic p 1,…,p m .
Theorem 9.4
Suppose that for each m, we are given p 1,…,p m such that m 3/2min i p i →∞ as m→∞. Let \(A_{m}:=\sum_{i=1}^{m}p_{i}^{-1}\), \(B_{m}:=\sum_{i=1}^{m}p_{i}^{-2}\) and
Then, as m→∞,
Proof
Note first that, e.g. by Jensen’s inequality, A m ≥m 2, B m ≥m 3 and C m ≥0.
Write, for convenience, b:=β−1/2. We have \(\widehat{U}_{i}=\widetilde{U}_{i}+b\) and \(\mathbb{E}\widehat{U}=b\). Thus, (9.8) can be expanded as
The last term has expectation \(\operatorname{Var}\sum _{i=1}^{m}\widetilde {U}_{i}=O(m)\), and can thus be ignored. Define
Then
and, by a straightforward calculation,
The random variables Z i are independent and each is bounded with \(|Z_{i}|=O(1/p_{i}+m)=o(m^{3/2})=o(B_{m}^{1/2})\). The result now follows by the central limit theorem, using either Lindeberg’s or Lyapounov’s condition, see e.g. Gut (2005, Sect. 7.2). □
Remark 9.5
Sainte-Laguë (1910b) also studied the least squares functional calculated per seat and not per voter:
and found that this is minimised by the divisor method with \(d(n)=\sqrt{n(n-1)}\), later suggested by Huntington, see Appendix A and Remark A.10. The asymptotics of this quantity is, after division by (V/N)3, the same as for S, because s i /N→p i .
10 Some other goodness-of-fit functionals
Many election methods, including the ones treated here, can be characterised as minimising various functionals, see e.g. Balinski and Young (2001), Kopfermann (1991) and Niemeyer and Niemeyer (2008). Theorems 3.7 and 3.16 yield by the continuous mapping theorem asymptotic distributions for many such functionals for the methods treated here, under our standard assumptions that p 1,…,p m are linearly independent over ℚ and \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow}\infty}}\), which we assume below. We give some examples, leaving others to the reader. Note, however, that our results for quota methods are less complete, cf. Problem 3.18.
10.1 Hamilton divergences
Hamilton/Hare’s method, the 0-quota method, is the unique method minimising max i |Δ i |. It also the method minimising max i Δ i . Moreover, it is, for any convex function φ:ℝ→[0,∞] with φ(0)=0 and φ(x)>0 for x≠0, the unique method minimising ∑ i φ(Δ i ), as was shown by Pólya (1918b).
For the β-linear divisor method, Theorem 3.7 implies
with X i given by (3.10). However, we do not know any really simple description of this limit variable.
For the γ-quota method, including Hamilton/Hare’s method and thus the minimum of max i |Δ i | over all allocations, Theorem 3.16 yields
but we have no explicit description of this limit variable.
Similar results hold for max i Δ i .
We obtain more complete results if we instead consider the least squares functional \(\sum_{i=1}^{m}\Delta_{i}^{2}\), which also yields Hamilton/Hare’s method by Polya’s theorem (Pólya 1918b). (This case was shown earlier by Sainte-Laguë (1910b), who attributed the result to Zivy.)
Theorem 10.1
With assumptions and notations as in Sect. 3, for the β-linear divisor method,
and for the γ-quota method,
Proof
By Theorems 3.7 and 3.16, calculating \(\mathbb {E}X_{i}^{2}\) and \(\mathbb{E}\bar{Y}_{i}^{2}\) by Theorem 3.4, Corollary 3.8, Theorem 3.11, Corollary 3.14 as in the proof of Theorem 9.2. □
10.2 Jefferson and Adams divergences
Heinrich and Schwingenschlögl (2006) consider the functionals
and
(We could as well use the maximum and minimum of s i /v i −N/V.) Since \(\sum_{i=1}^{m}\Delta_{i}=0\), we have S J≥0 but S A≤0, and we will therefore use |S A|=−S A.
It is easy to see that S J and |S A| are minimised, over all allocations of N seats, by the Jefferson and Adams methods, respectively (Sainte-Laguë 1910b; Balinski and Young 2001).
Theorem 3.7 yields the asymptotic distribution of S J and |S A| for any β-linear divisor method; we consider for simplicity only the optimal methods.
Theorem 10.2
With assumptions and notations as in Sect. 3, for the optimal allocations,
and
We have
Proof
S J is minimised by Jefferson’s method, so Theorem 3.7 with β=1 yields, noting that minV i /p i =0 by (3.9),
Similarly, |S A| is minimised by Adams’s method, so Theorem 3.7 with β=0 yields
We have \(\mathbb{E}S_{m}^{\mathrm{J}}=(m-1)/2\) directly from (10.10).
For \(|S_{m}^{\mathrm{A}}|\), we first note that if B:=min1≤i≤m 1/p i , then for 0<x<B,
while ℙ(min1≤i≤m (1−V i )/p i >x)=0 for x≥B. Hence,
Furthermore, \(\mathbb{E}\sum_{j=1}^{m}V_{j}=\sum _{j=1}^{m-1}U_{j}=(m-1)/2\), and (10.12) follows from (10.11). □
Remark 10.3
Note that \(S_{m}^{\mathrm{J}}\) in (10.10) does not depend on p 1,…,p m , but for \(|S_{m}^{\mathrm{A}}|\), the situation is more complicated; the representation in (10.11) depends on p 1,…,p m , but it is still possible that \(|S_{m}^{\mathrm{A}}|\) has the same distribution for all p 1,…,p m and that we just have found an unnecessarily complicated description of it. Note that \(\mathbb{E}|S_{m}^{\mathrm{A}}|\) does not depend on p 1,…,p m . In the case m=2, it is easy to see that \(|S_{m}^{\mathrm{A}}|\stackrel{\mathrm{d}}{=}U\stackrel {\mathrm{d}}{=}S_{m}^{\mathrm{J}}\) for every (p 1,p 2), but we do not know whether something similar is true for m≥3.
Problem 10.4
Is \(|S_{m}^{\mathrm{A}}|\stackrel{\mathrm{d}}{=}S_{m}^{\mathrm{J}}\) for every p 1,…,p m ?
The asymptotic results by Heinrich and Schwingenschlögl (2006), taking the limit of the limits as m→∞, follow easily in our setting too.
Theorem 10.5
For any p 1,…,p m depending on m, as m→∞,
Proof
The result for \(S_{m}^{\mathrm{J}}\) is immediate from (10.10) and the central limit theorem.
For \(|S_{m}^{\mathrm{A}}|\) we use (10.11). We have \({\sum_{j=1}^{m}(U_{j}-V_{j})}=U_{J}\) and consequently, by (10.11) and (10.13),
and the result follows by the central limit theorem applied to \(\sum _{j=1}^{m}(1-U_{j})\). □
11 Rational p i
In our main results we have assumed that p 1,…,p m are linearly independent over ℚ, since this is necessary for Lemma 4.1. We consider briefly what happens when this is not satisfied, i.e., when p 1,…,p m are linearly dependent over ℚ. In particular, we are for obvious practical reasons interested in the case when p 1,…,p m are rational. (See also the corresponding discussion in Janson and Linusson 2012.)
Note first that it may now happen that there are ties. We assume that any ties are resolved randomly; thus the quantities s i (N) and Δ i (N) in general may be random variables, also for a fixed N. This is no real problem, however, and when discussing the means we may replace them by their average over all solutions in case of a tie.
Lemma 4.1 does not apply when p 1,…,p m are linearly dependent over ℚ, but the proof of it sketched in Sect. 4 shows that the sequence ({ny 1+a 1},…,{ny k +a k }) n≥1 is uniformly distributed on a coset of a subgroup of [0,1)k; more precisely, if we for simplicity take all a i =0, the empirical distributions converge to the uniform probability measure μ y on a subgroup, with Fourier coefficients given by
The proofs of Theorems 3.7 and 3.16 now show, with minor modifications, that the seat excesses Δ i still converge jointly in distribution as \({{N\stackrel {\mathrm{p}^{*}}{\longrightarrow}\infty }}\), but the limit distribution is different, and depends on p 1,…,p m through some measures of the type (11.1). In particular, the bias \(\lim\mathbb{E}\Delta_{i}\) is well-defined, but we do not have an explicit formula for it.
However, if we consider a sequence of party size distributions (p 1k ,…,p mk ), k=1,2,… , (with a fixed m), such that p ik →p i for each i as k→∞ and further, for every integer vector (a 1,…,a m )≠0, we have \(\sum_{i=1}^{m} a_{i}p_{ik}\neq0\) for all large k, then it follows from (11.1) and an inspection of the proofs of Theorems 3.7 and 3.16 that the measures μ y that now appear in the proofs will converge to the uniform measure μ, and thus the limit variables X i and \(\bar{Y}_{i}\) will converge to the corresponding variables given in Sect. 3 for the case when p 1,…,p m are linearly independent over ℚ.
This shows that the results above are good approximations also for linearly dependent p 1,…,p m , as long as there is no linear relation (3.2) with small integers a i . In particular, for rational p 1,…,p m we typically have a good approximation unless some p i has a small denominator.
In the remainder of this section, we suppose that p 1,…,p m are rational, with a common denominator L. Theorems A.11 and B.3 then show that the sequence of seat excesses Δ i :=s i (N)−Np i , N≥1, has period L. (If β<0, β>1 or γ<0 we may have to except some small N.) This gives another (simpler) proof that the limit distribution of Δ i as \({{N\stackrel{\mathrm {p}^{*}}{\longrightarrow}\infty}}\) exists in this case, and shows that it is the same as the distribution if we take N uniformly in a period {K+1,…,K+L} (for any K that is large enough). In particular, the asymptotic distribution is discrete, and therefore not the same as in Sect. 3. The asymptotic bias \(\lim\mathbb{E}\Delta_{i}\) is easily calculated by taking the average of Δ i over a period.
Example 11.1
Let m=2 and take \((p_{1},p_{2})=(\frac{2}{3},\frac{1}{3})\). Then the β-linear divisor method equals the γ-quota method with γ=2β−1, see Appendix B.1. Assume, for simplicity, 0<β<1, i.e., −1<γ<1. A simple calculation shows that for N=1,2,3, we have \(\Delta_{1}=\frac{1}{3},-\frac{1}{3},0\). Hence the average is 0, so the methods are unbiased in this case, for any β∈(0,1) or γ∈(−1,1). (In fact, for this example the methods all coincide with Hamilton/Hare = Webster/Sainte-Laguë.) In particular, Theorems 3.4 and 3.11 do not hold for rational p 1,…,p m and \(\beta\in(0,\frac{1}{2})\cup(\frac{1}{2},1)\) and γ∈(−1,0)∪(0,1).
If we instead take β=1, i.e. γ=1, we have a tie for N=2, with \(\Delta_{1}=-\frac{1}{3}\) or \(\frac{2}{3}\). The average of Δ1 over a period now is \(\frac{1}{6}\).
For the methods with β=1/2 and γ=0, the result 0 in Example 11.1 agrees with Theorems 3.4 and 3.11. In fact, this is true for any rational p 1,…,p m .
Theorem 11.2
The \(\frac{1}{2}\)-linear divisor method (Webster/Sainte-Laguë) and the 0-quota method (largest remainder/Hamilton/Hare) are asymptotically unbiased for any rational p 1,…,p m .
Proof
Let, as above, L be a common denominator of p 1,…,p m . It is for both methods easily seen, arguing similarly to the proofs of Theorems A.11 and B.3, that s i (L−N)=Lp i −s i (N) for N=0,…,L. Hence, Δ i (L−N)=−Δ i (N), and the average over a period N=1,…,L vanishes. □
Also for β=1 and γ=1, the result in Example 11.1 agrees with Theorems 3.4 and 3.11. For divisor methods, we have the following general result. (By Example 11.1 and (3.14), the result does not hold for any other β.)
Theorem 11.3
Consider the β-linear divisor method and suppose that β=k/2 for some integer k. Then (3.5) holds for any rational p 1,…,p m .
Proof
By (3.14), the result holds for β if and only if it holds for β+1. Hence it suffices to consider β=1/2 and β=1. The case β=1/2 is part of Theorem 11.2.
For β=1, we argue similarly. (Cf. the proof of Theorem A.11.) Let 0≤N<L and let t≥0 be such that s i (N)=[p i t]1, see (2.7) and (A.3). Recall that [p i t]1=⌊p i t⌋ except that ⌊p i t⌋−1 also is possible when p i t is an integer.
Since p i L is an integer, if we first for simplicity assume that p i t is not,
where summing over i yields
Consequently, (A.3) again shows that these numbers yield the seat distribution for 2L−N−m seats. (Note that \(L=\sum_{i=1}^{m}p_{i}L\ge m\), and thus 2L−N−m>0.) Hence, using (11.2) again,
The argument works also, with a little care, if some p i t is an integer, and shows that (11.4) holds generally, and that ties in the distribution of s i (N) correspond to ties in the distribution of s i (2L−m−N). Hence, we have, regarding Δ i as a random variable when there is a tie,
Taking the expectations in case of a tie, and then the average over the period N=0,…,L−1, when 2L−m−N runs through the period L−m+1,…,2L−m, we see that if the average is x, then x=mp i −x−1; hence \(x=\frac{1}{2}(mp_{i}-1)\), and thus (3.5) holds when β=1. □
For the 1-quota method (Droop), we have seen that (3.16) holds in Example 11.1; in fact, when m=2, it holds for any (p 1,p 2) by Theorem 11.3, because the 1-quota method equals the 1-linear divisor method when m=2, see Appendix B.1. However, it does not hold for m=3.
Example 11.4
Let m=3 and \((p_{1},p_{2},p_{3})=(\frac{2}{5},\frac{2}{5},\frac{1}{5})\). Consider the 1-quota method (Droop). For N=1,2,3,4,5, the smallest party gets, taking the average when there is a tie, \(0,0,1,\frac{2}{3},1\) seats. (Note the non-monotonicity, an instance of the Alabama paradox which frequently occurs for all quota methods, cf. Janson and Linusson 2012.) The seat excesses are \(-\frac{1}{5},-\frac{2}{5},+\frac{2}{5},-\frac{2}{15},0\), with average \(-\frac{1}{15}\), while (3.16) would give \(p_{1}-\frac{1}{3}=-\frac{2}{15}\). Thus (3.16) does not hold.
Note that this example shows that the heuristic argument in Remark 3.12 is not always valid when p 1,…,p m are rational. In this example, the retracted seat in Remark 3.12 is not taken uniformly; a calculation shows that it is taken from the smallest party only \(\frac{4}{15}\) of the times.
We can ask the following for any β-linear divisor method or γ-quota method, where the cases γ=1 in Problem 11.5 and β=1/2, β=1 and γ=0 in Problem 11.6 are especially interesting. (By Theorem A.12, for the linear divisor methods, it suffices to consider 0<β≤1.)
Problem 11.5
Find a general formula for the asymptotic bias \(\lim\mathbb{E}\Delta _{i}\) for rational p 1,…,p m . (Theorems 11.2 and 11.3 yield the answers for β∈ℤ/2 and γ=0.)
Problem 11.6
Find a general formula for the asymptotic distribution of Δ i for rational p 1,…,p m .
References
Balinski, M. L., & Young, H. P. (2001). Fair representation (2nd ed.). Washington: Brookings Institution Press.
Belgium: Loi électorale communale du 4 août 1932. http://elections2006.wallonie.be/apps/spip/IMG/pdf/loi_electorale_1932-2.pdf. Example of official regulations (accessed 21 October 2011).
Billingsley, P. (1968). Convergence of probability measures. New York: Wiley.
Bingham, N. H., Goldie, C. M., & Teugels, J. L. (1987). Regular variation. Cambridge: Cambridge University Press.
Carstairs, A. M. (1980). A short history of electoral systems in western Europe. London: George Allen & Unwin.
D’Hondt, V. (1882) Système pratique et raisonné de représentation proportionnelle. Bruxelles.
Drton, M., & Schwingenschlögl, U. (2005). Asymptotic seat bias formulas. Metrika, 62(1), 23–31.
Estonia: Riigikogu Election Act. http://www.vvk.ee/public/dok/RKseadus_eng_2010.pdf. Example of official regulations (accessed 21 October 2011).
Farrell, D. M. (2011). Electoral systems (2nd ed.). Basingstoke: Palgrave Macmillan.
Gaffke, N., & Pukelsheim, F. (2008). Divisor methods for proportional representation systems: an optimization approach to vector and matrix apportionment problems. Mathematical Social Sciences, 56, 166–184.
Germany: Bundeswahlgesetz. www.bundestag.de/dokumente/rechtsgrundlagen/bwahlg_pdf.pdf. Example of official regulations (accessed 21 October 2011).
Grafakos, L. (2004). Classical and modern Fourier analysis. Upper Saddle River: Pearson.
Grimmett, G. R. (2012). European apportionment via the Cambridge compromise. Mathematical Social Sciences, 63, 68–73.
Gut, A. (2005) Probability: a graduate course. New York: Springer.
Heinrich, L., & Schwingenschlögl, U. (2006). Goodness-of-fit criteria for the Adams and Jefferson rounding methods and their limiting laws. Metrika, 64(2), 191–207.
Heinrich, L., Pukelsheim, F., & Schwingenschlögl, U. (2004). Sainte-Laguë’s chi-square divergence for the rounding of probabilities and its convergence to a stable law. Statistics & Decisions, 22, 43–59.
Heinrich, L., Pukelsheim, F., & Schwingenschlögl, U. (2005). On stationary multiplier methods for the rounding of probabilities and the limiting law of the Sainte-Laguë divergence. Statistics & Decisions, 23, 117–129.
Huntington, E. V. (1928). The apportionment of representatives in Congress. Transactions of the American Mathematical Society, 30(1), 85–110.
Janson, S. (2011). Another note on the Droop quota and rounding. Voting Matters, 29, 32–34.
Janson, S., & Linusson, S. (2012, to appear) The probability of the Alabama paradox. Journal of Applied Probability, 49.
Kopfermann, K. (1991). Mathematische Aspekte der Wahlverfahren. Mannheim: Wissenschaftsverlag.
Leutgäb, P., & Pukelsheim, F. (2009). List apparentements in local elections – a lottery. Homo Oeconomicus, 26(3/4), 489–500.
Macau: Lei Eleitoral No. 3/2001. http://bo.io.gov.mo/bo/i/2001/10/lei03.asp. Example of official regulations (accessed 21 October 2011).
Niemeyer, H. F., & Niemeyer, A. C. (2008). Apportionment methods. Mathematical Social Sciences, 56(2), 240–253.
Niemeyer, H., & Wolf, G. (1984). Über einige mathematische Aspekte bei Wahlverfahren. Zeitschrift für Angewandte Mathematik und Mechanik, 64(5), 340–343.
Oelbermann, K.-F., Pukelsheim, F., & Palomares, A. (2010). The 2009 European Parliament elections: from votes to seats in 27 ways. European Electoral Studies, 5(1), 148–182.
Pólya, G. (1918a). Über die Verteilungssysteme der Proportionalwahl. Zeitschrift für schweizerische Statistik und Volkswirtschaft, 54, 363–387.
Pólya, G. (1918b). Sur la représentation proportionnelle en matière électorale. L’Enseignement Mathématique, 20, 355–379.
Pólya, G. (1919). Proportionalwahl und Wahrscheinlichkeitsrechnung. Zeitschrift für die Gesamte Staatswissenschaft, 74, 297–322.
Sainte-Laguë, A. (1910a). La représentation proportionnelle et la méthode des moindres carrés. Comptes Rendus Hebdomadaires des Séances de L’Académie des Sciences, 151, 377–378.
Sainte-Laguë, A. (1910b). La représentation proportionnelle et la méthode des moindres carrés. Annales Scientifiques de L’Ecole Normale Supérieure (3), 27, 529–542.
Schuster, K., Pukelsheim, F., Drton, M., & Draper, N. R. (2003). Seat biases of apportionment methods for proportional representation. Electoral Studies, 22, 651–676.
Schwingenschlögl, U. (2008). Seat biases of apportionment methods under general distributional assumptions. Applied Mathematics Letters, 21(1), 1–3.
Schwingenschlögl, U., & Drton, M. (2004). Seat allocation distributions and seat biases of stationary apportionment methods for proportional representation. Metrika, 60(2), 191–202.
Schwingenschlögl, U., & Drton, M. (2006). Seat excess variances of apportionment methods for proportional representation. Statistics & Probability Letters, 76(16), 1723–1730.
Weyl, H. (1916). Über die Gleichverteilung von Zahlen modulo Eins. Mathematische Annalen, 77(3), 313–352.
Acknowledgement
I thank Friedrich Pukelsheim for many helpful comments.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Divisor methods
A divisor method is based on a given sequence of numbers d(n), n≥1, with 0≤d(1)<d(2)<d(3)<⋯ . Different choices of d(n) give different methods. A number of divisor methods that have been used or discussed are shown in Table 1. The most important ones are the widely used methods by Jefferson/D’Hondt and Webster/Sainte-Laguë. (Notation varies. In e.g. Balinski and Young (2001), the sequence is denoted d(0),d(1),… ; thus their d(n) is our d(n+1).)
Divisor methods can be described in several different ways that yield the same result. The methods have been invented and reinvented in different guises; tradition varies, and different types of formulations are used in, for example, election laws and other official and non-official descriptions from different countries. There are two main types of formulation, and we give each in two different versions. (The equivalence of the different types of formulations have been known for a long time. For example, D’Hondt proposed his method in D’Hondt (1882) using essentially our formulation D1 below, and later showed the equivalence to D4, see Kopfermann 1991, p. 125.)
In the first type of formulation, the method is seen as a way of rounding. Given a sequence d(1),d(2),… as above, we define the d-rounding of a real number x>0 by
where d(0)=0. Note that this is unambiguous only if d(n)<x<d(n+1); if x=d(n) for an integer n, then both [x] d =n and [x] d =n−1 are acceptable values. (This is important in the case of ties, see below.) In particular, \(d(n)=n-\frac{1}{2}\) (Webster’s method) yields standard rounding; d(n)=n (Jefferson’s method) yields rounding downwards; d(n)=n−1 (Adams’s method) yields rounding upwards. (Other choices of d(n) yield non-standard roundings. Note that “rounding” in general should be interpreted in a very weak sense, and that |[x] d −x|>1 may be possible, see Remark A.3.) In this formulation, the numbers d(n) are sometimes called signposts, see Balinski and Young (2001) and Gaffke and Pukelsheim (2008).
Using the concept of d-rounding, and the notation from Sect. 2, the divisor method may be defined as follows:
Divisor method, formulation D1
Let
where D is a (real) number chosen such that \(\sum_{i=1}^{m}s_{i}=N\).
The rationale for this formulation is that D is regarded as the price of a seat, i.e. the number of votes that a seat “costs”, which would give v i /D seats to party i; this real number is d-rounded to an integer to obtain s i . The price D is set by “the market” (i.e. the election officer, or computer) so that the desired total number of seats is distributed. This interpretation can be further combined with the definition (A.1) to say that the price of n seats is d(n)D votes.
Remark A.1
In particular, (A.1) yields [x] d =0 if and only if x<d(1), or perhaps x=d(1). Thus, if d(1)=0, then [x] d ≥1 for every x>0, while if d(1)>0, then [x] d =0 for small positive x.
Hence, if d(1)=0, then (A.2) yields s i ≥1, so every party (with at least one vote) is guaranteed at least one seat. This is unacceptable in general elections, where on the contrary there often are special threshold rules to prevent small parties from getting seats. However, this is acceptable, and may even be desirable, when distributing seats between states or constituencies, where typically there is a side condition of at least one seat each (and sometimes a minimum of two or more seats); a typical example is the United States Constitution (Balinski and Young 2001). (If d(1)=0, we assume that N≥m, so that there are enough seats to satisfy this minimum requirement.)
A divisor method is by Gaffke and Pukelsheim (2008) called impervious if d(1)=0 and pervious if d(1)>0.
If we move D continuously from ∞ down to 0, it is seen that the total number of seats \(\sum_{i=1}^{m}s_{i}\) given by (A.2) grows monotonously from 0 (when d(1)>0) or m (when d(1)=0) to ∞, and there is a value of D that yields the desired sum \(\sum_{i=1}^{m}s_{i}=N\). In general there is an interval of such D, all yielding the same seat distribution s i , and this determines all s i uniquely. In exceptional cases, there is only a single D that works; in this case we must have equalities v i /D=d(s i ) for some party i and v j /D=d(s j −1) for some other party j, see (A.2) and (A.1), and in this case the result \((s_{i})_{1}^{m}\) is not unique; we may move one seat from party j to party i and (A.2) still holds. Such cases are known as ties; they are usually resolved by drawing lots, but other special rules for ties are used in some countries. The formulation D1 thus gives a well-defined method in the sense that the numbers s i are uniquely determined, except possibly when there are ties. (Note that the number D is not uniquely determined.)
It is sometimes, as in Sect. 4, convenient to rewrite (A.2) as follows using x:=1/D or t:=V/D. A divisor method may thus also be called a multiplier method, as in Heinrich et al. (2005).
Divisor method, formulation D2
Let
where x or t is a (real) number chosen such that \(\sum_{i=1}^{m}s_{i}=N\).
Formulations of the type D1 of divisor methods are the standard in USA, where Thomas Jefferson formulated a method of this type (with rounding downwards) in 1792 for apportionment; Jefferson’s method was adopted by Congress in 1792 and used until 1832 (Balinski and Young 2001). Four other divisor methods (using formulation D1 with different d(n), see Table 1) have also been important in USA; Huntington’s method is used since 1941 and Webster’s method has been used earlier, while Adams’s and Dean’s methods have never been used but have frequently figured in discussions, see Huntington (1928) and Balinski and Young (2001).
In Europe, this type of formulation is unusual (although Grimmett 2012 and the current German election law (Germany, §6 (2)) are examples), and divisor methods are more commonly defined by formulations such as the two following (closely related) ones. Note that these formulations are algorithmic, in the sense that they directly yield workable methods, while with D1 one has to search for a suitable divisor. (Preferably by computer, where such searches are easy.) The European tradition of divisor methods is younger than the American one, see e.g. Carstairs (1980), Farrell (2011), Kopfermann (1991). D’Hondt’s method (which is the same as Jefferson’s) was proposed by D’Hondt (1882) in 1882 and first adopted in Belgium 1899; it is now used in many countries. Sainte-Laguë’s method (which is the same as Webster’s) was proposed by Sainte-Laguë (1910a, 1910b) in 1910, and is also used in several places.
Divisor method, formulation D3
The N seats are awarded sequentially, with each seat given to the party i that has the highest value of the quotient v i /d(s i +1) where s i is the number of seats that the party has received so far.
The quotient v i /d(s i +1) is called the comparative figure of the party. Note that the comparative figure is updated (decreased) each time the party receives a seat.
Divisor method, formulation D4
Divide each v i by the numbers d(1), d(2), … (as far as necessary). Assign the seats to the N largest quotients v i /d(j).
With formulation D4, the quotients v i /d(j) can be arranged in a table (matrix); alternatively, the sequences of quotients obtained for different parties can be merged into a single sequence in decreasing order. It is immediate that formulations D3 and D4 are equivalent; D3 picks the largest quotients in D4 in decreasing order.
Of course, ties can occur in formulations D3 and D4 as well, and again they are resolved by drawing lots, or possibly by some other rule. (In formulation D3, ties may also occur at intermediate stages, but they do not affect the final result.)
To see that the formulations D1–D4 are equivalent and yield the same method, note first that (A.2) can be written, using the definition (A.1),
or, equivalently (interpreting v i /0=+∞)
Consequently, \((s_{i})_{i=1}^{N}\) are such that
conversely, for any such s i we may choose D such that (A.5) holds. Thus formulation D1 is equivalent to:
Divisor method, formulation D5
s 1,…,s m are chosen such that \(\sum_{i=1}^{m}s_{i}=N\) and (A.6) holds.
Furthermore, it is easily seen that formulation D5 is equivalent to D4, and thus also to D3. Hence all formulations are equivalent. (It is easily verified that also ties appear simultaneously in the different formulations, except that in formulation D3 there may be additional intermediate ties that do not affect the result.) Further equivalent formulations are given by Gaffke and Pukelsheim (2008).
We see also, from (A.5), that the possible choices of D in formulation D1 that yield s 1,…,s m are D∈[D −,D +], where
Thus, D + equals the winning comparative figure when the last seat is awarded in formulation D3, and D − equals the winning comparative figure for the next seat (if there were one).
Remark A.2
It follows immediately from any of the formulations D1–D5 that the sequences d(n) and cd(n), for any constant c>0, define the same divisor method. For example, Sainte-Laguë’s method is given in Table 1 with \(d(n)=n-\frac{1}{2}\); it can as well be defined with d(n)=2n−1. (In fact, the method is traditionally defined with this sequence, see e.g. Sainte-Laguë (1910a, 1910b). It is therefore also called the odd-number method.) Similarly, the adjusted Sainte-Laguë’s method (used in Sweden and Norway) is traditionally defined with the sequence 1.4,3,5,7,… , and the Danish method with the sequence 1,4,7,10,… . (This method is used in Danish parliamentary elections for part of the distribution of adjustment seats between constituencies; the main distribution among parties is by the method of largest remainder.) It may be convenient to use integers; on the other hand, Imperiali’s method (used in Belgian local elections) is in the election law (Belgium 1932, Art. 56) described using \(1, 1\frac{1}{2}, 2, 2\frac{1}{2}, \dots\) , i.e., (n+1)/2. In Table 1, we have normalised d(n) to the form n+β whenever possible, cf. Sect. A.1.
Remark A.3
Several papers impose the condition that n−1≤d(n)≤n, but this is really not necessary. This condition is natural for the interpretation of the method as a kind of rounding in formulation D1, since it is equivalent to [n] d =n for any integer n∈ℕ, see (A.1). In particular, this means that if the exact proportions q i =p i N happen to be integers, then the divisor method gives s i =q i . (Take D=V/N in (A.2); cf. (2.1).) However, the method is well-defined also for other sequences d(n), and as seen in Table 1, there are some divisor methods currently in use for general elections that do not satisfy this condition (the Imperiali method (Belgium 1932, Art. 56) and the methods in Estonia (§62 (5)) and Macau (Artigo 17)). Formulation D1 remains formally valid in these cases too, but in practice these methods are described using e.g. formulation D3 where there are no problems of interpretation for any increasing sequence d(n). (Methods satisfying s i =q i when all q i are integers are called weakly proportional by Balinski and Young 2001.)
Remark A.4
We required above that the sequence d(n) is strictly increasing. In fact, it can be defined assuming only 0≤d(1)≤d(2)≤⋯ , with only a few minor modifications above. This extension is hardly used in practice, except that a minimum requirement of at least r seats for each party for some r>1 can be achieved by taking d(1)=d(2)=⋯=d(r)=0, cf. Remark A.1 for the case r=1. (We then need N≥rm.)
For example, the Cambridge Compromise (Grimmett 2012) proposed for apportionment of the European Parliament can be described as giving each state a “base” of 5 seats and distributing the rest by Adams’s method (subject to a maximum of 96 seats for each state); this is (e.g. by formulation D1, see also Theorem A.12 below) equivalent to giving each state 6 seats and distributing the rest by D’Hondt’s method (subject to the same maximum). If we ignore the restriction to at most 96 seats (as done in Table 1 for simplicity), this is the same as a divisor method with d(n)=0 for n≤6 and d(n)=n−6 for n>6, i.e. d(n)=(n−6)+. The maximum restriction is easily implemented if e.g. formulation D3 is used, cf. Grimmett (2012); formally we can also implement it by redefining d(n)=∞ for n>96.
Remark A.5
The name “divisor method” is used for methods of this type in any of the formulations above. However, the word divisor is usually used for the (variable) number D in formulation D1, but for the (fixed) numbers d(1),d(2),… in formulations D3–D4.
1.1 A.1 Linear divisor methods
We say that a divisor method is linear if d(n)=an+b for some a>0 and b∈ℝ. By Remark A.2, we may replace d(n) by d(n)/a=n+b/a. We define β:=b/a+1. In other words, we may assume that a=1 and
for some real β. The reason for our choice of β as parameter, and thus formula (A.8), is (A.9) below. Note that β=d(1).
We call the divisor method with d(n) given by (A.8) the β-linear divisor method. The parameter β is called proportionality index (Proportionalitätsindex) by Kopfermann (1991). The method is called q-stationary multiplier method (where q=β) by Heinrich et al. (2005).
With a linear divisor method (A.8), the definition (A.1) of d-rounding yields that if [x] d =n, then n−1+β≤x≤n+β, or equivalently x−β≤n≤x−β+1. Consequently, d-rounding equals the β-rounding defined in (2.4) and (2.5); i.e.
at least provided x>0 and x>β−1. Hence, the β-linear divisor method can, using formulation D1, be defined by (2.7) as we did in Sect. 2. If β>1, we here have to assume that N is so large that v i /D>β−1 and thus [v i /D] β ≥0.
Remark A.6
If β>1 and N is small, (2.7) may give a negative seat number s i for a very small party. Of course, this has to be replaced by 0, as given by (A.2). Hence, we may still use (2.7) if we ignore any party that would get a negative number of seats.
If [v i /D] β ≤−1 for some i, then v i /D≤β−1 by (2.5) and thus for any party j, using (2.4),
and summing over j we get N≤(1/p i −m)(β−1). Hence, if N>(1/min i p i −m)(β−1), then (2.7) holds without exception.
Remark A.7
The definition above requires β=d(1)≥0. We can extend it to β<0 by replacing (A.8) by
i.e., letting d(n)=0 for n≤1+⌊|β|⌋, cf. Remark A.4. Then (A.9) still holds for x>0, and (2.7) holds as before. By Remark A.4, every party now gets at least ⌊|β|⌋+1 seats, and thus we need N≥m(⌊|β|+1⌋).
The Cambridge Compromise (without maximum rule) is an example, with β=−5.
Remark A.8
Linear divisor methods are often considered only for the case 0≤β≤1, which is equivalent to the condition n−1≤d(n)≤n discussed in Remark A.3. However, we see that they are well-defined for arbitrary real β, although some care may be needed for small N. Note that at least the case β=2 is used in practice.
In the main part of the paper we consider only linear divisor methods. Note that many of the methods in Table 1 are linear: Jefferson/D’Hondt (β=1); Webster/Sainte-Laguë (β=1/2); Adams (β=0); Imperiali (β=2); Danish (β=1/3); Cambridge Compromise (β=−5).
Remark A.9
The adjusted Sainte-Laguë method also has a linear d(n) as in (A.8) (β=1/2) except for d(1), which does not affect asymptotic results. More precisely, the adjusted Sainte-Laguë method differs from Sainte-Laguë’s method only in the value of d(1), and [x] d is the same for both methods for all x≥0.7; hence they give exactly the same result as soon as every party gets at least one seat by the adjusted method. (But the adjustment makes it more difficult for a small party to get the first seat.) For asymptotic results as in the present paper, there is thus no difference between the adjusted Sainte-Laguë method and Sainte-Laguë’s method.
Remark A.10
Huntington’s and Dean’s methods are asymptotically linear in the sense that d(n)=n−1+β+o(1) as n→∞, in both cases with β=1/2. An argument similar to the proof in Sect. 4 shows that as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow}\infty}}\), the probability that these methods yield the same result as Webster’s tends to 1. In particular, Theorem 3.7 (with β=1/2) holds for these methods too, and Theorem 3.4 shows that they are asymptotically unbiased.
Note that our asymptotic approach does not answer the controversial question whether Webster’s or Huntington’s method is the most fair and unbiased, see Balinski and Young (2001); this depends mainly on how one measures the bias for small parties (states), in particular for the ones obtaining just 1 seat.
Although linear divisor methods may be biased, see Sect. 3, they are perfectly proportional with respect to changes in the total number of seats. (If β<0 or β>1, we assume that N is not too small, see Remarks A.6–A.7.)
Theorem A.11
Consider a β-linear divisor method. If the number of seats is increased from N to N+L, and Lp 1,…,Lp m all are integers, then party i gets exactly Lp i seats more:
Proof
This is easiest seen using the multiplier formulation D2. If t yields s i (N)=[p i t] β with \(\sum_{i=1}^{m}s_{i}(N)=N\), then t+L yields
with \(\sum_{i=1}^{m}s_{i}=\sum_{i=1}^{m}s_{i}(N)+\sum_{i=1}^{m}p_{i}L=N+L\). □
There is also a simple relation between the methods for two different values of β that differ by an integer.
Theorem A.12
Consider the β-linear divisor method and suppose that N is so large that every party gets at least one seat. (Otherwise, ignore the remaining parties.) Then the β-linear divisor method yields the same result as first giving one seat to each party and then distributing the remaining N−m seats by the (β+1)-linear divisor method.
Proof
This follows immediately from formulation D4, noting that by (A.8), or more generally (A.10), d β+1(n)=d β (n+1) (using β as a subscript to denote the divisor sequences).
Alternatively, this follows from (2.7) and the relation [x] β+1=[x] β −1, which follows from (2.6). □
By repeated applications of this theorem, we can reduce any β-linear divisor method to the case 0<β≤1.
Example A.13
Adams’s method (β=0) is the same as giving each party one seat and distributing the rest by Jefferson’s method (β=1).
Example A.14
The Imperiali method (β=2) is equivalent to using D’Hondt’s method (β=1) but retracting one seat from each party and redistribute them (among the parties that had at least one seat) by continued application of D’Hondt’s method.
Example A.15
The Cambridge Compromise (β=−5), is as said in Remark A.4 the same as giving each party (state) 5 seats and distributing the rest by Adams’s method (β=0), or as giving each party (state) 6 seats and distributing the rest by Jefferson/D’Hondt’s method (β=1).
1.2 A.2 Divisor methods and proportionality
We make a formal definition of proportionality for election methods, using an asymptotic property, since no election method is exactly proportional except in exceptional cases. (Balinski and Young 2001 uses a different definition, cf. Remark A.3.)
Definition A.16
An election method is (asymptotically) proportional if, for any number m of parties and any proportions of votes p 1,…,p m , for each party i,
Many proportional methods that are used are divisor methods, but not all divisor methods are proportional. In order to characterise the proportional divisor methods, recall that a positive measurable function f on some interval (A,∞) is regularly varying with index ρ, where ρ is a real number, if
for every λ>0. In the special case ρ=0, i.e. when f(λx)/f(x)→1 as x→∞ for every λ>0, we say that f is slowly varying. A typical example of a slowly varying function is logx. A sequence c n is said to be regularly varying with index ρ (and slowly varying if ρ=0) if the function c ⌊x⌋ is; this is equivalent to c ⌊λn⌋/c n →λ ρ as n→∞, for every λ>0. (See Bingham et al. 1987, where many more results are given.)
Theorem A.17
For a divisor method defined by d(1),d(2),… , the following are equivalent.
-
(i)
The method is proportional.
-
(ii)
For any m and p 1,…,p m , and all i,j≤m,
$$ \frac{s_j(N)}{s_i(N)}\to\frac{p_j}{p_i} \quad\mbox{\textit{as} }{{N\to \infty}}. $$(A.14) -
(iii)
x↦[x] d is regularly varying with index 1.
-
(iv)
x↦[x] d /x is slowly varying.
-
(v)
The sequence d(n) is regularly varying with index 1.
-
(vi)
The sequence d(n)/n is slowly varying.
The proof shows that it suffices that (A.12), or (ii), holds when m=2.
Proof
(i)⇔(ii). If (A.12) holds for each i, then also (A.14) holds. On the other hand, if (A.14) holds, then summing (A.14) over all j we obtain N/s i (N)→1/p i , and thus (A.12).
(ii)⇔(iii). By (A.2), s i =[v i /D] d =[p i x] d with x:=V/D, and hence (A.14) can be written
since \(N=\sum_{i=1}^{m}s_{i}=\sum_{i=1}^{m} [p_{i}x ]_{d}\) and thus N→∞⇔x→∞. If [x] d is regularly varying with index 1, then (A.15) follows from (A.13) with λ=p j /p i . Conversely, if (ii) holds, and λ>0, consider the case m=2 and p 1=λ/(1+λ), p 2=1/(1+λ). Then (A.15) with (j,i)=(1,2) and y:=p 2 x yields [λy] d /[y] d →λ, which is (A.13) with ρ=1.
(iii)⇔(iv). Directly by the definition (A.13).
(iii)⇔(v). Let g(x) be the inverse of [x] d defined by g(x):=inf{y:[y] d >x}. Then (A.1) shows that g(x)=d(⌊x⌋+1). Bingham et al. (1987, Theorem 1.5.12) shows that if [x] d is regularly varying with index 1, then so is g(x). This implies d(n+1)/d(n)→1 as n→∞, and it follows that d(n) is regularly varying with index 1. The converse follows in the same way, now using the inverse of d(⌊x⌋) which is [x] d +1.
(v)⇔(vi). Directly by the definition. □
In particular, Theorem A.17 shows that any divisor method with
for some a>0, is proportional. Of the methods in Table 1, the only methods that do not satisfy (A.16), and thus are proportional, are the methods of Estonia and Macau. In fact, it is easily seen from formulation D3 or D4 that the Estonian method is the same as D’Hondt’s method applied to \(v_{i}^{1/0.9}=v_{i}^{1.111\dots}\); thus it makes s i proportional to \(v_{i}^{1.111\dots}\). This gives a small but clear advantage to larger parties. Macau’s method, on the contrary, favours small parties and encourages splintering. (With only 12 directly elected members of the Legislative Assembly of Macau, it essentially imposes a maximum of 2 seats per party.)
In fact, the proportional divisor methods in Table 1, and all linear divisor methods, satisfy the following stronger form of proportionality.
Theorem A.18
For a divisor method with d(n)=an+O(1) for some a>0, s i (N)=p i N+O(1). (The implicit constant may depend on m, but not on anything else.)
Proof
By Remark A.2, we may replace d(n) by d(n)/a, and thus assume d(n)=n+O(1).
If d(n)=n+O(1), then also d(n−1)=n+O(1) and thus (A.1) shows that if [x] d =n, then x=n+O(1); in other words,
Consequently, (A.2) yields, with y:=V/D,
Summing over i we find \(N=\sum_{i=1}^{m}p_{i} y+O(1)=y+O(1)\), and thus (A.18) yields s i =p i N+O(1). □
1.3 A.3 Uniformity
Divisor methods satisfy the following consistency property, called uniformity in Balinski and Young (2001), when we consider subsets of the set of all parties. This is obvious from any of the formulations D1–D5, and would perhaps not be worth mentioning, except for the fact that it does not hold for quota methods; in fact, under some weak extra conditions, it is satisfied only by divisor methods (Balinski and Young 2001).
Theorem A.19
Suppose that some set of parties A 1,…,A ℓ (where 1≤ℓ≤m) get together N′ seats in an election by a divisor method. Then the number of seats for each of these parties is the same as if N′ seats were distributed, by the same divisor method, in an election with only these parties participating (and obtaining the same numbers of votes v i ).
Appendix B: Quota methods
In a quota method one first determines a quota Q; different quota methods differ in the formula for Q, see below. (Q can be seen as the standard price of a seat, just as the divisor D in the formulation D1 of divisor methods.) The seats are then distributed as follows:
Quota method, formulation Q1
Divide the numbers of votes v i by Q, and give first each party as many seats as the integer part ⌊v i /Q⌋ of its fraction. Any remaining seats are given to the parties that have largest fractional part {v i /Q}=v i /Q−⌊v i /Q⌋. (Or, equivalently, largest remainder at the division v i /Q, since the remainder is v i −⌊v i /Q⌋Q, which is Q times the fractional part v i /Q−⌊v i /Q⌋.)
The most common quota method is the method of largest remainder, or Hamilton’s method, in Europe often called Hare’s method, which uses the simple quota, also called Hare quota, Q:=V/N, i.e. the average number of votes per seat. Note that then v i /Q=q i , the exactly proportional real allocation, see (2.1). (Alexander Hamilton suggested the method in 1792 for apportionment to the US Congress; it was approved by Congress but vetoed by president Washington; it made a comeback and was used (under the name Vinton’s method) 1850–1900, see Balinski and Young (2001). The name Hare’s method is also well-established but is a misnomer; Thomas Hare really advocated a different method, the Single Transferable Vote, STV, which is not based on party lists and is not treated here.)
Another version is Droop’s method which uses the Droop quota Q:=V/(N+1). Also the Imperiali quota Q:=V/(N+2) has occasionally been used (e.g. in Italy until 1993, Farrell 2011).
We define, more generally, for any real number γ, the γ-quota method to be the quota method with quota
(For γ<0, we assume that N>|γ|.) Thus Hamilton/Hare’s method is the case γ=0, Droop’s method is γ=1, and the Imperiali quota is γ=2. (Kopfermann 1991 calls the γ-quota method Rundungsverfahren mit dem Proportionalitätsindex ρ, where his ρ=(γ+1)/2, i.e., γ=2ρ−1.)
Remark B.20
The quotas above are sometimes rounded to integers (upwards, downwards or by standard rounding); see Oelbermann et al. (2010) for several different examples in current use. There is no mathematical reason to round the quota; on the contrary, there are reasons against. Kopfermann (1991) points out that quota methods involve rounding at the end, but that, as a general principle, intermediate values in a calculation should not be rounded. A specific problem is that rounding the quota means that the method is no longer homogeneous (see Sect. 2). Moreover, with a rounded quota, in some cases a party may lose a seat by getting an additional vote, which ought to be unacceptable; see Janson (2011) for a simple example with Droop’s method (although there formulated for STV, which in the example gives the same result). In a general election with a large number of votes, and therefore a large quota, rounding the quota will usually make no difference, but in the exceptional cases when it does make a difference, it can be harmful and lead to undesirable results.
In the present paper, we consider the methods with unrounded quotas.
Remark B.21
In formulation Q1, it is implicitly assumed that \(N-m\le\sum_{i=1}^{m} \lfloor v_{i}/Q \rfloor\le N\), so that the number of remaining seats is non-negative and not larger than m, the number of parties. For the γ-quota method, this is always satisfied if −1≤γ<1 (including Hamilton/Hare), and also if γ=1 (Droop) except in a very special case when all parties tie for the last seat. For γ<−1 or γ>1, formulation Q1 may have to be amended; in such cases we interpret it by using one of the equivalent formulations Q2–Q4 below. (These formulations are well-defined for all Q and N, up to the usual possibility of ties.)
The rule in Q1 that the remaining seats are distributed to the parties with largest remainders v i /Q−⌊v i /Q⌋ can also be expressed by saying that we find a suitable α and round v i /Q upwards if the fractional part is greater than α, and downwards if the fractional part is smaller. In other words, recalling the notion of α-rounding in Sect. 2, a quota method can equivalently be described as follows.
Quota method, formulation Q2
where α is chosen such that \(\sum_{i=1}^{m}s_{i}=N\).
For the γ-quota method we have Q=V/(N+γ), and thus v i /Q=p i V/Q=p i (N+γ). Hence we have also the following formula.
Quota method, formulation Q3
The γ-quota method is given by
where α is chosen such that \(\sum_{i=1}^{m}s_{i}=N\).
Formulations Q2–Q3 are the same as (2.8).
Quota methods can also be described using a sequential allocation of seats by comparative figures as in formulation D3 of divisor methods. (This is not the usual way to define quota methods, but it appears in proofs in Sainte-Laguë 1910b and Pólya 1918a.) It is easy to see that Q2 is equivalent to the following.
Quota method, formulation Q4
The N seats are awarded sequentially, with each seat given to the party i that has the highest value of the difference v i −s i Q where s i is the number of seats that the party has received so far.
It is easy to see that every γ-quota method satisfies s i (N)=Np i +O(1), uniformly in all m and p 1,…,p m , and hence is proportional, cf. Definition A.16 and Theorem A.18. We omit the easy proof since we show a precise version of this in Theorem 3.10.
Moreover, a γ-quota method is perfectly proportional with respect to changes in the total number of seats, just as a β-linear divisor method, cf. Theorem A.11.
Theorem B.22
Consider a γ-quota method. If the number of seats is increased from N to N+L, and Lp 1,…,Lp m all are integers, then party i gets exactly Lp i seats more:
Proof
Obvious from (B.3), since the same α works for N and N+L. □
2.1 B.1 Two parties
In the simple case m=2, when there are only two parties, the β-linear divisor method and the γ-quota method with γ=2β−1 coincide (Kopfermann 1991, Satz 6.2.6). (In particular, then Webster/Sainte-Laguë = Hamilton/Hare and Jefferson/D’Hondt = Droop.) In fact, it is easy to see that
gives s 1+s 2=N, (at least for some choice in case of a tie). Thus, for the β-linear divisor method we can take D′=1/(N+2β−1)=1/(N+γ) and D=D′V=V/(N+2β−1) in (2.7), and for the γ-quota method we can take α=β in (2.8). Hence, both methods yield the result given by (B.5).
Appendix C: Lemmas on uniform distribution
Proof of Lemma 5.1
First, by Lemma 4.1, the sequence
is uniformly distributed in [0,1)k. Thus, by the definition of \(\operatorname{Mod}_{m}\),
is uniformly distributed in [0,m)k. If a sequence (x n ) is uniformly distributed in [0,m), then the sequence of pairs ({x n },⌊x n ⌋) is uniformly distributed in [0,1)×{0,…,m−1}. This extends to k dimensions, and thus it follows from (C.2) that if \(\tilde{\ell}_{nj}:=\operatorname{Mod}_{m}( \lfloor ny_{j}+a_{j} \rfloor)= \lfloor\operatorname{Mod}_{m}(ny_{j}+a_{j}) \rfloor\), then
is uniformly distributed in [0,1)k×{0,…,m−1}k.
The argument just given shows that (C.3) is uniformly distributed also along any subsequence n=mν+n 0, ν≥1. We have \(\ell_{n}=\operatorname{Mod}_{m}(n-\sum_{j=1}^{k}\tilde{\ell}_{nj})\), and it follows that along any such subsequence, ℓ n is determined by \(\tilde{\ell}_{n1},\dots,\tilde{\ell}_{nk}\). Consequently, the vectors (5.11) are uniformly distributed along any such subsequence, and therefore also along the full sequence. □
The following lemma is essentially taken from Heinrich et al. (2004). (There Riemann integrability is assumed, but as the proof below shows, Lebesgue integrability suffices.)
Lemma C.23
Let X=(X 1,…,X k ) be a random variable with an absolutely continuous distribution in ℝk. Let ν n be any sequence of constants with ν n →∞. Then
where U 1,…,U k ∼U(0,1) are independent of each other and of X.
Proof
The assumption that X is absolutely continuous means that it has a density function f, which is a Lebesgue integrable function on ℝk.
We regard {ν n X j } and U j as elements of the circle group \(\mathbb{T}=\mathbb{R}/\mathbb{Z}\). Thus the random vectors in (C.4) are elements of the group \(\mathbb{T}^{k}\times\mathbb{R}^{k}\), and by standard Fourier analysis, it suffices to show that for any integers ℓ 1,…,ℓ k and real numbers t 1,…,t k ,
Since ν n X j −{ν n X j } is an integer, we have, with f as above,
If (ℓ 1,…,ℓ k )=(0,…,0), then (C.5) is trivial. If (ℓ 1,…,ℓ k )≠(0,…,0), then |(2πℓ 1 ν n +t 1,…,2πℓ k ν n +t k )|→∞, and thus (C.6) and the Riemann–Lebesgue lemma show that the left-hand side of (C.5) tends to 0, which verifies (C.5) in this case too, since
because \(\mathbb{E}e^{2\pi\mathrm{i}\ell_{j}U_{j}}=0\) when ℓ j ≠0. □
Appendix D: Moments of order statistics for random p i
Let p 1,…,p m be random and uniformly distributed on \(\mathfrak{S}_{m}\) as in Sect. 6, and let p (1)≤⋯≤p (m) be the order statistics. We give here, for completeness, a calculation of moments of p (k) by a well-known method. Note that we here follow standard convention for order statistics in probability theory and order p i in increasing order, in contrast to Sect. 6 where we order them in decreasing order; thus p [k]=p (m+1−k), which should be remembered when using the results below in Sect. 6.
Let T 1,…,T m be m independent identically distributed random variables with the exponential distribution \(\operatorname{Exp}(1)\), and let \(Z:= \sum_{i=1}^{m}T_{i}\). Then (T 1/Z,…,T m /Z) is uniformly distributed on the simplex \(\mathfrak{S}_{m} \), and moreover independent of Z. This means that we can take p i =T i /Z. For the corresponding order statistics we thus have p (k)=T (k)/Z; moreover p (k) and Z are independent and thus, for any k and ℓ,
so the moments of p (k) are given by
Furthermore, \(Z:=\sum_{i=1}^{m}T_{i}\) has the Gamma distribution Γ(m) with moments
To find the moments of the order statistics T (k), we regard T 1,…,T m as the times of the first events in m independent Poisson processes. (Alternatively, we may regard them as life-lengths of m identical radioactive atoms.) It is then well-known, and easy to see by the lack of memory in exponential distributions, that the increments (or waiting times) V j :=T (j)−T (j−1) (with T (0):=0) are independent exponential random variables with \(V_{j}\sim\operatorname {Exp}(1/(m+1-j))\). Hence \(T_{(k)}=\sum_{j=1}^{k}V_{j}\) with these V j , and moments are easily computed. In particular,
(This can be written as H m −H m−k , where \(H_{m}=\sum_{i=1}^{m}1/i\) is the m:th harmonic number.) Similarly,
Covariances are computed similarly. If 1≤k≤ℓ≤m, then
and, calculating as in (D.5) and (D.6),
Example D.24
For m=3, the covariance matrix of (p [1],p [2],p [3])=(p (3),p (2),p (1)) is
Rights and permissions
About this article
Cite this article
Janson, S. Asymptotic bias of some election methods. Ann Oper Res 215, 89–136 (2014). https://doi.org/10.1007/s10479-012-1141-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-012-1141-2