1 Introduction

The idea of a proportional election method is that each party gets a number of seats that is proportional to the number of votes. The same mathematical problem arises if seats are to be apportioned (before the election) between states in a union or between multi-member constituencies according to their populations. In particular, the problem of apportionment to the House of Representatives in the United States has been the source of much debate as well as much research since 1790, including the invention of several important election methods; see Balinski and Young (2001) for a detailed history. (The European Union has yet to agree on a similar procedure; see Grimmett (2012) for a recent attempt to replace the current political bartering with a “mathematical formula”, i.e. a well-defined method to allocate seats in the European parliament to the member states.) To fix the terminology, we talk in this paper about parties in an election; the case of apportionment between states, etc. differs mathematically only in language.

Of course, exact proportionality is in general not possible, since the number of seats has to be an integer, and the election method can be seen as a method to “round” the real numbers that yield exact proportionality to integers. The purpose of this paper is to study the resulting deviation from perfect proportionality, i.e. the “rounding errors” introduced by the method. In particular, we are interested in whether there is a systematic bias favouring large or small parties for various election methods.

This problem has received a considerable amount of attention going back to (at least) Pólya (1918a, 1918b, 1919). (Sainte-Laguë 1910b did a related comparison between two different methods already in 1910, see Example 3.5.) Some investigations have analysed data from real elections or from simulations (with party sizes chosen at random according to some predetermined distributions), see e.g. Balinski and Young (2001) and Schuster et al. (2003). The problem has also been studied theoretically by assuming that the parties have random sizes with a uniform distribution of the relative sizes over all possibilities, see Pólya (1918a, 1918b, 1919), Schuster et al. (2003), Heinrich et al. (2005) and Drton and Schwingenschlögl (2004, 2005, 2006). Random party sizes with more general distributions have been considered by Heinrich et al. (2004, 2005) and Schwingenschlögl (2008). (See also Sect. 6 below.)

The approach in the present paper is different. We consider a number of parties with fixed sizes and let the total number of seats N be random. More precisely, we let N be chosen uniformly at random in 1,2,…,N 0 for some large integer N 0, and then we take the limit as N 0→∞. (Limits obtained in this way, if they exist, can be regarded as results for “a random positive integer” N. See also Remark 3.1.) The same approach has been used in Janson and Linusson (2012) for a related problem (the Alabama paradox for Hamilton’s method).

We use this approach for several different election methods, and calculate, for given sizes of the parties (under a technical condition, see Sects. 3 and 11), the (asymptotic) distribution of the deviation from perfect proportionality as well as its mean, i.e. the systematic bias, and its variance. In particular, this shows how the bias depends on the sizes of the parties in a more detailed way than previous studies that consider random party sizes (see the references given above).

The election methods that we consider are divisor methods of the “linear” (or “stationary”) type (including Jefferson/D’Hondt and Webster/Sainte-Laguë) and quota methods (including Hamilton/Hare and Droop). See further Sect. 2, and Appendices AB.

The main results are stated in Sect. 3 and proven in Sects. 45. Section 6 show that the results hold also in the more traditional approach with deterministic house size N and random party sizes. The following sections contain some simple applications of our main results: Sect. 7 discusses the probability of violating quota, Sect. 8 the expected gain or loss for parties forming an apparentement (coalition), Sect. 9 the Sainte-Laguë divergence studied by Heinrich et al. (2004, 2005) and Sect. 10 some related goodness-of-fit functionals. Section 11 discusses our technical condition and the case of rational sizes of the parties. The appendices contain some background material.

2 Notation

We assume throughout that we have m parties with v i votes for party i; we let \(V:=\sum_{i=1}^{m}v_{i}\) be the total number of votes and p i :=v i /V the proportions of votes for party i. (In real elections there may also be votes for parties that are disqualified because of a threshold rule, blank votes, and other invalid votes; these are ignored here so \(\sum_{i=1}^{m} p_{i}=1\). We consider only parties that have received at least one vote, so p i >0.) We further assume that N seats are to be distributed (the house size), and let s i be the number of seats given to party i; thus \(\sum_{i=1}^{m}s_{i}=N\). We occasionally write s i (N) when we need to specify the house size.

Strict proportionality would give

$$ q_i:=\frac{v_i}{V}N=p_iN $$
(2.1)

seats to party i. (This is usually not an integer.) We define the seat excess for party i to be the difference

$$ \Delta_i:=s_i-q_i=s_i-p_iN. $$
(2.2)

Note that

$$ \sum_{i=1}^m \Delta_i=\sum_{i=1}^ms_i - N = 0. $$
(2.3)

We use the standard notations ⌊x⌋ and ⌈x⌉ for rounding down and up of a real number x, i.e. the largest integer ≤x and the smallest integer ≥x, respectively. We denote the fractional part of x by {x}:=x−⌊x⌋.

More generally, let α be a real number. We say that the α-rounding of a real number x is the integer [x] α such that

$$ x-\alpha\le[x ]_{\alpha} \le x-\alpha+1; $$
(2.4)

equivalently,

$$ [x ]_{\alpha} +\alpha-1 \le x \le[x ]_{\alpha}+\alpha. $$
(2.5)

If xα is an integer, we regard [x] α as multiple-valued, and accept both xα and xα+1 as values of [x] α . (This exceptional case typically occurs at ties in the election methods.) Consequently,

$$ [x ]_{\alpha} = \lceil x-\alpha\rceil= \lfloor x+1-\alpha \rfloor, $$
(2.6)

except the case when xα is an integer and both xα and xα+1 are possible values. If 0≤α≤1, this means that x is rounded down if its fractional part is less than α and up if its fractional part is greater than α. In particular, \(\alpha=\frac{1}{2}\) yields standard rounding, α=0 yields rounding up and α=1 yields rounding down. (But note that we allow also α<0 or α>1, in which case |x−[x] α | may be greater than 1.)

The two types of election methods that we consider can be defined as follows; see further Appendices AB where further details and equivalent definitions are given. Note that all methods that we consider are homogeneous, i.e., the result depends only on the proportions p i of votes, but not on the total number V of votes; hence we will mainly use p i rather than v i in the sequel.

  • The β-linear divisor method, or the divisor method with divisors d(n)=n+β−1 (where β is a real number): Let

    $$ s_i:= \biggl[\frac{v_i}{D} \biggr]_{\beta}= \biggl[\frac{p_i}{D'} \biggr]_{\beta}, $$
    (2.7)

    where D (or D′=D/V) is chosen such that \(\sum_{i=1}^{m}s_{i}=N\). (If β<0 or β>1, there are some exceptions for small N, see Appendix A.1.) This includes several important methods, in particular β=1 (Jefferson, D’Hondt), β=1/2 (Webster, Sainte-Laguë), β=0 (Adams).

  • The γ-quota method (where γ is a real number): Let Q:=V/(N+γ) and let

    $$ s_i:= \biggl[\frac{v_i}{Q} \biggr]_{\alpha}= \bigl[(N+\gamma)p_i \bigr]_{\alpha}, $$
    (2.8)

    where α is chosen such that \(\sum_{i=1}^{m}s_{i}=N\). In principle, γ can be any real number (provided N+γ>0); in practice only the cases γ=0 (Hamilton, Hare, method of largest remainder) and γ=1 (Droop) are important, although also γ=2 has some limited use.

Remark 2.1

In both divisor methods and quota methods ties may occur; then the α-rounding in (2.7) or (2.8) has two possible values (because xα∈ℤ in (2.4)) for (at least) two parties, with (at least) one of them rounded to the larger value and (at least) one rounded to the smaller value. We assume that ties are resolved at random, but ties will usually not be a problem for us because of our assumptions below.

We let U(a,b) denote the uniform distribution on the interval (a,b). We will use U i and \(\widetilde{U}_{i}\) for random variables with U i ∼U(0,1) and \(\widetilde{U}_{i}\sim\mathrm{U}(-\frac{1}{2},\frac{1}{2})\), and we assume that such variables with different indices are independent. Recall that then \(\mathbb{E}U_{i}=1/2\), \(\mathbb{E}\widetilde{U}_{i}=0\) and \(\operatorname {Var}U_{i}=\operatorname{Var}\widetilde{U}_{i}=1/12\).

We denote convergence in distribution by \(\stackrel{\mathrm {d}}{\longrightarrow}\), convergence in probability by \(\stackrel{\mathrm{p}}{\longrightarrow}\) and equality in distribution by \(\stackrel{\mathrm{d}}{=}\).

3 Main results

As explained in the introduction, we assume that p 1,…,p m are fixed, and let the total number of seats (house size) N be random and uniformly distributed on {1,…,N 0} for some large integer N 0; we then consider limits as N 0→∞. For convenience, we denote this situation by \({{N\stackrel{\mathrm {p}^{*}}{\longrightarrow}\infty}}\). (Note that this implies \(N\stackrel{\mathrm{p}}{\longrightarrow }\infty\).)

Remark 3.1

We assume that N is distributed in this way for simplicity, but the results can easily be extended to more general sequence of random N∈ℕ. For example, we may take N uniformly distributed on an interval [N 1(k),N 2(k)] and let k→∞, for any two sequences N 1(k) and N 2(k) of positive integers with N 2(k)−N 1(k)→∞. In fact, we conjecture that the results hold for any N that satisfies the weak condition

$$ \varphi(\theta) :=\mathbb{E}e^{\mathrm{i}N\theta}\to0 \quad \mbox{for any }\theta\in(0,2\pi). $$
(3.1)

(This implies \(N\stackrel{\mathrm{p}}{\longrightarrow}\infty\) and excludes parity restrictions.) Note that Weyl’s Lemma 4.1 extends to this situation. However, we have not verified all details for this case, and we leave further investigations to the interested reader.

In order to obtain simple asymptotic results, we will usually make one mathematical simplification: we assume that p 1,…,p m are linearly independent over ℚ, i.e., that there is no relation

$$ a_1p_1+\cdots+a_mp_m=0 $$
(3.2)

with all coefficients a i rational and not all 0. (Equivalently, there is no relation (3.2) with integer coefficients a i , not all 0.) The case when this assumption does not hold is discussed in Sect. 11.

Remark 3.2

Mathematically, this assumption is reasonable, since if we choose p 1,…,p m at random (uniformly given that their sum is 1, as in Sect. 6), they will almost surely be linearly independent over ℚ. However, for real elections the assumption is clearly unreasonable since vote counts v i are integers and the p i thus rational numbers. Nevertheless, the results below are good approximations if the numbers p i have large denominators and there are no relations (3.2) with small integers a 1,…,a m , see Sect. 11. For practical purposes we thus can use the results below as good approximations for any p i and any N that is large enough. (However, we do not investigate the rate of convergence, and how large N has to be.)

3.1 Divisor methods

We begin with a simple deterministic bound, valid for all house sizes and numbers of votes, cf. Kopfermann (1991, Satz 6.2.11). It follows from the proof, or from Theorem 3.7 below, that the bounds are best possible. (Among bounds not depending on N; for a given N, the lower bound can be improved when N is so small that not all parties get seats, see Kopfermann (1991, p. 131).) Proofs of the results below are given in Sect. 4.

Theorem 3.3

Consider the β-linear divisor method. Then

$$ p_i-1+(\beta-1) (mp_i-1) \le \Delta_i \le(\beta-1) (mp_i-1) + (m-1)p_i. $$
(3.3)

Equivalently,

$$ \biggl \vert \Delta_i-\biggl(\beta-\frac{1}{2}\biggr) (mp_i-1)\biggr \vert \le\frac{1}{2}+\frac{m-2}{2}p_i. $$
(3.4)

For the asymptotic bias, we have a simple formula.

Theorem 3.4

Consider the β-linear divisor method, and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow}\infty}}\),

$$ \mathbb{E}\Delta_i \to\biggl(\beta-\frac{1}{2}\biggr) (mp_i-1). $$
(3.5)

Note that the asymptotic bias in (3.5) for a party depends only on its size p i and the number of parties, but not on the distribution of sizes between the other parties. The limit in (3.5) agrees with the formula by Kopfermann (1991, (S) on p. 208) for the advantage (or disadvantage) for different parties, derived by geometric considerations.

In particular, Theorem 3.4 shows that the bias is 0 for every party when β=1/2 (Webster/Sainte-Laguë). It is well-known that this divisor method is (asymptotically) unbiased, see e.g. (Pólya 1919; Balinski and Young 2001; Schuster et al. 2003; Schwingenschlögl and Drton 2004; Drton and Schwingenschlögl 2005) for various justifications of this; it is satisfying that our approach confirms this, and shows that the method really is unbiased for a party of any size.

For β>1/2 (for example Jefferson/D’Hondt with β=1), the bias is positive for p i >1/m and negative for p i <1/m. It is well-known that these methods favour larger parties; we here see exactly how much parties of different sizes are favoured, and that the boundary is exactly at the average size 1/m. Note also that the bias grows with m, especially for large parties. For example, for β=1, a large party with p i ≈1/2 has bias ≈(m−2)/4, while a small party (p i ≪1/m) always has a bias ≈−1/2.

For β<1/2 (for example Adams with β=0), we have the opposite biases; the boundary is still at p i =1/m.

Example 3.5

Sainte-Laguë (1910b) compared in 1910 the results of his method (which he called the method of least squares) and D’Hondt’s method in the case of two parties (that he thought of as two of several parties in an election) with p 1p 2. He did not consider the biases in our sense, comparing the distribution of the seats to the distribution of the votes, but his approach means studying the difference of the two biases. (Since we know that Sainte-Laguë’s method (β=1/2) is unbiased, this is equivalent to calculating the bias of D’Hondt’s method (β=1).)

Sainte-Laguë found that if k:=p 2/p 1∈(0,1], then the largest party gets on the average (1−k)/(2(1+k)) seats more with D’Hondt’s method. This agrees with Theorem 3.4, with m=2 and p 1=1/(1+k).

Sainte-Laguë (1910b) also considered the case of random p i , assuming that the ratio p 2/p 1 is uniformly distributed over (0,1], and found by integration the average gain

$$ \int_0^1\frac{1-k}{2(1+k)}\,\mathrm{d}k = \log2 - \frac{1}{2} \approx0.193 $$
(3.6)

for the larger party with D’Hondt’s method. (Note that this differs from the average bias 1/4=0.25 for the largest of two parties derived by Schuster et al. (2003) assuming that p 2 is uniformly distributed on (0,1/2), see Theorem 6.4 below with m=2. The reason is that the two probability distributions of the party sizes are different; they are uniform in different ways.)

Remark 3.6

Sainte-Laguë (1910b) seems to have thought that this calculation for two parties applies to any two of several parties, and that therefore, if party A is larger than B, then B would on the average gain 0.193 seats compared to A if the method is changed from D’Hondt’s to Sainte-Laguë’s, regardless of the number and sizes of the other parties. Theorem 3.4 shows that this is incorrect; in fact, if A and B both are small parties, they both have essentially the same bias −1/2 with D’Hondt’s method. It is true that for divisor methods, the distribution of seats to two parties A and B is the same as if the total number of seats for these two parties is distributed between them by the same divisor method, see Theorem A.19. However, if both A and B are small parties, they will by (3.5) get on the average almost 1 seat less together than their proportional share, and if their total number is redistributed between them, most of this loss will be borne by the larger party, which offsets the advantage of the larger party in the redistribution and explains the fallacy. More generally, parties 1 and 2 will by (3.5) on the average get together N′:=N(p 1+p 2)+(β−1/2)(mp 1+mp 2−2) seats. If we distribute this number of seats between the two parties, the same formula would yield that party 1 on the average gets, with \(p_{1}':=p_{1}/(p_{1}+p_{2})\),

seats, in accordance with Theorem 3.4.

We can describe asymptotically not only the mean but also the distribution of the seat excess. We also obtain a joint limit distribution, which is explicit although a little involved.

Theorem 3.7

Consider the β-linear divisor method, and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow}\infty}}\), for each i,

$$ \Delta_i\stackrel{\mathrm{d}}{\longrightarrow} \bar{X}_i:= \biggl(\beta-\frac{1}{2}\biggr) (mp_i-1) +\widetilde{U}_0+p_i \sum_{k=1}^{m-2} \widetilde{U}_k, $$
(3.7)

where \(\widetilde{U}_{k}\sim\mathrm{U}(-\frac{1}{2},\frac{1}{2})\) are independent. Moreover, jointly for i=1,…,m,

$$ \Delta_i\stackrel{\mathrm{d}}{\longrightarrow}X_i, $$
(3.8)

where the limit random variables X i can be constructed as follows: Let U 1,…,U m ∼U(0,1) and let J∈{1,…,m} be random with ℙ(J=j)=p j , j=1,…,m, with all variables independent. Let

(3.9)

and let, finally,

$$ X_i:=p_i\sum_{j=1}^mV_j-V_i+( \beta-1) (mp_i-1). $$
(3.10)

Of course, as we shall verify in Sect. 4, \(X_{i}\stackrel {\mathrm{d}}{=}\bar{X}_{i}\).

The description in (3.9)–(3.10) of the joint distribution of X 1,…,X m is explicit but rather involved, and it is quite possible that there exists a simpler description; cf. Remark 10.3. For m=2, X 2=−X 1, and thus \((X_{1},X_{2})\stackrel{\mathrm {d}}{=}(\bar{X}_{1},-\bar{X}_{1})\) where \(\bar{X}_{1}:=(2\beta-\nobreak 1)p_{1}-\beta+U_{0}\) by (3.7). We leave it as an open problem to try to find alternative expressions for m≥3.

Corollary 3.8

Consider the β-linear divisor method, and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow}\infty}}\),

$$ \operatorname{Var}\Delta_i \to\frac{1+(m-2)p_i^2}{12} $$
(3.11)

and, when ij,

$$ \operatorname{Cov}( \Delta_i,\Delta_j) \to\frac{(m-2)p_ip_j-p_i-p_j}{12}. $$
(3.12)

Note that the asymptotic distribution of the seat excess depends on β only through its mean \(\mathbb{E}X_{i}\) (i.e., the asymptotic bias in (3.5)); the centred random variable \(X_{i}-\mathbb{E}X_{i}\) does not depend on β. We see also, from (3.10) or (3.11), that this random part of the seat excess tends to be larger in absolute value for a large party.

In the limit as p i →0, i.e. for very small parties, (3.9) and (3.10) yield \(V_{i}\stackrel{\mathrm{d}}{\longrightarrow}U_{i}\) and

$$X_i\stackrel{\mathrm{d}}{\longrightarrow}-U_i -(\beta -1)=1-U_i-\beta\stackrel{\mathrm{d}}{=}U_i-\beta\sim \mathrm{U}(-\beta,1-\beta). $$

Remark 3.9

A simple heuristic motivation for the bias in Theorem 3.4 is that the β-rounding in (2.7) is unbiased if β=1/2, but otherwise, it costs on the average each party β−1/2 seats, compared to v i /D which is truly proportional. In total, this makes a loss of m(β−1/2) seats, which are redistributed proportionally (by shifting D), so party i gets back p i m(β−1/2) seats, on the average, giving a net gain of (mp i −1)(β−1/2), as shown by Theorem 3.4.

For a related, but rigorous, argument, let us compare the methods with parameters β and β+1, temporarily using a superscript to distinguish the methods. (For example, Adams’s method and Jefferson’s method.) By Theorem A.12 (assuming that N is large enough),

$$ s_i^{(\beta)}(N) = s_i^{(\beta+1)}(N-m)+1. $$
(3.13)

Hence,

$$ \Delta_i^{(\beta)}(N) = s_i^{(\beta)}(N)-Np_i= \Delta_i^{(\beta +1)}(N-m)-mp_i+1. $$
(3.14)

Taking the limit as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\), we see that the asymptotic distributions also differ by mp i −1. Hence, it is not surprising that the bias in Theorem 3.4 is mp i −1 times a linear function of β.

3.2 Quota methods

We again begin with a simple deterministic bound, cf. Kopfermann (1991, Satz 6.2.3). It follows from the proof, or from Theorem 3.13 below, that the bounds are best possible. Proofs of the results below are given in Sect. 5.

Theorem 3.10

For the γ-quota method,

$$ \gamma\biggl(p_i-\frac{1}{m}\biggr) - \frac{m-1}{m} \le\Delta_i \le\gamma\biggl(p_i-\frac{1}{m}\biggr) + \frac{m-1}{m}. $$
(3.15)

Again we have a simple formula for the asymptotic bias.

Theorem 3.11

Consider the γ-quota method and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\),

$$ \mathbb{E}\Delta_i \to\gamma\biggl(p_i- \frac{1}{m}\biggr). $$
(3.16)

As for the divisor methods, the asymptotic bias in (3.16) for a party depends only on its size p i and the number of parties, but not on the distribution of sizes between the other parties. The limit in (3.16) agrees with the formula by Kopfermann (1991, (S) on p. 199) for the (dis)advantage for different parties, derived by geometric considerations.

In particular, Theorem 3.11 shows that the bias is 0 for every party when γ=0 (method of largest remainder/Hamilton/Hare). Again it is well-known that this quota method is (asymptotically) unbiased, see e.g. (Pólya 1919; Balinski and Young 2001; Schuster et al. 2003; Schwingenschlögl and Drton 2004; Drton and Schwingenschlögl 2005); our approach confirms this and shows that the method is unbiased for a party of any size.

For γ=1 (Droop), the bias is positive for p i >1/m and negative for p i <1/m, just as for Jefferson’s method discussed above; hence Droop’s method too favours larger parties. (See Kopfermann 1991, p. 118 for figures illustrating the cases m=2 and m=3.) Note, however, that the bias for Droop’s method does not grow with m. A comparison between (3.16) (with γ=1) and (3.5) (with β=1) shows that the bias for Jefferson’s method is m/2 times the bias for Droop’s method, for any party of any size. Hence Droop’s method is less biased than Jefferson’s for every m≥3. (For m=2 not only the biases are the same; in fact the methods coincide when there are only two parties, see Appendix B.1.)

Remark 3.12

There is a simple heuristic explanation of the formula (3.16) for the bias, if we accept the fact that Hamilton/Hare’s method (γ=0) is unbiased. Consider γ=1 (Droop); this method can be seen as doing Hamilton’s method with N+1 seats, but retracting the last seat. Since Hamilton’s method is unbiased, this first distributes on the average (N+1)p i seats to party i. The retracted seat is taken essentially uniformly at random, since it depends on the fractional parts of the numbers q i =Np i only; hence each party loses on the average 1/m seats in that step. This argument gives a bias of p i −1/m, in accordance with (3.16).

Again we have an explicit expression also for the asymptotic distribution of the seat excess.

Theorem 3.13

Consider the γ-quota method and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty }}\), for each i,

$$ \Delta_i\stackrel{\mathrm{d}}{\longrightarrow} Y_i:= \gamma\biggl(p_i-\frac{1}{m}\biggr) + \widetilde{U}_0+\frac{1}{m}\sum_{k=1}^{m-2} \widetilde{U}_k, $$
(3.17)

where \(\widetilde{U}_{k}\sim\mathrm{U}(-\frac{1}{2},\frac{1}{2})\) are independent.

Corollary 3.14

Consider the γ-quota method and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\),

$$ \operatorname{Var}\Delta_i\to \frac{1+(m-2)/m^2}{12} = \frac{(m+2)(m-1)}{12m^2}. $$
(3.18)

As for the divisor methods in Theorem 3.7, the asymptotic distribution of the seat excess depends on γ only through the mean in (3.16); \(Y_{i}-\mathbb{E}Y_{i}\) does not depend on γ. Moreover, unlike the case of divisor methods, this centred random variable does not depend on the party size p i . In particular, for Hamilton/Hare’s method (γ=0), the asymptotic distribution of the seat excess is the same for every party.

Remark 3.15

Let for simplicity γ=0. Note that the limit variable Y i in (3.17) is uniform \(\mathrm{U}(-\frac{1}{2},\frac{1}{2})\) for m=2; for m≥3 the distribution is more complicated, but as m→∞, the limit converges to \(\mathrm{U}(-\frac{1}{2},\frac{1}{2})\) (by (3.17) and the law of large numbers). Hence, for large m, Y i has essentially the same distribution as for m=2, and although the supremum and infimum of the range of Y i tend to 1 and −1 as m→∞, cf. (3.15), the probability \(\mathbb{P}(Y_{i}\notin[-\frac{1}{2},\frac{1}{2}])\to0\).

Similarly, it is seen from the variance formula (3.18) that the variance is 1/12≈0.08333 for m=2, increases to 5/54≈0.09259 for m=3 and to the maximum 3/32=0.09375 for m=4; the variance then decreases again and converges to 1/12 as m→∞.

The difference from a uniform variable is not very large, and it may in practice be a good approximation to regard the seat excess Δ i in Hamilton’s method as \(\mathrm{U}(-\frac{1}{2},\frac{1}{2})\) for any party and any reasonably large house size.

We have stated Theorem 3.13 only for individual parties. It is easy to see that a joint asymptotic distribution exists, but (unlike the case of divisor methods, cf. Theorem 3.7) we do not know any simple description of it. (There is a description implicit in the proof of Theorem 3.16, see (5.14).)

Theorem 3.16

Consider the γ-quota method and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\), \(\Delta_{i}\stackrel{\mathrm{d}}{\longrightarrow}\bar{Y}_{i}\) jointly for i=1,…,m for some random variables \(\bar{Y}_{i}\) such that the distribution of \((\bar{Y}_{1}-\gamma p_{1},\dots,\bar{Y}_{m}-\gamma p_{m})\) is symmetric (exchangeable) and independent of p 1,…,p m , and \(\bar{Y}_{i}\stackrel{\mathrm{d}}{=}Y_{i}\) for each i.

Corollary 3.17

Consider the γ-quota method and suppose that p 1,…,p m are linearly independent over ℚ. Then, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\), when ij,

$$ \operatorname{Cov}(\Delta_i,\Delta_j) \to\operatorname{Cov}(\bar{Y}_i,\bar{Y}_j)= - \frac{\operatorname{Var}\bar{Y}_i}{m-1} =-\frac{m+2}{12m^2}. $$
(3.19)

Problem 3.18

Find an explicit formula for the limit variables \(\bar{Y}_{1},\dots,\bar{Y}_{m}\) in Theorem 3.16.

4 Proofs for divisor methods

Proof of Theorems 3.3 and 3.7

For the β-linear divisor method, the seat distribution is given by (2.7) with D and D′=D/V such that \(\sum_{i=1}^{m}s_{i}=N\). We let D′=1/t; thus, using (2.7) and (2.6),

$$ s_i= [p_it ]_{\beta}= \lfloor p_it-\beta+1 \rfloor $$
(4.1)

(with the usual possible exception if p i tβ is an integer). Let here t grow from 0 to ∞; think of t as time. The seats are assigned one by one (unless there is a tie).

Fix one party j and let t jr be the time when party j receives its r:th seat. By (4.1), r=p j t jr β+1, and thus

$$ t_{jr}=\frac{r+\beta-1}{p_j}. $$
(4.2)

At this time t jr , by (4.1) again, party i has s i seats with

(4.3)

Let, for i=1,…,m,

$$ W_i :=\{p_it_{jr}-\beta+1\} = \biggl\{\frac{p_i}{p_j}r+\biggl(\frac{p_i}{p_j}-1\biggr) (\beta -1)\biggr\} \in[0,1], $$
(4.4)

defining W i :=1 in the exceptional case when p i t jr β∈ℤ and s i =p i t jr β. (But W i :=0 when p i t jr β∈ℤ and s i =p i t jr β+1.) Then (4.3) can be written

$$ s_i =\frac{p_i}{p_j}r+\biggl(\frac{p_i}{p_j}-1 \biggr) (\beta-1)-W_i, \quad i=1,\dots,m. $$
(4.5)

Note that W j =0, since s j =r=p j t jr β+1.

The total number of seats at the moment t jr is, by (4.5),

$$ N=\sum_{i=1}^ms_i =\frac{1}{p_j}r+\biggl(\frac{1}{p_j}-m\biggr) (\beta-1)-\sum _{i=1}^mW_i, $$
(4.6)

and thus the seat excess is, combining (4.5) and (4.6),

(4.7)

So far everything is deterministic. Theorem 3.3 follows from (4.7) since each W k ∈[0,1] and W j =0.

Let us now assume that p 1,…,p m are linearly independent over ℚ; in particular, this implies that p i /p j is irrational when ij. Note that there is a tie between parties i and j at time t if and only if both p i tβ and p j tβ are integers. If parties i and j tie at two different times t 1 and t 2, we thus have p i (t 1t 2)∈ℤ and p j (t 1t 2)∈ℤ, and taking the quotient we find p i /p j ∈ℚ, a contradiction. Hence there is at most one tie for each pair of parties, and thus at most \(\binom{m}{2}\) values of t (and thus of N) for which there is any tie. Since we consider asymptotics, we may ignore the ties and assume that no tie occurs.

Let now N be random in {1,…,N 0} and let J=J(N) be the party getting the N:th seat. Then Δ i is given by (4.7), where W i =W i (N) for each N is given by (4.4) with j=J and r=s J (N). Fix again j and consider first only N such that J=j. The number of such NN 0 is s j (N 0), the number of seats party j receives when the house size is N 0; note that, e.g. by Theorem 3.3, s j (N 0)=p j N 0+O(1).

We recall a well-known result by Weyl (1916). (The result is usually stated with a 1=⋯=a k =0; this case implies immediately the more general version given here.) We say that an infinite sequence (v n ) n≥1∈[0,1)k is uniformly distributed if the empirical distributions \(n_{0}^{-1}\sum_{n=1}^{n_{0}}\delta_{v_{n}}\) converge to the uniform distribution as n 0→∞, where \(\delta_{v_{n}}\) denotes the Dirac measure. (Recall that this means that if A⊆[0,1)k with λ(∂A)=0, then #{nn 0:v n A}/n 0λ(A), where λ is the usual Lebesgue measure, see e.g. Billingsley 1968.)

Lemma 4.1

(Weyl)

Suppose that y 1,…,y k and 1 are linearly independent over ℚ, and let a 1,…,a k be any real numbers. Then the sequence of vectors ({ny 1+a 1},…,{ny k +a k }) n≥1∈[0,1)k is uniformly distributed in [0,1)k. □

(The standard proof is by showing that the Fourier transform (characteristic function) \(\frac{1}{n_{0}}\sum_{n=1}^{n_{0}} \exp(2\pi\mathrm{i}\sum_{j=1}^{k} \ell_{j}\{ny_{j}\} )\to0\), as n 0→∞, for any fixed integers 1,…, k , not all 0, see for example Grafakos 2004, Exercises 3.4.2–3.)

Since p 1,…,p m are linearly independent over ℚ, so are p 1/p j ,…,p m /p j , i.e. {p i /p j :ij}∪{1}. Taking y i :=p i /p j and a i :=(y i −1)(β−1), we thus see from Lemma 4.1 that (W i ) ij , given by (4.4) for r=1,…,s j (N 0), are uniformly distributed vectors in [0,1)m−1. In probabilistic notation, regarding W i =W i (N) as random variables, conditioning on J=j and recalling W j =0 when J=j, as N 0→∞ and thus s j (N 0)→∞,

$$ \bigl((W_1,\dots,W_m)\bigm| J=j \bigr) \stackrel{\mathrm{d}}{\longrightarrow} \bigl(\mathbf{1}\{i\neq j\}U_i \bigr)_{i=1}^m. $$
(4.8)

Since ℙ(J=j)=s j (N 0)/N 0p j as N 0→∞, the random variable J has asymptotically the distribution given in Theorem 3.7, and it follows from (4.8) that, with V i defined as in Theorem 3.7,

$$ (W_1,\dots,W_m)\stackrel{\mathrm{d}}{ \longrightarrow}(V_1,\dots,V_m). $$
(4.9)

Consequently, (4.7) yields \(\Delta_{i}\stackrel{\mathrm {d}}{\longrightarrow}X_{i}\), jointly for all i=1,…,m, with X i defined by (3.10).

To obtain the simpler form (3.7) for an individual Δ i , let

$$ \widetilde{X}_i:=X_i-\biggl(\beta-\frac{1}{2}\biggr) (mp_i-1)=p_i\sum_{j=1}^mV_j-V_i- \frac{1}{2} mp_i+\frac{1}{2}. $$
(4.10)

Then, if J=i,

and if Ji,

Hence, if we define

(4.11)

then

with \(\widetilde{U}_{j}\) (1≤jm−2) and Z independent. By (4.11), conditioning on J, we have the distributions \((Z\mid J=i)\sim\mathrm{U}(\frac{1}{2}-p_{i},\frac{1}{2})\) and \((Z\mid J\neq i)\sim\mathrm{U}(-\frac{1}{2},\frac{1}{2}-p_{i})\); furthermore ℙ(J=i)=p i , and it follows that \(Z\sim\mathrm{U}(-\frac{1}{2},\frac{1}{2})\), so \(Z\stackrel{\mathrm{d}}{=} \widetilde{U}_{0}\) and thus

The definition (4.10) now shows that \(X_{i}\stackrel{\mathrm {d}}{=}\bar{X}_{i}\), with \(\bar{X}_{i}\) defined in (3.7). □

Proof of Theorem 3.4 and Corollary 3.8

Since Δ i is bounded, e.g. by Theorem 3.3 (or by the simpler Theorem A.17), the convergence in distribution \(\Delta_{i}\stackrel{\mathrm{d}}{\longrightarrow}X_{i}\) in Theorem 3.7 implies convergence of all moments and mixed moments, see Gut (2005, Theorem 5.5.9); hence it suffices to calculate the expectations, variances and covariances of X i .

We have \(\mathbb{E}U_{i}=1/2\) and thus by (3.9), since J and U i are independent,

$$ \mathbb{E}V_i=\mathbb{P}(J\neq i)\mathbb{E}U_i=(1-p_i) \frac{1}{2}. $$
(4.12)

Consequently, \(\mathbb{E}\sum_{i=1}^{m}V_{i}=\frac{1}{2}\sum _{i=1}^{m}(1-p_{i})=\frac{1}{2}(m-1)\) and, by (3.10),

This shows Theorem 3.4.

For the variance and covariances, we have for any i

and, when ij, by independence,

Consequently, using also (4.12),

We obtain from this also

Consequently, (3.10) yields

and, for ij,

Corollary 3.8 follows. □

Remark 4.2

We can also see when equality holds in Theorem 3.3. By (4.7), Δ i is maximal when W i =0 and W k =1 for ki. Thus all W k ∈{0,1}, which by (4.4) means that all p k tβ are integers for t=t jr ; hence all parties tie in the sense that there are two possible values of s i in the β-rounding in (4.1), or equivalently (2.7), for every party. Consequently Δ i equals the upper bound in (3.3) if and only if all parties tie and, moreover, party i is lucky and is rounded up while all others are rounded down. (In the sequential formulation D3 or D4 in Appendix A, this means that all parties tie for the last seat, which is taken by party i.)

Similarly, Δ i is minimal when W i =1 and W k =0 for ki, and it follows that Δ i equals the lower bound in (3.3) if and only if all parties tie and, moreover, party i is unlucky and is rounded down while all others parties are rounded up. (In the sequential formulation this means that all parties tie for the last m−1 seats, which are given to all parties except i.)

5 Proofs for quota methods

Proof of Theorems 3.10 and 3.13

The proof for quota methods uses arguments similar to the proof in Janson and Linusson (2012), and the reader might wish to compare the versions. (Only the case γ=0 is treated in Janson and Linusson 2012, but that is only a minor simplification. For completeness we repeat some results from Janson and Linusson 2012.) For notational convenience, we show the results for party 1, i.e. we take i=1 in the proofs.

By definition, s j =[(N+γ)p j ] α for an α that makes \(\sum_{j=1}^{m}s_{j}=N\). Explicitly, by (2.5),

$$ s_j+\alpha-1\le(N+\gamma)p_j\le s_j+\alpha. $$
(5.1)

Thus, using party 1 as a benchmark,

$$ s_j-s_1-1\le(N+\gamma) (p_j-p_1) \le s_j-s_1+1. $$
(5.2)

Since Δ j −Δ1=s j s 1N(p j p 1), (5.2) yields

$$ \Delta_j-\Delta_1-1\le\gamma(p_j-p_1) \le\Delta_j-\Delta_1+1. $$
(5.3)

Thus, summing over all j≠1, recalling that \(\sum_{j=1}^{m}\Delta _{j}=0\) and \(\sum_{j=1}^{m} p_{j}=1\),

$$ -m\Delta_1-(m-1)\le\gamma(1-mp_1)\le-m \Delta_1+(m-1). $$
(5.4)

Hence,

$$ \gamma\biggl(p_1-\frac{1}{m}\biggr) - \frac{m-1}{m} \le \Delta_1 \le\gamma\biggl(p_1-\frac{1}{m}\biggr) + \frac{m-1}{m}, $$
(5.5)

which shows Theorem 3.10.

Suppose now that all p i p j are irrational. A tie between parties i and j can occur only if (N+γ)p i −(N+γ)p j ∈ℤ. If this happens for two different house sizes N 1 and N 2, then both (N 1+γ)(p i p j )∈ℤ and (N 2+γ)(p i p j )∈ℤ, and thus by taking the difference (N 1N 2)(p i p j )∈ℤ, which is impossible if p i p j ∉ℚ. Hence, our assumption implies that there is a tie between parties i and j for at most one value of N, and thus at most \(\binom{m}{2}\) values of N for which there is any tie. Since we consider asymptotics, we can ignore these N and assume that there are no ties. Hence we can choose α such that there are strict inequalities in (5.1), and thus there are strict inequalities in (5.2).

By (5.2) (with strict inequalities) ⌊(N+γ)(p j p 1)⌋∈{s j s 1−1,s j s 1}. Let

$$ I_j:= s_j-s_1- \bigl\lfloor(N+ \gamma) (p_j-p_1) \bigr\rfloor. $$
(5.6)

Then I j ∈{0,1} and I 1=0. We have

(5.7)

Summing over j we obtain, again recalling \(\sum_{j=1}^{m}\Delta_{j}=0\),

$$ -m\Delta_1= \sum _{j=1}^mI_j-\sum _{j=1}^m\bigl\{(N+\gamma) (p_j-p_1) \bigr\}+\gamma(1-mp_1). $$
(5.8)

Let \(L:=\sum_{j=1}^{m}I_{j}\). By (5.8), we then have the formula

$$ \Delta_1= \gamma\biggl(p_1-\frac{1}{m}\biggr) + \frac{1}{m}\Biggl(\,\sum_{j=2}^m\bigl\{(N+\gamma) (p_j-p_1)\bigr\}-L\Biggr). $$
(5.9)

Note that L is an integer with 0≤Lm−1 and, by (5.6),

Let \(\operatorname{Mod}_{m}(x):=m\{x/m\}\), the remainder when x is divided by m. We thus have

$$ L= \operatorname{Mod}_m\Biggl(N-\sum _{j=2}^m \bigl\lfloor(N+\gamma) (p_j-p_1) \bigr\rfloor\Biggr). $$
(5.10)

Let y j :=p j p 1 and a j :=γ(p j p 1); thus (N+γ)(p j p 1)=Ny j +a j . Let us now assume that p 1,…,p m are linearly independent over ℚ. It is easily seen that then y 2,…,y m and \(\sum_{1}^{m} p_{j}=1\) also are linearly independent over ℚ. (This can be seen as a change of basis, using a non-singular integer matrix, in a vector space of dimension m over ℚ.)

We need the following extension of Lemma 4.1, slightly generalising Janson and Linusson (2012, Lemma 4.3). The proof is given in Appendix C.

Lemma 5.1

Suppose that y 1,…,y k and 1 are linearly independent over ℚ, and let a 1,…,a k be any real numbers.

Let \(\ell_{n}= \operatorname{Mod}_{m} (n- \sum_{j=1}^{k} \lfloor ny_{j}+a_{j} \rfloor)\in \{0,\dots,m-1\}\). Then the sequence of vectors

$$ \bigl(\{ny_1+a_1\}, \dots, \{ny_k+a_k\}, \ell_n\bigr) \in[0,1)^k\times\{0,\dots,m-1\} $$
(5.11)

is uniformly distributed in [0,1)k×{0,…,m−1}.

Note that L is (5.10) equals N in Lemma 5.1, with y j and a j as chosen above. Consequently, (5.9) and Lemma 5.1 show that

$$ \Delta_1\stackrel{\mathrm{d}}{\longrightarrow} \gamma \biggl(p_1-\frac{1}{m}\biggr) +\frac{1}{m}\Biggl(\,\sum _{j=2}^mU_j-\bar{L}\Biggr), $$
(5.12)

where U j ∼U(0,1) and \(\bar{L}\) is uniformly distributed on {0,…,m−1}, with all variable independent. Moreover, \(U_{2}\stackrel{\mathrm{d}}{=}1-U_{2}\) and \(\bar{L}+U_{2}\sim \mathrm{U}(0,m)\), and thus

Hence, (5.12) can be written

with \(\widetilde{U}_{j}:=U_{j}-\frac{1}{2}\) for j≥1, which by \(\widetilde{U}_{0}\stackrel{\mathrm{d}}{=}-\widetilde{U}_{0}\) proves Theorem 3.13. □

Proof of Theorem 3.11 and Corollary 3.14

Since Δ i is bounded, e.g. by Theorem 3.10, the convergence in distribution \(\Delta_{i}\stackrel{\mathrm{d}}{\longrightarrow}Y_{i}\) in Theorem 3.13 implies convergence of all moments and it suffices to calculate the expectation and variance of Y i . This is immediate since \(\mathbb{E}\widetilde{U}_{k}=0\) and \(\operatorname{Var}\widetilde {U}_{k}=1/12\); thus by (3.17),

Theorem 3.11 and Corollary 3.14 follow. □

Proof of Theorem 3.16

Let W i :={(N+γ)p i }. For any j, {p i } ij and \(1=\sum_{i=1}^{m}p_{i}\) are linearly independent over ℚ, and thus the vectors (W i ) ij are uniformly distributed in [0,1)m−1 by Lemma 4.1. Moreover,

so W j γ−∑ ij W i (mod 1) and

Consequently, \((W_{1},\dots,W_{m})\stackrel{\mathrm{d}}{\longrightarrow }(V_{1},\dots,V_{m})\) where each V i ∼U(0,1), any m−1 of them are independent, and any such set determines the last variable by V j ={γ−∑ ij V i }. Obviously, the distribution of (V 1,…,V m ) is symmetric.

We have

$$ \Delta_i-\gamma p_i = s_i-(N+ \gamma)p_i = s_i- \bigl\lfloor(N+\gamma)p_i \bigr\rfloor-W_i. $$
(5.13)

By the construction of the γ-quota method, the numbers s i −⌊(N+γ)p i ⌋ depend only on (W 1,…,W m ), say s i −⌊(N+γ)p i ⌋=f i (W 1,…,W m ) for some function f i , and it follows from (5.13) that, jointly for all i,

$$ \Delta_i-\gamma p_i \stackrel{\mathrm{d}}{ \longrightarrow}Z_i:=f_i(V_1, \dots,V_m)-V_i, $$
(5.14)

where the distribution of (Z 1,…,Z m ) is symmetric. (The m functions f i differ only by permutations of the variables.) Now let \(\bar{Y}_{i}:=Z_{i}+\gamma p_{i}\). Then \(\Delta_{i}\stackrel{\mathrm{d}}{\longrightarrow}\bar{Y}_{i}\), and thus \(\bar{Y}_{i}\stackrel{\mathrm{d}}{=}Y_{i}\) by comparison with Theorem 3.13. □

Proof of Corollary 3.17

By Theorem 3.16 and the fact that each Δ i is bounded, \(\operatorname{Cov}(\Delta_{i},\Delta_{j})\allowbreak \to\operatorname{Cov}(\bar{Y}_{i},\bar{Y}_{j})\). Moreover, by symmetry, \(\operatorname{Cov}(\bar{Y}_{i},\bar{Y}_{j})\) is the same for all pairs (i,j) with ij. We have \(\sum_{i=1}^{m}\bar{Y}_{i}=0\) as a consequence of (2.3), and thus

hence by symmetry \(\operatorname{Cov}(\bar{Y}_{i},\bar{Y}_{j})=-\operatorname{Var}(\bar{Y}_{i})/(m-1)=-\operatorname{Var} (Y_{i})/(m-1)\) for ij. The final equality follows from (3.18). □

Remark 5.2

We can also see when equality holds in Theorem 3.10. The argument yielding (5.5) shows that Δ1 is maximal exactly when there is equality in the left inequality in (5.1) for party 1, and in the right inequality for all other parties; Δ1 is minimal when the opposite holds. It follows that, exactly as for Theorem 3.3 in Remark 4.2, Δ i equals the upper [lower] bound in (3.15) if and only if all parties tie in (2.8) and, moreover, party i is the only lucky [unlucky] party.

6 Random party sizes

As said in the introduction, there is a long tradition of studying the bias of election methods theoretically with a fixed house size and random party sizes. More precisely, in this approach one (usually) assumes that the vector (p 1,…,p m ) is random and uniformly distributed over the simplex for some given m. One then orders the party sizes p 1,…,p m as p [1]≥⋯≥p [m] and considers the seat excesses Δ[1],…,Δ[m] of the parties in the same order. (Thus, Δ[1] is the seat excess of the largest party, and so on.) Results for the mean \(\mathbb{E}\Delta_{[k]}\) of the seat excess for the k:th largest party (either exact formulas for a fixed N or asymptotic formulas as N→∞) have been obtained by Pólya (1918a, 1918b, 1919), Schuster et al. (2003), Heinrich et al. (2005) and Drton and Schwingenschlögl (2004, 2005); results on the variance are given by Schwingenschlögl and Drton (2006). (We follow these papers and order the parties in decreasing order, cf. Appendix D.)

Heinrich et al. (2004, 2005) and Schwingenschlögl (2008) have also, more generally, considered other distributions of p 1,…,p m (with an absolutely continuous distribution on \(\mathfrak{S}_{m}\)). Note also that Sainte-Laguë (1910b) considered a different distribution for two parties, see Example 3.5 (with p 2/p 1 uniformly distributed, which is not the same as p 2 uniform).

We cannot obtain these results for a given N directly from our results for random N, nor can we obtain our results for fixed p 1,…,p m from these results. However, by combining the methods, we can translate one type of results to the other and thus see that our main results transfer to the case of random p 1,…,p m . (Now with random p i in e.g. (3.7) and (3.17).)

Theorem 6.1

Let p 1,…,p m be random with an absolutely continuous distribution on the simplex \(\mathfrak{S}_{m}\), and let N→∞ (with N deterministic).

  1. (i)

    For the β-linear divisor method, the conclusions (3.7)(3.10) of Theorem 3.7 hold, with \(\widetilde{U}_{i}\) and U i independent of p 1,…,p m and with ℙ(J=jp 1,…,p m )=p j .

  2. (ii)

    For the γ-quota method, the conclusions of Theorems 3.13 and 3.16 hold; in particular (3.17) holds, with \(\widetilde{U}_{k}\sim\mathrm{U}(-\frac{1}{2},\frac{1}{2})\) independent of each other and of p 1,…,p m .

Proof

Consider first the β-linear divisor method. The proof of Heinrich et al. (2005, Theorem 3.1(i)) shows that as N→∞,

$$ (\Delta_1,\dots,\Delta_m)\stackrel{ \mathrm{d}}{\longrightarrow}(Z_1,\dots,Z_m) $$
(6.1)

for some random variables Z j ; with the notation in Heinrich et al. (2005),

$$ Z_j=- \biggl(U_j+q-\frac{1}{2}+\operatorname{sgn}(D_c)m_j(D_c) \biggr)+c\biggl(q-\frac{1}{2}\biggr)W_j. $$
(6.2)

Here W j is our p j , c is our m, q is our β, U j is our \(\widetilde{U}_{j}\) and \(\operatorname{sgn}(D_{c})m_{j}(D_{c})\) is an explicit but rather complicated function of U 1,…,U m and W 1,…,W m ; however, the point is that for us it suffices to know the existence of some variables Z j such that (6.1) holds; we can ignore the construction. The proof of (6.1) is based on the fact that for any (deterministic) sequence ν n →∞, the vectors of fractional parts ({ν n p 1},…,{ν n p m−1}) are uniformly distributed in [0,1)m−1, see Lemma C.1, which is the analogue of Lemma 4.1 in the present context.

Consider now N uniformly random on {1,…,N 0} and let N 0→∞, i.e., suppose \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow}\infty }}\) in the notation used in the previous sections. Then \(N\stackrel{\mathrm{p}}{\longrightarrow}\infty\), so by conditioning on N, it follows that (6.1) holds for such random N as well. On the other hand, since the distribution of p 1,…,p m is absolutely continuous, almost surely p 1,…,p m are linearly independent over ℚ. Thus, by conditioning on p 1,…,p m we may apply Theorem 3.7, which shows that

$$ (\Delta_1,\dots,\Delta_m)\stackrel{ \mathrm{d}}{\longrightarrow}(X_1,\dots,X_m), $$
(6.3)

where X 1,…,X m are given by (3.9)–(3.10) (with our random p 1,…,p m ). In the case of random p 1,…,p m and random \({{N\stackrel {\mathrm{p}^{*}}{\longrightarrow}\infty }}\), we thus have both (6.1) and (6.3); consequently,

$$ (Z_1,\dots,Z_m)\stackrel{\mathrm{d}}{=}(X_1,\dots,X_m). $$
(6.4)

Consider again deterministic N→∞. We know that (6.1) holds, so (6.4) implies that (6.3) holds in this case too. Finally, \((\bar{X}_{i}\mid p_{i})\stackrel{\mathrm{d}}{=}(X_{i}\mid p_{i})\) by Theorem 3.7, and thus \(\bar{X}_{i}\stackrel{\mathrm{d}}{=} X_{i}\), so (3.7) too holds in the present setting.

For the γ-quota method, the proof of Theorem 3.16 applies to the present situation as well, provided we use Lemma C.1 instead of Lemma 4.1. In particular, (5.14) holds, with V 1,…,V m ∼U(0,1) independent of p 1,…,p m . Further. \((\bar{Y}_{i}\mid{p_{1},\dots,p_{m}})=(Y_{i}\mid{p_{1},\dots,p_{m}})\), and thus \(\bar{Y}_{i}\stackrel{\mathrm{d}}{=}Y_{i}\) in the case of random p 1,…,p m too. The results follow. □

Remark 6.2

By considering p 1,…,p m uniformly distributed on a small neighbourhood \(\max_{i}|p_{i}-p^{0}_{i}|<\varepsilon\) of some \((p_{1}^{0},\dots,p_{m}^{0})\in \mathfrak {S}_{m}\), and then letting ε→0, we recover the limit distributions in Sect. 3, with a somewhat different interpretation as a double limit. (In this version p 1,…,p m can be any numbers, including rationals.)

In particular, we can apply Theorem 6.1 to the case of uniformly distributed p 1,…,p m taken in decreasing order as above, since then (p [1],…,p [m]) has a uniform distribution on the subset of \(\mathfrak{S}_{m}\). This gives immediately the following theorem. (We leave the case of joint distributions to the reader.)

Theorem 6.3

Let p 1,…,p m be random and uniformly distributed on \(\mathfrak{S}_{m}\), and let N→∞.

  1. (i)

    For the β-linear divisor method, for each jm,

    $$ \Delta_{[j]} \stackrel{\mathrm{d}}{\longrightarrow} \widehat{X}_j := \biggl(\beta-\frac{1}{2}\biggr) (mp_{[j]}-1) + \widetilde{U}_0+p_{[j]}\sum_{k=1}^{m-2} \widetilde{U}_k. $$
    (6.5)
  2. (ii)

    For the γ-quota method, for each jm,

    $$ \Delta_{[j]}\stackrel{\mathrm{d}}{\longrightarrow} \widehat{Y}_j:= \gamma\biggl(p_{[j]}-\frac{1}{m}\biggr) + \widetilde{U}_0+\frac{1}{m}\sum_{k=1}^{m-2} \widetilde{U}_k. $$
    (6.6)

Here \(\widetilde{U}_{k}\sim\mathrm{U}(-\frac{1}{2},\frac{1}{2})\) are independent of each other and of p 1,…,p m .

As before, this theorem implies moment convergence because Δ[j] is bounded. In particular, for the bias in this approach we obtain the following result. This was (except for the case γ≠0 in (ii)) conjectured by Schuster et al. (2003) and proven by Drton and Schwingenschlögl (2005) and Heinrich et al. (2005).

Theorem 6.4

Let p 1,…,p m be random and uniformly distributed on \(\mathfrak{S}_{m}\), and let N→∞.

  1. (i)

    For the β-linear divisor method, for each jm,

    $$ \mathbb{E}\Delta_{[j]} \to\mathbb{E}\widehat{X}_j = \biggl(\beta-\frac{1}{2}\biggr) (m\mathbb{E}p_{[j]}-1) =\biggl(\beta- \frac{1}{2}\biggr) \Biggl(\,\sum_{i=j}^m\frac{1}{i}-1 \Biggr). $$
    (6.7)
  2. (ii)

    For the γ-quota method, for each jm,

    $$ \mathbb{E}\Delta_{[j]}\to\mathbb{E}\widehat{Y}_j= \gamma\biggl(\mathbb{E}p_{[j]}-\frac{1}{m}\biggr) = \frac{\gamma}{m} \Biggl(\,\sum_{i=j}^m \frac{1}{i}-1\Biggr). $$
    (6.8)

Proof

The convergence to \(\mathbb{E}\widehat{X}_{j}\) or \(\mathbb{E}\widehat{Y}_{j}\) follows, as just said, from Theorem 6.3. The first equalities in (6.7) and (6.8) follow immediately from (6.5) and (6.6) since \(\mathbb{E} \widetilde{U}_{k}=0\), cf. (3.5) and (3.16). It remains only to compute \(\mathbb{E}p_{[j]}\). This is a standard formula, and for completeness we give a proof in Appendix D, see (D.6). □

We similarly obtain results for the variance. This was found for m=2 and 3 (except the case γ≠0) by Schwingenschlögl and Drton (2006).

Theorem 6.5

Let p 1,…,p m be random and uniformly distributed on \(\mathfrak{S}_{m}\), and let N→∞.

  1. (i)

    For the β-linear divisor method, for each jm,

    (6.9)
  2. (ii)

    For the γ-quota method, for each jm,

    (6.10)

Proof

Again, by Theorem 6.3, we only have to calculate \(\operatorname {Var}\widehat{X}_{j}\) and \(\operatorname{Var}\widehat{Y}_{j}\); the calculations are similar and we give the details only for \(\widehat{X}_{j}\). We use the standard decomposition

$$ \operatorname{Var}\widehat{X}_j = \mathbb{E} \bigl( \operatorname{Var}(\widehat{X}_j\mid{p_1,\dots ,p_m}) \bigr) + \operatorname{Var} \bigl(\mathbb{E}(\widehat{X}_j\mid{p_1,\dots,p_m}) \bigr). $$
(6.11)

The conditional variance and mean are obtained immediately from (6.5), which yields, cf. (3.5) and (3.11),

(6.12)
(6.13)

The first equality in (6.9) follows from (6.11)–(6.13). It remains only to find \(\mathbb{E}p_{[j]}^{2}\) and \(\operatorname {Var}(p_{[j]})\); these are well-known and we give a computation in Appendix D for completeness, see (D.6)–(D.8). □

Example 6.6

For m=3 and Jefferson/D’Hondt’s method we obtain from (6.7) with β=1, \(\mathbb{E}(\Delta_{[1]},\Delta_{[2]},\Delta_{[3]})\to(\frac{5}{12},-\frac{1}{12},-\frac{4}{12})\), as found already by Pólya (1919).

For m=3 and Droop’s method we obtain from (6.8) with γ=1, \(\mathbb{E}(\Delta_{[1]},\Delta_{[2]},\Delta_{[3]})\to(\frac{5}{18},-\frac{1}{18},-\frac{4}{18})\).

Example 6.7

For m=3, we have for the β-linear divisor method

and for the γ-quota method

These were found by Schwingenschlögl and Drton (2006) (where, however, the case γ≠0 is not treated).

Asymptotic covariances can be computed in the same way, using (3.12), (3.19) and (D.10).

In the same way as in Theorems 6.4–6.5, we can obtain results for the mean and variance of Δ[j] for random p 1,…,p m with other distributions than the uniform one. In particular, this gives the formulas by Heinrich et al. (2005) and Schwingenschlögl (2008) for the asymptotic bias in terms of \(\mathbb {E}p_{[j]}=\mathbb{E} (p_{j}\mid p_{1}\ge\cdots\ge p_{m})\) (assuming that (p 1,…,p m ) has a symmetric distribution).

For example, Heinrich et al. (2005) consider, as a special case, the case of a threshold t (with 0<t<1/m) and uniform distribution on p 1≥⋯≥p m t. In this case, one has (as in Heinrich et al. 2005) \((p_{1}-1/m,\dots,p_{m}-1/m)\stackrel{\mathrm {d}}{=}(1-mt)(p^{*}_{1}-1/m,\dots,p^{*}_{m}-1/m)\), where \((p^{*}_{1},\dots,p^{*}_{m})\) has a uniform distribution on \(\mathfrak {S}_{m,\ge}\), and the mean and variance are easily computed.

7 The probability of violating quota

We say that a seat assignment s i satisfies lower quota if s i ≥⌊q i ⌋ and satisfies upper quota if s i ≤⌈q i ⌉; it satisfies quota if both hold. In terms of the seat excess Δ i =s i q i , the assignment satisfies lower [upper] quota if and only if Δ i >−1 [Δ i <1], and it satisfies quota if and only if |Δ i |<1.

As is well-known, Hamilton/Hare’s method always satisfies quota, while Jefferson’s and Droop’s methods satisfy lower quota and Adams method satisfies upper quota, see e.g. Balinski and Young (2001) or Theorems 3.3 and 3.10 above. It is also well-known that Webster/Sainte-Laguë does not always satisfy quota, but that violations are unusual in practice.

Theorems 3.7 and 3.13 enable us to calculate the (asymptotic) probabilities that quota is violated for various methods. We give two examples.

Example 7.1

Consider Jefferson/D’Hondt’s method. By Theorem 3.3 with β=1, p i −1≤Δ i ≤(m−1)p i ; hence lower quota is always satisfied (as is well-known Balinski and Young 2001), but upper quota can be violated for a party with p i ≥1/(m−1). Consider the case m=3, and suppose that p 1≥1/2. (Then p 2,p 3<1/2, so quota is always satisfied for the other two parties; thus at most one party can violate quota.) By Theorem 3.7 with β=1, letting \(\widetilde{U}_{i}=\frac{1}{2}-U_{i}\),

$$ \Delta_1\stackrel{\mathrm{d}}{\longrightarrow}\bar{X}_1= \frac{1}{2}(3p_1-1)+\widetilde{U}_0+p_1 \widetilde{U}_1 =2p_1-U_0-p_1U_1; $$
(7.1)

hence the asymptotic probability of violating (upper) quota is

$$ \mathbb{P}(\bar{X}_1\ge1)=\mathbb{P} (U_0+p_1U_1<2p_1-1 )=\frac{(2p_1-1)^2}{2p_1}. $$
(7.2)

(The set of allowed (U 0,U 1) is a right-angled triangle with sides 2p 1−1 and (2p 1−1)/p 1.)

If we let (p 1,p 2,p 3) be random and uniformly distributed on \(\mathfrak{S}_{3}\), as in Sect. 6, then p 1 has the density 2(1−p 1) and thus the asymptotic probability that party 1 violates (upper) quota is

$$ \int_{1/2}^1 \frac{(2p-1)^2}{2p} \cdot2(1-p) \, \mathrm{d}p =\ln2 -2/3 \approx0.026. $$
(7.3)

Consequently, the asymptotic probability that some party violates (upper) quota is

$$ 3\ln2 -2 \approx0.079, $$
(7.4)

as found by Niemeyer and Wolf (1984).

Example 7.2

Consider Jefferson/D’Hondt’s method, and a party i with three times the average size: p i =3/m. Then the bias \(\mathbb{E}\bar{X}_{i}=1\), by Theorem 3.4 with β=1. It follows by (3.7) and symmetry that \(\mathbb {P}(\bar{X}_{i}>1)=1/2\), so the (asymptotic) probability that the party violates quota is 1/2. For a larger party, the probability is even greater.

Example 7.3

The Swedish parliament contains after the general election in 2010 8 parties: two large with 30 % of the votes each and 6 small with 5–8 % each. The seats are in principle distributed by Sainte-Laguë’s method. (We ignore here complications due to the division into 29 constituencies and the system with adjustment seats, which in 2010 did not give complete adjustment because the number of adjustment seats was insufficient.)

We note from Theorem 3.3 (with \(\beta=\frac{1}{2}\)) that the small parties always satisfy quota. In fact, (3.4) shows that for Webster/Sainte-Laguë, only parties with p i ≥1/(m−2) can violate quota.

For the large parties we have (approximatively) p i =0.3, and thus by (3.7) (still with \(\beta=\frac{1}{2}\)) \(\bar{X}_{i}=\widetilde{U}_{0}+0.3\sum_{k=1}^{6}\widetilde{U}_{k}\). An integration (using Maple) yields \(\mathbb{P}(\bar{X}_{i}\ge1)=\mathbb{P}(\bar{X}_{i}\le-1)\approx0.00045\). Hence, for each of the two large parties the (asymptotic) probability of violating quota is 0.0009.

8 Apparentements

Suppose that two or more parties decide to form a coalition in the election, so that their votes are counted together. (In some election systems, parties can register such a coalition, called an apparentement, and continue to have separate names and lists. Otherwise, the parties can always appear on election day as one party with a common umbrella name. In any case we assume that this does not attract or repel any voters, so the coalition gets exactly as many votes as the parties would have had separately, and that all other parties remain unaffected with the same number of votes as before.)

It is well-known that for a β-linear divisor method with β≥1 (e.g. Jefferson/D’Hondt’s method) parties can never lose by forming an apparentement; they will get at least as many seats together as if they appear separately. When β≤0 (e.g. Adams’s method), the opposite is true, and parties will never gain by forming an apparentement (conversely, they may gain by splintering), and for 0<β<1 (e.g. Webster/Sainte-Laguë’s method), they may both gain and lose. For D’Hondt’s method, this effect is politically important in reality, especially in small constituencies, and apparentements are or have been a regular feature in many countries, see e.g. Carstairs (1980).

Leutgäb and Pukelsheim (2009) give a detailed study of the resulting gains in a set of real elections, including examples, statistics and theoretical values assuming that the party sizes are random (as in Sect. 6). We can now give similar theoretical results for given party sizes. For simplicity, we consider only the case of a single apparentement of two parties; larger apparentements can be treated similarly as well as several apparentements (their effects are, asymptotically at least, additive).

Let the parties be i and j, and consider the expectation of the gain s ij −(s i +s j ), where s ij is the number of seats the parties get together as an apparentement.

Theorem 8.1

Suppose that p 1,…,p m are linearly independent over ℚ. If two parties i and j form an apparentement in the election, the mean of their seat gain s ij s i s j satisfies, as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow} \infty}}\),

$$ \mathbb{E}(s_{ij}-s_i-s_j) \to\biggl( \beta-\frac{1}{2}\biggr) (1-p_i-p_j ) $$
(8.1)

for the β-linear divisor method, and

$$ \mathbb{E}(s_{ij}-s_i-s_j) \to\gamma\biggl(\frac{2}{m}-\frac{1}{m-1}\biggr) = \gamma \frac{m-2}{m(m-1)} $$
(8.2)

for the γ-quota method.

Proof

If Δ ij is the seat excess for the apparentement, then

$$s_{ij}-s_i-s_j =N(p_i+p_j)+ \Delta_{ij}-(Np_i+\Delta_i)-(Np_j+ \Delta_j) =\Delta_{ij}-\Delta_i- \Delta_j, $$

so \(\mathbb{E}(s_{ij}-s_{i}-s_{j})=\mathbb{E}\Delta_{ij}-\mathbb{E}\Delta _{i}-\mathbb{E}\Delta_{j}\), which by Theorems 3.4 and 3.11 converges to, recalling that Δ ij is the seat excess in a contest with m−1 parties,

and

for the β-linear divisor method and γ-quota method, respectively. □

One sees in the same way that these expected gains (which are negative if β<1/2 or γ<0) are balanced by a loss for each other party k of \((\beta-\frac{1}{2})p_{k}\) for the β-linear divisor method and γ/(m(m−1)) for the γ-quota method.

We have seen that apparentements are favoured when β>1/2 or γ>0, but the effect is rather small, especially for quota methods. (Nevertheless, as said above, the effect is large enough to be politically important in reality for D’Hondt’s method.) Note that for the β-linear divisor method, the gain in (8.1) does not depend on the number (or sizes) of the other parties, while for the γ-quota method, the gain in (8.2) is largest for m=3 or 4 and decreases as O(1/m) for large m.

Example 8.2

For Jefferson/D’Hondt’s method, two small parties that form an apparentement gain at most 0.5 seats together, and two large parties gain less.

Example 8.3

For Droop’s method, any two parties that form an apparentement gain at most \(\frac{1}{6}\approx0.167\) seats together when there are 3 or 4 parties, 0.15 seats when there are 5 parties, and less if there are more.

Problem 8.4

What are, for the different methods, the asymptotic distributions of the seat gain s ij s i s j for an apparentement?

We have so far studied the gain of the two parties combined, but the parties are probably more interested in how the gain is split between them. Typically, the seats given to an apparentement are distributed, in a sub-apportionment, between the participating parties by the same election method as for the main distribution (the super-apportionment). It seems reasonable that the gain then, on the average, is split between the parties proportionally to their sizes, cf. Theorems A.11 and B.3. However, we shall see that this holds only for the divisor methods.

Consider first the β-linear divisor method. Theorem 3.4 shows that, asymptotically as \({{N\stackrel {\mathrm{p}^{*}}{\longrightarrow}\infty}}\), the number of seats the apparentement gets is on the average

$$ N(p_i+p_j)+\biggl(\beta-\frac{1}{2}\biggr) \bigl((m-1) (p_i+p_j)-1 \bigr). $$
(8.3)

If we use Theorem 3.4 again for the distribution of these seats between i and j, we obtain that i gets on the average

(8.4)

seats, which, comparing to Theorem 3.4 for an election without the apparentement, means an average gain for party i of

$$ \biggl(\beta-\frac{1}{2}\biggr) \biggl(\frac{p_i}{p_i+p_j}-p_i\biggr) = \frac{p_i}{p_i+p_j}\biggl(\beta-\frac{1}{2}\biggr) (1-p_i-p_j), $$
(8.5)

which indeed is the proportional share p i /(p i +p j ) of the joint gain in (8.1).

There is, however, a gap in this argument. In our model, the seat assignment is a deterministic function of N both in the super-apportionment and in the sub-apportionment, and the calculation above using expectations for random N assumes that there is no hidden correlation, where the numbers of seats given to the apparentement are seat numbers that tend to favour one of the two parties in the sub-apportionment. It seems very unlikely that there will be such a correlation under our assumption that p 1,…,p m are linearly independent over ℚ, but we have so far no rigorous proof, so (8.5) should only be regarded as a conjecture for the asymptotic average gain for party i.

Problem 8.5

Verify rigorously that the expected gain for party i converges to the value in (8.5) as \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\).

Nevertheless, if we tentatively use (8.5), and regard it as a valid approximation for real elections with finite N (and arbitrary p 1,…,p m ), we can draw some practical conclusions for D’Hondt’s method (β=1). If the parties i and j both are small, then (8.5) shows that the gain for party i is almost p i /(p i +p j ). Thus most of the gain of the apparentement goes to the larger party, and a party has very little to gain by an apparentement with a party that is substantially larger. In fact, if party i can choose between several parties to form an apparentement, we see, perhaps surprisingly, that it gets the largest gain by choosing the smallest partner. (This is based on asymptotics as N→∞. For a real election with finite N there is a trade-off since a really small partner will not help significantly; the advantage can hardly be larger than Np j , so presumably only parties with q j =Np j being at least 1/2 or so will be useful partners. It would be interesting to study such effects for finite N, but that is beyond the scope of the present paper.)

Consider now instead the γ-quota method. One important complication is that quota methods are not uniform, in the sense of Appendix A.3. Thus, even if the apparentement gets the same number of seats as the two parties would have got together if they had appeared separately in the election, the sub-apportionment may give a different distribution of these seats between the parties.

We calculate as above, with the same reservation that a rigorous verification still is lacking. Theorem 3.11 shows that, asymptotically as \({{N\stackrel {\mathrm{p}^{*}}{\longrightarrow}\infty}}\), the number of seats the apparentement gets is on the average

$$ N(p_i+p_j)+\gamma\biggl((p_i+p_j)- \frac{1}{m-1}\biggr). $$
(8.6)

If we use Theorem 3.11 again for the distribution of these seats between i and j, we obtain that i gets on the average

(8.7)

seats, which, compared to Theorem 3.11 for an election without the apparentement, means an average gain for party i of

$$ \gamma\biggl(\frac{m-2}{m-1}\frac{p_i}{p_i+p_j}-\frac {m-2}{2m} \biggr). $$
(8.8)

This is quite different from the proportional share p i /(p i +p j ) of the joint gain in (8.1). In fact, we see that for γ>0 (e.g. Droop’s method), a party gains only if p i /(p i +p j )>(m−1)/(2m). In other words, a party will lose if it forms an apparentement with a party that is only a little larger. Typically, the redistribution within the apparentement by the sub-apportionment is more important than the collective gain.

9 The Sainte-Laguë divergence

Sainte-Laguë (1910a, 1910b) based his proposal of his method on the fact that it is the least squares method minimising the sum

$$ S:=\sum_{i=1}^mv_i\biggl( \frac{s_i}{v_i}-\frac{N}{V}\biggr)^2 = \sum _{i=1}^m\frac{(s_i-Np_i)^2}{v_i} = \sum _{i=1}^m\frac{\Delta_i^2}{v_i}; $$
(9.1)

multiplying this by V we obtain the equivalent quantity

$$ \widetilde{S}:=VS=\sum_{i=1}^m \frac{\Delta_i^2}{p_i}, $$
(9.2)

which is called the Sainte-Laguë divergence by Heinrich et al. (2004, 2005), where the asymptotic distribution is studied under the assumption of random party sizes (see Sect. 6). We obtain corresponding result in our setting as corollaries of the results in Sect. 3. (Unfortunately, we do not get as explicit result for quota methods as for divisor methods because Theorem 3.16 is less explicit.) The most interesting case is the \(\frac{1}{2}\)-linear divisor method (Sainte-Laguë’s method) (Heinrich et al. 2004), since then S and \(\widetilde{S}\) are the minima over all allocations, but it is also interesting to see how much larger \(\widetilde{S}\) is for other methods (Heinrich et al. 2005).

Theorem 9.1

Suppose that p 1,…,p m are linearly independent over ℚ, and let \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\).

  1. (i)

    For the β-linear divisor method,

    $$ \widetilde{S}\stackrel{\mathrm{d}}{\longrightarrow} S_{\beta,m}:= \sum_{i=1}^m \frac{(V_i+\beta-1)^2}{p_i}-\Biggl(\sum_{i=1}^m(V_i+ \beta-1)\Biggr)^2, $$
    (9.3)

    where V 1,…,V m are as in Theorem 3.7.

  2. (ii)

    For the γ-quota method,

    $$ \widetilde{S}\stackrel{\mathrm{d}}{\longrightarrow} \bar{S}_{\gamma,m}:= \sum_{i=1}^m \frac{\bar{Y}_i^2}{p_i}, $$
    (9.4)

    where \(\bar{Y}_{1},\dots,\bar{Y}_{m}\) are as in Theorem 3.16.

Proof

For the β-linear divisor method, we use Theorem 3.7 and obtain by the continuous mapping theorem (Billingsley 1968, Theorem 5.1) and (9.2) immediately \(\widetilde{S}\stackrel{\mathrm{d}}{\longrightarrow}\sum_{i=1}^{m}X_{i}^{2}/p_{i}\). We write (3.10) as X i =p i AA i , where A i :=V i +β−1 and \(A:=\sum_{i=1}^{m}(V_{i}+\beta-1)=\sum_{i=1}^{m}A_{i}\). Thus

which is (9.3).

The result for the γ-quota method follows directly from Theorem 3.16. □

The expression in (9.3) for S β,m may be compared to the somewhat different expression in Heinrich et al. (2004, 2005) for the limit for fixed N and random p 1,…,p m ; by the argument in Sect. 6, the formulas have to be equivalent (for random p 1,…,p m ).

For the expectation we similarly obtain:

Theorem 9.2

Suppose that p 1,…,p m are linearly independent over ℚ, and let \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow }\infty}}\).

  1. (i)

    For the β-linear divisor method,

    $$ \mathbb{E}\widetilde{S}\to\frac{1}{12}\Biggl(\,\sum _{i=1}^m\frac{1}{p_i}+m-2\Biggr) +\biggl(\beta- \frac{1}{2}\biggr)^2\Biggl(\,\sum_{i=1}^m \frac{1}{p_i}-m^2\Biggr). $$
    (9.5)
  2. (ii)

    For the γ-quota method,

    $$ \mathbb{E}\widetilde{S} \to\frac{(m+2)(m-1)}{12 m^2}\sum _{i=1}^m\frac{1}{p_i} +\frac{\gamma^2}{m^2} \Biggl(\,\sum_{i=1}^m\frac{1}{p_i}-m^2 \Biggr). $$
    (9.6)

Proof

This can be shown from Theorem 9.1, but it is easier to use

and Theorem 3.4, Corollary 3.8, Theorem 3.11, Corollary 3.14. □

Note that \(\sum_{i=1}^{m}1/p_{i}\ge m^{2}\), with equality if and only if all p i are equal. We know that \(\widetilde{S}\) always is minimised by the \(\frac{1}{2}\)-linear divisor method (Sainte-Laguë’s method). We see from Theorem 9.2 that also asymptotically, \(\mathbb{E} \widetilde {S}\) is strictly larger for any β-linear divisor method or γ-quota method, except when all parties have exactly the same size (a trivial case where all methods yield the same result; note that we have excluded this case by our assumptions).

Furthermore, Theorems 9.1–9.2 show that \(\widetilde{S}\) is typically of the order m 2 if all p i are of roughly the same order, but if some p i is very small, \(\widetilde {S}\) can be substantially larger. Heinrich et al. (2004, 2005) consider \(\widetilde{S}\) and the limit S β,m for random, uniformly distributed, p 1,…,p m , and show that, as m→∞,

$$ S_{\beta,m}/m^2 - \biggl( \frac{1}{12}+\biggl(\beta-\frac{1}{2}\biggr)^2 \biggr)\log m \stackrel{\mathrm{d}}{ \longrightarrow} S^*, $$
(9.7)

where S is a certain 1-stable random variable (depending on β). We see that the logm term and the large tail of S (we have \(\mathbb{E}S^{*}=\infty\)) depend on the possibility of some very small p i ; the asymptotic behaviour for deterministic p 1,…,p m with, say, (minp i )−1=O(m) is quite different. To see this better, we first replace V i (which are dependent) by U i (which are i.i.d.) in (9.3); the following lemma shows that for large m, the difference is small. (Cf. similar approximations in Heinrich et al. 2004, 2005.)

Lemma 9.3

Let \(\widehat{U}_{i}:=U_{i}+\beta-1\), so \(\widehat{U}_{i}\) are i.i.d. U(β−1,β), and let

$$ S_{\beta,m}':= \sum_{i=1}^m \frac{{\widehat{U}_i}^2}{p_i}- \Biggl(\,\sum_{i=1}^m{ \widehat{U}_i}\Biggr)^2. $$
(9.8)

Then \(\mathbb{E}\vert S_{\beta,m}-S_{\beta,m}'\vert =O(m)\), uniformly in p 1,…,p m .

Proof

We have |U i +β−1|≤|β|+1=O(1) and similarly |V i +β−1|=O(1). Further, by (3.9), ℙ(V i U i )=ℙ(J=i)=p i . Hence,

$$ \mathbb{E}\frac{\vert (V_i+\beta-1)^2-(U_i+\beta -1)^2\vert }{p_i} =\frac{O(p_i)}{p_i}=O(1). $$
(9.9)

Similarly, \(\sum_{i=1}^{m}(V_{i}+\beta-1)-\sum_{i=1}^{m}(U_{i}+\beta-1)=-U_{J}=O(1)\) and thus

$$ \Biggl(\,\sum_{i=1}^m(V_i+ \beta-1)\Biggr)^2-\Biggl(\,\sum_{i=1}^m(U_i+ \beta-1)\Biggr)^2 =O(m). $$
(9.10)

The result follows by summing (9.9) and (9.10). □

We can now state a normal limit theorem corresponding to (9.7), but for deterministic p 1,…,p m .

Theorem 9.4

Suppose that for each m, we are given p 1,…,p m such that m 3/2min i p i →∞ as m→∞. Let \(A_{m}:=\sum_{i=1}^{m}p_{i}^{-1}\), \(B_{m}:=\sum_{i=1}^{m}p_{i}^{-2}\) and

Then, as m→∞,

Proof

Note first that, e.g. by Jensen’s inequality, A m m 2, B m m 3 and C m ≥0.

Write, for convenience, b:=β−1/2. We have \(\widehat{U}_{i}=\widetilde{U}_{i}+b\) and \(\mathbb{E}\widehat{U}=b\). Thus, (9.8) can be expanded as

The last term has expectation \(\operatorname{Var}\sum _{i=1}^{m}\widetilde {U}_{i}=O(m)\), and can thus be ignored. Define

Then

and, by a straightforward calculation,

The random variables Z i are independent and each is bounded with \(|Z_{i}|=O(1/p_{i}+m)=o(m^{3/2})=o(B_{m}^{1/2})\). The result now follows by the central limit theorem, using either Lindeberg’s or Lyapounov’s condition, see e.g. Gut (2005, Sect. 7.2). □

Remark 9.5

Sainte-Laguë (1910b) also studied the least squares functional calculated per seat and not per voter:

$$ \sum_{i=1}^ms_i\biggl( \frac{v_i}{s_i}-\frac{V}{N}\biggr)^2 =\frac{V^2}{N^2}\sum _{i=1}^m\frac{\Delta_i^2}{s_i}, $$
(9.11)

and found that this is minimised by the divisor method with \(d(n)=\sqrt{n(n-1)}\), later suggested by Huntington, see Appendix A and Remark A.10. The asymptotics of this quantity is, after division by (V/N)3, the same as for S, because s i /Np i .

10 Some other goodness-of-fit functionals

Many election methods, including the ones treated here, can be characterised as minimising various functionals, see e.g. Balinski and Young (2001), Kopfermann (1991) and Niemeyer and Niemeyer (2008). Theorems 3.7 and 3.16 yield by the continuous mapping theorem asymptotic distributions for many such functionals for the methods treated here, under our standard assumptions that p 1,…,p m are linearly independent over ℚ and \({{N\stackrel{\mathrm{p}^{*}}{\longrightarrow}\infty}}\), which we assume below. We give some examples, leaving others to the reader. Note, however, that our results for quota methods are less complete, cf. Problem 3.18.

10.1 Hamilton divergences

Hamilton/Hare’s method, the 0-quota method, is the unique method minimising max i i |. It also the method minimising max i Δ i . Moreover, it is, for any convex function φ:ℝ→[0,∞] with φ(0)=0 and φ(x)>0 for x≠0, the unique method minimising ∑ i φ i ), as was shown by Pólya (1918b).

For the β-linear divisor method, Theorem 3.7 implies

$$ \max_{1\le i\le m}|\Delta_i|\stackrel{\mathrm{d}}{\longrightarrow }\max_{1\le i\le m}|X_i| $$
(10.1)

with X i given by (3.10). However, we do not know any really simple description of this limit variable.

For the γ-quota method, including Hamilton/Hare’s method and thus the minimum of max i i | over all allocations, Theorem 3.16 yields

$$ \max_{1\le i\le m}|\Delta_i|\stackrel{\mathrm{d}}{\longrightarrow }\max_{1\le i\le m}|Y_i|, $$
(10.2)

but we have no explicit description of this limit variable.

Similar results hold for max i Δ i .

We obtain more complete results if we instead consider the least squares functional \(\sum_{i=1}^{m}\Delta_{i}^{2}\), which also yields Hamilton/Hare’s method by Polya’s theorem (Pólya 1918b). (This case was shown earlier by Sainte-Laguë (1910b), who attributed the result to Zivy.)

Theorem 10.1

With assumptions and notations as in Sect3, for the β-linear divisor method,

(10.3)
(10.4)
(10.5)

and for the γ-quota method,

(10.6)
(10.7)

Proof

By Theorems 3.7 and 3.16, calculating \(\mathbb {E}X_{i}^{2}\) and \(\mathbb{E}\bar{Y}_{i}^{2}\) by Theorem 3.4, Corollary 3.8, Theorem 3.11, Corollary 3.14 as in the proof of Theorem 9.2. □

10.2 Jefferson and Adams divergences

Heinrich and Schwingenschlögl (2006) consider the functionals

$$ S^{\mathrm{J}}:=\max_{1\le i\le m}\biggl(\frac{s_i}{p_i}-N\biggr)= \max_{1\le i\le m}\frac{\Delta_i}{p_i} $$
(10.8)

and

$$ S^{\mathrm{A}}:= \min_{1\le i\le m}\biggl(\frac{s_i}{p_i}-N\biggr)=\min_{1\le i\le m} \frac{\Delta_i}{p_i}. $$
(10.9)

(We could as well use the maximum and minimum of s i /v i N/V.) Since \(\sum_{i=1}^{m}\Delta_{i}=0\), we have S J≥0 but S A≤0, and we will therefore use |S A|=−S A.

It is easy to see that S J and |S A| are minimised, over all allocations of N seats, by the Jefferson and Adams methods, respectively (Sainte-Laguë 1910b; Balinski and Young 2001).

Theorem 3.7 yields the asymptotic distribution of S J and |S A| for any β-linear divisor method; we consider for simplicity only the optimal methods.

Theorem 10.2

With assumptions and notations as in Sect3, for the optimal allocations,

$$ S^{\mathrm{J}}\stackrel{\mathrm{d}}{\longrightarrow}S_m^{\mathrm {J}}:= \sum_{j=1}^{m-1}U_j, $$
(10.10)

and

$$ \bigl|S^{\mathrm{A}}\bigr|\stackrel{\mathrm{d}}{\longrightarrow }\bigl|S_m^{\mathrm{A}}\bigr|:=m-\sum_{j=1}^mV_j -\min_{1\le i\le m}\frac{1-V_i}{p_i}. $$
(10.11)

We have

$$ \mathbb{E}S_m^{\mathrm{J}}=\mathbb{E}\bigl|S_m^{\mathrm{A}}\bigr|= \frac{m-1}{2}. $$
(10.12)

Proof

S J is minimised by Jefferson’s method, so Theorem 3.7 with β=1 yields, noting that minV i /p i =0 by (3.9),

Similarly, |S A| is minimised by Adams’s method, so Theorem 3.7 with β=0 yields

We have \(\mathbb{E}S_{m}^{\mathrm{J}}=(m-1)/2\) directly from (10.10).

For \(|S_{m}^{\mathrm{A}}|\), we first note that if B:=min1≤im 1/p i , then for 0<x<B,

while ℙ(min1≤im (1−V i )/p i >x)=0 for xB. Hence,

(10.13)

Furthermore, \(\mathbb{E}\sum_{j=1}^{m}V_{j}=\sum _{j=1}^{m-1}U_{j}=(m-1)/2\), and (10.12) follows from (10.11). □

Remark 10.3

Note that \(S_{m}^{\mathrm{J}}\) in (10.10) does not depend on p 1,…,p m , but for \(|S_{m}^{\mathrm{A}}|\), the situation is more complicated; the representation in (10.11) depends on p 1,…,p m , but it is still possible that \(|S_{m}^{\mathrm{A}}|\) has the same distribution for all p 1,…,p m and that we just have found an unnecessarily complicated description of it. Note that \(\mathbb{E}|S_{m}^{\mathrm{A}}|\) does not depend on p 1,…,p m . In the case m=2, it is easy to see that \(|S_{m}^{\mathrm{A}}|\stackrel{\mathrm{d}}{=}U\stackrel {\mathrm{d}}{=}S_{m}^{\mathrm{J}}\) for every (p 1,p 2), but we do not know whether something similar is true for m≥3.

Problem 10.4

Is \(|S_{m}^{\mathrm{A}}|\stackrel{\mathrm{d}}{=}S_{m}^{\mathrm{J}}\) for every p 1,…,p m ?

The asymptotic results by Heinrich and Schwingenschlögl (2006), taking the limit of the limits as m→∞, follow easily in our setting too.

Theorem 10.5

For any p 1,…,p m depending on m, as m→∞,

Proof

The result for \(S_{m}^{\mathrm{J}}\) is immediate from (10.10) and the central limit theorem.

For \(|S_{m}^{\mathrm{A}}|\) we use (10.11). We have \({\sum_{j=1}^{m}(U_{j}-V_{j})}=U_{J}\) and consequently, by (10.11) and (10.13),

and the result follows by the central limit theorem applied to \(\sum _{j=1}^{m}(1-U_{j})\). □

11 Rational p i

In our main results we have assumed that p 1,…,p m are linearly independent over ℚ, since this is necessary for Lemma 4.1. We consider briefly what happens when this is not satisfied, i.e., when p 1,…,p m are linearly dependent over ℚ. In particular, we are for obvious practical reasons interested in the case when p 1,…,p m are rational. (See also the corresponding discussion in Janson and Linusson 2012.)

Note first that it may now happen that there are ties. We assume that any ties are resolved randomly; thus the quantities s i (N) and Δ i (N) in general may be random variables, also for a fixed N. This is no real problem, however, and when discussing the means we may replace them by their average over all solutions in case of a tie.

Lemma 4.1 does not apply when p 1,…,p m are linearly dependent over ℚ, but the proof of it sketched in Sect. 4 shows that the sequence ({ny 1+a 1},…,{ny k +a k }) n≥1 is uniformly distributed on a coset of a subgroup of [0,1)k; more precisely, if we for simplicity take all a i =0, the empirical distributions converge to the uniform probability measure μ y on a subgroup, with Fourier coefficients given by

(11.1)

The proofs of Theorems 3.7 and 3.16 now show, with minor modifications, that the seat excesses Δ i still converge jointly in distribution as \({{N\stackrel {\mathrm{p}^{*}}{\longrightarrow}\infty }}\), but the limit distribution is different, and depends on p 1,…,p m through some measures of the type (11.1). In particular, the bias \(\lim\mathbb{E}\Delta_{i}\) is well-defined, but we do not have an explicit formula for it.

However, if we consider a sequence of party size distributions (p 1k ,…,p mk ), k=1,2,… , (with a fixed m), such that p ik p i for each i as k→∞ and further, for every integer vector (a 1,…,a m )≠0, we have \(\sum_{i=1}^{m} a_{i}p_{ik}\neq0\) for all large k, then it follows from (11.1) and an inspection of the proofs of Theorems 3.7 and 3.16 that the measures μ y that now appear in the proofs will converge to the uniform measure μ, and thus the limit variables X i and \(\bar{Y}_{i}\) will converge to the corresponding variables given in Sect. 3 for the case when p 1,…,p m are linearly independent over ℚ.

This shows that the results above are good approximations also for linearly dependent p 1,…,p m , as long as there is no linear relation (3.2) with small integers a i . In particular, for rational p 1,…,p m we typically have a good approximation unless some p i has a small denominator.

In the remainder of this section, we suppose that p 1,…,p m are rational, with a common denominator L. Theorems A.11 and B.3 then show that the sequence of seat excesses Δ i :=s i (N)−Np i , N≥1, has period L. (If β<0, β>1 or γ<0 we may have to except some small N.) This gives another (simpler) proof that the limit distribution of Δ i as \({{N\stackrel{\mathrm {p}^{*}}{\longrightarrow}\infty}}\) exists in this case, and shows that it is the same as the distribution if we take N uniformly in a period {K+1,…,K+L} (for any K that is large enough). In particular, the asymptotic distribution is discrete, and therefore not the same as in Sect. 3. The asymptotic bias \(\lim\mathbb{E}\Delta_{i}\) is easily calculated by taking the average of Δ i over a period.

Example 11.1

Let m=2 and take \((p_{1},p_{2})=(\frac{2}{3},\frac{1}{3})\). Then the β-linear divisor method equals the γ-quota method with γ=2β−1, see Appendix B.1. Assume, for simplicity, 0<β<1, i.e., −1<γ<1. A simple calculation shows that for N=1,2,3, we have \(\Delta_{1}=\frac{1}{3},-\frac{1}{3},0\). Hence the average is 0, so the methods are unbiased in this case, for any β∈(0,1) or γ∈(−1,1). (In fact, for this example the methods all coincide with Hamilton/Hare = Webster/Sainte-Laguë.) In particular, Theorems 3.4 and 3.11 do not hold for rational p 1,…,p m and \(\beta\in(0,\frac{1}{2})\cup(\frac{1}{2},1)\) and γ∈(−1,0)∪(0,1).

If we instead take β=1, i.e. γ=1, we have a tie for N=2, with \(\Delta_{1}=-\frac{1}{3}\) or \(\frac{2}{3}\). The average of Δ1 over a period now is \(\frac{1}{6}\).

For the methods with β=1/2 and γ=0, the result 0 in Example 11.1 agrees with Theorems 3.4 and 3.11. In fact, this is true for any rational p 1,…,p m .

Theorem 11.2

The \(\frac{1}{2}\)-linear divisor method (Webster/Sainte-Laguë) and the 0-quota method (largest remainder/Hamilton/Hare) are asymptotically unbiased for any rational p 1,…,p m .

Proof

Let, as above, L be a common denominator of p 1,…,p m . It is for both methods easily seen, arguing similarly to the proofs of Theorems A.11 and B.3, that s i (LN)=Lp i s i (N) for N=0,…,L. Hence, Δ i (LN)=−Δ i (N), and the average over a period N=1,…,L vanishes. □

Also for β=1 and γ=1, the result in Example 11.1 agrees with Theorems 3.4 and 3.11. For divisor methods, we have the following general result. (By Example 11.1 and (3.14), the result does not hold for any other β.)

Theorem 11.3

Consider the β-linear divisor method and suppose that β=k/2 for some integer k. Then (3.5) holds for any rational p 1,…,p m .

Proof

By (3.14), the result holds for β if and only if it holds for β+1. Hence it suffices to consider β=1/2 and β=1. The case β=1/2 is part of Theorem 11.2.

For β=1, we argue similarly. (Cf. the proof of Theorem A.11.) Let 0≤N<L and let t≥0 be such that s i (N)=[p i t]1, see (2.7) and (A.3). Recall that [p i t]1=⌊p i t⌋ except that ⌊p i t⌋−1 also is possible when p i t is an integer.

Since p i L is an integer, if we first for simplicity assume that p i t is not,

$$ \bigl[p_i(2L-t) \bigr]_{1}=2p_iL- [p_it ]_{1}-1=2p_iL-s_i(N)-1, $$
(11.2)

where summing over i yields

$$ \sum_{i=1}^m \bigl[p_i(2L-t) \bigr]_{1}=2L-\sum_{i=1}^ms_i(N)-m=2L-N-m. $$
(11.3)

Consequently, (A.3) again shows that these numbers yield the seat distribution for 2LNm seats. (Note that \(L=\sum_{i=1}^{m}p_{i}L\ge m\), and thus 2LNm>0.) Hence, using (11.2) again,

$$ s_i(2L-m-N)= \bigl[p_i(2L-t) \bigr]_{1} =2p_iL-s_i(N)-1. $$
(11.4)

The argument works also, with a little care, if some p i t is an integer, and shows that (11.4) holds generally, and that ties in the distribution of s i (N) correspond to ties in the distribution of s i (2LmN). Hence, we have, regarding Δ i as a random variable when there is a tie,

(11.5)

Taking the expectations in case of a tie, and then the average over the period N=0,…,L−1, when 2LmN runs through the period Lm+1,…,2Lm, we see that if the average is x, then x=mp i x−1; hence \(x=\frac{1}{2}(mp_{i}-1)\), and thus (3.5) holds when β=1. □

For the 1-quota method (Droop), we have seen that (3.16) holds in Example 11.1; in fact, when m=2, it holds for any (p 1,p 2) by Theorem 11.3, because the 1-quota method equals the 1-linear divisor method when m=2, see Appendix B.1. However, it does not hold for m=3.

Example 11.4

Let m=3 and \((p_{1},p_{2},p_{3})=(\frac{2}{5},\frac{2}{5},\frac{1}{5})\). Consider the 1-quota method (Droop). For N=1,2,3,4,5, the smallest party gets, taking the average when there is a tie, \(0,0,1,\frac{2}{3},1\) seats. (Note the non-monotonicity, an instance of the Alabama paradox which frequently occurs for all quota methods, cf. Janson and Linusson 2012.) The seat excesses are \(-\frac{1}{5},-\frac{2}{5},+\frac{2}{5},-\frac{2}{15},0\), with average \(-\frac{1}{15}\), while (3.16) would give \(p_{1}-\frac{1}{3}=-\frac{2}{15}\). Thus (3.16) does not hold.

Note that this example shows that the heuristic argument in Remark 3.12 is not always valid when p 1,…,p m are rational. In this example, the retracted seat in Remark 3.12 is not taken uniformly; a calculation shows that it is taken from the smallest party only \(\frac{4}{15}\) of the times.

We can ask the following for any β-linear divisor method or γ-quota method, where the cases γ=1 in Problem 11.5 and β=1/2, β=1 and γ=0 in Problem 11.6 are especially interesting. (By Theorem A.12, for the linear divisor methods, it suffices to consider 0<β≤1.)

Problem 11.5

Find a general formula for the asymptotic bias \(\lim\mathbb{E}\Delta _{i}\) for rational p 1,…,p m . (Theorems 11.2 and 11.3 yield the answers for β∈ℤ/2 and γ=0.)

Problem 11.6

Find a general formula for the asymptotic distribution of Δ i for rational p 1,…,p m .