1 Introduction

Hotelling’s (1929) “Main Street” model of spatial competition between firms has—most notably thanks to its adaptation by Downs (1957) to ideological competition among political parties—enjoyed a significant presence in the voting literature. In the classical model, there is a society of voters whose ideal policy platforms lie along the left-right political spectrum. A set of exogenously given political candidates or parties choose platforms to advocate so as to maximise their support from the voters, who vote for the candidate with the platform nearest to his or her personal ideal platform.

Most such studies of Downsian competition have focused on situations in which elections are held under the voting system known as plurality rule. This is the simplest system where voters have one vote each, which they cast for their favourite candidate, and whoever gets the most votes wins. Under plurality, voters’ second, third and other preferences—most importantly for this paper, their last place preferences—do not matter. However, voting systems, both used in practice and studied theoretically, come in many varieties. Many of them do take into account voters’ partial or full ranking of candidates when producing a winner. These include, among others, approval voting, Borda count, and single transferable vote. When the preferences beyond first matter, candidates’ incentives change, and we expect equilibrium outcomes to vary as well. In this paper, we analyze the equilibrium properties of a largely overlooked class of voting rules, which combine positive and negative voting, and are referred to as best-worst rules (García-Lapresta et al. 2010). Under these rules, each voter casts one positive and one negative vote and a candidate’s total score is the weighted difference of the number of positive votes and the number of negative votes. We allow the weight of a negative vote to be different from that of a positive vote and, hence, this class of voting systems includes as special cases plurality, anti-plurality, and the system in which positive and negative votes are of equal importance.Footnote 1

The main result of this paper is that, in a simple Hotelling-Downs model with uniformly distributed,Footnote 2 sincere voters and no exit or entry, there is a close link between the pure-strategy equilibria of general best-worst rules and those of plurality, which is well known to admit divergent equilibria in which candidates adopt a range of ideologically diverse positions (Eaton and Lipsey 1975; Denzau et al. 1985). When the importance of a positive vote exceeds that of a negative vote, equilibria take the same general form as those of plurality, with divergent policy platforms advocated. However, the key difference is that, while differentiated, the equilibrium platforms for the best-worst rules exhibit less dispersion. Indeed, these rules present candidates with a clear centrifugal motive to seek first-place rankings, as occurs under plurality, but with the simultaneous incentive to avoid being the most unpopular candidate and receiving negative votes. This last property encourages a degree of policy moderation—adopting extreme platforms is discouraged as doing so is likely to single oneself out as a target for the negative votes of citizens at the opposite end of the ideological spectrum. As the importance of a negative vote increases relative to that of a positive vote, the equilibrium platforms move inwards towards the median voter’s ideal platform. Eventually all platforms merge at the median as a negative vote reaches parity with a positive vote (i.e., one negative vote cancels out one positive vote exactly). When a negative vote becomes more important than a positive vote, only convergent equilibria exist, with no policy differentiation.

Describing the equilibrium properties of different voting systems is an important task (Cox 1985, 1987; Grofman and Lijphart 1986; Myerson and Weber 1993; Myerson 1999; Cahan and Slinko 2017). When choosing between voting rules, first of all, we would like to know whether or not equilibria exist—their absence may lead to permanent instability and a lack of predictability of outcomes. Second, if they exist, an electoral designer would prefer a rule that admits equilibria with desirable properties. The main consideration here is a tradeoff between discouraging extremism and promoting fair representation—it is undesirable if candidates are incentivised to adopt extremist platforms rather than more centrist platforms, while at the same time the rationale for voting in the first place is to provide citizens with political representation of their varied interests. Besides the platforms that are advocated, which platforms are likely to receive the most support also matters for similar reasons.

Our results show that best-worst rules do well on all counts. They admit nonconvergent equilibria, offering voters a choice over distinct platforms and avoiding Hotelling’s “excessive sameness”. At the same time, the perhaps excessive extremism associated with plurality (Cox 1987, 1985; Myerson and Weber 1993; Laslier and Maniquet 2010) is moderated. Indeed, depending on the weight placed on a negative vote, we may have any level of dispersion of platforms between that of plurality, at one extreme, and full convergence of platforms, at the other. Moreover, the candidates that adopt the most extreme positions in equilibrium never obtain a strictly higher vote share than any other candidate—in fact, when there are at least five candidates, there always exist NCNE in which the most extreme candidates receive a strictly smaller vote share than at least one less extreme candidate. Finally, best-worst voting rules have the additional advantage that they are simple and easily implementable, requiring only that voters list their first and last choices and not a tedious full ranking.

Best-worst voting itself has not been used in practice, but the idea of voting against candidates in one form or another has been around for some time. Boehm (1976) in an unpublished essay suggested that voters in an election be allowed either to cast a vote for or against a candidate, but not both. A candidate’s “negative” votes would be subtracted from his “positive” votes to determine his net vote, and the candidate with the highest net vote would win.Footnote 3 Boehm—and many others after him (see, e.g., Leef 2014)—argued that the introduction of negative votes in United States presidential elections would force the candidates to appeal to voters with positive programs, rather than just fill the airwaves with ads attacking other candidates, sowing doubt among their supporters. The rule suggested by Boehm is now known as negative voting (Brams 1983). Anti-plurality voting is a similar method in which each voter votes against a single candidate, and the candidate with the fewest votes against wins. In other words, anti-plurality determines who among the candidates is the least unpopular.Footnote 4

The use of some form of negative voting in elections is not so uncommon. For example, Nevada gives voters the option to vote against all candidates by having a “None of these candidates” option on the ballot. Prior to 2000, Lithuanian voters were allowed voters to express approval, neutrality or disapproval of candidates in the proportional representation part of their parliamentary elections (Renwick and Pilet 2016). Latvia does the same in allocating a party’s European Parliament seats to individual candidates from the party list.Footnote 5

A voting system in which voters cast both positive and negative votes, as occurs under best-worst rules, may be even more advantageous. It can give a fighting chance to major or minor centrist parties—it is not unthinkable that people on the extreme left will vote for a leftist candidate and against a right-wing one, while the right-wing voters will do the opposite. Their votes will cancel out and a centrist candidate will be elected.Footnote 6 Indeed, roughly speaking, this is directly in the spirit of our main results.Footnote 7

The rest of this paper is organised as follows: in Sect. 2 we outline some literature related to this work; in Sect. 3 we present the model; in Sect. 4 we present our main results; Sect. 5 discusses a few of the assumptions and the generalisability of the results; and, Sect. 6 provides our concluding remarks. A few minor and auxiliary results are presented in the Appendix.

2 Related literature

Best-worst rules specifically and notions related to them have been considered before in other contexts. The idea that the best and worst alternatives play a special role in the decision process has been prominent in decision theory. For example, the Arrow–Hurwicz (1972) criterion for choice under uncertainty takes a weighted average of the best and worst expected value/utility outcomes and does not take into account intermediate outcomes, and Marley and Louviere (2005) look at probabilistic discrete choice models through the best-worst lense.

García-Lapresta et al. (2010) provide an axiomatic characterisation of the class of best-worst voting rules considered in this paper. Alcantud and Laruelle (2014) characterise a related voting rule in which, for each candidate, voters may express approval, indifference, or disapproval. This rule is also studied in Felsenthal (1989) from the perspective of voter strategies. Joy and McMunigal (2016) believe that the current system of peremptory challenges in the criminal justice system of the United States makes it easy to exclude qualified African Americans jurors in the process of jury selection and propose that it be replaced with a system of peremptory strikes and peremptory inclusions. In other words, both the defense and the prosecution should be allowed to not just rule potential jurors out, but also “rule them in”.

Baujard et al. (2014), during the first round of the 2012 French presidential election, ran an experiment in which subjects were asked to vote for candidates using various “evaluative voting” methods, which bear many similarities to the best-worst voting rules considered in our paper. Voters “graded” candidates on a numerical scale: for example, under one system they could assign each candidate 1 point, 0 points, or \(-1\) points; under another, they could assign 2 points, 1 point or 0 points. They documented an interesting psychological effect: these systems were not treated the same, despite being mathematically equivalent (see also Igersheim et al. 2016).

None of these papers, thus, look at how the incentives created by these voting systems affect political competition. Given the very natural combination of negative and positive voting embodied in the best-worst rules, it is surprising that, to the best of our knowledge, they have been overlooked in the spatial competition literature. Plurality, a special case, has of course been extensively discussed, and its equilibrium properties are characterised in Eaton and Lipsey (1975) and Denzau et al. (1985). Anti-plurality is known to allow convergent equilibria in which all candidates adopt the same policy platform, but not to allow nonconvergent equilibria (Cox 1987).

The two most relevant papers to this research are Cox (1987) and Cahan and Slinko (2017). Both are concerned with Nash equilibria under the class of voting rules known as general scoring rules, of which the best-worst rules are a subclass. Cox (1987) characterised all scoring rules that have convergent Nash equilibria, which leads to a straightforward description of all best-worst rules allowing convergent equilibria, as we will describe in Sect. 4. However, Cox’s theorem says nothing about the possibility of divergent equilibria, which is the focus of Cahan and Slinko (2017), and also this paper.

Cahan and Slinko (2017) investigate the existence and properties of nonconvergent equilibria under general scoring rules. In some subclasses of scoring rules—in particular, those whose score vector is convex—they managed to characterise all rules that allow Nash equilibria. These rules appear to be truncated variants of the Borda rule. This result is, however, inapplicable to the best-worst rules, whose score vectors are neither convex nor concave. A general characterisation of scoring rules that allow equilibria remains an open question.

3 The model

There is a unit mass of voters with ideal positions distributed uniformly on the interval [0,1], the issue space.Footnote 8 There are m candidates—candidate i’s position is \(x_i\), and a strategy profile \(x=(x_1,\ldots ,x_m)\in [0,1]^m\) describes the platforms of all the candidates. A strategy profile implies a set of distinct occupied positions, \(x^1<x^2<\cdots <x^q\). We denote by \(n_i\) the number of candidates at occupied position \(x^i\) and we will sometimes use the alternative notation for a strategy profile, \(x=((x^1,n_1),\ldots ,(x^q,n_q))\), which gives the location and number of candidates at each occupied position rather than each individual candidate’s position.

We will use notation \([n]=\{1,\ldots ,n\}\) and if \(I=[a,b]\) is an interval, then \(\ell (I)=b-a\) is the length of the interval. We assume sincere voters with single-peaked, symmetric utility functions who, hence, rank candidates according to the distance between their advocated platform and the voter’s ideal position. Voters who are indifferent between candidates decide on a strict ranking by fair lottery.

A best-worst voting rule can be described as follows: a first-place ranking earns a candidate a normalised 1 point, while a last-place ranking earns the candidate \(-c\) points, where \(c\ge 0\). Being ranked anywhere other than first or last by a voter earns a candidate nothing. The magnitude of c describes the relative importance of the positive vote relative to the negative vote, which is the parameter of interest here. Thus, a rule can be described by a pair of numbers \(s=(c,m)\), where m is the number of candidates.Footnote 9

Candidate i’s score is the weighted difference between the number of positive votes and the number of negative votes received, denoted \(v_i(x)\). Candidates choose positions simultaneously so as to maximise \(v_i(x)\).Footnote 10 Our equilibrium concept is the Nash equilibrium in pure strategies. Profile \(x^*=(x_1^*,\ldots ,x_m^*)\) is an equilibrium if and only if \(v_i(x^*)\ge v_i(x_i,x_{-i}^*)\) for all \(i\in [m]\) and for all \(x_i\in [0,1]\), where \((x_i,x_{-i}^*)=(x_1^*,\ldots ,x_{i-1}^*, x_i,x_{i+1}^*,\ldots ,x_m^*)\). A convergent Nash equilibrium (CNE) is an equilibrium in which all candidates adopt the same platform, while in a non-convergent Nash equilibrium (NCNE), at least two of the platforms are distinct. The notation \(x_i^{+}\) and \(x_i^{-}\) refer to points \(x_i+ \epsilon \) and \(x_i- \epsilon \), respectively, for vanishingly small \(\epsilon >0\).

4 Results

Our main result is a general characterisation of NCNE for rules \(s=(c,m)\) in Theorem 4.3. Before we concentrate on NCNE, however, we should address the issue of CNE—equilibria in which all candidates adopt the same platform. In fact, their characterisation is straightforward, presented below in Proposition 4.1. This result follows directly from Cox (1987), who characterised CNE for general scoring rules, a broad class of voting rules to which best-worst rules belong.

Proposition 4.1

(Cox 1987) A rule \(s=(c,m)\) admits CNE if and only if \(c\ge 1\), in which case the profile \(x=((x^1,m))\) is a CNE for any \(x^1\in \big [\frac{m-1+c}{m(1+c)}, 1-\frac{m-1+c}{m(1+c)}\big ]\).

Proof

For \(x=((x^1,m))\) to be a CNE, it should not be beneficial to deviate just to the left or right of \(x^1\). That is, we have CNE if and only if: first, \(v_i(x^{1-},x_{-i}) =x^1-c(1-x^1)\le \frac{1-c}{m}=v_i(x)\); and, second, \(v_i(x^{1+},x_{-i})=1-x^1-cx^1\le v_i(x)\). Together, these two conditions yield the interval of possible CNE, which is nonempty if and only if \(c\ge 1\). \(\square \)

Proposition 4.1 tells us that CNE can only exist if the weight on a positive vote does not exceed that of a negative vote. In this case, a small deviation from the common platform differentiates a candidate in a positive way for one side of the electorate, and negatively for the other. The gain in terms of positive votes is not worth the damage due to the negative votes that the candidate will now receive, so candidates will stay put at the common platform. So CNE exist at any point of an interval centered at the median voter’s ideal position. As c increases, this interval expands, meaning that a wider range of CNE are possible.Footnote 11

While Proposition 4.1 tells us everything there is to know about CNE, it is silent about NCNE. We do know that NCNE exist for plurality (Eaton and Lipsey 1975), but not for antiplurality (Cox 1987), both of which are examples of best-worst rules, so the picture is not at all clear in general.

It turns out that, for NCNE to exist, it must be that \(c<1\). In other words, the value of a positive vote must outweigh the value of a negative vote in order for the candidates to be induced to adopt divergent policies. Otherwise, the centripetal incentive to avoid being singled out as the worst candidate is too strong and only CNE can exist. This also implies that CNE and NCNE cannot exist simultaneously for the same rule.Footnote 12

Proposition 4.2

The rule \(s=(c,m)\) does not admit NCNE if \(c\ge 1\).

Proof

Consider candidate 1 at position \(x^1\), which is occupied by \(n_1\) candidates, where \(2\le n_1\le m-2\). Consider intervals \(I_1=[0,x^1]\) and \(I_2=[(x^1+x^q)/2,1]\). If 1 makes an infinitesimal move to the right of \(x^1\), then in the rankings of voters in \(I_1\), of which there is positive measure by Lemma A.1, she falls behind the other \(n_1-1\) candidates originally at \(x^1\), thus losing their positive votes. On the other hand, 1 rises ahead of these \(n_1-1\) candidates in the rankings of all other voters and, in particular, no longer receives a negative vote from any voter. Then, the score candidate 1 loses by making this move is \(v_{lost}=\frac{1}{n_1}\ell (I_1)\). On the other hand, 1’s gain from this move is \(v_{gained}=\frac{1}{n_1}c\ell (I_2).\)

For NCNE, it must be the case that \(v_{lost}\ge v_{gained}\), or \(\ell (I_1)\ge c\ell (I_2)\). Since we assume \(c\ge 1\), this implies that \(\ell (I_1)\ge \ell (I_2)\), or \(x^1\ge 1-(x^1+x^q)/2\). Similar considerations with respect to candidate q yields the requirement that \(\ell ([x^q,1])\ge \ell ([0,(x^1+x^q)/2]\), or \(1-x^q\ge (x^1+x^q)/2\). Together, these two conditions imply that \(x^1\ge x^q\), an impossibility for an NCNE. \(\square \)

Before we proceed to our characterisation, some additional notation. Let

  1. (i)

    \(I_1=[0,(x^1+x^2)/2]\),

  2. (ii)

    \(I_i=[(x^{i-1}+x^{i})/2,(x^{i}+x^{i+1})/2]\) for \(2\le i \le q-1\),

  3. (iii)

    \(I_q=[(x^{q-1}+x^{q})/2,1]\),

be the “full-electorates” around each of the occupied positions. A full-electorate \(I_i\) is the set of voters for whom a given occupied position \(x^i\) is the nearest, so that any candidates located there are ranked first equal for these voters. For each \(i\in [q]\) let \(I_i^L=\{y\in I_i: y\le x^i\}\) and \(I_i^R=\{y\in I_i: y\ge x^i\}\) be the left and right “half-electorates” whose union is the full-electorate \(I_i\). That is, we simply partition a full-electorate into those voters whose ideal positions lie to the left of the given occupied position and those who lie to the right. Note that \(\ell (I_i^R)=\ell (I_{i+1}^L)\) for \(i\in [q-1]\).

We now present our characterisation of NCNE for best-worst rules, Theorem 4.3, which provides five necessary and sufficient conditions for a profile to be an NCNE for a given best-worst rule. Condition (i) states that the outermost occupied positions must be occupied by two candidates apiece. It is clear that they cannot be single candidates, but this condition also excludes the possibility of more than two candidates, as in the well-known case of plurality (see Eaton and Lipsey 1975). The second condition says that all paired candidates’ half-electorates are the same length, excluding end electorates, while (iii) relates these interior half-electorates to the outermost half-electorates. Conditions (iv) and (v) put restrictions on the lengths of various electorates: first, an unpaired candidate’s full-electorate cannot be smaller than any half-electorate (excluding end half-electorates); and, second, a paired candidate’s half-electorate cannot be smaller than an unpaired candidate’s half-electorate (excluding end half-electorates).

An important observation to make is that, with the exception of (iii), all the remaining conditions are identical for any rule—they do not depend directly on c, as long as \(c<1\). This implies that the equilibrium spacing will be affected by c, but not the configuration of the candidates, i.e., the number of occupied positions and how many candidates occupy them. Thus, if they exist (we will see shortly that they do for any rule with \(c<1\)), NCNE for best-worse rules will have the same general form as NCNE for plurality, the only difference being the exact location of the platforms \(x^1,\ldots ,x^q\).

Theorem 4.3

Given a rule \(s=(c,m)\), with \(c<1\), the following conditions are necessary and sufficient for a profile x to be an NCNE:

  1. (i)

    \(n_i\le 2\) for all \(i\in [q]\) and \(n_1=n_q=2\). That is, candidates at the most extreme occupied positions are paired.

  2. (ii)

    If \(n_i=2\) for \(1<i<q\), then \(\ell (I_i^L)=\ell (I_i^R)=\ell (I_1^R)=\ell (I_q^L)\). Let \(I^p\) denote this common measure. That is, all paired candidates’ half-electorates are the same length (except end half-electorates).

  3. (iii)

    \(\ell (I_1^L)=\ell (I_q^R)=I^p+\frac{c}{2}\).

  4. (iv)

    If \(n_i=1\), then both \( \ell (I_i)\ge \ell (I_k^L)\) for all \(k\ne 1\) and \( \ell (I_i)\ge \ell (I_k^R)\) for all \(k\ne q\). That is, any (unpaired) candidate’s full-electorate is no smaller than any other half-electorate (excluding end half-electorates).

  5. (v)

    \(I^p\ge \ell (I_k^L)\) for all \(k\ne 1\) and \(I^p\ge \ell (I_k^R)\) for all \(k\ne q\). That is, a paired candidate’s half-electorate (excluding end half-electorates) is no smaller than any other (unpaired) candidate’s half-electorate (excluding the end half-electorates).

Proof

That (i) is necessary follows from Lemma A.4, so we start by showing the necessity of (ii). Suppose candidate j is at \(x^i\), where \(n_i=2\) and suppose without loss of generality that \(\ell (I_i^L)>\ell (I_i^R)\). Then \( v_j(x^{i-},x_{-j})= \ell (I_i^L) > \ell (I_i^R) =v_j(x^{i+},x_{-j}), \) contradicting Lemma A.2. So \(\ell (I_i^L)=\ell (I_i^R)\). Let \(I^p\) denote this common measure. Moreover, note that \(v_1(x^{1+},x_{-1})=\ell (I_1^R)\). Using Lemmas A.2 and A.3, we know \(v_1(x^{1+},x_{-i})=v_1(x)=v_j(x)=I^p\), so that \(\ell (I_1^R)=I^p\). Similarly, \(\ell (I_q^L)=I^p\). Hence, condition (ii) is necessary.

Now condition (iii). Note that we must have \(\ell (I_1^L)=\ell (I_q^R)\). Otherwise, if \(\ell (I_1^L)>\ell (I_q^R)\), then, using Lemmas A.2 and A.3,

$$\begin{aligned} v_q(x)=v_1(x)=v_1(x^{1-},x_{-1})>v_q(x^{q+},x_{-q})=v_q(x), \end{aligned}$$

a contradiction. Thus, \((x^1+x^q)/2=1/2\). Hence, \( v_1(x)=\frac{1}{2}( \ell (I_1)+I^p)-\frac{c}{4}\), which, by Lemma A.2, is equal to \(v_1(x^{1+},x_{-1})= I^p \), so \(\ell (I_1^L)=I^p+\frac{c}{2}\).

Now conditions (iv) and (v). Let candidate l be at \(x^i\). Then, if \(n_i=1\), we have \(v_l(x)= \ell (I_i) \). Suppose there is some \(k>1\) such that \(\ell (I_i)<\ell (I_k^L)\). Clearly the half electorate \(I_k^L\) could not be i’s half electorate, i.e. \(k=i\) or \(k=i+1\). So we have

$$\begin{aligned} v_l(x^{k-},x_{-l})= \ell (I_k^L) > \ell (I_i) =v_l(x), \end{aligned}$$

so this is not an NCNE. So we must have \(\ell (I_k^L)\le \ell (I_i)\). For (v), if \(n_i=2\), then (noting that i can be 1 or q since all paired candidates receive the same score by Lemma A.3) \(v_l(x)= I^p \), which, to avoid contradiction, implies \(I^p\ge \ell (I_k^L)\) for all \(k\ne 1\). Similarly for right electorates.

Now sufficiency. We need to check that no candidate can deviate profitably. Consider candidate i at \(x^j\), where \(n_j=2\) (i could be an end candidate). We know that all paired candidates get the same score, \(v_i(x)=I^p\), and that \(v_i(x^{1-},x_{-1})=v_i(x)\), so i would not want to deviate to \(x^{1-}\) or \(x^{q+}\). Also, if \(t\in (x^k,x^{k+1})\) for some \(k<q\), then \(v_i(t,x_{-1})=\ell (I_k^R)\le I^p =v_i(x)\) by condition (v). Candidate i would also not deviate to an occupied position \(x^k\), \(k\ne j\). Doing so would yield a score of \(v_i(x^k,x_{-i})=\frac{2}{3}I^p<v_i(x)\) if \(n_k=2\) or a score of \(v_i(x^k,x_{-i})=\frac{1}{2}\ell (I_k)=\frac{1}{2}(\ell (I_k^L)+\ell (I_k^R))\le I^p=v_i(x)\) if \(n_k=1\), by (v). So no paired candidates would deviate.

Consider an unpaired candidate i at position \(x^j\). Then \(v_i(x)= \ell (I_j) \). Clearly any moves within the interval \((x^{j-1},x^{j+1})\) do not change i’s score. Suppose \(t\in (x^k,x^{k+1})\) for some \(k\notin \{j-1,j,q\}\). Then \(v_i(t,x_{-i})= \ell (I_k^R)\le \ell (I_j)=v_i(x)\), so i will not move to any unoccupied position. Suppose \(n_k=2\) and \(k\notin \{j-1,j+1\}\). Then \(v_i(x^k, x_{-i})=\frac{2}{3} I^p < I^p \le \ell (I_j) =v_i(x)\), by (iv). Suppose \(n_k=1\) and \(k\notin \{j-1,j+1\}\). Then \(v_i(x^k,x_{-i})=\frac{1}{2} (\ell (I_k^L)+\ell (I_k^R)) \le \ell (I_j) =v_i(x)\). So no unpaired candidate wants to deviate to any occupied position that is not adjacent to the candidate’s current position.

Finally, we check that no unpaired candidate would move to an adjacent occupied position. If \(n_{j-1}=2\), \(j-1\ne 1\), then \(v_i(x^{j-1},x_{-i})=\frac{1}{3} (I^p+\ell (I_j)) \le \frac{2}{3} \ell (I_j) <\ell (I_j)=v_i(x)\). If \(j-1=1\) then \(v_i(x^{j-1},x_{-i})=\frac{1}{3}(\ell (I_1^L)+\ell (I_j))-\frac{c}{6}=\frac{1}{3} \big ( I^p+\frac{c}{2}+\ell (I_j)\big )-\frac{c}{6}\le \frac{2}{3}\ell (I_j)<\ell (I_j)=v_i(x)\). If \(n_{j-1}=1\), then \(v_i(x^{j-1},x_{-i}) =\frac{1}{2}(\ell (I_{j-1}^L)+\ell (I_j))\le \ell (I_j)=v_i(x)\). So no unpaired candidate wants to move to the next left occupied position or, by similar arguments, to the next right occupied position. We have checked all possible deviations, so x is a NCNE. \(\square \)

While Theorem 4.3 gives necessary and sufficient conditions for an NCNE, it is not yet clear that these conditions can be satisfied for an arbitrary number m of candidates and any \(c<1\). For \(m=2\) only convergent equilibria may exist. For \(m=3\), no equilibria can exist whatsoever by the familiar argument (Eaton and Lipsey 1975) that one of them would have to be alone at an outermost occupied position, and would have an incentive to move inwards. The next result addresses this question and shows that they do indeed exist for any \(m\ge 4\).

Corollary 4.4

For all \(m\ge 4\) NCNE exist for rules \(s=(c,m)\) with \(c<1\), and for \(m\ge 6\) there are infinitely many NCNE for a given rule. Moreover, all NCNE take the same general form as plurality (they have NCNE with the same number of occupied positions, q, with the same number of candidates, \(n_i\), at each one, but perhaps different locations).

Proof

Consider \(m\ge 4\). Suppose candidates are positioned so that all half-electorates have the same length, except for end electorates, and \(n_1=n_q=2\). That is, \(\ell (I_k^L)=\ell (I_j^R)=I^p\) for all \(k\ne 1\), \(j\ne q\). Then, we place \(x^1\) and \(x^q\) so that (iii) is satisfied, from which it follows that \(I^p=\frac{1}{2q}(1-c)\). By construction, then, (iv) and (v) are satisfied, and we have an NCNE.

Next, we show that there are infinitely many NCNE for \(m\ge 6\). If m is even, construct a profile as above, but with \(q=(m+2)/2\) occupied positions, all of them occupied by two candidates except for the two innermost positions, \(x^k\) and \(x^{k+1}\), which are occupied by only one candidate each, and all half-electorates except for the outermost of the same length. This will be an equilibrium by the argument of the previous paragraph, with \(x^1\) and \(x^q\) chosen to satisfy (iii). Let us increase the length of each half-electorate except for \(I_k^R\) and \(I_{k+1}^L\) by \(\epsilon >0\), so that \(I^{p'}=I^p+\epsilon \) (that is, we are moving all positions inwards at the expense of the interior two candidates). This maintains (i) and (iii). Condition (iv) will still be satisfied since the only unpaired candidates are those at \(x^k\) and \(x^{k+1}\), who have full-electorates of length \(\ell (I_k)=I^{p'}+\ell (I_k^R)>\max \{I^{p'}, \ell (I_k^R)\}\). Clearly, (v) will still be satisfied, since we are increasing the length of \(I^p\) and decreasing the length \(I_k^R\) and \(I_{k+1}^L\).

If there is an odd number of candidates \(m\ge 7\) we can do a similar thing. We start with \(q=(m+3)/2\) occupied positions, symmetric about the median, which is occupied by a single candidate. The two innermost occupied positions to the left and right of the median are also occupied by single candidates. Label the occupied position at the median as \(x^k\). All occupied positions other than these three have two candidates apiece, and we place them so that all half-electorates except the outermost are of the same length. Again, choose \(x^1\) and \(x^q\) so that (iii) is satisfied. Now, increase the length of all half-electorates except for \(I_k^L\) and \(I_k^R\) by \(\epsilon >0\). As above, this maintains (i) and (iii). Condition (iv) will clearly still be satisfied for all \(i\ne k\). For \(I_k\), the full electorate is getting smaller, but \(\ell (I_k)=2I^p-J\epsilon >I^p+\epsilon \) for small \(\epsilon \), where J is the number of half intervals to one side of the median that increase in length. Condition (v) will still be satisfied, since the paired candidates’ half-electorates \(I^p\) are increasing in length, while the unpaired candidates’ half-electorates are either increasing at the same rate, or getting smaller in the case of \(I_k^L\) and \(I_k^R\). \(\square \)

An important consequence of Theorem 4.3 is that plurality rule produces the most dispersed equilibria, while incorporating a negative vote pulls the platforms inward. Essentially, the correct choice of c allows an election designer to pick any level of equilibrium dispersion between that of plurality and full convergence, an important result given the tradeoff between moderation and representation discussed in the Introduction. This is stated in Corollary 4.5.

Corollary 4.5

For a given configuration of candidates, i.e., fixing q and \(n_i\) for \(i\in [q]\), but allowing \(x^i\) to vary, the most extreme equilibria occur under plurality, and increasing c lowers the attainable levels of dispersion.

Proof

Given a number of occupied positions q and the number of candidates \(n_i\) at each of them, maximising dispersion consists, essentially, in minimising the location of \(x^1\). By (iii), then, we want to minimise \(I^p\). Looking at condition (v), we can see that we will want to to have \(\ell (I_k^L)=\ell (I_k^R)=I^p\), which will then imply that (iv) is satisfied. This will partition the issue space into 2q intervals \(I^p\) and two intervals of length c / 2. Thus, \(2qI^p+c=1\), so that \(I^p=\frac{1}{2q}(1-c)\) and, hence, \(x^1=\frac{1}{2q}(1+c(q-1))\). Increasing c, then, increases \(x^1\), leading to less dispersed equilibria. \(\square \)

It is also important to consider which candidates receive the most support in equilibrium, as this may determine which platform will be implemented or the distribution of power in parliament, depending on the context. Corollary 4.6 shows that, for any \(m>4\), the candidates that adopt the most extreme platforms never win a strictly larger share of the vote than any other candidate and, thus, can never outperform more centrist candidates. Additionally, there always exist NCNE in which the candidates that adopt the most extreme platforms receive strictly less than some other candidate. In these NCNE, one or more less extreme (and unpaired) candidates have a strictly larger vote share.

Corollary 4.6

In NCNE, all paired candidates receive the same vote share, which may not strictly exceed an unpaired candidate’s share. Thus, candidates at \(x^1\) and \(x^q\) cannot do strictly better than any less extreme candidate. Moreover, for any \(m>4\), there always exists an NCNE in which candidates at \(x^1\) and \(x^q\) receive strictly less votes than at least one less extreme (unpaired) candidate.

Proof

The first statement follows from Lemma A.3 and Theorem 4.3(iv). For the second statement, the case \(m = 5\) follow by Corollary 4.7. For \(m\ge 6\), such NCNE were constructed in the second and third paragraphs of the proof of Corollary 4.4. Namely, we position the candidates so that: there is at least one unpaired candidate; \(n_1=n_q=2\); \(x^1\) and \(x^q\) are positioned so that Theorem 4.3(iii) is satisfied; and, all half-electorates are the same length, except for end electorates. In this NCNE, an unpaired candidate obtains twice the score of a paired candidate, since they have the same full-electorate, but do not have to share it. \(\square \)

In the case of four or five candidates, there is a unique NCNE.

Corollary 4.7

If \(m=4\), then there is a unique NCNE, given by profile \(x=((x^1,2),(1-x^1,2))\), where \(x^1=\frac{1}{4}(1+c).\) If \(m=5\), then there is a unique NCNE, given by profile \(x=((x^1,2), (1/2,1),(1-x^1,2))\), where \(x^1=\frac{1}{6}(1+2c).\)

In the four- and five-candidate cases, as expected by Corollary 4.5, the amount of dispersion observed in the candidates’ positions depends on c and is maximal when the rule is plurality. As c grows towards 1, the positions converge at the median voter position. As c increases beyond 1, by Proposition 4.1, we know that infinitely many CNE are possible in an interval that becomes increasingly wide. Hence, there is a bifurcation point that divides CNE from NCNE when \(c=1\). Moving away from this point, more extreme positions are possible—on one side they take the form of CNE, and on the other side they are NCNE.

The six candidate case admits infinitely many equilibria for a given rule, but the pattern is similar.

Example. With six candidates, there are two possible configurations in an NCNE: we can have three occupied positions with two candidates apiece; or, we can have four occupied positions where the inner two positions are occupied by single candidates. Consider the latter profile first—the former will turn out to be a limiting case of the latter. If (i) and (iii) of Theorem 4.3 are satisfied, condition (iv) will always be true, since the unpaired candidate at \(x^2\) has full-electorate length \(\ell (I_2)=I^p+\ell (I_2^R)\), which is clearly larger than all other half-electorates excluding end electorates, which are of length either \(I^p\) or \(\ell (I_2^R)=\ell (I_3^L)\). Thus, the only restriction is condition (v).

To get a maximally dispersed equilibrium, we want \(I^p\) to be as small as possible, which means setting \(I^p=\ell (I_2^R)=\ell (I_3^L)\). This gives equilibrium profile \(x=((x^1,2),(x^2,1), (1-x^2,1),(1-x^1,2))\) where \(x^1=\frac{1}{8}(1+3c)\) and \(x^2=\frac{3}{8}(1+c)\). A number of these maximally dispersed equilibria are pictured in Fig. 1 for a few different values of c.

Fig. 1
figure 1

Maximally dispersed NCNE for different choices of c

To obtain a minimally dispersed equilibrium, we want \(I^p\) to be as large as possible. Condition (iv) will always be satisfied, while condition (v) will still be satisfied if the length of the half-electorates \(I_2^R\) and \(I_3^L\) go to zero. There, the interior two candidates converge at the median and we are left with minimally dispersed equilibrium profile \(x=((x^1, 2), (1/2,2), (1-x^1,2))\) where \(x^1=\frac{1}{6}(1+2c)\). Thus, the unique equilibrium with three occupied positions is the limiting case of the equilibria with four occupied positions. These equilibria are depicted in Fig. 2.

Fig. 2
figure 2

Minimally dispersed NCNE for different choices of c

5 Discussion and extensions

Here we discuss a number of the assumptions underlying our model and the extent to which the results extend to more general settings.

5.1 Uniform distribution

Eaton and Lipsey (1975) showed that when the assumption of a uniform distribution is relaxed, equilibria seldom exist under plurality. Cox (1990) conjectures that the same is true for scoring rules, and Osborne (1993) elaborates on and extends the generality of the arguments (see also Osborne 1995; Bol et al. 2016, and Xefteris 2016, for discussions pertaining to a range of settings, including ours). Our results are subject to the same critique, and small deviations from uniformity would normally lead to nonexistence of NCNE.

While the assumption of uniformity may appear quite restrictive, it has been widely used in the literature, and there are a number of justifications aside from its simplicity. First, it has been noted (Aragonès and Xefteris 2012; Cahan and Slinko 2017) that the distribution of voter ideal points does not literally need to be uniform—all we need is that the candidates believe the distribution to be uniform or assume it as a simplifying assumption in their calculations, which is already substantially more realistic.

Second, our results also extend to distributions that may be arbitrarily non-uniform in the tails. That is, if the profile \(x=((x^1,n_1),\ldots ,(x^q,n_q))\) is an NCNE for a uniform distribution, it will still be an NCNE if we distort the shape of the distribution outside of the interval \((x^1,x^q)\), while keeping the mass in each tail the same.Footnote 13 This observation is implicit in the result of Eaton and Lipsey (1975) and helps somewhat to alleviate concerns about the implausible step-function nature of the uniform distribution. Perhaps more importantly, combining this argument with our results for best-worst rules leads to a stronger converse of sorts—for any distribution that is uniform on some open interval containing the median voter’s ideal point, NCNE exist for \(c<1\), when c is close enough to 1. This is because, as c tends towards 1, the amount of possible dispersion is reduced to the point of full convergence at the median position—at some point all the candidates’ adopted positions will lie within this uniform part of the distribution. It is not at all unlikely that the candidates assume the central part of the distribution to be uniform. Indeed, any smooth distribution will approximate a uniform distribution when we zoom in enough and, in many cases, this central part of the distribution could realistically be quite large.

5.2 Candidate objectives

We focus on candidates who aim to maximise their vote share. This assumption is natural in a proportional representation setting, where seats are assigned to parties according to the share of the vote obtained. It is perhaps less natural in settings where the winner takes all and the losers end up with nothing.Footnote 14

When candidates seek only to win—i.e., they are indifferent between any outcomes that give them the same probability of being ranked first among the candidates—the well known results for vote maximisation (Eaton and Lipsey 1975; Denzau et al. 1985; Cox 1987) are substantially different. In particular, Chisik and Lemke (2006) show that, with plurality and three candidates, NCNE exist (they do not exist for vote maximisers) in which one candidate wins outright, and two candidates tie for second (acting as “spoilers” for each other). In Proposition A.5 in the Appendix, we extend this result to the case of best-worst rules. We find that NCNE for best-worst rules take a similar form to NCNE under plurality. In addition—and highly reminiscent of the results under vote maximisation—the importance of the negative vote should not exceed that of a positive vote and, moreover, the negative vote acts as a moderating force on possible equilibrium positions.

Cox (1987) studies a few more plausible objectives. Under plurality maximisation, candidates seek to maximise their margin with respect to the best of their competitors: \(v_i(x)-\max _{j\ne i}v_j(x)\). Complete plurality maximisation is similar but candidates care about their margins with respect to all other candidates in the race. Cox’s characterisation of CNE easily extends to plurality and complete plurality maximising candidates. With CNE, there is only one occupied position and the calculation is straightforward. With NCNE, considering margins rather than vote shares increases the complexity of the calculations significantly, and we do not know whether our results generalise. We leave this as an open question for future research.

5.3 Multiple positive and negative votes

We have considered rules in which voters are endowed with a single negative vote and a single positive vote. A natural generalisation would be a case in which voters have \(d_1\) positive votes and \(d_2\) negative votes. As before, a positive vote earns a candidate 1 point while a negative vote is worth \(-c\) points. A rule of this kind can be described by a four-tuple \(s=(c, d_1,d_2,m)\).Footnote 15 We refer to this class of rules as generalised best-worst voting rules as opposed to the standard best-worst rules where \(d_1=d_2=1\).

As the number of candidates grows, the combinatorics of generalised best-worst rules quickly become daunting, as illustrated by the six-candidate example below. A complete characterisation of NCNE is not straightforward, though we are able to make some progress for the special cases of four, five and six candidates. In the first two cases, only standard best-worst rules allow NCNE.

Proposition 5.1

For \(m=4\) or \(m=5\), the rule \(s=(c,d_1,d_2,m)\) allows NCNE only if \(d_1=d_2=1\), in which case NCNE are described by Corollary 4.7.

Proof

It can be verified through straightforward but tedious calculations. \(\square \)

For six candidates, on the other hand, NCNE do exist more broadly. We investigate NCNE of the form \(x=((x^1,3),(1-x^1,3))\) in Example A.6 in the Appendix. We find that rich equilibrium behaviour may be observed for generalised best-worst voting rules. In many cases, properties reminiscent of the standard case carry through—the weight placed on negative votes is bounded above and increasing c reduces the amount of dispersion that may be observed in NCNE. This behaviour, however, is no longer the only show in town, and in one case we see quite the opposite properties. Investigating generalised best-worst rules further would take us outside the scope of this paper, but would be a fruitful avenue for future research.

6 Conclusion

Different voting systems provide political candidates with different incentives and, hence, lead to different outcomes, not all of which are socially desirable. One would usually want a voting system in which adopting extremist positions is not encouraged while, at the same time, voters are presented with some choice over the policies advocated by the candidates. One might also prefer that the candidates that choose the most extreme positions do not win the greatest electoral support. We have shown that the class of best-worst rules offers a solid middle ground when voters have one positive vote and one negative vote of relatively less importance, i.e., so that one negative vote does not cancel out one positive vote. In particular, nonconvergent equilibria exist, and candidates adopt different platforms in a very similar way to under plurality. Importantly, however, the strong best-rewarding incentives of plurality are tempered by the need to avoid negative votes and, indeed, any degree of dispersion between the extreme cases of plurality and full convergence of antiplurality can be obtained for the correct relative importance of the negative vote. The need to avoid negative votes leads candidates to moderate their platforms, but without sacrificing diversity entirely. Moreover, when there are at least five candidates, there always exist equilibria in which the most extreme candidates do not receive the most support.

Though natural, best-worst rules have not been used in practice, as is the case for many of the voting rules studied in the social choice literature. However, our results provide evidence that this system is worthy of consideration and presents several desirable properties.

Future research should investigate the properties of best-worst voting rules in more realistic spatial models with, for example, strategic or probabilistic voting, or endogenous candidacy. It would also be useful to comprehensively study generalised best-worst voting rules, as well alternative candidate objectives.