Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 The Issue

This article may be seen as a small contribution to a larger project, an attempt to link the theory of social choice to more traditional normative political philosophyFootnote 1. Although the subject of this chapter is applied social choice, I am writing as a philosopher: I try to reflect some well-established results rather than to prove new ones. More specifically, I shall discuss the consequences of the common idea that our decision-making methods should take the intensities of preference into account. In the theory of social choice, the possibility of making interpersonal intensity comparisons is often seen as the way out from Arrow’s problem. In ethics, the relevancy of such comparisons has been defended in terms of utilitarianism as well as in terms of fairness. Finally, in political science it is connected to the discussion on various decision-making mechanisms. These discussions overlap, but there are very few attempts to bring them together in a systematic way.

The theory of social choice can be applied to different contexts. Here are some examples: evaluation in ethics and welfare economics, voting in democratic bodies and in elections, decision-making in courts and panels of experts, multiple-criteria decision-making in planning, engineering, and quality assessment, choosing the winners in various contests of skill, aggregating information in opinion measurements and marketing research. All processes, rules and theories that are purported to select an alternative or several alternatives or to produce an overall ranking by using individual rankings as the starting-point may be interpreted in terms of the theory of social choice. In this article, I focus the context of democratic decision-making, taking the fundamental democratic values—voter equality, voters’ effective influence (‘popular sovereignty’) and freedom of choice—as granted.Footnote 2 The critical question is, then, whether it is possible to find an institutional method which would deliver the required information about voters’ preference intensities while satisfying the requirements of democracy. Numerous theoretical proposals has been made (see, for example, Hillinger 2004, 2005), but the only method which has actually been used in political contexts is the Borda rule (or Borda count). I discuss the two most sophisticated defences of that method, those presented by Michael Dummett and by Donald Saari. While the arguments put forth by Dummett and especially by Saari are theoretically convincing, I shall argue that matters tend to be more complex when Borda-like systems are actually applied in democratic decision-making. I try to show that the arguments for the Borda rule are partly dependent on the view that voting rules are means to acquire information about voters’ preferences. Voting rules, however, do not aggregate preferences. They aggregate votes which are more or less truthful expressions of voters’ preferences between those alternatives which happen to be on the agenda.

One of the important aspects of the formal theory of social choice is that it can be applied to different contexts. However, the fundamental problem often neglected by the theorists of social choice is that while the formal apparatus may be applied to all kinds of aggregation processes, different considerations may be relevant in different contexts. In political contexts, there are two aspects which are not equally relevant in some other cases: the requirements of democracy, and the interaction between the choice of an aggregation method (voting rule) and the input of aggregation (votes cast). Moreover, contrary to many theorists, I do not think that the interaction problem is solved by supposing that all voters are fully rational, always having complete preference rankings and acting in the strategically optimal way. Instead of seeing voting as one process of information-aggregation among others, we should, perhaps, see it primarily as an exercise of power. This power should, like all power, be constrained by normative rules. The theory of social choice is able to capture some, but only some, part of the normative aspect of voting. While the arguments for the use of Borda-like rules may be convincing for example in multi-criteria decision-making, they need not to be equally convincing in voting contexts.

2 Arguments for and Against Intensity Comparisons

If we could compare different decision alternatives in terms of the intensity by which they were supported or opposed, our collective decisions would not need to be based solely on ordinal rankings. There are at least three possible reasons intensity comparisons as relevant. First, their relevancy follows from the general utilitarian programme. Second, most notions of fairness presuppose some forms of interpersonal comparisons at some level. In democratic theory, the problem of “intense minorities” is usually seen as a problem of fairness, not of maximization (for overviews of the problem, see Dahl 1956, pp. 48–50, 90–102; Kendall and Carey 1968; Jones 1988; Karvonen 2004). Third, intensity comparisons seem to provide an escape-route from Kenneth Arrow’s famous impossibility result. One possible way of interpreting Arrow’s Theorem is that an interest-based political theory like utilitarianism cannot be based on ordinal comparisons. In order to define the common good or general interest, we need some additional information. If we are utilitarians, we either have to reject the whole idea that decisions should be based on individual preferences, or we have to endorse full-blown utilitarianism with an interpersonally applicable measure of intensities (Ng 1979). We should be able to say, in a truly utilitarian fashion, that an alternative is so-and-so many units better than another alternative when measured on some absolute scale. The question is how to get reliable information about these differences.

Conversely, it is possible distinguish at least three reasons for rejecting interpersonal intensity comparisons in voting contexts. First, some theorists—following the famous critique made by the economist Lionel Robbins in 1932—regard such comparisons as conceptually meaningless. Even if voters were allowed to express their preferences in cardinal terms, the numbers would not measure intensities. Intensities are not observable; judgments about intensities are necessarily based on value judgments, while judgments about ordinal preferences could be based on peoples’ actual choices. Of, course, this argument excludes even weaker forms comparability (Sen 1982, pp. 264–281). In democratic theory, this position is adopted by Tännsjö (1992, pp. 31–2) and by Riker and Ordeshook (1973, p. 112). It also seems to be Arrow’s own position. For this reason, he has been labelled as a “positivist” by some authors (Harsanyi 1979, p. 302). Lehtinen (2007) remarks that, in voting contexts, when there are more than two alternatives ordinal preferences are no more observable than cardinal preferences. If choices have strategic aspects, even ordinal preferences cannot be inferred directly from voters’ observable choices. A strictly verificationist criterion of meaning may rule out even judgments about ordinal preferences as “meaningless”. And, as a general philosophical programme, verificationism seems to be out of business in any case.

Second, some others see interpersonal comparisons as ethically irrelevant even if they were available. According to Schwartz,

There are worthier and more likely purposes served by instituting collective-choice processes than satisfying participants’ preferences to the greatest possible degree: such purposes are to distribute power widely, minimizing the abuse of power, to broaden the pool of ideas by which choices are informed, to enhance people’s sense of participation in institutions, and to institutionalize orderly shifting of power. To favor people with intense preferences is to favor people who are bigoted, greedy, meddlesome, etc. (Schwartz 1986, pp. 30–1; cf. also Rawls 1971, pp. 30–1, 361; Saward 1998, p. 78).

The validity of the utilitarian principle—“satisfying preferences to the greatest possible degree”—is disputable. But, as we saw, the intensity comparisons are not only in the interest of the maximizing utilitarian. For example, most principles of fairness presuppose interpersonally applicable measures of satisfaction which go beyond ordinal comparisons.

Third, some theorists think that interpersonal intensity comparisons are useless in democratic theory, as—although they may be conceptually meaningful—there is no effective and ethically acceptable way to make the comparisons needed in collective decisions. If the first and the second criticisms could be ignored, a utilitarian theory of social good or welfare would, in principle, make sense (for a defence of an essentially Benthamite system, see Ng 1979; for a sophisticated Millian alternative, see Riley 1988). But the problem of creating an institutional system to collect the necessary information would remain. What is needed is an institutional method of making the required intensity comparisons en masse—it would not be helpful if such comparisons could be made, say, in laboratory conditions or by using “extended sympathy” in personal interaction (MacKay 1980, pp. 73–6). Moreover, even if there were an effective way of making interpersonal intensity comparisons, any such method would necessarily be undemocratic. The most plausible conception of democracy contains at least the following normative components: the voters’ voting power is (roughly) equal; their choices determine (directly or indirectly) the outcomes; and the choices are free, not coerced or manipulated. We may have different ways of arguing that a million spent on the health care of poor children is, in terms of justice or human welfare or happiness, better used than a million spent on tax cuts for wealthy people. Public organizations, such as welfare agencies, do make such comparisons, and in making them, they may use scientific information as well as everyday knowledge, empathy and imagination. But the information they use is not inferred from valuations consciously given by citizens, nor are they aggregated by using a method that would ensure procedural equality between the respondents.

Roughly, many normative theorists of democracy see the intensity problem as irrelevant for the second reason, and many empirically oriented political scientists see it as relevant but irresolvable for the third reason, while many theorists of public choice and of social choice see the problem both as relevant and solvable. The obvious response to the third critique would be to construct a democratic method which could make systematic intensity comparisons possible. The rest of this chapter is mainly about the most popular proposal.

3 The Borda Rule and Intense Minorities

When a choice is made between two alternatives, majority rule satisfies Arrow’s independence condition. Moreover, as Kenneth May has shown in his classical article, majority rule is the only rule which satisfies the further conditions of decisiveness (Arrow’s “universal domain”), anonymity (which implies Arrow’s “non-dictatorship”), and strong responsiveness (which implies the Pareto condition). However, this positive result cannot be extended to cases with more than two alternatives. If there are more than two alternatives and none of them is the most-preferred alternative for more than a half of the voters, there are several options. We may either drop the “more than half” requirement and be satisfied with mere plurality, or drop “the most-preferred” requirement and try to reduce the choice to a series of pair-wise majority comparisons. The latter is the basis of the well-known Condorcet criterion. For many theorists, the Condorcet criterion is the most plausible extension of the majority principle in voting contexts, or even the only criterion compatible with democracy. Iain McLean makes the argument explicit:

What is so special about a Condorcet winner? Let us go two steps backwards. What is democracy? Majority rule. Majority rule is necessary, though doubtless not sufficient, to any definition of democracy. What is majority rule? The rule that the vote of each voter counts for one and only one; and that the option which wins a majority is chosen and acted on. Indeed, the second requirement is little more than a special case of the first. For if an option which is not a majority winner is chosen, then the votes of those who supported it turn out to have counted for more than the votes of those who would have supported the majority winner. And that is exactly what happens when a Condorcet winner exists but is not chosen (McLean 1991, p. 177).

Although Condorcet-effective rules do not satisfy Arrow’s independence condition, they satisfy it more often than other weakly neutral and anonymous rules, for they are bound to violate it only in the cyclical cases. This follows from their basic logic: they reduce complex choices to a series of pair-wise majority choices. Indeed, Michael Dummett (1984)—who does not himself unqualifiedly support the Condorcet criterion—thinks that anyone who sincerely adheres to the absolute-majority principle in dichotomous choice-situations must also adhere to Condorcet’s principle when there are more alternatives than two. What really matters for a majoritarian is the number of people satisfied with the result, not the relative degrees of satisfaction.

According to Dummett, however, the number of satisfied voters cannot be relevant as such. Ultimately, even the majority principle derives its normative force from “total satisfactions”. As he says.

The question turns on whether it be thought more important to please as many people as possible or to please everyone collectively as much as possible. The latter is surely more reasonable. The rule to do as the majority wishes does not appear to have any better justification as a rough-and-ready test for what will secure the maximum total satisfaction: to accord it greater importance is to fall victim to the mystique of the majority (Dummett 1984, p. 142).

In this interpretation, all voting rules are imperfect measures of the maximum total satisfaction. However, majority rule is not a particularly good measure of total satisfaction unless we have reasons to believe that the intensities are equal (cf. Riley 1990). To make the matter more clear, let us consider the following case:

Example 1

51

49

voters

a

b

 

b

c

 

d

d

 

c

a

 

In the example a is the majority winner, and therefore a Condorcet winner too. One might, however, argue that there would be a good case for selecting b instead of a. Although a slight majority favours a, for a large minority a is the worst alternative, while b does not offend anyone. It is possible that, by selecting b instead of a, we may increase the “total satisfaction”. Various point-counting rules, of which the Borda count is the best known, would select b. If the voters are allowed to give three points for their favourite, two points for their second choice, etc., b would receive a total of 249 points against a’s 153 points. In the example, b is the Borda winner. The Borda count seems to be the most promising way to institutionalize intensity comparisons in voting contexts. It is the rule which has enjoyed continuous support of the specialists since Nicolaus CusanusFootnote 3, and one which has also applied in practice. Plurality, Condorcet, and Borda are commonly conceived as being the three main competing criteria for democratic decision-rules (see, for example, Budge 2000). It may be argued that all the other electoral principles are either imperfect substitutes of, or compromises between, these three principles.

Example 1 also shows how intensity considerations may be justified in terms of fairness (rather than in terms of “total satisfaction”). Suppose that we want to avoid “majority tyranny” (or, less dramatically, the problem of “permanent minorities”; cf. Jones 1988; Karvonen 2004) by giving the minorities some real power over the outcomes. If any minority smaller than a half of voters had the power to determine some outcomes, the system would be indecisive, for obviously there could be more than one minority making the claim at the same time. If only some nameable minorities had the power, the resulting quasi-corporativist rule would violate anonymity. Finally, a general minority-veto would favour conservative minorities. In contrast, an intensity measuring rule like the Borda count would give more power to the minorities without violating the requirement of voter equality. For example, with four alternatives, the Borda count guarantees that a majority cannot dictate the outcome in all possible choice-situations, unless it is larger than three-fourths (Nurmi 2007 , pp. 116–7). A comparison with approval voting—which is sometimes considered as a “utilitarian” rule (see Hillinger 2005)—is illustrative. Approval voting allows that a narrow majority can guarantee the selection of its favoured outcome under sincere or coordinated strategic voting. (Baharad and Nitzan 2005). Consider Example 1 again. If the 51 voters strategically approve only the alternative a, it is selected in spite of the strong and intense opposition. This problem can be mitigated by requiring that the voters should vote (at least) for two alternatives. But this solution would make the rule less sensitive to intensity considerations. Even voters who sincerely reject all but one alternative would be forced to give an equally weighty vote for some of the rejected alternatives.

Because the Borda count possesses the relatively rare strong responsiveness property, it guarantees that all changes in voters’ preference orderings are reflected by the final choice. For this reason, the results of the Borda count actually agree with the Condorcet-criterion more often than the results produced by other positional or semi-positional rules in general use. Thus, the notion of a Borda winner may look like an attractive alternative to the Condorcet criterion. It partly agrees with our majoritarian intuitions while leaving some room for other considerations.

However, these results do not show that the Borda rule actually provides a practicable way to measure intensity differences. In his book Voting Procedures (1984), Michael Dummett recognizes that many arguments for and against various voting rules are based on suppositions about the typical preference structures. He criticizes the plurality criterion because it looks only at the first preferences. As he remarks, one ground upon which it can be defended is the supposition

that the gap in any voter’s preference scale between any outcome other than his first choice and the next outcome on his scale is not merely small, but infinitesimal, in comparison with the gap between his first choice and his second (p. 132).

This supposition concerns intensity differences, and although it may hold in some cases, it is just one possibility among many. To see Dummett’s point, consider a case in which the plurality rule is used to produce a full ranking rather than just choosing the best alternative:

Example 2

99

1

voters

a

c

 

b

b

 

c

a

 

According to the plurality criterion, c is the second-best alternative, for c, unlike b, appears as the first in the preferences of at least one voter. Nevertheless, all voters except one rank it lower than b; the plurality ranking looks acceptable only if the voters put no weight on their lower preferences. “Certain gaps”, says Dummett, “between consecutive outcomes on an individual voter’s preference scale may be small, others large; but there can be no general rule for determining which”. This is plausible; there seems to be no universal reason why voters themselves would put all the weight on their first preferences. In some cases the distance between the best and the second best may be negligible. Dummett’s general conclusion, however, is less plausible: “the only general rule we can reasonably adopt is that all the gaps are not merely comparable, but equal” (p. 133). This sounds like an application of the Principle of Insufficient Reason. Dummett’s argument seems to be this: if we do not know what the actual differences are, we have to treat them as equal. But the principle itself is a problematic one. Consider the following possibility: The 51 voters in Example 1 above are actually almost indifferent as between alternatives b, d and c, but they all agree that these alternatives are much worse than a. To make the case more dramatic, let us suppose that the consequences of all the other alternatives than a would be perceived as catastrophic by the 51 voters. The 49 voters who favour b have no intense preferences over the issue. They could almost as well accept some other result. The measured ‘intensities’ are, in this case, products of the instrument of measurement; the plurality rule would measure them more accurately. The problem is that all such general suppositions, including Dummett’s equal distance supposition, are necessarily ad hoc. According to one early proposal, voters might give one vote for their favorite and a half vote for the second-best (Dabagh 1934). As a sort of compromise between the plurality and the Borda rule, Dummett (1997 , 167–73) recommends a modified Borda rule which awards six points to a party standing highest in a voters’ ranking, two points to the second highest preference, and one to the third. However, there are infinitely many ways to assign the weights. Without a general argument, the problem of social preferences has not been solved but only thrust back onto the choice of weights (Feldman 1980, p. 194) Indeed, Sugden (1981, p. 143) admits that his intensity-based argument does not pick Borda as the uniquely best “neo-utilitarian” rule.

4 Saari’s Argument and the Interaction Problem

In spite of the problem presented above, many defenders of Borda, including Dummett (1984), Saari (1995) and Sugden (1981, p. 144) see it essentially as an imperfect but practicable intensity-measuring device. Saari has, however, provided an extremely interesting argument which is, as such, independent of the intensity considerations. Here, I try to present a short sketch of the basic argument. Consider, first, the following situation

Example 3

5

3

voters

a

b

 

b

a

 

c

c

 

Here, a is both the Borda and the Condorcet winner. Now, let us add nine new voters whose preferences exhibit the familiar Condorcet paradox:

Example 4

5

3

3

3

3

voters

a

b

b

a

c

 

b

a

a

c

b

 

c

c

c

b

a

 

According to Saari, these nine additional voters are tied; hence their votes should not be able to change the initial outcome. (Analogously, if we add three voters who prefer a to b and three with the opposite preference, this group of six is tied, and should not change the outcome!) However, in Example 4, b becomes the Condorcet winner. Alternative b beats alternative a 9–8 and alternative c 11–6. In contrast a remains as the Borda winner in both examples, even after the invasion of the nine new voters. Their votes—three first places, three second places and three-third places for each alternative—cancel out each other. According to Saari, this phenomenon accounts the whole Arrowian indeterminacy problem.

Saari formulates two symmetry requirements:

The Neutral Reversal Requirement: When two rankings reverse one another, say a > b > c and c > b > a, they are tied and do not change the outcome.

The Neutral Condorcet Requirement: When n rankings over n alternatives form a complete cycle, say a > b > c, b > c > a and c > a > b, they are tied and do not change the outcome.

Majority rule respects the Neutral Reversal Requirement but not the Neutral Condorcet Requirement. In contrast, all positional rules (including the plurality and the Borda rules) respect the Neutral Condorcet Requirement, but only the Borda rule also respects the Neutral Reversal Requirement. Thus, the Borda rule is the best voting rule. According to Saari, this conclusion can be challenged only by showing that the Neutral Condorcet Requirement is not relevant, in other words, that a symmetric cycle between alternatives should not be treated as a tie.

The real defect of the Condorcet criterion is that pair-wise comparisons mandated by the independence condition do, according to Saari, disregard some important information about the preferences of the voters. Consider a voting cycle: a defeats b, b defeats c and c defeats a in a series of majority contests. This may result from an underlying Condorcetian cycle of majority preferences. But it might also result from intransitive individual preferences: some voters have simply voted in an irrational way. We cannot tell the source of intransitivity by looking at the pair-wise voting results. Saari’s point is not that such a situation is likely to occur, or that a voting rule should be able to deal with it; the point is that a good rule should be able to distinguish between the two sources of intransitivity. By excluding all information not related to the ordinal preference rankings, Arrow’s independence condition also excludes essential information about the nature of these rankings. As Saari puts it, “losing the intensity information corresponds to dropping the critical assumption that voters have transitive preferences”. While the Borda rule does not satisfy Arrow’s independence condition, it is the only rule that satisfies the binary intensity independence condition which requires that the relative ranking of each pair of alternatives be determined by voters’ relative rankings of that pair, and that the intensity of this ranking is determined by the number of candidates ranked between them (Saari 1995, pp. 201–2).

Saari’s writings are not only mathematically innovative but also philosophically sophisticated. He sees the Arrow theorem as one instance of a general problem of information aggregation, and finds analogical problems in sports, statistics, law, engineering, and economics. All his examples illustrate the problems which appear when we try to understand or evaluate a whole by aggregating information achieved from its parts. He warns: “Expect paradoxical phenomena whenever there is a potential discrepancy between the actual unified whole and the various ways to interpret the totality of disconnected parts” (Saari 2001, p. 104). The great merit of Saari’s approach is that several apparently unrelated but somehow “paradoxical-looking” phenomena are shown to be instances of a single general problem. It does not follow, however, that there exists a corresponding single solution, applicable in all contexts. My thesis is that voting in political contexts has specific properties which are not present in the other cases discussed by Saari.

Consider the following example:

Example 5

3

2

2

voters

a

b

c

 

b

c

a

 

c

a

b

 

In this example the introduction of a Pareto-dominated alternative c* reverses the ordering of alternatives based on the Borda-criterion. Without it, a gets 8, b gets 7 and c gets 6 points When it is introduced, the Borda scores are: 6 for c*, 11 for a, 12 for b and 13 for c.

3

2

2

voters

a

b

c

 

b

c

c*

 

c

c*

a

 

c*

a

b

 

Other preference counting rules—STV, the Bucklin rule, and the supplementary vote—produce similar if somewhat less dramatic anomalies. (On this anomaly in STV, see Doron 1979). These effects cannot plausibly be interpreted in terms of intensity differences. Suppose, for example, that c* is in all essential aspects identical with c, but contains some technical defect and is therefore considered worse than c by all the voters. For an informed decision, its presence on the agenda is totally irrelevant, for it does not contain any new aspect not already contained in c.

Example 5 shows that the agenda-setting process is crucial for the Borda rule. The problem presented above is, of course, well known by the proponents of the rule. Some of them (for example, Dummett) have argued that agenda manipulation is less likely to cause troubles in real elections, for it may be difficult to produce suitable “dummy” alternatives (like c* in the example above). Mackie (2003, pp. 153–155) claims that someone who tries to manipulate a voting rule by addition or subtraction of alternatives needs to know voters’ exact preference rankings, including their rankings over manipulative alternatives (like c*).

In order to assess these empirical claims, let us consider the almost only example of the use of the Borda rule in politically important decision-making: the choice the candidates for the office of Beretitenti or the president in the island-state Kiribati. According to the constitution of Kiribati, the legislature (Maneaba) chooses three or four candidates; and one of them is elected by the people to the office. The candidates are selected in Maneaba by using a limited version of the Borda count. There can be many candidates, but members of Maneaba are allowed to rank four of them. Those four having largest scores are allowed to continue in the final (popular) contest. In 1991, there were eight candidates presented for the Maneaba. According to Ben Reilly (2002, pp. 367–9), there was extensive strategic voting in which two of the most popular candidates were played out from the final election. Two of the running candidates were “dummies”. Their role was exactly the same as that of the alternative c* in our example: by voting a “dummy” alternative the voters could avoid giving any lower preference support for the most serious challengers of their favourite candidates. In the only politically relevant real-life case described in the literature, the Borda count worked exactly as its critics expected it to work. Reilly quotes another commentator of the Kiribati election “It remains to be seen just how long such a system will be tolerated which has the effect of eliminating popular candidates through backroom political manoeuvring” (p. 368).

This form of manipulation is particularly attractive when the Borda count is used. According to Serais (2008, p. 8), in three-candidate Borda elections the a priori probability of situations which can be manipulated by “cloning” alternatives is always over 40 %, and approaches rapidly to 62 % when the number of voters increases. Pace Mackie, the manipulators need not to know the exact preference rankings; it is sufficient for their purposes if they can produce alternatives which are generally perceived as ‘clones’ of their preferred alternative. The resulting multiplication of the Borda scores guarantees that some among the essentially similar alternatives will be selected—unless, of course, the other groups are able to use the same strategy. If, for example, the Borda rule were used for allocating seats between parties in an assembly, a party might increase its share of seats by splitting itself up to two essentially similar but nominally different parties. The point is nicely illustrated in Sverker Härd’s study on seat allocation rules in the Riksdag of Sweden (Härd 1999, 2000). Using opinion measurements, Härd simulated the distribution of seats in the Swedish Parliament under different voting rules. One of the rules tested by Härd was a version of Borda. In this application, a party’s proportion of the seats in the Riksdag was the same as its proportion of the total amounts of the Borda points. The result was a massive shift of power from the Social Democrats to the small non-Socialist party groups. The obvious reason for this shift—not discussed by Härd—is that in Sweden the non-Socialist party groups are numerous, while in the Left the only alternatives are the Social Democratic party and the small Leftist (ex-Communist) party. The number of ideologically close parties multiplied their compound Borda scores. If the Borda rule were actually used in the Swedish elections, the Left could regain its power simply by creating more, nominally independent groups. A general result proved by van der Hout et al. (2006, pp. 465–7) shows how problems of this type can be avoided only by using first preference information as the sole basis for seat allocation.

There is a further problem. The Borda rule is likely to produce larger set of candidates than, say, the plurality rule. Intuitively, the reason is that candidates who do not have much first-preference support still have some hopes to get elected. Any rule that takes some of the lower preferences into account tends to have this effect, even without any conscious attempts to manipulate the agenda. Ordinary voters are not necessarily able to produce strict and complete preference orderings when the number of alternatives becomes large (say, over five). It is reasonable to expect that voters are generally able to submit transitive preference orderings, as Saari says. It is, however, less obvious that the rankings submitted by them would always satisfy the strictness or completeness requirements. If voters are nevertheless required to submit strict and complete rankings (as in the Australian alternative-vote elections) an elections result may actually be determined by voters who—when unable to rank all the candidates—fill their ballot papers in a random way. Therefore, a reasonable voting system should either to limit the number of candidates, or to allow incomplete ballots.

However, while modified versions of the Borda count can handle incomplete rankings, there are inevitable costs (Nurmi 2007). First, such modifications are vulnerable to strategic truncation of preferences. In many voting situations, it is rational not to submit one’s complete preference ordering (for such truncation strategies, see Lagerspetz 2004). Second, all attempts to modify the Borda rule are likely to undo some of the most attractive properties of the rule. Most notably, the modified versions may elect a candidate who is considered as the worst by a majority of voters. Given the effect exemplified in Example 5, these results are to be expected: if the removal of a candidate from the contest—c* in the example—may change the outcome, his removal from sufficiently many ballot-papers may have a similar effect. If these costs are unacceptable, the remaining solution is to limit the number of alternatives beforehand. While this may reasonable in some contexts—for example, in multi-alternative referendums—in general elections it is clearly incompatible with the principle of democratic freedom.

Thus, there is an important difference between voting and the other aggregation contexts analysed by Saari. Only in the context of voting, the choice of the method of aggregation may change the input of aggregation. This reflects a general problem shared by many attempts to “apply” the results of social sciences. In engineering, statistics etc., the reality itself does not react to the choice of method of acquiring information about itFootnote 4. Hence, the manipulative aspects of the Borda rule may well be irrelevant in such contexts (on engineering contexts, see Scott and Zivikovic 2003). In contrast, voters’ strategies, the composition of agendas, the supply of candidates etc. may vary with the chosen voting rule. This adds to voting situations an additional element of arbitrariness not present in Saari’s other examples. The question is not just which method would reflect the objects (voters’ opinions) in the most accurate way, but rather, which would be the best method given the unavoidable interaction between the aggregation process and the objects of aggregation.

Saari’s argument for the Borda rule, brilliant as it is, should be balanced against the defects of the Borda rule discussed above. A Condorcet-effective rule is sensitive to the addition of new (tied) voter groups, but, as we saw, the Borda rule is sensitive to the addition of new (Pareto-dominated) alternatives. If the Condorcet criterion loses some information about the transitivity of the rankings, the Borda rule lets in some questionable information. The normative interpretation of the Borda rule is, even for Saari, that it is able to take preference intensities into account. But if the number of candidates between a and b in someone’s expressed preference orderings may reflect other factors than preference intensities, it is difficult to argue that this information should have an effect on the final choice. While Arrow’s independence condition is too strong (it may leave out some relevant information), Saari’s alternative condition is too weak (it allows that irrelevant information may determine the outcome). Personally, I am unable to decide which form of arbitrariness disturbs me more.

5 For and by the People

In many works informed by the theory of social choice, the underlying supposition is that the main purpose of voting rules is to aggregate information. A voting rule is, indeed, a means of aggregation. The fundamental issue is how the results of aggregation are to be interpreted. We may distinguish two different ways to interpret voting, and, correspondingly, two partly different perspectives from which a voting rule may be evaluated. According to one view, the task of the voting rule is the provide information about some independently existing properties of the world, basically of voters’ preferences. Thus, voting is a kind of measurement, and the aggregation problems appearing in political contexts are largely analogous to those appearing in statistics, multi-criteria decision-making etc. A voting rule should be as reliable and exact instrument of measurement as possible. For example, Claude Hillinger (2004) compares voting to measurements in sociology, psychology, market research etc., and remarks that in these contexts cardinal scales are always used. “It is only in voting and particularly in political voting, that the scales are restricted. For this there is no apparent reason, nor, as far as I know, has any argument in defence of this practice been advanced”. Thus Hillinger (2004, 2005), like Ian Budge (1996, pp. 164–5), argues for cardinal scoring rules. Budge defends his proposal with the same analogy: “similar procedures are used in psychological tests and opinion polls with results which are widely accepted” (p. 165). He comments on the possibility of strategic behaviour: “Voters in the mass are also likely to assign scores that reflect their true feelings, unless urged to engage strategic misrepresentation by political parties. But these can, if necessary be legally forbidden to do so” (idem, emphasis EL). The last sentence reveals one difficulty in the measurement interpretation of voting. Is it compatible with democratic freedom that people—with or without party affiliations—are not allowed to give voting recommendations to their fellow citizens?

The problem of strategic behaviour reveals an interesting difference between voting and measurement. As Sager (2002, p. 185) remarks, strategic behaviour may be a problem even in social measurement if the subjects expect that the results are utilized in decision-making. Consequently, questionnaires are often designed in a way that makes it hard for informants to see how their answers can influence future policy decisions. In voting contexts, the democratic ideal requires that the connection between the answers given and the future policy decisions is as clear as possible. Indeed, various institutions (for example, proportional representation, coalition governments, bicameralism, representative institutions in general) are often criticized for the lack of a visible connection between votes and future policy decisions.

A further argument against the measurement interpretation is that it does not provide any justification for democratic equality. Suppose that, in order to save election costs, we select 1/10 of adult population as the demos. Only those belonging to this selected group are entitled to participate in referenda or in general elections. If we use the modern techniques of random sampling in choosing the demos, the distribution of opinions and interests in the demos will mirror the general population very accurately. Consider normal opinion measurements. By using small random samples (much smaller than 1/10 of the electorate), the pollsters are able to predict the choices of the total population with a great degree of precision. With an enormous sample of 10 % of the total population, the deviation would be negligible. The randomly composed demos would elect the same candidates and vote for the same parties in equal proportions as the entire population. If the main purpose of voting were to provide information, recording everybody’s preferences seems to be just a waste of time and moneyFootnote 5.

There is, however, another possible interpretation of voting. It should not be seen mainly as a means to get information. It is primarily an exercise of power. To take the obvious case, when voting in a parliament, the MP’s are not providing information about their opinions. They are making binding decisions based on those opinions. Elections can be interpreted in the same way. It is, of course, plausible to say that an elections result usually provides information, mostly about the relative popularity of parties and candidates but also about other issues (for example, the turnout rates may measure political alienation). The main purpose of elections, however, is not to provide information but to choose the most popular candidates. A good voting rule should produce outcomes which are recognized as legitimate. In order to produce legitimate results the rule must be compatible with the background values; in democracies these values include equality, liberty, and effective voter influence. Because voting is also an exercise of power, voters are—and should be—moved by motives which are not operative when the same people are filling in questionnaires or answering questions in an opinion poll. As Saward (1998, p. 35) says, an opinion poll can gather expressions of preference, but they are not preferences which reflect the fact that people are aware that their expressions will decide anythingFootnote 6. Because of the power aspect, elections are taken seriously; and this unavoidably provides incentives both for rational deliberation and for strategic behaviour. This does not, however, mean that there are no normative problems related to strategic behaviour in democratic elections. The social choice results tell us that strategic manipulation is possible in all democratic systems. A realistic aim is to minimize the role of certain forms of strategic behaviour.

One possible counterargumentFootnote 7 to my analysis is this. In mass elections the probability that an individual voter would be decisive is extremely small. If power is measured in terms of decisiveness, the power exercised by an individual voter is almost zero. This creates a collective action problem. A candidate may win only if a sufficient number of citizens’ vote for him or her; and, more generally, democracy can work and produce legitimate decisions if sufficiently many citizens are willing to participate. But nevertheless, a single citizen has no convincing instrumental reason to cast an informed vote, for his or her personal contribution to the outcome is likely to be negligible in any case. Perhaps voting should be interpreted as a purely expressive act like cheering in a soccer match (Brennan and Lomasky 1993).

This argument certainly points out a real problem (first discussed by G. W. F. Hegel in his famous article on the Estates of Württemberg) for the view that voting acts could be interpreted as purposive exercises of power. However, it does not work as an argument for the measurement view of voting. If voting is an expression of feelings, voting results do not measure voters’ preferences over outcomes in a reliable way. The expressive interpretation implies that voters are actually choosing between alternative voting acts (“How do I feel if I cast my vote in this way ?”) rather than between competing candidates or policies.

There is not enough space for a convincing answer, but some observations can be made. A purely expressive model cannot explain the fact that when there are competing acceptable candidates or parties, people are more willing to give their support to those that have realistic chances to succeed, given the expected choices of the others. In practice, people tend to vote for an acceptable candidate who has realistic prospects to be elected, rather than for the candidate who might be their absolute favourite. All electoral systems tend to constrain political competition as a contest between a limited number of realistic candidates or parties. The most plausible explanation of this (Cox 1997) appeals to voters’ instrumental rationality. But we have already admitted that instrumental rationality cannot explain why people vote at all! A solution of this dilemma is, I think, that a voting act is (at least sometimes) seen as a contribution to a collective action. In mass elections, voters are (at least sometimes) motivated by a “consequentialist generalization”. In other words, they ask themselves: “What would happen if all (or most, or very many) people like me would choose in this way ?” Voters tend to portrait themselves as participants in collective actions. They try to evaluate the consequences of those actions rather than the consequences of their individual voting acts.

There is a more general philosophical lesson in the distinction between measurement and voting. Real-life rules of social choice do not connect voter’s preferences directly to outcomes. Instead, they connect expressions of preferences—votes to outcomes. Suppose that we had a measurement device that would connect (ordinal or cardinal) preferences directly to outcomes, say, by measuring peoples’ neural states. Suppose, moreover, that the officials—a benevolent autocrat or central planning agency—would then implement the outcomes that were picked by the aggregated measurement results. Would that constitute a democratic arrangement? The answer is, I think, no. Why? In the thought example, there would be no element of popular choice or authorization by the citizens. The citizens’ role would be a purely passive one. The system would constitute a government for the people, not by the people. It would give people what they desired, not what they would have desired when knowing that a public expression of their desires causally contributes to, and therefore makes them responsible for, the resulting outcomes. These are likely to be different things: the authoritative nature of the voting process forces voters to consider their preferences and the way their votes are connected to the outcomes.

6 Conclusion

Voting rules are used for different purposes. Votes are taken in representative bodies, general elections and referendums, as well as in multi-member courts, panels of experts, collegial bodies, and public contests. The rules cannot be evaluated without taking wider institutional and social contexts into account. More specifically, it is not possible to find “the best” rule simply by comparing the performance of various voting rules in respect with the pre-given criteria of social choice (Lagerspetz 2004, pp. 218–20). When, for example, we have the luxury of choosing between the Borda rule and some Condorcet–effective procedure, we should appeal to pragmatic and context-dependent considerations for and against both alternatives. In many contexts, the Borda rule may be preferable. If the set of alternatives is fixed, the effects discussed above cannot occur. For example, when we are pooling experts’ judgments or the popular judgments on the performance of competing contestants (for example, in the Eurovision song contests), the “agendas” are exogenously given. The Borda rule may well be the most plausible method to aggregate information in such contexts. But in such contexts, “intensities”, conceived in the utilitarian way, are not relevant. If we decide to use the Borda rule, our reasons should not be related to intensity considerations.

To quote Sartori (1987, p. 225) “the intensity criterion cannot establish a workable rule”. However, democratic processes are not insensitive to varying intensities of preference. Pressure group activities (Dahl 1956), vote trading (Buchanan and Tullock 1962), decentralized decision-making (Karvonen 2004) and public argumentation can all be seen as informal ways to cope with the intensity problem. They are necessarily unsystematic, imperfect and partial solutions. But, given the nature of the problem, this necessity may actually be a virtue.