1 Introduction

Consequentialism is the approach to normative evaluation in which only the consequences that are obtained with an alternative are of ethical significance. That is, goodness only depends on consequences. Here, I consider the more specific form of consequentialism used in social choice theory and welfare economics when the objective is to provide a normative ranking of a set of alternatives. This ranking is consequentialist if the comparison of any two alternatives only depends on a ranking of the associated consequences.

Alternatives and consequences can take many forms. An alternative could be an act or set of acts, with the corresponding consequences being the resulting outcomes. For example, in a Prisoners’ Dilemma, the acts are the decisions of the two prisoners whether to confess or not and the consequences are the prison sentences they receive as a result. A consequentialist ranks the four possible alternatives solely in terms of the prison sentences taking no account of the description of the acts that lead to these outcomes.

In many applications, an alternative is an allocation of goods to individuals and the consequences are the utilities that individuals obtain with this allocation. Utility consequentialism requires that when ordering two social alternatives, all non-utility features that differ between them are ignored. This approach to normative evaluation is better known as welfarism. Thus, the ranking of two alternatives depends on the individual utilities obtained with the two alternatives, but not, say, on the reasons why the individuals have these utilities. This is not to say that non-utility information can play no evaluative role. For example, if weighted utilitarianism is used to compare alternatives, the weights could depend on the heights of the individuals’ parents because these heights have fixed predetermined values.Footnote 1

It is sometimes claimed that any nonconsequentialist theory can be “consequentalized” by including the purported nonconsequentialist features of an alternative into the description of the consequences and modifying the conception of goodness so as to take account of this redescription. Such a move would render the distinction between consequentialist and nonconsequentialist theories vacuous. However, Brown (2011) has shown that there are in fact limits to this kind of consequentialization. Moreover, the concrete structure of a problem may provide guidance as to what is a consequence and what is not. For example, a utilitarian evaluation of allocations of goods to individuals would not consider liberal rights or the fairness of the allocations as being consequences if these are features of the alternatives that the individuals do not value. When alternatives are acts, Broome (1991, pp. 3–4) argues that consequentialism “relies on a division between an act and its consequences that cannot be maintained” because an act can be thought of as being one of the consequences.Footnote 2 Broome is, in effect, suggesting a particular form of consequentialization. But, as the Prisoners’ Dilemma example illustrates, it is sometimes possible to distinguish acts from consequences in a useful way. In what follows, I shall suppose that in each of the problems that I consider, the distinction between what is a consequence and what is not is clear-cut. This is not always the case. Indeed, in some applications, there may well be disagreement about what should be included in the description of a consequence. In such cases, subjective considerations play a role in determining what a consequence is.

Welfarism precludes taking account of many important values, such as liberal rights and the fairness of social allocations, when these values do not have intrinsic value to the individuals being considered. Because of this limitation, there has been a great deal of interest in non-welfarist or, more generally, nonconsequentialist approaches to normative evaluation in recent decades. However, nonconsequentialist principles often conflict with other cherished values. The most well-known example of such a conflict is provided by Amartya Sen’s liberal paradox (Sen 1970b). This paradox shows that if individuals are given the right to determine the social ranking on some pairs of alternatives, then there are configurations of individual preferences for which it is not possible to also satisfy the Weak Pareto Principle (which requires the social ranking to respect unanimous strict preferences) provided that the social ranking is acyclic. The various versions of the Pareto Principle are unanimity principles in that they require collective rankings to respect unanimous agreement.

Kaplow and Shavell (2001) contend that the conflict between welfarism and the Weak Pareto Principle is more widespread than the one identified by Sen. They argue that any non-welfarist approach to social evaluation, not just a respect for liberal rights, must violate this form of the Pareto Principle given some weak regularity conditions. Their proof of this claim utilizes the single-profile framework employed in traditional welfare economics. That is, there is a single list of utility functions, one for each individual. Typically, these are the individuals’ actual utility functions. As we shall see, Kaplow and Shavell’s theorem is a consequence of the equivalence of welfarism and Pareto Indifference (which requires universal indifference to be respected) in a single-profile setting when the social ranking of the alternatives is a quasiordering (Blackorby et al. 1990).

When consequences are multidimensional, appeal is often made to some form of a dominance principle that requires one alternative to be preferred to a second if the former vector dominates the latter in the space of consequences. In other words, goodness has a number of dimensions and one alternative is preferred to a second if it is better in all dimensions. Thus, dominance principles respect unanimous rankings in all of the dimensions of goodness. The Weak Pareto Principle is a dominance principle applied to vectors of individual utilities. In what follows, I shall generally speak of unanimity principles when the consequences are utilities and speak of dominance principles when they are not. However, in both cases, they are formally unanimity principles.

There are a number of single-profile impossibility theorems that demonstrate the incompatibility of unanimity/dominance criteria with various nonconsequentialist principles given some rationality restrictions on the rankings being considered. Each of these impossibility theorems provides nonconsequentialists with a difficult conundrum: How should the fundamental incompatibilities between the desiderata of the theorem be resolved?

In this article, I consider some of these theorems and examine what they have in common and how they differ. In particular, I identify groups of results that have similar formal structures and are established using similar proof strategies. In order to highlight the underlying structural similarities and dissimilarities of the theorems that I discuss, I sometimes present variants of the theorems that have been established in the literature, rather than the original results themselves. Furthermore, for the same reason, I sometimes use stronger axioms than are necessary, rather than provide the most general form of an impossibility theorem. The theorems that I have chosen to consider in some detail illustrate the main kinds of impossibility results that have appeared in the literature. Many other nonconsequentialist impossibility theorems have been established, some of which are mentioned in subsequent sections.Footnote 3

I begin in Sect. 2 with some notation and definitions. In Sect. 3, I present the single-profile characterization of welfarism in terms of Pareto Indifference. The Kaplow–Shavell Theorem is considered in Sect. 4. In Sect. 5, I discuss the conflicts identified by Sen (1970a) and Brun and Tungodden (2004) between the Pareto Principle and dominance principles that involve permuting the positions of individuals. Pareto conflicts with inequality aversion principles identified by Gibbard (1979) and Fleurbaey and Trannoy (2003) are considered in Sect. 6. The consequences in Sects. 36 are all utilities. In Sect. 7, I consider non-welfarist forms of nonconsequentialism. Specifically, I discuss an impossibility theorem about the measurement of standards of living considered by Pattanaik and Xu (2007) and an abstract theorem due to Hare (2007) that he has applied to the problem of whether taking account of proximity is morally justified when deciding whether to aid the needy. Finally, in Sect. 8, I offer some concluding remarks about how one might resolve a conflict between nonconsequentialist and unanimity/dominance principles without abandoning social rationality.

2 Preliminaries

Let X be the set of alternatives. Depending on the application, members of X could be, for example, social alternatives, states of the world, or actions. The set of individuals is \(N = \{1, \ldots , n\}\), where \(n \ge 2\). Alternatives in X are mapped into a set of consequences C. What these consequences are and how X is mapped into C differs in each of the problems considered here. However, in each case, C is a subset of an m-dimensional Euclidean space \(\mathbb {R}^m\).Footnote 4

Let A be a set which, depending on the context, shall be either X or C. A weak preference relation is a binary relation R on A that is interpreted as meaning “is weakly preferred to”. The corresponding strict preference relation P (“is strictly preferred to”) and indifference relation I (“is indifferent to”) are defined by setting, for all \(a, b \in A\), (i) aPb if and only if aRb and \(\lnot (bRa)\) and (ii) aIb if and only if aRb and bRa.

The relation R is reflexive if aRa for all \(a \in A\), complete if aRb or bRa for all distinct \(a, b \in A\), transitive if aRb and bRc imply aRc for all \(a,b,c \in A\), and acyclic if \(a_1Pa_2, \ldots , a_{s-1}Pa_s\) imply \(\lnot (a_sRa_1)\) for all \(a_1, \ldots a_s \in A\). The relation R is a quasiordering if it is reflexive and transitive and it is an ordering if it is a complete quasiordering.

3 Single-profile welfarism

In this section, I consider a single-profile version of utility consequentialism. Each individual \(i \in N\) has a utility function \(U_i :X \rightarrow \mathbb {R}\), interpreted as being a comprehensive measure of well-being.Footnote 5 There is a fixed profile of utility functions \(U = (U_1, \ldots , U_n)\). For each \(x \in X\), U determines a vector of individual utilities \(U(x) = (U_1(x), \ldots , U_n(x))\). The set of utility vectors that are achievable with the profile U and set of alternatives X is \(U(X) = \{U(x) \mid x \in X \}\). Here, the set of consequences C is U(X) and, hence, \(m = n\).

Let \(R_U\) denote a weak social preference relation on X. The use of the subscript indicates that this preference is conditional on the profile of utility functions U being considered. The corresponding strict preference and indifference relations are \(P_U\) and \(I_U\), respectively. It is supposed that \(R_U\) is either an ordering or a quasiordering. A weak social preference relation \(R^*_U\) on the set of consequences U(X) is called a social welfare ranking. The corresponding strict preference and indifference relations are \(P^*_U\) and \(I^*_U\), respectively.

In the single-profile setting being considered here, utility consequentialism requires that the social ranking of the alternatives in X be determined by a social welfare ranking of the achievable utility vectors U(X), a property known as Single-Profile Welfarism.

Single-Profile Welfarism

There exists a social welfare ranking \(R^*_U\) on U(X) such that for all \(x,y \in X\),

$$\begin{aligned} xR_Uy \leftrightarrow U(x)R^*_UU(y) . \end{aligned}$$
(1)

Pareto Indifference requires two alternatives to be socially indifferent if everybody is indifferent between them.

Pareto Indifference

For all \(x,y \in X\), if \(U(x) = U(y)\), then \(xI_Uy\).

In general, these two conditions are not equivalent. For example, suppose that \(n = 2\), \(X = \{x, y, z\}\), \(U(x) = U(y)\), \(U_1(z) > U_1(x)\), and \(U_2(z) < U_2(x)\). Define \(R_U\) by setting \(xI_Uy\), \(xP_Uz\), and \(zP_Uy\). Pareto Indifference is satisfied, but there is no binary relation \(R^*_U\) on U(X) for which (1) holds, so Single-Profile Welfarism is not.

In the preceding example, \(R_U\) is not transitive. Theorem 1 demonstrates that Pareto Indifference is a necessary and sufficient condition for Single-Profile Welfarism provided that \(R_U\) is a quasiordering. Moreover, in this case, \(R^*_U\) is also a quasiordering. Furthermore, if \(R_U\) is complete, then so is \(R^*_U\).Footnote 6

Theorem 1

For the triple \(\langle X,U,R_U \rangle \), if \(|X| \ge 3\) and \(R_U\) is a quasiordering (resp. ordering) of X, then Pareto Indifference is satisfied if and only if Single-Profile Welfarism is satisfied with \(R^*_U\) a quasiordering (resp. ordering) of U(X).

Proof

(i) Suppose that \(R_U\) is a quasiordering and that Pareto Indifference is satisfied. I first show that if \(xR_Uy\) for some \(x,y \in X\) and there exist \(w,z \in X\) such that \(U(w) = U(x)\) and \(U(z) = U(y)\), then \(wR_Uz\).Footnote 7 By Pareto Indifference, \(wI_Ux\) and \(yI_Uz\). Together with the assumption that \(xR_Uy\), transitivity of \(R_U\) implies that \(wR_Uz\).

Define \(R^*_U\) on U(X) as follows. Consider any \(u, v \in U(X)\). By the definition of U(X), there exist \(x, y \in X\) such that \(U(x) = u\) and \(U(y) = v\). Let \(uR^*_Uv\) if and only if \(xR_Uy\) and \(vR^*_Uu\) if and only if \(yR_Ux\). By the preceding argument, the ranking of u and v defined in this way is independent of the alternatives in X used to generate u and v. Thus, Single-Profile Welfarism is satisfied.

If \(u = v\), then y can be chosen to be x. It then follows from the reflexivity of \(R_U\) that \(R^*_U\) is also reflexive. Consider any \(t, u, v \in U(X)\) for which \(tR^*_Uu\) and \(uR^*_Uv\). By construction, there exist \(x, y, z \in X\) with \(U(x) = t\), \(U(y) = u\), and \(U(z) = v\) such that \(xR_Uy\) and \(yR_Uz\). Transitivity of \(R_U\) implies that \(xR_Uz\) and, hence, by the definition of \(R^*_U\), that \(tR^*_Uv\). Thus, \(R^*_U\) is transitive. If \(R^*_U\) is complete, then by the definition of \(R^*_U\), so is \(R^*_U\).

(ii) If Single-Profile Welfarism is satisfied and \(R^*_U\) is reflexive, it follows immediately from (1) that Pareto Indifference is satisfied. \(\square \)

As the proof of Theorem 1 shows, only reflexivity of \(R^*_U\) is needed to conclude that Single-Profile Welfarism implies Pareto Indifference. However, the reverse implication utilizes the full force of the transitivity of \(R_U\).

Theorem 1 demonstrates that a commitment to Single-Profile Welfarism amounts to endorsing Pareto Indifference. Thus, a utility nonconsequentialist in the single-profile context being considered here must reject Pareto Indifference, at least if \(R_U\) is a quasiordering. Blackorby et al. (1990) have argued that there may be good reasons for doing so. For example, as Sen’s Liberal Paradox (Sen 1970b) demonstrates, for some configurations of non-self-centered individual utilities, Pareto Indifference is incompatible with granting an individual the right to choose between alternatives that only differ in some feature that falls within his protected private sphere (e.g., the colour of his bedroom walls) independent of the utility consequences. Furthermore, Pareto Indifference does not permit the motivations an individual has for assigning utilities to alternatives (e.g., the pleasure a sadist obtains from torturing someone) to play any role in determining the social ranking.Footnote 8

4 The Kaplow–Shavell Theorem

Kaplow and Shavell (2001) have claimed that any non-welfarist criterion that is used when evaluating social policies, such as one that incorporates considerations of justice or rights, must violate the Weak Pareto Principle.Footnote 9 That is, in some circumstances, taking account of a non-welfarist criterion requires overriding a unanimous strict ranking of the alternatives by the individuals. This conclusion makes use of some auxiliary assumptions, notably a continuity assumption. The implications of this view are developed at great length in Kaplow and Shavell (2002). For reasons discussed in Sect. 8, Fleurbaey et al. (2003) argue that what Kaplow and Shavell have shown in their formal theorem establishes a less far-reaching result than what their informal statement suggests.

In this section, I present a version of the Kaplow–Shavell Theorem for a social welfare ordering \(R_U\) on the set of alternatives X and its corresponding social welfare ranking \(R^*_U\) of U(X).Footnote 10 I also discuss how their theorem relates to Theorem 1. The role that the continuity assumption plays in their analysis is given particular attention because it is different from the role that continuity assumptions play in some of the theorems discussed in subsequent sections.

Kaplow and Shavell assume that there is at least one divisible private good, but do not require that any of the other features of an alternative exhibit any special structure. Accordingly, in this section, it is assumed that \(X = X^1 \times X^2\), where \(X^1 = \mathbb {R}^n_+\). A vector \(x^1 = (x^1_1, \ldots , x^1_n)\) in \(X^1\) specifies, for each person i, the amount \(x^1_i\) of some divisible private good that i is allocated. As in Sect. 3, there is a single profile of utility functions U and the set of consequences C is U(X). Furthermore, the social preference relation \(R_U\) is assumed to be an ordering.

Weak Pareto regards any change that makes everybody better off as being a social improvement.

Weak Pareto

For all \(x,y \in X\), if \(U(x) \gg U(y)\), then \(xP_Uy\).

Kaplow and Shavell impose a relatively weak monotonicity condition on the individual preferences, what I call Common Monotonicity. Common Monotonicity says that if everybody’s allocation of the divisible private good is increased by a common amount holding the other features of the alternatives fixed, then everybody is made better off. This assumption ensures that Weak Pareto is not vacuous.

Common Monotonicity

For all \(x,y \in X\) with \(x^2 = y^2\), if \(x^1_i = y^1_i + \delta \) for all \(i \in N\) for some \(\delta >0\), then \(U(x) \gg U(y)\).

Kaplow and Shavell also impose a continuity condition on the social ordering \(R_U\). It requires \(R_U\) to be continuous on \(X^1\) for any fixed \(x^2\) in \(X^2\), a property I call Continuity on \(X^1\).

Continuity on

\(\varvec{X}^1\) For all \(x \in X\), the sets \(\{z^1 \in \mathbb {R}^n_+ \mid (z^1,x^2) R_U x \}\) and \(\{z^1 \in \mathbb {R}^n_+ \mid x R_U (z^1,x^2)\}\) are closed.

The Kaplow–Shavell Theorem shows that if \(R_U\) is an ordering and both Common Monotonicity and Continuity on \(X^1\) are satisfied, then it is not possible to take account of non-welfare information in this single-profile setting without violating the Weak Pareto Principle.

Theorem 2

For the triple \(\langle X,U,R_U \rangle \), if \(X = X^1 \times X^2\) with \(X^1 = \mathbb {R}^n_+\), \(R_U\) is an ordering, and both Common Monotonicity and Continuity on \(X^1\) are satisfied but Single-Profile Welfarism is not, then Weak Pareto is violated.

Proof

The theorem is established by showing that if Common Monotonicity, Continuity on \(X^1\), and Weak Pareto are satisfied, then so is Single-Profile Welfarism when \(R_U\) is an ordering. I first show that Pareto Indifference is satisfied. On the contrary, suppose that it is not. Thus, there exist xy with \(U(x) = U(y)\) for which \(xP_Uy\).Footnote 11 Let z be such that \(z^2 = y^2\) and \(z^1_i = y^1_i + \delta \) for all \(i \in N\) for some \(\delta > 0\). By choosing \(\delta \) sufficiently small, Continuity on \(X^1\) implies that \(xP_Uz\). However, by Common Monotonicity, \(U(z) \gg U(y)\) and, hence, \(U(z) \gg U(x)\). Therefore, Weak Pareto is violated, a contradiction. Hence, Pareto Indifference is satisfied and, by Theorem 1, so is Single-Profile Welfarism. \(\square \)

As my proof of Theorem 2 demonstrates, what Kaplow and Shavell have done is to identify restrictions for which Weak Pareto implies Pareto Indifference. Specifically, provided that \(R_U\) is an ordering, Weak Pareto implies Pareto Indifference if Common Monotonicity and Continuity on \(X^1\) are satisfied. Using somewhat different continuity and monotonicity assumptions about the profile U and the social preference \(R_U\), Suzumura (2001) has shown that if it is always possible to reverse a social preference \(xP_Uy\) by increasing any individual’s consumption of the divisible private good in y sufficiently, then the standard versions of the Pareto Principle, including Weak Pareto and Pareto Indifference, are all mutually equivalent. Thus, in the single-profile setting considered here, the conflict between permitting non-welfare information to play a role in the social evaluation and the Weak Pareto Principle identified by Kaplow and Shavell is simply an implication of the fact established in Theorem 1 that Pareto Indifference and Single-Profile Welfarism are equivalent conditions when \(R_U\) is an ordering.Footnote 12

Requiring \(R_U\) to be an ordering and the profile U to satisfy Common Monotonicity are relatively uncontroversial assumptions. Given these two assumptions, the only role of the continuity assumption is to show that by adopting Weak Pareto, one is also committed to Pareto Indifference, which in turn commits one to Single-Profile Welfarism. Chang (2000, Section IV.A) argues that Kaplow and Shavell’s continuity condition is not compelling. He provides examples that violate this condition in which the ordering \(R_U\) is obtained by serially applying different criteria, including the Weak Pareto Principle. If one rejects Continuity on \(X^1\), then it is possible to be a non-welfarist and satisfy Weak Pareto when only a single profile is considered. However, in view of Theorem 1, one must nevertheless abandon Pareto Indifference.

5 Pareto conflicts with permutation dominance principles

Sen (1970a, Theorem \(9^*2\)) has shown that the Suppes (1966) Grading Principle is inconsistent with Weak Pareto for some possible profiles of preferences. The Suppes Grading Principle is a non-welfaristic principle for socially ranking alternatives based on utility dominance after possibly permuting the positions of the individuals in one of the alternatives. For a fixed profile, Brun and Tungodden (2004) consider a more structured economic environment in which alternatives consist of a commodity bundle for each individual, individuals only care about what they receive, and the individual preferences for own consumption satisfy the standard assumptions of microeconomic theory. In this framework, the Suppes Grading Principle compares alternatives by applying a utility dominance criterion after the commodity bundles in one of the alternatives have been permuted. Brun and Tungodden consider a related permutation dominance principle in which dominance is applied to commodity bundles, not utilities.Footnote 13 In their Observation, they show that their principle is incompatible with the Strong Pareto Principle provided that the preferences for own consumption are not all the same. Although the principles considered by Suppes and by Brun and Tungodden make use of utility dominance comparisons, they are not Pareto criteria; that is, they are not utility dominance criteria in the sense used here, which require that alternatives be compared person by person in terms of their own utilities, not the utilities they might have in some counterfactual situation.

In this section, I use the framework and proof strategy employed by Brun and Tungodden to show that Strong Pareto violates both a slight strengthening of their dominance principle and the Suppes Grading Principle. This result illustrates the basic conflict identified by Sen and by Brun and Tungodden. I also show that similar reasoning can be used to establish a Pareto Indifference version of this result. This permits me to relate the incompatibilities between any single-profile non-welfarist principle and various versions of the Pareto Principle discussed in the preceding two sections with the more specific conflicts considered in this section.

I now suppose that \(X = \prod _{i=1}^n X^i\), where \(X^i = \mathbb {R}^k_+\) with \(k \ge 2\). An alternative has the form \(x = (x^1, \ldots , x^n)\), where for each \(i \in N\), \(x^i\) is the commodity bundle that specifies the quantities of k divisible private goods for person i. Brun and Tungodden (2004) interpret these goods as being either functionings as in Sen (1985) or primary goods as in Rawls (1971). Functionings are achievements, what an individual does or becomes. Primary goods are goods that facilitate the achievement of a good life whatever one’s conception of a good life turns out to be. The vector x is sometimes written as \((x^i, x^{-i})\), where \(x^{-i}\) is the vector of commodity bundles of everyone but i. As in the preceding sections, there is a single profile of utility functions U and the set of consequences C is U(X).Footnote 14

The utility function \(U^i\) is self-regarding if \(U^i(x^i, x^{-i}) = U^i(x^i, y^{-i})\) for all \(x, y \in X\). If \(U^i\) is self-regarding, \(i\)’s utility only depends on his own consumption and, hence, \(U^i\) can be equivalently expressed using a utility function for own consumption \(\widetilde{U}^i\) defined on \(X^i\). \(U^i\) is a classical private goods utility function if it is self-regarding and \(\widetilde{U}^i\) is continuous, increasing in each of its arguments, and strictly quasiconcave. It is assumed that each person’s utility function satisfies these restrictions.

Classical Private Goods Profile

For each \(i \in N\), \(U^i\) is a classical private goods utility function.

The Pareto conflicts considered in this section presuppose that not everybody has the same preferences for own consumption, what I call Nonidentical Preferences. Individual i has the same preferences for own consumption as individual j if \(\widetilde{U}^i\) is an increasing transform of \(\widetilde{U}^j\).

Nonidentical Preferences

There exist \(i, j \in N\) who do not have the same preferences for own consumption.

Strong Pareto regards any change that makes at least one person better off without harming anyone else as being a social improvement.

Strong Pareto

For all \(x,y \in X\), if \(U(x) > U(y)\), then \(xP_Uy\).

Note that this definition of Strong Pareto differs from the standard one which also stipulates that Pareto Indifference is satisfied. Excluding the Pareto Indifference part of the standard definition of Strong Pareto allows us to conclude that the impossibility established in Theorem 3 is not a corollary to our finding in Theorem 1 that Single-Profile Welfarism is equivalent to Pareto Indifference, at least if the social preference is a quasiordering.

I now consider a dominance condition that combines anonymity and dominance properties for commodity bundles. Permutation Dominance regards alternative x to be socially preferred to y if there is a permutation of the commodity bundles in x that provides someone with more of every good than in y and no less of any good for everybody else.

Permutation Dominance

For all \(x,y \in X\), if there exists a permutation \(\pi :N \rightarrow N\) such that \(x^{\pi (i)} \ge y^i\) for all \(i \in N\) with \(x^{\pi (j)} \gg y^j\) for some \(j \in N\), then \(xP_Uy\).

Permutation Dominance is a strengthening of the Strong Dominance condition considered by Brun and Tungodden (2004). Strong Dominance modifies the antecedent in Permutation Dominance by requiring that \(x^{\pi (i)} \gg y^i\) for all \(i \in N\).

Permutation Dominance is closely related to the Suppes Grading Principle (Suppes 1966). While, in general, the latter principle does not presuppose that the individual utility functions are self-regarding, in order to compare it with Permutation Dominance, I shall suppose that they are. With this proviso, the Suppes Grading Principle says that if it is possible to permute the individual commodity bundles in x in such a way that the permuted alternative Pareto dominates y, then x is socially preferred to y.

Suppes Grading Principle

For all \(x,y \in X\), if there exists a permutation \(\pi :N \rightarrow N\) such that \(\widetilde{U}^i(x^{\pi (i)}) \ge \widetilde{U}^i(y^i)\) for all \(i \in N\) with a strict inequality for some \(j \in N\), then \(xP_Uy\).

If U is a classical private goods profile, then the Suppes Grading Principle is a more stringent requirement than Permutation Dominance because the antecedent in the Suppes Grading Principle is implied by the antecedent in Permutation Dominance.

Theorem 3 illustrates the conflict identified by Sen and by Brun and Tungodden between the Pareto Principle and their non-welfaristic dominance principles using the economic structure employed by Brun and Tungodden, but with Permutation Dominance used instead of Strong Dominance.

Theorem 3

For the triple \(\langle X,U,R_U \rangle \), if \(X = \prod _{i=1}^n X^i\), where \(X^i = \mathbb {R}^k_+\) and \(k \ge 2\), and U is a classical private goods profile with nonidentical preferences, then Strong Pareto is incompatible with Permutation Dominance and with the Suppes Grading Principle.

Proof

Suppose that both Strong Pareto and Permutation Dominance are satisfied. Because U is a classical private goods profile with nonidentical preferences, there exist \(i, j \in N\) for which two indifference curves for own consumption cross. Hence, it is possible to choose alternatives x and y such that (i) \(\widetilde{U}^i(x^i) > \widetilde{U}^i(y^i)\), (ii) \(\widetilde{U}^j(x^j) > \widetilde{U}^j(y^j)\), (iii) \(y^i \gg x^j\), (iv) \(y^j\gg x^i\), and (v) \(x^h = y^h\) for all \(h \ne i, j\). The commodity bundles for i and j are illustrated in Fig. 1 for the case in which there are two goods. By Strong Pareto, \(xP_Uy\). Let \(\hat{y}\) denote the alternative that is obtained by permuting the commodity bundles of i and j in y. The constructions for i and j are illustrated in Fig. 1 for the two-good case. Because \(\hat{y}^i \gg x^i\), \(\hat{y}^j \gg x^j\), and \(\hat{y}^h = x^h\) for all \(h \ne i,j\), Permutation Dominance implies that \(yP_Ux\), a contradiction.

The same argument shows that Strong Pareto and the Suppes Grading Principle are inconsistent because \(\widetilde{U}^i(\hat{y}^i) > \widetilde{U}^i(x^i)\), \(\widetilde{U}^j(\hat{y}^j) > \widetilde{U}^j(x^j)\), and \(\widetilde{U}^h(\hat{y}^h) = \widetilde{U}^h(x^h)\) for all \(h \ne i,j\).Footnote 15 \(\square \)

Fig. 1
figure 1

Illustration of the proofs of Theorems 3 and 4

A notable feature of Theorem 3 is that \(R_U\) is not assumed to satisfy any rationality condition. The proof merely exploits the fact that it is logically impossible for one alternative to be both socially preferred to and socially worse than a second. Moreover, unlike the Kaplow–Shavell Theorem, the impossibility is not a consequence of Pareto Indifference being satisfied when Strong Pareto is. Hence, even if \(R_U\) is assumed to be an ordering, Theorem 3 is not a corollary to Theorem 1.

A simple modification of the argument used to prove Theorem 3 shows that Pareto Indifference is inconsistent with both Permutation Dominance and the Suppes Grading Principle if U is a classical private goods profile with nonidentical preferences.

Theorem 4

For the triple \(\langle X,U,R_U \rangle \), if \(X = \prod _{i=1}^n X^i\), where \(X^i = \mathbb {R}^k_+\) and \(k \ge 2\), and U is a classical private goods profile with nonidentical preferences, then Pareto Indifference is incompatible with Permutation Dominance and with the Suppes Grading Principle.

Proof

As in the proof of Theorem 3, suppose that i and j have different preferences. The assumptions on U imply that it is possible to choose alternatives x and \(\bar{y}\) such that (i) \(U^i(x^i) = U^i(\bar{y}^i)\), (ii) \(U^j(x^j) = U^i(\bar{y}^j)\), (iii) \(\bar{y}^i \gg x^j\), (iv) \(\bar{y}^j\gg x^i\), and (v) \(x^h = \bar{y}^h\) for all \(h \ne i,j\). For i and j, see Fig. 1 for the two-good case. By Pareto Indifference, \(xI_U\bar{y}\). By either Permutation Dominance or the Suppes Grading Principle, \(\bar{y}P_Ux\), a contradiction. \(\square \)

Theorem 1 applies to any non-welfarist principle and to any set of alternatives, not just to the specific non-welfarist criteria and structured set of alternatives considered in this section. Nevertheless, Theorem 4 does not follow from the conclusion of Theorem 1 that violating Single-Profile Welfarism is inconsistent with Pareto Indifference because, unlike Theorem 1, Theorem 4 does not presuppose that \(R_U\) is a quasiordering. What Theorem 4 demonstrates is that with a more structured set of alternatives, there can be a conflict between a specific non-welfarist criterion and Pareto Indifference without any social rationality restriction whatsoever.

The assumption that preferences for own consumption are not identical ensures that the Permutation Dominance and Suppes Grading Principles are incompatible with welfarism. Suppose that everybody has the same utility function defined on own consumption and that \(R_U\) is the utilitarian ordering of the alternatives. The utilitarian ordering is clearly welfarist and satisfies Pareto Indifference. Because permuting commodity bundles between individuals also permutes their utilities when everybody has the same utility function for own consumption, the utilitarian ordering also satisfies the Permutation Dominance and Suppes Grading Principles. Hence, in this special case, these principles do not conflict with Single-Profile Welfarism. However, when \(R_U\) is assumed to be a quasiordering, by Theorem 1, Pareto Indifference is equivalent to Single-Profile Welfarism. Thus, with non-identical preferences, it follows from Theorems 1 and 4 that the Permutation Dominance and Suppes Grading Principles are not welfarist.

It is also noteworthy that Theorems 3 and 4 make no use of any continuity assumption for \(R_U\). In contrast, the Kaplow–Shavell Theorem makes essential use of such an assumption in order to ensure that Weak Pareto implies Pareto Indifference. Theorem 3 does not imply 4, nor does the reverse implication hold. Rather, it is the geometric structure of the problem that permits us to tweak the constructions used to prove the Strong Pareto version of the impossibility theorem in order to to prove its Pareto Indifference counterpart.

6 Pareto conflicts with inequality aversion principles

I now turn to conflicts between the Pareto Principle and two inequality aversion principles. The first is a version of the Rawlsian Difference Principle and the second is a multidimensional generalization of the Pigou–Dalton Transfer Principle. I present Pareto Indifference versions of the Weak Pareto impossibility results established by Gibbard (1979) for the former principle and by Fleurbaey and Trannoy (2003) for the latter. I also consider how these impossibility theorems are related to the results discussed in the preceding sections.

6.1 The minimal difference principle

The Rawlsian Difference Principle (Rawls 1971) advocates designing social institutions so as to make the least advantaged as well off as possible, where advantage is determined by an index of primary goods. Rawls identified a number of primary goods (such as rights and liberties, power and opportunities, self-respect, and income and wealth), but did not specify how these goods are to be aggregated into an index. The Difference Principle provides a non-welfarist criterion for ranking social alternatives. However, even if everybody has the same amount of all primary goods except for income, in order to make this principle operational, there remains the difficulty of identifying who is the least advantaged because individuals have different preferences for commodities, with the consequence that how well income advances their interests depends on commodity prices. This problem does not arise if prices are held fixed because then the least advantaged is simply the person with the smallest income. Gibbard (1979) has shown that even if the Difference Principle is restricted to such fixed-price comparisons, there is a conflict with the Weak Pareto Principle provided that all preferences are not identical.

For Gibbard, an alternative is a price vector \(p \in \mathbb {R}^k_{++}\) and a vector of incomes \(\mu = (\mu ^1, \ldots , \mu ^n) \in \mathbb {R}^n_{++}\) for the n individuals. These are dual variables to the commodity bundles considered in Sect. 5. To facilitate the comparison of the impossibility result in this section with the other impossibility results considered in this article, I employ a primal approach. Specifically, I assume that the set of alternatives X is the same as in Sect. 5. Furthermore, it is assumed that U is a classical private goods profile with nonidentical preferences. The set of consequences C is again U(X).

The assumptions employed here imply that for each price-income pair \((p, \mu ^i) \in \mathbb {R}^{k+1}_{++}\), person i has a unique demand vector \(d^i(p, \mu ^i)\). This is the commodity bundle that maximizes \(i\)’s utility function subject to his budget constraint. The primal form of Gibbard’s Minimal Difference Principle says that alternative x is socially preferred to y if x and y are vectors of demands for two price-income situations in which the prices are the same in both situations and the smallest income in the first situation is larger than the smallest income in the second situation.

Minimal Difference Principle

For all \(x,y \in X\), if there exist \((p, \mu ), (p, \bar{\mu }) \in \mathbb {R}^{k+n}_{++}\) such that (i) \(x^i = d^i(p, \mu ^i)\) and \(y^i = d^i(p, \bar{\mu }^i)\) for all \(i \in N\) and (ii) \(\min _i \mu ^i > \min _i \bar{\mu }^i\), then \(xP_Uy\).

Theorem 5 demonstates that this principle is inconsistent with Pareto Indifference provided that U is a classical private goods profile with nonidentical preferences and the social ranking \(R_U\) is a quasiordering.

Theorem 5

For the triple \(\langle X,U,R_U \rangle \), if \(X = \prod _{i=1}^n X^i\), where \(X^i = \mathbb {R}^k_+\) and \(k \ge 2\), U is a classical private goods profile with nonidentical preferences, and \(R_U\) is a quasiordering, then Pareto Indifference is incompatible with the Minimal Difference Principle.

Proof

Suppose that both Pareto Indifference and the Minimal Difference Principle are satisfied. Because U is a classical private goods profile with nonidentical preferences, there exist \(i, j \in N\) with nonidentical indifference curves for own consumption. Without loss of generality, suppose that for fixed values of goods 3 through k, there exists an indifference curve for i that intersects an indifference curve for j from above. It is then possible to choose alternatives x, y, \(\bar{x}\), and \(\bar{y}\) and price-income situations \((p,\mu _1)\), \((p, \mu _2)\), \((q,\mu _3)\), and \((q, \mu _4)\) with \(p_1/p_2 < q_1/q_2\) such that (i) \(x^h = d^h(p, \mu ^h_1)\), \(y^h = d^h(p, \mu ^h_2)\), \(\bar{x}^h = d^h(q, \mu ^h_3)\), and \(\bar{y}^h = d^h(q, \mu ^h_4)\) for all \(h \in N\), (ii) \(U^h(x^h) = U^h(\bar{x}^h)\) and \(U^h(y^h) = U^h(\bar{y}^h)\) for all \(h \in N\), (iii) \(\mu ^h_1 > \mu ^i_1\) and \(\mu ^h_2 > \mu ^i_2\) for all \(h \ne i\), (iv) \(\mu ^h_3 > \mu ^j_3\) and \(\mu ^h_4 > \mu ^j_4\) for all \(h \ne j\), (v) \(\mu ^i_2 > \mu ^i_1\), and (vi) \(\mu ^j_3 > \mu ^j_4\). This construction is illustrated in Fig. 2 when there are two goods and two individuals.Footnote 16

By the Minimal Difference Principle, \(yP_Ux\) and \(\bar{x}P_U\bar{y}\). By Pareto Indifference, \(xI_U\bar{x}\) and \(\bar{y}I_Uy\). Because \(yP_Ux\), \(xI_U\bar{x}\), \(\bar{x}P_U\bar{y}\), and \(\bar{y}I_Uy\), the transitivity of \(R_U\) implies that \(yP_Uy\), which contradicts the reflexivity of \(R_U\). \(\square \)

Fig. 2
figure 2

Illustration of the proof of Theorem 5

The strategy used to prove Theorem 5 is more transparent when there are only two goods and two individuals. In Fig. 2, person i has a relative taste for good 1, whereas person j has a relative taste for good 2. The commodity bundles in x and y are chosen when the price vector p is such that good 1 is relatively cheap. For these bundles to be demand vectors, i must have the smallest income in both of these cases and he must have more income when he chooses \(y^i\) than when he chooses \(x^i\). Hence, by the Minimal Difference Principle, \(yP_Ux\). For \(\bar{x}\) and \(\bar{y}\), similar reasoning using a price vector q in which good 1 is relatively expensive shows that j is the least advantaged in both cases and that \(\bar{x}P_U\bar{y}\). Pareto Indifference implies that \(xI_U\bar{x}\) and \(\bar{y}I_Uy\). These four rankings are inconsistent with \(R_U\) being a quasiordering.

6.2 The multidimensional transfer principle

For unidimensional distributions of income, the Pigou–Dalton Transfer Principle expresses an aversion to inequality. This principle regards a transfer of income from a richer to a poorer person that does not reverse their ranking in the income distribution as being a social improvement. When there is more than one good, this principle needs to be reformulated so as to take account of differences in the individual preferences. Fleurbaey and Trannoy (2003) have introduced a natural multidimensional version of this transfer principle and shown that it conflicts with the Weak Pareto Principle when all preferences are not identical.Footnote 17 I present a Pareto Indifference version of their result.

As above, U is a classical private goods profile with nonidentical preferences for the set of alternatives X used in Sect. 5 and in Theorem 5 and the set of consequences C is U(X). Consider implementing a Pigou–Dalton transfer between individuals i and j for each good separately. If i has more of one good than j initially, but the reverse is true for some other good, it is unclear if inequality has been reduced by such a transfer. However, if one of these individuals has at least as much of every good as the other and strictly more of at least one of the goods, then this multidimensional transfer is unambiguously inequality reducing. Fleurbaey and Trannoy’s Multidimensional Transfer Principle regards such a transfer as being a social improvement.

Multidimensional Transfer Principle

For all \(x,y \in X\), if there exist \(i, j \in N\) and \(\delta > \mathbf {0}_k\) such that \(x^i + \delta = y^i \le y^j = x^j - \delta \) and \(x^h = y^h\) for all \(h \ne i,j\), then \(yP_Ux\).

Theorem 6 shows that the impossibility result established in Theorem 5 is also valid if the Multidimensional Transfer Principle is substituted for the Minimal Difference Principle.

Fig. 3
figure 3

Illustration of the proof of Theorem 6

Theorem 6

For the triple \(\langle X,U,R_U \rangle \), if \(X = \prod _{i=1}^n X^i\), where \(X^i = \mathbb {R}^k_+\) and \(k \ge 2\), U is a classical private goods profile with nonidentical preferences, and \(R_U\) is a quasiordering, then Pareto Indifference is incompatible with the Multdimensional Transfer Principle.

Proof

Suppose that both Pareto Indifference and the Multidimensional Transfer Principle are satisfied. Because U is a classical private goods profile with nonidentical preferences, there exist \(i, j \in N\) with nonidentical indifference curves for own consumption. Hence, it is possible to choose alternatives x, y, \(\bar{x}\), and \(\bar{y}\) and vectors \(\delta > \mathbf {0}_k\) and \(\bar{\delta } > \mathbf {0}_k\) such that (i) \(x^i + \delta = y^i \le y^j = x^j - \delta \), (ii) \(x^h = y^h\) for all \(h \ne i,j\), (iii) \(\bar{y}^j + \bar{\delta } = \bar{x}^j \le \bar{x}^i = \bar{y}^i - \bar{\delta }\), (iv) \(\bar{x}^h = \bar{y}^h\) for all \(h \ne i,j\), and (v) \(U^h(x^h) = U^h(\bar{x}^h)\) and \(U^h(y^h) = U^h(\bar{y}^h)\) for \(h = i,j\).Footnote 18 For individuals i and j, this construction as illustrated in Fig. 3 for the two-good case.

By the Multidimensional Transfer Principle, \(yP_Ux\) and \(\bar{x}P_U\bar{y}\). By Pareto Indifference, \(xI_U\bar{x}\) and \(\bar{y}I_Uy\). Because \(yP_Ux\), \(xI_U\bar{x}\), \(\bar{x}P_U\bar{y}\), and \(\bar{y}I_Uy\), the transitivity of \(R_U\) implies that \(yP_Uy\), which contradicts the reflexivity of \(R_U\). \(\square \)

The proofs of Theorems 5 and 6 are remarkably similar. For each of the focal individuals i and j, their four consumption bundles are chosen from two pairs of intersecting indifference curves and the relative positions of these bundles are similar in the two cases. The precise locations of the bundles differs in the two proofs, with budget dominance and commonality of marginal rates of substitution used in the proof of Theorem 5 to pin these locations down so as to appeal to the Minimal Difference Principle, whereas vector dominance and commonality of the size of transfers is used in the proof of Theorem 6 so as to appeal to the Multidimensional Transfer Principle. The same preference cycle is generated in both cases. Unlike with the Kaplow–Shavell Theorem, no continuity assumption for \(R_U\) is required.

As is the case with the Permutation Dominance and Suppes Grading Principles, the assumption that preferences for own consumption are not identical ensures that the Minimal Dominance and Multidimensional Transfer Principles are incompatible with welfarism. It is straightforward to verify that these principles and Pareto Indifference are satisfied if \(R_U\) is the leximin ordering of utility vectors when everybody has the same utility function for own consumption. However, when preferences for own consumption are not identical, Theorems 1, 5, and 6 imply that the Minimal Dominance and Multidimensional Transfer Principles are non-welfarist.

The Weak Pareto versions of Theorems 5 and 6 established by Gibbard (1979) and Fleurbaey and Trannoy (2003) can be established using proofs that are very similar to the ones used here to prove their Pareto Indifference counterparts by slightly modifying the alternatives so that Weak Pareto can be invoked instead of Pareto Indifference. Similar tweaking of the alternatives is what permits Theorems 3 and 4 to be established using similar proofs.

The role that having nonidentical preferences plays in the proofs of Theorems 5 and 6 is similar to the role that it plays in the proof of Theorem 4. However, the latter result differs in some fundamental respects from the other two theorems. First, its proof only requires considering two alternatives, whereas the proofs of Theorems 5 and 6 require considering four. Second, Theorem 4 does not impose any rationality restriction on \(R_U\).

7 Non-welfarist impossibility theorems

In the preceding sections, the consequences have been utilities. I now turn to two results in which the consequences need not have this interpretation. The first is a theorem due to Pattanaik and Xu (2007) about standard of living measurement. The second is an abstract theorem due to Hare (2007) that has a number of concrete applications. In both theorems, the ranking of the consequences depends on some conditioning variables, what Pattanaik and Xu (2012) call context dependence. They distinguish between two types of context dependence, of which the kind considered in this section is type 1 dependence. Pattanaik and Xu (2012) have established a quite general theorem about type 1 dependence from which slight variants of the theorems considered in this section follow as special cases.Footnote 19

7.1 The Pattanaik–Xu Theorem

Now assume that \(X = \mathbb {R}^m_+\), with \(m \ge 2\). Pattanaik and Xu (2007) interpret a vector \(x \in X\) as being the quantities of m divisible functionings for some individual, but it can also be interpreted as being a commodity bundle listing this individual’s quantities of m divisible goods. I shall use the latter interpretation. Pattanaik and Xu permit the quantity of each good to have a finite upper bound; for simplicity, here it is supposed that X is unbounded from above. As in previous sections, \(N = \{1, \ldots , n\}\) is the set of individuals, with \(n \ge 2\).

The objective is to compare commodity bundles in terms of their standards of living both intrapersonally and interpersonally. This is done using a standard of living relation \(\succeq \) on \(N \times X\), with corresponding asymmetric factor \(\succ \) and symmetric factor \(\sim \), respectively. For all \(i,j \in N\) and all \(x,y \in X\), \((i,x) \succeq (j,y)\) is interpreted as meaning that i has a standard of living with the commodity bundle x at least as high as j’s standard of living with the commodity bundle y. For each \(i \in N\), \(\succeq \) defines a conditional ranking \(\succeq _i\) on X that ranks commodity bundles in terms of \(i\)’s standard of living.

It is commonplace to ask: What is the standard of living obtained with a particular commodity bundle? This question implicitly assumes that X is the set of consequences C, and that is what is supposed here. With this interpretation, the identity of who has a commodity bundle is not part of the description of the consequences of an alternative (ix). Thus, on this view, a consequentialist would require the conditional rankings \(\succeq _i\), \(i = 1, \ldots , n\), to be identical.

In contrast, Pattanaik and Xu suppose that intrapersonal standard of living comparisons should not be invariant across individuals, but should instead respect differences in individual values and the cultural norms in the societies in which they live. A very minimal version of this requirement is provided by their Minimal Relativism condition, which requires that there exist at least two individuals and two commodity bundles for which the standard of living comparison differs.

Minimal Relativism

There exist \(i,j \in N\) and \(x,y \in X\) such that \((i,x) \succ (i,y)\) and \((j,y) \succ (j,x)\).

Thus, the conditional rankings \(\succeq _i\) on X cannot be the same for all \(i \in N\). Hence, in the view being explored, Minimal Relativism is a nonconsequentialist principle because the intrapersonal standard of living comparisons are permitted to depend on the nonconsequentialist information in N. In other words, the ranking of X depends on the context, here provided by the identity of the individual being considered. Minimal Relativism plays a role in the analysis that is similar to the one played by the Nonidentical Preferences condition in the preceding two sections.Footnote 20

PX Dominance

For all \(i,j \in N\) and all \(x,y \in X\) for which \(x \gg y\), \((i,x) \succ (j,y)\).

PX Dominance uses vector dominance in the space of consequences to make standard of living comparisons for individual-commodity bundle pairs.Footnote 21 Specifically, if one commodity bundle x strictly dominates a second commodity bundle y, then no matter who has x and y, the person with x has a higher standard of living than the person with y.

Conditional Continuity

For all \(i \in N\) and all \(x \in X\), the sets \(\{z \in X \mid (i,z) \succeq (i,x)\}\) and \(\{z \in X \mid (i,x) \succeq (i,z)\}\) are closed.

Conditional Continuity requires each of the conditional rankings \(\succeq _i\) to be continuous.

Theorem 7 is the Pattanaik–Xu Theorem. It is a variant of Proposition 1 in Pattanaik and Xu (2007).Footnote 22

Theorem 7

For the triple \(\langle X,U, \succeq \rangle \), if \(X = \mathbb {R}^m_+\) with \(m \ge 2\) and \(\succeq \) is acyclic, then Minimal Relativism, PX Dominance, and Conditional Continuity are incompatible.

Proof

By Minimal Relativism and PX Dominance, there exist \(i, j \in N\) and \(x, y \in X\) with \(\lnot (x \gg y)\) and \(\lnot (y \gg x)\) such that \((i,x) \succ (i,y)\) and \((j,y) \succ (j,x)\). By Conditional Continuity, (a) \((i,x) \succ (i,z)\) for any z arbitrarily close to y for which \(z \gg y\) and (b) \((j,y) \succ (j,w)\) for any w arbitrarily close to x for which \(w \gg x\) . By PX Dominance, \((j,w) \succ (i,x)\) and \((i,z) \succ (j,y)\). However, \((j,y) \succ (j,w)\), \((j,w) \succ (i,x)\), \((i,x) \succ (i,z)\), and \((i,z) \succ (j,y)\) contradict acyclicity. \(\square \)

Fig. 4
figure 4

Illustration of the proof of Theorem 7

The proof of Theorem 7 is illustrated in Fig. 4 for the two-good case. The relative positions of w, x, y, and z with respect to each other and with respect to the i and j indifference curves in Fig. 4 are the same as the relative positions of \(y^i\), \(x^i\), \(x^j\), and \(y^i\) in Fig. 1. The similarity between these two figures might lead one to believe that Theorems 3 and 7 are fundamentally the same, but this is not the case. While both proofs exploit some of the same ideas, in the former case, the comparisons are between allocations of commodity bundles to every individual, whereas in the latter case, the comparisons are between commodity bundles for particular individuals. Furthermore, in the proof of Theorem 7, because PX Dominance only takes account of two individuals’ commodity bundles, it does not matter what commodity bundle anybody else has. In contrast, in the proof of Theorem 3, in order to apply Strong Pareto, it matters what everybody consumes. Moreover, Theorem 7 makes use of a social rationality condition on \(\succeq \) (i.e., acyclicity), whereas Theorem 3 does not.

Proposition 1 in Pattanaik and Xu (2007) employs a slightly different dominance condition, what they call Weak Dominance, in which \(x \gg y\) and \((i,x) \succ (j,y)\) are replaced by \(x > y\) and \((i,x) \succeq (j,y)\), respectively, in the statement of PX Dominance. With this alternative dominance principle, it is necessary to strengthen acyclicity to transitivity in order to obtain a contradiction between Minimal Relativism, Weak Dominance, and Conditional Continuity. Both the acyclic and transitive cases are covered by Proposition 1 in Pattanaik and Xu (2012).

7.2 The Hare Theorem

Hare (2007) has established an abstract theorem in which a context-dependent nonconsequentialist principle is shown to be in conflict with a dominance condition in the space of consequences when the ranking of alternatives is acyclic. Hare illustrates his theorem with a number of applications. Most notably, he applies his theorem to the issue of what duties of assistance are owed to needy strangers. Here, I present a version of Hare’s Theorem and discuss how Hare applies it to his aid example.Footnote 23 Hare’s Theorem and this application are considered at greater length in Weymark (2014).

In Hare’s Theorem, it is assumed that the set of alternatives X is the union of two disjoint sets \(X^1\) and \(X^2\). No other structure is imposed on X. The set of consequences C is an \(m\)-dimensional subset of \(\mathbb {R}^m\) for some \(m \ge 2\). There is a function \(f :X \rightarrow C\) that uniquely determines a consequence in C for each \(x \in X\). An evaluator has a preference binary relation on X with corresponding asymmetric and symmetric factors P and I, respectively. Note that this preference is on the set of alternatives X, not the set of consequences C.

Following Hare (2007), I illustrate this formalism by applying it to the ethical issue of whether it is morally legitimate for a prosperous individual to condition assistance given to someone in great need based on his proximity (either in terms of distance or kinship) when the sacrifice required is moderate. Hare models this problem as one in which the set of individuals is \(N = \{1,2\}\), where individual 1 is the person in the position of offering assistance and individual 2 is the person needing it. Let \(\omega > 0\) be a small number that is much less than individual 1’s wealth. Individual 1 contemplates sacrificing any amount of money in \([0, \omega ]\) to aid individual 2. Let \(\alpha \) denote that the second individual is nearby and \(\beta \) that he is distant. The set of alternatives is \(X = X^1 \cup X^2\), where \(X^1 = [0, \omega ] \times \{\alpha \}\) and \(X^2 = [0,\omega ] \times \{\beta \}\). In this application, consequences are the individuals’ utilities, so \(m = 2\). For simplicity, let the set of consequences C be \(\mathbb {R}^2_+\) and identify f with a profile of utility functions \(U :X \rightarrow \mathbb {R}^2_+\). The preference relation R on X is now interpreted as being the moral preferences of the first individual.

Hare’s dominance principle is H Dominance.

H Dominance

For all \(x, y \in X\), if \(f(x) \gg f(y)\), then xPy.

As is the case with PX Dominance, vector dominance in the space of consequences is used to make inferences about the binary relation of interest (the evaluator’s preference R on X in the case of H Dominance and the standard of living relation \(\succeq \) on \(N \times X\) in the case of PX Dominance). H Dominance requires an evaluator to prefer one alternative to another if the former results in consequences that vector dominate the consequences obtained with the latter.

Hare’s context-dependent nonconsequentialist priniciple is Variable Trade-Offs.

Variable Trade-Offs

There exist open (relative to \(\mathbb {R}^m\)) subsets A and B of C for which \(\lnot (u \gg v)\) and \(\lnot (v \gg u)\) for all \(u \in A\) and all \(v \in B\) such that \((A \cup B) \subseteq f(X^1)\) and \((A \cup B) \subseteq f(X^2)\). Moreover, for all \(x, y \in X^1\), if \(f(x) \in A\) and \(f(y) \in B\), then xPy, whereas for all \(x, y \in X^2\), if \(f(x) \in A\) and \(f(y) \in B\), then yPx.

The first part of this axiom is a domain richness condition. The set of consequences that can be obtained by alternatives in \(X^1\) need not be the same as the set of consequences that can be obtained by alternatives in \(X^2\). Nevertheless, these two sets are required to have two open sets of consequences in common that do not vector dominate each other. Openness ensures that for any alternative x in \(X^j\), \(j = 1,2\), whose consequence f(x) is in either A or B, it is possible to find a different alternative in the same set whose consequence vector is arbitrarily close to f(x) that vector dominates f(x).Footnote 24 The requirement that no consequence in A vector dominates any consequence in B and vice versa ensures that H Dominance does not apply to a comparison of alternatives that result in two such consequences.

As with Minimal Relativism, the second part of Variable Trade-Offs requires the ranking of the alternatives to be sensitive to nonconsequentialist information. Specifically, the ranking of consequences in A relative to those in B is reversed if these consequences are obtained from alternatives in \(X^1\) rather than from alternatives in \(X^2\).

When interpreted in terms of Hare’s aid example, the first part of Variable Trade-Offs implies that any utility vector in either region A or region B can be obtained by a small amount of assistance regardless of whether the needy individual is nearby or distant. Hare supposes that it is possible to make intrapersonal comparisons of utility differences and that the utilities of individual 2 are much greater in region A of the consequence space than in region B, whereas the utilities of individual 1 in region A are only slightly smaller than in region B. With this interpretation of Hare’s abstract framework, Variable Trade-Offs is a non-welfarist principle that he regards as capturing the “morally undemanding” view that a moderately prosperous person is obligated to make small sacrifices for the nearby needy, but not for the distant needy, even if the benefits from doing so for the beneficiary are substantial. This is not a view that Hare himself subscribes to. Rather, he describes someone who holds this view as being an “ogre”.

Fig. 5
figure 5

Illustration of the proof of Theorem 8

The constructions in Variable Trade-Offs are illustrated in Fig. 5. The interiors of the two circles are the consequent sets A and B. Although the preference relation R is defined on the set of alternatives X, Variable Trade-Offs allows us to make some inferences about how consequence vectors in A and B are ranked conditional on whether the alternative is in \(X^1\) or \(X^2\). The two “indifference curves” shown in Fig. 5 indicate (i) that any alternative in \(X^1\) that generates a consequence in A is preferred to any alternative in \(X^1\) that generates a consequence in B and (ii) that any alternative in \(X^2\) that generates a consequence in B is preferred to any alternative in \(X^2\) that generates a consequence in A.Footnote 25

Theorem 8 is Hare’s Theorem. It is a formal statement of a result stated somewhat informally in Hare (2007) with his transitivity assumption weakened to acyclicity.

Theorem 8

For the triple \(\langle X,U, R \rangle \), if \(X = X^1 \cup X^2\) with \(X^1 \cap X^2 = \varnothing \), \(C \subseteq \mathbb {R}^m\) with \(m \ge 2\), and R is acyclic, then H Dominance and Variable Trade-Offs are incompatible.

Proof

By Variable Trade-Offs, there exist (i) \(w, x \in X^1\) such that \(f(w) = u \in A\) and \(f(x) = v \in B\) and (ii) \(y, z \in X^2\) such that \(f(y) = \bar{u} \in A\) and \(f(z) = \bar{v} \in B\), where \(\bar{u} \gg u\) and \(v \gg \bar{v}\). By Variable Trade-Offs, wPx and zPy. By H Dominance, yPw and xPz. Thus, R is cyclic. \(\square \)

Figure 5 illustrates the constructions used in this proof. Even though the proof of Theorem 8 is somewhat more indirect than that of Theorem 7 because of the need to consider the alternatives that generate the consequences, the logic underlying them is essentially the same. Theorem 7 makes explicit use of a continuity condition. The analogue in Theorem 8 is the assumption that small vector dominating variations in the consequence vectors in A and B are possible.

When applied to his aid example, Hare regards his theorem as saying that if a potential donor wants to be rational (here, requiring R to be acyclic), then he must reject either H Dominance or Variable Trade-Offs. As I have noted, Hare rejects Variable Trade-Offs, describing someone who holds this view about duties of assistance as being an ogre. In Weymark (2014), I question Hare’s application of his theorem to his aid example and his characterization of someone who subscribes to Variable Trade-Offs as being morally deficient. Conditioning assistance on whether the recipient is a member of one’s own community need not indicate that one is morally deficient. Rather, it demonstrates a concern for what are known as associative duties.Footnote 26 More fundamentally, I have argued that Hare’s Theorem does not apply to his aid scenario because Hare has illegitimately treated the nearby and distant needy as being the same person, which is a physical impossibility. Once they are distinguished, utility consequences are three dimensional and, hence, there are no situations in which H Dominance applies.

8 Concluding remarks

The impossibility theorems discussed in the preceding sections present a nonconsequentialist with the conundrum of how to resolve the fundamental incompatibility of what a priori appear to be appealing principles. In these concluding remarks, I describe three kinds of solutions to this dilemma that have been considered in the literature. The first shifts the focus to a multi-profile setting and only allows nonconsequentialist principles to play a role in inter-profile comparisons. The other two solutions retain the single-profile setting but either reject one or more of the principles or restrict their scope.

Fleurbaey et al. (2003) take issue with the claim made by Kaplow and Shavell (2001) that Theorem 2 establishes that any non-welfarist method of policy assessment must violate the Weak Pareto Principle. They argue that welfarism is generally understood to be the claim that a single social welfare ordering of utility vectors is used to determine the social ordering of the alternatives for each possible profile of utility functions. In this multi-profile setting, in addition to Pareto Indifference, welfarism requires the social ranking of a pair of alternatives to be the same for both profiles if these profiles agree on the individual utilities assigned to them.Footnote 27 Even if one is willing to accept some form of the Pareto Principle, this independence condition is controversial and, therefore, one may well be a non-welfarist in the multi-profile sense without running afoul of any Pareto principle. Fleurbaey et al. (2003) provide an example of a Paretian social welfare functional based on fairness principles that is non-welfarist because it violates the independence described above (see also, Chang 2000, Section IV.B). Nevertheless, their example satisfies Single-Profile Welfarism for each profile considered separately.Footnote 28

In the single-profile setting that has been the focus of this article, one possible way of dealing with incompatible principles is to reject one or more of them. Hare (2007), for example, rejects Variable Trade-Offs, which is his context-dependence axiom. He argues that proximity is not a relevant consideration when deciding whether to aid the nearby or distant needy. Pattanaik and Xu (2012) also suggest dropping one of the axioms, but argue that which one this should be depends on the application. For the applications that they consider, they argue that the dominance principle is the one that should be rejected. For example, with PX Dominance, they note that if individual i has more of every good with the commodity bundle x than individual j has with the commodity bundle y, then it cannot be inferred that i is better off than j; other factors may be relevant when making interpersonal welfare comparisons even when there is commodity bundle dominance. Further grounds for rejecting dominance conditions when the comparisons are between vectors of individual utilities have already been provided in the discussion of Pareto principles in Sect. 3.

Rather than reject a desiderata outright, its scope can be restricted. For example, Fleurbaey and Trannoy (2003) and Fleurbaey (2006) suggest restricting the Multidimensional Transfer Principle considered in Sect. 6.2. One way of doing so is to only apply this principle when the transfers are between individuals with the same preferences. Alternatively, this principle could be applied only when all of the commodity bundles are proportional to each other. With either of these restrictions, not only is there no conflict with standard Pareto principles and an appropriate social rationality condition, it is also possible to satisfy other desirable criteria, such as ones that incorporate equity considerations. Similarly, the set of alternatives to which a dominance principle applies may be restricted, as in Fleurbaey (2007, 2011) and Decancq et al. (2015), who restrict vector dominance comparisons to a single monotonic path in the set of alternatives.Footnote 29

None of these resolutions to the nonconsequentialist’s condundrums is completely satisfactory. But they may be the best we can do given the fundamental incompatibility of nonconsequentialism, dominance, and social rationality in the applications considered here.