1 Introduction

Experiments provide robust evidence of randomness in choice behaviour, especially when choosing between risky or uncertain prospects (Loomes 2005; Wilcox 2008; Hey 2014; Hollard et al. 2016).Footnote 1 Subjects are often observed to make different choices in successive presentations of the same choice problem. Since the early 1990s increasing attention has been paid to this phenomenon, and to the way in which “noise” is modelled in the analysis of experimental data on choice behaviour.

Such lines of inquiry have prompted revisionist thinking about the descriptive merits of expected utility (EU). By 1995 John Hey was prepared to advance the following tentative hypothesis:

“[O]ne can explain experimental analyses of decision-making under risk better (and simpler) as EU plus noise – rather than through some higher level functional – as long as one specifies the noise appropriately.” (Hey 1995, p. 640)

Numerous subsequent papers have put Hey’s hypothesis to the test. Contributions such as Buschena and Zilberman (2000) and Schmidt and Neugebauer (2007) lend confirmatory evidence, though contrary evidence has also been found (e.g. Loomes and Sugden 1998; Loomes and Pogrebna 2014).

For uncertain prospects, the descriptive merits of subjective expected utility (SEU) have also become the subject of renewed debate. The experiments of Ellsberg (1961) originally discredited SEU by suggesting the prevalence of uncertainty (or ambiguity) aversion. The latter entails the possibility of a strict preference for betting on a given “risky” event—one with an objectively determined probability of occurrence—over betting on an alternative “uncertain” event—one of unknown probability—together with a strict preference for betting against the risky event over betting against the uncertain event. Various generalisations of SEU that can accommodate uncertainty aversion have been proposed, the best known of which are Choquet expected utility (Schmeidler 1989) and maxmin expected utility (Gilboa and Schmeidler 1989).

More recently, new experimental designs have cast doubt on the prevalence of uncertainty aversion and have restored some respectability to SEU. Halevy (2007), for example, found that evidence against SEU all but disappears when attention is restricted to subjects whose behaviour is consistent with the reduction of compound lotteries assumption. Approximately 18% of Halevy’s subjects behave in conformity with this assumption, and 96% of these exhibit behaviour that is consistent with SEU. Abdellaoui et al. (2016) obtain similar, though less dramatic, results. The experiments of Binmore et al. (2012) find substantial support for the principle of insufficient reason—that is, subjective expected utility maximisation with equal subjective probabilities assigned to each state—while “[t]heories that postulate a large level of ambiguity aversion all perform badly” (ibid. p. 233). The results in Hey et al. (2010) are more equivocal, but SEU still performs respectably in describing their aggregate data, and better, in the sense of minimising the prediction (i.e. out-of-sample) log-likelihood, than Choquet expected utility—see their Table 1.Footnote 2

This recent experimental literature, which is ably surveyed by Hey (2014), has prompted experimentalists to think more carefully about the noise term when analysing their data. Amongst theorists, it has revived interest in probabilistic models of choice. These models characterise decision-makers through choice probabilities rather than preference relations. The theoretical challenge is to devise parsimonious, but descriptively accurate, representations for choice probabilities, validated by plausible sets of axioms.

For binary choices, Fechnerian representations are common. Consider pairs of alternatives drawn from a set A. Let P(ab) denote the probability with which a particular decision-maker chooses a from the choice set \(\left\{ a,b\right\} \subseteq A\). We call P the decision-maker’s binary choice probability function. The function P has a Fechner model, or Fechner representation, if there exists a utility function \(u:A\rightarrow {\mathbb {R}}\) such that P(ab) is a non-decreasing function of the utility difference, \( u(a) -u(b) \). If P(ab) is a strictly increasing function of this utility difference, we have a strong Fechner (or strong utility) model. In the latter case, we also say that u is a strong utility for P.

If P has a Fechner model, then binary choice behaviour can be described as “noisy” utility maximisation, with the probability of “error” being inversely related to the absolute utility difference between the two options.

There is a small literature on the axiomatic foundations of Fechner models for probabilistic choice between risky prospects, and an even smaller literature on Fechner models for choice under uncertainty.Footnote 3 The relevant portions of this literature are briefly surveyed in Sect. 2. The present paper adds to this literature, with a particular focus on choice under uncertainty. Since our formal analysis requires that A has a mixture set structure (Herstein and Milnor 1953), we adopt the framework of Anscombe and Aumann (1963) to describe uncertain prospects. An Anscombe–Aumann act is a mapping from states to lotteries. We provide axiomatic foundations for strong utility models in which the utility function u may take (respectively) the subjective expected utility, Choquet expected utility (CEU) or maxmin expected utility (MEU) form. We also axiomatise a strong utility model in which u represents invariant biseparable preferences (Ghirardato et al. 2004, 2005). The latter is a rather broad class of utility functions, which contains SEU, MEU and CEU as special cases.

Our results expose the axiomatic foundations beneath “noisy” versions of the most popular models of choice under uncertainty. These are the models amongst which recent experimental work seeks to adjudicate. When assessing the descriptive accuracy of SEU, experiments such as those of Hey et al. (2010) are really assessing the descriptive credentials of SEU maximisation with Fechnerian noise. It is therefore important to understand the axioms that underpin “noisy” SEU maximisation, and those that underpin its noisy competitors. The present paper provides these axiomatic foundations.

All of our representation theorems are corollaries of a more general result (Theorem 1) which, though somewhat abstract, may be of independent interest. We say that a is weakly stochastically preferred to b (denoted \(a\succsim ^{P}b\)) if the decision-maker is at least as likely to choose a as to choose b from the set \(\left\{ a,b\right\} \). That is,

$$\begin{aligned} a\succsim ^{P}b\ \ \ \text {iff}\ \ \ P(a,b) \ge \frac{1}{2} \end{aligned}$$
(1)

Suppose \(u:A\rightarrow {\mathbb {R}}\) is a utility function that represents \( \succsim ^{P}\), in the usual sense that \(a\succsim ^{P}b\) if and only if \( u(a) \ge u(b) \). A natural question arises:

What are sufficient conditions for P to be a strictly increasing function of utility differences measured by u ?

Theorem 1 provides an answer to this question. The required conditions are joint restrictions on P and u. The restriction on u is a restricted form of linearity that we call M -linearity (Definition 3)Footnote 4 and is partnered with a corresponding restriction on P called Strong M -Independence (Axiom 3). The latter is a weakened form of the Strong Independence axiom of Dagsvik (2008).

Many interesting classes of utility functions are M-linear (for suitable choice of M). In an Anscombe–Aumann environment, the SEU, MEU and CEU classes are all M-linear when M is the set of “constant” acts—constant mappings from states to lotteries—as we explain in Sect. 5.

To see the practical significance of Theorem 1, fix some M-linear class of utility functions, \({\mathcal {U}}\). For example, \({\mathcal {U}}\) might be the class of CEU functions in an Anscombe–Aumann environment. Theorem 1 provides a three-step “recipe” for assembling a set of conditions on P that are sufficient for the existence of a strong utility contained in \({\mathcal {U}}\):

Step 1.:

Identify axioms on preferences which are sufficient for a utility representation within the class \({\mathcal {U}}\). Such axioms will usually be available for utility classes of interest. (In the case of CEU, we could use the axioms of Schmeidler (1989), for example.)

Step 2.:

Impose these axioms on the weak stochastic preference relation, \(\succsim ^{P}\), and identify the corresponding restrictions on P by using (1). The resulting restrictions on P ensure that \( \succsim ^{P}\) has a utility representation \(u\in {\mathcal {U}}\).

Step 3.:

Add the Strong M-Independence axiom (plus any further restrictions on P identified in Theorem 1).

This recipe allows us to leverage any axiomatically grounded model of deterministic choice, described by a family \({\mathcal {U}}\) of M-linear utility functions, into an axiomatically grounded model of stochastic choice in which a utility function from \({\mathcal {U}}\) is maximised with Fechnerian error.

The remainder of the paper is organised as follows. The next section reviews the related literature on Fechner representation theorems. Section 3 contains our main result (Theorem 1). The remaining sections present applications of Theorem 1. In Sect. 4, A is a set of lotteries (the domain of risk) and we axiomatise a strong Fechner model in which u has the EU form (Proposition 1). Dagsvik (2008, 2015) already provided axiomatic foundations for such a model, but our axioms differ from his. “Appendix 2” presents a variant of our representation theorem which shows that one of Dagsvik’s (2008) axioms can be significantly weakened without jeopardising the representation (Theorem 3). In Sect. 5, A is a set of Anscombe–Aumann acts (the domain of uncertainty). We provide four strong Fechner representation theorems for each of the following utility classes: invariant biseparable (Proposition 2),Footnote 5 subjective expected utility (Proposition 3), maxmin expected utility (Proposition 4) and Choquet expected utility (Proposition 5). Section 6 provides some discussion of our results. “Appendix 1” contains proofs omitted from the text.

2 Related literature

There are few results on the axiomatics of Fechner, or Fechner-like, models for choice between risky or uncertain prospects. We review the key contributions here, as well as some related work on random utility models.

2.1 Risk

For the domain of risk, Blavatskyy (2008) provides sufficient conditions for the existence of a Fechner representation with u of the expected utility form.Footnote 6 Dagsvik (2008, 2015) provides sufficient conditions for a strong Fechner representation with u of the expected utility form.Footnote 7 In terms of axioms, the critical difference between the results of Dagsvik and Blavatskyy is the stochastic version of the independence property that is used: Strong Independence (Dagsvik 2008, Axiom 5)Footnote 8 versus Common Consequence Independence (Blavatskyy 2008, Axiom 4).

We focus on strong Fechner models in the present paper. For the domain of risk, we obtain two sets of sufficient conditions for the existence of a strong Fechner model with u of the EU form—Proposition 1 and Theorem 3—each of which differs significantly from the axiomatisations of Dagsvik.

Following the lead of Wilcox (2008, 2011), recent experimental literature often employs “contextual” Fechner models, in which choice probabilities depend on utility differences that are “normalised” to reflect different choice contexts.Footnote 9 In these models, the same utility difference may vary in its impact on choice depending on the particular alternatives being compared—that is, on the “context” in which the utilities arise. A standard Fechnerian logic applies within any fixed context.

For the domain of risk, Wilcox (2008, 2011) advocates a contextual model in which the relevant context consists of the best and worst possible outcomes across the two alternative lotteries under consideration. The term “contextual Fechner model” is often used to refer specifically to Wilcox’s model.

Blavatskyy (2011, 2012) suggests a different specification of context, one based on a dominance relation—first-order stochastic dominance for choice under risk (Blavatskyy 2011) or statewise dominance for choice under uncertainty (Blavatskyy 2012). If A is a set of lotteries with outcomes in \({\mathbb {R}}_{+}\) (a risky domain), with each lottery described by its associated distribution function, then the first-order stochastic dominance relation induces a lattice on A. Blavatskyy’s proposed “context” for a choice between a and b is the least upper bound and greatest lower bound for \(\left\{ a,b\right\} \) within this lattice. Blavatskyy (2011) provides an axiomatic foundation for a contextual Fechner model of this sort.

All representation theorems in the present paper are for standard (i.e. context-independent) Fechner models. Our purpose is to describe a new approach to the construction of such theorems—one embodied in the “recipe” based on Theorem 1—and to use this approach to prove some new representation results, especially for the domain of uncertainty.

That said, the axiomatisation of contextual Fechner models is certainly an important task for future research, and the results of Blavatskyy (2011, 2012) are valuable first steps in this direction.

2.2 Uncertainty

For choice between uncertain prospects, the aforementioned paper by Blavatskyy (2012) is the only axiomatisation of a Fechner(-like) model of which we are aware. In Blavatskyy (2012), A is a set of Savage acts—functions from a given state space, \({\mathcal {S}}\), to an arbitrary outcome space, X (outcomes need not be lotteries)—and Blavatskyy axiomatises a model of SEU maximisation with “contextual” Fechnerian error. The context is based on a statewise dominance relation. Given a utility function \(u:A\rightarrow {\mathbb {R}}\), one induces a utility function on X via the utilities of constant acts (i.e. acts that assign the same outcome to every state). We use u to denote this utility function on X also. Given Savage acts, a and b, let \(a\vee b\) denote a Savage act that gives, in each state s, an outcome with utility

$$\begin{aligned} \max \left\{ u\left( a(s) \right) ,u\left( b(s) \right) \right\} \end{aligned}$$

and let \(a\wedge b\) be a Savage act that gives, in each state s, an outcome with utility

$$\begin{aligned} \min \left\{ u\left( a(s) \right) ,u\left( b(s) \right) \right\} \text {.} \end{aligned}$$

Thus, a and b are mutually statewise dominated by \(a\vee b\) and they mutually statewise dominate \(a\wedge b\). The acts \(a\vee b\) and \(a\wedge b\) provide the “context” in which a and b are evaluated. In Blavatskyy’s (2012) representation, P(ab) is a function of the normalised utility difference

$$\begin{aligned} \frac{u(a) -u(b)}{u\left( a\vee b\right) -u\left( a\wedge b\right) } \end{aligned}$$
(2)

where u has the SEU form. Blavatskyy provides a set of necessary and sufficient conditions for the existence of a representation of this form.

The present paper is complementary to Blavatskyy’s work. We consider Anscombe–Aumann acts, rather than Savage acts, and we obtain a conventional (i.e. context-independent) strong Fechner representation. In addition to providing an axiomatic foundation for a strong utility of the SEU form, we also axiomatise strong Fechner models in which u takes the CEU or MEU form, as well as a more general model in which u may represent any invariant biseparable preference ordering.

2.3 Random utility models

The main rivals to the Fechner models are the random utility models. In a random utility model, the decision-maker has a set of utility functions, one of which is randomly drawn according to a fixed probability measure whenever the decision-maker has a decision to make. The randomly selected utility function is then maximised without error. In a random utility model, P(ab) is the probability that a utility function is selected for which the utility of a exceeds that of b.

Gul and Pesendorfer (2006) axiomatise a random expected utility model, which requires all possible utility functions to have the expected utility form. Lu (2014) provides axiomatic foundations for a random utility model for choice under uncertainty in an Anscombe–Aumann environment. In Lu’s model, utility functions are randomly selected from a subset of the MEU class.

Random utility models have the advantage that they exclude the possibility of dominated alternatives being chosen. They are also suitable for studying multinomial choice. However, they suffer from their own drawbacks. Most notably, they admit violations of Weak Stochastic Transitivity, which requires that the weak stochastic preference relation \(\succsim ^{P}\) be transitive (see Luce and Suppes, 1965, Theorem 43).Footnote 10 The preponderance of experimental evidence suggests that violations of Weak Stochastic Transitivity are rare (Rieskamp et al. 2006; Loomes et al. 2015).

3 A general result for mixture set domains

The basic object of analysis is a binary choice probability function (BCPF) defined over a set, A, of alternatives. This is a mapping

$$\begin{aligned} P:A\times A\rightarrow \left[ 0,1\right] \end{aligned}$$

that satisfies

$$\begin{aligned} P(a,b) =1-P\left( b,a\right) \end{aligned}$$
(3)

for all \(a,b\in A\). The quantity P(ab) is interpreted as the probability with which the decision-maker selects a when given the choice of a or b. Abstention is not an option—choices are “forced”—so BCPFs must satisfy the completeness (or balance) condition (3).

Our interpretation of P(ab) is behaviourally meaningful only if \(a\ne b\), but it is traditional to define P on the entire Cartesian product \(A\times A\) for convenience. An immediate implication of (3) is that

$$\begin{aligned} P\left( a,a\right) =\ \frac{1}{2} \end{aligned}$$

for any \(a\in A\).

Given a BCPF, P, we may construct the following binary relation on A: for any \(a,b\in A\),

$$\begin{aligned} a\succsim ^{P}b~~~\Leftrightarrow ~~~P(a,b) \ge P\left( b,a\right) ~~~\Leftrightarrow ~~~P(a,b) \ge \frac{1}{2} \end{aligned}$$
(4)

where the second equivalence follows from (3). When \(a\succsim ^{P}b\) we say that a is weakly stochastically preferred to b. That is, a decision-maker weakly stochastically prefers a over b if she is at least as likely to choose a from \(\left\{ a,b\right\} \) as she is to choose b. It is natural to think of P as a “noisy” expression of these preferences. Note, however, that while \(\succsim ^{P}\) is complete by construction, it need not be transitive. In particular, there may not exist any utility representation for \(\succsim ^{P}\). The asymmetric and symmetric parts of \(\succsim ^{P}\) are denoted \(\succ ^{P}\) and \(\sim ^{P}\), respectively:

$$\begin{aligned} a\succ ^{P}b~~~\Leftrightarrow ~~~P(a,b) >\frac{1}{2} \end{aligned}$$

and

$$\begin{aligned} a\sim ^{P}b~~~\Leftrightarrow ~~~P(a,b) =\frac{1}{2}\text {.} \end{aligned}$$

Following Marschak (1960) we introduce the following definitions:Footnote 11

Definition 1

We call \(u:A\rightarrow {\mathbb {R}}\) a weak utility for P if u represents \(\succsim ^{P}\); that is, if the following holds for any \(a,b\in A \):

$$\begin{aligned} a\succsim ^{P}b~~~\text {iff}~~~u(a) \ge u(b) \end{aligned}$$
(5)

Definition 2

We call \(u:A\rightarrow {\mathbb {R}}\) a strong utility for P if

$$\begin{aligned} P(a,b) \ge P\left( c,d\right) ~~~\text {iff}~~~u(a) -u(b) \ge u\left( c\right) -u(d) \end{aligned}$$
(6)

for any \(a,b,c,d\in A\).

This terminology is motivated by the following observation.

Lemma 1

If u is a strong utility for P, then u is a weak utility for P.

Proof

If u is a strong utility for P, then

$$\begin{aligned} P(a,b) \ge P\left( b,a\right) ~~~\Leftrightarrow ~~~u(a) \ge u(b) \end{aligned}$$

for any \(a,b\in A\). Hence, using (4), we see that u is a weak utility for P. \(\square \)

When P possesses a strong utility function, then choice probabilities are determined by utility differences in the Fechnerian tradition of psychophysics (Falmagne 2002).Footnote 12

Suppose u is a strong utility for P and let

$$\begin{aligned} \varGamma _{u}=\left\{ u(a) -u(b) \ \left| \ a,b\in A\right. \right\} \end{aligned}$$
(7)

be the set of utility differences generated by u. Note that \(\varGamma _{u}\) is symmetric about 0: if \(x\in \varGamma _{u}\) then \(-x\in \varGamma _{u}\). It follows that there exists a strictly increasing function \(F:\varGamma _{u}\rightarrow \left[ 0,1\right] \) such that

$$\begin{aligned} P(a,b) =F\left( u(a) -u(b) \right) \end{aligned}$$
(8)

for any \(a,b\in A\). The completeness condition (3) implies that F must also satisfy

$$\begin{aligned} F(x) +F\left( -x\right) =1 \end{aligned}$$
(9)

for any \(x\in \varGamma _{u}\). Suppose F is continuous.Footnote 13 Then we may interpret F as a distribution function—or rather, the restriction to \(\varGamma _{u}\) of a distribution function—for some zero-mean, symmetrically distributed random variable, \({\tilde{\varepsilon }}\), so that

$$\begin{aligned} P(a,b) =F\left( u(a) -u(b) \right) =\Pr \left[ u(a) -u(b) \ge {\tilde{\varepsilon }}\right] \end{aligned}$$
(10)

for any \(a,b\in A\). The representation (10) is the guise in which Fechnerian models are usually encountered by economists.

In this paper, we study conditions under which a BCPF has a strong utility representation within specific classes of utility functions. Throughout, we assume that A is a mixture set (Herstein and Milnor 1953). Mixture sets generalise the Euclidean notion of a convex set. Given \(a,b\in A\) and \( \lambda \in \left[ 0,1\right] \), we write \(a\lambda b\) for the \(\lambda \) -mixture of a and b. In particular, \(a1b=a\) and \(a0b=b\). For example, if A is a convex subset of \({\mathbb {R}}^{n}\) then

$$\begin{aligned} a\lambda b=\lambda a+\left( 1-\lambda \right) b \end{aligned}$$

under the standard mixture operation on \({\mathbb {R}}^{n}\).

Mixture sets are very familiar in the domain of risk. The unit simplex in \({\mathbb {R}}^{n}\), denoted by \(\varDelta ^{n}\), may be used to describe the set of all lotteries over a fixed set of n possible outcomes. This is a mixture (indeed, convex) set under the usual mixing operation for \({\mathbb {R}}^{n}\). Similarly, spaces of distribution functions on a given interval are mixture sets under the usual mixing operation on real-valued functions. For the domain of uncertainty, mixture sets appear in the framework of Anscombe and Aumann (1963). Given a set \({\mathcal {S}}\) of states and a mixture set \({\mathcal {C}}\) of consequences, an Anscombe–Aumann act is a function from \({\mathcal {S}}\) to \({\mathcal {C}}\). Anscombe and Aumann (1963) assume that \({\mathcal {C}}\) is a set of lotteries but none of our formal results relies on this interpretation.Footnote 14 If A is the set of Anscombe–Aumann acts, the mixture operation on \({\mathcal {C}}\) induces a mixture operation on A as follows: given \(a,b\in A\) and \(\lambda \in \left[ 0,1\right] \), \(a\lambda b\) is the Anscombe–Aumann act that maps state \(s\in {\mathcal {S}}\) to the consequence \(a(s) \lambda b(s) \in {\mathcal {C}}\).

When A is a mixture set, we say that \(u:A\rightarrow {\mathbb {R}}\) is mixture linear if

$$\begin{aligned} u\left( a\lambda b\right) =\lambda u(a) +\left( 1-\lambda \right) u(b) \end{aligned}$$

for any \(a,b\in A\) and any \(\lambda \in \left[ 0,1\right] \). For example, if \(A=\varDelta ^{n}\) then EU functions are mixture linear. The following definition generalises the notion of mixture linearity:

Definition 3

Given some \(M\subseteq A\) we say that \(u:A\rightarrow {\mathbb {R}}\) is M -linear if \(u\left( M\right) =u\left( A\right) \) and

$$\begin{aligned} u\left( a\lambda b\right) =\lambda u(a) +\left( 1-\lambda \right) u(b) \end{aligned}$$

for any \(a\in A\), any \(b\in M\) and any \(\lambda \in \left[ 0,1\right] \).

When \(M=A\) the notion of M-linearity coincides with ordinary mixture linearity. For future reference, it will be useful to note that the range of any M-linear function is convex.

Lemma 2

If u is M-linear, then \(u\left( A\right) \) is an interval (i.e. a convex subset of \({\mathbb {R}}\)).

Proof

Suppose \(x,y\in u\left( A\right) \) and \(\lambda \in \left[ 0,1\right] \). If u is M-linear, then there exist \(a,b\in M\) such that \(u(a) =x\) and \(u(b) =y\). Moreover:

$$\begin{aligned} u\left( a\lambda b\right) =\lambda x+\left( 1-\lambda \right) y \end{aligned}$$

so \(\lambda x+\left( 1-\lambda \right) y\in u\left( A\right) \). \(\square \)

To the best of our knowledge, the notion of M-linearity has not been defined elsewhere in the literature. We introduce it here for two reasons. First, because we can identify simple conditions on P which ensure that any M-linear weak utility for P is also a strong utility for P (Theorem 1). This is the central result of the paper. Second, many familiar classes of utility functions are M-linear for suitable choice of M. These include the SEU, MEU and CEU classes in the Anscombe–Aumann domain.

Example 1

Let A be the Anscombe–Aumann domain with finite state space \( {\mathcal {S}}=\left\{ 1,\ldots ,S\right\} \) and consequence set \({\mathcal {C}}\). Let

$$\begin{aligned} \varTheta =\ \left\{ \theta :{\mathcal {S}}\rightarrow \left[ 0,1\right] \ \left| \ \sum _{s=1}^{S}\theta (s) =1\right. \right\} \end{aligned}$$

be the set of all probability distributions over states. A utility function \(u:A\rightarrow {\mathbb {R}}\) has the maxmin expected utility form if there exists a non-empty, closed and convex set \({\mathcal {P}}\subseteq \varTheta \) and a mixture linear function \(v:{\mathcal {C}}\rightarrow {\mathbb {R}}\) such that

$$\begin{aligned} u(a) =\min _{\theta \in {\mathcal {P}}}\sum _{s\in {\mathcal {S}}}\theta (s) v\left( a(s) \right) \end{aligned}$$
(11)

for all \(a\in A\). Let

$$\begin{aligned} {\overline{A}}=\left\{ a\in A\ \left| \ a(s) =a\left( s^{\prime } \right) \text { for each} s,s^{\prime } \in {\mathcal {S}}\right. \right\} \end{aligned}$$

be the set of constant acts. Since

$$\begin{aligned} u\left( A\right) \supseteq u\left( {\overline{A}}\right) =v\left( {\mathcal {C}} \right) \supseteq u\left( A\right) \end{aligned}$$

(where the final inclusion follows from the mixture linearity of v) we have \(u\left( A\right) =u\left( {\overline{A}}\right) \). Moreover, if \(a\in A\) and \(b\in {\overline{A}}\) with \(b(s) =x\in {\mathcal {C}}\) for every \( s\in {\mathcal {S}}\), then

$$\begin{aligned} u\left( a\lambda b\right)= & {} \ \min _{\theta \in {\mathcal {P}}}\sum _{s\in {\mathcal {S}}}\theta (s) \left[ \lambda v\left( a(s) \right) +\left( 1-\lambda \right) v(x) \right] \\= & {} \ \lambda \left[ \min _{\theta \in {\mathcal {P}}}\sum _{s\in {\mathcal {S}}} \theta (s) v\left( a(s) \right) \right] \ +\ \left( 1-\lambda \right) v(x) \\= & {} \ \lambda u(a) +\left( 1-\lambda \right) u(b) \text {.} \end{aligned}$$

It follows that (11) is \({\overline{A}}\)-linear.

In order to state our main result, we need three axioms.

Axiom 1

(Strong Stochastic Transitivity) For all \(a,b,c\in A\), if

$$\begin{aligned} \min \left\{ P(a,b) ,\ P\left( b,c\right) \right\} \ \ge \ \frac{1}{2} \end{aligned}$$

then

$$\begin{aligned} P(a,c) \ \ge \ \max \left\{ P(a,b) ,\ P(b,c) \right\} \text {.} \end{aligned}$$

Strong Stochastic Transitivity is a standard assumption in the literature on binary stochastic choice. It implies (but is not implied by) the transitivity of \(\succsim ^{P}\) (i.e. the Weak Stochastic Transitivity of P)—see Fishburn (1973).

Axiom 2

(Solvability) For all \(a,b,c\in A\) and all \(\rho \in \left( 0,1\right) \)

$$\begin{aligned} P(a,b) \ge \rho \ge P\left( a,c\right) ~~~\Rightarrow ~~~P\left( a,e\right) =\rho ~~\text {for}\, \text {some}~\, e\in A \end{aligned}$$

This condition was introduced by Debreu (1958).Footnote 15 One consequence of Axiom 2 is that A must be a sufficiently rich domain.

Our third axiom is actually a family of axioms, indexed by \( M\subseteq A\). When interpreting references to “Axiom 3,” context must be used to determine which member of the family is intended.

Axiom 3

(Strong M-Independence) For any \(a,b,c,d\in A\), any \(e\in M\) and any \(\lambda \in \left( 0,1\right) \),

$$\begin{aligned} P(a,b) \ge P\left( c,d\right) ~~~\Rightarrow ~~~P\left( a\lambda e,b\lambda e\right) \ge P\left( c\lambda e,d\lambda e\right) \end{aligned}$$
(12)

This notion generalises Dagsvik’s (2008) Strong Independence axiom, which is equivalent to Strong A-Independence. Note that if \(M^{\prime } \subseteq M\), then Strong M-Independence implies Strong \(M^{\prime } \) -Independence. In other words, Dagsvik’s axiom is the “strongest” member of this family of axioms. We will use the terms Strong Independence and Strong A-Independence interchangeably.Footnote 16

Theorem 1

Let \(M\subseteq A\) be given and let P satisfy Axioms 13. Suppose that \(u:A\rightarrow {\mathbb {R}}\) is an M-linear weak utility for P. Then u is a strong utility for P.

Theorem 1 is our main result. The rest of the paper explores various applications of Theorem 1. To understand how this theorem may be applied, suppose we wish to establish sufficient conditions for the existence of a strong utility for P within a given M-linear class, \({\mathcal {U}}\). Suppose further that we know a set of sufficient conditions for preferences on A to have a utility representation in \({\mathcal {U}}\). Then we can use this knowledge, together with the three-step recipe from the Introduction, to obtain the desired conditions on P. (At Step 3, we add Strong Stochastic Transitivity and Solvability, in addition to Strong M-Independence.)Footnote 17

We end this section by noting a useful corollary to Theorem 1.Footnote 18

Corollary 1

Let \(M\subseteq A\) be given and let P satisfy Axioms 13. Suppose that \(u:A\rightarrow {\mathbb {R}}\) is an M-linear weak utility for P and let \(I\subseteq {\mathbb {R}}\) be a closed interval that is symmetric about 0 and contains the set \(\varGamma _{u}\) defined by (7). Then there exists a zero-mean, symmetrically and continuously distributed random variable, \({\tilde{\varepsilon }}\), with support contained in I, such that

$$\begin{aligned} P(a,b) =\Pr \left[ u(a) -u(b) \ge {\tilde{\varepsilon }}\right] \end{aligned}$$

for all \(a,b\in A\).

It follows from Corollary 1 that all of the models presented in this paper can be re-expressed in the familiar form (10) for some zero-mean random variable, \({\tilde{\varepsilon }}\), with continuous and strictly increasing distribution function, F. When expressed in this form, the axioms in our various representation theorems are also necessary, as is easily verified.

4 Application I: choice between risky prospects

In this section, we prove a strong “expected utility” representation theorem (Proposition 1). That is, we obtain sufficient conditions for P to possess a strong utility that is mixture linear. Our result requires only that A is a mixture set, but the Strong Independence axiom (which underpins the result) is motivated by scenarios in which A is a set of lotteries—a risky domain.

To state our result, we first introduce a strengthening of Axiom 2:

Axiom 4

(Mixture Solvability) For all \(a,b,c\in A\) and all \(\rho \in \left( 0,1\right) \)

$$\begin{aligned} P(a,b) \ge \rho \ge P\left( a,c\right) ~~~\Rightarrow ~~~P\left( a,b\lambda c\right) =\rho ~~\text {for}\,\text {some}~\,\lambda \in \left[ 0,1 \right] \end{aligned}$$

Proposition 1

Let A be a mixture set. If P satisfies Axioms 1, 4 and Strong Independence, then P has a strong utility that is mixture linear.

Proposition 1 is closely related to Theorem 4 in Dagsvik (2008), which also establishes sufficient conditions for a strong expected utility representation. Dagsvik’s result is less general, in the sense that he assumes \(A=\varDelta ^{n}\), and also uses a different set of axioms. In place of Strong Stochastic Transitivity (Axiom 1), Dagsvik assumes the following:

Axiom 5

(Quadruple Condition) For all \(a,b,a^{\prime } ,b^{\prime } \in A\):

$$\begin{aligned} P(a,b) \ge P\left( a^{\prime } ,b^{\prime } \right) ~~~\Rightarrow ~~ ~P\left( a,a^{\prime } \right) \ge P\left( b,b^{\prime } \right) \end{aligned}$$
(13)

Like Strong Stochastic Transitivity, the Quadruple Condition has a long history in the literature on stochastic choice. It appears in Debreu (1958), who attributes it to Davidson and Marschak (1959). It is well known that the Quadruple Condition implies Strong Stochastic Transitivity, but not conversely (Luce and Suppes 1965, Theorem 39). The extra strength of the Quadruple Condition comes at the cost of some intuitive appeal. Strong Stochastic Transitivity has a familiar and transparent logic,Footnote 19 while the Quadruple Condition appears less compelling from a normative point of view.

In place of Mixture Solvability (Axiom 4), Dagsvik assumes two continuity conditions: Axiom 2 (Solvability) and the following:

Axiom 6

(Archimedean Property) For all \(a,b,c\in A\), if

$$\begin{aligned} P(a,b) \>\ \frac{1}{2}\ >\ P\left( c,b\right) \end{aligned}$$

then there exist \(\alpha ,\beta \in \left( 0,1\right) \) such that

$$\begin{aligned} P\left( a\alpha c,b\right) \>\ \frac{1}{2}\ >\ P\left( a\beta c,b\right) \text {.} \end{aligned}$$

Theorem 2

(Dagsvik 2008) Let \(A=\varDelta ^{n}\). If P satisfies Axioms 256 and Strong Independence, then P has a strong utility that is mixture linear.

In a recent paper, Dagsvik (2015, Theorem 4) shows that some of the conditions in Theorem 2 can be relaxed without jeopardising the result: Solvability (Axiom 2) can be dropped and Strong Independence weakened by requiring that (12) holds only for degenerate lotteries (i.e. vertices of the simplex). In “Appendix 2” we establish that the axioms in Theorem 2 can be weakened in another direction, again without jeopardising the result (Theorem 3). We show that the Quadruple Condition (Axiom 5) can be replaced with the weaker (and more intuitively appealing) Strong Stochastic Transitivity condition. Of course, this raises the interesting question as to whether both sets of relaxations—those of our Theorem 3 and those of Dagsvik (2015, Theorem 4)—can be simultaneously made while preserving the existence of a mixture linear strong utility. The answer is obscured by the fact that the two theorems use very different proof strategies. The question remains open at this time.

5 Application II: choice between uncertain prospects

For this section (and its subsections), A will be a set of Anscombe–Aumann acts—a domain of uncertainty. For expositional convenience, we confine attention to a finite state space, \({\mathcal {S}}=\left\{ 1,\ldots ,S\right\} \), but our results generalise straightforwardly to richer state spaces. We use \({\overline{A}}\) to denote the set of constant acts (recall Example 1) and we take the usual notational liberty of identifying \({\overline{A}}\) with \({\mathcal {C}}\): if \(x\in \mathcal {C }\) we also treat x as an element of \({\overline{A}}\), relying on context to indicate the intended meaning. Note, in particular, that \({\overline{A}}\) is a mixture set. As discussed in Sect. 3, it is conventional to choose \({\mathcal {C}}\) to be a space of lotteries (such as \(\varDelta ^{n}\)) but our formal results require only that \({\mathcal {C}}\) is a mixture set.

In Example 1 we showed that utility functions of the MEU form are \({\overline{A}}\)-linear. This arises because of the Certainty independence (or C-independence) property of MEU preferences:Footnote 20 for any \(a,b\in A\) , any \(x\in {\overline{A}}\) and any \(\lambda \in \left( 0,1\right) \),

$$\begin{aligned} a\succsim b\ \ \ \ \text {iff}\ \ \ \ a\lambda x\succsim b\lambda x\text {.} \end{aligned}$$

Preferences of the CEU variety also enjoy this property, as do SEU preferences, which are contained in the intersection of the MEU and CEU classes. All these classes are special cases of invariant biseparable preferences, which were introduced by Ghirardato et al. (2004). Invariant biseparable preferences are characterised by C-independence plus another four standard axioms (ibid., p. 141). Ghirardato, Maccheroni and Marinacci (2004, Theorem 11) describe the utility functions that represent such preferences. We say that a utility function is “invariant biseparable” if it represents invariant biseparable preferences.

We refer the reader to Ghirardato et al. (2004) for a detailed description of invariant biseparable utility functions, which is somewhat involved. For our purposes, invariant biseparable utilities are interesting because they are all \( {\overline{A}}\)-linear (ibid., Lemma 1) and they include many familiar classes of utility functions for the Anscombe–Aumann domain, such as SEU, MEU and CEU.

Proposition 2 (below) identifies sufficient conditions for a BCPF to possess a strong utility of the invariant biseparable form. In Sects. 5.1, 5.3 and 5.3 we refine this result to obtain sufficient conditions for a strong utility of the SEU, MEU and CEU form, respectively.

To state these representation theorems, it will be convenient to exclude the uninteresting case in which the decision-maker is stochastically indifferent between any two alternatives. We say that a BCPF is non-trivial if there exist \(a,b\in A\) such that \(P(a,b) \ne \frac{1}{2}\). The following axiom will also be needed for each representation theorem. (In reading the statement of this axiom, recall that we identify the consequence \(x\in {\mathcal {C}}\) with the constant act that maps each state to x.)

Axiom 7

(Stochastic Monotonicity) For any \(a,b\in A\), if

$$\begin{aligned} P\left( a(s) ,b(s) \right) \ \ge \ \frac{1}{2} \end{aligned}$$

for every \(s\in {\mathcal {S}}\) then \(P(a,b) \ge \frac{1}{2}\).

Proposition 2

If P is a non-trivial BCPF that satisfies Axioms 147 and Strong \({\overline{A}}\)-Independence, then \(\succsim ^{P}\) are invariant biseparable preferences and P has a strong utility of the invariant biseparable form.

5.1 Strong SEU

Defining

$$\begin{aligned} \varTheta =\ \left\{ \theta :{\mathcal {S}}\rightarrow \left[ 0,1\right] \ \left| \ \sum _{s=1}^{S}\theta (s) =1\right. \right\} \end{aligned}$$

to be the set of all probability distributions over states, the utility function \(u:A\rightarrow {\mathbb {R}}\) has the SEU form if there exists a mixture linear function \(v:{\mathcal {C}}\rightarrow {\mathbb {R}}\) and a probability \(\theta \in \varTheta \) such that

$$\begin{aligned} u(a) =\ \sum _{s=1}^{S}\theta (s) v\left( a(s) \right) \end{aligned}$$
(14)

for all \(a\in A\). Not surprisingly, a strong utility of the form (14) is obtained by strengthening Strong \({\overline{A}}\)-Independence to Strong A-Independence in Proposition 2. This gives the following stochastic analogue of the Anscombe and Aumann (1963) representation theorem.

Proposition 3

If P is a non-trivial BCPF that satisfies Axioms 147 and Strong A-Independence, then P has a strong utility of the SEU form (14).

5.2 Strong MEU

Recall the MEU specification (11) from Example 1. To obtain this functional form, we need to add a stochastic analogue of Gilboa and Schmeidler’s (1989) uncertainty aversion axiom:

Axiom 8

(Stochastic Uncertainty Aversion) For any \(a,b\in A\) and any \(\lambda \in \left( 0,1\right) \),

$$\begin{aligned} P(a,b) =\frac{1}{2}~~~\Rightarrow ~~~P\left( a\lambda b,b\right) \ge \frac{1}{2}\text {.} \end{aligned}$$

Axiom 8 may be written

$$\begin{aligned} a\sim ^{P}b~~~\Rightarrow ~~~a\lambda b\succsim ^{P}b \end{aligned}$$

for any \(a,b\in A\) and any \(\lambda \in \left( 0,1\right) \). In other words, Stochastic Uncertainty Aversion is the property of P implied by the requirement that \(\succsim ^{P}\) satisfy uncertainty aversion—Axiom A.5 in Gilboa and Schmeidler (1989).

Proposition 4

If P is a non-trivial BCPF that satisfies Axioms 1478 and Strong \({\overline{A}}\) -Independence, then P has a strong utility of the MEU form (11).

5.3 Strong CEU

For our final application, we develop a stochastic version of Choquet expected utility (Schmeidler 1989). CEU is arguably the most frequently applied alternative to SEU. Unlike MEU, the Choquet expected utility model does not impose uncertainty aversion—it can accommodate a wide variety of attitudes to uncertainty. However, it does require that preferences satisfy comonotonic independence (Schmeidler 1989, p. 575), which is a stronger independence property than C-independence.

Definition 4

Let \(\succsim \) be a weak order (complete and transitive binary relation) on A. Acts \(a,b\in A\) are \(\succsim \) -comonotonic if there do not exist states \(s,s^{\prime } \in {\mathcal {S}}\) with \(a(s) \succ a\left( s^{\prime } \right) \) and \(b\left( s^{\prime } \right) \succ b(s) \), where \(\succ \) is the asymmetric part of \(\succsim \) and we identify \({\mathcal {C}}\) with \({\overline{A}}\) in the usual manner.

Axiom 9

(Stochastic Comonotonic Independence) For any pairwise \(\succsim ^{P}\)-comonotonic \(a,b,c\in A\) and any \(\lambda \in \left( 0,1\right) \),

$$\begin{aligned} P(a,b)>\frac{1}{2}~~~\Rightarrow ~~~P\left( a\lambda c,b\lambda c\right) >\frac{1}{2}\text {.} \end{aligned}$$

Axiom 9 says that \(\succsim ^{P}\) satisfies the comonotonic independence property (Schmeidler 1989) .

A utility of the CEU class has the same form as (14), but the probability \(\theta \) is replaced by a capacity and summation by Choquet integration.

Definition 5

A capacity on \({\mathcal {S}}\) is a mapping \(\omega :2^{{\mathcal {S}}} \rightarrow \left[ 0,1\right] \) that satisfies \(\omega \left( \emptyset \right) =0\), \(\omega \left( {\mathcal {S}}\right) =1\) and \(\omega \left( A\right) \le \omega \left( B\right) \) whenever \(A\subseteq B\).

Capacities are non-additive generalisations of probabilities. Choquet integration allows us to take expectations of real-valued functions with respect to capacities.Footnote 21 Given a function \(f:{\mathcal {S}}\rightarrow {\mathbb {R}}\), let

$$\begin{aligned} f\left( {\mathcal {S}}\right) =\left\{ x_{1},x_{2},\ldots ,x_{k}\right\} \end{aligned}$$

with \(x_{1}>x_{2}>\cdots >x_{k}\) and let

$$\begin{aligned} E_{i}=\left\{ s\in {\mathcal {S}}\ |\ f(s) =x_{i}\right\} \text {.} \end{aligned}$$

Then

$$\begin{aligned} f =\ \sum _{i=1}^{k}x_{i}E_{i}^{*} \end{aligned}$$
(15)

where \(E_{i}^{*} :{\mathcal {S}}\rightarrow \left\{ 0,1\right\} \) is the indicator function for \(E_{i}\). If \(\omega \) is a capacity on \({\mathcal {S}}\), the Choquet expectation of (15) with respect to \(\omega \) is defined as follows:

$$\begin{aligned} \int f\ d\omega \ \equiv \ \sum _{i=1}^{k}\left( x_{i}-x_{i+1}\right) \omega \left( \bigcup \limits _{j=1}^{i}E_{j}\right) \end{aligned}$$
(16)

where \(x_{k+1}=0\). When \(\omega \) is additive—that is, a probability—( 16) is just the usual expected value of f with respect to \( \omega \).

The utility function \(u:A\rightarrow {\mathbb {R}}\) has the CEU form if there exists a mixture linear function \(v:{\mathcal {C}}\rightarrow {\mathbb {R}}\) and a capacity \(\omega \) on \({\mathcal {S}}\) such that

$$\begin{aligned} u(a) =\ \int \left( v\circ a\right) \ d\omega \end{aligned}$$
(17)

Proposition 5

If P is a non-trivial BCPF that satisfies Axioms 1479 and Strong \({\overline{A}}\) -Independence, then P has a strong utility of the CEU form (17).

6 Discussion

In their recent, and comprehensive, survey of Fechnerian representation theorems, Marley and Regenwetter (2015, p. 59) observe: “Clearly, much more research is needed in this area for uncertain (or ambiguous) gambles and representations other than expected utility”. Our paper fills some of this gap. It also provides theoretical background for recent experimental work on binary choice between uncertain prospects.

Hey et al. (2010) tested the “descriptive and predictive adequacy” of eight preference-based models of decision-making under uncertainty.Footnote 22 These included SEU, MEU, CEU and three other models based on preferences within the invariant biseparable class. Each utility model was embedded in a strong utility structure (10), with Fechnerian errors drawn from a zero-mean Normal distribution.Footnote 23

In terms of aggregate predictive accuracy (summarised in Table 1 of Hey, Lotito and Maffioletti 2010), the strong SEU model out-performed the strong CEU model but was marginally inferior to the strong MEU model. The best performing models were those based on the “maximax” dual to MEU and on \(\alpha \)-MEU (Ghirardato et al. 2004). Both of these are within the invariant biseparable class; the latter is axiomatised in Section 6 of Ghirardato et al. (2004).

Our paper provides explicit axiomatic foundations for the strong SEU, strong CEU and strong MEU models, as well as a “recipe” for constructing axiomatisations of strong utility models based on maximax utility and \(\alpha \)-MEU. Corollary 1 shows that all of our strong utility representations can be re-expressed in the form (10).

Importantly, our results provide axiomatic demarcations between the various models. Proposition 2 reveals that Strong \({\overline{A}}\)-Independence is a central pillar in the foundation of any strong utility model within the invariant biseparable class. It is a prime candidate for further testing. Propositions 4 and 5 indicate the additional axiomatic refinements necessary to confine strong utility to the MEU and CEU classes, respectively: Stochastic Uncertainty Aversion in the case of MEU and Stochastic Comonotonic Independence in the case of CEU. If both axiomatic restrictions are supported by the data, there exists a strong utility within the convex CEU class (Schmeidler 1989, Proposition), which is commonly encountered in applied work.

Theorem  1 could also be used to axiomatise strong utility models beyond those considered here, such as a probabilistic analogue of Jaffray’s (1989) linear utility theory for (the mixture set of) belief functions over a given set of outcomes.

Of course, one limitation of our models, like all Fechner models, is the restriction to binary choices. It is obviously desirable to have an extension to multinomial choice, and preferably an extension with an axiomatic foundation. One such extension is provided by Blavatskyy (2012, Sect. 4). Let P be a BCPF defined on an arbitrary domain of alternatives, A. Given a finite choice set \(C\subseteq A\), let \({\mathbf {P}} \left( a|C\right) \) denote the probability of choosing \(a\in C\) from C. Consider the following formula for constructing \({\mathbf {P}}\left( a|C\right) \) from P:

$$\begin{aligned} {\mathbf {P}}\left( a|C\right) =\ \frac{\prod \limits _{b\in C}P(a,b)}{\sum _{a^{\prime } \in C}\left[ \prod \limits _{b^{\prime }\in C}P\left( a^{\prime } ,b^{\prime } \right) \right] } \end{aligned}$$
(18)

Blavatskyy (2012, p. 49) provides an axiomatic foundation for (18). One may think of

$$\begin{aligned} \prod \limits _{b\in C}P(a,b) \end{aligned}$$
(19)

as the probability that \(a\in C\) is a Condorcet winner amongst the “candidates” in C—the probability that a would be chosen over every other element of C in a sequence of binary choices. Let us therefore call (19) the Condorcet probability of choosing a from C. The expression (18) says that the relative choice probability

$$\begin{aligned} \frac{{\mathbf {P}}\left( a|C\right) }{{\mathbf {P}}\left( b|C\right) } \end{aligned}$$

(where \(a,b\in C\)) matches the relative Condorcet probability. This gives a rather natural extension of a binary choice probabilities to multinomial choice probabilities. In particular, it could be applied to any of the models in the present paper.Footnote 24