1 Introduction

The independence axiom (IA) plays a central role in most formal statements of expected utility theory (EUT), as well as popular alternative models of decision-making under objective or subjective risk. One such alternative is rank-dependent utility (RDU) theory (Quiggin 1982), which assumes that the IA is invalid in a certain way. The axiom also plays a central role in virtually every experiment used to characterize the way in which risk preferences deviate from EUT, through the use of the random lottery incentive mechanism (RLIM). For example, if someone claims that individuals behave as if they “probability weight” outcomes, and hence violate the IA, it is almost always on the basis of experiments and theories that assume the IA if the incentives are to be taken seriously. But there is an obvious inconsistency with saying that individuals behave as if they violate the IA on the basis of evidence collected under the maintained assumption that the axiom is magically valid.

This inconsistency has long bothered theorists confronted with experimental data, and there have been responses from theorists and experimentalists. The primary theoretical response has been to argue that there is a way to write out a model of decision-making under risk that allows one to relax the IA but to still allow risk preferences to deviate from EUT predictions and for RLIM to be valid. In effect, to argue an existence proof: even though the inconsistency is real for the most popular alternatives to EUT, there exists a formal alternative to EUT where there is no inconsistency. We discuss this theoretical response later in Sect. 6. The primary experimental response has been to develop some direct tests and some ingenious designs intended to trap the IA under some circumstances.Footnote 1 But these direct and indirect experimental tests of the IA have been inconclusive. This is frustrating: either the axiom applies or it does not.

The uneasy state of the literature has evolved to assuming the axiom for the purposes of making the payment protocol of an experiment valid, but rejecting it when characterizing the risk preferences exhibited in the same experiment using the standard alternatives to EUT.Footnote 2 Those characterizations seem to show evidence of rank-dependent probability weighting, when that very evidence calls into question a maintained assumption of the payment protocol used to generate the evidence. We refer to someone who holds this view as a BipolarFootnote 3 Behaviorist, exhibiting pessimism about the IA when it comes to characterizing how individuals directly evaluate two lotteries in a binary choice task, but optimism about the IA when characterizing how individuals evaluate multiple lotteries that make up the incentive structure for a multiple-task experiment.

The standard payment protocol in individual risky choice experiments involves a subject making \(K>1\) binary choices over objective lotteries, and then selecting one choice at random for payment. We call this protocol 1-in-K. Following Conlisk (1989), Starmer and Sugden (1991), Beattie and Loomes (1997), Cubitt et al. (1998), and Cox et al. (2011), an alternative payment protocol, which we call 1-in-1, involves a subject making only one choice, and then being paid with certainty for the single choice.Footnote 4 The IA can have no role to play in the validity of the 1-in-1 protocol per se if we restrict choice to simple lotteries, but plays a defining role in the 1-in-K protocol. And the role that the IA plays in the theoretical and behavioral validity of the experimental payment protocol is quite distinct from the role that it might play in evaluating the actual binary choice or choices. Even with the 1-in-1 protocol being used, it is possible to ask if behavior is better characterized by violations of IA or not. Indeed, the whole point of our design is to highlight the dual role of the IA in 1-in-K protocols that seek to test violations of IA.

Testing the manner in which the IA interacts with payment protocols used to collect data on observed choice behavior is complicated by the possibility that the reduction of compound lotteries (ROCL) axiom may be invalid behaviorally. This possibility lies at the heart of the theoretical response to the hypothesis about subject behavior underlying the stance of the Bipolar Behaviorist. If the objects of choice are themselves compound lotteries, as is the case in some famous experimental tasks such as the “preference reversal” experiments of Grether and Plott (1979), then one has to take a stand on the validity of ROCL anyway.Footnote 5 But if the objects of choice are simple lotteries, as here, then one can test the implications of RLIM for the standard alternatives to EUT without taking a position on the validity of ROCL.Footnote 6

We offer direct tests of the effect of payment protocols on preferences for risk in general, and the evidence for probability weighting in particular. We do find statistically significant evidence for a difference in estimated risk preferences deriving from the use of different payment protocols and experimental tasks.

Using choices over simple lotteries, we find evidence of RDU probability weighting with the 1-in-1 protocol that does not rely on the validity of the IA. So this result establishes that there is theoretical and behavioral “cause for concern” when one assumes the validity of the IA for the 1-in-K protocol. We then find that this theoretical concern is empirically relevant. Estimated RDU risk preferences are different depending on whether one infers them from data collected with the 1-in-1 payment protocol or the 1-in-K payment protocol. It is not the existence of evidence for probability weighting that is the issue, it is the fact that the nature of probability weighting differs in the 1-in-1 versus 1-in-K protocol.

In order to justify the use of the 1-in-K payment protocol, many studies appeal to the “isolation effect.” This effect is often presented as a behavioral assertion that a subject views each choice in an experiment as independent of other choices in the experiment. When stated formally, the isolation effect is often expressed the same as the IA, and is indeed exactly the same as the IA in our choice context. We recognize that the isolation effect is often invoked informally as “an empirical matter,” with either an appeal to prior evidenceFootnote 7 or simply a conjecture that the isolation effect is a reasonable description human behavior. Given limited empirical support from prior studies and the tautology of support via conjecture, we present an experiment which provides a new test of the isolation effect.

In Sect. 2 we describe the theoretical constructs needed for our design, in particular the various axioms that are at issue. In Sect. 3 we present our experimental design, which allows comparison of risk preferences obtained from choice tasks over simple lotteries that do not require the IA with risk preferences obtained from tasks that do require the assumption. We also explain why we focus on differences in estimated preferences across treatments rather than just examine raw choice patterns. In Sect. 4 we develop the econometric model used to estimate preferences. We pay particular attention to the manner in which between-subject heterogeneity is modeled. The reason for this attention is that the simplest way of avoiding reliance on the IA is to give some individuals only one choice to make, necessitating the pooling of choices across different individuals. In the absence of an assumption of homogeneity of risk preferences, or samples of sufficient power to allow randomization to mitigate the need for that assumption, we must address the econometric modeling of heterogeneity. In Sect. 5 we examine the data from our experiments, and present the econometric analysis of hypotheses. In Sect. 6 we draw some general implications of our results, and in Sect. 7 offer general conclusions.

2 Theory

2.1 Basic axioms

Following Segal (1987, 1988, 1990, 1992), we distinguish between three axioms defined over objective lotteries.Footnote 8 In words, the ROCL axiom states that a decision-maker is indifferent between a compound lottery and the actuarially-equivalent simple lottery in which the probabilities of the two stages of the compound lottery have been multiplied out. To use the language of Samuelson (1952, p. 671), the former generates a compound income-probability-situation, and the latter defines an associated income-probability-situation, and that “...only algebra, not human behavior, is involved in this definition.”

To state this more explicitly, with notation to be used to state all axioms, let X, Y, and Z denote simple lotteries, A and B denote compound lotteries, \(\succ \) express strict preference, and \(\sim \) express indifference. Then the ROCL axiom says that \(A \sim X\) if the probabilities and prizes in X are the actuarially-equivalent probabilities and prizes from A. Thus if A is the compound lottery that pays “double or nothing” from the outcome of the lottery that pays $10 if a coin flip is a head and $2 if the coin flip is a tail, then X would be the lottery that pays $20 with probability \(1/2\times 1/2 = 1/4\), $4 with probability \(1/2\times 1/2 = 1/4\), and nothing with probability 1/2. From an observational perspective, one would have to see choices between compound lotteries and the actuarially-equivalent simple lottery to test ROCL.

The compound independence axiom (CIA) states that a compound lottery formed from two simple lotteries by adding a positive common lottery with the same probability to each of the simple lotteries will exhibit the same preference ordering as the simple lotteries. So this is a statement that the preference ordering of the two constructed compound lotteries will be the same as the preference ordering of the different simple lotteries that distinguish the compound lotteries, provided that the common prize in the compound lotteries is the same and has the same (compound lottery) probability. It says nothing about how the compound lotteries are to be evaluated, and in particular it does not assume ROCL. It only restricts the preference ordering of the two constructed compound lotteries to match the preference ordering of the original simple lotteries.

The CIA says that if A is the compound lottery giving the simple lottery X with probability \(\alpha \) and the simple lottery Z with probability (\(1-\alpha \)), and B is the compound lottery giving the simple lottery Y with probability \(\alpha \) and the simple lottery Z with probability \((1-\alpha )\), then \(A \succ B \;\mathrm{{ iff}}\; X\succ Y\,\forall \,\alpha \in (0,1)\). So the construction of the two compound lotteries A and B has the “independence axiom” cadence of the common prize Z with a common probability \((1-\alpha )\), but the implication is only that the ordering of the compound and constituent simple lotteries are the same. For example, Segal (1992, p. 170) defines the CIA by assuming that the second-stage lotteries are replaced by their certainty-equivalent according to some (possibly non-EUT) model, “throwing away” information about the second-stage probabilities before one examines the first-stage probabilities at all. Hence one cannot then define the actuarially-equivalent simple lottery and hence state the ROCL axiom, by construction, since the informational bridge to that calculation has been burnt.

Finally, the mixture independence axiom (MIA) says that the preference ordering of two simple lotteries must be the same as the two actuarially-equivalent simple lotteries derived from the two compound lotteries formed by combining a common outcome with one of the original simple lotteries, where the common outcome has the same (compound lottery) probability. That is, \(X \succ Y\) iff the actuarially-equivalent simple lottery of \(\alpha X +(1-\alpha )Z\) is strictly preferred to the actuarially-equivalent simple lottery of \(\alpha Y + (1-\alpha ) Z, \forall \,\alpha \in \,(0,1]\). So stated, it is clear that the MIA strengthens the CIA by making a definite statement that the constructed compound lotteries are to be evaluated in a way that is ROCL-consistent. Construction of the compound lottery in the MIA is actually implicit: the axiom only makes observable statements about two pairs of simple lotteries. To restate Samuelson’s point about the definition of ROCL, the experimenter testing the MIA could have constructed the associated income-probability-situation without knowing the risk preferences of the individual (although the experimenter would need to know how to multiply).

The reason these three axioms are important for the evaluation of alternatives to EUT is that the failure of MIA does not imply the failure of CIA and ROCL. It does imply the failure of one or the other, but it is far from obvious which one. Indeed, one could imagine some individuals or task domains where only CIA might fail, only ROCL might fail, or both might fail. Moreover, specific types of failures of ROCL lie at the heart of many important models of decision-making under uncertainty and ambiguity. We use the acronym IA when we mean “CIA or MIA” and the acronyms CIA or MIA directly when the difference matters.

2.2 Experimental payment protocols

Turning now to experimental procedures, the most popular payment protocol used in individual choice experiments assumes the validity of the CIA. This payment protocol is called the RLIM. It entails the subject making K choices and then one of the K choices being selected at random to be played out. Typically, and without loss of generality, assume that the selection of the kth task to be played out uses a discrete uniform distribution over the K tasks. Since the other \(K-1\) tasks will generate a payoff of zero, the payment protocol can be seen as a compound lottery that assigns probability \(\alpha = 1/k\) to the selected task and \((1-\alpha ) = (1-(1/k))\) to the other \(K-1\) tasks as a whole. If the experiment consists of binary choices between simple lotteries X and Y, then the RLIM can be immediately seen to entail an application of the CIA, where Z \(=\) U($0) and \((1-\alpha ) = (1-(1/k))\), for the utility function \(U(\cdot )\). Hence, under the CIA, the preference ordering of X and Y is independent of all of the choices in the other tasks (Holt 1986).

The CIA can be avoided by setting \(K=1\), and asking each subject to answer one binary choice task for payment. Unfortunately, this comes at the cost of another assumption if one wants to compare choice patterns over two simple lottery pairs, as in most of the popular tests of EUT such as the Allais paradox and common ratio test: the assumption that risk preferences across subjects are the same. This is a strong assumption, obviously, and one that leads to inferential tradeoffs in terms of the “power” of tests of EUT relying on randomization that will vary with sample size. Further, experimenters in this area are wont to ignore the implications this assumption has on power calculations, and so the results from small-sample studies which assume homogeneity of preferences can be questionable due to low power.

The assumption of homogeneous preferences can be diluted, however, by changing it to a conditional form: that risk preferences are homogeneous conditional on a finite set of observable characteristics. Although this sounds like an econometric assumption, and it certainly has statistical implications, it is as much a matter of (operationally meaningful) theory as formal statements of the CIA, ROCL, and MIA.

3 Experiment

3.1 Basic design issues

Our experimental design focuses directly on the risk preferences that one can infer from binary choices over pairs of simple lotteries. This task is canonical, in terms of testing EUT against alternatives such as RDU, as well as for estimating risk preferences. Our design builds on a comparison of the risk preferences implied by 1-in-1 and 1-in-K choice tasks. We let K equal 30, to match the typical risky choice experiment in which there are many choices (e.g., Hey and Orme (1994)). Figure 1 shows the interface given to our subjects. A standard, fixed show-up fee, in our case $7.50, was paid to every subject independently of their lottery choices.

Fig. 1
figure 1

Bipolar probability weighting functions for rank-dependent utility models

3.2 Specific design

Table 1 summarizes our experimental design. In treatment A, each subject undertakes one 1-in-1 binary choice, where each subject’s single lottery pair is drawn at random from a set of 69 lottery pairs shown in online Appendix A. These lottery pairs span five monetary prize amounts, $5, $10, $20, $35, and $70, and five probabilities, 0, 1/4, 1/2, 3/4, and 1. The prizes are combined in ten “contexts,” defined as a particular triple of prizes.Footnote 9 They are based on a battery of lottery pairs developed by Wilcox (2010) for the purpose of robust estimation of EUT and RDU models.Footnote 10 These lotteries also contain a number of pairs in which the “EUT-safe” lottery has a higher EV than the “EUT-risky” lottery: this is designed deliberately to evaluate the extent of risk premia deriving from probability pessimism rather than diminishing marginal utility.

Table 1 Experimental design

In treatment A we do not have to assume the CIA in order for observed choices to reflect risk preferences under EUT or RDU. In effect, it represents the behavioral Gold Standard benchmark in terms of internal validity, against which the other payment protocols are to be evaluated.

In treatment B we move to the 1-in-30 case, which is typical of the usual risk elicitation setting. In some cases there was an additional task after the 30 lottery choices, and in some cases there was no other task. In the former case the instructions raised the possibility of a future task for payment.Footnote 11 In the latter case we explicitly told subjects that there were no further salient tasks affecting their earnings after the risky lottery task, to avoid them even tacitly thinking of forming a portfolio over the risky lottery tasks and any future tasks. We later test for an effect of there being an extra task and find none, so pool these minor variants into one treatment B.

Every random event determining payouts was generated by the rolling of one or more dice. These dice were illustrated visually during the reading of the instructions,Footnote 12 and each subject rolled their own dice.

3.3 Why not just look at raw choice patterns?

Prior tests looked directly at choice patterns. In contrast, we focus on the risk preferences implied by the observed choice data, and do not examine the choice patterns themselves. The reason is that there are limits on what can be inferred by just looking at choice patterns. Since much of the literature on the evaluation of the axioms of EUT has done precisely that, we explain why we believe this to be less informative than trying to make inferences about the underlying latent preferences. This may be particularly important because one might wonder how they could differ: after all, if preferences are just rationalizing observed choices, and if observed choices appear to violate the predictions of EUT or IA, how can it be that the implied preferences might not?

3.3.1 Behavioral errors

In an important sense, our task would be easier if humans never made mistakes. This would allow us to test deterministic theories of choice, and any deviation from the predictions of the theory would provide prima facie evidence of a failure of the theory. However, humans do make errors in behavior, and so our task is more complex. The canonical evidence for behavioral errors is the fraction of “switching behavior” observed when subjects are given literally the same lottery pair at different points in a session (e.g., Wilcox (1993)). Any analysis of individual choice ought to account for such behavioral errors. Indeed, some previous analyses of choice patterns have attempted to account for “mistakes” by implementing “trembles” (e.g., Conlisk (1989, Appendix I) and Harless and Camerer (1994)). Such trembles are agnostic about the way any behavioral error might affect the latent components of the choice. A more satisfactory approach would incorporate behavioral errors into the choice process in a more structural manner (Wilcox 2008).

It is worth emphasizing that behavioral errors are quite distinct conceptually from sampling errors. The former refer to some latent component of the theoretical structure generating a predicted choice. The latter refer to the properties of an estimate of the parameters of that theoretical structure. To see the difference, and assuming a consistent estimator, if the sample size gets larger and larger the sampling errors must get smaller and smaller, but the (point estimate of the) behavioral error need not.Footnote 13 In the first instance behavioral errors are the business of theorists, not econometricians, as stressed by Wilcox (2008, 2011).

3.3.2 Do choice patterns use all available information?

Once we recognize that there can be some imprecision in the manner in which preferences translate into observed choices, we obtain another informational advantage from making inferences about preferences estimated from a structural model: a theory about how the intensity of a preference for one lottery over another matters. For any given utility function and set of parameter values, and assuming EUT for exposition, a larger difference in the EU of two lotteries matters more for the likelihood of the presumed preferences than a difference in the EU that is close to zero. To see this, assume some parameter values characterizing preferences, and two lottery pairs. One lottery pair, evaluated at those parameter values, implies an EU for the left lottery that is greater, by \(\varepsilon \), than the EU for the right lottery. Another lottery pair, similarly evaluated at those same parameter values, implies an EU for the left lottery that is greater, by much more than \(\varepsilon \), than the EU for the right lottery. An observed choice that is inconsistent with the predicted choice for the second lottery pair matters more for the validity of the assumed parameter values than an inconsistent observed choice for the first lottery pair. This is not the case when one simply looks at the number of consistent and inconsistent choice pairs, as all inconsistent choices are treated as informationally equivalent.

Of course, one has to define the term “intensity” for a given utility representation, and there are theoretical and econometric subtleties involved in normalizing EU differences over different choice contexts, discussed later and in Wilcox (2008, 2011). Structural estimation also typically entails some parametric assumptions, also discussed later, that are not involved with the usual analysis of choice patterns. But there is simply more information used when one evaluates estimated preferences with a structural model. The difference is akin to limited-information inference versus full-information inference in statistics: ceteris paribus, it is always better to use more information than less, and the only (statistically) efficient estimators use all information. We admit immediately that things are not all equal, and that some parametric assumptions will be needed to undertake what we call the full-information approach here. But we argue that the preference estimation approach is complementary to studying choice patterns, and not an inferior and less direct method of conducting the same analysis.

3.3.3 Are the stimuli representative?

Comparison of choice patterns from a paradox test with two pairs of lotteries may support or refute the theory under consideration, but how confident are we that the result is representative of choices over all possible lottery pairs? What if multiple tests using distinct choice patterns are conducted and only a single test pattern suggests a failure of the theory? Perhaps some theorists are content with a single case of falsification, but others may be concerned that the single failure is a rare exception. For example, it is well-known that violations of EUT tend to occur less frequently when lotteries are in the “interior” of the Marschak-Machina triangle (e.g., Starmer (2000, p. 358)). Hence one might draw one negative set of qualitative conclusions about EUT from one battery of stimuli and a different, positive set of qualitative conclusions about EUT from a different battery of stimuli [e.g., Camerer (1989, 1992) on the importance of “boundary effects”]. As a general model for all sets of stimuli, EUT is still in trouble in this case, to be sure, but inferences about the validity of EUT then need to be nuanced and conditional.

Model estimation can address this “representativeness” issue by presenting subjects with a wide range of lottery pairs, a point first stressed in the experimental economics literature by Hey and Orme (1994). Of course, there is a tradeoff in doing this: with the 1-in-1 protocol we cannot conduct choice pattern comparisons due to low sample sizes for any given lottery pair.

3.3.4 The homogeneity assumption

Another theoretical reason one might want to estimate a structural model of preferences, rather than examine choice data alone, is to better account for heterogeneity of preferences in the 1-in-1 treatment. The analysis of choice patterns must assume preference homogeneity, or perhaps minimally condition on an observable characteristic or two, such as assuming homogeneity within samples of men and women. Some might appeal to large-sample randomization in an attempt to avoid the assumption of homogeneity, but rarely does anyone conduct appropriate power analyses to justify that appeal. By using structural model estimation, observed preference heterogeneity can be ameliorated through the use of demographics controls (e.g., Harrison and Rutström 2008), and unobserved preference heterogeneity can be ameliorated through the use of random coefficient models (e.g., Andersen et al. 2012).

3.4 Data

A total of 283 subjects were recruited to participate in experiments at Georgia State University between February 2011 and April 2011. The general recruitment message did not mention the show-up fee or any specific range of possible earnings, and subjects were undergraduate students recruited from across the campus. Table 1 shows the allocations of subjects over our treatments. Instructions for all treatments are presented in online Appendix B. Every subject received a copy of the instructions, printed in color, and had time to read them after being seated in the lab. The instructions were then projected on-screen and read out word-for-word by the same experimenter in every session. Every subject also completed a demographic survey covering standard characteristics. All subjects were paid in cash at the end of each session.

4 Econometrics

Our interest is in making inferences about the latent risk preferences underlying observed choice behavior. The estimation approach is to write out a structural model of decision-making, assuming some functional forms if necessary. We focus initially on EUT as the appropriate null, but also consider RDU models of decision-making under risk. The lottery parameters in our design also allow us to estimate the structural model assuming non-parametric specifications of the utility and probability weighting functions, and these non-parametric estimations will be the main focus of inferences whenever possible.

Online Appendix D provides the specifications of our econometric model, and generally follows Harrison and Rutström (2008).

5 Results

We consider behavior observed under treatments A and B, and evaluate the hypothesis that risk preferences are the same across the treatments. The initial estimates assume preference homogeneity across subjects, to be able to focus on the interpretation of non-parametric estimates of the utility and probability weighting functions. We then allow for preference heterogeneity using observable demographic characteristics. Finally, we consider parametric estimates, which may be more intuitive and familiar.

Online Appendix D contains the detailed estimates of all structural models. If we use a non-parametric specification, we cannot find any statistically significant effect of the payment protocols when we assume EUT. This is true where we assume preference homogeneity (p value \(=\) 0.78) or allow for preference heterogeneity (p value \(=\) 0.81). But we do find significant differences when we assume RDU. When we assume preference homogeneity we observe a statistically significant effect at a p value of 0.078. With preference heterogeneity the p value across all parameters, those characterizing utility and probability weighting, is only 0.15; but if we evaluate the effect for just the probability weighting parameters the p value is 0.07.

Turning to parametric estimates, since they are familiar and easier to visualize, we again find no effect of payment protocols under EUT (p value \(=\) 0.98). In this case we use the Expo-Power utility function proposed by Saha (1993), and implemented by Holt and Laury (2002), since it allows for non-constant relative risk aversion. We find a p value of 0.06 with the RDU model, across all parameters, and note again that the culprit appears to be the probability weighting behavior (the p value on just those parameters is 0.07). In this case we again use the Expo-Power utility function, as well as the flexible two-parameter Prelec (1998) probability weighting function These results assume preference heterogeneity across individuals Fig. 1 shows the difference between RDU probability weighting functions when moving from the 1-in-1 payment protocol to the 1-in-K payment protocol. The differences are striking, quantitatively and qualitatively.

6 Implications

A first implication of our results is to encourage theorists to come up with payment protocols that allow one to elicit multiple choices but do not require that one violate an assumption required for the coherent specification of the particular decision model. This challenge has been directly addressed, and partially met, by Cox et al. (2011). There are no known, or obvious, payment protocols that can be used for RDU and cumulative prospect theory (CPT).

A second implication of our results is to question inferences made about specific alternative hypotheses to EUT when the 1-in-K protocol has been employed. That is, in literally every test of specific alternatives to EUT of which we are aware. This is not to say that EUT is valid, just that tests of the validity of specific alternatives rest on a maintained assumption that appears to be false.Footnote 14

A third, costly implication of our results, then, is to place a premium on collecting choice data in smaller doses, using 1-in-1 payment protocols. Anyone proposing new or robust anomalies should be encouraged to do this, and demonstrate that the alleged misbehavior persists when one removes the obvious theoretical confound.

A fourth, modeling implication of the need for 1-in-1 choice data is to place greater urgency on the use of rigorous econometric methods to flexibly characterize heterogeneous preferences.

A fifth implication is to consider more rigorously the learning behavior that might change behavior towards lottery choices such as these.Footnote 15 The argument is that one would expect 1-in-1 behavior to differ from 1-in-30 behavior since the latter reflects some learning behavior. The problem with this line of argument is that it is silent as to what should be compared to what, and does not provide a metric for defining when learning is finished.Footnote 16

A sixth implication is to formally model the effects of treating behavior as if generated by portfolio formation for the experiment as a whole. In effect, this is the implication of the theoretical “Recursive RDU” response to the problems posed by the RLIM payment procedure.Footnote 17 Even though nobody has ever estimated or empirically implemented such a model, the fact that one can formally write one out is regarded as enough to validate the use of the RLIM. The Recursive RDU model is not the standard RDU model, and it is the latter model that is the one that appears in all empirical work. It is apparent that the estimation of a Recursive RDU model for \(K \ggg 0\), as in our experimental design, is infeasible.

A final implication is to just be honest when presenting experimental findings on RDU and CPT models about the assumed neutrality of the experimental payment protocol. In effect this is just saying that there might be two independence axioms at work: one for the evaluation of a given lottery in a binary choice setting, and another one for an evaluation of sets of choices in 1-in-K settings. If one estimates RDU and CPT models with a 1-in-K protocol one might claim to be allowing the first axiom to be relaxed while maintaining the second. It is logically possible for the latter axiom to be empirically false while the former axiom is empirically true. In the absence of better alternatives, we do this in our own ongoing research using 1-in-K protocols.

7 Conclusions

We have demonstrated that estimated non-EUT preferences are sensitive to whether behavior is elicited with the RLIM or instead with a “one-shot” design. We do so by considering a simple and popular alternative to EUT: a standard RDU model that relaxes the CIA. Of course, this RDU specification is identical to the “gain frame” part of CPT, so CPT inherits all of the issues we have raised with respect to inferring risk preferences with RDU. Experimental economists should take this result seriously, and recognize the apparent problem of inferring preferences through use of the RLIM and treating these results “as if” they are the same as those from a 1-in-1 scenario.