1 Introduction

The abundance of uncertainty in life has prompted a great many investigations into humans’ response to risk. The interest in understanding risk preferences has created a latent demand for effective, easy-to-use risk preference elicitation devices. Following a long line of previous research by Becker et al. (1964), Binswanger (1980) and Binswanger (1981), and many others, in 2002 Holt and Laury (H&L) introduced a risk preference elicitation method that has subsequently become a mainstay. In a testament to the general interest in risk preference elicitation and to the appeal of the specific approach introduced by H&L, their work has been cited more than 3,700 times to date according to Google Scholar and is the second most highly cited paper published by the American Economic Review since 2002 according to ISI’s Web of Knowledge. The approach used by H&L has subsequently come to be referred to as a type of multiple price list (MPL) (Andersen et al. 2006; Harrison and Rutström 2008), an approach thought to have been first used by Miller et al. (1969).Footnote 1 The key advantage of the MPL is its ease of use. Respondents make a series of consecutive choices between two outcomes, where the expected value of one outcome increases at a higher rate than the other. The point at which an individual switches from choosing one outcome over the other is often used as a measure of risk aversion.

Despite the fact that MPLs are easy to use and relatively easy for participants to understand, the approach has some weaknesses. Harrison et al. (2005) pointed out that inferences from MPLs can be influenced by order effects (see also Holt and Laury 2005), and Andersen et al. (2006) discussed the potential for choices in MPLs to be influenced by the ranges of values used. Here, we point to a more fundamental problem with MPLs that seems to have been overlooked by practitioners. In particular, the H&L approach is subject to Wakker and Deneffe’s (1996) critique that many risk preference elicitation methods confound estimates of the curvature of the utility function (i.e., the traditional notion of risk preference) with an estimate of the extent to which an individual weights probabilities non-linearly. These are two conceptually different constructs that have different implications for individuals’ behavior under risk, and without controlling for one, biased estimates of the other are obtained.

This observation about MPLs is well known to experts in the field of risk preference elicitation, and yet in our experience, it is not well known to newcomers or those outside the field. The purpose of this paper is to further elucidate some of these issues and more widely disseminate this knowledge among the (apparently large) audience of individuals interested in risk preference elicitation. Moreover, while we agree that the use of a single “choice list” or MPL may not perform well in fully capturing the multidimensional aspects of risk preferences, it must be acknowledged that their popularity results from ease of use. Accordingly, in this paper, we show that different types of MPLs are better able to capture some risk dimensions than others and that by using two (or more) easy to use MPLs, a researcher might achieve a more balanced picture of risk preferences, and thus might attain improved predictive validity.Footnote 2

In what follows, we show that H&L’s original MPL is, perhaps ironically, not particularly well suited to measuring the traditional notion of risk preferences — the curvature of the utility function. Rather, it is likely to provide a better approximation of the curvature of the probability weighting function. We then introduce an alternative MPL that has exactly the opposite property. By combining the information gained from both types of MPLs, we show that greater prediction performance can be attained.

2 Effect of probability weighting in MPLs

In the baseline MPL used by H&L, individuals are asked to make a series of 10 decisions between two options (see Table 1). In option A, the high payoff amount is fixed at €2 and the low payoff amount is fixed at €1.60 across all 10 decision tasks. In option B, the high payoff amount is fixed at €3.85 and the low payoff amount is fixed at €0.10. The only thing changing across the 10 decisions are the probabilities assigned to the high and low payoffs. Initially the probability of receiving the high payoff is 0.10 but by the tenth decision task, the probability is 1.0.

Table 1 The H&L multiple price list

The expected value of lottery A exceeds the expected value of lottery B for the first four decision tasks. Thus, someone who prefers lottery A for the first four decision tasks and then switches and prefers lottery B for the remainder is often said to have near risk-neutral preferences. Analysts often use the number of “safe choices” (the number of times option A was chosen) or the A-B switching point to describe risk preferences and to infer the shape of an assumed utility function (Bellemare and Shearer 2010; Bruner et al. 2008; Eckel and Wilson 2004; Glockner and Hochman 2011; Lusk and Coble 2005).

Perhaps the first thing that should be noted about the original H&L MPL is that it entails choices made over only four monetary amounts (0.10, 1.60, 2.00 and 3.85). Because a utility function is unique only up to an affine transformation, one must fix two of these points and can only identify the relative difference implied by the other two. Stated differently, the original H&L MPL reveals little information about the curvature of the utility function.Footnote 3 By contrast, the H&L MPL entails choices over 11 different probability amounts (from 0 to 1 in increments of 0.1). Thus, the approach contains much more information about the potential shape of the probability weighting function over the entire probability domain.

To more formally address these issues, assume people’s preferences are represented by rank-dependent utility theory introduced by Quiggin (1982) and incorporated into cumulative prospect theory by Tversky and Kahneman (1992). Applying the theory to the H&L MPL, the rank-dependent utility of option A is R D U A = w(p)U(2) + (1 − w(p))U(1.6) and the rank-dependent utility of option B is R D U B = w(p)U(3.85) + (1 − w(p))U(0.1), where p is the probability of receiving the higher payoff amount in each option. A person chooses option A over B when R D U A > R D U B or when w(p)U(2)+(1−w(p))U(1.6) > w(p)U(3.85) + (1 − w(p))U(0.1). Re-arranging, one can see that option A is chosen when:

$$ \frac{w(p)}{1-w(p)}<\frac{U(1.6)-U(0.1)}{U(3.85)-U(2)} $$
(1)

Equation 1 reveals two important facts. First, the choice between option A and B in the H&L task is driven both by the shape of w(p) and the shape of U(x) — i.e., it does not separately identify only the curvature of the utility function or the coefficient of relative risk aversion as is often presumed. Second, Eq. 1 shows that, at most, one can identify only two utility differences U(1.6)−U(0.1) and U(3.85)−U(2), which is clearly a small amount of information to be gleaned about the shape of U(x).

To illustrate the first point, note that many experimental studies have estimated the shape of w(p) using functional forms such as w(p) = p γ/[p γ + (1 − p)γ]1/γ. Estimates of γ typically fall in the range of 0.56 to 0.71 (e.g., see Camerer and Ho 1994; Tversky and Kahneman 1992; Wu and Gonzalez 1996), which implies an S-shaped probability weighting function that over-weights low probability events and under-weights high probability events.

Now, consider a simple example where individuals have a linear utility function (i.e., they are risk neutral in the traditional sense), U(x) = x. With the traditional H&L task, a risk neutral person with U(x) = x and γ = 1 would switch from option A to B at the fifth decision task. However, if the person weights probabilities non-linearly, say with a value of γ = 0.6, then they would instead switch from option A to B at the sixth decision task. Thus, in the original H&L decision task, an individual with γ = 0.6 will appear to have a concave utility function (if one ignores probability weighting) even though they have a linear utility function, U(x) = x. The problem is further exasperated as γ diverges from one. Of course in reality, people may weight probabilities non-linearly and exhibit diminishing marginal utility of earnings, but the point remains: simply observing the A-B switching point in the H&L decision task is insufficient to identify the shape of U(x) and the shape of w(p). The two are confounded. While it is possible to use data from the H&L technique to estimate these two constructs, U(x) and w(p), ex post, we argue that more information is contained about w(p) than U(x) in the original H&L MPL.

In addition to the above arguments that choices in the H&L MPL are likely to provide more information on the shape of w(p) than U(x), there is an argument that relates to the moderate level of payoffs used in many experiments using MPLs. Several authors have argued that the utility function should be linear over relatively low payoff amounts (Selten et al. 1999; Wakker 2010).Footnote 4 If true, this would suggest that the risk averse behavior previously observed in H&L tasks may well relate to w(p) rather than to U(x). A final piece of evidence suggesting that the original H&L task is more likely to elicit probability weights than utility curvature are the findings that in repeated choice tasks people are more likely to pay attention to the factors changing across the tasks (which in the case of H&L are the probabilities). Because probabilities are changing in the original H&L task, people are more likely to pay attention to this dimension of choice (Bleichrodt 2002).

2.1 A payoff-varying MPL

Given the preceding discussion, one might ask if there is a simple way to use a MPL that yields more information about U(x) and, at least in some special cases, avoids the confound between w(p) and U(x)? One can indeed achieve such an outcome by following an approach like the one used by Wakker and Deneffe (1996) in which probabilities are held constant. Using this insight, we modify the H&L task such that probabilities remain constant across the ten decision tasks and instead change the monetary payoffs down the ten tasks. Our approach is similar to that used in prior research such as that whereby certainty equivalents are elicited from subjects by using repeated choices with varying payoff amounts (Cohen et al. 1987).Footnote 5

Table 2 shows a payoff-varying MPL. In this MPL, the probabilities of all payouts are held constant at 0.5. We constructed the payoff-varying MPL shown in Table 2 so that it matched the original H&L MPL in terms of the coefficient of relative risk aversion (CRRA) implied by a switch between choosing option A and option B under the assumption of EUT preferences.Footnote 6

Table 2 Payoff-varying MPL with Constant Probabilities

What are the advantages and disadvantages of the payoff-varying MPL compared to the H&L MPL? At the onset, one can see that because the payoff-varying MPL only utilizes one probability level, 0.5, it cannot reveal much about the shape of the probability weighting function. However, the payoff-varying MPL entails choices over 22 different monetary payouts. To consider these ideas more formally, again assume individuals have rank-dependent preferences and note that option A will be chosen over option B if w(0.5)U(A H) + (1 − w(0.5))U(1.6) > w(0.5)U(B H) + (1 − w(0.5))U(1), where A H and B H are the higher payoffs for options A and B, respectively (values which change over the 10 decision tasks), and where A H > 1.6, B H > 1, and B H > A H. Re-arranging terms, one can see that option A is chosen if:

$$ \frac{w(0.5)}{1-w(0.5)} < \frac{U(1.6)-U(1)}{U(B^{H} )-U(A^{H})} $$
(2)

Comparing Eq. 2 with Eq. 1, one can see that the original H&L task can utilize 10 points to estimate the function for w(p) but by contrast, the payoff-varying task can only estimate a single point, w(0.5). In contrast, whereas the original H&L task can only estimate two utility differences, the payoff-varying task can estimate 11. Thus, the payoff-varying MPL reveals more information about the shape of U(x) than the original H&L MPL, but the original H&L MPL reveals more information about the shape of w(p) than does the payoff-varying task.Footnote 7

3 Experiment

To investigate some of the issues discussed above, a laboratory experiment was conducted to compare behavior in the original H&L MPL and our payoff-varying MPL. Moreover, the experiment was designed to see which MPL (or whether a combination of the two) could better predict a hold-out sample of choices. The next sub-section describes the subjects, recruiting, and experimental environment. Then, we describe the different treatments used in the study.

3.1 Description of the experiment set-up

A lab experiment was conducted using the z-Tree software (Fischbacher 2007). Subjects consisted of undergraduate students at the University of Ioannina, Greece and were recruited using the ORSEE recruiting system (Greiner 2004). During the recruitment, subjects were told that they would be given the chance to make more money during the experiment.Footnote 8 Harrison et al. (2009) have shown that stochastic and non-stochastic fees can significantly affect self-selection of subjects with respect to risk attitudes.

Subjects participated in sessions of group sizes that varied from 9 to 11 subjects per session (all but two sessions involved groups of 10 subjects). In total, 100 subjects participated in 10 sessions that were conducted between December 2011 and January 2012. Each session lasted about 45 minutes and subjects were paid a €10 participation fee. Subjects were given a power point presentation explaining the risk preferences tasks as well as printed copies of instructions. They were also initially given a five-choice training task to familiarize them with the choice screens that would appear in the real task. Subjects were told that choices in the training phase would not count toward their earnings and that this phase was purely hypothetical.

Full anonymity was ensured by asking subjects to choose a unique three-digit code from a jar. The code was then entered at an input stage once the computerized experiment started. The experimenter only knew correspondence between digit codes and profits. Profits and participation fees were put in sealed envelopes (the digit code was written on the outside) and were exchanged with printed digit codes at the end of the experiment. No names were asked at any point of the experiment. Subjects were told that their decisions were independent from other subjects, and that they could finish the experiment at their own convenience. Average total payouts including lottery earnings were €15.2(S.D.=4.56).

3.2 Risk preference elicitation

Our experiment entailed a 3 × 2 within-subject design, where each subject completed three different multiple price lists (MPL) at two payout (low vs. high) amounts. As shown in Table 3, the baseline (or control) involved the original H&L task at low payoff amounts (a task we refer to as H&L1).

Table 3 Treatments in experiment

The baseline H&L MPL presented subjects with a choice between two lotteries, A or B. For each lottery choice shown in Table 1, a subject chose A, B or could state indifference between A and B. The last choice shown in Table 1 is a simple test of whether subjects understood the instructions correctly.Footnote 9 The second treatment (H&L5) is identical to the first (H&L1) except that all payouts are scaled up by a magnitude of five.

In addition to the choices in treatments H&L1 and H&L5, subjects also completed the payoff-varying MPL shown in Table 2 (pvMPL1) and another set of choices identical to the ones shown in Table 2 except that all payoffs were scaled up by a magnitude of five (pvMPL5).

Instead of providing a table of choices arrayed in an ordered manner all appearing on the same page as in H&L, each choice was presented separately showing probabilities and prizes (as in Andersen et al. 2014; Hey and Orme 1994). Subjects could move back and forth between screens in a given table but not between tables. Once all ten choices in a table were made, the table was effectively inaccessible. The order of appearance of the treatments for each subject was completely randomized in order to randomize any order effects (Harrison et al. 2005). An example of one of the decision tasks is shown in Fig. 1.

Fig. 1
figure 1

Example Decision Task

One of the implicit arguments made thus far is that the original H&L task can better estimate the probability weighting function and the payoff-varying MPL can better estimate the curvature of the utility function. As such, a combination of the insights attained by the two approaches might result in a better overall model. To determine whether this combination is indeed “better” than either used alone, we used out-of-sample prediction as our measure of performance. Thus, as shown in Table 3, the study also included two hold-out tasks which we use as the basis of measuring prediction performance. We constructed these hold-out tasks by creating yet another MPL that modified the original H&L design such that the probability of receiving the higher payout option increased nonlinearly down the list (see Table 4). The MPL is constructed so that it matched the original H&L task in terms of the coefficient of relative risk aversion (CRRA) implied by a switch between choosing option A and option B under the assumption that subjects have prospect-theory preferences where they weigh probabilities nonlinearly with w(p) = p 0.6/[p 0.6 + (1 − p)0.6]1/0.6.

Table 4 Hold-out task

Because each subject completed three MPLs (with 10 choices each) at two payouts, they each made 60 binary choices. For each subject, one of the 60 choices was randomly chosen and paid out.

4 Data analysis and results

4.1 Descriptive analysis

Figure 2 illustrates the proportion of subjects choosing option A for the original H&L task and the payoff-varying task for small and large payoff amounts. Note that all four tasks were designed to elicit the same switching point for a given risk aversion coefficient under the assumption of expected utility preferences (an assumption we will show later to be descriptively invalid).

Fig. 2
figure 2

Percentage of respondents choosing option A for each decision task

The two H&L tasks and the pvMPL5 tasks imply significant risk averse behavior as subjects switch, on average, far after the fourth choice. However, for the low payoff payoff-varying task, pvMPL1, where probabilities are held constant, a different picture emerges. Subjects exhibit what appears to be risk loving behavior in the constant-probability task while they show risk averse behavior in the conventional H&L task.

One striking difference in the pvMPL1 task is the fact that the percent choosing option A remains at about 50% for the first five decision tasks, and in actuality slightly increases over this range. One explanation for this trend is that the payoff-varying task generated more multiple switching points than the standard H&L task.Footnote 10 If we calculate the number of choices that violate monotonicity (i.e., number of times a subject switched from preferring the right option to preferring the left option), we find that on average subjects made 0.21 and 0.11 such violations in the original H&L task at low and high payouts, respectively. By contrast, in our payoff-varying MPL tasks with constant probabilities, on average subjects made 0.85 and 0.69 such violations in the low and high payout tasks, respectively. Over the first few choices in the payoff-varying decision task at low payoffs (pvMPL1), the difference in the expected values between lottery options A and B were relatively small, and this might partially explain why the task generated more switching behavior. However, it should be noted that such small differences in expected values were required to generate the same implied CRRA intervals as the original H&L task given the overall payout magnitudes. Thus, this is not a feature of the payoff-varying task per se but rather a feature of constant relative risk aversion and expected utility theory applied to lotteries with payouts of the magnitude considered in the original H&L task but with constant probabilities. It is therefore possible to construct other payoff-varying tasks with larger differences in expected value that would mitigate multiple switching behavior. Importantly, we have analyzed our data removing individuals that significantly violated monotonicity (i.e., made three or more inconsistent choices), and our econometric estimates (discussed momentarily) are virtually unchanged.

Figure 2 also illustrates the effects of scaling off payoffs. For the traditional H&L task, increasing payoffs had very little effect on the percentage of times option A was chosen. However, increasing payoffs had a much larger effect on our payoff-varying MPL. The issue of monotonicity does not appear as problematic in the payoff-varying MPL when payoffs are scaled up by a factor of five. This might be because the expected value differences between options A and B (shown in Table 2) are also scaled up by a factor of five in this task.

4.2 Econometric modeling approach

To explore the results in terms of the curvature of the utility and probability weighting functions, we utilize the random utility approach also used by Andersen et al. (2008) and use the rank-dependent utility model as the base-line model of analysis. We let the random rank-dependent utility of option k experienced by individual i in choice j be: \(V_{ij}^{k}=Z_{ij}^{k}+\varepsilon _{ij}^{k}\) for k = A,B where ε i j is a stochastic error term assumed to be known to the individual but unobservable to the analyst. \( Z_{ij}^{A} \) and \( Z_{ij}^{B} \) are the systematic portions of the utility functions assumed to follow rank-dependent preferences, i.e., \( Z_{ij}^{A}=w(p_{j} )U({A_{j}^{H}} )+(1-w(p_{j} ))U({A_{j}^{L}} ) \), where \( {A_{j}^{H}} \) is the high payoff and \( {A_{j}^{L}} \) is the low payoff for option A in choice j (and likewise for \( Z_{ij}^{B}\)).

The probability of option A being chosen over option B can be given by \(P_{ij}^{A}={\Phi }((Z_{ij}^{A}-Z_{ij}^{B})/\sigma )\), where the difference in the error terms is distributed i.i.d. normal with standard deviation equal to σ. Thus, a log likelihood function can be defined for estimation: \( LF={\sum }_{i=1}^{N}{\sum }_{j=1}^{J}[y_{ij}ln(P_{ij}^{A})+(1-y_{ij})ln(1-P_{ij}^{A})] \) where y i j = 1 if option A is chosen, y i j = 0 if option B is chosen and y i j = 0.5 if an individual indicates indifference to A and B.Footnote 11

In the analyses that follow, we consider several specifications for w(p) and U(x). The baseline specification for the utility function is the constant relative risk aversion specification: \( U(x) = \frac {x^{1-r}}{1-r} \), where r is the coefficient of relative risk aversion.Footnote 12 In the payoff-varying MPLs we have many more points on the utility function and can also estimate an expo-power utility function (Saha 1993): U(x) = (1−e x p(− α x 1−r))/α.

For the probability weighting function in the H&L MPLs, we consider the function used by Tversky and Kahneman (1992) and others: w(p) = p γ/[p γ + (1 − p)γ]1/γ. We also estimated the probability weighting function proposed by Prelec (1998): w(p) = e x p(− β(− l n(p)τ)). For the payoff-varying MPL, there is only a single probability point and thus we need only estimate a single parameter, θ, representing the weight placed on the 0.5 probability, i.e., w(0.5) = θ.

We estimated each of these competing specifications separately for high and low payoffs (likelihood ratio tests reject the hypothesis of equality of parameters across high and low payoffs for each specification). Moreover, we used the Akaike and Bayesian information criteria (AIC and BIC, respectively) to determine the best fitting model for each dataset.

4.3 Econometric results

For the traditional H&L MPLs, each of the aforementioned model variations was estimated (see the Electronic Supplementary Material Table B.2).Footnote 13 For both high and low payoffs, the AIC and BIC model selection criteria indicate a preference for the rank-dependent models over the prospect-theory models. Within the rank-dependent models, the Tversky and Kahneman (1992) probability weighting function is preferred to the Prelec (1998) weighting function according to the AIC and BIC.

The preferred model for both the H&L1 and H&L5 treatments is the CRRA model with the Tversky and Kahneman (1992) probability weighting function. For low payoffs, the estimate of the coefficient of relative risk aversion (r = 0.004) was not statistically different from zero, but the estimated parameter on the probability weighting function, γ = 0.501, was statistically different from one, indicating a rejection of the expected utility model in favor of the rank-dependent model. Similarly, for high payoffs, the coefficient of relative risk aversion (r = −0.135) was not statistically different from zero, but the estimated parameter on the probability weighting function, γ = 0.45, was statistically different from one. Thus, at least for our subjects, the apparent risk averse behavior shown in Fig. 2 is solely a result of probability weighting rather than utility function curvature for the conventional H&L tasks. The implication is that practitioners using the H&L task to infer curvature of the utility function would have arrived at erroneous conclusions had they not also jointly estimated the extent to which people weigh probabilities non-linearly.

Figure 3 plots the estimated probability weighting functions for the low payoff H&L1 MPL. In addition to the two aforementioned functional forms, we also show the results of a non-parametric estimation in which a single parameter is estimated for each of the 11 probability points available in the H&L task (with the lowest normalized to zero and the highest normalized to one). Although the non-parametric form is not preferred to the parametric forms according to AIC and BIC, the results reveal the level of information about probability weights obtainable from the H&L task. In all specifications, the results reveal significant over-weighting of low probability events and under-weighting of high probability events.

Fig. 3
figure 3

Non-linear probability weighting implied by three different models for the H&L1 MPL

Turning to the payoff-varying MPLs, the AIC and BIC indicate that the most preferred models (see full results in the Electronic Supplementary Material Table B.1) are the models assuming constant relative risk aversion with the error variance normalized to one for both the low and high payoff tasks. The results reveal that at low payoffs, the coefficient of relative risk aversion (r = 0.116) was not statistically different from zero and the estimated weight applied to probability 0.50 was w(0.5) = 0.581, a difference (0.581 − 0.50 = 0.081) which is not statistically different from zero, implying linear probability weighting in the vicinity of p = 0.5. Taken together, for low payoffs, the estimates imply near risk-neutral behavior for the payoff-varying MPL.

At high payoffs, however, a different picture emerges. For our payoff-varying MPL which utilizes much more variation over payoff amounts than the H&L task, we find that for high payoffs, a statistically significant estimate for the coefficient of relative risk aversion (r = 0.233) emerges. Moreover, we find that the estimated weight applied to probability 0.50 was w(0.5) = 0.366. This estimate of probability weighing is very similar to that implied by the high payoff H&L task (with γ = 0.45, the H&L5 task implies w(0.5) = 0.313).

4.4 Prediction performance

Because of the larger variation in probabilities in the H&L task, we have argued that this task should yield better estimates of the curvature of the probability weighting function. By contrast, because of the larger variation in the monetary amounts in our payoff-varying MPL, we have argued that this task should yield better estimates of the curvature of the utility function. To put these conjectures to the test, we now see how well the aforementioned estimates are able to predict the hold-out tasks at low (H1) and high (H2) payoff amounts.

In particular, we compare the predictive performance of three models: i) a model based on the estimate of r and γ from the H&L MPL, ii) a model based on the estimate of r (and for lack of a better choice assuming γ = 1) from the pvMPL, and finally iii) a composite model in which we use the estimate of r from our payoff-varying MPL and the estimate of γ from the H&L MPL.Footnote 14 To judge predictive fit, we use two criteria: 1) the percent of correct predictions and 2) the value of the likelihood function observed at out-of-sample values — the out-of-sample log-likelihood function (OSLLF). The out-of-sample log-likelihood function approach has long been used in the marketing literature for model selection (Erdem 1996; Roy et al. 1996) and further elucidated in the economics literature (Norwood et al. 2004a, b; Drichoutis and Lusk 2014). The OSLLF has desirable properties in judging the predictive fit of discrete choice models and it is our preferred selection criteria.

Table 5 shows the performance of the three models in predicting the out-of-sample hold-out choices. For low payoffs, the composite model generates the same percentage of correct predictions but has a lower OSLLF than the H&L MPL. Although a paired-test indicates no significant difference in the composite-model and H&L1 OSLLF values, the non-parametric sign-rank test indicates the two are significantly different (p-value < 0.01). The composite model outperforms the pvMPL both in terms of percent of correct predictions and in terms of OSLLF (both the t-test and signed-rank test indicate OSLLFs are significantly different at p < 0.01 level).

Table 5 Out-of-sample prediction performance of three competing models

A similar result is obtained for the high payoff values. Although all three models generate similar performance in terms of the percentage of correct predictions, the composite model far outperforms the H&L5 and pvMPL5 tasks in isolation according to the OSLLF values (the composite model yields significantly different OSLLF values as compared to the H&L5 and pvMPL5 tasks according to t-tests and signed-rank tests at the p < 0.01 level).

Taken together, the results in Table 5 largely confirm our intuition that better predictions can be made by using the H&L task to infer the curvature of the probability weighting function and the payoff-varying MPL to infer the curvature of the utility function.

5 Conclusion

Although H&L introduced a useful tool for characterizing risk taking behavior, their approach is limited in being able to identify why a particular behavior under risk was observed. Risk averse behavior could result from curvature of the utility function, curvature of the probability weighting function, or both. The obvious implication is that caution should be taken in directly using a single number like “number of safe choices” from H&L’s risk preference elicitation method to infer curvature of the utility function, the theoretical concept that is often of interest, because risk averse behavior may be driven by probability weighting. In fact, we show that, if anything, the H&L task is probably best suited to measuring the curvature of the probability weighting function.

We constructed a modified version of the H&L task which held probabilities constant at 0.50 and provided much more variation in the payoff amounts. By providing more variation in payoff amounts, we hoped to obtain better estimates of the curvature of the utility function. By and large, that’s what our experimental results imply. At both low and high payoff amounts, econometric estimates suggest that behavior is almost totally driven by the curvature of the probability weighting function (the estimated CRRA is not significantly different from zero in either case). Only with our payoff-varying MPL under high payoffs did we observe significant curvature in the utility function. Moreover, a recent examination of various MPL risk elicitation methods (Csermely and Rabas 2016) finds that our high payoff-varying MPL performs best (among other MPL tasks) in terms of predictive power in games where risk is relevant in explaining behavior. In the same study, our high payoff-varying MPL showed remarkably high individual and aggregate stability within a 30-minute time frame and outperformed all other MPL tasks.

To test our intuition about the relative merits of the two elicitation approaches, we sought to determine whether a composite model that combined the estimate of the curvature of the utility function from our payoff-varying MPL with the estimate of the curvature of the probability weighting function from the H&L task would exhibit better out-of-sample prediction performance with a hold-out task than either model used in isolation. Our results implied that the composite model did indeed generate lower OSLLF values than the estimates from the conventional H&L task or the MPL used alone.