1 Introduction

Many economic decisions are made under uncertainty that cannot be readily quantified by objective probabilities. Consider saving decisions, that is investing money into bonds or stocks in the presence of inflation and general uncertainties about the economic future. Even in the absence of wars and pandemics, most people find it hard to attach probabilities to the relevant possible events, let alone agree on such an assessment.

The classical paradigm of rational decision making under such uncertainty is subjective expected utility (SEU), underpinned by the axiomatic foundation of Savage (1954). The experimental designs of Ellsberg (1961), later implemented in many studies, see e.g., the survey of Trautmann and van de Kuilen (2015), have challenged the SEU paradigm. This challenge was mostly on positive, that is, empirical, grounds in that these experiments found that many people behave in a way that is inconsistent with subjective expected utility. The fact, however, that this behavior has been so robust in lab experiments and that many researchers have developed axiomatic foundations for alternative models of decision making that admit such behavior, see references below, may be received as posing a normative challenge to SEU in postulating that a broader set of preference models could be considered normatively appealing.

In this paper we aim to test the normative appeal of these alternative models of decision making under uncertainty. We use the term “normative” in the sense of Ellsberg’s 1962 PhD thesis, see Ellsberg (2001, pages 22–26), and as in the “subjective” definition of rationality given by Gilboa (2012, page 5): We consider a decision normatively appealing (to the decision maker) if the decision maker (still) makes this choice after thorough reflection. In all of our treatments, subjects are provided with a complete, and fairly standard, description of all payoff-relevant aspects of the environment. We operationalize this reflection in the lab by providing subjects with supplementary descriptions that, while not payoff-relevant, emphasize certain ways to think about the environment. The descriptions are provided in the form of short videos that subjects watch before the elicitation of their choice. The experimental design has been pre-registered at the AEA RCT Registry, see Kuzmics et al. (2018).Footnote 1

Our findings are relevant for all models of ambiguity aversion that are monotone. That is, if one act is better than another act in all states, then the inferior act cannot be chosen when the superior act is available. We call such models classical ambiguity aversion (CAA) models.Footnote 2

The experimental environment is directly inspired by the hedging argument of Raiffa (1961) in the context of the Ellsberg (1961) two-color urn experiment. A risky urn contains 49 White and 51 Red balls.Footnote 3 An ambiguous urn also contains 100 balls, each of which is either Green or Yellow, but nothing more is known about the composition of the ambiguous urn. The decision-maker (DM) must choose one of three actions, after which the experimenter draws one ball from each urn. Together, these determine the consequence for the DM, which is either “Win” or “Lose”, with Win being strictly preferred to Lose. We call the choices bets for White, Green, and Yellow. Each bet Wins if the experimenter draws a ball of the corresponding color, and loses otherwise. Ellsberg’s key insight is that Bet White, while commonly chosen, is incompatible with SEU.Footnote 4

In this context, the Raiffa (1961) hedge against ambiguity works as follows. The DM flips a fair coin and bets Green if the coin lands on Heads and bets Yellow if the coin lands on Tails. This randomized action provides an (objective) winning probability of 50%, regardless of the color of the drawn ball from the ambiguous urn. As this is higher than the 49% winning probability from betting White, this action strictly dominates the Ellsberg choice.Footnote 5

In light of the (Raiffa, 1961) argument, what are the possible explanations for a classically ambiguity-averse DM to nonetheless bet on White in this experiment even though it is dominated? First, it is possible, even likely, that designing a random choice does not occur to subjects as an option. Second, even if a subject recognizes such a possibility, it is possible that they cannot, or choose not to, go through the required construction and reasoning that would allow them to see that betting on White is dominated.Footnote 6 Suppose that a subject does recognize the possibility and understands the argument. A third possible explanation is that the subject has no access to a suitable randomization device, nor thinks they can simulate one. So suppose the subject does have a fair coin. The fourth and final explanation is that the subject lacks the ability to commit to the randomized action. Once the coin flip realizes, the subject could revisit their choice, and if they are ambiguity averse they will want to flip the coin again, and again ad infinitum, with one possible outcome that they bet on White in the end after all.

Classical ambiguity aversion models do not allow us to delve into the reasons behind the choice of White in the presence of the Raiffa (1961) argument, as these preference models are axiomatized for preferences over the space of all pure acts, see e.g., Seo (2009), and not over the set of all mixed acts as in Anscombe and Aumann (1963). This means, however, that whether or not a classically ambiguity averse subject who understands the environment may choose to bet on White depends simply on whether the Raiffa (1961) hedge (including the commitment to its outcome) is provided as a pure choice or not. If it is given as a pure choice the subject cannot choose to bet on White. If it is not given, they can.

Motivated by these considerations, our experimental treatments provide differential supplementary observations that focus on the hedging argument. All of the supplementary observations are given after a (common to all treatments) standard and complete description of the environment, and before the elicitation of the subject’s choice. If the standard and complete description of the environment suffices to impart a full understanding of the consequences of each possible action, then the treatment effects of the supplementary observations will be null. Indeed, there is no marginal payoff-relevant information in the supplementary observations.

To the contrary, our first main finding is that the commentary significantly changes behavior. Since the commentary does not affect a subject’s underlying preferences, it changes behavior through modifying a subjects’ understanding of the environment. This is only possible if the subject’s understanding after the standard description was incomplete or erroneous. We conclude that many subjects indeed have an incomplete or erroneous understanding after being provided with a standard and complete description of the classical Ellsberg two-urn environment, so that directly applying the revealed preference toolkit to such data may not be appropriate. Instead, a given choice may reflect an incomplete comprehension of the environment, and therefore be viewed as a mistake, or the result of confusion, rather than as a manifestation of the subject’s preference.

Our second main finding concerns the direction of the effects of the supplemental observations. In this regard, and to the extent we can infer preferences from the treatments with commentary, our data supports the broader normative appeal of ambiguity aversion models over SEU in the following sense: When the Ellsberg choice (Bet White) is compatible with ambiguity averse preferences but not with SEU, the supplemental hedging observations increase the frequency of Ellsberg behavior; when the Ellsberg choice is incompatible with ambiguity aversion (and so also with SEU), the same observations decrease the frequency of Ellsberg behavior.

The paper proceeds as follows: The experimental design is given in Section 2. The results are given in Section 3. Section 4 offers a discussion of these results and Section 5 provides a brief survey of related literature before Section 6 concludes. Experimental instructions are presented in the Appendix.

2 Experimental design

The slight variation of the Ellsberg two-color urn experiment, outlined above, serves as the baseline and control. We then vary the supplemental observations that subjects receive about the decision-making environment. Our treatments differ from the baseline along two dimensions. First, in addition to the standard options of betting on White, Yellow, or Green, in some treatments subjects are presented with an additional fourth option, which executes a bet on either Green or Yellow according to the outcome of a fair coin to be tossed by a third party after the balls are drawn from the urns—the Raiffa hedge. Note that this option also serves as a commitment device for randomization, as the bet will be executed by the experimenter on the subject’s behalf. Recall that the compatibility of the Ellsberg choice (bet White) with classical ambiguity aversion models hinges on the presence or absence of this option.

Second, after the environment is fully described as transparently as possible, in some treatments subjects watch short videos before making their (single) choice. The videos, while all factually correct, emphasize different aspects of the consequences of using the randomization device, in ways that we hypothesize may change some subjects’ understanding of the merits of the Raiffa hedge.

In the main treatment, the coin flip bet is included as an option, and subjects are presented with a single video (denoted V1) containing supplemental observations that describe the hedging argument of Raiffa (1961).Footnote 7 It describes the outcome of betting on the coin conditional on the outcome of the ball drawn from the ambiguous urn. It states that the winning probability using that option is 50% in either case (Green ball or Yellow ball drawn), and concludes by reminding the subject that betting on White wins with probability 49%.

This video, and each one of our other videos, does not explicitly advocate for any particular choice, so that it contains observations rather than advice. The transcripts of the videos are read by an anonymous (to subjects) third party to avoid a perception that the experimenters are giving implicit advice.

Partly to control for a possible experimenter demand effect, we designed a second video (denoted V2), in which the structure of the argument and the language is symmetric to the first video. It describes the outcome of betting on the coin conditional on the outcome of the coin flip. It states that no known winning probability can be specified in either case (Heads or Tails), and concludes by reminding the subject that betting on White wins with probability 49%. Again, it does not advocate for any particular choice.

We ran treatments utilizing exclusively this second video, as well as treatments in which subjects were presented with both videos before eliciting their choice (in both orders; there were no order effects).Footnote 8

As we want to understand the effect of the supplementary observations independently from the effect of presenting the hedging device as an explicit option, we ran a parallel set of treatments with similar videos but where the available options were simply bets for White, Green, or Yellow, as in the baseline case, without the coin flip option. In these treatments, the Ellsberg choice remains compatible with CAA models. We varied the videos slightly to accommodate the different choice set. First, as there was no explicit coin, before showing either V1 or V2, we showed a preliminary video (denoted V0) in which it was explained that a subject could imagine a virtual coin toss, and then bet on Green/Yellow according to the outcome. Second, in videos V1 and V2 the coin toss was referred to as a virtual coin toss. We refer to the treatment with the explicit hedge/commitment option as “Coin” and those without it as “No Coin” treatments.

3 Results

Table 1 summarizes the main findings.

Table 1 Summary of data (left: without a randomization device; right: explicit randomization option)

We summarize the key findings from this data in the following three results.

Result 1

If preferences alone dictate choices, the supplementary observations contained in the videos should have no effect on behavior. For the No Coin (Coin, resp.) treatments, pooling the data for Green and Yellow (being conservative), the p-value for the null hypothesis that choice frequencies are the same in the baseline and the V1 video treatment is 0.067 (\(< 0.001\), resp.),Footnote 9 and for the null hypothesis that choice frequencies are the same in the baseline and the neutral \(V1+V2\) video treatment is 0.142 (0.007, resp.).

Result 2

The p-value for the null hypothesis that neutral video observations (\(V1+V2\)) does not decrease the choice of White relative to the baseline is \(\ge 0.5\) for the No Coin treatments, and 0.057 for the Coin treatments.

Result 3

Footnote 10 Without supplemental observations there is no significant difference in the frequency of Bet White between the Coin and No Coin treatment, p-value 0.402. With supplemental observations (\(V1 + V2\)) there is a significant difference in the frequency of Bet White between the Coin and No Coin treatment, p-value \(< 0.001\).

4 Discussion

There is firm evidence that behavior across treatments is not a pure consequence of underlying preferences combined with a complete understanding of the environment. The observations contained in the videos cannot change preferences as they do not change any of the payoff-relevant considerations. Rather, any differences in behavior must come from differences in the subjects’ understanding.

Suppose we adopt the view that choices after studying both videos indeed reveal preferences, since subjects may have a more complete understanding of the environment and the consequences of their choices after considering the observations contained therein. Even then, 23% of subjects, i.e., those who chose Bet White in the Coin treatment, have preferences different from any CAA model. The remaining 77% of subjects make choices in the Coin treatment that can be explained by CAA as well as SEU.

However, the No Coin treatments provide an interesting contrast. In these treatments, Bet White is undominated and the supplemental observations increase the frequency of Bet White relative to the baseline description. If, again, we view the choices after studying both videos as revealing preferences, the choice of Bet White made by 58% of subjects is inconsistent with SEU but is consistent with CAA models.

Together, these findings suggest that CAA models have broader normative appeal than (the narrower theory of) SEU despite their descriptive problems in some environments. Of note, we asked subjects (in a non-incentivized post-experiment questionnaire) if their preference became more or less clear after watching the videos. In the No Coin treatments 17% (27 out of 162) of subjects reported that their preferences became “less clear” after watching the videos, which is much higher than the 3% (6 out of 202) reporting the same in the Coin treatment (\(p<0.001\)), calling into question the presumption that behavior directly reveals preferences, especially in the No Coin treatments. One interpretation is that many subjects find monotonicity to be a normatively appealing property, yet lack the sophistication to identify its consequences.

We conclude this discussion by considering preferences that may depend on the timing of the resolution of uncertainty, as in the models of (Seo, 2009; Saito, 2015; Ke & Zhang, 2020). Such models are not classical, as they are not monotone.Footnote 11 In the Coin treatments subjects were (truthfully) told, as part of the baseline description of the environment, that the coin flip would be executed after the balls were drawn from the urns and revealed. Thus, the choice of Bet White in the Coin treatments (37% in the baseline and 23% after both videos) is even inconsistent with these more flexible models.

Roughly, one could categorize our subjects as follows. There is one group of subjects (\(\ge\) 23%) who make choices inconsistent with all models of ambiguity aversion. There is a second group of subjects (\(\leq{58\%}-{23\%}=35\%\)) who make choices consistent with ambiguity aversion but not with SEU. The remainder make choices consistent with SEU. Subjects in the second group would have a demand for randomization devices as they cannot, to their satisfaction, create and commit to randomized choices themselves.

5 Related literature

A number of papers have studied the consistency of subjects’ choices across decision problems. These include Binmore et al. (2012), Stahl (2014), Voorhoeve et al. (2016), and Crockett et al. (2019). This literature finds, on the whole, that relatively few subjects make consistent choices, and those who do tend to be ambiguity-neutral. The lack of consistency can be interpreted as evidence against people choosing according to a clear preference. However, ambiguity-averse DMs may find inconsistent choices to be a useful hedge against ambiguity, see e.g., Kuzmics (2017) and Azrieli et al. (2018) for more general arguments.Footnote 12 Our single choice design is immune to such problems. This is why we constrained our design to a single incentivized elicitation per subject, even at the cost of forgoing the ability to conduct within-subject analyses across treatments.

Spears (2009), Dominiak and Schnedler (2011), and Oechssler et al. (2016) study experiments in which subjects are given the Raiffa hedge as an option, similarly to our baseline Coin treatment (without supplemental commentary). Generally, they find very few subjects choosing this option, with more subjects instead choosing a dominated option. This too is evidence against CAA models. They also find that subjects do not care about the timing of the resolution of uncertainty, evidence even against the non-CAA models of Seo (2009), Saito (2015), and Ke and Zhang (2020). Relatedly, Baillon et al. (2022b) study how subjects respond to two different forms of randomized incentive mechanisms to elicit ambiguity aversion preferences. Somewhat in contrast to the above mentioned literature, they find that, while about 50% of subjects make an Ellsberg-like choice in a treatment without randomization (akin to a treatment without a coin toss), only 25% to 29% of subjects make this choice when there is randomization (akin to a treatment with a coin toss). Interestingly, however, and consistent with the above mentioned literature, subjects do again not seem to care about the timing of the resolution of uncertainty.

We add to this literature by focusing on the possible effects on subjects’ choices of providing explicit descriptions of the Raiffa hedge in the form of video commentaries. We thus add to these findings that such observations significantly influence behavior, and does so in directions that support the appeal of CAA models.

Nielsen and Rehbeck (2022) study the normative appeal of key axioms of rational decision-making under risk. They ask subjects two things: Whether they would agree that their choices should satisfy a certain axiom, and to make a choice from a set of lotteries. For all those subjects whose two answers are inconsistent, they point this out to them and ask them if they would reconsider their answers. They then find that of all those subjects whose answers are inconsistent, 47% revise their choice to be consistent with the axiom, while 13% revise their wish to be consistent with the axiom. Our approach has similarities with that of Nielsen and Rehbeck (2022). We also show subjects an axiom, if one can call it that, by giving them video commentary about the consequences of the Raiffa (1961) hedge, i.e., of making choices based on a coin flip. We also have a similar goal as Nielsen and Rehbeck (2022): We are interested if subjects might make choices by mistake because they do not fully understand the consequences of their choices. In contrast to Nielsen and Rehbeck (2022), however, and this is the main difference, we here do not study decision-making under risk, but under ambiguity. Our “axiom” of interest, the Raiffa (1961) hedge, is, therefore, quite different. We think of the Raiffa (1961) hedge not so much as an axiom, but more of an argument. Therefore, we do not just ask subjects if they agree with the argument, but provide two neutrally phrased commentaries about the consequences of the coin toss, one implicitly advocating for and one against the Raiffa (1961) hedge. Some of our findings, while in a different domain of choice problems, are, nevertheless consistent with those of Nielsen and Rehbeck (2022): In both domains of decision-making under risk (Nielsen & Rehbeck, 2022) and decision-making under ambiguity (this paper) there is evidence of people making choices that they see as mistakes after induced contemplation. We then add to this finding evidence of the way these mistakes are made and in which way they are corrected for the domain of decision-making under ambiguity.

Finally, several studies test, in various other ways, the normative appeal of ambiguity aversion preference models.Footnote 13 The closest to our design is that of Slovic and Tversky (1974), who give subjects written advice for and against Allais (1953) and Ellsberg (1961) choices. However, their advice is built around the independence axiom, and so concerns a quite distinct domain. Jabarian and Lazarus (2022) also study the effect of a form of advice on subjects’ decisions in a framework with ambiguity aversion. Their framework involves two independent draws from the same two-color ambiguous urn (and two draws from a 50-50 risky urn) in which many subjects make dominated choices, similarly also to Yang and Yao (2017) and Kuzmics et al. (2022). Subjects win if they draw two balls of the same color from the urn that they choose, making betting on the ambiguous urn a (weakly) dominant choice—as the more extreme the ball distribution in the ambiguous urn the higher the chance of drawing two balls of the same color. Jabarian and Lazarus (2022) have treatments in which subjects are given additional decision problems that should help them understand the mechanism why a choice is dominated. They find that while subjects do seem to understand the mechanism, they, nevertheless, do not seem to transfer this knowledge to the original problem in which they make dominated choices regardless. Finally, Keller et al. (2007), Trautmann et al. (2008), Charness et al. (2013), and Keck et al. (2014) study decision problems with ambiguity in groups (or under peer observation) and find, on the whole, that group discussion and related phenomena tend to lead to more ambiguity-neutral choices.

6 Conclusion

We have subjected classical preference models of ambiguity aversion models to tests of their normative appeal with experiments that stay close to the original Ellsberg (two-color urn) design.

We find that subjects’ choices are affected by payoff-irrelevant commentary. This implies that at least one of the two treatments, without or with commentary, does not allow the full revelation of subjects’ preferences.

At least some of our subjects do seem to see a certain normative appeal in the kind of behavior prescribed by classical models of ambiguity aversion and, in particular, the monotonicity axiom. Giving subjects access to additional commentary, in the form of short video clips, results in behavior that is more consistent with these models.

The nature of this normative appeal suggests that people, after sufficient reflection, would have a demand for the ability to commit to randomized choices, a demand which one would surmise should be observable. It would be interesting to identify instruments outside the lab, in the various areas of application of ambiguity aversion models, that could serve to satisfy this demand.

We also find that our subjects lack a complete and perfect understanding of their decision environment and how their choices map into final outcomes, in spite of the fact that we did our best to describe the environment completely and accurately. If this is the case, then there is room for further descriptions to influence behavior. We have shown that this is indeed readily observable, using the relatively weak instrument of short video clips providing commentary on the hedging argument of Raiffa. This means that in classical designs, it may be necessary to view a given choice as arising from a combination of preferences and how the subject understands the environment, where the second channel is non-trivial. Accordingly, a given choice may not provide direct evidence for or against any given preference model.