1 Introduction

In daily life people have to make many important decisions that involve risk. Buying a house, moving to take a new job opportunity, and investing family savings are clear examples. In recessionary periods, like those experienced in Europe recently, these decisions are often exposed to elevated risk. Typically, it is not only the decision maker who is affected by the resulting outcomes, but other people as well. In the examples above all members of the household feel the consequences. Moreover, sometimes decisions like these are made not by sole individuals, but by groups of people. For example, in a household, a husband and a wife might make the choices together. Similarly, managers act on behalf of shareholders. Decision makers in such situations have responsibility since their choice affects others to the same degree, and, in some cases, they are involved in mutual decision making when several people make choices jointly for themselves and a group of others.

We put forward a hypothesis that decisions made under responsibility are influenced by blame avoidance: a change in behavior that is triggered by the desire to preclude others from forming an unfavorable belief about the decision maker when her choice affects them. In the psychological literature two primal reasons for blame are distinguished: causal (e.g., Darley and Shultz 1990) and intentional (e.g., Baron and Ritov 2004). In the former case, the decision maker is blamed for a bad outcome even if she did not cause it directly. In the latter case, the decision maker is blamed for choosing a bad action even if this choice did not cause direct harm. Cushman (2008) shows that, depending on the situation, either or both reasons can influence the amount of blame. In economics, the two sources of blame can be conceptualized in the framework of decision making under uncertainty. Indeed, a decision maker can be blamed for a choice among uncertain prospects and/or for a realized or counterfactual outcome when uncertainty is resolved. Gurdal et al. (2013) provide the evidence for both causal and intentional blame. In their experiment, an agent chooses between a lottery and a safe asset. The outcome of the chosen option goes to a principal, who then decides how much to allocate between the agent and a third party. They find that principals blame agents for low realized outcomes of the chosen lottery (causal blame), for high counterfactual outcomes of the unchosen lottery (causal blame), and for choosing an option with low expected utility (intentional blame).Footnote 1

In this paper we do not focus on how blame can be expressed by the individuals for whom the choice is made, but rather on how blame avoidance manifests itself in the choices of individuals who are responsible for others. To be in line with the examples of decisions mentioned above and to reduce the number of possible interpretations of our results, we design a choice environment where causal blame does not play a leading role so that any effects we find can be attributed to intentional blame. In order to understand how the decision maker should estimate the amount of blame generated by her choices we use the intuition of Çelen et al. (2017), who propose and provide experimental support for a model in which people blame others for choices that they themselves would not make (in a non-risky setting). In their study blame is directly connected to the preferences of the individuals for whom the choice is made.Footnote 2 In our risky setting, we conjecture that blame avoidance should shift choices in a way that makes them more consistent with what the affected others would have chosen, so that the amount of intentional blame is minimized. Put differently, under responsibility the choice of a decision maker should move towards what she believes is consistent with the risk preferences of the majority (modal preferences).

To find out if choices under responsibility are in line with the blame avoidance hypothesis, we conduct an experiment in which we compare risky choices that influence only the decision maker herself and the choices that, in addition, affect one other person. We also test two accompanying hypotheses. The first one is related to the level of responsibility by which we mean the number of people that are affected by the decision. Intuitively, there should be “more” blame when more people feel the consequences of the decision. This should push the choices even more towards consistency with the preferences of the others than in the case when only one other person is involved. The second hypothesis is concerned with mutual decision making when the choice between lotteries is made by two individuals who should agree on a choice that impacts themselves and a group of others. In this context, the two decision makers have more information available to them than when they choose alone. Namely, when discussing their choice, the decision makers reveal their personal risk preferences, which provides them with more information about the distribution of preferences in the population. This should allow them to make a more informed guess about the modal preferences and, thus, shift the mutual choice closer to it as compared to the situation when each of them is choosing individually.

The blame avoidance hypothesis predicts that, in situations where others are affected, the choices should shift in the direction of the modal preferences in the population.Footnote 3 In particular, risk loving individuals should choose more cautiously and risk averse individuals should choose riskier options. In order to detect such movements we use a novel completely randomized within-subjects design, which allows us to see how heterogeneity in subjects’ risk preferences leads to differential behavior in responsibility situations. Subjects make choices in four stages. Each stage is a different Holt and Laury task (Holt and Laury 2002), which involves ten decisions between two lotteries. Subjects are told that the payoff that they get from each of the four HL tasks is also delivered to 0, 1, 3, or 4 others. Moreover, in one of the four stages two subjects have to make a decision in a HL task together. Each subject makes choices in (1) individual choice condition, where only he is affected by his choice; (2) responsibility for one other condition, where the payoff is delivered to the subject who makes the choice and one other subject in the session; (3) responsibility for three others condition, where the payoff is delivered to the subject who makes the choice and three others; (4) mutual responsibility condition, where two subjects have to agree on a choice and the payoff is delivered to them and three others. The presentation order of the conditions and HL tasks is randomized. This design allows us to directly observe the changes in choices under risk depending on the four conditions and on the individual heterogeneity in risk preferences. A within-subjects design is necessary for testing the predictions of the blame avoidance hypothesis, since the choices of subjects should change in the direction of the modal risk preference, which would be hard to detect if measured between-subjects.

Our design makes it possible to rule out several alternative explanations. First, the role of causal blame is minimized since subjects never see the realizations of the lotteries during the main task and are only informed about the total payoff they receive from others in the end of the experiment. Second, since in all conditions the decision maker and others affected by her choice receive the same payoff, the difference in behavior between conditions cannot be explained by any type of social preferences sensitive to inequality. Third, the shift in behavior can be potentially attributed to preferences for efficiency (maximize the sum of payoffs of all parties involved). This should push the behavior more towards risk neutrality (highest expected value) the higher the number of others is. Since all four HL tasks are calibrated so that risk neutrality is manifested in the same behavior, the tendency to move towards risk neutrality can be easily distinguished from the shifts towards modal risk preferences.

We report the following results. After controlling for the regression to the mean that is inherent in our design, we find a weak effect of responsibility for one other person on the choices of our subjects as compared to the condition where they choose only for themselves.Footnote 4 When we compare the choices with responsibility for one and three others the effect becomes stronger: risk averse subjects, when responsible for three others, choose riskier options than when responsible for one other, and risk loving subjects choose less risky options. In the mutual responsibility condition, where pairs of subjects choose for themselves and three others, we also observe the same predicted shift as compared to the individual choices for three others. Moreover, the choices in the mutual responsibility condition seem to be more concentrated around the modal risk preferences than the choices in the other three conditions. In addition, we find some evidence that the information about preferences of others obtained in the mutual responsibility condition helps subjects to better match modal preferences in subsequent conditions. All this supports blame avoidance as an explanation of behavior under responsibility.

2 Related literature

One strand of literature on decision making under risk which is of importance for this study is the literature on group decision making (when choices are made mutually by the group members for themselves). Stoner (1961) laid the foundation. He found that people, as members of a group, took on significantly more risk, compared to the situation in which they chose individually. This change was referred to as a “risky shift.” The same effect was found in various other studies, for example, in Gardner and Steinberg (2005).

More recent investigations report the somewhat contrasting evidence. For example, Shupp and Williams (2008), who used lotteries to explore differences in choices under risk between individuals and groups, found that groups were significantly more risk averse than individuals when choosing between lotteries that involved high levels of risk. Masclet et al. (2009) and Baker et al. (2008) also found a “cautious shift”: they observed that groups made safer choices than individuals.

The findings that people deciding within a group context are less risk averse can be explained by the fact that risk is shared by the group members as conjectured in Wallach et al. (1964). When choosing alone, all the results of risky decisions are attributed to a single decision maker. Consequently, in case of non-desirable outcomes, this person is the only one who can be seen as “responsible” for the outcome. The authors suggest that guilt and shame aversion (aversion to having negative feelings for being responsible for bad outcome not expected by others) force the decision makers to choose more cautiously.Footnote 5 This reasoning can also be applied in reverse: being solely responsible for successful events results in a higher utility level, compared to achieving the same with a group (Eliaz et al. 2006). As a result, utility levels of the decision makers have a smaller spread when decisions are made in groups rather than on an individual basis. This implies that individuals who make decisions together in groups are exposed to less risk, which could explain riskier choices.

The papers cited above use various methods to test their hypotheses regarding group versus individual choices. However, none of the papers uses a within-subjects design and therefore only the averages between conditions can be compared. This does not allow to look at deeper determinants of the changes in risky choices as our design permits. In particular, a within-subjects design allows us to see how individual risk preferences determine the shifts in choice under responsibility. Between-subjects designs do not allow for this possibility since only averages between conditions can be compared and the differential effects of risk preferences (cautious shift for risk loving subjects and risky shift for risk averse subjects) are hard to detect. In addition, we can directly compare the influence of the number of others on the choices and the influence of the mutual decision making, which is, again, not possible with between-subjects design.Footnote 6 As an additional advantage, in our design mutual decision making involves passive group members for whom the choice is made, which adds a new dimension never studied before. Thus, with our design, we aim at clarifying the mixed results reported in the studies above.

Some studies concentrate on the dynamics of group decision making. According to Pruitt (1971), leadership theory predicts that the choice shifts towards the most confident group member. Gardner and Steinberg (2005) studied risk preferences of groups of adolescents and found that peer pressure plays an important role in their behavior. In our design subjects in the mutual choice condition can freely chat with each other in order to reach mutual decision for the group. Given that we have subjects also choose for the group individually, we can see the effect mutual responsibility has on choice.

All studies above compared group decision making to individual choice for the group. It is also interesting to compare an individual choosing just for himself and choosing for himself and a group (Davis 1992; Bolton et al. 2015). In this case responsibility is not shared by several individuals. The decision maker is the only one who decides and is, therefore, the only person responsible. Thus, the mechanism that worked for the group decision making cannot play any role here. Bolton et al. (2015) find that risk loving subjects (as measured by the risky choices when only the subject herself is affected) become more cautious when they are responsible for the payoff of one other subject, but they do not find any shifts towards riskier choices under responsibility among risk averse subjects. The authors attribute this behavior to (causal) blame avoidance and possibly other effects. The individual condition and the choice for one other condition in our experiment are almost the same as the IR and SR+ conditions in Bolton et al. (2015). However, we do not find the same effect after controlling for the regression to the mean.Footnote 7

3 Experimental design

The experiment consists of four stages and a questionnaire. In each of the four stages subjects make choices in a Holt-Laury task, or HL (Holt and Laury 2002). Namely, in each HL task subjects make ten choices between two lotteries, A and B, which have fixed monetary outcomes but varying probabilities.Footnote 8 The monetary outcomes in all four HL tasks are different but the probabilities of the outcomes stay fixed across all ten choices in all tasks (see Table 9 in online Appendix B for details). The payoffs were chosen to satisfy two conditions: (1) the expected payoffs from the lotteries across the four HL tasks are approximately the same; (2) assuming the expected utility function \(u(x) = x^{1-r}\), the optimal switch from choosing lottery A to lottery B happens at the same place for all four HL tasks for given intervals of r.Footnote 9 The main idea of our experiment is to have a complete within-subjects design with four different conditions but with the same kind of task. Therefore, we cannot use one HL task for all four conditions since this can create several types of biases in subjects’ choices if they are exposed to the same task four times in a row. For example, subjects might ignore the differences in conditions and choose the same switching point, thinking that they have to be consistent, or there can be an experimenter demand effect: subjects might deliberate about how exactly they are supposed to change their switching point depending on the condition. We minimize the effects of these biases by having different payoffs in the four HL tasks. In addition, to avoid order effects, each subject was exposed to a different (randomized) sequence of HL tasks.Footnote 10

There are two treatments: Main and Control. In the Control treatment subjects choose for themselves in the four HL tasks (in randomized order). No information about the payoffs is communicated to subjects between the HL tasks (see online Appendix C for instructions). At the end of the experiment the software chooses one of the ten choices at random for each of the HL tasks, and “plays” the lottery. Each subject receives final earnings that are equal to the sum of the four payoffs realized from the four chosen lotteries. Subjects are informed about this procedure and are told that they will see the payoffs they obtain from each task at the end of the experiment. We choose to pay subjects for all four HL tasks instead of just one because in the four analogous HL tasks of the Main treatment subjects choose for themselves and others. To make choices for others more salient, subjects and others receive a payment for each HL task. Therefore, we opt for paying subjects for each HL task in the Control treatment in order to keep it as similar to the Main treatment as possible.

Paying for all four lotteries might raise concerns about the possibility of the portfolio effect. Harrison and Rutström (2008) directly address this issue in a setting where subjects choose in several HL tasks one after the other (all draws are uncorrelated). They are paid for all HL tasks and the goal of the experiment is to elicit subjects’ risk preferences. They point out that, in this specific case, which is exactly like our treatments, more information about the risk preferences of the subjects can be obtained. For example, when a subject makes six safe choices in one HL task and five in the other, this means that her CRRA risk coefficient lies somewhere in the intersection of the intervals of coefficients that are consistent with making these choices, which makes the risk preference estimation more precise. In addition, Harrison and Rutström (2008) report the results of an experiment where they evaluate risk preferences in this setting but vary the number of HL tasks that are paid (1, 2, or 3). No difference is found in the estimations of risk coefficients. The authors conclude that in the sequential HL tasks setting the portfolio effect does not influence the choices. The same should be true in our experiment. In fact, in order to estimate the strength of regression to the mean, we use the average number of safe choices for each subject in the Control treatment as a measure of their risk preferences.

In the Main treatment we expose subjects to different conditions that vary in the level of responsibility and the number of decision makers (individual vs. mutual choice). This treatment is the same as the Control treatment but with one difference. The subjects are told that the payoff they obtain in each HL task (as described above) will also be paid to other subject(s) in the session. So, if the software generated some payoff x for a subject, the same amount x is paid to other individual(s) depending on the condition. Therefore, the payments that each subject receives in the Main treatment are: 1) payments from own choices (the sum of four payoffs from the HL tasks) and 2) payments from the choices of other subjects in the session when a subject is acting as the individual affected by their decisions. There are four conditions which correspond to the four HL tasks.

  • Individual choice (IC) Subjects choose just for themselves as in the Control treatment.

  • Responsibility for one other person (R1) Subjects choose for themselves and one other subject in the session. So, the payoff generated by the software goes to the subject who made the choice and someone else in the session. The identity of the other was kept anonymous.

  • Responsibility for three other people (R3) Subjects choose for themselves and three other subjects in the session. So, the payoff generated by the software goes to the subject who made the choice and three other people in the session. The identities of the others were kept anonymous.

  • Mutual responsibility (MR) Subjects are matched in pairs. Each pair of subjects (the negotiators) sees the same HL task and should reach an agreement on how to choose in each of the ten cases. The subjects can communicate via online chat using natural language. In order to make a mutual choice both subjects should choose the same lotteries in all ten cases and press the “Validate” button. After both subjects press the button, the software checks if the choices of both subjects are the same and only then allows them to proceed. After the mutual choice was made, the payoff that was generated from these choices was received by the negotiators and three others in the session. This allows us to make direct comparisons between Condition R3 and Condition MR.

The conditions were not presented to the subjects in a fixed order. In each time period all four conditions were played by some subjects in the session. The order was randomized. Table 12 in online Appendix B reports the exact sequence of conditions for each subject and Table 11 shows how subjects were matched. Thus, in our design, subjects were presented with a random sequence of HL tasks and a random sequence of conditions.

The questionnaire elicited standard demographic data and asked two questions related to risk aversion. The first question was “How risk averse/loving are you?” on the scale from 1 = “very risk loving” to 7 = “very risk averse.” The second question was “How risk averse/loving are you in comparison with other people?” on the scale from 1 = “most risk loving” to 7 = “most risk averse.”

The experiments were conducted at Maastricht University, School of Business and Economics (BEELAB), in April 2014, June 2017, and February 2018. 72 subjects participated in the three sessions of the Control treatment and 120 subjects in the five sessions of the Main treatment. Subjects in the Control treatment earned € 11.89 on average, and subjects in the Main treatment earned € 27.89 (both including € 2 show-up fee). No other data were collected in any form: there were no pilots or discarded sessions. For the analysis the subjects who switched more than once in at least one of the HL tasks were dropped. Such choices are inconsistent with the expected utility maximization and might also indicate that subjects did not understand the task. This left us with 54 subjects in the Control treatment and 102 subjects in the Main treatment. The experiment was programmed in z-Tree (Fischbacher 2007).

4 Results

4.1 Preliminary results

Before we get to the main results we would like to ensure that in the four HL tasks of the Control treatment subjects chose (mostly) the same switching point, as should be the case if they are CRRA utility maximizers. If this is true, then, in the Main treatment, we can interpret the movements of the switching point between the HL tasks as the result of influence of responsibility. We look at the number of safe choices of an individual, or threshold, which is defined as the number of safe options (lotteries A) chosen before switching to the risky option (lotteries B). Thus, high threshold indicates high degree of risk aversion. For the HL tasks 1, 2, 3, and 4 we define the variables \(hl_1\), \(hl_2\), \(hl_3\), and \(hl_4\), which are equal to the number of safe choices.

Table 1 Summary of the number of safe choices in the four HL tasks in the Control treatment

Table 1 shows the average number of safe choices in the four HL tasks of the Control treatment. Figure 3 in online Appendix A presents these data graphically, and Fig. 5 (online Appendix A) shows the histograms of the number of safe choices in the four HL tasks. All four tasks produce similar average thresholds between 5 and 6, which indicates that our subjects are, on average, risk averse (risk neutral subjects should switch after 4 safe choices in all HL tasks). Nevertheless, the mean number of safe choices in the first HL task is two standard errors away from that of the tasks 2 and 4. This might look problematic, however, the difference is rather small and the signed-rank tests do not show a significant difference between any pair of HL tasks at 5% level.Footnote 11 Still, it might seem that the higher average threshold in the first HL task can bias our results. Notice, however, that we analyze the within-subjects differences in thresholds between conditions. Moreover, in each condition there are subjects who choose in all four HL tasks. Thus, the possible bias of the first HL task should cancel out as there is approximately the same number of subjects choosing in it in all conditions. Overall, we conclude that we can use the choices in the four HL tasks to analyze the condition effects in the Main treatment.

Fig. 1
figure 1

Fitted values of the OLS regression of the answer to the question “How risk averse/loving are you in comparison with other people?” on the average threshold in the Control treatment. Errors are robust. Data points are jittered

The blame avoidance hypothesis predicts that subjects, when responsible for others, should make choices consistent with the modal risk preferences in the population. This prediction rests on an assumption that subjects have correct beliefs about which number of safe choices corresponds to the modal risk preferences. To test if this is true on average we consider the data from the Control treatment and check if the per subject average threshold \(m = \sum _{k=1..4} hl_k/4\) predicts the answer to the question “How risk averse/loving are you in comparison with other people?” that subjects gave after the experiment.Footnote 12

We run an OLS regression where the dependent variable is the answer to this question (from 1 to 7 on a Likert scale) and the independent variable is m. Figure 1 shows the results. We find that the correlation is significant: intercept \(2.45^{**}, p = 0.010\) and \(\beta\)-coefficient \(0.35^{**}, p = 0.036\). As was reported in Table 1, the average threshold in the Control treatment is 5.45 and the modal choice is 6 (Fig. 5, online Appendix A). The fitted values of the regression do not exactly correspond to these average/modal preferences: subjects with the thresholds around 5 consider themselves to be average/modal. However, the discrepancy is not dramatic. It is important that the very risk averse (loving) subjects know they are more risk averse (loving) than the majority. So, the assumption that subjects on average have correct beliefs about the modal preferences approximately holds.

Next, we provide some summary statistics for the Main treatment. Each subject made four choices: (1) a choice in an HL task for herself only (Condition IC); (2) a choice for herself and one other person (Condition R1); (3) a choice for herself and three others (Condition R3); and (4) a mutual choice with a partner for three others (Condition MR). In what follows, we will denote the thresholds in the four conditions by \(t_{IC}\), \(t_{R1}\), \(t_{R3}\), and \(t_{MR}\).Footnote 13

Table 2 Summary of the number of safe choices in the four conditions of the Main treatment

Table 2 shows the average numbers of safe choices in the four conditions (see Figs. 4 and 6 in online Appendix A for the graph of averages and the histograms). The averages are very close. The signed-rank tests show no significant differences between conditions (Holm-Bonferroni corrected \(p > 0.186\) for the six comparisons performed). This result does not necessarily mean that there is no influence of responsibility on choices. As we hypothesize, it is possible that subjects change their choices in different directions depending on their individual risk preferences (cautious and risky shifts).Footnote 14

4.2 Responsibility

In this section we analyze the effects of the four conditions on the choices of our subjects. The goal is to measure the influence of responsibility for other people on the number of safe choices made. We look at the within-subjects differences in thresholds \(t_{IC}\), \(t_{R1}\), \(t_{R3}\), and \(t_{MR}\) and make pair-wise comparisons of the conditions as the level of responsibility increases. Thus, we are interested in how the choices change between Conditions R1 and IC (responsibility for one other vs. no one); Conditions R3 and R1 (responsibility for three others vs. one other); and Conditions MR and R3 (mutual responsibility for three others vs. individual responsibility for three others).

In order to show that responsibility for others influences choices in the HL tasks, we need to find a relationship between thresholds in Conditions i and j. The simplest way to do this is to run a regression with dependent variable \(t_i\) and independent variable \(t_j\). However, this approach cannot be used if we want to compare the observations in the Main and Control treatments (which we do want to do, see below). The latter does not have the same conditions as the Main treatment, but only the four independent HL tasks, which makes it unclear which of the four tasks should be the analogs of \(t_i\) and \(t_j\) for Conditions i and j. To circumvent this problem we define the variables \(t_{i-j}\) that, for each subject, denote the differences in the number of safe choices between Conditions i and j, or, in other words, \(t_i - t_j\). For example, \(t_{R1-IC}\) is the difference between the number of safe choices in Condition R1 and Condition IC (\(t_{R1} - t_{IC}\)). For HL task j in the Control treatment we define \(t_{i-j}\) as the difference between the average threshold and \(hl_j\): \(t_{i-j} = hl_{m-j} = m - hl_j = \sum _{k=1..4} hl_k/4 - hl_j\), which is the deviation of each threshold from subject’s average risk preferences estimated from the four HL tasks. This definition of \(t_{i-j}\) is symmetric with respect to all HL tasks and allows us to use \(t_{i-j}\) as a dependent variable in the regressions that combine all the data from both treatments.

We can run an OLS regression with the dependent variable \(t_{i-j}\) and the independent variable \(t_j\) and see whether we find any evidence for the shift of choices towards the modal risk preference, as predicted by the blame avoidance hypothesis. It should manifest itself as a negative \(t_{i-j}\) for risk averse subjects with high values of \(t_{j}\) and a positive \(t_{i-j}\) for risk loving subjects with low \(t_{j}\). Thus, according to this logic, a positive intercept and a sufficiently negative coefficient on \(t_{j}\) should provide the evidence for the predicted movement. However, this is incorrect since such estimation does not take into account the effect of regression to the mean. Risk loving subjects with very low \(t_{j}\) will tend to have positive \(t_{i-j}\) just because the probability of their choosing \(t_{i}\) closer to the mean is much higher than the probability of choosing a more extreme threshold. The same is true for very risk averse subjects: very high \(t_{j}\) will be likely associated with negative \(t_{i-j}\). Both forces—our hypothesized treatment effect and the regression to the mean—are, thus, operating in the same direction. Therefore, to detect the influence of responsibility we need to show that condition effects are significantly larger than the regression to the mean.

We use the data from the Control treatment in order to estimate the strength of regression to the mean assuming that it is the same in both treatments. Thus, we pool the data from the Main and Control treatments (four observations per subject) and run a random effects GLS regression with errors clustered by subject. For the data in the Main treatment, the dependent variable is \(t_{s,i-j}\) for subject s and the difference in thresholds between Conditions i and j. We look at the threshold differences between comparable conditions. Thus, for the thresholds in Condition IC (independent variable \(t_{s,IC}\)) the associated dependent variable is \(t_{s,R1-IC}\); for the independent variable \(t_{s,R1}\) in Condition R1 it is \(t_{s,R3-R1}\); and for the independent variable \(t_{s,R3}\) in Condition R3 it is \(t_{s,MR-R3}\). For the data in the Control treatment the dependent variable is \(hl_{s,m-j}\) equal to the difference between the subject-wise average threshold and the threshold in HL task j as described above. The independent variables are: \(t_{s,j}\), the threshold of subject s in Condition j in the Main treatment and in HL task j in the Control treatment; three dummies (\(Cond_k\) for \(k \in \{IC, R1, R3\}\)) that are equal to 1 for Conditions IC, R1, and R3 of the Main treatment and zero otherwise (in particular, all three dummies are zero for the Control treatment data); and their interactions. Thus, the baseline of the regression is all the data from the Control treatment. The condition effects beyond that of the estimated regression to the mean are given by the dummies \(Cond_k\) and the interaction terms \(Cond_k\cdot t_{s,j}\). Thus, we estimate the following model:

$$\begin{aligned} t_{s,i-j} = \alpha + \sum _{k \in \{IC, R1, R3\}}\beta _{k} Cond_{k} + \gamma t_{s,j} + \sum _{k \in \{IC, R1, R3\}} \delta _{k} Cond_{k} \cdot t_{s,j} + \eta _{s} + \varepsilon _{s,j} \end{aligned}$$

where \(\eta _s\) is a subject-specific random effect. In this formulation coefficients \(\alpha\) and \(\gamma\) represent the effect of regression to the mean in the Control treatment. \(\alpha\) should be positive, which would indicate the tendency for the thresholds to increase when they are low or zero, and \(\gamma\) should be sufficiently negative, which would indicate the tendency for the thresholds to decrease when they are high. The coefficients \(\beta _k\) and \(\delta _k\) describe the changes between conditions (R1 and IC; R3 and R1; MR and R3). Thus, in order to establish a significant shift of choices between conditions beyond regression to the mean, both \(\beta _k\) and \(\delta _k\) should be significantly different from zero with \(\beta _k\) positive and \(\delta _k\) negative. This would show that the effect of a condition on differences in thresholds is significantly higher than that in the Control treatment.

Table 3 Random effects GLS regression of differences in thresholds between conditions in the Main and Control treatments

Table 3 shows the results. The negative and significant coefficient on \(t_{s,j}\) and positive intercept represent the regression to the mean in the Control treatment. The coefficients on \(Cond_{IC}\) and the interaction term \(Cond_{IC} \cdot t_{s,j}\) are not significant. This implies that we cannot conclude that there is an effect of responsibility for one other person as compared to choosing for oneself only. However, the coefficients on \(Cond_{R1}\) and \(Cond_{R1} \cdot t_{s,j}\) are significant, which means that we do have an effect of the level of responsibility for three others (Condition R3) as compared to one other (Condition R1). We also observe an effect of mutual responsibility as compared to individual responsibility, the coefficients on \(Cond_{R3}\) and \(Cond_{R3} \cdot t_{s,j}\) are significant. This provides the evidence that the level of responsibility and mutual decision making have an effect on the choices as predicted by the blame avoidance hypothesis. For robustness check, we run the same regressions separately for each pair of conditions in the Main treatment. Columns (1–3) in Table 7 in online Appendix A show that the results are unchanged.

Fig. 2
figure 2

Fitted values of the regression in Table 3 in the Control treatment and in Conditions IC, R1, and R3 of the Main treatment. The red lines show the fitted values of the Control treatment and the blue lines the fitted values of the regression. Vertical dashed lines show where the fitted values cross zero (red line in the Control treatment and blue lines in the Main treatment). (Color figure online)

To better understand the effects of responsibility on choices we present the fitted values of the regression in Table 3 graphically. Figure 2a shows the fitted values of the data in the Control treatment. The red line represents the effect of regression to the mean (coefficients \(\alpha\) and \(\gamma\)). Figure 2b–d shows the fitted values of the regression between conditions (blue lines, coefficients \(\alpha + \beta _k\) and \(\gamma + \delta _k\), \(k \in \{IC, R1, R3\}\)). The red lines in these figures are the same as in Fig. 2a and are shown for the comparison of the effect of responsibility with the effect of regression to the mean. Comparing the changes in thresholds between Conditions R1 and R3, when the level of responsibility increases from one to three others, we observe many risk averse subjects who show a risky shift (Fig. 2c, the data points with the threshold bigger than 5). This goes against the evidence provided in many studies cited in Sect. 2, which find only cautious shifts. We suggest that this is due to our randomized design, which mitigates any possible biases that can emerge in a within-subjects experiment and, thus, provides more reliable measures of shifts in behavior. The effect of responsibility is most pronounced when we compare mutual choices for three others (Condition MR) to individual choices for three others (Condition R3). Here, both cautious and risky shifts are clearly noticeable. We attribute this effect to the increase in the quality of information about the modal risk preferences in mutual responsibility condition as compared to the individual choice conditions.

To support the blame avoidance hypothesis further, notice that the fitted values of the regression on Fig. 2b–d (blue lines) cross zero in between thresholds 5 and 6 as indicated by the vertical dashed lines. This means that subjects with thresholds 5 and 6, which are the average and the modal preferences (see histograms in Fig. 6, online Appendix A), do not change their behavior between conditions, exactly as the blame hypothesis suggests. Notice, as well, that we can rule out the explanation related to the preferences for efficiency discussed in the Introduction, as in this case we would see the crossing at threshold 4 (risk neutrality).

Table 4 Random effects GLS regression of differences in thresholds between conditions in the Main and Control treatments

To show the strength of the effect of each condition we make the analogous comparisons between other pairs. Table 4 shows the same regression as Table 3 only with the dependent variable for the Main treatment defined as \(t_{MR-IC}\), \(t_{MR-R1}\), and \(t_{MR-R3}\) for the Conditions IC, R1, and R3 respectively. Thus, this regression describes the effect of mutual responsibility condition (MR) on the threshold choices as compared to all other conditions. The regression shows a significant convergence to modal preferences, beyond regression to the mean, as is evident from the significance of all coefficients \(\beta _k\) and \(\delta _k\), \(k\in \{IC, R1, R3\}\). As before, the fitted values for Conditions IC, R1, and R3 cross zero between thresholds 5 and 6 (modal preferences), which implies that subjects with these thresholds do not change their choices as predicted by the blame avoidance hypothesis.Footnote 15 The fact that we detect a significant change in choices in Condition MR as compared to all other conditions demonstrates that subjects are much more confident in their decision when they choose together than when they choose individually. This emphasizes the importance of better information about the preferences of others for choices under responsibility.

In order to provide additional evidence that it is indeed information about preferences that creates a strong effect in Condition MR, and not, say, a negotiators’ desire to simply agree on an intermediate threshold, we compare the choices of subjects who made decisions in Conditions IC and R1 before and after Condition MR. Intuitively, if a subject has been exposed to Condition MR, where she learns about the preferences of another negotiator, before Condition IC or R1, then she should be better at matching modal preferences when choosing individually afterwards, as compared to the situation when Condition MR comes after IC and R1.

Table 5 OLS regressions of differences in thresholds between conditions in the Main and Control treatments

Table 5 shows regressions (like in Table 7 in online Appendix A) of changes in thresholds between Conditions R1 and IC, and R3 and R1 separately for subjects who had Condition MR before and after Conditions IC and R1.Footnote 16 We see that subjects who experience Conditions IC after MR do show significant reaction to Condition R1 as compared to IC: the coefficients on \(Cond_{IC}\) and \(Cond_{IC}\cdot t_{s,j}\) in the “After” regression are significant at 10% level (p values are 0.066 and 0.065). This is in contrast to subjects who choose in Condition IC before MR (the same coefficients in the “Before” regression are insignificant and much closer to zero). Similarly, for the comparison of R3 and R1, the “Before” regression shows less significant and smaller reaction than the “After” regression (coefficients on \(Cond_{R1}\) and \(Cond_{R1}\cdot t_{s,j}\)). This pattern is consistent with the idea that subjects learn new information about preferences during Condition MR and then use it to better match the modal preferences. Interestingly, the fitted values for the two “After” regressions cross zero between thresholds 5 and 6, exactly where modal preferences are (bottom row of Table 5), while the two “Before” regressions do not.Footnote 17 This suggests that subjects after experiencing Condition MR become much better at matching modal preferences.

The last comparison left unchecked is the change in thresholds between Conditions IC and R3. When we run a regression similar to the one in Table 4 but with the dependent variable defined as \(t_{R3-IC}\) and \(t_{R3 - R1}\), we do not find a significant difference between Conditions IC and R3 (see Table 8 in online Appendix A and column (6) in Table 7 for a separate OLS regression). As we discuss in Sect. 5 below, the reason for this might be the imprecision of individual information about the risk preferences in the population.

4.3 Choice in mutual responsibility condition

In this section we analyze the influence that the choices of the two subjects in Condition R3 have on the number of safe choices they end up agreeing on in Condition MR. The question is, given the number of safe choices of the decision makers in Condition R3, which number of safe choices do they end up choosing in Condition MR? We hypothesize that the negotiation process should lead to a consensus so that each pair’s choice lies in between the individual choices in Condition R3. Indeed, 81% of the pairs choose the threshold in Condition MR in between their individual choices in Condition R3. Notice that this is true even though half the pairs face Condition MR before Condition R3.

We analyze the data further to support this finding. We consider three variables: \(t_{MR}\), the agreed threshold of each pair; \(Lowt_{R3}\) and \(Hight_{R3}\), the lowest and the highest choices in Condition R3 in the pair. For example, if in Condition R3 one partner in a pair chose 4 safe choices and another, say, 7, then \(Lowt_{R3}\) is equal to 4 and \(Hight_{R3}\) is equal to 7. We would like to know how the choice of the pair depends on the partners’ choices when they choose alone.

Table 6 Column 1: Means of the choices in Condition MR and the pair of negotiators’ highest and lowest number of safe choices in Condition R3. Column 2: OLS regression of the choice under mutual responsibility (\(t_{MR}\)) as dependent on the pair’s highest and lowest number of safe choices in Condition R3

Column 1 in Table 6 reports the means of these variables. The average choice in Condition MR lies in between the negotiators’ choices in Condition R3. Column 2 of Table 6 reports the regression of \(t_{MR}\) on \(Lowt_{R3}\) and \(Hight_{R3}\). Notice that the coefficients add up to 0.933, which is very close to 1, suggesting that the choice in Condition MR is a convex combination of the two choices in Condition R3. This implies that the negotiators in the pair in most cases (81% of the data) agree on a number of safe choices that lies in between their number of safe choices in Condition R3. Moreover, this choice is influenced more by the cautious negotiator since the coefficient on \(Hight_{R3}\) is around 0.6 (60% weight) and the significance of this coefficient is much higher than that of \(Lowt_{R3}\) (\(p < 0.001\) for \(Hight_{R3}\) vs. \(p = 0.030\) for \(Lowt_{R3}\)). This suggests that it is “simpler” for the risk averse subjects to talk risk loving subjects into choosing closer to their risk preference than the other way round. Overall, the negotiators choose in a way consistent with the blame avoidance hypothesis. If we assume that the choices in Condition R3 represent each negotiator’s guess (given individual beliefs) about the modal preferences in the population, then the choice in Condition MR, which is a convex combination of the two choices in Condition R3, is a better guess that takes into account the information from both subjects.

5 Discussion

Our results can be summarized as follows. We found a weak effect of responsibility for one other person on the choices in Holt and Laury task, as compared with the individual choice, once the regression to the mean was controlled for. We did find strong support for the hypothesis that the number of others affected by the choice matters: when it increases from one to three, risk averse subjects choose significantly riskier options and risk loving subjects choose more cautiously. Importantly, we find a strong effect of mutual responsibility. When two subjects communicate in order to jointly choose for themselves and three others they demonstrate strong cautious and risky shifts towards the population modal risk preferences as compared to the three conditions where the choice is made individually. All these findings are consistent with the blame avoidance hypothesis, which states that people try to minimize intentional blame—blame for taking a wrong action—when choosing for others. In other words, the decisions under responsibility gravitate to the individual preferences of the majority.

The strongest effect of blame avoidance that we find involves mutual responsibility when two subjects communicate with each other in order to jointly choose for themselves and three others. Two subjects possess more refined information about the modal preferences in the population than each subject individually. Therefore, according to our hypothesis, if subjects’ purpose is to match these preferences, their mutual choice should be more consistent with the modal preferences than the individual choices. Moreover, the information obtained in the mutual responsibility condition should help subjects choose individually in subsequent conditions. This is exactly what we find. The strong significance of the differences in choices between mutual responsibility condition and three individual responsibility conditions (Table 4) suggests that subjects in our experiment have imprecise information about the distribution of risk preferences in the population, and the quality of this information improves a lot when two subjects choose together. This conclusion is supported by a weak relationship between the individual risk preferences and the answer to the question “How risk averse/loving are you in comparison with other people?” in the Control treatment data (Fig. 1). Though we do find that extreme risk averse (loving) subjects judge their preferences to be above (below) average, the magnitude of this effect is relatively small. This means that very risk averse/loving subjects realize that modal preferences are less extreme than theirs, but, at the same time, underestimate how far away their preferences are from the majority. Therefore, when two subjects learn the preferences of each other in the mutual responsibility condition, they get a better idea about the modal preferences, which makes their choices more consistent with them.

The finding that subjects underestimate how extreme their preferences are in comparison with the majority can also explain why we do not find a strong effect of responsibility for one other person. Subjects, who were not exposed to the preferences of others, think that the preferences of the majority are not that different from their own, so, when responsible for one other, they might still move towards the modal preferences (the coefficients \(\beta _{IC}\) and \(\delta _{IC}\) in Table 3 have correct signs), but this movement is too small to be statistically distinguished from the regression to the mean. Conversely, subjects with extreme preferences, who interacted with another person in mutual responsibility condition, do change their choices significantly (alas, at 10% level) when responsible for one other since they realize how extreme their preferences actually are.

Our results demonstrate that people do have a tendency to act responsibly. However, when estimating what others might prefer or like, they seem to use their own preferences as a benchmark, which can lead to a biased choice when few others are affected.

6 Conclusion

The goal of our experiment was to investigate risky decisions under responsibility and to reconcile mixed results reported in previous studies. We showed that the blame avoidance hypothesis can organize the behavior rather well. Notably, our results reveal the importance of mutual decision making for the quality of choices that affect others. Two individuals, who share private information, are able to make better decisions for others than when they choose alone. This finding might have important implications for the managerial practices and other real life situations where others are affected by decisions of individuals or small groups of people. In the future we plan to conduct a more detailed exploration of the connection between responsibility for others and the information about preferences available to the decision makers.