1 Introduction

Expenditures on public goods are often the result of a bargaining process between legislative parties that differ in terms of their bargaining power and in the utility they derive from public good expenditures. In addition, there is often a dynamic element to such negotiations in that a considerable fraction of public expenditures are often mandated by law, and are not subject to discretionary, renegotiation from one period to the next. For example, in the U.S., mandatory expenditures (e.g., on Social Security, Medicare, Medicaid, Veteran’s Benefits and other sources of “Income Security” - the latter including Disability Assistance, Food and Nutrition Assistance, Supplemental Security Income, Earned Income Tax Credits, Child Tax Credits, Unemployment Insurance, Student Loans, and Deposit Insurance) accounted for over two thirds of total federal government spending in FY 2020, while discretionary spending that is subject to renegotiation each year (e.g., on military and non-defense cabinet offices) accounted for about one quarter of FY 2020 spending—see Fig. 1 (The remaining amount is interest on government debt which is not included in the figure).

Fig. 1
figure 1

Source: Congressional Budget Office Data

Breakdown of U.S. Federal Government Spending in 2020

According to the Congressional Budget OfficeFootnote 1 U.S. federal mandatory spending has steadily increased over time from an average of 12.7 percent of GDP over the years 2010-2019 to more than 20% of GDP during the pandemic years of 2020-21 and are projected to be 15.2 percent of GDP by 2030. By contrast, discretionary outlays have steadily declined from 7 percent of GDP in 2013 to 6.4 percent in 2020 and are projected to be 5.6 percent of GDP in 2030.

In this paper we explore the process by which two parties bargain over public good expenditures under two distinct budgeting rules. Specifically, we experimentally test a model of this bargaining process due to Bowen et al. (2014), henceforth “BCE”. Under a purely discretionary bargaining rule, BCE assume that the status quo allocation to the public good in the event of a bargaining disagreement is always zero. However, under mandatory bargaining rules, the status quo default public good expenditure in the event that there is no bargaining agreement is assumed to be equal to the level of public good expenditure that was last agreed upon by the two parties. BCE show that under this mandatory bargaining rule, allocations to the public good are higher and can Pareto dominate allocations under the discretionary rule under certain conditions. BCE thus provide a simple dynamic mechanism that enables efficient provision of public goods to be attained and rationalizes the steady growth of mandatory spending in the historical U.S. federal budget. Our aim in this paper is to experimentally test the predictions of the BCE model in a laboratory experiment with paid human subjects. We implement a version of their model in the laboratory and we find strong, though imperfect support for the model predictions in our experimental data.

Our paper is most closely related to the experimental literature on coalitional and legislative bargaining; see, e.g. Palfrey (2015) and Baranski and Morton (2022) for surveys. John Kagel has made many pioneering contributions to this literature including Fréchette et al. (2003, 2005a, b, c, 2012) and Baranski and Kagel (2015). Of these papers, the one that is most closely related to this paper is Fréchette et al. (2012). They consider a version of the Baron and Ferejohn (1989) model of majoritarian coalitional bargaining where 5 players must make and vote on allocations to both private (particularistic) and public goods as in our study. The difference between their paper and ours is that we consider dynamic budgeting rules that depend on the status quo level of previously agreed upon public good expenditures and we have only 2 players who make or agree to proposed allocations, so that our decision rule amounts to unanimity. Further, in our setting, following BCE, the public good yields a nonlinear payoff (implying an interior optimum for the public good amount) that varies with the player type—the high (low) type gets a higher (lower) utility from the public good.

Our paper is also related to the recent literature on dynamic bargaining experiments, possibly with an endogenous status quo, in legislative and/or multilateral settings (e.g., Battaglini et al. (2012) and Battaglini and Palfrey (2012)). Those papers allow for an endogenous status quo, but focus either on purely distributive politics without a public good element or on the provision of durable public goods under different voting rules (majority/unanimity).

The main question we address in this paper is whether dynamic, mandatory budgeting rules matter for the achievement of the efficient level of the public good relative to discretionary budgeting rules when the public good allocation is the result of a dynamic bargaining process by two parties with different interests.

Our experimental data clearly show that Pareto improvements in public good allocations are possible under dynamic, mandatory budget rules, as opposed to discretionary rules. These improvements result from private negotiations between interested parties and occur in the absence of transaction costs. In this sense, the dynamic public good bargaining game provides a mechanism to obtain efficient public good provision in line with the Coase Theorem. To preview our results, we find that in the discretionary treatment, participants tend to allocate more to the public good than what is anticipated based on the static equilibrium. However, in the mandatory treatments, participants allocate even more to the public good, and come very close to achieving the Pareto efficient outcome in public good provision. However, they fall just short of that level, and we attribute this failure to fairness concerns. In order to convince responders to accept proposals, proposers are not able to exercise full proposer power. Instead, proposers must award responders with some private points, despite the equilibrium prediction that proposers allocate zero private points to responders in most cases unless the responders have a strong outside option. These private points awarded to responders reduce both the public good allocation and the proposer’s own private allocations in the mandatory treatments, and that is why the Pareto optimum is not quite achieved.

Our results provide some support for the key insight of BCE that the endogenous status quo level of public good provision works as an outside option for responders in bargaining under the mandatory budget rules. Proposers have an incentive to maintain or increase allocations to the public good over the current status quo as insurance against the possibility that they lose their status as a proposer in the future (the roles of proposer and responder change with a fixed probability in each round of our dynamic bargaining supergames). This is why public good provision can grow close to the Pareto efficient level in the steady states of dynamic bargaining games under the mandatory budget rules.

2 Model and experimental design

The model we implement in the laboratory was originally proposed as a dynamic game of public good bargaining by Bowen et al. (2014) (BCE). It involves bargaining between two parties about the allocation of an endowment across both public and private accounts under alternative budget rules and over an indefinite horizon.

Specifically, two parties repeatedly bargain with one another in an indefinite sequence of rounds over how to allocate a fixed endowment—in our experiment 100 points in each round—across a group account (public good) and two private accounts, one for each of the two parties. The points assigned to the group account contribute to the earnings of both members of the pair, while the points assigned to each of the two private accounts only accrue to the earnings of the individual parties who own those private accounts.

At the start of each new sequence (supergame) of rounds, the two members of each party are randomly paired and are equally likely to be chosen to be the proposer (the other player is the responder). Following the first round of the sequence, if the game continues, the current proposer continues to be the proposer in the next round with probability p, and with probability \(1-p\), the proposer and responder switch roles. We chose to set \(p=.60\) throughout all treatments of the experiment so that there is some persistence to players remaining in the same proposer/responder roles from one round to the next of each sequence, but also allowing for political change (i.e., changes in the majority party which monopolizes the proposer power) to occur, here with probability \(1-p=.40\).

The players in each pair are also randomly assigned to be either a high or a low type player, which refers to how they value the public good (as explained below). Each pair has one high and one low type player and this designation does not change over the course of the supergame.

In each pair, the proposer chooses an allocation of the 100 points (endowed anew each round) to the “group” account, the “private” account of the proposer and the private account of the responder in each round. The allocations across all three accounts must sum to exactly 100 points. If the responder (matched with the proposer) accepts this proposal, the round payoffs for the proposer and responder in the pair are realized according to the agreed upon proposal. A player with X points in their own private account and Y points in the group account would have round earnings calculated as follows:

$$\begin{aligned} \text {Player points earned} = X + \theta _i\ln {Y},\,\,\,\,i\in \{L,H\}. \end{aligned}$$

Here, if the player is a “low” type, then \(\theta _L =25\) in all treatments. If the player is a “high” type, then \(\theta _H =40\) or \(\theta _H =55\) depending on the treatment conditions. ln refers to the natural logarithm.

Once a sequence ended, subjects were randomly rematched into new pairs and their types (high or low) and initial assignments as either the proposer or responder were newly and randomly assigned at the start of the next sequence. In this way, subjects in each experimental session played multiple indefinitely repeated games (supergames) of public and private good bargaining as either high or low types (in terms of their valuation for the public good) and also traded off proposer and responder roles according to the Markovian switching probability p as described above.

Our experiment consists of two treatment variables, (1) the budget rule, which is either “discretionary” or “mandatory” as discussed in further detail below and (2) for the mandatory treatments only, the degree of political polarization which is measured by the difference, \(\theta _H-\theta _L\). As noted earlier, we always have \(\theta _L=25\). In the baseline “aligned” treatment the high type has \(\theta _H =40\) and in the “polarized” treatment, the high type has \(\theta _H =55\) (accordingly \(\theta _H-\theta _L\) is larger in the polarized treatment). Note that the discretionary treatment uses the same \(\theta\) values as the baseline aligned treatment, \((\theta _H, \theta _L) = (40, 25)\).

At the start of each new sequence, the default number of points, Y, in the group account is set to one in both the mandatory and discretionary treatments. Thus, under the logarithmic specification for the public good component for the stage utility function, there will be a zero public good payoff from this default level. The default number of points in the two private accounts (which enter utility linearly) are both 0.

Following BCE, we distinguish alternative budget rules from one other according to whether the public good levels that are agreed upon in previous rounds of a given sequence are persistent or not. If there is disagreement under the discretionary budget rules, then the points allocated to the group account are reset to one and both private accounts are reset to zero. In other words, there is no persistence to public good levels in the discretionary treatment.

By contrast, under the mandatory budget rules, if a proposal is accepted, the accepted amount in the group account becomes the new status quo default public good amount for future rounds of that same sequence. In the event that there is disagreement about future proposals within that same sequence (supergame) then, under the mandatory rule, both the proposer and the responder’s private accounts default to zero points but the group account defaults to the status quo level - the most recently agreed to public good allocation in the sequence—so that both players may still receive some positive utility benefit from a disagreement outcome.

Thus, we implement three treatments. In the discretionary, aligned (D) treatment (just called the “discretionary treatment” hereafter), \((\theta _{H},\theta _{L})=(40, 25)\) and the discretionary budgeting rule is in place. In the mandatory-aligned treatment (Ma) \((\theta _{H},\theta _{L})=(40, 25)\) and in the mandatory-polarized treatment (Mp) \((\theta _{H},\theta _{L})=(55,25)\), and in these two treatments, the mandatory budgeting rule is in place.Footnote 2 These parameter choices imply that the Pareto efficient level for the public good allocation is \(\theta _{H}+\theta _{L}=65\) in the aligned treatment and 80 in the polarized treatment.Footnote 3 We required the proposer to allocate at least 1 point to the group account to prevent outcomes with negative payoffs (\(\ln {Y}\) for \(0\le Y<1\) can result in a large negative number). Note one difference of this theory from standard public good or voluntary contribution games is that the utility from the public good allocation is nonlinear (logarithmic) which enables unique interior solutions; utility from private point allocations is linear as is more typical in those games.Footnote 4 Finally, our design is between-subjects; each session consisted of 10 subjects who participated in multiple supergames or “sequences” all conducted under the same treatment conditions (i.e., the budget rule, discretionary or mandatory, and the values of the public good weighting parameters \((\theta _{H},\theta _{L})\) are held fixed in every session of the treatments).

At the end of each round of a sequence there is a one-fifth chance that the current sequence does not continue on with another round. We thus implement bargaining over an indefinite horizon with a discount factor of \(\delta =.80\) using the method of random termination. After learning whether the most recent proposal was accepted or not, subjects were shown a randomly drawn integer from 1-5 inclusive at the end of each round. They were instructed that if a 5 was drawn then the sequence would end; otherwise the sequence would continue with another round and in that case, the status quo level for the public good in the mandatory treatments would carry forward as well.

We drew the random numbers in advance and we used several different sequences of random number draws across both the discretionary and mandatory treatment sessions. This design ensures that the length of sequences are the same between mandatory and a discretionary sessions so that we can more readily compare the dynamic data between the different treatments. The realized number of rounds for our sessions are as shown in Table 1.

Table 1 Number of realized rounds

For instance, for the first two sessions, 1-2 of treatments D, Ma, and Mp, we had 7 sequences (supergames) lasting various numbers of rounds that summed to 39 rounds in total.

At the end of a session, we randomly chose two sequences from all sequences played in a session and we paid subjects according to the points they earned in the final rounds of the two chosen sequences.Footnote 5 The points subjects earned in those two final rounds were converted into money at the fixed and known rate of 15 points = US$1 and the point totals thus calculated were paid together with a $7 show-up fee.

The experiment was computerized and programmed using oTree (Chen et al., 2016). On the relevant decision screens, we reminded subjects of the history of all group (public good) and private points in the previous rounds as well as the status quo public good levels to aid them in making decisions. They also had access to online calculators.

All sessions were conducted in the Experimental Social Science Laboratory (ESSL) at UC Irvine. Prior to making any decisions on the networked computer workstations of the laboratory, subjects were given written instructions which were also read aloud. See Appendix A for a copy of these instructions for the aligned treatment (both the Discretionary and Mandatory versions). After the instructions were read, subjects completed a quiz (which can be found at the end of the instructions given in Appendix A). Subjects’ quiz answers were reviewed by the experimenter; if a subject got a quiz question wrong, the experimenter went over the correct answer with the subject before the experiment began.

Subjects were undergraduate students at UC Irvine pursuing a variety of different major programs of study. They were recruited using the Sona systems software. Each subject participated in just one session. Total average earnings (including the show-up payment) were $24.03 for a two-hour experimental session.

3 Equilibrium and hypotheses

In the dynamic bargaining game described in Sect. 2, subjects should maximize their discounted payoffs (discount factor \(\delta =.80\)) over an indefinite sequence of allocation decisions with the induced stage utility given by \(u_t = X_t + \theta _i\ln Y_t\). Assuming they do so, Fig. 2 shows the resulting Markov perfect equilibrium public good allocations. These equlibrium allocations are plotted as a function of the status quo level for the public good (on the horizontal axis) for the two mandatory treatments (aligned, Ma and polarized, Mp) for the parameter (\(\theta\)) values that we used in the experiment. Figure 2 shows the predicted public good allocations, Y, proposed by both high and low type proposers. For most status quo levels, the proposer should allocate the remainder of the endowment to his own private account, giving zero to the responder, thereby exercising full proposer power. Figure B.1 in the Appendix presents predicted private point allocations X for both proposers and responders in the two mandatory treatments.

These are of course, the rational actor model predictions under standard, money maximizing preferences as specified above. In Appendix C, we show that a modified version of the discretionary model with other-regarding preferences results in a slight increase in public good allocations and less than full exercise of proposer power.

In the equilibrium for the discretionary (aligned) treatment (which is not shown in any figure), each type of player proposes his static equilibrium level for the public good, \(Y^{*}=\theta _i\). The logic here directly follows from the first order condition from the static, one-shot maximization problem which yields \(-1 + \frac{\theta _i}{Y}=0\), or \(Y=\theta _i\). Thus, in the discretionary aligned treatment \(Y^{*}=40\) or 25 depending on whether the proposer is a high or low type (the interior optimum for the stage utility with the assumption of full proposer power \(X_{t,proposer}=100-Y_t\)), and all remaining points go to the proposer’s own private account (See Proposition 1 of Bowen et al. (2014)).

Fig. 2
figure 2

Markov perfect equilibrium predictions for public good allocations in the two mandatory treatments, as a function of the status quo level. left panel: mandatory-aligned (ma); right panel: mandatory-polarized (Mp)

Under the mandatory rule, the two parties have an incentive to maintain the level of public good provision at least as high as the status quo level (the so-called status quo effect), as illustrated in Fig. 2. Maintenance of the current public good allocations provides proposers with some insurance against the future possibility of losing proposer power (which happens with probability \(1-p\) in the next round if one was a proposer in the current round). Once in the role of a responder, a higher status quo level for public goods reinforces the player’s bargaining power, and anticipating this, proposers operating under the mandatory rule have an incentive to push up or maintain public good allocations, relative to those under the discretionary rule.

This dynamic incentive results in the eventual growth of public good amounts up to the Pareto efficient level (\(\theta _H+\theta _L\)) under the mandatory regimes. The achievement of the Pareto efficient level of public good provision stands in contrast to the perpetual oscillation between each type’s static equilibrium levels of public good provision (\(\theta _H\) and \(\theta _L\)) that is predicted to occur under the discretionary budget rule.

To clarify these differences, Bowen et al. (2014) introduce the notion of the “dynamic optimum” (\(Y_{DO}^{*}\)) which is roughly the public good allocations that maximize the dynamic payoff of each proposer type (high or low) under full proposer power;

$$\begin{aligned}Y_{DO}^{*}(high)=\frac{1+\delta -2\delta p}{1-\delta p}\theta _H, \quad Y_{DO}^{*}(low)=\theta _L\end{aligned}$$

where \(\delta\) is discount factor, or probability of random termination in our experiment, and p is the Markov probability of the roles switching between proposer and responder from round to round. It is always the case that \(Y^{*}_{DO}(high)>\theta _H\) unless \(\delta =p=0\).Footnote 6

Basically, if the status quo level of the public good is below the (type-specific) dynamic optimum, each type has an incentive to raise public goods to their own dynamic optimum level immediately; if the status quo level is above the dynamic optimum but below the Pareto efficient level, then each type will maintain the current status quo; finally, for a status quo level above Pareto efficiency, both types propose the efficient level in equilibrium (see Fig. 2). These patterns for equilibrium public good offers largely hold without exception in the mandatory-aligned (Ma) case. However, the high type’s equilibrium public good proposals may overshoot the Pareto efficient level when the status quo is lower (below half of the endowment) or is above the efficient level in the mandatory-polarized (Mp) case (and there is a small region of irregularity in low type’s proposals for status quo levles close to the full endowment of 100). With the initial status quo being 1 point in the group account at the start of each sequence (which is the case for all treatments of our experiment), the equilibrium dynamics predict that the public good allocations in the steady states are equal to the high type’s dynamic optimum: \(Y_{SS}^{*}=Y_{DO}^{*}(high)\approx 64.615\) in the Ma treatment and the Pareto efficient level \(Y_{SS}^{*}=\theta _H+\theta _L=80\) in the Mp treatment, respectively, for our parameter choices. Markov perfect equilibrium is formally characterized in Proposition 3 (Ma or low-polarization) and Proposition 4 (Mp or high-polarization), and equilibrium steady states are characterized in Proposition 5 of Bowen et al. (2014).

Thus, our experiment is designed to test the status quo effect of mandatory budget rules that institutionalize a relationship between current decisions and future payoffs, which theoretically leads to efficiency gains. In particular, we propose to test hypotheses that are informed by the equilibrium theory. Since in equilibrium, all proposals are accepted, our data analysis will mainly focus on accepted proposals, though we will also examine factors affecting the acceptance of proposals.Footnote 7

Given our design and research questions, we have the following testable hypotheses:

Hypothesis 1

For fixed \(\theta _H\) and \(\theta _L\), public good provision is higher under mandatory budget rules than under discretionary budget rules.

Specifically, under our parameterization, starting from the status quo level of \(Y=1\), public good provision is predicted to grow close to or to achieve the Pareto efficient public good levels under the mandatory budget rules but will remain below this level under the discretionary budget rules.

Hypothesis 2

An increase in the efficient public good provision amount results in an increased steady state allocation to the public good under the mandatory rules.

Under our parameterization, an increase in political polarization (\(\theta _H\) increasing from 40 to 55) results in a higher level of efficient public good provision. It follows that, as we move from Ma (\(\theta _H=40\)) to Mp (\(\theta _H=55\)) we should observe greater allocations to the public good by both proposer types eventually. Specifically, even the proposer type whose importance parameter for public good utility doesn’t change (the low type in our experiment) has an incentive to increase their public good allocations according to a change in the importance parameter of their opponent type (if the change in the latter parameter results in a change in the Pareto efficient public good amount).

Hypothesis 3

In all treatments, proposers exercise proposer power by generally proposing 0 private points to responders and keeping all points in excess of the public good allocation for their own private accounts.

The testable private point prediction of the model is summarized in the above Hypothesis 3. The theory predicts that in all settings, proposers exercise full proposer power, which means that when the status quo public good allocation is below the Pareto efficient level, any points not allocated to the public good are primarily, if not exclusively, allocated to the proposer’s own private account and not to the responder’s private account. Figure B.1 in the Appendix shows predictions for private point allocations as a function of the status quo public good level in our parameterization of the two mandatory treatments. When this status quo amount is below the Pareto efficient level, private points allocated to the responder are always zero in the Ma treatment, and sometimes marginally different from zero for certain status quo levels in the Mp treatment (see Figure B.1 in the Appendix for details). Similarly, in the discretionary treatment there is never any allocation of points to the responder’s private account. While other bargaining games (e.g., ultimatum bargaining or legislative bargaining) also predict the full exercise of proposer power, a difference here is that there is also a public good component to players’ payoffs that benefits both players and thus it is of interest to understand whether or not the presence of this public good component works to strengthen the use of proposer power.

Hypothesis 4

Under discretionary budget rules, in equilibrium both high and low types propose distinct levels for the public good (which are their static equilibrium amounts, \(\theta _H=40\) and \(\theta _L=25\)) no matter how many rounds are played in a supergame of the public and private good bargaining task.

As mentioned before, the logic here follows directly from the first order condition for the static optimization. Since an agreement about public good allocations in the current round has no implications for future rounds, the incentives for proposing the static equilibrium levels \(\theta _i\) for the group account (public good) are still maintained in the dynamic games of the discretionary treatment.

The theory also has predictions regarding the dynamics of behavior and the convergence of public good allocations to steady states over time for the mandatory treatments, which are summarized in the following hypothesis:

Hypothesis 5

Under mandatory rules, starting from the initial status quo level of \(Y=1\) (out of an endowment of 100), both types will propose public good amounts that should converge over time to the steady state levels - that is the high type’s dynamic optimum (\(Y_{SS}^{*}\approx 64.615\)) in the Ma treatment and the Pareto efficient level (\(Y_{SS}^{*}=80\)) in the Mp treatment.

Finally, we consider some efficiency measures that can be used to evaluate the performance of the different budget rules. As a measure of aggregate efficiency, we look at the ratio of actual payoffs earned from accepted allocations to payoffs that would have been obtained at Pareto optimal allocations. Given the predicted public good allocations across treatments, we have the following:

Hypothesis 6

Aggregate efficiency will be higher in the two mandatory treatments as compared with the discretionary treatment.

The difference in efficiency between the two mandatory treatments is more ambiguous, and we will address this topic later in Sect. 4.9 when we evaluate hypothesis 6.

In the next section we evaluate each of these six hypotheses using the data from our experiment.

4 Experimental results

We report on results from 5 sessions of each of our three treatments, 15 sessions in total. As noted, there are 10 subjects per session; thus we report on data from 150 subjects.

Recall that the discretionary (D) and mandatory-aligned (Ma) treatments only involve a change in the status quo rule for the public good; the values of \((\theta _H, \theta _L)=(40, 25)\), are kept constant between these two treatments. By contrast, the mandatory-polarized (Mp) treatment involves both the mandatory rule for the status quo level of the public good and a greater difference between \(\theta _{H}\) and \(\theta _{L}\) (i.e., greater polarization) namely \((\theta _{H}, \theta _{L})=(55,25)\) and thus a higher level for efficient public good provision.

As noted earlier, we focus here and throughout the paper on accepted proposal amounts in keeping with the theory and since acceptance rates are generally high.

4.1 Overview

We begin with an overview of the main outcome variables from our experiment. Table 2 reports for each treatment, mean values for each of five main outcome variables: (1) proposal acceptance rates, (2) accepted amounts allocated to the public good, disaggregated by high or low type proposers; (3) accepted amounts allocated to the proposers’ own private account, disaggregated by high or low type proposers; (4) accepted amounts allocated to the responders’ private account, disaggregated by high or low type responders, and (5) aggregate efficiency levels achieved. Means are reported for all rounds of all supergames. The table also shows in square brackets the (Markov perfect) equilibrium predictions based on actual realizations for the proposer types and given the actual status quo levels for the public good that were realized in the experimental games, which is most relevant for the mandatory treatments.

Table 2 Overview of main outcome variables, all rounds

Indeed, as Table 2 reveals, proposals are accepted on average more than 75% of the time. A general finding observed across all treatments is that amounts allocated to the public good by low proposer types are greater than equilibrium predictions, while amounts allocated by high proposer types are lower than equilibrium predictions in the mandatory treatments. On the other hand, both types of proposers allocate less, on average, to their own private accounts and more, on average, to the private accounts of their matched responders. Despite these differences, efficiency, as measured by the ratio of total payoffs earned to the Pareto optimum payoff level, is generally quite high, in excess of 95%.

4.2 Acceptance rates

We begin by discussing responder’s acceptance rates of proposals made by proposers. As Table 2 reveals, acceptance rates differ from the equilibrium prediction of 100%, and are highest for the discretionary treatment and lower for the two mandatory treatments. Details on acceptance rates by treatment and session are found in Table B.1 in the Appendix.

Table 3 Mann–Whitney tests of differences in acceptance rates across treatments

The difference in acceptance rates between the discretionary and mandatory treatments likely reflects the fact that under the discretionary rule, the rejection of a proposal means that earnings are zero while under the mandatory rule, if the status quo level of the public good, \(Y>1\), rejection still results in a positive payoff to both players and this status quo payoff level can grow large over time, i.e., the status quo is endogenous. Thus, the mandatory rule gives responders greater bargaining power that gets stronger as the status quo points become higher, and empirically, this results in higher rates of rejection (lower acceptance rates) under the two mandatory rules as compared with the discretionary rule. Indeed, the difference in acceptance rates between the discretionary treatment and either of the two mandatory treatments (Ma or Mp) is significant at the 5% level as revealed in Table 3 which reports on Mann-Whitney tests using session level mean data (5 sessions per treatment) over various sub-intervals. We observe in Table 2 that the difference in acceptance rates between the discretionary and mandatory treatments is around 10 percentage points, on average. Table 3 reveals that there is no difference in acceptance rates between the two mandatory treatments, Ma and Mp. Summarizing this discussion we have:

Result 1

Acceptance rates across all treatments are less than 100%. Acceptance rates are significantly higher in the discretionary treatment as compared with the two mandatory treatments. There are no significant differences in acceptance rates between the two mandatory treatments.

4.3 Determinants of responder acceptance decisions

Table 4 Responders’ acceptance decisions, all treatments
Table 5 Responders’ acceptance decisions, mandatory treatments only

We next examine the determinants of responder acceptance decisions using a Probit regression analysis. Here the binary dependent variable is equal to 1 if the proposal was accepted and 0 otherwise. The results from our analysis are reported in Table 4 for all treatments and in Table 5 for the two mandatory treatments only.

In Table 4 we observe that responders are more likely to accept offers the higher is the proposer’s allocations to the public good and the higher is the proposer’s allocation to the responder’s private point balance. We further observe that responder acceptance decisions are decreasing with increases in the status quo amount of the public good in the two mandatory treatments, Ma and Mp. The latter result follows from the fact that in the discretionary treatment the status quo level is not changing but it generally rises over time in the mandatory treatments. Intuitively, as the status quo level rises, it is easier for responders to reject proposals as the positive status quo level of the public good guarantees that they will get some positive payoffs from the public good (upon rejection). Finally, we observe that controlling for public and private good allocations, the status quo public good level and treatment effects, proposals made by high type players (those who value the public good more) are significantly more likely to be rejected by their opponent low type responders. The latter finding is our first indication that fairness concerns may play a role in responders’ acceptance decisions. We explore such concerns in more detail later in Sect. 4.6 as well as in Appendix C.

These results remain largely robust if we restrict attention to the two mandatory treatments only, as reported on in Table 5. In the analysis of Table 5 we further explore if acceptance decisions depend on whether the status quo level of the public good is below the efficient level (low SQ) or above the efficient level (high SQ). We find that, in a way to facilitate the theory predictions, when the status quo level is below the efficient level, a higher allocation to the public good leads to a greater likelihood of acceptance by responders, and this effect is highly significant. When the status quo level is above the efficient level, higher private points awarded to the responder lead to a small but significant reduction in acceptance rates by responders.

We further consider differences in mean allocations to public good and private points between accepted and rejected proposals - see Table B.2 and Figure B.2 in the Appendix. There we show that public good and responder private points are significantly greater in accepted proposals as compared with rejected proposals while proposer private points are significantly lower in accepted proposals as compared with rejected proposals. This evidence further confirms that responders consider both public good levels and their own private points in deciding whether to accept proposals, as was already shown in the probit regression results.

4.4 Effect of mandatory rules on public good levels

Fig. 3
figure 3

Mean Accepted Allocation to Public Good, All Sequences. Left panel: all rounds; Right panel: the first round.

We now consider the main treatment effect of adopting mandatory budget rules for public good provision relative to the discretionary budget rule case. We first focus on a comparison of the mandatory-aligned (Ma) and discretionary (D) treatments as they are most comparable (have the same \(\theta\) values). We report the following finding.

Result 2

Consistent with Hypothesis 1, public good provision is higher in the mandatory-aligned treatment than in the discretionary treatment.

Table 6 Nonparametric tests for differences in (accepted) public good allocations

Support for Result 2 comes from Fig. 3 which reports mean public good allocations by treatment and proposer type over all rounds and for the first rounds of sequences. Further Table 6 provides results from non-parametric tests on accepted public good amounts by treatment over all rounds or round 1 only using session level averages (session level data on public good allocations reported on in Tables B.3-B.5 of the Appendix). In Table 6, and those that follow (about nonparametric tests), we used one-sided tests whenever we have specific directional predictions from the theory and two-sided tests otherwise.

We first compare allocations to the public good in the discretionary treatment (D) with the mandatory-aligned treatment (Ma). Since the \(\theta\) values do not change between these two treatments, the comparison of D vs. Ma provides the cleanest test of the effect of changing the bargaining rules. We observe that over all rounds of all sequences, the mean agreed upon public good allocation proposed by both high and low types in the Ma treatment is around 10 points higher than the mean for the discretionary treatment and this difference is significant at the 5% level using the Mann–Whitney test on session level data as revealed in Table 6.

Using a random-effects Tobit regression analysis (to account for data censoring) and all data on accepted public good amounts from all treatments, Table 7 confirms that public good provision is, on average, significantly higher in the Ma treatment as compared with the baseline D treatment in most specifications including those with round and sequence numbers.

Table 7 Tobit regression analysis of accepted public good allocations, all treatments
Table 8 Tobit regression analysis of accepted public good allocations, mandatory treatments only

We note further that in the discretionary treatment, Table 2 reveals that there is considerable over-allocation to the public good in that mean accepted public good amounts are greater than theoretical predictions.Footnote 8 We see considerably less over-allocation in the mandatory-aligned (Ma) treatment.

We further observe in Fig. 3 and in Table 7 that, consistent with the theory, average allocations to the public good in the mandatory-polarized treatment (Mp) are also significantly greater than in the discretionary treatment (D) by somewhere between 12-20 points, though in this case, the additional change in the degree of polarization between the D and Mp treatment is a confounding factor.

We next consider the impact on allocations to the public good under the two mandatory rules when there is an increase in polarization, that is, we make a comparison between mean allocations by low and by high types in the mandatory-aligned (Ma) and the mandatory-polarized (Mp) treatment. We have:

Result 3

Consistent with Hypothesis 2, when there is an increase in polarization (hence a change in treatments from Ma to Mp in the experiment), then under the mandatory rules, both types increase their allocations to the public good.

Support for Result 3 comes from Fig. 3 and Tables 6 and 8. From Fig. 3 we observe that average allocations to the public good are higher for both high and low types in the mandatory-polarized (Mp) treatment relative to the allocations of these same types in the mandatory-aligned (Ma) treatment. Using Mann–Whitney tests on session level averages over all rounds, as reported on in Table 6 we find that this difference is significant for either types at the 5% or 10% level of significance. That is, for each type (high or low), we reject the null hypothesis that mean allocations to the public good are the same in treatments Ma and Mp in favor of the alternative that mean allocations are higher in Mp as compared with Ma. Finally, Table 8, reports on another Tobit regression analysis for accepted public good allocations but for the mandatory treatments only and confirms this finding. Across several different specifications, we see that accepted public good allocations are significantly higher in the Mp treatment as compared with the baseline Ma treatment at the 1% or 5% significance level.Footnote 9

Finally, we also examined mean first round choices over all sequences, since in the first round, the status quo level is the same across the three treatments - see the right panel of Fig. 3. There we see that high types across the two mandatory treatments proposed higher allocations to the public good in round 1 than did low types which suggests that high types understood and acted upon the insurance role. Further, high types were especially responsive to changes in the bargaining rule, by monotonically increasing their round 1 allocation to the public good as the bargaining rule changed from D to Ma to Mp. Finally, as Table 6 shows, these differences in round 1 behavior are often significant.

4.5 Accepted allocations to private accounts of proposer and responder

Thus far, we have focused on accepted allocations to the public good benefiting both players. However, it is also of interest to consider the amounts allocated to the proposer and responder’s private accounts. Here again we focus on accepted proposals made by the proposer to his/her own private account and to the opponent’s (responder’s) private account. Mean amounts for both allocations are reported on in Table 2 and illustrated in Fig. 4.

Fig. 4
figure 4

Mean Accepted Allocation to Private Account, All Sequences. Left panel: proposer; Right panel: responder.

Table 9 Nonparametric tests for differences in accepted private points to proposer and responder

Figure 4 and the non-parameteric tests reported in Table 9Footnote 10 reveal that, consistent with theoretical predictions, private points allocated to proposers are significantly higher in the discretionary treatment as compared with the Ma treatment, where they are significantly higher than in the Mp treatment. We further note that accepted private points to low type responders are significantly greater than accepted private points to high type responders.

However, in all cases accepted points allocated to the proposer’s private accounts lie below the predicted amounts based on the realized status quo level (solid line) or using the Pareto efficient equilibrium benchmark (dashed line) in Fig. 4. That is, proposers of both types under-allocate to their own private accounts on average and they over-allocate to the responder’s private account on average, relative to theoretical predictions.

While the equilibrium suggests that a proposal will be accepted so long as there is sufficient provision of public goods but zero private points to responders, especially given a status quo below the Pareto efficient level, such proposals are typically rejected in our laboratory experiment and proposers had to offer private points as well to their responders at most realized status quo levels. Indeed, as Tables B.15-B.16 in the Appendix show, mean proposed amounts to the proposer’s private account and to the responder’s private account, independent of the acceptance decision are, respectively, slightly greater and lower than are the same amounts conditional on acceptance of the proposal in Appendix Tables B.8 and B.11.Footnote 11 We understand this as proposers not being able to fully exercise their proposer power, a widely observed phenomenon in the empirical bargaining literature.

4.6 Allocations within the 2D simplex

Fig. 5
figure 5

Bubble plots of allocation vectors by proposer types and treatments, accepted proposals only

Figure 5 summarizes, using a 2-dimensional simplex, the frequency of accepted allocation vectors made by high and low proposer types. The advantage of this approach relative to our analysis thus far is that, instead of looking at one-dimensional analysis of public or private points, here we can consider the behavior of allocation vectors in multidimensional (bargaining) choice spaces. In each panel of the figure, the first coordinate, on the horizontal axis, is the amount allocated to the public good and the second coordinate, on the vertical axis, is the proposer’s allocation to his own private account. The allocation to responders’ private account is the residual amount. Thus, the coordinate pair (50,25) corresponds to case where the proposer allocated 50 points to the public good, 25 points to his own private account and the remaining 25 points went to the responder’s private account.Footnote 12 The size of a bubble centered at each observed coordinate pair is proportional to the count of observations with that allocation vector; the smallest bubble corresponds to a single observation. These figures also show the mean accepted allocation from the experimental data (Data Mean, indicated by the solid triangles) along with the static equilibrium prediction in the discretionary treatment and the Pareto optimum allocation in the two mandatory treatments (the solid squares) for reference purposes.

One key finding from these simplex figures is that most accepted proposals do not lie on the hypotenuse of the simplex triangle where the proposer exercises full proposer power, keeping all points not allocated to the public good for himself. Instead, the most frequently observed accepted proposals involve an equal division of private points between the proposer and the responder. Such “equal split” allocations are defined as those for which the proposer divides the amount not allocated to the public good equally between himself and the responder. These equal split allocations are found along the dashed line labeled “equal split” in Fig. 5. While we observe these equal split outcomes in all three treatments, the frequency of such equal split allocations is greater in the two mandatory treatments.

A second key finding concerns the mean accepted allocations relative to theoretical predictions. We see a large difference in public good allocations between the discretionary treatment and the two mandatory treatments which is consistent with theoretical predictions (and earlier findings). In the discretionary treatment, the upper row of Fig. 5, we observe a large mass of observations around the static equilibrium allocation. By contrast, as we move from the discretionary to the two mandatory treatments, the middle and bottom panels, we see an increase in the accepted amounts allocated to the public good, as indicated by the rightward movement of the data mean allocation. This movement is away from the static equilibrium and toward the Pareto efficient outcome. The shift is particularly pronounced for high proposer types and only less so for low proposer types. In Appendix Figures B.3-B.5, we provide further evidence of similar movements over time, between the first and second halves of sessions, in the mean accepted allocations in all three treatments. In the discretionary treatment there is a very slight movement toward the static equilibrium while in the two mandatory treatments there is a more pronounced movement toward the Pareto optimum allocation over time.

A further observation from the 2D simplex Fig. 5 is that in the two mandatory treatments, low type proposers’ accepted allocations assign greater points to their own private accounts than do high type proposer’s accepted allocations. Low type proposer’s accepted allocations generally lie on or above the equal split line, while high type proposer’s accepted allocations generally lie on or below the equal split line. The difference is largely due to some high types proposing allocations at the lower bound of the simplex (the horizontal leg of the triangle) where high type proposers are giving all of the endowment points in excess of the public good to the responder. We seldom see this type of allocation behavior in the case of low type proposers suggesting that fairness motivations are playing an important role since low type responders (matched with high type proposers) don’t get the same benefit from the public good as do high type responders (matched with low type proposers).

In the two mandatory treatments, it may be puzzling to find that low type proposers are proposing public good amounts that are substantially below the Pareto optimum level, as is also revealed in Fig. 3. Figure 5 suggests that low types in the two mandatory treatments appear to be allocating too much, on average, to their own private accounts, particularly in the Mp treatment (see also Fig. 4 for low types’ over-allocation to their own private accounts, relative to the Pareto benchmark - drawn as the dashed line). However, Figs. 3 and 4 also reveal that low type proposers have actually over-allocated to public goods and under-allocated to their own private account in the two mandatory treatments, taking into account the predictions based on the realized status quo levels of the public good (i.e., the predictions depicted as the solid line). Thus, it seems that a reason for the under-allocation to the public good by the low proposer types is that the status quo level of the public good is not high enough in the time frame of our experiment to justify their allocating the Pareto efficient amount to the public good.Footnote 13 Of course, the status quo level depends on the behavior of both high and low type proposers. The high proposer types in the mandatory treatments are generally closer to the Pareto optimal public good levels, particularly in the Ma treatment.

Based on Fig. 5, we summarize our findings regarding proposer power in relation to Hypothesis 3 as follows:

Result 4

Inconsistent with Hypothesis 3, proposers do not exercise full proposer power by allocating zero private points to responders. This is particularly evident in the mandatory treatments, where sizeable fractions of proposers divide points net of the public good allocation equally between their own private accounts and those of responders.

Given the observed heterogeneity in the extent to which proposers exercise their proposer power, we further explore, in Appendix C, the possibility of behavioral model explanations, using either the Fehr and Schmidt (1999) specification of other-regarding preference or a finite mixture model (FMM; see Moffatt (2016)). The latter model gives, for instance, the estimated proportion of equal splitters.

4.7 Evolution of accepted public good amounts over time

Fig. 6
figure 6

Mean (Accepted) Public Good Allocations over Rounds.

In this section we explore in further detail, the evolution of agreed upon public good allocations over time. Figure 6 shows mean public good allocations over all rounds of a supergame, with standard error bars (see also Table B.17 in the Appendix). Note that due to our use of random termination to implement a discount factor of \(\delta =.80\), earlier rounds in a supergame will have more observations than later rounds (so there are accordingly, larger standard errors in later rounds) and that the longest supergame in any session was 12 rounds. The top panel of Fig. 6 shows mean (accepted) public good allocations by proposer type, high or low, across all three treatments. The two middle panels and the bottom left panel show mean public good allocations by high and low types for each of the three treatments, Ma, Mp and D. Finally, the bottom right panel shows mean public good allocations for both proposer types combined, across all three treatments.

The clear impression given by Fig. 6 is that over the course of a supergame there is on average, good separation in mean public good amounts across treatments, offered by the high and low proposer types. Further, in both of the mandatory treatments we observe an upward trend in public good allocations while in the discretionary treatment we see a constant or even a declining trend in mean public good amounts.

Fig. 7
figure 7

Mean Accepted Allocation to Public Good by Proposer Type, 1st vs. 2nd Half of Session. Left panel: high type; Right panel: low type.

We next consider each treatment in turn, beginning with the discretionary treatment. Recall from Hypothesis 3 that for the discretionary treatment, high and low types should simply propose their static equilibrium amounts in each round that they serve as the proposer, specifically \(Y=\theta _H=40\) for high types and \(Y=\theta _L=25\) for low types, ignoring any dynamic aspects of the repeated game. As Table 2 and Fig. 3 reveal, in the discretionary (D) treatment, accepted proposals made by high types average 50.81 while those made by low types average 43.3 over all rounds. These levels are greater than the static equilibrium levels of 40 and 25 respectively. However, as Table 6 reveals, we can reject the null hypothesis that accepted public good offers by low types are the same as accepted public good offers by high types in favor of the alternative that the latter offers by high types are significantly greater than the former offers by low types at the 10% level of significance (\(p=.059\)). Further, as Fig. 7 reveals, there is not much change in the accepted public good allocations proposed by high and low types over the first and second halves of each session; that is, while accepted amounts proposed by both types are greater than the static equilibrium levels, they are not increasing or decreasing by much over time. A similar observation follows from the bottom left panel of Fig. 6. We summarize this finding as follows.

Result 5

Consistent with Hypothesis 4 in the discretionary treatment, accepted public good proposals by high types are greater than accepted public good proposals by low types, and do not change much over time. Both types’ accepted public good proposal amounts are greater than the static equilibrium levels \((\theta _H, \theta _L) = (40, 25)\).

We next go back to Table 7 and compare the evolution of accepted public good amounts over time in the mandatory treatments relative to the discretionary treatment using Tobit regressions that account for data censoring. Specifically, we report on random-effect Tobit regressions and we include round and sequence numbers to consider behavior over time. The dependent variable is accepted public good amounts within the implemented limits between 1 and 100.

In the regressions reported in Table 7, the discretionary treatment serves as the baseline. We see in specification (1) that the baseline accepted public good amount is about 40 and is increasing in the mandatory-aligned (Ma) treatment by about 13 and increasing further in the mandatory-polarized (Mp) treatment by around 20. Further, the inclusion of sequence and round numbers in specification (1) suggests that allocations to the public good are growing over time. However, disaggregating this effect further using interaction variables, Ma\(\times\)Round and Mp\(\times\)Round in specification (2) and Ma\(\times\)Sequence and Mp\(\times\)Sequence in specification (3) we see that the growth in public good allocations over time is owing to the two mandatory treatments (Ma and Mp); including the interactive terms, the coefficients on the round or sequence number variables for the baseline discretionary treatment are no longer significantly different from zero. This finding is consistent with the theory which predicts that accepted public good amounts should be growing over time in the mandatory treatments due to the role played by the status quo public good level in dynamic bargaining under mandatory rules. These same results generally continue to hold if we disaggregate accepted public good allocations by the type of player (high or low) who made the proposal as reported on in Appendix Table B.6 (with stronger effects of learning by high types).

Figure 7 also clearly shows that proposers of both types learn to increase public good allocations significantly in the two mandatory treatments as they move from the first to the second half of sessions.

4.8 Adjustment in public good allocations conditional on status quo level

We next look for more explicit evidence of dynamic adjustment within a sequence (indefinitely repeated game) in the two mandatory treatments, since in those treatments, each new sequence starts with a status quo level for the public good reset to the initial condition, \(Y=1\), and then, depending on whether proposals are accepted or not, the status quo level for the public good can increase over time.

Looking at the data from the first and second half of sessions (see Table B.1), we observe that acceptance rates remain roughly constant over time and consistently below the 100% equilibrium prediction across treatments (the 95% confidence intervals of mean acceptance rates - viewed as sample proportions - were overlapping between the first and the second half of sessions for all three treatments).

We notice further in Fig. 7 that both high and low types tend to increase their public good allocations over time in the two mandatory treatments, while as previously noted, there is not much change in public good allocations over time in the discretionary treatment (the 95% confidence intervals of mean public good allocations in accepted proposals were non-overlapping between the first and the second half of sessions for the two mandatory, Ma and Mp, treatments, for each of the two types, high and low, of proposers, while the same intervals were indeed overlapping between the two halves for either type of proposers in the discretionary treatment). The latter observation is consistent with the notion that players are learning to play according to the dynamic equilibrium predictions of the theory with greater experience.

Fig. 8
figure 8

Scatter plot of (accepted) public good allocations by status quo default level, mandatory treatments along with lowess filter and MPE prediction

Figure 8 shows scatter plots of accepted public good amounts as a function of the status quo level at the time the proposal was made. In addition, we show the fit of Lowess filters to these data. These fitted lines can be compared with the Markov Perfect Equilibrium (MPE) predictions shown together (which are repetitions of those in Fig. 2). The top panels are for the high and low proposer types in the mandatory-aligned (Ma) treatment, the middle panels are for the high and low types in the mandatory-polarized (Mp) treatment, and the bottom panels consider both proposer types combined in Ma (left) and Mp (right). By comparison with the theoretical predictions we find both differences and similarities. On the one hand, we observe that for the low types (the right columns of the top and middle panels), the pattern of public good allocations as a function of the status quo level is qualitatively similar to the MPE path. On the other hand, for the high types (the left columns of the top and middle panels) public good allocation levels should be more or less constant according to the equilibrium while the data shows a clearly increasing pattern as status quo levels increase.

By way of an explanation, we note that the lowest status quo value (\(Y=1\)) is more likely to be observed, as it is the initial state of all supergames, and there is a wide variance of public good allocation amounts for this status quo level for both high and low proposer types. This large initial variance reflects some initial learning/coordination that the theory does not address. Further, as the probit regressions in Tables 4 and 5 revealed, proposals by high type proposers are significantly more likely to be rejected by low type responders across all treatments (while the MPE, as depicted in Fig. 2 and Appendix Figure B.1, assumes no rejection). The greater rejection of high type proposals may cause these high type players to increase their allocations to the public good in order to gain acceptance.

To better understand the repeated game dynamics within a supergame (or sequence), we consider a simple first order autoregressive model of the convergence behavior of different outcome variables, which we label y. Specifically, we consider the model

$$\begin{aligned} y_{j,t} = \lambda y_{j,t-1} + \mu _{j} + \epsilon _{j,t} \end{aligned}$$
(1)

where \(y_{j,t}\) denotes the time t value of variable j. We are particularly interested in two main outcome variables, namely the accepted amount of the public good in period t, \(PG_t\), and the status quo level for the public good in period t, \(SQ_t\), in the two mandatory treatments, as both of these variables are expected to converge to steady states over time.

Provided that estimates of \(\lambda\) are less than 1, the steady state public good and status quo amounts over a supergame are well approximated by estimates of the limiting value of equation (1), namely, by \(\frac{\mu }{1-\lambda }\). Estimation of equation (1) for the accepted public good amount (PG) are reported on in Appendix Table B.18 while estimates for the status quo level of the public good (SQ) are reported on in Appendix Table B.19.

These tables reveal several things. First estimates for \(\lambda\) are generally less than 1 providing evidence of weak convergence over time across all treatments. Second, the limiting estimated values for \(\frac{\mu }{1-\lambda }\) in the two mandatory treatments are greater than in the discretionary treatment and are close to, but often fall just short of predicted steady state values for these two mandatory treatments. For instance, considering both proposer types in treatment Ma (Ma-both) a 95% confidence interval for the estimated limiting value of the public good allocation or the status quo level does not include the steady state level of 64.615, though the data are very close to this level. For high type proposers in treatment Ma (Ma-high), the 95% confidence interval for the estimated limiting value of the public good allocation overshoots the predicted steady state level of 64.615. Similarly, for the Mp treatment, considering both proposer types (Mp-both), a 95% confidence interval for the estimated limiting value of the public good allocation or the status quo level does not include the steady state level of 80, though again the data are very close to this level. For high proposer types in Mp (Mp-high), the 95% confidence interval for the estimated limiting value of the public good allocation does include the predicted steady state level of 80 (for high types in Ma, the same intervals can even stay above the predicted level of 64.615). A third finding from Appendix Tables B.18-B.19 is that the 95% confidence intervals for the estimated limiting values of accepted public good allocations and status quo levels are non-overlapping across all three treatments (when both proposer types are combined, and in the second half and over all rounds of session); that is, there is good separation in these limiting values as we move from the discretionary treatment to treatment Ma and then to treatment Mp. For instance, considering both proposer types and all rounds, Appendix Table B.18 reveals that the estimated limiting accepted public good allocation in the discretionary treatment (D-both) is 46.688 with a 95% confidence interval of [44.455, 48.921]; for the Ma treatment the estimated limiting accepted public good allocation for both proposer types (Ma-both) is 59.749 with a 95% confidence interval of [56.780, 62.718]; and finally for the Mp treatment the estimated limiting accepted public good allocation for both proposer types (Mp-both) is 69.537 with a 95% confidence interval of [66.756, 72.318]. We summarize these findings as follows:

Result 6

Regarding Hypothesis 5, in the mandatory treatments, accepted public good allocations are converging toward steady state levels, but estimated limits often fall just short of steady state predictions. Still, there is good separation of the long-run mean public good allocations across treatments in terms of 95% confidence intervals. A similar pattern obtains for the convergence of the status quo public good level across treatments.

4.9 Efficiency

In this section we consider efficiency across all three treatments. To enable a consistent comparison, we calculate efficiency as the ratio of actual payoffs from accepted allocations to those that would have been obtained at the Pareto optimal allocations as was already done in the overview Table 2. While the Pareto optimum allocation is not an equilibrium under the discretionary rules, using the Pareto optimum payoff levels as a benchmark enables comparisons across all three treatments. Here we report both aggregate efficiency measures, defined as the sum of proposers’ and responders’ actual payoffs relative to the Pareto optimum payoff level, and individual type-specific payoffs: proposer/responder and high/low types payoffs relative to the Pareto optimum. To be precise, the denominator of both the aggregate and individual efficiency measures uses the same (hypothetical) aggregate payoff at the Pareto optimum since we are interested in how aggregate efficiency is decomposed individually between high and low type proposers and responders. The results are shown in Table 10 and nonparametric tests for differences in these efficiency measures are reported in Table 11.Footnote 14

Table 10 Aggregate and Individual Efficiency

As Table 10 reveals, aggregate efficiency is very high across all treatments in excess of 90%. The rows for each treatment (D, Ma, Mp) labeled “Aggregate-both p.types” repeat the efficiency measures reported on earlier in Table 2. These aggregate numbers are recalculated according to whether the proposer was a high or low type. The final 12 rows of Table 10 show proposers’ or responders’ average share of the Pareto optimal payoff and these percentages are further distinguished by the player’s type, high or low. We further report the Pareto Optimum (PO) share predictions for comparison purposes. Note that the actual efficiency shares for the Proposer and the Responder add up to the aggregate actual efficiency percentage in the first part of the table. For example, the numbers in the row “D: Proposer-high p.type” and in the row “D: Responder-low r.type” sum up to the numbers in the row “D: Aggregate-high p.type.”

Table 11 Nonparametric tests for differences in efficiency

As Table 10 reveals, we observe slightly higher aggregate efficiency in the mandatory treatments relative to the discretionary treatment, but as the Mann–Whitney test results using session level averages in Table 11 reveal, these differences are only significant when the proposer is a high type. This finding is surprising since the mandatory treatments should lead to higher efficiency due to the role played by the endogenous status quo default. However, as we have already noted, there is over-allocation to the public good in the discretionary treatment and this behavior raises payoffs to levels that are not far from the Pareto optimal benchmark. At the same time, in the two mandatory treatments, the desire for equal sharing is more pronounced (see the finite mixture model results and the allocations in the 2D simplex) and these fairness concerns reduce allocations to the public good. The net effect of these two behaviors is to move payoffs in all treatments to be closer together so that efficiency differences across treatments are minimal. Further, we do not find efficiency differences between the two mandatory treatments.

While there are no large differences in aggregate efficiency across treatments, Tables 10 and 11 reveal an interesting difference in individual efficiency by player type. Specifically, we find strong evidence that high type proposers or high type responders achieve a greater share of total payoffs (greater efficiency) than do (their matched) low type responders or proposers. The differences are consistent with what would be predicted in the Pareto optimum (PO) as also reported in Table 10 (under PO predictions), and mainly reflect the fact that high types get more utility value from public good allocations than do low types. We summarize these findings as follows:

Result 7

Efficiency (actual payoffs achieved relative to the Pareto optimum) is high across all treatments in excess of 90%. The evidence for Hypothesis 6 is mixed with aggregate efficiency being marginally significantly higher in the Mp treatment but not in the Ma treatment as compared with the D treatment. Further, high types achieve a significantly larger individual share of the efficient aggregate payoff level regardless of whether they are in the proposer or the responder role.

5 Conclusions and suggestions for future research

We have reported on an experimental test of a model of public good bargaining due to Bowen et al. (2014). The main innovation of this model is the consideration of mandatory versus discretionary bargaining rules for public good provision. Under mandatory rules, the status quo level of public good provision becomes endogenous; once parties agree on a public good provision level, that level becomes the new status quo level. Thus, in the event of a break-down in bargaining between the two political parties, public good provision defaults to the status quo level which may be positive unlike in the discretionary case where the status quo level or the disagreement value is always zero. Theoretically, the problem of underprovision of the public good in the discretionary environment can be eliminated in the mandatory setting because the mandatory bargaining rules raise the bargaining power of the out-of-power party. Indeed, under mandatory rules, efficient public good provision becomes possible. The aim of our experiment is to test this important insight.

We consider both discretionary and mandatory bargaining rules and in the latter case, we further consider the degree of political polarization of the two parties as measured by differences in the weights that they attach to public good provision.

Consistent with the theory, we find that public good allocations are significantly higher under mandatory budget rules than under discretionary rules and that under the mandatory rules, pairs of players are very close to achieving the efficient level of public good provision. Still, they fall just short. What can explain this behavior? As we have seen, acceptance rates are increasing in both the public good amount and the private points offered to responders. The latter result is inconsistent with the theoretical prediction that proposers exercise full proposer power, but it is consistent with findings from many ultimatum bargaining experiments. At the same time, low type proposers are not offering as large an allocation to the public good as high type proposers are (while the former types over-allocate and the latter types under-allocate on average, with respect to the equilibrium predictions conditional on the realized status quo levels for the public goods, in the mandatory treatments) and this behavior by the low types largely accounts for the shortfall in public good provision relative to the Pareto efficiency benchmark. This bias by the low types may reflect the low type’s smaller payoff from the public good relative to the high types in combination with fairness concerns. Further, many proposers in the mandatory treatment, particularly the high types are choosing to reduce their own private points from equilibrium levels to fund the private points allocated to the responder. That is, proposers are heterogeneous in their exercise of proposer power, consistent with prior legislative bargaining experiments. The main difference of course, is that we are considering bargaining over public good provision which benefits all players. Here we observe that while our subjects fall short of achieving the efficient outcome, they do come tantalizingly close to reaching that benchmark.

Still, consistent with theoretical predictions, we find that as political polarization increases, both proposer types increase their allocations to the public good under the mandatory rules since the change in polarization leads to a higher efficient public good level. By contrast, under the discretionary rules, each proposer type makes public good allocations that are higher than the predicted static equilibrium levels, but that are further away from the Pareto efficient levels than under the mandatory rules. A main takeaway from our findings is that they help to rationalize the use of mandatory rather than discretionary budget rules in bargaining over public good expenditures between political parties.

We see several directions for future research on this topic. First, it would be useful to consider longer indefinite sequence lengths than in our study as that would allow more time for the status quo bargaining mechanism in the mandatory treatments to enable subject to possibly achieve convergence to the Pareto optimum. This change might be achieved by increasing the discount factor or by using the block random termination method of Fréchette and Yuksel (2017). Alternatively, we could consider changing the initial status quo level for the public good, e.g., to be at the Pareto optimum level. Second, it would be of interest to give subjects some pilot experience with several indefinite sequences involving both discretionary and mandatory bargaining rules and then ask them to choose which set of bargaining rules they would like to operate under—Bowen et al. (2017) suggest an interesting theory along this line. Third, it would be of interest to vary the duration of proposer power; we currently only consider a single value for p, the probability that a proposer remains in power. Changing p can affect the insurance motivation without changing the Pareto optimum public good levels, which is different from changing the degree of polarization (\(\theta _H-\theta _L\)). Finally, it would be of interest to connect our experimental design more closely with the Baron-Ferejohn legislative bargaining experiments, for example, by Fréchette et al. (2012) that involve three or more parties as well as both public and private goods in a dynamic bargaining game thereby enabling the study of majority rule rather than unanimous consent for implementation of bargaining outcomes. We leave all of these interesting extensions to future research.