Abstract
Grim trigger strategies can support any set of control paths as a cooperative equilibrium, if they yield at least the value of the noncooperative Nash equilibrium. We introduce the recursive Nash bargaining solution as an equilibrium selection device and study its properties by means of an analytically tractable n-person differential game. The idea is that the agents bargain over a tuple of stationary Markovian strategies, before the game has started. It is shown that under symmetry the bargaining solution yields efficient controls.
Access provided by CONRICYT-eBooks. Download chapter PDF
Similar content being viewed by others
Keywords
- Nash Bargaining Solution
- Grim Trigger Strategy
- Noncooperative Equilibrium Strategies
- Equilibrium Selection Device
- Markov Strategies
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
2.1 Introduction
Most noncooperative differential games lack Pareto efficiency. That is, all agents can increase their individual payoffs if they agree to coordinate controls. However, in order to attain the socially optimal outcome at least two conditions must be fulfilled: (1) the agents form the grand coalition to derive the efficient controls and (2) payoffs must be transferable and distributed in such a way that every agent benefits from cooperation.Footnote 1
Here we study a mechanism which implements the Pareto efficient outcome as a bargaining solution. The crucial difference to the classic cooperative approach is that agents do not mutually agree to maximize overall payoffs and distribute them appropriately, but bargain over the controls. In order to support the resulting controls as an equilibrium we fix grim trigger strategies. If an agent defects on the agreement, all agents switch to their noncooperative Nash equilibrium strategies [7].
Sorger [6] proposed the recursive Nash [4] bargaining solution for difference games. We introduce a continuous time analogon and apply it to the differential game of public good provision of Fershtman and Nitzan [2]. Considering the noncooperative equilibrium they showed that the public good is underprovided with respect to the efficient solution. The result, however, crucially depends on the linearity of the Markovian strategies. This simplification makes the game analytically tractable and yields a unique steady state.Footnote 2
This note contributes to the literature on cooperative agreements in noncooperative differential games. It is well known that grim trigger strategies can support a set of control paths as equilibria, if they payoff dominate the noncooperative Nash equilibrium. The Nash bargaining solution can then be used as an equilibrium selection device. Since bargaining problems are defined in the payoff space we need to construct a value under agreement. In games with transferable payoffs one can simply fix the efficient value of the grand coalition and define an imputation. Here, however, we do not assume that the grand coalition forms and jointly maximizes payoffs. But we can define the agreement value in terms of a stationary Hamilton-Jacobi-Bellman equation (HJBe), if the agents stick to the agreement strategies over the entire time interval. The agreement strategies are then determined by the Nash bargaining solution.
The remainder of the paper is organized as follows: Sect. 2.2 presents the problem, Sect. 2.3 the solution concept and Sect. 2.4 concludes.
2.2 Problem Statement
The model is essentially the most rudimentary version of Fershtman and Nitzan [2].Footnote 3 Let \(x(t) \in X := [0, \frac {1}{2}]\) denote the stock of a pure public good at time t ∈ R +. We could think of x being the total contribution to some joint project carried out by n agents. Each agent i ∈ N := {1, 2, …, n} can partially control the evolution of the state according to the state equation
where u(t) := (u i(t))i ∈ N ∈ ×i ∈ N U i =: U ⊂ R n denotes the investment (control) vector and δ ∈ (0, 1] is the deprecation rate. In the context of the joint project, u i(t) then denotes the contribution rate of any agent i ∈ N. We consider quadratic payoffs of the form
such that the game is linear quadratic and thus possesses a closed form solution [1, Ch. 7.1]. Note that the instantaneous payoff function is monotonously increasing in the state \(\frac {\partial F_i(x,u_i)}{\partial x} > 0\) for all x ∈ X. The state is thus a pure public good and each agent benefits by its provision. With costly investment, however, there exists a trade-off between increasing the stock and minimizing costs. This trade-off defines a public good game. Each agent wants the others to invest, such that one can free ride on the effort of the other agents. This behavior results in an inefficiently low overall investment level. The objective functional for each agent i ∈ N is then given by the stream of discounted payoffs
where r > 0 denotes the time preference rate.
2.3 Solution Concepts
In what follows we consider a stationary setup and hence save the time argument t frequently. First we will derive the efficient collusive solution of joint payoff maximization. The efficient value is an upper bound on the agreement value. We then derive the noncooperative Nash equilibrium which serves as the disagreement value for the bargaining solution. The noncooperative equilibrium value is a lower bound on the agreement value. Any cooperative agreement lies in the set of strategies which support payoffs between the noncooperative Nash and efficient value. The noncooperative equilibrium strategies also serve as threats for deviations from the agreed upon bargaining solution.
2.3.1 Collusive Solution
Assume all agents agree to cooperate and jointly maximize overall payoffs. The value function for the efficient solution then reads
The optimal controls must satisfy the stationary HJBe
The maximizers of the right hand side of (2.6) are u i = C′(x) for all i ∈ N. Substituting the maximizers into the HJBe yields
Theorem 2.1
If we consider symmetric stationary linear strategies of the form \(\hat u_i = \alpha x + \beta \) for all i ∈ N where α and β are constants, then there exists a uniquequadratic solution to (2.7)
with
Proof
Substitute the guess (2.8) and thus C′(x) = αx + β into (2.7)
This optimality condition must hold at any x ∈ X. Evaluate (2.12) at x = 0, which yields γ
Taking the derivative of (2.12) gives
Again, at x = 0 we have
Resubstituting β in (2.14) and solving for α yields
Note that the state dynamics become \(\dot x(t) = (n\alpha - \delta )x(t) + n\beta \). There exists a unique and globally asymptotically stable steady state at x = −nβ∕(nα − δ) if nα − δ < 0 holds, which is ensured for the negative root of (2.16).
2.3.2 Noncooperative Equilibrium
The collusive solution implies two restrictive assumptions. The grand coalition must form and payoffs must be transferable in order to split the total payoff.Footnote 4 Let us assume that the collusive solution is not feasible. If this is the case we consider a noncooperative differential game and each agent maximizes his individual payoffs. The noncooperative Markovian strategies are denoted by ϕ i : X → U i and satisfyFootnote 5
where ϕ −i := (ϕ j)j ∈ N∖{i}. A noncooperative Nash equilibrium is then defined as follows.
Definition 2.1
The strategy tuple ϕ(x(s)) := (ϕ i(x(s)))i ∈ N ∈ U is a noncooperative Nash equilibrium if the following holds
Denote by
the noncooperative disagreement value.
Theorem 2.2
If we consider symmetric stationary linear strategies of the form ϕ i(x) = ωx + λ for all i ∈ N where ω and λ are constants, then there exists a unique quadratic solution to (2.19)
with
Proof
The proof follows the same steps as Theorem 2.1.
The noncooperative equilibrium, however, is generally not efficient. It can be shown eventually that the collusive solution yields a cooperation dividend such that the value under cooperation always exceeds the noncooperative value, i.e., C(x) >∑i ∈ N D i(x) ∀x ∈ X. The investment levels and thus the provision of the public good are inefficiently low. This result is standard in public good games and due to free riding. It is rational to assume that the agents do not want to stick to the fully noncooperative equilibrium, but increase overall efficiency by exploiting the cooperation dividend.
2.3.3 Bargaining Solution
It was shown by Tolwinski et al. [7]Footnote 6 that any control path \(\tilde u_i^t := (\tilde u_i(s))_{s \geq t}\), i ∈ N can be supported as an equilibrium if the control profiles are agreeable and defection from the agreement is punished.Footnote 7 Let σ i : X → U i denote a Markovian strategy that generates \(\tilde u_i\). Suppose the agents agree on some strategy profile σ(x) := (σ i(x))i ∈ N at \( \underline t < 0\) before the game has started. If the agents agree from t onwards, the agreement value is defined as
Definition 2.2
A strategy tuple σ(x) is agreeable at \( \underline t\) if
such that every agent benefits the agreement in comparison to the noncooperative equilibrium.
If this inequality was not about to hold there exists an agent who rather switches to the noncooperative equilibrium, because it payoff dominates the agreement. The condition, also refereed to dynamic individual rationality, is necessary but not sufficient for dynamic stability of an agreement. An agent might deviate from the agreement if he benefits from it.
Now we construct the history dependent non-Markovian grim trigger strategies τ i : [0, ∞) → U i that support σ i(x) as an equilibrium. Given some agreement strategy profile σ(x) the agents can solve the differential equation (2.1) for the agreement trajectory of the state
Suppose the agents perfectly observe the state and can recall the history of the state (x(s))s ∈ [0,t]. If they observe that an agent deviates in t, they can impose punishment with delay t + 𝜖. Now the grim strategies read
That is, if the agents observe that another player deviated at t from the agreement they implement their noncooperative equilibrium strategies from t + 𝜖 onwards. Let d ∈ N denote a potential defector who deviates from σ(x) at t. In the interval s ∈ [t, t + 𝜖] he maximizes his payoff against the agreement strategies of the opponents. From t + 𝜖 onwards he receives the discounted disagreement payoff. Let V d(x(t);𝜖) denote the value of the defector defined as
The threat is effective if
holds and every agent benefits the agreement over defecting on the agreement. Now we can always fix an \(\epsilon \in (0, \overline \epsilon ]\) such that (2.29) holds. Suppose punishment can be implemented instantly 𝜖 = 0. Equation (2.29) then becomes
which is true by the definition of individual rational agreements. Let \(\overline \epsilon \) denote a threshold such that (2.29) holds with equality
Then the threat is effective for all \(\epsilon \in (0, \overline \epsilon ]\). The threat is also credible, because after defection occurs all agents switch to their noncooperative equilibrium strategies and thus have no unilateral incentive to deviate from the punishment by the definition of an equilibrium. The grim trigger strategies and a sufficiently small punishment delay guarantee that the agents stick to the initial agreement over the entire time horizon.
Differentiating (2.24) w.r.t. time yields a representation of the agreement value in terms of the stationary HJBe
This gives us a stationary definition for the agreement value. Next we want to determine a particular strategy profile σ(x) by the Nash bargaining solution. Fix the excess demand function as follows
That is, each agent claims an amount which exceeds his disagreement value. Since each agent will only agree on some bargaining strategy if it gives him at least his disagreement value, we must restrict the control set. The set of individual rational strategies is then defined as
Note that these are all stationary representations. That is, the actual time instance t is not important, but state x(t). Since the relation holds for all t ∈ R, we saved the time argument. We are now in the position to state our main result and show how to solve for the bargaining strategy σ(x).
Theorem 2.3
For the fully symmetric case the agreement strategies that solve the Nash bargaining product
yield the Pareto optimal controls.
Proof
The first order conditions for j ∈ N of (2.36) is given by
Under symmetry, we must have \(E_i(\cdot ) =: \overline E(\cdot )\), \(A^{\prime }_i(\cdot ) =: \overline A'(\cdot )\) and \(\sigma _i(\cdot ) =: \overline \sigma (\cdot )\) for all i ∈ N. The first order condition then becomes
Since \(\overline E(\cdot ) = 0 \Leftrightarrow \overline A(\cdot ) = \overline D(\cdot )\) implies that all agents stick to the disagreement strategy we can neglect this case here. Now substitute the maximizer \(\overline \sigma (x) = n\overline A'(x)\) into (2.33) which gives
Take the derivative with respect to x
We claimed that the agreement strategies satisfy the efficient solution and are thus given by \(\overline \sigma (x) = \alpha x + \beta \) with \(\overline \sigma '(x) = \alpha \). Equation (2.40) becomes
This relation must hold at any x ∈ X. At x = 0, the equation simplifies to
Now substitute β into (2.41) and solve for α, which then is identical with (2.9). Since the controls and thus dynamics are identical under the collusive and bargaining solution, the values must be identical as well.
2.4 Conclusion
We studied the recursive Nash bargaining solution for symmetric differential games. It was shown by an analytically tractable example that the bargaining solution yields the Pareto efficient outcome of full cooperation. In an accompanying paper the author also wants to investigate asymmetric games and compare different solution concepts (e.g. Kalai-Smorodinsky and Egalitarian solution). Especially for the case of asymmetric discounting the recursive bargaining solution can be useful, because then efficient controls are not derivable in the standard way by joint payoff maximization.
Notes
- 1.
See Yeung and Petrosyan [9] for a recent treatment on subgame consistent cooperation in differential games.
- 2.
Wirl [8] showed that within the set of nonlinear Markovian strategies the Nash equilibrium is nonunique and that the efficient steady state is potentially reachable.
- 3.
See also Dockner et al. [1, Ch. 9.5] for a textbook treatment.
- 4.
The latter assumption is not too prohibitive. If payoffs were not transferable the individual cooperative value is simply given by \(C_i(x_t) = J_i(\hat u(s), t)\) where \(\hat u(s)\) are the Pareto efficient controls. It turns out that in the symmetric setup \(C_i(x_t) = \frac {C(x_t)}{n}\) which would also be the result under an equal sharing rule with transferable payoffs.
- 5.
See e.g. Dockner et al. [1, Ch. 4] for the theory on noncooperative differential games.
- 6.
See also Dockner et al. [1, Ch. 6].
- 7.
Agreeability is a stronger notion than time consistency. In the former the agreement payoff dominates the noncooperative play for any state while in the latter only along the cooperative path. Time consistency was introduced by Petrosjan [5] (originally 1977) and agreeability by Kaitala and Pohjola [3]. See also Zaccour [10] for a tutorial on cooperative differential games.
References
Dockner, E.J., Jørgensen, S., Long, N.V., Sorger, G.: Differential Games in Economics and Management Science. Cambridge University Press, Cambridge (2000)
Fershtman, C., Nitzan, S.: Dynamic voluntary provision of public goods. Eur. Econ. Rev. 35, 1057–1067 (1991)
Kaitala, V., Pohjola, M.: Economic development and agreeable redistribution in capitalism: efficient game equilibria in a two-class neoclassical growth model. Int. Econ. Rev. 31(2), 421–438 (1990)
Nash, J.F., Jr.: The bargaining problem. Econometrica 18(2), 155–162 (1950)
Petrosjan, L.A.: Agreeable solutions in differential games. Int. J. Math. Game Theory Algebra 2–3, 165–177 (1997)
Sorger, G.: Recursive Nash bargaining over a productive asset. J. Econ. Dyn. Control 30(12), 2637–2659 (2006)
Tolwinski, B, Haurie, A., Leitmann, G.: Cooperative equilibria in differential games. J. Math. Anal. Appl. 119(1–2), 182–202 (1986)
Wirl, F.: Dynamic voluntary provision of public goods: extension to nonlinear strategies. Eur. J. Polit. Econ. 12, 555–560 (1996)
Yeung, D.W.K., Petrosyan, L.A.: Subgame Consistent Cooperation. Springer, Singapore (2016)
Zaccour, G.: Time consistency in cooperative differential games: a tutorial. Inf. Syst. Oper. Res. (INFOR) 46(1), 81–92 (2008)
Acknowledgements
I thank Mark Schopf and participants of the Doktorandenworkshop der Fakultät für Wirtschaftswissenschaften for valuable comments. This work was partially supported by the German Research Foundation (DFG) within the Collaborative Research Center “On-The-Fly Computing” (SFB 901).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Hoof, S. (2018). Dynamic Voluntary Provision of Public Goods: The Recursive Nash Bargaining Solution. In: Petrosyan, L., Mazalov, V., Zenkevich, N. (eds) Frontiers of Dynamic Games. Static & Dynamic Game Theory: Foundations & Applications. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-92988-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-92988-0_2
Published:
Publisher Name: Birkhäuser, Cham
Print ISBN: 978-3-319-92987-3
Online ISBN: 978-3-319-92988-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)