Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In the study of intertemporal choices it is customary in economics to consider the so-called Discounted Utility (DU) Model, introduced in Samuelson (1937). In the DU model, agent’s time preferences are time-consistent and they are characterized by a single parameter, the constant discount rate of time preference. However, empirical observations seem to show that predictions of the DU model sometimes disagree with the actual behavior of decision makers (we refer to Frederick et al. 2002, for a review on the topic). In addition, if there are several players, although it is typically assumed that all economic agents have the same rate of time preference, there is no reason to believe that consumers, firms or countries have identical time preferences for utility streams. For instance, in a non-cooperative setting, Van Long et al. (1999) studied feedback Nash equilibria for the problem of extraction of exhaustible resources under common access in the case of different but constant discount rates. There are also problems—for instance, in the analysis of international trade agreements, climate change policies, or the exploitation of common property natural resources; we refer to Jørgensen et al. (2010) and Van Long (2011) for two recent surveys on dynamic games in these topics—in which it is natural to assume that players can communicate and coordinate their strategies in order to optimize their collective payoff. In this cooperative framework, if time preferences of players in the problem are time-inconsistent, or they apply different discount rates (constant or not), the notion of Pareto efficiency is lost. For the case of constant but different discount rates in a discrete time setting, Sorger (2006) proposed a recursive Nash bargaining solution. Also in a discrete time setting, Breton and Keoula (2014) studied the stability of coalitions in a resource economics model. In a continuous time setting, De Paz et al. (2013) (see also Marín-Solano and Shevkoplyas 2011) studied the problem of asymmetric players under two fundamental assumptions: all players commit themselves to cooperate at every instant of time t, and the different t-coalitions (with different time-preferences) lack precommitment power. Equilibria were computed by finding subgame perfect equilibria in a noncooperative sequential game where agents are the different t-coalitions (representing, for instance, different generations). Hence, this solution to the problem is time-consistent provided that all players commit to cooperate at every instant of time t.

The objectives of this chapter are the following. First, results derived, in a continuous time setting, in Karp (2007) and Ekeland and Lazrak (2010) for the case of time-distance and nonconstant discount functions are extended to a noncooperative differential game with more general discount functions. This is in fact a straightforward generalization of the results in Marín-Solano and Shevkoplyas (2011). Then, attention is addressed to extend the setting of partial cooperation among players studied in De Paz et al. (2013). In order to guarantee the stability of the grand coalition, nonconstant weights are introduced, so that players can bargain their weight in the grand coalition at every instant of time. Strictly speaking, although the proposed solution assumes cooperation among players at every instant of time t, it is a noncooperative Markovian Nash equilibrium for the non-cooperative sequential game defined by these infinitely many t-coalitions. In this sense we call this solution a time-consistent equilibrium with partial cooperation. It is important to realize that, in the standard case with a common and constant discount rate for all players, if weights are constant, standard dynamic optimization techniques (the Pontryagin’s maximum principle, or the Hamilton–Jacobi–Bellman equation) provide time-consistent solutions. However, there are cases in which no constant weights exist guaranteeing the sustainability of cooperation along time (see e.g. Yeung and Petrosyan 2006, and references therein). For this standard problem, the introduction of nonconstant weight provides a way to construct dynamically consistent solutions guaranteeing the stability of the grand coalition. The price to be paid is that the proposed solution with nonconstant weights is found for a problem with time-inconsistent preferences, and this makes the problem less computationally tractable. Maybe, what is more relevant is to check which are the effects of introducing time-inconsistent preferences in economic models. First, a simple common access resource game solved in Clemhout and Wan (1985) is studied by introducing heterogeneity and nonconstancy in the discount rates. Finally, a linear state pollution differential game with the same kind of time preferences model is also studied. Along the paper we will assume that players are rational, in the sense that they are aware of the changing preferences and they look for time-consistent solutions.

The chapter is organized as follows. In Sect. 2, we describe the noncooperative problem. The problem with partial cooperation and nonconstant weights is studied in Sect. 3. Finally, Sect. 4 analyzes the two above mentioned models coming from the field of environmental and resource economics.

2 Markovian Nash Equilibria in Noncooperative Differential Games with Time-Inconsistent Preferences

Within the framework of the (β,δ)-preferences introduced in Phelps and Pollak (1968), differential games with time-inconsistent preferences were already studied in Alj and Haurie (1983). In that paper the authors analyzed intergenerational equilibria, extending previous definitions and results to stochastic games and intragenerational conflicts. More recent references on the topic are Haurie (2005), Nowak (2006) and Balbus and Nowak (2008). In this section we study Markovian Nash equilibria in differential games for a rather general model with time-inconsistent preferences.

First, let us review the problem in case of just one decision maker. Let x=(x 1,…,x n)∈XR n be the vector of state variables, u=(u 1,…,u m)∈UR m the vector of control (or decision) variables, L(x(s),u(s),s) the instantaneous utility function at time s, T the planning horizon (terminal time) and S(x(T),T) the final (scrap or bequest) function. Let d(s,t) be an arbitrary discount function representing how the agent a time t (the so-called t-agent in the hyperbolic discounting literature) discounts utilities enjoyed at time s. For instance, if d(s,t)=e r(st) we recover the standard problem with a constant instantaneous discount rate of time preference. In the case of time-distance discounting with a nonconstant discount rate, \(d(s,t)=\theta(s-t)= \exp ( -\int_{0}^{s-t} r(\tau) d\tau )\). For our general problem, an agent taking decisions at time t (the t-agent) aims to maximize

$$ J(x,u,t)=\int_t^T d(s,t) L \bigl(x(s),u(s),s\bigr) ds + d(T,t) S\bigl(x(T),T\bigr), $$
(1)

with

$$\dot{x}^i(s)=g^i\bigl(x(s),u(s),s\bigr),\quad x^i(t)=x^i_t, \mbox{for } i=1,\ldots,n. $$

In Problem (1) we assume that functions L(x,u,s), S(x,T) and g i(x,u,s), i=1,…,n, are continuously differentiable in all their arguments. In the following we will also assume that d(s,t) is continuously differentiable in both arguments. In general, unless the discount function is multiplicatively separable in time s and the planning date t, i.e. d(s,t)=d 1(s)d 2(t), for all t∈[0,T], s∈[t,T], the optimal solution from the viewpoint of the agent at time t will be no longer optimal for future s-agents. Hence, the solution provided by the use of standard optimal control techniques (such as the Pontryagin’s maximum principle, or the Hamilton–Jacobi–Bellman equation) is time inconsistent. In this paper we center our interest in the search of time-consistent solutions (agents are sophisticated, according to the literature of hyperbolic preferences).

In order to solve Problem (1), an intuitive way to derive a dynamic programming equation is to discretize it, find later on the Markov perfect equilibrium in the corresponding sequential game and define finally the equilibrium rule of the original problem by passing to the continuous time limit (provided that it exists). This is the approach followed in Karp (2007) in the derivation of a dynamic programming equation extending the classical Hamilton–Jacobi–Bellman equation for the problem of time-distance discounting with a nonconstant discount rate of time preference. Alternatively, we can follow the approach introduced in Ekeland and Lazrak (2010) (later on extended in Ekeland and Pirvu 2008, to a stochastic setting) for the same problem. Next we briefly describe the latter procedure and the corresponding results derived in Marín-Solano and Shevkoplyas (2011).

If u (s)=ϕ(s,x(s)) is the equilibrium rule, then the value function is given by

$$ V(x,t)= \int_t^T d(s,t) L \bigl(x(s),\phi\bigl(x(s),s\bigr),s\bigr) ds + d(T,t) S\bigl(x(T),T\bigr) $$
(2)

where \(\dot{x}(s)=g(x(s),\phi(x(s),s),s)\), x(t)=x t . Next, for ε>0, let us consider the variations

$$u_\varepsilon (s)= \begin{cases} v(s) & \mbox{if } s\in[t,t+\varepsilon ], \\ \phi(x,s) & \mbox{if } s>t+\varepsilon . \end{cases} $$

If the t-agent has the ability to precommit her behavior during the period [t,t+ε], the value function for the perturbed control path u ε is given by

$$\begin{aligned} V_\varepsilon (x,t) = & \max_{\{ v(s), s\in[t,t+\varepsilon ]\}} \biggl\{ \int _t^{t+\varepsilon } d(s,t) L\bigl(x(s),v(s),s\bigr) ds \\ &{} + \int_{t+\varepsilon }^T d(s,t) L\bigl(x(s), \phi\bigl(x(s),s\bigr),s\bigr) ds + d(T,t) S\bigl(x(T),T\bigr) \biggr\} . \end{aligned}$$
(3)

Definition 1

A decision rule u (s)=ϕ(s,x(s)) is called an equilibrium rule if

$$\lim_{\varepsilon \to0^+}\frac{V(x,t)-V_{\varepsilon }(x,t)}{\varepsilon } \geq0. $$

This definition of equilibrium rule is rather weak, as explained, e.g., in Ekeland et al. (2012), and in particular is satisfied by the optimal solutions in a classical optimal control problem. Concerning regularity conditions, in Karp (2007) and Ekeland and Lazrak (2010) it was assumed that decision rules were differentiable. This condition was not assumed in Ekeland and Pirvu (2008). In fact, the differentiability of the decision rule is not needed in the derivation of the following result (see Marín-Solano and Shevkoplyas 2011, for a proof): if the value function is of class C 1, then the solution u=ϕ(x,t) to the integral equation (2) with

$$u^*=\phi(x,t)=\arg\max_{u} \bigl[ L(x,u,t) + \nabla_x V(x,t) g(x,u,t) \bigr] $$

is an equilibrium rule, in the sense that it satisfies Definition 1.

If there is no final function and T=∞, in Marín-Solano and Shevkoplyas (2011) it was proved that, if there exists a bounded value function of class C 1 solving the integral equation

$$ V(x,t) = \int_t^\infty d(s,t) L \bigl(x(s),\phi\bigl(x(s),s\bigr),s\bigr) ds $$
(4)

where

$$ u^*=\phi(x,t)=\arg\max_{u} \bigl[ L(x,u,t) + \nabla_x V(x,t) g(x,u,t) \bigr], $$
(5)

then u =ϕ(x,t) is an equilibrium rule, in the sense that it satisfies Definition 1.

In order to guarantee the finiteness of the integral in (4), Ekeland and Lazrak (2010) restrict their attention to convergent policies (i.e. equilibrium rules such that the corresponding state variables converge to an stationary state).

We can easily generalize the previous results to multi-agent problems. Let us consider a differential game defined on [0,T]. The state of the game at time t is described by a vector xXR n. The initial state is fixed, x(0)=x 0. There are N players. Let \(u_{i}(t)\in U_{i}\subseteq\mathbf{R}^{m_{i}}\) be the control variables of player i. Each agent i at time t seeks to maximize in u i her objective functional

$$\begin{aligned} J_{i}\bigl(x,t,u_1(s),\ldots,u_N(s)\bigr) = & \int_t^T d_i(s,t) L_i\bigl(x(s),u_1(s),\ldots,u_N(s),s\bigr) ds \\ &{} + d_i(T,t) S_i\bigl(x(T),T\bigr) \end{aligned}$$

subject to

$$ \dot{x}(s)=g\bigl(x(s),u_1(s),\ldots,u_N(s),s \bigr),\quad x(t)=x_t. $$
(6)

In a noncooperative setting with simultaneous play, we restrict our attention to the case when players apply Markovian strategies, u i (t)=ϕ i (x,t), for i=1,…,N. Note that open-loop strategies are not appropriate for our problem, since time-consistent players with time-inconsistent preferences decide at each time t according to their new time preferences, and taking into account the value of the state variable at time t, x t . Time-consistent Markovian Nash equilibria in a noncooperative differential game can be obtained as a generalization of the results for one decision maker. Let \((\phi_{1}^{nc},\ldots,\phi_{N}^{nc})\) be a N-tuple of functions \(\phi _{i}^{nc}:X\times[0,T]\to\mathbf{R}^{m_{i}}\), i=1,…,N, such that the following assumptions are satisfied:

  1. 1.

    There exists a unique absolutely continuous curve x:[0,T]→X solution to

    $$\dot{x}(t)=g\bigl(x(t),\phi_1\bigl(x(t),t\bigr),\ldots, \phi_N\bigl(x(t),t\bigr)\bigr), \quad x(0)=x_0. $$
  2. 2.

    For all i=1,2,…,N, there exists a continuously differentiable function \(V_{i}^{nc}:X\times[0,T]\to\mathbf{R}\) verifying the integral equation

    $$\begin{aligned} V_{i}^{nc}(x,t) =&\int_t^T d_i(s,t) L_i\bigl(x(s),\phi_1{nc}\bigl(x(s),s \bigr),\ldots ,\phi_N^{nc}\bigl(x(s),s\bigr),s\bigr) ds \\ &{}+ d_i(T,t) S_i\bigl(x(T),T\bigr), \quad V_i(x,T)=S_i(x,T), \end{aligned}$$

    where

    $$\begin{aligned} u_i^{nc} =&\phi_i^{nc}(x,t) \\ =& \arg\max_{\{ u_i\}} \bigl\{ L_i\bigl(x, \phi_1^{nc}(x,t),\ldots,\phi_{i-1}^{nc}(x,t),u_i, \phi _{i+1}^{nc}(x,t),\ldots,\phi_N^{nc}(x,t),t \bigr) \\ &{} + \nabla_x V_i^{nc}(x,t) \\ &{}\times g \bigl(x,\phi_1^{nc}(x,t),\ldots,\phi _{i-1}^{nc}(x,t),u_i, \phi_{i+1}^{nc}(x,t),\ldots,\phi _N^{nc}(x,t),t \bigr) \bigr\} . \end{aligned}$$
    (7)

Then the strategy \((\phi_{1}^{nc}(x,t),\ldots,\phi_{N}^{nc}(x,t))\) is a time-consistent Markov Nash equilibrium, and \(V_{i}^{nc}(x,t)\), i=1,…,N, are the corresponding value functions for all players in the noncooperative differential game.

In an infinite horizon setting (T=∞ and there is no final function), equations (4) and (5) generalize as follows. Let \((\phi _{1}^{nc},\ldots,\phi_{N}^{nc})\) be a N-tuple of functions \(\phi _{i}^{nc}:X\times[0,\infty)\to\mathbf{R}^{m_{i}}\) such that the following assumptions are satisfied:

  1. 1.

    There exists a unique absolutely continuous curve x:[0,∞)→X solution to

    $$\dot{x}(t)=g\bigl(x(t),\phi_1^{nc}\bigl(x(t),t\bigr), \ldots,\phi_N^{nc}\bigl(x(t),t\bigr)\bigr), \quad x(0)=x_0, $$
  2. 2.

    For all i=1,2,…,N, there exists a bounded continuously differentiable function V i :X×[0,∞)→R verifying the integral equation

    $$V_{i}^{nc}(x,t)=\int_t^\infty d_i(s,t) L_i\bigl(x(s),\phi _1^{nc} \bigl(x(s),s\bigr),\ldots,\phi_N^{nc}\bigl(x(s),s\bigr),s \bigr) ds, $$

    where \(u_{i}^{nc}=\phi_{i}^{nc}(x,t)\) solves (7).

Then the strategy \((\phi_{1}^{nc}(x,t),\ldots,\phi_{N}^{nc}(x,t))\) is a time-consistent Markov Nash equilibrium, and \(V_{i}^{nc}(x,t)\), i=1,…,N, are the corresponding value functions.

3 Time-Consistent Solutions in a Differential Game with Asymmetric Players Under Partial Cooperation

In the analysis of intertemporal decision problems with several agents, when players can communicate and coordinate their strategies in order to optimize their collective payoff, cooperative solutions are introduced. If there is a unique and constant discount rate of time preference for all agents, the Pareto efficient solution is easily obtained by solving a standard optimal control problem. However, in the case of different discount rates or time inconsistent preferences, when looking for time-consistent cooperative solutions, standard dynamic optimization techniques fail. In these cases, when agents lack commitment power but they decide to cooperate at every instant of time, they act at different times t as sequences of independent coalitions (the t-coalitions). The solution we propose in this chapter, which is an extension of that in De Paz et al. (2013) (see also Marín-Solano and Shevkoplyas 2011) assumes cooperation among players at every time t, but is a non-cooperative (Markovian Nash) equilibrium for the non-cooperative sequential game defined by these infinitely many t-coalitions.

In this section, we tackle the problem of maximizing

$$ J^c= \sum_{i=1}^{N} \lambda_i(x_t,t) \int_t^T d_i(s,t) L_i \bigl(x(s),u_1(s), \ldots,u_N(s),s \bigr) ds $$
(8)

subject to (6), where λ i (x t ,t)≥0, for every i=1,…,N, and \(\sum_{i=1}^{N} \lambda _{i}(x_{t},t)=N\). Coefficients λ i (x t ,t) represent the bargaining power of agent i at time t.

Note that, in general, there are two sources of time-inconsistency in Problem (8). First, there is the time-consistency problem related to the changing time preferences of the different t-coalitions, as we have discussed in the previous paragraphs. In addition, if players are not committed themselves to cooperate at every instant of time t, a problem of dynamic inconsistency or time-inconsistency can arise, independently of the form of the discount function: it is possible that players initially agree on a cooperative solution that generates incentives for them, but it is profitable for some of them to deviate from the cooperative behavior at later periods. Haurie (1976) proved that the extension of the Nash bargaining solution to differential games is typically not dynamically consistent. We refer to Zaccour (2008) for a recent review on the topic. For the case of transferable utilities, if the agents can redistribute the joint payoffs of players in any period, Petrosyan proposed in a series of papers a payoff distribution procedure in order to solve this problem of dynamic inconsistency (see e.g. Yeung and Petrosyan 2006; Petrosyan and Zaccour 2003, and references therein).

In De Paz et al. (2013) this issue of dynamic consistency (related to the stability of the whole coalition) was not considered. In that paper it was assumed that weights are given and constant. Agents commit themselves to cooperate at every instant of time t. There are several problems in which this seems to be a rather reasonable assumption, since players necessarily cooperate. Consider for instance, the intra-personal problem of a decision maker who faces how to allocate her money in order to buy different goods that she values in a different way (different utility functions and different impatience degree or discount rate). In a similar way, there is the problem of a family whose members take consumption decisions according to different preferences. There are also problems in which it is always profitable to cooperate, because if they do not cooperate they obtain nothing. For this kind of problems in which cooperation is guaranteed, equilibria were computed by finding subgame perfect equilibria in a noncooperative sequential game where players are the different t-coalitions (representing, for instance, different generations). However, in general the sustainability of cooperation can not be assured. For instance, in a discrete time setting, Breton and Keoula (2014) illustrated how, for a simple model of management of a renewable natural resource, if players apply different discount rates and have equal weights, the sustainability of cooperation is lost. If utilities are transferable, payoff (imputation) distribution procedures can be introduced in order to guarantee the stability of the whole coalition, extending in an easy way this method to the problem with asymmetric players and time inconsistent preferences, as in the case of differential games with time-distance non-constant discounting (see Marín-Solano and Shevkoplyas 2011).

If utilities are not transferable, we refer to Yeung and Petrosyan (2006) for a study in some models of constant weights guaranteeing the dynamic consistency of the whole coalition. Non surprisingly, they found that there are problems in which such constant weights guaranteeing the sustainability of cooperation do not exist. Sorger (2006) proposed, in a multiperiod (discrete time) setting with two asymmetric players, a recursive Nash bargaining solution which gives rise to a dynamically consistent equilibrium, by assuming that weights are bargained at each period of time and are therefore state-dependent. In this paper we depart from the model in De Paz et al. (2013) and consider the possibility that weights depend in general on the moment t at which the decision is taken, and also on the current state x t . Hence, at time t, given the initial state x t , and knowing which will be the equilibrium decision rule of future s-agents, s>t, the members of the coalition decide their decision rule and bargain their current weight in the coalition. Since the equilibrium rule of future s-agents depends on the changing preferences and, also, on the changing weights, the members of the coalition decide at time t their decision rule and also their current weights by taking into account this information. Non surprisingly, in our model, in order to guarantee the sustainability of the cooperation, weights λ m of players in whole coalition should be non-constant, in general, but a result of a bargaining procedure at every time t. This applies also to the problem with constant and equal discount rates of time preference. As we present in the Introduction, the price to pay if weights are assumed to be of the form λ i (x,t) is that the solution obtained by applying the standard optimal control techniques is time-inconsistent also in the case of equal and constant discount rates, hence the problem should be solved always as a problem with time-inconsistent preferences.

Let us briefly analyze first the problem in which all players have equal (and constant) weights in the whole coalition. The objective of the whole coalition is then to find a time-consistent solution to the problem of maximizing

$$ J^c=\sum_{i=1}^N \lambda_m \int_t^T d_i(s,t) L_i \bigl(x(s),u_1(s), \ldots,u_N(s),s \bigr) ds $$
(9)

subject to (6). As we prove later, for this problem, if \(V_{i}^{c}(x,t)\), i=1,…,N, is a set of continuously differentiable functions in (x,t) characterizing the value function of all agents in the problem, then the decision rule \((u_{1}^{c},\ldots ,u_{N}^{c})=(\phi_{1}^{c}(x,t),\ldots,\phi_{N}^{c}(x,t))\) solving

$$\max_{\{u_1,\ldots,u_N\}} \Biggl\{ \sum_{i=1}^N \lambda_i L_i (x,u_1,\ldots,u_N,t)+ \sum_{i=1}^N \lambda_i \nabla_x V_i^c(x,t) g(x,u_1, \ldots,u_N,t) \Biggr\} $$

with

$$ V_i^c(x,t)= \int_t^T d_i(s,t) L_i\bigl(x(s),\phi_1^c \bigl(x(s),s\bigr),\ldots,\phi _N^c(x,s),s\bigr) ds, $$
(10)

for every i=1,…,N, is a (time-consistent) Markov Perfect Equilibrium for the problem with partial cooperation (9). The extension to the infinite horizon problem is straightforward.

Next, let us consider Problem (8). If \(u_{i}^{c}(s)=\phi_{i}^{c}(s,x(s))\), i=1,…,N, is the equilibrium rule, then the joint value function is

$$ V^c(x,t)= \sum_{i=1}^N \lambda_i(x,t) \int_t^\tau d_i(s,t) L_i\bigl(x(s),\phi _1^c \bigl(x(s),s\bigr),\ldots,\phi_N^c(x,s),s\bigr) ds. $$
(11)

The planning horizon τ can be finite or infinite. We assume that, if τ=∞, along the equilibrium rule, the value function (11) is finite (i.e. the integral converges). This is guaranteed if we restrict our attention to convergent policies (along the equilibrium rule the state variables converge to a stationary state). Hence we have:

Theorem 1

If there exists a value function of class C 1 solving the set of N integral equations (10) where

$$\begin{aligned} \bigl(u_1^c,\ldots,u_N^c \bigr) =&\bigl(\phi_1^c(x,t),\ldots,\phi_N^c(x,t) \bigr) \\ =& \arg\max_{\{ u_1,\ldots,u_N\} } \Biggl\{ \sum_{i=1}^N \lambda _i(x,t) \bigl( L_i(x,u_1, \ldots,u_N,t) \\ &{}+ \nabla_x V_i^c(x,t) g(x,u_1,\ldots,u_N,t) \bigr) \Biggr\} , \end{aligned}$$
(12)

and there exists a unique absolutely continuous curve x:[0,τ]→X solution to \(\dot{x}(t) = g(x(t),\phi_{1}^{c}(x(t),t),\ldots,\phi_{N}^{c}(x(t),t))\), x(0)=x 0, then \((u_{1}^{c},\ldots,u_{N}^{c})=(\phi_{1}^{c}(x,t),\ldots , \phi _{N}^{c}(x,t))\) is an equilibrium rule for Problem (8), in the sense that it satisfies Definition 1.

Proof

According to Definition 1, for ε>0, let us consider the variations

$$u_i^\varepsilon (s)= \begin{cases} v_i(s) & \mbox{if } s\in[t,t+\varepsilon ], \\ \phi_i^c(x,s) & \mbox{if } s>t+\varepsilon , \end{cases} $$

for i=1,…,N. In the following, we denote u=(u 1,…,u N ), \(u^{\varepsilon }=(u_{1}^{\varepsilon },\ldots,u_{N}^{\varepsilon })\), v=(v 1,…,v N ) and \(\phi^{c}(x,t)=(\phi^{c}_{1}(x,t),\ldots,\phi^{c}_{N}(x,t))\). Let

$$V_i^\varepsilon (x,t) = \int_t^{\tau} d_i(s,t)L_i\bigl(x^\varepsilon (s),u^\varepsilon (s),s\bigr) ds, $$

where x ε(s) denotes the state trajectory obtained from equation (6) when the decision rule u ε(s) is applied. By definition,

$$\begin{aligned} &V^c(x,t)-V^\varepsilon (x,t) \\ &\quad= \sum_{i=1}^N \lambda_i(x,t) \bigl[ V_i^c(x,t)-V_i^\varepsilon (x,t) \bigr] \\ &\quad= \sum_{i=1}^N \lambda_i(x,t) \biggl[ \int_t^{t+\varepsilon } d_i(s,t) \bigl[ L_i\bigl(x(s),\phi^c\bigl(x(s),s\bigr),s\bigr) - L_i\bigl(x^\varepsilon (s),v(s),s\bigr) \bigr] ds \\ &\qquad{}+ \int_{t+\varepsilon }^\tau d_i(s,t) \bigl[ L_i\bigl(x(s),\phi^c\bigl(x(s),s\bigr),s\bigr) - L_i\bigl(x^\varepsilon (s),\phi^c\bigl(x^\varepsilon (s),s \bigr),s\bigr) \bigr] ds \biggr]. \end{aligned}$$

Note that

$$\begin{aligned} &\int_{t+\varepsilon }^\tau d_i(s,t) L_i\bigl(x(s),\phi^c\bigl(x(s),s\bigr),s\bigr) ds \\ &\quad = V_i^c\bigl(x(t+\varepsilon ),t+\varepsilon \bigr) \\ &\qquad{}- \int_{t+\varepsilon }^\tau \bigl[ d_i(s, t+\varepsilon ) - d_i(s,t) \bigr] L_i \bigl(x(s),\phi^c\bigl(x(s),s\bigr),s\bigr) ds. \end{aligned}$$

In a similar way,

$$\begin{aligned} &\int_{t+\varepsilon }^\tau d_i(s,t) L_i\bigl(x^\varepsilon (s),\phi^c\bigl(x^\varepsilon (s),s\bigr),s\bigr) ds \\ &\quad= V_i^c\bigl(x^\varepsilon (t+\varepsilon ),t+ \varepsilon \bigr) \\ &\qquad{}- \int_{t+\varepsilon }^\tau \bigl[ d_i(s, t+\varepsilon ) - d_i(s,t) \bigr] L_i \bigl(x^\varepsilon (s),\phi^c\bigl(x^\varepsilon (s),s\bigr),s \bigr) ds. \end{aligned}$$

Therefore,

$$\begin{aligned} &V^c(x,t)-V^\varepsilon (x,t) \\ &\quad= \sum_{i=1}^N \lambda_i(x,t) \biggl[ \int_t^{t+\varepsilon } d_i(s,t) \bigl[ L_i\bigl(x(s),\phi^c \bigl(x(s),s\bigr),s\bigr) - L_i\bigl(x^\varepsilon (s),v(s),s\bigr) \bigr] ds \\ &\qquad{} + V_i^c\bigl(x(t+\varepsilon ),t+\varepsilon \bigr) - V_i^c\bigl(x^\varepsilon (t+\varepsilon ),t+\varepsilon \bigr) \\ &\qquad{}+ \int_{t+\varepsilon }^\tau \bigl[d_i(s,t)- d_i(s,t+\varepsilon ) \bigr] \\ &\qquad{}\times\bigl[ L_i\bigl(x(s),\phi^c\bigl(x(s),s\bigr),s\bigr) - L_i\bigl(x^\varepsilon (s),\phi^c\bigl(x^\varepsilon (s),s\bigr),s\bigr) \bigr] ds \biggr]. \end{aligned}$$

Hence,

$$\lim_{\varepsilon \to0^+} \frac{V^c(x,t)-V^\varepsilon (x,t)}{\varepsilon } = (A) + (B) + (C), $$

where

$$\begin{aligned} (A) = & \lim_{\varepsilon \to0^+} \frac{1}{\varepsilon } \Biggl( \sum _{i=1}^N \lambda_i(x,t) \int _t^{t+\varepsilon } d_i(x,t) \\ &{}\times \bigl[ L_i\bigl(x(s),\phi^c\bigl(x(s),s \bigr),s\bigr) - L_i\bigl(x^\varepsilon (s),v(s),s\bigr) \bigr] ds \Biggr) \\ = & \sum_{i=1}^N \lambda_i(x,t) \bigl[ L_i\bigl(x,\phi^c(x,t),t\bigr) - L_i(x,v,t) \bigr], \\ (B) = & \lim_{\varepsilon \to0^+} \frac{1}{\varepsilon } \Biggl( \sum _{i=1}^N \bigl[ V_i^c \bigl(x(t+\varepsilon ),t+\varepsilon \bigr) - V_i^c \bigl(x^\varepsilon (t+\varepsilon ),t+\varepsilon \bigr) \bigr] \Biggr) \\ = & \sum_{i=1}^N \lambda_i(x,t) \bigl[ \nabla_x V_i^c(x,t) \bigl( g\bigl(x, \phi^c(x,t),t\bigr) - g(x,v,t) \bigr) \bigr], \end{aligned}$$

and

$$\begin{aligned} (C) = & \lim_{\varepsilon \to0^+} \frac{1}{\varepsilon } \Biggl( \sum _{i=1}^N \biggl[ \int_{t+\varepsilon }^\tau \bigl[d_i(s,t)- d_i(s,t+\varepsilon ) \bigr] \\ &{}\times \bigl[ L_i\bigl(x(s),\phi^c\bigl(x(s),s \bigr),s\bigr) - L_i\bigl(x^\varepsilon (s),\phi^c \bigl(x^\varepsilon (s),s\bigr),s\bigr) \bigr] ds \biggr] \Biggr) \\ = & 0. \end{aligned}$$

Summarizing

$$\begin{aligned} &\lim_{\varepsilon \to0^+} \frac{V^c(x,t)-V^\varepsilon (x,t)}{\varepsilon } \\ &\quad = \sum _{i=1}^N \lambda_i(x,t) \bigl[ \bigl( L_i\bigl(x,\phi^c(x,t),t\bigr) + \nabla _x V_i^c(x,t) g\bigl(x,\phi^c(x,t),t\bigr) \bigr) \\ & \qquad{} - \bigl( L_i(x,v,t)+\nabla_x V_i^c(x,t) g(x,v,t) \bigr) \bigr] \\ &\quad\geq 0 \end{aligned}$$

and the result follows. □

Remark 1

It is important to realize that, unless d i (s,t)=α(t)β i (s) (or, in particular, d i (s,t)=e ρ(st), i.e. all players discount the future by using the same constant discount rate of time preference) and weights λ i are constant, for i=1,…,N, the time-consistent solution provided by condition (12) in Theorem 1 is different to that obtained by applying the classical Pontryagin maximum principle (or the Hamilton–Jacobi–Bellman equation) to the problem of maximizing (8) subject to (6) from the viewpoint of the time preferences of all players at time t=0.

4 Examples

In this section we illustrate our results with two simple models coming from the field of environmental and resource economics. In the first example we solve a common property resource model studied in Clemhout and Wan (1985) with time-distance nonconstant discounting. For this model we compute both the Markovian Nash equilibria and the time-consistent equilibria with partial cooperation. We restrict our attention to the particular case of constant weights. The second example is a pollution linear state differential game whose equilibria are state independent. Although this is not a nice property from an economic viewpoint, it has the advantage that, in the computation of time-consistent equilibria within the framework of partial cooperation, in the problem with nonconstant weights for the different players, it is rather natural to restrict our attention to time dependent but state independent weights. In this case we are able to derive explicit formula for time consistent equilibria with arbitrary weights.

4.1 A Common Property Resource Game

Let us consider the problem of exploitation of a renewable natural resource in which, if x(t) represents the stock of natural resource at time t, and h i (t) is the harvest rate at time t of player i, for i=1,…,N, the state dynamics is described by the equation

$$ \dot{x}(s) = x(s) \bigl(a-b \ln x(s)\bigr) - \sum _{i=1}^N h_i(s), \quad x(t)=x_t. $$
(13)

Players have logarithmic instantaneous utility functions depending just on their harvest rates, and they discount the future according to different distance-based nonconstant discount rates of time preference. Hence, the intertemporal utility function for player i is given by

$$J_i=\int_t^\infty \theta_i(s-t) \ln h_i(s) ds. $$

4.1.1 Noncooperative Markovian Nash equilibrium

In players do not cooperate, let us look for stationary strategies. According to the results in Sect. 2, player i aims to look for the solution to

$$\max_{\{ h_i\}} \biggl\{ \ln h_i + \bigl(V_i^{nc}(x)\bigr)^{\prime} \biggl[x(a-b \ln x)-h_i-\sum_{j\neq i} \phi_j^{nc}(x) \biggr] \biggr\} , $$

where \(\phi_{j}^{nc}(x)\), j=1,…,N, denotes the equilibrium strategy of player j in feedback form. Hence \({h_{i}=((V_{i}^{nc}(x))^{\prime})^{-1}}\). We look for a value function of the form \(V_{i}^{nc}(x)=\alpha_{i}^{nc} \ln x + \beta_{i}^{nc}\), for i=1,…,N. Then \({h_{i}^{nc}=\phi _{i}^{nc}(x)=(\alpha_{i}^{nc})^{-1} x}\). By substituting in equation (13) and solving we obtain

$$x(s)=\exp \biggl[ \biggl( \ln x_t + \frac{\sum_{j=1}^N {1}/{\alpha _j^{nc}}-a}{b} \biggr) e^{-b(s-t)} + \frac{a-\sum_{j=1}^N {1}/{\alpha _j^{nc}}}{b} \biggr]. $$

Hence,

$$\ln \phi_i\bigl(x(s)\bigr) = e^{-b(s-t)} x_t + \frac{a-\sum_{j=1}^N {1}/{\alpha_j^{nc}}}{b} \bigl( 1-e^{-b(s-t)} \bigr) - \ln \alpha _i^{nc}, $$

for every i=1,…,N. Therefore, since

$$V_i(x)= \int_t^\infty \theta_i(s-t) \ln \phi_i\bigl(x(s)\bigr) ds, $$

then

$$\begin{aligned} \alpha_i^{nc} \ln x+\beta_i^{nc} = & \biggl[\int_t^\infty\theta _i(s-t) e^{-b(s-t)} ds \biggr] \ln x \\ &{}+\frac{a-\sum_{j=1}^N {1}/{\alpha_j^{nc}}}{b}\int_t^\infty\theta _i(s-t)\bigl[1-e^{-b(s-t)}\bigr] ds \\ &{}- \ln \alpha_i^{nc} \int_t^\infty \theta_i(s-t) ds. \end{aligned}$$

By simplifying we obtain

$$\begin{aligned} \alpha_i^{nc} = & \int_0^\infty \theta_i(s) e^{-bs} ds, \\ \beta_i^{nc} = & \frac{1}{b} \Biggl( a-\sum _{j=1}^N\frac{1}{\int_0^\infty\theta_j (s) e^{-bs} ds} \Biggr) \int _0^\infty\theta _i(s) \bigl[1-e^{-bs}\bigr] ds \\ &{}- \ln \biggl( \int_0^\infty \theta_i(s) e^{-bs} ds \biggr) \int_0^\infty \theta_i(s) ds \end{aligned}$$

and

$$h_i^{nc}(x)=\frac{x}{\int_0^\infty\theta_i(s) e^{-bs} ds}, $$

for i=1,…,N.

4.1.2 Time-Consistent Equilibrium with Partial Cooperation

Next, let us compute the time consistent equilibria in case players at every time t decide to cooperate among them, but coalitions taking decisions at different times do not cooperate. We restrict our attention to stationary strategies, and weights are assumed to be constant. According to Theorem 1, we look for the solution to

$$\begin{aligned} &\max_{\{ h_1,\ldots,h_N\}} \Biggl\{ \sum_{j=1}^N \lambda_j \ln h_j \\ &\quad{}+ \Biggl( \sum _{i=1}^N \lambda_i \bigl(V_i^{c}(x) \bigr)^{\prime} \Biggr) \biggl[x(a-b \ln x)-h_i-\sum _{j\neq i} \phi_j^{c}(x)\biggr] \Biggr\} . \end{aligned}$$

Therefore, \({h_{j}^{c}=\lambda_{j} (\sum_{i=1}^{N} \lambda_{i} (V_{i}^{c}(x))^{\prime})^{-1}}\). We look for a set of value functions of the form \(V_{i}^{c}(x)=\alpha_{i}^{c} \ln x + \beta_{i}^{c}\), for i=1,…,N. Then

$$h_j^c=\phi^c(x)=\frac{\lambda_j x}{\sum_{i=1}^N \lambda_i\alpha_i^c}. $$

By substituting in equation (13) and solving we obtain

$$\begin{aligned} x(s) =&\exp \biggl[ \biggl( \ln x_t + \frac{\sum_{j=1}^N\lambda_j - a\sum_{j=1}^N\lambda_j\alpha_j^c}{b\sum_{j=1}^N\lambda_j\alpha _j^c} \biggr) e^{-b(s-t)} \\ &{}-\frac{\sum_{j=1}^N\lambda_j - a\sum_{j=1}^N\lambda_j\alpha_j^c}{b\sum_{j=1}^N\lambda_j\alpha_j^c} \biggr]. \end{aligned}$$

Hence, proceeding as in the noncooperative case, we easily obtain

$$ \begin{aligned} \alpha_i^c \ln x+\beta_i^c ={} & \biggl[\int_t^\infty\theta_i(s-t) e^{-b(s-t)} ds \biggr] \ln x \\ &{}+ \frac{a\sum_{j=1}^N\lambda_j\alpha_j^c-\sum_{j=1}^N\lambda_j}{b\sum_{j=1}^N\lambda_j\alpha_j^c}\int_t^\infty \theta_i(s-t)\bigl[1-e^{-b(s-t)}\bigr] ds \\ &{}- \ln \biggl( \frac{\sum_{j=1}^N \lambda_j \alpha_j^c}{\lambda_i} \biggr) \int_0^\infty \theta_i(s) ds. \end{aligned} $$

By simplifying we obtain

$$\begin{aligned} \alpha_i^c = & \int_0^\infty \theta_i(s) e^{-bs} ds, \\ \beta_i^c = & \frac{a\sum_{j=1}^N\lambda_j\int_0^\infty\theta_j(s) ds - \sum_{j=1}^N\lambda_j}{b\sum_{j=1}^N\lambda_j\int_0^\infty\theta_j(s) ds} \int _0^\infty\theta_i(s) \bigl[1-e^{-bs}\bigr] ds \\ &{} - \ln \biggl( \frac{\sum_{j=1}^N \lambda_j \int_0^\infty\theta _j(s) e^{-bs} ds}{\lambda_i} \biggr) \int_0^\infty \theta_i(s) ds \end{aligned}$$
(14)

and

$$h_i^c(x)=\frac{\lambda_i x}{\sum_{j=1}^N \lambda_j \int_0^\infty\theta _j(s) e^{-bs} ds}, $$

for i=1,…,N. Note that \(\alpha_{i}^{nc}=\alpha_{i}^{c}\), as in the case of constant and equal discount rates.

4.2 A Linear State Differential Game of Pollution Control

As a second example, we consider the environmental problem studied in Jørgensen et al. (2003) where N countries (the players of the game) coordinate their pollution strategies to optimize their joint payoff. Let us denote by E i (t), for i=1,…,N, the emissions of country i at time t. The evolution of the stock of pollution S(t) is described by the differential equation

$$ \dot{S}(\tau) = \sum_{i=1}^N E_i(\tau)-\delta S(\tau), \quad S(0)=S_0, $$
(15)

where δ>0 represents the natural absorption rate of pollution. The emissions are assumed to be proportional to the production and hence the revenue from production can be expressed as a function of the emissions. In particular, the revenue function of country i is assumed to be logarithmic. The damage cost is a linear function on the stock of pollution. The intertemporal utility function for player i is given by

$$J_i=\int_t^\infty \theta_i(\tau-t) \bigl( \ln E_i(\tau) - \varphi_i S(\tau)\bigr) d\tau $$

Next we compute both the time-consistent Markovian noncooperative and with partial cooperation equilibria.

4.2.1 Noncooperative Markovian Nash Equilibrium

In this case, player i aims to maximize

$$\max_{\{ E_i\}} \biggl\{ \ln E_i - \varphi_i S + \bigl(V_i^{nc}(S) \bigr)^{\prime}\biggl(E_i+\sum_{j\neq i} \phi_j^{nc}(S)-\delta S\biggr) \biggr\} , $$

where \(E_{j}^{nc}=\phi_{j}^{nc}(S)\) is the equilibrium rule. Then \(E_{i}^{nc}=(-(V_{i}^{nc})^{\prime}(S))^{-1}\). We look for a value function of the form \(V_{i}(S)=\alpha_{i}^{nc} S+\beta_{i}^{nc}\). Then \(E_{i}^{nc}=\phi _{i}^{nc}=(-\alpha_{i}^{nc})^{-1}\). By substituting in (15) we obtain \(\dot{S}(\tau)= \sum_{j=1}^{N} (-\alpha^{nc}_{j})^{-1} - \delta S(\tau)\), whose solution with the initial condition S(t)=S t is

$$S(\tau) = e^{-\delta(\tau-t)} S_t-\sum_{j=1}^N \frac{1}{\delta\alpha _j^{nc}} \bigl( 1-e^{-\delta(\tau-t)} \bigr). $$

By identifying the value functions we obtain

$$\begin{aligned} \alpha_i^{nc} S + \beta_i^{nc} ={}& \int_t^\infty\theta_i( \tau-t) \bigl[ \ln \phi_i^{nc}\bigl(S(\tau )\bigr)- \varphi_i S \bigr] d\tau \\ ={}& \int_t^\infty\theta_i( \tau-t) \Biggl[-\ln \bigl(-\alpha _i^{nc}\bigr) \\ &{}- \varphi_i \Biggl( e^{-\delta(\tau-t)}S - \sum _{j=1}^N\frac {1}{\delta\alpha_j^{nc}}\bigl(1-e^{-\delta(\tau-t)} \bigr) \Biggr) \Biggr] d\tau. \end{aligned}$$

By simplifying we obtain

$$\begin{aligned} \alpha_i^{nc} = & -\varphi_i\int _0^\infty\theta_i(\tau) e^{-\delta\tau } d\tau, \\ \beta_i^{nc} = & -\ln \biggl(\varphi_i\int _0^\infty\theta_i(\tau) e^{-\delta\tau} d\tau \biggr)\int_0^\infty \theta_i(\tau) d\tau \\ &{}- \frac{\varphi_i}{\delta}\sum_{j=1}^N \frac{1}{\varphi_j\int_0^\infty \theta(\tau) e^{-\delta\tau}} \biggl(\int_0^\infty \theta_i(\tau ) \bigl(1-e^{-\delta\tau}\bigr) d\tau \biggr) \end{aligned}$$

and the emission rule becomes

$$E_i^{nc}=\frac{1}{\varphi_i\int_0^\infty\theta_i(\tau)e^{-\delta\tau} d\tau}. $$

4.2.2 Time-Consistent Equilibrium with Partial Cooperation

Finally, let us compute the time-consistent equilibrium for the problem with partial cooperation. In comparison with the previous example on the management of a common property access resource, we consider the case of nonconstant weights for this problem. Since the decision rule for linear state games is typically independent on the state variable (the pollution stock), it seems natural to restrict our attention to weights λ i (t), i.e. independent on the state variable. This simplification allows to solve the model. The payoff for the grand coalition is given by

$$J^c=\sum_{j=1}^N \lambda_j^c(t) \int_t^\infty \theta_j(\tau-t) \bigl( \ln E_j(\tau)- \varphi_j S(\tau)\bigr) d\tau. $$

According to Theorem 1, we must solve

$$\max_{\{ E_1,\ldots,E_N\}} \Biggl\{ \sum_{j=1}^N \lambda_j^c(t) \Biggl[ \ln E_j - \varphi_j S + \bigl(V_j^c(S) \bigr)^\prime \Biggl(\sum_{i=1}^N E_i-\delta S \Biggr) \Biggr] \Biggr\} . $$

The equilibrium rule is given by

$$E_i=-\frac{\lambda_i(t)}{\sum_{j=1}^N\lambda_j(t) (V^c_j(S))^\prime}. $$

We look for a family of value functions of the form \(V_{j}(S)=\alpha _{j}^{c}(t) S + \beta_{j}^{c}(t)\), for j=1,…,N. Then the emission rules become

$$ E_i=-\frac{\lambda_i(t)}{\sum_{j=1}^N\lambda_j(t) \alpha_j^c(t)}. $$
(16)

By substituting (16) in (15) we obtain the linear differential equation

$$\dot{S}(\tau)= -\frac{\sum_{j=1}^N\lambda_j(t)}{\sum_{i=1}^N\lambda _i(t)\alpha_i^c(t)} - \delta S(\tau) = -\frac{N}{\sum_{i=1}^N\lambda _i(t)\alpha_i(t)} - \delta S(\tau), $$

whose solution with the initial condition S(t)=S t is given by

$$S(\tau) = e^{-\delta(\tau-t)} S_t - \int_t^\tau \frac{e^{-\delta(\tau -z)}}{\sum_{i=1}^N \lambda_i(z)\alpha^c_i(z)} dz. $$

In order to compute the values of the (nonconstant) coefficients \(\alpha _{i}^{c}(t)\), \(\beta_{i}^{c}(t)\), proceeding as in the previous example, note that

$$\begin{aligned} \alpha_i(t) S + \beta_i(t) = & \int _t^\infty\theta_i(\tau-t) \bigl(\ln E_i(\tau)-\varphi_i S(\tau)\bigr) d\tau \\ =& \int_t^\infty\theta_i(\tau-t) \biggl[ \ln \biggl( -\frac {\lambda_i(\tau)}{\sum_{j=1}^N \lambda_j(\tau)\alpha_j^c(\tau)} \biggr) \\ &{}- \varphi_i \biggl( e^{-\delta(\tau-t)} S_t - \int_t^\tau\frac {e^{-\delta(\tau-z)}}{\sum_{j=1}^N \lambda_j(z)\alpha_j^c(z)} dz \biggr) \biggr] d\tau. \end{aligned}$$

By identifying terms we obtain

$$\alpha_i^c(t)=-\varphi_i\int _0^\infty\theta_i(\tau) e^{-\delta\tau} d\tau $$

and

$$\begin{aligned} \beta_i^c(t) = & \int_t^\infty \theta_i(\tau-t) \biggl[ \ln \frac {\lambda_i(\tau)}{\sum_{j=1}^N\varphi_j\int_0^\infty\theta_j(z) e^{-\delta z} dz} \\ &{} - \varphi_i\int_t^\tau \frac{e^{-\delta(\tau -z)}}{\sum_{j=1}^N\varphi_j\int_0^\infty\theta_j(s)e^{-\delta s} ds} dz \biggr] d\tau. \end{aligned}$$

From (16), the emission rule of country i is given by

$$E_i^c(t)=\frac{\lambda_i(t)}{\sum_{j=1}^N\lambda_j(t)\varphi_j\int_0^\infty\theta_j(\tau)e^{-\delta\tau} d\tau}. $$

For instance, in the case of a constant and common discount rate for all players but nonconstant weights, θ i (τ)=e ρτ, emissions of country i become

$$E_i^c(t)=\frac{(\rho+\delta)\lambda_i(t)}{\sum_{j=1}^N\varphi_j\lambda _j(t)}. $$

In Jørgensen et al. (2003) parameter conditions ensuring the time consistency of the coalition (so that payoffs obtained when they cooperate are higher than payoffs in the case of non cooperation) were established when players are not symmetric. By introducing nonconstant weights obtained from a bargaining procedure at every t (by using e.g. the Nash bargaining solution), a time-consistent solution guaranteeing the stability of the coalition can be found.

5 Conclusions

In this chapter, differential games with time-inconsistent preferences generated by the introduction of general (not necessarily time-distance) discount functions are studied. Both the noncooperative setting and a framework with partial cooperation are analyzed. The corresponding dynamic equations for the derivation of time-consistent equilibria are obtained. In order to guarantee the stability of the grand coalition, nonconstant weights are introduced, so that players can bargain their weight in the grand coalition at every instant of time. In particular, the introduction of nonconstant weights provides a way to construct dynamically consistent solutions guaranteeing the stability of the grand coalition in problems in which the players discount the future by using constant (and not necessarily different) discount rates. The price to be paid is that the use of nonconstant (time and/or state dependent) weights induces time-inconsistent preferences. The results in the chapter are illustrated with two simple examples coming from the field of environmental and resource economics. In a first example, a simple common access resource game is studied by introducing heterogeneous and time-distance nonconstant discount rates. Weights of players in the problem with partial cooperation are assumed to be constant. The second example analyzes a linear state pollution differential game with the same kind of time preferences. For this problem, nonconstant weights in the problem with partial cooperation are introduced.