Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Differential game theory extends optimal control theory to situations with more than one decision maker. The controls of each decision maker affect the development of the state of the system, given by a set of differential equations. The objectives of the decision makers are integrals of functions that depend on the state of the system, so that the controls of each decision maker affect the objectives of the other decision makers and the problem turns into a game. A differential game is a natural framework of analysis for many problems in environmental and resource economics. Usually these problems extend over time and have externalities in the sense that the actions of one agent affect welfare of the other agents. For example, emissions of all agents accumulate into a pollution stock and this pollution stock is damaging to all agents. Or, resource extractions by all agents decrease a resource stock and the availability of the resource affects welfare of all agents.

It is logical to identify the pollution stock and the resource stock with the state of the system but one needs to be careful here. In systems theory, the state of the system is a memory concept, comparable to a sufficient statistic. It contains sufficient information from the history of the system, at some point in time, to be able to predict the future of the system, under given inputs. This indeed applies to the pollution stock and the resource stock. However, the controls in the Nash equilibrium of the differential game usually differ when they are conditioned on time only, on the current state of the system or on the history of the state of the system. These possible interactions between the controls and the state complicate the conceptual framework. This chapter will first clarify this issue that was already presented in the early stages of the development of differential game theory (Starr and Ho 1969; Başar and Olsder 1982).

Most theory and applications are restricted to linear-quadratic differential games where the state transition is linear and the functions under the objective integral are quadratic. In this case the controls are linear and the value functions are quadratic but again one needs to be careful here. It can happen that non-linear equilibria of the game exist as well (Dockner and Van Long 1993). Moreover, the assumption that the state transition is linear is fine for most standard economic problems with capital accumulation, for example, or for the simple environmental and resource problems mentioned above. However, ecological systems are usually non-linear and if the resource is an ecological system, a differential game with a non-linear constraint has to be solved. For example, a lake reacts in a non-linear way to the release of phosphorus on the lake, so that in case of many users a stock externality occurs that develops in a non-linear way.

An important reason for analyzing differential games in environmental and resource economics is to assess the “tragedy of the commons” (Hardin 1968). The tragedy of the commons means that if a common resource is accessible to all agents and is not jointly managed, it will be overused and joint welfare will not be optimal. It would be collectively rational to use less but in that situation there are individual incentives to use more. One can say that the non-cooperative Nash equilibrium of the differential game characterizes the situation without optimal management. By comparing this to the cooperative outcome of the game, the welfare losses can be assessed. Note, however, that the non-cooperative Nash equilibrium of the differential game is not unique. It depends on the information structure or, more specifically, on whether the controls can be conditioned on the state of the system or not. It also depends on the level of commitment the agents can make with respect to their future strategies. This positions the analysis in what is usually called the “Nash program”: a fundamental question in game theory is whether equilibria exist that mimic the cooperative outcome because this combines the highest joint welfare with the stability properties of the equilibrium in the sense that no agent has an incentive to deviate. In that case the players would only have to coordinate on the proper Nash equilibrium.

If equilibrium behavior is not jointly optimal, it can be corrected by taxes or other policies. In principle, a fully corrective mechanism usually exists but this mechanism may be too complicated or too costly to implement. The general question becomes how close one can get to the cooperative outcome with a non-cooperative Nash equilibrium that is based on a realistic set of strategies for the individual agents and a realistic policy, mandated to a policy maker. The main purpose of this chapter is to show how far we can go in case of two typical examples, with the available techniques in differential games.

This chapter will start with an overview of the most important concepts and techniques of differential games. Then two examples in environmental and resource economics will be elaborated. The first one is the game of international pollution control where countries emit, for example, greenhouse gases into a concentration level of pollutants. With some linear natural degradation, the state transition is linear. There are benefits of emissions in a one-to-one relationship with production and there is damage of the stock of pollutants. If these costs and benefits are approximated by quadratic functions, the differential game is a linear-quadratic one. The second example is the lake game with many users. The state is again the stock of pollutants which is now the total amount of phosphorus sequestered in algae. This pollution stock, however, develops in a non-linear way which will be explained below. Such a non-linear differential game with a one-dimensional state can still be solved but requires numerical techniques.

Section 2 will give an overview of the most important concepts and techniques of differential games. Section 3 will analyse the game of international pollution control as an example of a linear-quadratic differential game. Section 4 will analyze the lake game as an example of a non-linear differential game. Section 5 concludes.

2 Concepts and Techniques of Differential Games

It may be fair to say that Bellman’s principle of optimality is one of the most important concepts in dynamic optimization theory. It simply states that at each point in time the remainder of an optimal control path is still optimal when starting at the state of the system that is reached by implementing the optimal control path up to that point in time. The proof is easy. If another future control path is better, replacing the remainder of the original optimal control path with this one leads to a control path that is better than the original one so that the original one cannot be optimal. The principle of optimality is simple but powerful because it allows to solve a dynamic optimization problem backwards in time and it yields the technique of dynamic programming and, more specifically, the Hamilton–Jacobi–Bellman equation for the value function of the optimal control problem. Therefore, it was remarkable to find that the equivalent does not hold in differential games.

2.1 Example from 1969

This issue was first put forward in a paper by Starr and Ho (1969) in an instructive example. This example is not a differential game. It has only two periods and two actions for each player but it provides insights that are relevant for the sequel. There are two players. Each player has controls 0 and 1. The game starts in the state x 0 and the state transition is such that the control pair (1,0) leads to the state x 1, with costs −1 for player 1 and costs 1 for player 2. The control pairs (0,0) and (1,1) both lead to x 2, with respective costs (2,2) and (5,0), and the control pair (0,1) leads to x 3, with costs (2,2). In the second period one of the following three bi-matrix games is played, depending on whether the state x 1, x 2 or x 3 has been reached. The actions of player 1 are represented by the rows of these matrices and the actions of player 2 by the columns. The entries of these matrices represent the respective costs of player 1 and player 2. The objective of the players is to minimize costs. The bi-matrix games in the second period are given by Table 1. The game over both periods can be solved in two ways. First, the game can be written as one bi-matrix game (confer Table 2) in the four possible strategies of the players over two periods (00, 01, 10 and 11), by adding up the costs in the first period (given in the text above) and the costs in the second period (given by the three bi-matrices above). The Nash equilibrium of this game is (11,00) with total costs (3,2). Second, the game can be solved backwards in time. The Nash equilibria in the second period are (1,0) in x 1 with costs (4,1), (1,1) in x 2 with costs (0,3), and (0,0) in x 3 with costs (2,2). By adding up the costs in the first period and the Nash equilibrium costs in the second period, we get a bi-matrix game in the first period that is shown in Table 3. The Nash equilibrium of this game is (0,1) with total costs (4,4).

Table 1 The bi-matrix games in the second period
Table 2 One bi-matrix game in the four possible strategies
Table 3 The bi-matrix game in the first period

The two Nash equilibria are different. The Nash equilibrium of the game in strategies over the two periods is (11,00). The continuation of these strategies (1,0), starting in the state x 1 that is reached by implementing the controls (1,0) in the first period, is also a Nash equilibrium of the remainder of the game (time-consistency). However, this does not imply that this Nash equilibrium can be found by dynamic programming, backwards in time! The reason is that if the players realize that the controls are reconsidered in the second period, conditioned on the state that has been reached, they have incentives to deviate. Player 1 wants to play 0 in the first period, leading to state x 2 and Nash equilibrium (1,1) with total costs 2 for player 1 that are lower than 3. However, then the total costs of player 2 are 5 and therefore this player wants to play 1 in the first period, leading to state x 3 and Nash equilibrium (0,0) with total costs 4 for player 2 that are lower than 5. This results in the other Nash equilibrium with total costs (4,4). There are two mechanisms at work here: one is that the players are not committed to a control in the second period and the other one is that the players can condition their controls on the state of the system. Note that these additional options lead to higher total costs in the Nash equilibrium!

2.2 Open-Loop and Feedback

The theory of differential games was developed in the engineering literature and therefore the concepts of open-loop and closed-loop were used to describe the findings in the example above. A controlled system has closed loops if the controller uses observations on the state of the system. One can also say that the observations are fed back into the controller and therefore the concept of feedback is also used, instead of closed-loop. The conclusion of the example above is that the open-loop Nash equilibrium differs from the feedback or closed-loop Nash equilibrium, where the controls are conditioned on the state of the system. Later it was found that more Nash equilibria exist where the controls are conditioned on the history of the state of the system. These are called closed-loop memory Nash equilibria. If these equilibria are considered, the concept of state has to change. If the current and future controls are known, the state contains sufficient information to be able to predict the future of the system but if the current controls depend on the history of the state, this information is not sufficient. The state has to be augmented with the relevant history of the system. For example, the current pollution stock has to be augmented with previous pollution stocks. Başar and Olsder (1982) label open-loop, closed-loop and closed-loop memory as possible information structures and focus on the informational non-uniqueness of the Nash equilibrium of differential games.

Most of the economic applications of differential games, however, are restricted to the open-loop and the feedback Nash equilibrium. This is mainly driven by the available solution techniques. A differential game is basically a set of optimal control problems and therefore the well-known solution techniques of optimal control problems are also at the heart of differential games. When Pontryagin’s maximum principle is used, the control is a function of time and the open-loop Nash equilibrium is found. This can be compared to the first Nash equilibrium in the example above. When dynamic programming is used or, more specifically, the Hamilton–Jacobi–Bellman equations in the value functions, the feedback Nash equilibrium is found. In the example above, the value functions give the costs-to-go of the two players in the Nash equilibrium of the second period as a function of the state. These are added to the possible costs in the first period and in this way the feedback Nash equilibrium is found.

2.3 Commitment

It is quite common in economic applications of differential games to label the open-loop Nash equilibrium as the pre-commitment solution, in contrast with the feedback Nash equilibrium. It is clear that the players in the feedback Nash equilibrium are not committed to their future controls and wait with choosing their actions until they observe the state of the system. It is also clear that the players in the open-loop Nash equilibrium are committed to their future controls, simply because they do not get new information. However, it is a bit confusing to distinguish these two equilibria on the basis of commitment because the comparison is not made ceteris paribus. When moving from open-loop to feedback, the information structure is changed and commitment is lost because the dynamic programming framework is used. A comparison on the basis of commitment only would require to keep the information structure fixed but then one cannot rely on a standard solution technique from optimal control theory.

 

Open-loop

Feedback

Commitment

Maximum Principle

?

No commitment

 

Dynamic Programming

This combination of commitment and feedback or closed-loop information structures is simply not very well developed in economic applications of differential games. This is in contrast with another area in dynamic games, namely repeated games. Repeated games do not have a state but strategies can be conditioned on the strategies of the other players. In this way, with commitment, the cooperative outcome can be realized because players can announce to punish other players sufficiently much if they deviate from the cooperative outcome. A similar approach can be chosen for differential games (Tolwinski 1982). Another approach is to solve for closed-loop memory Nash equilibria. Tolwinski et al. (1986) show that for a class of differential games, closed-loop memory Nash equilibria can sustain efficient outcomes. Gaitsgory and Nitzan (1994) consider difference games which are differential games with discrete time. They develop a folk theorem in the sense that closed-loop memory Nash equilibria sustain individually rational outcomes with respect to the open-loop Nash equilibrium. In terms of applications to economics, these approaches are simply not very well developed and will therefore not be further pursued in this chapter. However, the idea of a folk theorem will pop up again below when the possible multiplicity of feedback Nash equilibria is discussed. It will be shown that the steady state of the cooperative outcome can be approached with feedback Nash equilibria if the discount rate goes to zero (see the next section).

If commitment is attached to the open-loop Nash equilibrium, the commitment device is to refrain from observing the state of the system. Commitment in this way may pay as we have seen in the example above where the open-loop Nash equilibrium has lower costs than the feedback Nash equilibrium. However, if we do not think that commitment is treated properly in this way and if we do not think that it is realistic to deliberately refrain from observing the state of the system, this is not an interesting way to go. Moreover, it is not generally true that the feedback Nash equilibrium has lower costs than the open-loop Nash equilibrium, as we will see in the applications below. One may say in general that the feedback Nash equilibrium is the most realistic solution concept. It usually confirms some tragedy of the commons but more optimistic results are also feasible, as we will see in the applications below. Most economic applications of differential games focus on comparing the open-loop Nash equilibrium and the feedback Nash equilibrium. Since the feedback Nash equilibrium is derived with dynamic programming and since the controls are conditioned on the state, it is also called the Markov–Perfect Nash equilibrium.

2.4 Formal Model

It is time to introduce some formalities. An important class of differential games is given by

$$ \max_{u_{i}(\cdot)}W_{i}=\int_{0}^{\infty}F_{i} \bigl[x(t),u_{i}(t)\bigr]\exp (-rt)dt,\quad i=1,2,\ldots,n $$
(1)

subject to

$$ \dot{x}(t)=f\bigl[x(t),u_{1}(t),u_{2}(t), \ldots,u_{n}(t)\bigr],\quad x(0)=x_{0}, $$
(2)

where i indexes the n players, x denotes the state of the system, u the controls, r the discount rate, W total welfare, F welfare at time t, and f the state transition. Note that the players only interact through the state dynamics. The problem has an infinite horizon and the welfare function and the state transition do not explicitly depend on time, except for the discount rate. This implies that the problem is stationary. Note also that the objective of the players is to maximize total welfare, whereas in the example in Sect. 2 players were minimizing costs.

In the open-loop Nash equilibrium the controls only depend on time: u i (t). This implies that player i solves an optimal control problem with Pontryagin’s maximum principle and the strategies of the other players as exogenous inputs. This results in an optimal control strategy for player i that is a function of the strategies of the other players. This is, in fact, the rational reaction or the best response of player i. The open-loop Nash equilibrium simply requires consistency of these best responses. Pontryagin’s maximum principle gives a necessary condition in terms of a differential equation for the co-state λ i . If the optimal solution for player i can be characterized by the set of differential equations in x and λ i , then the open-loop Nash equilibrium can be characterized by the set of differential equations in x and λ 1,λ 2,…,λ n . This is usually the best way to find the open-loop Nash equilibrium. The necessary conditions for player i in terms of the current-value Hamiltonian function

$$ H_{i}(x,u_{i},t,\lambda_{i})=F_{i}(x,u_{i})+ \lambda _{i}f\bigl[x,u_{1}(t),\ldots,u_{i}, \ldots,u_{n}(t)\bigr] $$
(3)

are that the optimal \(u_{i}^{\ast}(t)\) maximizes H i and that the state x and the co-state λ i satisfy the set of differential equations

$$\begin{aligned} \dot{x}(t) =&f\bigl[x(t),u_{1}(t),\ldots,u_{i}^{\ast}(t), \ldots,u_{n}(t)\bigr],\quad x(0)=x_{0}, \end{aligned}$$
(4)
$$\begin{aligned} \dot{\lambda}_{i}(t) =&r\lambda_{i}(t)-H_{ix} \bigl[x(t),u_{i}^{\ast }(t),t,\lambda_{i}(t)\bigr], \end{aligned}$$
(5)

with a transversality condition on λ i . Note that the actions of the other players u j (t), ji, are exogenous to player i. If sufficiency conditions are satisfied and if \(u_{i}^{\ast}(t)\) can be explicitly solved from the first-order conditions of optimization, the open-loop Nash equilibrium can be found by solving the set of differential equations for x and λ 1,λ 2,…,λ n given by

$$\begin{aligned} \dot{x}(t) =&f\bigl[x(t),u_{1}^{\ast}(t),u_{2}^{\ast}(t), \ldots,u_{n}^{\ast }(t)\bigr],\quad x(0)=x_{0}, \end{aligned}$$
(6)
$$\begin{aligned} \dot{\lambda}_{i}(t) =&r\lambda_{i}(t)-H_{ix} \bigl[x(t),u_{i}^{\ast }(t),t,\lambda_{i}(t)\bigr], \quad i=1,2,\ldots,n, \end{aligned}$$
(7)

with transversality conditions on λ 1,λ 2,…,λ n .

In the feedback Nash equilibrium the controls depend on the current state of the system and since the problem is stationary, they do not depend explicitly on time: u i (x). The Hamilton–Jacobi–Bellman equations for the current value functions V i are given by

$$\begin{aligned} rV_{i}(x) =&\max_{u_{i}}\bigl\{ F_{i}(x,u_{i}) \\ &{}+V_{i}^{\prime }(x)f \bigl[x,u_{1}(x),\ldots,u_{i},\ldots,u_{n}(x) \bigr]\bigr\} ,\quad i=1,2,\ldots,n. \end{aligned}$$
(8)

If sufficiency conditions are satisfied and if \(u_{i}^{\ast}(x)\) can be explicitly solved from the first-order conditions of optimization, the feedback Nash equilibrium can be found by solving the set of equations in the current value functions V i given by

$$\begin{aligned} rV_{i}(x) =&F_{i}\bigl(x,u_{i}^{\ast}(x) \bigr) \\ &{}+V_{i}^{\prime}(x)f\bigl[x,u_{1}^{\ast }(x),u_{2}^{\ast}(x), \ldots,u_{n}^{\ast}(x)\bigr],\quad i=1,2,\ldots,n. \end{aligned}$$
(9)

How this works in specific problems will follow below.

This is only a small part of the theory of differential games. The first textbook is Başar and Olsder (1982) which was written from an engineering perspective. A more recent textbook with many economic applications is Dockner et al. (2000). A nice and concise introduction is Van Long (2013).

3 International Pollution Control

This section is strongly based on van der Ploeg and de Zeeuw (1992). The game of international pollution control, as it is formulated in this paper and in other papers such as Dockner and Van Long (1993), is an example of a linear-quadratic differential game where the state transition f is linear in the state and in the controls and where the objective function F is quadratic in the state and in the control. It is interesting to compare the open-loop Nash equilibrium and the linear feedback Nash equilibrium and to interpret the difference. It is shown that the players are worse off in the linear feedback Nash equilibrium. It follows that the additional information does not pay or, to put it differently, that it pays to stick to the open-loop controls. The reason is that, at the margin, players emit more knowing that the other players will partly offset this when they observe the resulting higher stock of pollution. Therefore, in equilibrium, emissions are higher and the difference in terms of welfare with the cooperative outcome is higher. However, it will be shown that also non-linear feedback Nash equilibria exist and that the steady state of the best non-linear feedback Nash equilibrium converges to the steady state of the cooperative outcome when the discount rate converges to zero. Apparently, with non-linear strategies some sort of folk theorem can be achieved.

3.1 The Model

Pollution P is an inevitable by-product of production Y and the stock of pollution damages the environment. In the case of global environmental problems, pollution P crosses national borders but in the absence of a supra-national government that is mandated to implement policies worldwide, these transboundary externalities cannot be internalized in a standard way. For example, climate change affects many countries and is caused by worldwide emissions of greenhouse gases but emissions can only be controlled by policies at the national level. At the international level a game is played between the countries. In the non-cooperative Nash equilibrium of this game the countries do not take the transboundary externalities into account but only focus on the damage by their own emissions in their own country. Of course, they could coordinate their policies and correct the transboundary externalities as well but then incentives to deviate arise and this is the heart of the problem. An important question is how much the countries would gain from cooperation, but this depends on which Nash equilibrium is to be expected.

The relationship between pollution P and production Y is simply modeled by a fixed emission-output ratio α. Pollution P accumulates into a stock of pollution which is partly degrading by natural processes. Damage is caused by the concentration level of pollution S and its development over time or the state transition, in the case of n countries, is simply modeled as

$$ \dot{S}(t)=\frac{\alpha}{n}\sum_{i=1}^{n}Y_{i}(t)- \delta S(t),\quad S(0)=S_{0}. $$
(10)

This is a linear equation in the state S and in the controls Y i . Note, however, that the linear natural degradation δS of the concentration level may be too simple. Usually processes in the natural system are more complicated but if the state transition is modeled in a non-linear way, the analysis of the differential game becomes much more complicated. We leave this to the next section on lakes.

The objectives are simply modeled as

$$\begin{aligned} \max_{Y_{i}(\cdot)}W_{i} =&\int_{0}^{\infty} \biggl[\beta Y_{i}(t)-{ \frac{1}{2}}Y_{i}^{2}(t)-{\frac{1}{2}}\gamma S^{2}(t)\biggr] \\ &{}\times\exp(-rt)dt,\quad i=1,2,\ldots,n. \end{aligned}$$
(11)

The objectives are quadratic in the state S and in the control Y i . Again, quadratic damage costs \({\frac{1}{2}}\gamma S^{2}\) may be too simple. Climate change, for example, is better modeled by some tipping point where damage is sharply increasing but again, the analysis of the differential game would become much more complicated. Note also that the countries are assumed to be the same. We will only consider symmetric Nash equilibria so that the index i will at some point disappear.

3.2 Open-Loop Nash Equilibrium

Using Pontryagin’s maximum principle, the current-value Hamiltonian functions become

$$ H_{i}(S,Y_{i},t,\lambda_{i})=\beta Y_{i}-{\frac{1}{2}}Y_{i}^{2}-{ \frac{1}{2}}\gamma S^{2}+ \lambda_{i} \Biggl(\frac{\alpha}{n}Y_{i}+ \frac{\alpha}{n}\sum_{j\neq i}^{n}Y_{j}(t)- \delta S \Biggr) $$
(12)

and since sufficiency conditions are satisfied, the open-loop Nash equilibrium conditions become

$$\begin{aligned} Y_{i}(t) =&\beta+\frac{\alpha}{n}\lambda_{i}(t),\quad i=1,2,\ldots,n, \end{aligned}$$
(13)
$$\begin{aligned} \dot{S}(t) =&\frac{\alpha}{n}\sum_{i=1}^{n}Y_{i}(t)- \delta S(t),\quad S(0)=S_{0}, \end{aligned}$$
(14)
$$\begin{aligned} \dot{\lambda}_{i}(t) =&(r+\delta)\lambda_{i}(t)+\gamma S(t),\quad i=1,2,\ldots,n, \end{aligned}$$
(15)

with transversality conditions on λ 1,λ 2,…,λ n . The symmetric open-loop Nash equilibrium can therefore be characterized by the set of differential equations

$$\begin{aligned} \dot{S}_{OL}(t) =&\alpha \biggl(\beta+\frac{\alpha}{n}\lambda _{OL}(t) \biggr)-\delta S_{OL}(t),\quad S_{OL}(0)=S_{0}, \end{aligned}$$
(16)
$$\begin{aligned} \dot{\lambda}_{OL}(t) =&(r+\delta)\lambda_{OL}(t)+\gamma S_{OL}(t), \end{aligned}$$
(17)

with a transversality condition on λ OL , where OL denotes open-loop. This yields a standard phase diagram in the state/co-state plane for an optimal control problem with a stable manifold and a saddle-point-stable steady state, given by

$$ S_{OL}^{\ast}=\frac{\alpha\beta(r+\delta)n}{\delta(r+\delta )n+\alpha ^{2}\gamma}. $$
(18)

The negative of the shadow value −λ OL can be interpreted as the tax on emissions that is required in each country to implement the open-loop Nash equilibrium. Note that this tax only internalizes the externalities within the countries but not the transboundary externalities. This would require a higher tax that can be found from the cooperative outcome of the game.

In the cooperative outcome the countries maximize their joint welfare. This is a standard optimal control problem with objective

$$ \max_{Y_{1}(\cdot),\ldots, Y_{n}(\cdot)}\sum_{i=1}^{n}W_{i}. $$
(19)

Using Pontryagin’s maximum principle, the current-value Hamiltonian function becomes

$$ H(S,Y_{1},Y_{2},\ldots,Y_{n},\lambda)=\sum _{i=1}^{n} \biggl(\beta Y_{i}- {\frac{1}{2}}Y_{i}^{2} \biggr)-{ \frac{1}{2}}\gamma n S^{2}+\lambda \Biggl( \frac{\alpha}{n}\sum_{i=1}^{n} Y_{i}-\delta S \Biggr) $$
(20)

and since sufficiency conditions are satisfied, the optimality conditions become

$$\begin{aligned} Y_{i}(t) =&\beta+\frac{\alpha}{n}\lambda(t),\quad i=1,2,\ldots,n, \end{aligned}$$
(21)
$$\begin{aligned} \dot{S}(t) =&\frac{\alpha}{n}\sum_{i=1}^{n}Y_{i}(t)- \delta S(t),\quad S(0)=S_{0}, \end{aligned}$$
(22)
$$\begin{aligned} \dot{\lambda}(t) =&(r+\delta)\lambda(t)+\gamma nS(t), \end{aligned}$$
(23)

with a transversality condition on λ. The cooperative outcome can therefore be characterized by the set of differential equations

$$\begin{aligned} \dot{S}_{C}(t) =&\alpha \biggl(\beta+\frac{\alpha}{n}\lambda _{C}(t) \biggr)-\delta S_{C}(t),\quad S_{C}(0)=S_{0}, \end{aligned}$$
(24)
$$\begin{aligned} \dot{\lambda}_{C}(t) =&(r+\delta)\lambda_{C}(t)+\gamma nS_{C}(t), \end{aligned}$$
(25)

with a transversality condition on λ C , where C denotes cooperative. This yields a standard phase diagram in the state/co-state plane for an optimal control problem with a stable manifold and a saddle-point-stable steady state, given by

$$ S_{C}^{\ast}=\frac{\alpha\beta(r+\delta)}{\delta(r+\delta)+\alpha ^{2}\gamma}<S_{OL}^{\ast}. $$
(26)

The negative of the shadow value −λ C can be interpreted as the tax on emissions that is required in each country to implement the cooperative outcome and this tax is higher than the tax in the open-loop Nash equilibrium because now the transboundary externalities are internalized as well. The steady state of the cooperative outcome is lower than the steady state of the open-loop Nash equilibrium, as is to be expected. This implies that welfare is lower in the open-loop Nash equilibrium than in the cooperative outcome. These results are straightforward. In the next section we will consider the feedback Nash equilibrium.

3.3 Feedback Nash Equilibrium

The Hamilton–Jacobi–Bellman equations in the current value functions V i become

$$\begin{aligned} rV_{i}(S) =& \max_{Y_{i}} \Biggl\{ \beta Y_{i}-{\frac{1}{2}}Y_{i}^{2}-{ \frac{1}{2}}\gamma S^{2} \\ &{}+V_{i}^{\prime}(S) \Biggl(\frac{\alpha}{n}Y_{i}+\frac {\alpha}{n}\sum _{j\neq i}^{n}Y_{j}(S)-\delta S \Biggr) \Biggr\} , \quad i=1,2,\ldots,n, \end{aligned}$$
(27)

with first-order conditions

$$ Y_{i}^{\ast}(S)=\beta+\frac{\alpha}{n}V_{i}^{\prime}(S), \quad i=1,2,\ldots,n. $$
(28)

Since sufficiency conditions are satisfied, the symmetric feedback Nash equilibrium can be found by solving the differential equation in V=V i , i=1,2,…,n,

$$\begin{aligned} rV(S) =& \beta \biggl(\beta+\frac{\alpha}{n}V^{\prime}(S) \biggr)-{\frac{1}{2}}\biggl(\beta+\frac{\alpha}{n}V^{\prime}(S) \biggr)^{2}- {\frac{1}{2}}\gamma S^{2} \\ &{} +V^{\prime}(S) \biggl[\alpha \biggl(\beta+ \frac{\alpha }{n}V^{\prime }(S) \biggr)-\delta S \biggr]. \end{aligned}$$
(29)

The usual way to solve this equation is to assume that the current value function V is quadratic with the general form

$$ V(S)=\sigma_{0}-\sigma_{1}S-{\frac{1}{2}}\sigma_{2}S^{2}, \quad \sigma_{2}>0, $$
(30)

so that a quadratic equation in the state S results. Since this equation has to hold for all S, the coefficients of S 2 and S on the left-hand side and the right-hand side have to be equal. It follows that

$$\begin{aligned} \sigma_{2} =&\frac{-(r+2\delta)n^{2}+n\sqrt{(r+2\delta )^{2}n^{2}+4\alpha ^{2}\gamma(2n-1)}}{2\alpha^{2}(2n-1)}, \end{aligned}$$
(31)
$$\begin{aligned} \sigma_{1} =&\frac{\alpha\beta n^{2}\sigma_{2}}{(r+\delta )n^{2}+\alpha ^{2}(2n-1)\sigma_{2}}. \end{aligned}$$
(32)

The feedback Nash equilibrium becomes

$$ Y_{i}^{\ast}(S)=\beta+\frac{\alpha}{n}(-\sigma_{1}- \sigma _{2}S),\quad i=1,2,\ldots,n, $$
(33)

and the controlled state transition becomes

$$ \dot{S}(t)=\alpha (\beta+\frac{\alpha}{n}\bigl(-\sigma_{1}-\sigma _{2}S(t) \bigr)-\delta S(t), $$
(34)

which is stable and yields the steady state

$$ S_{FB}^{\ast}=\frac{\alpha\beta n-\alpha^{2}\sigma_{1}}{\delta n+\alpha ^{2}\sigma_{2}}, $$
(35)

where FB denotes feedback.

It is tedious but straightforward to show that

$$ S_{C}^{\ast}<S_{OL}^{\ast}<S_{FB}^{\ast}. $$
(36)

This implies that in the feedback Nash equilibrium the countries are worse off than in the open-loop Nash equilibrium or, to put it differently, that the gains of cooperation are higher when the non-cooperative model is the feedback model. Since the feedback model, where countries observe the state of the system and are not committed to future actions, is the more realistic model, the tragedy of the commons is more severe than one would think when the open-loop model is used to assess the gains of cooperation. The intuition for this pessimistic result is as follows. A country argues that when it increases emissions, this will increase the concentration level of pollution and this will induce the other countries to lower their emissions, so that part of the increase in emissions will be offset by the other countries. Each country argues the same way so that in equilibrium emissions will be higher than in the case where the concentration level is not observed. With the open-loop model, the gains of cooperation are in fact underestimated.

The bulk of the literature on economic applications of differential games has this type of result. However, a different approach to the issue is possible. Dockner and Van Long (1993) show that non-linear feedback Nash equilibria (with non-quadratic current value functions) for this problem exist which may be better than the open-loop Nash equilibrium. This will be shown in the next section.

3.4 Non-linear Feedback Nash Equilibria

The symmetric feedback Nash equilibrium is given by

$$ Y_{i}^{\ast}(S)=\beta+\frac{\alpha}{n}V^{\prime}(S):=h(S), \quad i=1,2,\ldots,n, $$
(37)

and using these equations for substituting V′(S), the Hamilton–Jacobi–Bellman equation can be written as

$$ rV(S)=\beta h(S)-{\frac{1}{2}}\bigl(h(S)\bigr)^{2}-{ \frac{1}{2}}\gamma S^{2}+ \biggl[ \frac{n}{\alpha}\bigl(h(S)-\beta\bigr) \biggr] \bigl[\alpha h(S)-\delta S\bigr]. $$
(38)

Assuming that h is differentiable, differentiating this equation with respect to S and substituting V′(S) again yields an ordinary differential equation in the feedback equilibrium control h given by

$$ \bigl[(2n-1)\alpha h(S)+(1-n)\alpha\beta-n\delta S\bigr]h^{\prime }(S)=n(r+ \delta)h(S)+\alpha\gamma S-n(r+\delta)\beta. $$
(39)

This is in fact the Euler–Lagrange equation for this problem. This differential equation may have multiple solutions because the boundary condition is not specified. The steady-state condition yields a boundary condition but the steady state is not determined in a differential game. One can also say that the multiplicity of non-linear feedback Nash equilibria results from the indeterminacy of the steady state in differential games. Dockner and Van Long (1993) have the same model but with n=α=2 which yields

$$ \bigl[3h(S)-\beta-\delta S\bigr]h^{\prime}(S)=(r+\delta)h(S)+\gamma S-(r+ \delta)\beta, $$
(40)

with the boundary condition in the steady state S given by

$$ h\bigl(S^{\ast}\bigr)={ \frac{1}{2}}\delta S^{\ast}. $$
(41)

The solutions of this differential equation in the feedback control h must lead to a stable system where the state S converges to the steady state S . The stable solutions form a set of hyperbolas in the (S,h)-plane that cut the steady state line \({\frac{1}{2}}\delta S\) in the interval

$$ \frac{2\beta(2r+\delta)}{\delta(2r+\delta)+4\gamma}\leq S^{\ast }<\frac{2\beta}{\delta}. $$
(42)

Rubio and Casino (2002) correct this result by showing that it only holds for a certain set of initial states S 0. The right-hand side of the interval represents the situation where the countries do not have a concern for the environment and choose Y=β. The left-hand side is the lowest steady-state that can be achieved with a feedback Nash equilibrium. In this equilibrium the hyperbola h(S) is tangent to the steady-state line, so that \(h(S^{\ast})={\frac{1}{2}}\delta S^{\ast}\) and \(h^{\prime}(S^{\ast})={\frac{1}{2}}\delta\).

The steady state in the cooperative outcome (for α=2) is still lower,

$$ S_{C}^{\ast}=\frac{2\beta(r+\delta)}{\delta(r+\delta)+4\gamma }<\frac{2\beta(2r+\delta)}{\delta(2r+\delta)+4\gamma}, $$
(43)

but it is interesting to note that the best steady state in a feedback Nash equilibrium converges to the steady state in the cooperative outcome when the discount rate r converges to zero. This does not imply, however, that welfare in this feedback Nash equilibrium also converges to welfare in the cooperative outcome when the discount rate r converges to zero. We will come back to this in the next section on lakes.

This result in Dockner and Van Long (1993) is important. It shows that when we allow for non-linear equilibria in this linear-quadratic framework, the feedback Nash equilibrium can be better than the open-loop Nash equilibrium, i.e. in terms of the steady states. Apparently, the feedback information structure can also be beneficial for the countries, by keeping each other targeted on a better steady state with a different set of feedback equilibrium controls. These feedback controls are not offsetting part of the earlier extra emissions but are threatening to emit even more. This implies that even if a linear feedback Nash equilibrium exists, the countries can decide to coordinate on another, non-linear, feedback Nash equilibrium because this one leads to a steady state with higher welfare. Note that this result is achieved in a dynamic programming framework and that the equilibrium is Markov perfect. If the discount rate r approaches zero, the steady state approaches the steady state of the cooperative outcome which can be interpreted as some sort of folk theorem in a differential game (see also Rowat 2007). This analysis has only been developed for one-dimensional systems and we have to wait and see how it works out in higher-dimensional systems. In the next section on lakes, we will apply the same technique but now in a model with a non-linear state transition.

4 The Lake Game

This section is strongly based on Mäler et al. (2003) and Kossioris et al. (2008, 2011). The lake game, as it is formulated in these papers and in other papers such as Brock and Starrett (2003), is an example of a differential game where the state transition f is non-linear in the state. Since the linear-quadratic structure is lost anyway, the objective function F is chosen to be logarithmic in the control because this implies that the cooperative outcome is independent of the number of players, which is convenient in the analysis. We will compare the cooperative outcome with the best feedback Nash equilibrium that is derived with the technique in the last sub-section. Furthermore, we will introduce a tax rate on pollution in order to see if that can internalize the externality in this case. Because of the complexity of the problem, at some point we need to resort to numerical solutions.

4.1 The Model

It can be shown that the essential dynamics of eutrophication of lakes can be described by the differential equation

$$ \dot{x}(t)=\sum_{i=1}^{n}a_{i}(t)-bx(t)+ \frac{x^{2}(t)}{x^{2}(t)+1},\quad x(0)=x_{0}, $$
(44)

where x denotes the amount of phosphorus sequestered in algae, a i the loading of phosphorus on the lake by agent i, i=1,2,…,n, and b the parameter for the rate of loss (which differs across lakes). The last non-linear term reflects an internal positive release of phosphorus, that has been sequestered in sediments and submerged vegetation, due to changes in the condition of the lake (Carpenter and Cottingham 1997). Note that this equation has one or more steady states, depending on the value of b and on the level of total loading a. If \(b<3\sqrt{0.375}\), for a certain range of a, the equation has three steady states: two stable ones and an unstable one in between. A low x is usually referred to as an oligotrophic state and a high x is usually referred to as a eutrophic state. With these multiple steady states, a hysteresis effect can occur. Increasing total loading a from a low level, with a low steady state, will at some point lead to a sudden flip to a high steady state (tipping point). Trying to return to a low steady state, by decreasing total loading a again, requires a substantial decrease in a before the lake flips back. If b≤0.5, it is even impossible to flip back since total loading a cannot become negative. In this case, the flip is irreversible and the lake is trapped in a eutrophic state. We will assume that \(0.5<b<3\sqrt {0.375}\), so that hysteresis can occur but a flip to a eutrophic state is reversible. This type of model is also relevant for other natural systems such as coral reefs, rangelands and climate, so that it can be seen as a metaphor for many environmental problems facing us today.

Damage to the lake is caused by the amount of phosphorus sequestered in algae x. We take a simple increasing quadratic form. However, the release of phosphorus on the lake is a by-product of agriculture and in that sense also beneficial (value as a waste sink). The agents can be seen as communities around the lake, which is common property to them. We take a logarithmic form for the benefits of loadings a i of phosphorus on the lake, because this form has the property that the cooperative outcome is independent of the number of agents, as we will see below. The objectives are simply modeled as

$$ \max_{a_{i}(\cdot)}W_{i}=\int_{0}^{\infty} \bigl[\ln a_{i}(t)-cx^{2}(t)\bigr]\exp (-rt)dt,\quad i=1,2, \ldots,n, $$
(45)

where c denotes the relative weight in the objective between the value of the lake as a waste sink and the damage to the lake. For a high c it is to be expected that the resulting state will be oligotrophic. Note that the communities are assumed to be the same. We will only consider symmetric Nash equilibria again so that the index i will at some point disappear.

4.2 Open-Loop Nash Equilibrium

Using Pontryagin’s maximum principle, the current-value Hamiltonian functions become

$$ H_{i}(x,a_{i},t,\lambda_{i})=\ln a_{i}-cx^{2}+\lambda _{i} \Biggl(a_{i}+ \sum_{j\neq i}^{n}a_{j}(t)-bx+ \frac {x^{2}}{x^{2}+1} \Biggr) $$
(46)

and since sufficiency conditions are satisfied, the open-loop Nash equilibrium conditions become

$$\begin{aligned} a_{i}(t) =&-\frac{1}{\lambda_{i}(t)},\quad i=1,2,\ldots,n, \end{aligned}$$
(47)
$$\begin{aligned} \dot{x}(t) =&a(t)-bx(t)+\frac{x^{2}(t)}{x^{2}(t)+1},\quad a:=\sum _{i=1}^{n}a_{i},x(0)=x_{0}, \end{aligned}$$
(48)
$$\begin{aligned} \dot{\lambda}_{i}(t) =& \biggl(r+b-\frac{2x(t)}{(x^{2}(t)+1)^{2}} \biggr)\lambda _{i}(t)+2cx(t),\quad i=1,2,\ldots,n, \end{aligned}$$
(49)

with transversality conditions on λ 1,λ 2,…,λ n . The symmetric open-loop Nash equilibrium can therefore be characterized by the set of differential equations in the state x and in the total loadings a, given by

$$\begin{aligned} \dot{x}_{OL}(t) =&a_{OL}(t)-bx_{OL}(t)+ \frac {x_{OL}^{2}(t)}{x_{OL}^{2}(t)+1},\quad x_{OL}(0)=x_{0}, \end{aligned}$$
(50)
$$\begin{aligned} \dot{a}_{OL}(t) =&- \biggl(r+b-\frac {2x_{OL}(t)}{(x_{OL}^{2}(t)+1)^{2}} \biggr)a_{OL}(t)+2\frac{c}{n}x_{OL}(t)a_{OL}^{2}(t), \end{aligned}$$
(51)

with a transversality condition on a OL , where OL denotes open-loop. This system may have multiple steady states, depending on the value of the parameters. We return to this issue below.

In the cooperative outcome the communities maximize their joint welfare. This is a standard optimal control problem with objective

$$ \max_{a_{1}(\cdot),\ldots, a_{n}(\cdot)} \sum_{i=1}^{n}W_{i}. $$
(52)

Using Pontryagin’s maximum principle, the current-value Hamiltonian function becomes

$$ H(x,a_{1},a_{2},\ldots,a_{n},\lambda)=\sum _{i=1}^{n}\ln a_{i}-ncx^{2}+ \lambda \Biggl(\sum_{i=1}^{n}a_{i}-bx+ \frac{x^{2}}{x^{2}+1} \Biggr) $$
(53)

and since sufficiency conditions are satisfied, the optimality conditions become

$$\begin{aligned} a_{i}(t) =&-\frac{1}{\lambda(t)},\quad i=1,2,\ldots,n, \end{aligned}$$
(54)
$$\begin{aligned} \dot{x}(t) =&a(t)-bx(t)+\frac{x^{2}(t)}{x^{2}(t)+1},\quad a:=\sum _{i=1}^{n}a_{i}, x(0)=x_{0}, \end{aligned}$$
(55)
$$\begin{aligned} \dot{\lambda}(t) =&\biggl(r+b-\frac{2x(t)}{(x^{2}(t)+1)^{2}}\biggr)\lambda(t)+2ncx(t), \end{aligned}$$
(56)

with a transversality condition on λ. The cooperative outcome can therefore be characterized by the set of differential equations in the state x and in the total loadings a, given by

$$\begin{aligned} \dot{x}_{C}(t) =&a_{C}(t)-bx_{C}(t)+ \frac{x_{C}^{2}(t)}{x_{C}^{2}(t)+1} ,\quad x_{C}(0)=x_{0}, \end{aligned}$$
(57)
$$\begin{aligned} \dot{a}_{C}(t) =&- \biggl(r+b-\frac{2x_{C}(t)}{(x_{C}^{2}(t)+1)^{2}}\biggr)a_{C}(t)+2cx_{C}(t)a_{C}^{2}(t), \end{aligned}$$
(58)

with a transversality condition on a C , where C denotes cooperative.

For b=0.6, c=1 and r=0.03, the phase diagram in the (x,a)-plane for the cooperative outcome is given in Fig. 1.

Fig. 1
figure 1

Phase diagram in the (x,a)-plane for the cooperative outcome

For these parameter values, the controlled system has one oligothrophic steady state that is saddle-point stable. The phase diagram has a stable and an unstable manifold. The situation is essentially not different from the linear-quadratic case in the previous section. It would be different if we increase r, for example, but it is more interesting to increase n and thus move to the open-loop Nash equilibrium. Note that the open-loop Nash equilibrium can be found by solving the optimal control problem with parameter c/n (a game with this property is called a potential game). For n=2, the phase diagram in the (x,a)-plane for the open-loop Nash equilibrium is given in Fig. 2.

Fig. 2
figure 2

Phase diagram in the (x,a)-plane for the open-loop Nash equilibrium

Now we get three steady states: saddle-point stable ones to the left and to the right and an unstable one in between. Stable manifolds curl out from the unstable steady state and go either to the left or to the right steady state. The outcome depends on the initial state. A point x S exists with the property that if x 0<x S , the open-loop Nash equilibrium moves towards the oligotrophic steady state at the left and if x 0>x S , the open-loop Nash equilibrium moves towards the eutrophic steady state at the right. Such a point is called a Skiba point since it was first presented by Skiba (1978) in an optimal growth model with a convex-concave production function. The intuition is clear: if the lake is already heavily polluted, it does not pay anymore to move to an oligotrophic state.

The policy question is whether a properly chosen tax τ on phosphorus loadings a i , with an extra cost τ(t)a i (t) under the integral in the objective, can regulate the system on the optimal path towards the optimal (oligotrophic) steady state. The answer is yes, because the tax τ should simply bridge the gap between the negatives of the shadow values −λ i and −λ:

$$ \tau(t)-\lambda_{i}(t)=-\lambda(t). $$
(59)

However, such a tax τ is time-dependent, since these shadow values are constantly changing on the optimal path. Such a tax would be very difficult to implement because it would require a regulating institution to continuously change the tax rate. Therefore, the more realistic policy question is what a fixed tax rate can do. By comparing steady-state equations, it is easy to see that in the optimal steady state (a C ,x C ), this (fixed) tax must be equal to

$$ \tau^{\ast}=\frac{n-1}{a_{C}}. $$
(60)

This implies that the steady state of the open-loop Nash equilibrium under the fixed tax rate τ is the same as the steady state of the cooperative outcome. However, the locus of steady states in the resulting phase diagram for the total loading a differs, and the trajectory may differ as well, of course. In Mäler et al. (2003) it is shown that for a small number of communities n (n≤7), the phase diagram for the open-loop Nash equilibrium under the fixed tax rate τ is qualitatively the same as the phase diagram for the cooperative outcome. It follows that for small n, the optimal steady state can be achieved, although welfare may be lower because of changes in the trajectory. However, for a large number of communities n (n>7), the phase diagram for the open-loop Nash equilibrium under the fixed tax rate τ is complicated and irregular, and multiple steady states may occur. It follows that for large n, it may even not be possible to achieve the optimal steady state, depending on the initial state x 0. We may conclude that this regulation works fine for a small number of communities n but in general does not work for a large number of communities. This situation can be improved in the feedback Nash equilibrium, which we will consider in the next section.

4.3 Feedback Nash Equilibria

The Hamilton–Jacobi–Bellman equations in the current value functions V i become

$$\begin{aligned} rV_{i}(x) =&\max_{a_{i}} \Biggl\{ \ln a_{i}-cx^{2} \\ &{}+V_{i}^{\prime }(x) \Biggl(a_{i}+\sum_{j\neq i}^{n}a_{j}(x)-bx+ \frac{x^{2}}{x^{2}+1}\Biggr) \Biggr\} ,\quad i=1,2,\ldots,n. \end{aligned}$$
(61)

Since sufficiency conditions are satisfied, the symmetric feedback Nash equilibrium with V=V i , i=1,2,…,n, is given by

$$ a_{i}^{\ast}(x)=-\frac{1}{V^{\prime}(x)}:=h(x),\quad i=1,2, \ldots,n, $$
(62)

and using these equations for substituting V′(x), the Hamilton–Jacobi–Bellman equation in the current value function V can be written as

$$ rV(x)=\ln h(x)-cx^{2}-\frac{1}{h(x)} \biggl(nh(x)-bx+ \frac {x^{2}}{x^{2}+1} \biggr). $$
(63)

Assuming that h is differentiable, differentiating this equation with respect to x and substituting V′(x) again yields an ordinary differential equation in the feedback equilibrium control h given by

$$ \biggl[-h(x)+bx-\frac{x^{2}}{x^{2}+1} \biggr]h^{\prime}(x)= \biggl(r+b-2cxh(x)-\frac{2x}{(x^{2}+1)^{2}} \biggr)h(x). $$
(64)

This is, in fact, the Euler–Lagrange equation for this problem. It is a so-called Abel differential equation of the second kind (Murphy 1960) which cannot be solved analytically, but we can solve it numerically with the ode15s solver in Matlab. As in the previous section, this differential equation may have multiple solutions because the boundary condition is not specified. The steady-state condition gives the boundary condition

$$ h(x_{FB})=\frac{1}{n} \biggl(bx_{FB}- \frac{x_{FB}^{2}}{x_{FB}^{2}+1} \biggr), $$
(65)

where FB denotes feedback, but the steady state x FB is not determined in a differential game which yields the multiplicity of feedback Nash equilibria. We use the same values for the parameters as in the previous section: b=0.6, c=1, r=0.03 and n=2. The solutions of this differential equation in the feedback equilibrium control h must lead to a stable system where the state x converges to the steady state x FB . The results are shown in Fig. 3.

Fig. 3
figure 3

Phase diagram in the (x,a)-plane for the feedback equilibrium

Figure 3 depicts both the locus of steady states for the state x with total loading a on the y-axis (y=g S (x)) and with individual loading a i on the y-axis (y=g S (x)/n) where

$$ g_{S}(x)=bx-\frac{x^{2}}{x^{2}+1}. $$
(66)

Furthermore, solutions h(x) of the differential equation are depicted. These solutions must have an intersection point x FB with y=g S (x)/n because of the boundary condition. If the derivative h′(x) is negative in this intersection point x FB , the solution h(x) yields a stable system with steady state x FB , at least in a neighborhood of x FB .

A number of conclusions can be drawn from Fig. 3 (see Kossioris et al. 2008). The benchmark is the curve h(x) that is tangent to the steady-state curve y=g S (x)/n which occurs at the steady state \(x_{FB}^{\ast}=0.38\). For this feedback Nash equilibrium, the steady state can be reached from any initial state x 0>0.38 but not from an initial state x 0<0.38. It is also possible to reach any steady state x FB >0.38 but welfare is lower in the corresponding feedback Nash equilibria. Low steady states (x FB <0.17) cannot be reached because for the corresponding feedback Nash equilibria, the resulting system is not stable. It is not possible to get a stable solution for initial states x 0<0.17. For the initial states 0.17<x 0<0.38, the situation is more complicated. At such an initial state x 0, the feedback equilibrium control h(x) that starts just above the steady-state curve y=g S (x)/n will steer the system to a stable steady state to the right, in the eutrophic area of the lake (a similar observation was made in Rubio and Casino (2002) for the non-linear feedback Nash equilibria in the game of international pollution control in the previous section). This leads to a type of time-inconsistency: as soon as the system has moved a bit, the incentive occurs to adjust the equilibrium and to jump down to a lower feedback equilibrium control h(x), just above the steady-state curve y=g S (x)/n at the higher level of x. Moreover, as soon as the system has moved beyond the state x=0.38, the incentive occurs to jump down all the way to the tangent feedback equilibrium control h(x) that steers the system to the steady state \(x_{FB}^{\ast}=0.38\). A full picture of possible equilibria for the lake and for similar models can be found in Dockner and Wagener (2008).

For any initial state x 0>0.38, the best the communities can do, in terms of welfare, is to coordinate on the tangent feedback equilibrium control h(x) that steers the system to the steady state \(x_{FB}^{\ast }=0.38\). The steady state of this best feedback Nash equilibrium (0.38) is closer to the steady state of the cooperative outcome (0.353) than the steady states of the two open-loop Nash equilibria (0.393 and 1.58). This is not generally true. If the number of communities is increased to n=3, the steady state of the best feedback Nash equilibrium becomes 0.417 and the steady state of the best open-loop Nash equilibrium becomes 0.412 whereas the steady state of the cooperative outcome remains 0.353. More details can be found in Kossioris et al. (2008). More importantly, however, the feedback Nash equilibrium allows the communities to move to an oligotrophic steady state and they are not trapped in a eutrophic steady state, like in the open-loop Nash equilibrium, when the initial state is above the Skiba point. Furthermore, the result of Dockner and Van Long (1993) for the game of international pollution control also holds for the lake: the best steady state of a feedback Nash equilibrium converges to the steady state of the cooperative outcome when the discount rate r converges to zero. However, this does not imply that welfare is the same. Kossioris et al. (2008) show that the best feedback Nash equilibrium generally performs worse, in terms of welfare, than the open-loop Nash equilibrium and therefore a fortiori worse that the cooperative outcome but differences are small, of course, when the initial state is close to the steady states.

Kossioris et al. (2011) study what a tax can achieve in the feedback information structure. More specifically, they focus on a stationary tax rate τ(x) that depends on the state of the system. The Hamilton–Jacobi–Bellman equations in the current value functions V i become

$$\begin{aligned} rV_{i}(x) =&\max_{a_{i}} \Biggl\{ \ln a_{i}-cx^{2}-\tau(x)a_{i} \\ &{}+V_{i}^{\prime }(x) \Biggl(a_{i}+\sum_{j\neq i}^{n}a_{j}(x)-bx+ \frac{x^{2}}{x^{2}+1}\Biggr) \Biggr\} , \quad i=1,2,\ldots,n. \end{aligned}$$
(67)

A similar derivation as above yields the differential equation

$$\begin{aligned} & \biggl[-h(x) -(n-1)\tau(x)h^{2}(x)+bx-\frac {x^{2}}{x^{2}+1} \biggr] h^{\prime }(x) \\ &\quad = \biggl[ \biggl(r+b-\frac{2x}{(x^{2}+1)^{2}} \biggr) \bigl(1-\tau (x)h(x) \bigr)-2cxh(x) \biggr] h(x) \\ &\qquad{} + \biggl((n-1)h(x)-bx+\frac{x^{2}}{x^{2}+1} \biggr)\tau ^{\prime}(x)h^{2}(x) \end{aligned}$$
(68)

with the two unknown functions h(x) and τ(x). Kossioris et al. (2011) take different polynomial functional forms for τ(x) and fix the parameters of the functional form such that the (tangent) best feedback Nash equilibrium h(x) steers the system to the steady state of the cooperative outcome x C =0.353, starting at higher initial states. More specifically they focus on a fixed tax rate, a linear tax rate and a quadratic tax rate and they compare the resulting welfare with the welfare in the cooperative outcome. The welfare differences get smaller for higher order polynomials, because the trajectories move closer to the trajectory of the cooperative outcome, but it is not possible to mimic the cooperative outcome with these relatively simple tax rates. The “Nash program” cannot be solved with simple tax rates in this context.

5 Conclusion

The purpose of this chapter is twofold. First, it provides an introduction into some concepts and techniques of differential games that are widely used for economic applications. Second, it analyzes two famous models in environmental and resource economics, the game of international pollution control and the lake game. The analysis fits in what is called the “Nash program.” Since Nash equilibria in differential games are usually not unique, the question is which one comes closest to the cooperative outcome or can even mimick it. If the last option is not available, the question is whether some realistic tax rate can regulate the Nash equilibrium in order to cover the remaining welfare gap. The chapter is mainly based on a number of articles that have already appeared in journals but it puts the main conclusions in this general framework.

The basics of differential games goes back more than 40 years and results from the observation that the equivalent of Bellman’s principle of optimality does not hold in games. This implies that Nash equilibria in strategies that only depend on time and are derived with Pontryagin’s maximum principle are different from Nash equilibria that also depend on the state and are derived with dynamic programming. This leads to the general question whether it is possible to characterize the full set of Nash equilibria that depend on all possible information structures regarding the state and that have different levels of commitment but this issue is far from solved. However, the restriction to Nash equilibria that either result from Pontryagin’s maximum principle or from the Hamilton–Jacobi–Bellman equations of dynamic programming already gives interesting results and insights. The first set of equilibria is referred to as open-loop Nash equilibria and the second set is referred to as feedback Nash equilibria.

The game of international pollution control is an example of a differential game where the objective is quadratic and the state transition is linear in the state and in the controls. The open-loop Nash equilibrium is linear. A linear feedback Nash equilibrium exists with a steady state stock of pollution that is higher than in the open-loop Nash equilibrium, and with lower welfare. However, also non-linear feedback Nash equilibria exist. The most favorable one has a steady state stock of pollution that converges to the steady state stock of pollution in the cooperative outcome. The conclusion whether the open-loop or the feedback Nash equilibrium is better is therefore mixed. On the one hand, feedback equilibria can push up the stock of pollution because countries know that extra emissions will be partly offset by the other countries. On the other hand, feedback Nash equilibria can push down the stock of pollution if countries are threatening to emit even more as a reaction to extra emissions.

The lake game is an example of a differential game where the state transition is non-linear in the state. Assuming that the steady state in the cooperative outcome is oligotrophic (good), increasing the number of communities using the lake will at some point lead to the situation that the open-loop Nash equilibrium has both an oligotrophic and a eutrophic (bad) steady state. It depends on the initial condition (below or above the Skiba point) where the equilibrium trajectory will end up. Regulation by means of a fixed tax rate works for a low number of communities but it does not work for a large number of communities, in which case the system gets trapped in the eutrophic area of the lake. Feedback Nash equilibria (that are non-linear, of course) have the same properties as in the game of international pollution control. In addition, with feedback Nash equilibria the system cannot get trapped in the eutrophic area of the lake. Regulation by means of a polynomial state-dependent tax rate can steer the system to the steady state of the cooperative outcome and can move the system closer to the trajectory of the cooperative outcome but it cannot mimic the cooperative outcome and has therefore lower welfare.

Differential games are the natural framework of analysis for many problems in environmental and resource economics. The existing solution techniques can cover some of the gap between non-cooperative Nash equilibria and the cooperative outcome but still not all the way. Regulation by means of realistic tax rates can cover some of the remaining gap but also still not all the way. Further research has to show what is feasible here.