1 Introduction

In recent years, politicians and scholars have been focusing on the negative impact of rapid technological advancement and economic growth, as different types of pollution and environmental degradation have reached alarming levels. At the same time, the transboundary or the cross-border pollution has also received special attention in the literature (see, for example, Benchekroun and Chaudhuri [3], Hall [14], Youssef [28], and references therein). Just as the name implies, transboundary pollution is the pollution that generates in one region and can affect the surrounding regions via water and air.

To understand the relationship between the accumulation of pollution and the region’s strategies, we model the underlying problem as a differential game. The goals of the players or the neighboring regions with symmetric information in the transboundary pollution game are to maximize their respective profits by choosing their optimal allocations of emission permits and the amounts of pollution abatement. Dockner and Long [11] examine the optimal behavior of two neighboring regions in a game setting and conclude that when the two regions use only linear strategies, both regions suffer a loss caused by their competitive actions. However, when nonlinear Markov-perfect strategies are used, and the discount rate is small enough, then a subgame-perfect equilibrium can be reached. Yeung [25] studies a pollution management problem in a cooperative stochastic differential game framework, in which he proposes a payment distribution mechanism that would yield a time-consistent solution. List and Mason [19] conclude that the cooperative payoff is larger than the noncooperative one in a decentralized control setting, if the individual payoffs are sufficiently heterogeneous and the initial pollution stocks are sufficiently small. Kossioris et al. [16] use a numerical method to obtain a nonlinear feedback Nash equilibrium (FBNE) in a shallow lake pollution context and show that the equilibrium trajectory of the best FBNE is, in general, worse than the open-loop and optimal management solutions. Maler and Zeeuw [21] study open-loop and feedback Nash equilibria in an acid rain differential game and find that the depositions always converge to the critical loads. Benchekroun and Martín-Herrán [4] consider both farsighted and myopic behaviors in a transboundary pollution game. Their results suggest that it is necessary to design incentive mechanisms to induce a country to decide in a farsighted way and thereby to increase the number of farsighted countries. Huang et al. [15] present a cooperative differential game of transboundary industrial pollution, which involves a Stackelberg game between the industrial firms and their local governments. They provide a payment distribution mechanism which guarantees that cooperation would result in higher total payoff.

With increasing environmental awareness, every country in the world realizes the necessity of investing in abatement technologies. Besides, the environmental policies also give polluting firms an incentive to do R&D and invest in cleaner ways of production, to reduce their compliance costs. Therefore, some research has been done to examine the abatement decision in the process of environmental degradation. For example, Lundgren [20] presents a real option framework to explain the reasons behind producers voluntarily taking abatement actions and examining the relationship between the abatement decision and regulations, competitors’ strategies, uncertainty about future green goodwill, etc. Farzin and Kort [13] study the effect of a higher pollution tax rate on abatement investment both under full certainty and when the timing and/or the size of the tax increase is uncertain. Bertinelli et al. [5] propose a differential game framework for transboundary \(\hbox {CO}_2\) pollution to study the strategic behaviors of the players. In their proposed framework, the reduction of \(\hbox {CO}_2\) concentration can be realized by taking advantage of the carbon capture and storage techniques. They find that if there is a high enough level of \(\hbox {CO}_2\) at the beginning, then the optimal feedback strategies can lead to a higher overall environmental quality.

Furthermore, it is widely known that most pollution types are caused by over-emissions of industrial waste. To reduce greenhouse gas emissions, many regions have established and developed emission permits trading markets in recent years. In the cap-and-trade scheme, each emitter is allocated a total amount of emission, i.e., an initial quota. An emitter can sell the unused quotas in the market to other permit seekers or buy emission permits from the market if his emission level exceeds his assigned quota. This scheme reduces emission possible by adjusting the emission permits among the emitters through market means. The emission permits trading schemes have been studied by researchers including Chang et al. [6], Daskalakis et al. [12] and Seifert et al. [23], and its effects on the equilibrium of differential game have also been examined in Chang et al. [7] and Li [17].

Motivated by the above discussions, this article studies the two regions dynamic optimal strategies in transboundary pollution differential games, in which the abatement policy and emission trading are involved. By solving the derived Hamilton–Jacobi–Bellman (HJB) equations satisfied by the value functions, we work out the players’ optimal emission levels and optimal pollution abatement strategies in a finite-horizon game. Moreover, we solve a stochastic extension by numerical means. Our results show that emission trading has great impact on the optimal strategies and profits of the two regions. We show that cooperation between the players leads to greater abatement, fewer emissions, as well as fewer accumulated stock of pollution. Apart from the fact that cooperation is better than noncooperation, which can also be readily shown from a simple static prisoner’s dilemma, the optimal trajectory of the pollution stock is obtained from the complex dynamic game framework. Compared with the similar models without abatement, due to Chang et al. [7] and Li [17], our results demonstrate that the introduction of abatement can reduce the pollution stock. Specially, owing to the presence of the terminal point besides the stable states, our results can also provide the evolutions of emissions, abatements and revenues of the players and show that a higher unit salvage cost incentivizes the players to reduce emission, increase abatement, and then further reduce costs from the emission permits trading scheme. The results leads to an increase in net revenue that is absent in Li [17]. Additionally, in the stochastic case, different from the results in Chang et al. [7], our work shows that it is always beneficial for the players to choose cooperation. For the theory of differential games, we refer to Basar and Olsder [2], Dockner et al. [10] and Sethi and Thompson [24].

The remainder of the paper is organized as follows. In Sect. 2, the basic model is developed. Noncooperative and cooperative games are formulated in Sects. 3 and 4, respectively. Some discussion is provided with a few numerical examples in Sect. 5. A stochastic extension is treated in detail in Sect. 6. Lastly, Sect. 7 concludes the paper.

2 Basic Model

Similar to Li [17], we suppose that there are two nonidentical players (nations or regions) in our game framework. Also, the relationship between production revenue \(R_i\) and emission \(E_i(t)\) is as follows:

$$\begin{aligned} R_i(E_i(t))=A_i E_i(t)-\frac{1}{2}E_i^2(t), \end{aligned}$$
(1)

\(i=1,2\), where \(A_i>0\) is a constant, \(E_i(t)\) denotes the emission of region i at time t, and \(E_i(t) \in [0,A_i]\). This guarantees that \(R_i\) is an increasing concave function. Following Chang et al. [7], we set \(A_1=A\) and \(A_2=\alpha A\), where \(\alpha \) is a positive parameter that measures the gap between the two players’ abilities in obtaining benefits from production.

Then, we denote P(t) as the accumulated amount of pollution in the air at time t. Then, the evolutionary process of P(t) can be expressed as follows:

$$\begin{aligned} {\dot{P}}(t)=E_1(t)+E_2(t)-\theta P(t), \quad P(0)=P_0,\quad P(t)>0, \end{aligned}$$
(2)

where \(E_1(t)\ge 0\) and \(E_2(t)\ge 0\) denote the two players’ emission rates at time t, and \(\theta \) represents the exponential decay rate of pollution. According to Chang et al. [7], we suppose that the loss suffered by player i is a linear function of P(t), namely \(D_iP(t)\), where \(D_i\) is a strictly positive parameter. Without loss of generality, we let \(D_1=D\) and \(D_2=\beta D\) where \(\beta \) is a positive constant that measures the gap between the two players’ abilities in suffering damages from pollution.

In recent years, with the signing of climate documents such as Kyoto protocol and Paris agreement, more and more countries in the world are beginning to regard abatement as a necessary issue in the process of development. Similar to Bertinelli et al. [5], by employing the abatement strategy, the regions can reduce the accumulated amount of pollution, that is

$$\begin{aligned} {\dot{P}}(t)=E_1(t)-a_1(t)+E_2(t)-a_2(t)-\theta P(t), \quad P(0)=P_0,\quad P(t)>0, \end{aligned}$$
(3)

where \(a_1(t)\) and \(a_2(t)\) are the region 1’s and region 2’s amounts of pollution abatement, respectively, and the two regions can suffer fewer pollution damages. However, the two regions should also face the costs of abatement which reduce the flow of net revenues. As is standard in economics, we assume the abatement cost to be convex. Specially, we assume it to be \(\frac{1}{2}C_ia_i^2(t)\), \(i=1,2\), where \(C_i\) is a positive constant and can be regarded as the abatement cost coefficient. Similarly, we set \(C_1=C\) and \(C_2=\gamma C\), where \(\gamma \) measures the gap between the two regions’ abilities in mastering the abatement technology.

In addition, the region i’s cost from the emission permits trading scheme is given by \( S(E_i(t)-E_{i0}(t)-a_i(t)), \) at time t, where \(S\ge 0\) is a given constant permit price and \(E_{i0}(t)\) is the instantaneous permits quota allocated by the emission regulatory authorities. Specifically, \(E_i(t)-E_{i0}(t)-a_i(t)>0\) means that region i buys the permits from others who have unused permits and \(E_i(t)-E_{i0}(t)-a_i(t)<0\) means that region i sells its unused permits to other permit seekers in the market. We assume that the quota assigned to region i is exponentially decreasing over time at the rate \(\rho _i>0\), i.e., \(E_{i0}(t)=E_{i0}e^{-\rho _i t}\). This assumption is entirely reasonable given what is observed in many emission trading schemes. Additionally, it encourages the players to reduce emission over time. We should also mention that the assumption of the instantaneous quota has been widely used in the environmental management and operations literature; see, e.g., Dobos [8], Dobos [9], Li [17], Chang et al. [7], Li [18], Zhang et al. [29].

The assumption of the exogenous emission permit price S, on the other hand, is a strong one. However, it is appropriate in a variety of cases. In our model, the two regions are neighbors such as countries, states, or cities and the pollutant can be gas, liquid or solid. Hence, the two players of any sizes are the contributors to the local pollution dynamics. If these are, say, located in Europe, they may not have much influence on the permits price in the big European Union permits market. In the case, when the two regions are large such as continents, we can in view of the rapid economic development and increasing globalization envision a future when there would be a global emission permit trading market, with the continental polluters having little influence on the emission permits price. A particular case (suggested by a referee) is that of carbon pricing mechanisms that work a little like feebates. If a region exceeds its target, it pays a fixed unit tax per unit exceeding the target. If it emits less than its target, it gets a fixed per unit subsidy per unit below the target. In our model, this target for region i can be thought of as its quota \(E_{i0}(t)\), while S is the fixed per unit fee or rebate. A good instance of this situation is the “Carbon Competitiveness Incentive Regulation” in Canada’s Alberta Province. Finally, we note that our way of modeling emission trading is an extension of Li [17] in that we allow both the terminal salvage cost and the abatement policy.

The goal of region i is to come up with the optimal emission level and the optimal abatement strategy that maximize the discounted stream of revenue and cost over a finite horizon T: production revenue, costs from the emission permits trading, pollution damages, and abatement costs. Additionally, according to Yeung and Petrosyan [26], we assume that at the end of the game, the region i’s salvage cost of dealing with the excess pollution stock is \(g_i(P(T)-{\bar{P}}_i)e^{-rT}\), where \(g_i\) is the unit cost, \({\bar{P}}_i\) denotes the datum pollution stock, and \(P(T)-{\bar{P}}_i\) is the excess pollution stock which needs to be treated. Then, the optimal control problem of region i is given by

$$\begin{aligned}&\max _{E_i(t)\ge 0,a_i(t)}\int _0^{\mathrm{T}}e^{-rt}\Bigg [(A_i-S)E_i(t)-\frac{1}{2}E_i^2(t) +SE_{i0}e^{-\rho _i t}+Sa_{i}(t)-D_iP(t)\\&\qquad \qquad -\frac{1}{2}C_ia_i^2(t)\Bigg ]\hbox {d}t-g_i(P(T)-{\bar{P}}_i)e^{-rT},\\ \end{aligned}$$

subject to

$$\begin{aligned} {\dot{P}}(t)&=E_1(t)-a_1(t)+E_2(t)-a_2(t)-\theta P(t), ~~\quad ~~P(0)=P_0,\\ P(t)&\ge 0, t \in [0,T], \end{aligned}$$

where \(r>0\) is the risk-free discount rate.

In the next two sections, we will use the optimal control theory and HJB equations to find the optimal emission levels and optimal abatement strategies, such that the two players’ discounted flows of profits are maximized under cooperative and the noncooperative games, respectively.

3 Noncooperative Game

Each player in a noncooperative game makes his own decision to maximize his own profits. Thus, the players seek a Nash equilibrium to obtain their optimal emission levels and optimal abatement strategies. That is, the problem of region 1 is described by

$$\begin{aligned}&\max _{E_{N1}(t)\ge 0,a_{N1}(t)}\int _0^{\mathrm{T}}e^{-rt}\Bigg [(A-S)E_{N1}(t)-\frac{1}{2}E_{N1}^2(t) +SE_{10}e^{-\rho _1 t}+Sa_{N1}(t)-DP(t)\nonumber \\&\qquad \quad -\frac{1}{2}Ca_{N1}^2(t)\Bigg ]\hbox {d}t-g_1(P(T)-{\bar{P}}_1)e^{-rT},\nonumber \\&\text{ subject } \text{ to } \quad {\dot{P}}(t)=E_{N1}(t)-a_{N1}(t)+E_{N2}(t)-a_{N2}(t)-\theta P(t),\quad P(0)=P_0,\quad P(t)\ge 0, \end{aligned}$$
(4)

and that of region 2 is governed by

$$\begin{aligned}&\max _{E_{N2}(t)\ge 0,a_{N2}(t)}\int _0^{\mathrm{T}}e^{-rt}\Bigg [(\alpha A-S)E_{N2}(t)-\frac{1}{2}E_{N2}^2(t) +SE_{20}e^{-\rho _2 t}+Sa_{N2}(t)-\beta DP(t)\\&\qquad \quad -\frac{1}{2}\gamma Ca_{N2}^2(t)\Bigg ]\hbox {d}t-g_2(P(T)-{\bar{P}}_2)e^{-rT},\\&\text{ subject } \text{ to } \quad {\dot{P}}(t)=E_{N1}(t)-a_{N1}(t)+E_{N2}(t)-a_{N2}(t)-\theta P(t), \quad P(0)=P_0,\quad P(t)\ge 0, \end{aligned}$$

where \(E_{N1}(t)\) and \(E_{N2}(t)\) denote the emission levels of regions 1 and 2, and \(a_{N1}(t)\) and \(a_{N2}(t)\) denote the abatement levels of regions 1 and 2 in the noncooperative game, respectively.

By using the differential games theory, we can obtain the system of HJB equations satisfied by the value functions \(V_{N1}\) and \(V_{N2}\) for regions 1 and 2 as follows:

$$\begin{aligned} \left\{ \begin{aligned} \max _{E_{N1}(t)\ge 0,a_{N1}(t)}\Bigg \{\frac{\partial V_{N1}}{\partial t}+\Big (E_{N1}(t)-a_{N1}(t)+E_{N2}(t)-a_{N2}(t)-\theta P(t)\Big )\frac{\partial V_{N1}}{\partial P}-rV_{N1}+F_{N1}\Bigg \}=0, \\ \max _{E_{N2}(t)\ge 0,a_{N2}(t)}\Bigg \{\frac{\partial V_{N2}}{\partial t}+\Big (E_{N1}(t)-a_{N1}(t)+E_{N2}(t)-a_{N2}(t)-\theta P(t)\Big )\frac{\partial V_{N2}}{\partial P}-rV_{N2}+F_{N2}\Bigg \}=0, \end{aligned} \right. \end{aligned}$$
(5)

with the terminal conditions

$$\begin{aligned} V_{N1}(P,T)=-g_1\left( P-{\bar{P}}_1\right) \quad \text{ and }\quad V_{N2}(P,T)=-g_2\left( P-{\bar{P}}_2\right) , \end{aligned}$$
(5a)

where

$$\begin{aligned} F_{N1}=(A-S)E_{N1}(t)-\frac{1}{2}E_{N1}^2(t)+SE_{10}e^{-\rho _1 t}+Sa_{N1}(t)-DP(t)-\frac{1}{2}Ca_{N1}^2(t), \end{aligned}$$

and

$$\begin{aligned} F_{N2}=(\alpha A-S)E_{N2}(t)-\frac{1}{2}E_{N2}^2(t)+SE_{20}e^{-\rho _2 t}+Sa_{N2}(t)-\beta DP(t)-\frac{1}{2}\gamma Ca_{N2}^2(t). \end{aligned}$$

According to the first-order optimality condition, we know that the optimal emission levels \(E^*_{N1}\) and \(E^*_{N2}\), and the optimal abatement levels \(a^*_{N1}\) and \(a^*_{N2}\) can be given by the following equations:

$$\begin{aligned} E^*_{N1}= & {} A-S+\frac{\partial V_{N1}}{\partial P}, a^*_{N1}=\frac{1}{C}\Big (S-\frac{\partial V_{N1}}{\partial P}\Big ), \end{aligned}$$
(6a)
$$\begin{aligned} E^*_{N2}= & {} \alpha A-S+\frac{\partial V_{N2}}{\partial P}, a^*_{N2}=\frac{1}{\gamma C}\Big (S-\frac{\partial V_{N2}}{\partial P}\Big ). \end{aligned}$$
(6b)

Proposition 1

In the noncooperative game, \(\{V_{\mathrm{Ni}}(P,t),E^*_{\mathrm{Ni}}(t),a^*_{\mathrm{Ni}}(t)\}\), \(i=1,2\), denote the Nash equilibrium solutions of the value functions and the control variables. Then, we have

$$\begin{aligned} V_{N1}(P,t)&=l_1(t)P + k_1(t), \end{aligned}$$
(7a)
$$\begin{aligned} E^*_{N1}(t)&={\left\{ \begin{array}{ll} A-S+l_1(t)&{} S<A+\left( \frac{ D}{r+\theta }-g_1\right) e^{-(r+\theta )(T-t)}-\frac{ D}{r+\theta },\\ 0&{} S\ge A+\left( \frac{ D}{r+\theta }-g_1\right) e^{-(r+\theta )(T-t)}-\frac{ D}{r+\theta }, \end{array}\right. } \end{aligned}$$
(7b)
$$\begin{aligned} a^*_{N1}(t)&=\frac{1}{C}\left( S-l_1(t)\right) , \end{aligned}$$
(7c)
$$\begin{aligned} V_{N2}(P,t)&=l_2(t)P + k_2(t), \end{aligned}$$
(7d)
$$\begin{aligned} E^*_{N2}(t)&={\left\{ \begin{array}{ll} \alpha A-S+l_2(t)&{} S<\alpha A+\left( \frac{ \beta D}{r+\theta }-g_2\right) e^{-(r+\theta )(T-t)}-\frac{\beta D}{r+\theta },\\ 0&{} S\ge \alpha A+\left( \frac{\beta D}{r+\theta }-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ \beta D}{r+\theta }, \end{array}\right. } \end{aligned}$$
(7e)
$$\begin{aligned} a^*_{N2}(t)&=\frac{1}{\gamma C}\left( S-l_2(t)\right) , \end{aligned}$$
(7f)

where \(l_1(t)\), \(l_2(t)\), \(k_1(t)\), \(k_2(t)\) are given in “Appendix A” section.

Proof

See “Appendix A” section.

We can clearly see that the value functions are linear functions with respect to the state variable, namely the pollution stock P.

If we let

$$\begin{aligned} \eta= & {} \min \left\{ A+\left( \frac{ D}{r+\theta }-g_1\right) e^{-(r+\theta )(T-t)}-\frac{ D}{r+\theta },\right. \\&\quad \left. \alpha A+\left( \frac{\beta D}{r+\theta }-g_2\right) e^{-(r+\theta )(T-t)}-\frac{\beta D}{r+\theta }\right\} ,\\ \phi= & {} \frac{1-e^{-\theta (T-t)}}{1-e^{-r(T-t)}}\frac{r(2\gamma C+\gamma +1)}{\gamma \theta (C+1)}\left( \frac{ D}{r+\theta }-g_1\right) e^{-r(T-t)}\\&+\,\frac{AC}{C+1}-\frac{D(2\gamma C+\gamma +1)}{\gamma (C+1)(r+\theta )},\\ \psi= & {} \frac{1-e^{-\theta (T-t)}}{1-e^{-r(T-t)}}\frac{r(2\gamma C+\gamma +1)}{\theta (\gamma C+1)}\left( \frac{ \beta D}{r+\theta }-g_2\right) e^{-r(T-t)}\\&+\,\frac{A\gamma C}{\gamma C+1}-\frac{\beta D(2\gamma C+\gamma +1)}{(\gamma C+1)(r+\theta )},\\ \tau _1(t)= & {} \frac{rCe^{-\rho _1T}\left( e^{\rho _1(T-t)}-e^{-r(T-t)}\right) }{(C+1)(1+\rho _1)(1-e^{-r(T-t)})}, \end{aligned}$$

and

$$\begin{aligned} \tau _2(t)=\frac{r\gamma Ce^{-\rho _2T}\left( e^{\rho _2(T-t)}-e^{-r(T-t)}\right) }{(\gamma C+1)(1+\rho _2)(1-e^{-r(T-t)})}, \end{aligned}$$

then, some useful results are presented in the following corollaries. \(\square \)

Corollary 1

  1. (i)

    When \(A-S+\left( \frac{ D}{r+\theta }-g_1\right) e^{-(r+\theta )(T-t)}-\frac{ D}{r+\theta }>E_{10}e^{-\rho _1 t}\), region 1 should buy emission permits at time t; when \(A-S+\left( \frac{ D}{r+\theta }-g_1\right) e^{-(r+\theta )(T-t)}-\frac{ D}{r+\theta }<E_{10}e^{-\rho _1 t}\), region 1 should sell emission permits at time t; and when \(A-S+\left( \frac{ D}{r+\theta }-g_1\right) e^{-(r+\theta )(T-t)}-\frac{ D}{r+\theta }=E_{10}e^{-\rho _1 t}\), region 1 neither buys nor sells emission permits at time t. Similarly, when \(\alpha A-S+\left( \frac{\beta D}{r+\theta }-g_2\right) e^{-(r+\theta )(T-t)}-\frac{\beta D}{r+\theta }>E_{20}e^{-\rho _2 t}\), region 2 should buy emission permits at time t; when \(\alpha A-S+\left( \frac{\beta D}{r+\theta }-g_2\right) e^{-(r+\theta )(T-t)}-\frac{\beta D}{r+\theta }<E_{20}e^{-\rho _2 t}\), region 2 should sell emission permits at time t; and when \(\alpha A-S+\left( \frac{\beta D}{r+\theta }-g_2\right) e^{-(r+\theta )(T-t)}-\frac{\beta D}{r+\theta }=E_{20}e^{-\rho _2 t}\), region 2 neither buys or sells emission permits at time t.

  2. (ii)

    The profits of the two regions will increase with the decrease in pollution stock P and with increase in their instantaneous emission permits quotas at any time before T.

  3. (iii)

    If \(\tau _1(t)E_{10}\ge \phi \), region 1’s profit should increase monotonically with increase in the permit price at time t; if \(\tau _1(t)E_{10}\le \phi -\eta \), region 1’s profit should decrease monotonically with increase in the permit price at time t; if \(\phi -\eta \le \tau _1(t)E_{10} \le \phi \), region 1’s profit should increase monotonically with increase in permit price at time t when \(0<S<\phi -\tau _1(t)E_{10}\) and decrease monotonically with increase in the permit price at time t when \(\phi -\tau _1(t)E_{10}<S<\eta \). Similarly, if \(\tau _2(t)E_{20}\ge \psi \), region 2’s profit should increase monotonically with increase in the permit price at time t; if \(\tau _2(t)E_{20}\le \psi -\eta \), region 2’s profit should decrease monotonically with increase in the permit price at time t; if \(\psi -\eta \le \tau _2(t)E_{20} \le \psi \), region 2’s profit should decrease monotonically with increase in the permit price at time t when \(0<S<\psi -\tau _2(t)E_{20}\), and increase monotonically with increase in the permit price at time t when \(\psi -\tau _2(t)E_{20}<S<\eta \).

Proof

See “Appendix B” section.

From the results in (iii) in Corollary 1, we can see that the instantaneous quotas play an essential role in examining the effect of emission permits price on the two regions’ profits. If the instantaneous quotas are very large, then the emission levels will not exceed them, so the players in the game can sell their excess emission permits in the market and they benefit more from increases in the price of emission permits. If the two regions receive a fewer amounts of instantaneous quotas, then the emission levels can easily exceed them. In this case, the two regions have to spend money to buy emission permits and their profits will decrease with increases in the price of permits. If the amount of instantaneous quotas is adequate, the two regions’ revenues will first decrease with increases in the price of permits and then increase with increases in the price of permits.

The following corollary shows the trajectories of the pollution stock along with an optimal emission path. Moreover, the evolution of the pollution stock is demonstrated to be very highly related to the initial pollution stock. \(\square \)

Corollary 2

Let

$$\begin{aligned} X_{N}=\frac{1}{\theta }\left( (1+\alpha )A-\frac{2\gamma C+\gamma +1}{\gamma C}S-\frac{C+1}{C}\frac{D}{r+\theta }-\frac{\gamma C+1}{\gamma C}\frac{\beta D}{r+\theta }\right) \end{aligned}$$

and

$$\begin{aligned} Y_{N}=\frac{1}{r+2\theta }\left( \frac{C+1}{C}\left( \frac{D}{r+\theta }-g_1\right) +\frac{\gamma C+1}{\gamma C}\left( \frac{\beta D}{r+\theta }-g_2\right) \right) . \end{aligned}$$

Then we can obtain the trajectory of the pollution stock under the noncooperative game as follow:

$$\begin{aligned} P_N(t)=X_{N}+Y_{N}e^{-(r+\theta )(T-t)}+\left( P_{0}-X_{N}-Y_{N}e^{-(r+\theta )T}\right) e^{-\theta t}. \end{aligned}$$
(8)

In addition, when \(P_0<X_{N}+Y_{N}e^{-(r+\theta )T}(1+\frac{1}{\theta }(r+\theta )e^{(r+2\theta )t})\), the evolution of the pollution stock is an accumulative process; when \(P_0>X_{N}+Y_{N}e^{-(r+\theta )T}(1+\frac{1}{\theta }(r+\theta )e^{(r+2\theta )t})\), the evolution of the pollution stock is an dissipative process; when \(P_0=X_{N}+Y_{N}e^{-(r+\theta )T}(1+\frac{1}{\theta }(r+\theta )e^{(r+2\theta )t})\), the pollution stock is a constant.

Proof

See “Appendix C” section. \(\square \)

4 Cooperative Game

A cooperative game means that the players come to a binding agreement to make the coalition reach the optimum. In our cooperative game, the two regions find the optimal emission levels and optimal abatement strategies to maximize their joint net profit. The resulting problem can be written as follows:

$$\begin{aligned}&\max _{{\mathop {E_{C2}(t)\ge 0,a_{C2}(t)}\limits ^{E_{C1}(t)\ge 0,a_{C1}(t),}}}\int _0^{\mathrm{T}}e^{-rt} \left[ (A-S)E_{C1}(t)+(\alpha A-S)E_{C2}(t)+S(a_{C1}(t)+a_{C2}(t))\right. \\&\quad \quad -\,\frac{E_{C1}^2(t)+E_{C2}^2(t)}{2} +(E_{10}e^{-\rho _1t}+E_{20}e^{-\rho _2t})S-(1+\beta )D P(t)\\&\quad \quad \left. -\,\frac{1}{2}Ca_{C1}^2(t)-\frac{1}{2}\gamma Ca_{C2}^2(t)\right] \hbox {d}t-\sum _{i=1,2}{g_i(P(T)-{\bar{P}}_i)e^{-rT}}, \\&\text{ subject } \text{ to } \quad {\dot{P}}(t)=E_{C1}(t)-a_{C1}(t)+E_{C2}(t)-a_{C2}(t)-\theta P(t), \quad P(0)=P_0,\quad P(t)\ge 0, \end{aligned}$$

where \(E_{C1}(t)\) and \(E_{C2}(t)\) denote the emission levels of regions 1 and 2, and \(a_{C1}(t)\) and \(a_{C2}(t)\) denote the abatement levels of regions 1 and 2 in the cooperative game, respectively.

The corresponding HJB equation for the joint value function \(V_C\) can be written as follows:

$$\begin{aligned}&\max _{{\mathop {E_{C2}(t)\ge 0,a_{C2}(t)}\limits ^{E_{C1}(t)\ge 0,a_{C1}(t),}}}\Bigg \{\frac{\partial V_{C}}{\partial t}+(E_{C1}(t)-a_{C1}(t)+E_{C2}(t)-a_{C2}(t)-\theta P(t))\frac{\partial V_{C}}{\partial P}\nonumber \\&\quad \quad \qquad \qquad \qquad -\,rV_{C}+F_{C}\Bigg \}=0, \end{aligned}$$
(9)

with the terminal condition

$$\begin{aligned} V_{C}(P,T)=-g_1\left( P-{\bar{P}}_1\right) -g_2\left( P-{\bar{P}}_2\right) , \end{aligned}$$
(9a)

where

$$\begin{aligned} F_{C}&=(A-S)E_{C1}(t)+(\alpha A-S)E_{C2}(t)-\frac{E_{C1}^2(t)+E_{C2}^2(t)}{2} +\left( E_{10}e^{-\rho _1 t}+E_{20}e^{-\rho _2 t}\right) S\\ {}&\quad +S\left( a_{C1}(t)+a_{C2}(t)\right) -(1+\beta )D P(t)-\frac{1}{2}Ca_{C1}^2(t)-\frac{1}{2}\gamma Ca_{C2}^2(t). \end{aligned}$$

From the first-order optimality condition, we know that the two regions’ optimal emission levels \(E^*_{C1}\) and \(E^*_{C2}\), and the optimal abatement levels \(a^*_{C1}\) and \(a^*_{C2}\) can be given by the following equations:

$$\begin{aligned} E^*_{C1}= & {} A-S+\frac{\partial V_{C}}{\partial P}, \quad a^*_{C1}=\frac{1}{C}\Big (S-\frac{\partial V_{C}}{\partial P}\Big ), \end{aligned}$$
(10a)
$$\begin{aligned} E^*_{C2}= & {} \alpha A-S+\frac{\partial V_{C}}{\partial P}, \quad a^*_{C2}=\frac{1}{\gamma C}\Big (S-\frac{\partial V_{C}}{\partial P}\Big ). \end{aligned}$$
(10b)

Proposition 2

In the cooperative game, \(\{V_{C}(P,t),E^*_{\mathrm{Ci}}(t),a^*_{\mathrm{Ci}}(t)\}\), \(i=1,2\), denote the solutions of the value function and the control variables. Then,

$$\begin{aligned}&V_{C}(P,t)=l(t)P + k(t), \end{aligned}$$
(11a)
$$\begin{aligned}&E^*_{C1}(t) ={\left\{ \begin{array}{ll} A-S+l(t) &{} S<A+\left( \frac{(1+\beta )D}{r+\theta }-g_1-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ (1+\beta )D}{r+\theta },\\ 0 &{} S\ge A+\left( \frac{(1+\beta )D}{r+\theta }-g_1-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ (1+\beta )D}{r+\theta },\end{array}\right. } \end{aligned}$$
(11b)
$$\begin{aligned}&a^*_{C1}(t)=\frac{1}{C}\left( S-l(t)\right) , \end{aligned}$$
(11c)
$$\begin{aligned}&E^*_{C2}(t) ={\left\{ \begin{array}{ll} \alpha A-S+l(t) &{} S<\alpha A+\left( \frac{(1+\beta )D}{r+\theta }-g_1-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ (1+\beta )D}{r+\theta },\\ 0 &{} S\ge \alpha A+\left( \frac{(1+\beta )D}{r+\theta }-g_1-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ (1+\beta )D}{r+\theta },\end{array}\right. } \end{aligned}$$
(11d)
$$\begin{aligned}&a^*_{C2}(t)=\frac{1}{\gamma C}\left( S-l(t)\right) , \end{aligned}$$
(11e)

where l(t) and k(t) are given in “Appendix C” section.

Proof

See “Appendix D” section.

Letting

$$\begin{aligned}&{\bar{\eta }}=\min \left\{ A+\left( \frac{(1+\beta )D}{r+\theta }-g_1-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ (1+\beta )D}{r+\theta },\right. \\&\quad \quad \left. \,\alpha A+\left( \frac{(1+\beta )D}{r+\theta }-g_1-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ (1+\beta )D}{r+\theta }\right\} ,\\&{\bar{\phi }}=\frac{1-e^{-\theta (T-t)}}{1-e^{-r(T-t)}}\frac{r}{\theta }\left( \frac{ (1+\beta )D}{r+\theta }-g_1-g_2\right) e^{-r(T-t)}\\&\quad \quad +\frac{2\gamma C}{2\gamma C+\gamma +1}(1+\alpha )A-\frac{(1+\beta )D}{r+\theta },\\&\tau _{C1}(t)=\frac{2\gamma Cre^{-\rho _1T}\left( e^{\rho _1(T-t)}-e^{-r(T-t)}\right) }{(r+\rho _1)(1-e^{-r(T-t)})(2\gamma C+\gamma +1)}, \end{aligned}$$

and

$$\begin{aligned} \tau _{C2}(t)=\frac{2\gamma Cre^{-\rho _2T}\left( e^{\rho _2(T-t)}-e^{-r(T-t)}\right) }{(r+\rho _2)\left( 1-e^{-r(T-t)}\right) (2\gamma C+\gamma +1)}, \end{aligned}$$

we have the following corollary. \(\square \)

Corollary 3

  1. (i)

    When \(A-S+\left( \frac{(1+\beta )D}{r+\theta }-g_1-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ (1+\beta )D}{r+\theta }>E_{10}e^{-\rho _1 t}\), region 1 buys emission permits at time t; when \(A-S +\left( \frac{(1+\beta )D}{r+\theta }-g_1-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ (1+\beta )D}{r+\theta }<E_{10}e^{-\rho _1 t}\), region 1 sells emission permits at time t; and when \(A-S+ \left( \frac{(1+\beta )D}{r+\theta }-g_1-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ (1+\beta )D}{r+\theta }=E_{10}e^{-\rho _1 t}\), region 1 neither buys nor sells the emission permits at time t. Similarly, when \(\alpha A-S+ \left( \frac{(1+\beta )D}{r+\theta }-g_1-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ (1+\beta )D}{r+\theta }>E_{20}e^{-\rho _2 t}\), region 2 buys emission permits at time t; when \(\alpha A-S+ \left( \frac{(1+\beta )D}{r+\theta }-g_1-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ (1+\beta )D}{r+\theta }<E_{20}e^{-\rho _2 t}\), region 2 sells emission permits at time t; and when \(\alpha A-S +\left( \frac{(1+\beta )D}{r+\theta }-g_1-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ (1+\beta )D}{r+\theta }=E_{20}e^{-\rho _2 t}\), region 2 neither buys nor sells emission permits at time t.

  2. (ii)

    The joint profit of the two regions increases when the pollution stock P decreases or when their instantaneous emission permits quotas increase at any time before T.

  3. (iii)

    If \(\tau _{C1}(t)E_{10}+\tau _{C2}(t)E_{20}\ge {\bar{\phi }}\), then the two regions’ total profit increases monotonically with the permit price at time t; if \(\tau _{C1}(t)E_{10}+\tau _{C2}(t)E_{20}\le {\bar{\phi }}-2{\bar{\eta }}\), the two regions’ total profit decreases monotonically with the permit price at time t; if \({\bar{\phi }}-2{\bar{\eta }}< \tau _{C1}(t)E_{10}+\tau _{C2}(t)E_{20} < {\bar{\phi }}\), the two regions’ total profit decreases monotonically with the permit price at time t when \(0<S<\frac{1}{2}({\bar{\phi }}-\tau _{C1}(t)E_{10}-\tau _{C2}(t)E_{20})\) and increases monotonically with the permit price at time t when \(\frac{1}{2}({\bar{\phi }}-\tau _{C1}(t)E_{10}-\tau _{C2}(t)E_{20})<S<\eta \).

Proof

See “Appendix E” section.

In the cooperative game, if the sum of the instantaneous quotas is large, their joint profit increases with the permits prices, with fewer instantaneous quotas, their joint profit decreases with the permits price. If the sum of the instantaneous quotas is adequate, their joint profit first decreases and then increases with the permits price. \(\square \)

Corollary 4

Let

$$\begin{aligned} X_{C}=\frac{1}{\theta }\left( (1+\alpha )A-\frac{2\gamma C+\gamma +1}{\gamma C}S-\frac{2\gamma C+\gamma +1}{\gamma C}\frac{(1+\beta )D}{r+\theta }\right) \end{aligned}$$

and

$$\begin{aligned} Y_{C}=\frac{1}{r+2\theta }\frac{2\gamma C+\gamma +1}{\gamma C}\left( \frac{(1+\beta ) D}{r+\theta }-g_1-g_2\right) . \end{aligned}$$

Then, the trajectory of the pollution stock in the cooperative game is:

$$\begin{aligned} P_C(t)=X_{C}+Y_{C}e^{-(r+\theta )(T-t)}+ \left( P_{0}-X_{C}-Y_{C}e^{-(r+\theta )T}\right) e^{-\theta t}. \end{aligned}$$
(12)

In addition, when \(P_0<X_{C}+Y_{C}e^{-(r+\theta )T} \left( 1+\frac{1}{\theta }(r+\theta )e^{(r+2\theta )t}\right) \), the evolution of the pollution stock is an accumulative process; when \(P_0>X_{C}+Y_{C}e^{-(r+\theta )T}\left( 1+\frac{1}{\theta }(r+\theta )e^{(r+2\theta )t}\right) \), the evolution of pollution stock is a dissipative process; when \(P_0=X_{C}+Y_{C}e^{-(r+\theta )T} \left( 1+\frac{1}{\theta }(r+\theta )e^{(r+2\theta )t}\right) \), the pollution stock is a constant.

Proof

See “Appendix F” section. \(\square \)

5 Discussion

In this section, we compare the optimal emission paths, the optimal abatement strategies and the value functions obtained in the two games.

5.1 Difference in the Optimal Strategies

First, we examine the difference in the optimal emission paths and the optimal abatement strategies. From Propositions 1 and 2 , we can obtain

$$\begin{aligned} E^*_{C1}-E^*_{N1}= & {} \left( \frac{\beta D}{r+\theta }-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ \beta D}{r+\theta }<0,\\ a^*_{C1}-a^*_{N1}= & {} -\frac{1}{C}\Bigg (\left( \frac{\beta D}{r+\theta }-g_2\right) e^{-(r+\theta )(T-t)}-\frac{ \beta D}{r+\theta }\Bigg )>0,\\ E^*_{C2}-E^*_{N2}= & {} \left( \frac{ D}{r+\theta }-g_1\right) e^{-(r+\theta )(T-t)}-\frac{ D}{r+\theta }<0,\\ a^*_{C2}-a^*_{N2}= & {} -\frac{1}{\gamma C}\Bigg (\left( \frac{ D}{r+\theta }-g_1\right) e^{-(r+\theta )(T-t)}-\frac{ D}{r+\theta }\Bigg )>0, \end{aligned}$$

which imply that in both regions, the optimal emission levels under the cooperative game are lower than those under the noncooperative game, and the optimal abatement levels under the cooperative game are higher than those under the noncooperative game. The obvious reason is that under the noncooperative game, the two regions make decisions to maximize their respective profits. This creates a conflicting situation in which one region may try to reduce its pollution damage by reducing emissions or implementing pollution abatement only to find that its effort is offset by the other region increasing the emission level to obtain more production profit. This would not happen in the cooperative game because of their binding contract to cooperate.

Moreover, we also know from Propositions 1 and 2 that if the salvage cost coefficients \(g_1\) and \(g_2\) are sufficiently large, the optimal emission paths of the two regions are both decreasing over time and the optimal abatement levels of the two regions are both increasing over time in the noncooperative as well as in the cooperative game. This implies that when the two regions have to suffer from a high salvage cost, they are more willing to reduce emissions and implement pollution abatements to lower the pollution stock level.

5.2 Difference in the Optimal Trajectories of the Pollution Stock

According to (8) and (12), we can calculate the difference in the optimal trajectories of the pollution stocks as follows:

$$\begin{aligned} P_{C}(t)-P_{N}(t)&=(X_C-X_N)+(Y_C-Y_N)e^{-(r+\theta )(T-t)}\\&\quad -\Big ((X_C-X_N)+(Y_C-Y_N)e^{-(r+\theta )T}\Big )e^{-\theta t}\\&=(X_C-X_N)\left( 1-e^{-\theta t}\right) + \left( Y_C-Y_N\right) e^{-(r+\theta )T}e^{rt}, \end{aligned}$$

where

$$\begin{aligned} X_C-X_N=-\frac{1}{C}\frac{1}{\theta }\frac{D}{r+\theta } \left( C+\frac{1}{\gamma }+\beta C+\beta \right) <0 \end{aligned}$$

and

$$\begin{aligned} Y_C-Y_N=\frac{1}{r+2\theta }\Bigg (\Big (\frac{\gamma C+1}{\gamma C}+\frac{\beta (\gamma C+\gamma )}{\gamma C}\Big )\frac{D}{r+\theta }-\frac{\gamma C+1}{\gamma C}g_1-\frac{\gamma C+\gamma }{\gamma C}g_2\Bigg ). \end{aligned}$$

Then, we can find a sufficiently large pair \((g_1^{*},g_2^{*})\) to make \(P_C(t)-P_N(t)<0\) at each time t, since \(P_C-P_N\) is a decreasing function of \(g_1\) and \(g_2\). This means that the pollution stock under the cooperative game will be lower than that under the noncooperative game. It naturally follows that any reduction in emission and increase in abatement can diminish the accumulation of the pollution stock.

5.3 Difference in the Optimal Net Revenues

From (7a), (7d) and (11a), we can calculate the difference in the optimal profits as follows:

$$\begin{aligned} V_C-V_{N1}-V_{N2}&=(G_C-G_{N1}-G_{N2})+(H_C-H_{N1}-H_{N2})e^{-(r+\theta )(T-t)}\\&\quad +(I_C-I_{N1}-I_{N2})e^{-2(r+\theta )(T-t)} -\Big ((G_C-G_{N1}-G_{N2})\\&\quad +(H_C-H_{N1}-H_{N2})+(I_C-I_{N1}-I_{N2})\Big )e^{-r(T- t)}\\&=(G_C-G_{N1}-G_{N2})(1-e^{-r(T-t)})\\&\quad +(H_C-H_{N1}-H_{N2})e^{-r(T-t)}\left( e^{-\theta (T-t)}-1\right) \\&\quad +(I_C-I_{N1}-I_{N2})e^{-r(T-t)}\left( e^{-(r+2\theta )(T-t)}-1\right) , \end{aligned}$$

where

$$\begin{aligned} G_C-G_{N1}-G_{N2}= & {} \frac{1}{2\gamma C}\frac{1}{r}\left( \frac{D}{r+\theta }\right) ^2\Big (\beta ^2\gamma (1+C)+\gamma C+1\Big )>0, \\ H_C-H_{N1}-H_{N2}= & {} \frac{1}{\theta }\frac{D}{r+\theta }\Bigg (\frac{\beta ^2(2\gamma C+\gamma +1)+(1-\beta )(\gamma C+1)}{\gamma C}\frac{D}{r+\theta }\\&\quad -\,\frac{\gamma C+1}{\gamma C}g_1-\frac{\beta (\gamma C+\gamma )}{\gamma C}g_2\Bigg ), \end{aligned}$$

and

$$\begin{aligned} I_C-I_{N1}-I_{N2}=-\frac{1}{r+2\theta }\Bigg (\frac{\gamma C+1}{2\gamma C}\Big (\frac{D}{r+\theta }-g_1\Big )^2+\frac{\gamma C+\gamma }{2\gamma C}\Big (\frac{\beta D}{r+\theta }-g_2\Big )^2\Bigg )<0. \end{aligned}$$

Note that the coefficients \(e^{-\theta (T-t)}-1\) and \(e^{-(r+2\theta )(T-t)}-1\) are negative, which make \(V_C-V_{N1}-V_{N2}\) to be an increasing function of \(g_1\) and \(g_2\) when \(g_1>\frac{D}{r+\theta }\) and \(g_2>\frac{\beta D}{r+\theta }\). Thus, we can also find a sufficiently large pair \((g_1^{**},g_2^{**})\) to make \(V_C-V_{N1}-V_{N2}>0\). Therefore, we can say that when the salvage costs are large enough, it is better for the two regions to cooperate.

5.4 Numerical Example

In this subsection, we illustrate our results by some numerical examples. The parameters are: \(T=10\), \(A=20\), \(\alpha =0.9\), \(E_{10}=5\), \(E_{20}=6\), \(\rho _1=0.05\), \(\rho _2=0.04\), \(\theta =0.6\), \(P_{0}=200\), \(S=1\), \(D=3\), \(\beta =1.2\), \(C=2\), \(\gamma =1.5\), \(r=0.08\), \(g_1=8\), \(g_2=9\), \(\bar{P_1}=1100\), and \(\bar{P_2}=1200\) (Chang et al. [7]).

5.4.1 Basic Results

The results of Propositions 1 and 2 are sketched in Fig. 1. The two regions’ optimal emission levels and optimal abatement levels under the noncooperative and the cooperative games are shown in Fig. 1a, b, respectively. We can see that the emission levels and the abatement levels are relatively stable at the start, while the former shows an increasing trend and the latter shows a decreasing trend closer to the terminal time point. From Propositions 1 and 2, we know that the observed emission and abatement paths evolve in the opposite directions. This implies that in a dynamic transboundary pollution game with a terminal salvage cost, the players adopt relatively stable emission and abatement strategies most of the time and rapidly decrease emission and increase abatement closer to the expiration date to reduce the pollution stock and to avoid high salvage cost. This behavior could capture the initial revenues and reduce the loss due to pollution as much as possible.

Fig. 1
figure 1

Numerical results, a emission levels, b abatement levels, c pollution stocks, d value functions

In addition, in each region, the noncooperative emission level is higher than the cooperative emission level, and the noncooperative abatement level is lower than the cooperative abatement level. Figure 1c also demonstrates that the pollution stock under the noncooperative game is higher than that under the cooperative game. All these results imply that cooperation plays an important role in improving the environmental quality.

It is necessary to focus on the issue of how to distribute the aggregate profit to each player in the cooperative game. The payment distribution mechanism that we use in this paper is to share the aggregate profit by the proportion of their noncooperative profits. This can be expressed mathematically as

$$\begin{aligned} V_{\mathrm{Ci}}=\frac{V_{\mathrm{Ni}}}{V_{N1}+V_{N2}}V_C, \end{aligned}$$

\(i=1,2\), where \(V_{\mathrm{Ci}}\) denotes the region i’s value function in the cooperative game. According to Yeung [25], Yeung and Petrosyan [26] and Chang et al. [7], this payment distribution mechanism supports the subgame consistent solution, and in this case the two players should be committed to cooperation throughout the game. The two regions will cooperate when \(V_C>V_{N1}+V_{N2}\) and the profit \(V_{\mathrm{Ci}}\) in the cooperative game is higher than the profit \(V_{\mathrm{Ni}}\) in the noncooperative game. Based on this, we can plot the two regions’ value functions under the noncooperative and the cooperative games in Fig. 1d, which shows that the two regions would obtain higher profits when they cooperate. Therefore, we conclude that in both economic and environmental terms, cooperation is always better than noncooperation.

5.4.2 The Effects of the Salvage Cost on the Results

It is easy to see from our theoretical results that the value of datum pollution stock \({\bar{P}}_i\) only influences the revenues of the players; namely, the bigger the \({\bar{P}}_i\) is, the more are the revenues. So, we simply examine the effects of unit salvage cost \(g_i\). Since the results are similar for the two players under cooperative and noncooperative games, here for brevity we present the effects of \(g_1\) on the region 1’s optimal strategies for emission and abatement, optimal net revenues, and the optimal trajectories of the pollution stock under the noncooperative game only.

Figure 2, in which \(g_1\) is set at 6, 8, and 10, shows the effects of \(g_1\) on the results. We can see from the figure that a higher unit salvage cost results in a lower emission level, a higher abatement level and more revenues, while the optimal trajectory of the pollution stock is insensitive to \(g_1\). This implies that a higher unit salvage cost should incentivize the players to reduce emission, increase abatement, and then further reduce costs from the emission permits trading scheme, leading to increases in net revenues.

In Fig. 3, we plot the emission levels and abatement levels for a longer horizon \(T=100\). Here we see that the results are primarily driven by the salvage value function. Specifically, the emission and abatement levels are at a steady state until close to the terminal time when the levels deviate from the steady state because of the influence of the salvage cost. Such results are common in the economics literature; see, e.g., Sects. 7.2 and 13.3 in Sethi and Thompson [24].

Fig. 2
figure 2

The effects of \(g_1\) on the results, a emission levels, b abatement levels, c pollution stocks, d value functions

Fig. 3
figure 3

The effects of time horizon on the results, a emission levels, c abatement levels

6 Stochastic Extension

6.1 Game Framework

Our model can be extended to a stochastic multi-dimensional versions by considering stochastic emission permits prices and a stochastic evolution of the pollution stock. Several studies, such as Chang et al. [6] and Daskalakis et al. [12], have demonstrated that the emission permit price should be stochastic, as it is caused by the scarcity of emission permits and market discipline. Moreover, the dynamic process of the pollution stock may be affected by extreme weather events such as gales, nature’s ability to refresh the air, and other stochastic disturbances. In recent years, several stochastic optimal control models have been proposed to deal with the uncertainty in pollution stock, such as Athanassoglou and Xepapadeas [1], Masoudi et al. [22], and Yi et al. [27]. Based on the above, we consider the following dynamics of the emission permits price S(t) and the pollution stock P(t):

$$\begin{aligned} \hbox {d}S(t)=\mu _SS(t)\hbox {d}t+\sigma _SS(t)\hbox {d}W_S, \end{aligned}$$
(14)

and

$$\begin{aligned} \hbox {d}P(t)=(E_1(t)-a_1(t)+E_2(t)-a_2(t)-\theta P(t))\hbox {d}t + \sigma _PP(t)\hbox {d}W_P, \end{aligned}$$
(15)

where \(\mu _S\) is the drift rate of the emission permit price, \(\sigma _S\) and \(\sigma _P\) are the volatilities of the emission permit price and the pollution stock, respectively, and \(W_S\) and \(W_P\) are two correlated Brownian motions with the correlation coefficient \(\rho >0\), whereas \(\rho =0\) means they are independent of each other.

For the cooperative game with \({\mathbb {E}}\) denoting the expectation, the objective function and the corresponding HJB equation are

$$\begin{aligned}&\max _{\begin{array}{c} E_{C1}\ge 0,a_{C1},\\ E_{C2}\ge 0,a_{C2} \end{array}}{\mathbb {E}}\Bigg \{\int _0^{\mathrm{T}}e^{-rt}\Bigg [(A-S(t))E_{C1}+(\alpha A-S(t))E_{C2}-\frac{E_{C1}^2+E_{C2}^2}{2}\\&\quad \quad +S(t)(a_{C1}+a_{C2})+S(t)(E_{10}e^{-\rho _1t}+E_{20}e^{-\rho _2t}) -(1+\beta )D P(t)-\frac{1}{2}Ca_{C1}^2\\&\quad \quad -\,\frac{1}{2}\gamma Ca_{C2}^2\Bigg ]\hbox {d}t\Bigg \}-\sum _{i=1,2}{g_i(P(T)-{\bar{P}}_i)e^{-rT}},\\&\text{ subject } \text{ to } \quad {\left\{ \begin{array}{ll} \hbox {d}S(t)=\mu _S S(t)\hbox {d}t+\sigma _S S(t)\hbox {d}W_{S}, S(0)=S_0, \\ \hbox {d}P(t)=(E_{C1}(t)-a_{C1}(t)+E_{C2}(t)-a_{C2}(t)-\theta P(t))\hbox {d}t\\ \quad \qquad \qquad +\,\sigma _P P(t)\hbox {d}W_P, P(0)=P_0, \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned}&\max _{\begin{array}{c} E_{C1}(t)\ge 0,a_{C1}(t),\\ E_{C2}(t)\ge 0,a_{C2}(t) \end{array}}\Bigg \{-\frac{\partial V_{C}}{\partial t}-\Big (E_{C1}(t)-a_{C1}(t)+E_{C2}(t)-a_{C2}(t)-\theta P(t)\Big )\frac{\partial V_{C}}{\partial P}\nonumber \\&\quad \quad -\,\frac{1}{2}\sigma _P^2P^2\frac{\partial ^2 V_C}{\partial P^2} -\mu _SS\frac{\partial V_C}{\partial S}-\frac{1}{2}\sigma _S^2S^2\frac{\partial ^2 V_C}{\partial S^2}-\rho \sigma _P\sigma _SPS\frac{\partial ^2 V_C}{\partial P\partial S}+rV_{C}-F_{C}\Bigg \}=0,\nonumber \\ \end{aligned}$$
(16)

with the terminal condition

$$\begin{aligned} V_{C}(P,T)=-g_1\left( P-{\bar{P}}_1\right) -g_2\left( P-{\bar{P}}_2\right) , \end{aligned}$$

where

$$\begin{aligned} F_{C}&=(A-S)E_{C1}(t)+(\alpha A-S)E_{C2}(t)-\frac{1}{2}\left( E_{C1}^2(t)+E_{C2}^2(t)\right) +S\left( E_{10}e^{-\rho _1t}+E_{20}e^{-\rho _2t}\right) \\&\quad +S(t)(a_{C1}(t)+a_{C2}(t))-(1+\beta )D P(t)-\frac{1}{2}Ca_{C1}^2(t)-\frac{1}{2}\gamma Ca_{C2}^2(t). \end{aligned}$$

For the noncooperative game, the objective functions for regions 1 and 2, and the corresponding HJB system are, respectively,

$$\begin{aligned}&\max _{\begin{array}{c} E_{N1}(t)\ge 0,\\ a_{N1}(t) \end{array}}{\mathbb {E}}\int _0^{\mathrm{T}}e^{-rt}\Bigg [(A-S(t))E_{N1}(t)-\frac{E_{N1}^2(t)}{2} \nonumber \\&\quad \qquad \quad +\,S(t)E_{10}e^{-\rho _1t+S(t)a_{N1}(t)} -DP(t)-\frac{1}{2}Ca_{N1}^2(t)\Bigg ]\hbox {d}t-g_1(P(T)-{\bar{P}}_1)e^{-rT},\\&\quad \text{ subject } \text{ to } \quad {\left\{ \begin{array}{ll} \hbox {d}S(t)=\mu _S S(t)\hbox {d}t+\sigma _S S(t)\hbox {d}W_{S}, S(0)=S_0, \\ \hbox {d}P(t)=(E_{N1}(t)-a_{N1}(t)+E_{N2}(t)-a_{N2}(t)-\theta P(t))\hbox {d}t\\ \quad \qquad \qquad +\,\sigma _P P(t)\hbox {d}W_P, P(0)=P_0, \end{array}\right. }\nonumber \end{aligned}$$
(17)

and for region 2,

$$\begin{aligned}&\max _{ \begin{array}{c} E_{N2}(t)\ge 0,\\ a_{N2}(t) \end{array}}{\mathbb {E}}\int _0^{\mathrm{T}}e^{-rt}\Bigg [(\alpha A-S(t))E_{C2}(t)-\frac{E_{C2}^2(t)}{2} +S(t)E_{20}e^{-\rho _2t}+S(t)a_{N2}(t)\nonumber \\ {}&\quad \qquad \qquad -\,\beta D P(t)-\frac{1}{2}\gamma Ca_{C2}^2(t)\Bigg ]\hbox {d}t-g_2(P(T)-{\bar{P}}_1)e^{-rT},\\&\quad \text{ subject } \text{ to } \quad {\left\{ \begin{array}{ll} \hbox {d}S(t)=\mu _S S(t)\hbox {d}t+\sigma _S S(t)\hbox {d}W_{S}, S(0)=S_0, \\ \hbox {d}P(t)=(E_{N1}(t)-a_{N1}(t)+E_{N2}(t)-a_{N2}(t)-\theta P(t))\hbox {d}t\\ \quad \qquad \qquad +\,\sigma _P P(t)\hbox {d}W_P, P(0)=P_0, \end{array}\right. }\nonumber \end{aligned}$$
(18)

and

$$\begin{aligned} \left\{ \begin{aligned} \max _{E_{N1}(t)\ge 0,a_{N1}(t)}\Bigg \{-\frac{\partial V_{N1}}{\partial t}-\Big (E_{N1}(t)-a_{N1}(t)+E_{N2}(t)-a_{N2}(t)-\theta P(t)\Big )\frac{\partial V_{N1}}{\partial P}-\frac{1}{2}\sigma _P^2P^2\frac{\partial ^2 V_{N1}}{\partial P^2}\\ -\,\mu _SS\frac{\partial V_{N1}}{\partial S}-\frac{1}{2}\sigma _S^2S^2\frac{\partial ^2 V_{N1}}{\partial S^2}-\rho \sigma _P\sigma _SPS\frac{\partial ^2 V_{N1}}{\partial P\partial S}+rV_{N1}-F_{N1}\Bigg \}=0, \\ \max _{E_{N2}(t)\ge 0,a_{N2}(t)}\Bigg \{-\frac{\partial V_{N1}}{\partial t}-\Big (E_{N1}(t)-a_{N1}(t)+E_{N2}(t)-a_{N2}(t)-\theta P(t)\Big )\frac{\partial V_{N2}}{\partial P}-\frac{1}{2}\sigma _P^2P^2\frac{\partial ^2 V_{N2}}{\partial P^2}\\ -\,\mu _SS\frac{\partial V_{N2}}{\partial S}-\frac{1}{2}\sigma _S^2S^2\frac{\partial ^2 V_{N2}}{\partial S^2}-\rho \sigma _P\sigma _SPS\frac{\partial ^2 V_{N2}}{\partial P\partial S}+rV_{N2}-F_{N2}\Bigg \}=0, \end{aligned} \right. \nonumber \\ \end{aligned}$$
(19)

with the terminal conditions

$$\begin{aligned} V_{N1}(P,T)=-g_1\left( P-{\bar{P}}_1\right) \quad \text{ and }\quad V_{N2}(P,T)=-g_2\left( P-{\bar{P}}_2\right) , \end{aligned}$$
(19a)

where

$$\begin{aligned} F_{N1}=(A-S)E_{N1}(t)-\frac{1}{2}E_{N1}^2(t)+SE_{10}e^{-\rho _1 t}+Sa_{N1}(t)-DP(t)-\frac{1}{2}Ca_{N1}^2(t), \end{aligned}$$

and

$$\begin{aligned} F_{N2}=(\alpha A-S)E_{N2}(t)-\frac{1}{2}E_{N2}^2(t)+SE_{20}e^{-\rho _2 t}+Sa_{N2}(t)-\beta DP(t)-\frac{1}{2}\gamma Ca_{N2}^2(t). \end{aligned}$$

Since the HJB Eqs. (19) and (19a) cannot be solved analytically, we propose a so-called fitted finite volume method to discretize the equations. This numerical scheme is presented in “Appendix G” section. The theoretical results about the method will be presented in another work. The values of parameters used in this subsection are the same as those in Chang et al. [7]: \(T=10\), \(A_1=20\), \(\alpha =0.9\), \(P_{\min }=200\), \(P_{\max }=1000\), \(S_{\min }=0\), \(S_{\max }=2\), \(E_{i0}=5\), \(E_{j0}=6\), \(\rho _1=0.05\), \(\rho _2=0.04\), \(\theta =0.6\), \(\sigma _P=0.3\), \(\sigma _S=0.3\), \(\mu _S=0.2\), \(\rho =0.5\), \(D_1=0.1\), \(\beta =1.2\), \(r=0.08\), \(g_i=2\), \(g_j=3\), \({\bar{P}}_i=1100\), \({\bar{P}}_j=1200\), where \([P_{\min },P_{\max }]\times [S_{\min },S_{\max }]\) is the computational region of our problems.

Fig. 4
figure 4

Numerical results for the stochastic model, a emission levels, b abatement levels, c value functions

6.2 Numerical Results

In Fig. 4, we plot the numerical results of the stochastic model over time by fixing \(P=800\) and \(S=1\). Once again we see that both regions will benefit from cooperation. Further, cooperation also leads to lower emission levels and higher abatement levels for each region.

It is interesting to note that the emission levels are decreasing over time and the abatement levels are increasing over time, which is similar to the deterministic case, while the profits first increase with time and then dramatically decrease until the terminal time, which is different from the deterministic case. We can conclude that this difference is caused by the randomness in the pollution stock and the emission permit price. The risk due to randomness is accumulated gradually as time progresses, and when a threshold value is reached by the risk, the players in the game suffer losses, denoting the decreasing phase in Fig. 4c.

Fig. 5
figure 5

The effects of \(\sigma _P\) in the cooperative game, a \(V_{C1}\), b \(V_{C2}\), c \(E_{C1}\), d \(E_{C2}\), e \(a_{C1}\), f \( a_{C2}\)

Fig. 6
figure 6

The effects of \(\sigma _P\) in the noncooperative game, a \(V_{N1}\), b \(V_{N2}\), c \(E_{N1}\), d \(E_{N2}\), e \(a_{N1}\), f \(a_{N2}\)

6.3 Sensitivity Analysis

In this subsection, we present the sensitivity analysis of the numerical results concerning the parameters of the Brownian motions. The effects of the other parameters on the game can be obtained from the above analytical results. Since the parameters \(\mu _S\) and \(\sigma _S\) have little influence on the results, we will only examine the effects of \(\sigma _P\) on the optimal profit, the optimal emission paths as well as the optimal abatement strategies under the cooperative and noncooperative games, respectively. In Figs. 3 and 4, we fix \(P=675\), \(S=1\), and set \(\sigma _P\) to be 0.25, 0.3 and 0.35, respectively.

Figures 5 and 6 show the effects of the parameter \(\sigma _P\) on the cooperative game and the noncooperative game, respectively. On the one hand, we can see from 5a, b and 6a, b that the optimal profits of both regions decrease with \(\sigma _P\). A higher volatility \(\sigma _P\) implies that the two regions will bear a higher risk of the pollution stock. To control this risk, more efforts, such as investments in strategic portfolios and infrastructure construction, should be made. This implies that higher volatility of the pollution stock will reduce the two regions’ revenues in the game.

On the other hand, the more the volatility of the pollution stock is, the lower emission levels and the higher abatement levels are for the two regions under both types of games. This can be illustrated as follows. The regions may suffer from more pollution damage if the volatility \(\sigma _P\) increases, so they should do their best to reduce the pollution stock to avoid the climate risk as far as possible. Reducing emission and increasing abatement should be reasonable choices in the game for both players. Moreover, another advantage of emission reduction is that the players could save more unused emission permits and sell them in the market to receive benefits, thus offsetting the reduction in the productive revenues.

7 Concluding Remarks

In this paper, we investigate the regions’ dynamic optimal strategies in finite-horizon differential games of transboundary industrial pollution. In particular, the emission trading scheme and the abatement policy are involved in our differential game models. Through solving the HJB equations satisfied by the value functions, we can obtain the two regions’ optimal emission paths, optimal abatement levels, as well as the optimal trajectories of the pollution stocks under the noncooperative and the cooperative games. Moreover, a stochastic extension is also discussed.

Our results show that the two regions’ instantaneous quotas affect their profits to a great extent. Additionally, cooperation leads to increased amounts of abatement, fewer emissions, and a lower pollution stock. This emphasizes the fact that cooperation is a better choice in both economic and environmental terms.