1 Introduction

Nowadays, there is no doubt that transboundary pollution has become an issue of growing concern. The creation of different policies to face transboundary pollution problems, such as global warming, is on the political agenda. A dynamic game is a natural framework of analysis for transboundary pollution problems, in particular, for the problem of global warming. The problem extends over time and has externalities in the sense that emissions of all countries accumulate in a common stock of pollution and this stock damages all agents’ welfare. The formulation of transboundary pollution problems as dynamic games allows us to understand the dynamic trade-offs and agents’ behavior. Dynamic game models of transboundary pollution were originally proposed by Ploeg and Zeeuw [28], Long [26], and Dockner and Long [17]. These seminal papers have been extended in different ways. Jørgensen et al. [23], Long [27] and De Zeeuw [15] surveyed this literature.

The present paper contributes to this literature and studies the strategic impact of investment in cleaner technologies on equilibrium outcomes in a transboundary pollution dynamic game. The literature has already emphasized that a key factor in environmental pollution control is the adoption of more environmentally friendly technologies by firms. Different approaches have been used to model technical change in the environmental economics literature (Baker et al. [1]). In this paper we consider that the principal source of the incentive to invest in cleaner technology relies on the fact that emissions per unit of output are assumed to diminish as larger stocks of cleaner technology are accumulated. The decision to invest in cleaner technology is inherently dynamic and there are costs of adjusting the stock of clean technology. As far as we know, these assumptions were proposed for the first time in a dynamic game model of transboundary pollution in Ploeg and Zeeuw [28]. These authors assume that the cleaner technology is public knowledge and compare the outcome under policy coordination and the noncooperative precommitment outcome, when the players make their emission and investment decisions following the open-loop Nash equilibrium. In this paper we follow the approach presented in Ploeg and Zeeuw [28] but focus on subgame-perfect Nash equilibria. This equilibrium concept is considered more realistic because the strategies supporting this equilibrium do not require precommitment to a course of action over time. The disadvantage of this more realistic equilibrium concept is that the subgame-perfect Nash equilibrium is difficult to compute. Jørgensen and Zaccour [24, 25] and recently De Frutos and Martín-Herrán [13] assume the same type of cleaner technology, but in these cases it is region specific. Special functional forms for the instantaneous benefit, the emission-output ratio and the pollution damage are proposed such that the differential game belongs to the class of linear-state differential games. Although for this class of differential games the subgame-perfect Nash equilibria can be easily characterized analytically, unfortunately these equilibria are constant over time. On the contrary with our model specification, subgame-perfect Nash equilibria are not constant over time, but depend on the state variables.

As far as we know, there is no previous study in the literature that has introduced the possibility of investment in clean technology in order to reduce the emission-output ratio, nor analyzed how the availability of new technology could affect the subgame-perfect Nash equilibrium emission and investment strategies dependent on the stocks of pollution and clean technology. Our paper intends to fill this gap.

Our model allows us to consider the interplay of two dynamic processes, the process of environmental degradation or improvement, and the process of developing clean technology. The pollutant emissions accumulate into a global stock of pollutants, and hence, stock externalities between the players occur. Because the investment in cleaner technologies accumulate into a global or public knowledge stock of clean technology, stock externalities between the players also occur. All countries benefit from the investment in clean technology of any individual country. Therefore, our model presents both environmental and innovation externalities. The pollution externality is negative, while the cleaner technology externality is positive. One of the distinguishing characteristics of this paper is the presence of two externalities, positive and negative, respectively, in a problem that takes into account the strategic interactions between the players who make their decisions on emission and investment without any prior commitment.

Specifically, in this paper we study the strategic behavior of two countries facing transboundary pollution under a noncooperative infinite-horizon differential game framework. Emissions accumulate in a common pollution stock and cause environmental damage in both regions. In our model, the countries invest in cleaner technologies to reduce the emission-output ratio and hence aim to reduce the environmental damage caused by the pollution stock. Both countries invest in a common cleaner technology that is assumed to be public knowledge. Making the emission-output ratio endogenous greatly increases the difficulty for the characterization of subgame-perfect Nash equilibria of the differential game, since it loses its linear-quadratic formulation.

The class of linear-quadratic differential games belongs to the analytically tractable game structures that allow the analytical characterization of subgame-perfect Nash equilibria. Most of the transboundary pollution dynamic games proposed in the literature belong to this class. For example, the recent works by Bertinelli et al. [6], Benchekroun and Martín-Herrán [4], Bréchet et al. [7], Chang et al. [8] and Vardar and Zaccour [29] all formulate linear-quadratic differential games to analyze different questions related to transboundary pollution. Richer formulations of these transboundary pollution dynamic games lead to nonlinear-quadratic differential games. For these games numerical algorithms and methods are needed to characterize the feedback Nash equilibria. Recently, this numerical approach has been developed in De Frutos and Martín-Herrán [11, 12, 14], Jaakkola and Ploeg [22], El Ouardighi et al. [18], De Frutos et al. [9] for the analysis of environmental policies in transboundary pollution differential games.

In this paper we apply a numerical algorithm that allows us to numerically characterize subgame-perfect Nash equilibria of the transboundary dynamic game. This game presents two state variables: the stocks of pollution and cleaner technology, and two control variables for each player: the emission rate and the investment in cleaner technology. The numerical algorithm we use to carry out this analysis essentially consists on solving an approximate discrete-time dynamic game. The dynamic programming equations associated with the discrete-time dynamic game are solved using a tensorial Chebyshev method.

The main objective of this work is to analyze the strategic impact of a country’s investment in a cleaner technology on the equilibrium levels of the countries’ emissions, on the level of pollution and on the countries’ welfare. As a first step, more specifically, we want to check whether or not our richer formulation still preserves the main conclusions in Benchekroun and Ray Chaudhuri [5]: that the adoption of cleaner technology could lead to an increase in countries’ emissions, thereby increasing the pollution stock which might be detrimental to welfare. In Benchekroun and Ray Chaudhuri [5] (hereafter B &RC for short), the emission-output ratio is taken as given and the focus is on the analysis of exogenous changes in technology. In our paper, we extend the model in the direction of making the emission-output ratio endogenous. Specifically, the ratio of emissions per output can be reduced through investment in cleaner technology. Because clean technology is assumed to be public knowledge, both countries benefit from the investment in clean technology of an individual country, and therefore, an additional (positive) externality is introduced in the model. The endogenization of the clean technology dynamics and the feedback information structure allow the players to determine their optimal emission and investment strategies depending on the current states of the stocks of pollution and cleaner technology.

In a second step, with the aim of complementing the previous study and deepening our understanding of the strategic impact of investing in cleaner technology, we analyze the transition paths of the decision and state variables toward their steady-state values. At this point we are completely departing from B &RC, which does not study the transitional dynamics. However, because in this last paper the transboundary pollution dynamic game is formulated as a linear-quadratic differential game with a single state variable and Nash equilibria in linear strategies are characterized, the transition paths are monotonous increasing or decreasing time-paths converging toward the steady-state values, depending on whether the initial pollution stock is lower or greater than its long-run value. In our richer formulation, the endogenization of the cleaner technology adds a second state variable to the problem, and therefore, the path dynamics may lose their monotonous character. We aim to check whether the equilibrium control and state trajectories monotonously approach their long-run values or if some of the variables may overshoot or undershoot the long-run equilibrium before converging. This analysis of the transitional dynamics toward the long-run equilibrium will allow us to show, among other things, whether the cleaner technology is used to mitigate an immediate environmental problem or to prevent a future problem, depending on the initial state of the environment. This question has been previously analyzed by Fischer et al. [20] but in the context of a unique decision-maker, and hence, without the strategic interactions among the two countries as studied in this paper. Fischer et al. [20] considered two dynamic processes describing the time evolution of the pollution stock and of the cleaner technology and show (under the assumption that the initial stock of the cleaner technology equals zero) that the optimal transition paths toward the steady state are quite different depending on whether the initial environment is clean or dirty. These paths can involve overshooting or undershooting of the pollution or cleaner technology stock targets before converging.

Our numerical results allow us to qualify the main conclusions in B &RC. The adoption of a cleaner technology, in our framework represented by a greater value of the stock of clean technology, results in an increase in emissions when the stock of pollution is above a certain level. Furthermore, our results show that this last threshold is defined by the long-run pollution level and that this behavior only emerges when the stock of cleaner technology is below a bound. Concerning the effect of the adoption of a cleaner technology on welfare, B &RC shows that when the damage caused by the stock of pollution is large enough, adopting a cleaner technology reduces welfare throughout the transition phase from an initial pollution stock to the steady state. Our results show that this somehow counterintuitive result in our context applies for large values of the initial pollution stock and for upper bounded values of the initial cleaner technology stock. Thus, endogenizing the dynamics of clean technology and therefore introducing a new externality in the model (in this case positive) lead to results that qualify those obtained in B &RC. More specifically, in general terms our results show that the main conclusions obtained when technological improvements are formulated as exogenous changes in technology, as considered in B &RC, are only valid in our framework, where the dynamics of the clean technology stock has been endogenized, when this stock is below a threshold. We show that these conclusions are reverted when this threshold is exceeded.

Concerning the equilibrium trajectories, our numerical results show that depending on the initial value of the stocks of pollution and cleaner technology, the equilibrium trajectories can monotonously approach their long-run values, or they may present non-monotonous behavior. In some cases, they can overshoot/undershoot the long-run equilibrium before converging. Our numerical simulations show that the non-monotonous behavior can emerge for any of the state and control variables.

Some of the results derived in this paper appear in the single decision-maker version of the model, while others are exclusive to the formulation of dynamic games. Specifically, in the case of a single decision-maker, it may be that greater cleaner technology implies higher emissions. However, cleaner technology is always associated with greater welfare, which does not always happen when the strategic interaction between the players is considered. The non-monotonicities of the optimal paths also appear in the optimal control formulation of the model, although they are more frequent and more pronounced when each player behaves strategically with respect to his competitor.

The paper is organized as follows: in Sect. 2, we present and recall the transboundary pollution dynamic game formulated for the first time in the seminal paper by Ploeg and Zeeuw [28]. Section 3 presents the characterization of the approximate Markov-perfect equilibrium strategies and value functions. Section 4 analyzes the equilibrium trajectories. Section 5 summarizes the results of the robustness analysis. Section 6 presents our concluding remarks.

2 The Model

Consider two countries. Each country produces a single consumption good. We denote by \(Y_i(t)\) the production of good i at time \(t\ge 0\). The instantaneous net social benefits of production of country i are given by

$$\begin{aligned} B_i(Y_i(t))=Y_i(t) \left( A-\frac{Y_i(t)}{2}\right) , \end{aligned}$$
(1)

with A being a positive parameter. This functional form has been proposed in seminal works in this area (Ploeg and Zeeuw [28] and Dockner and Long [17]) and used in many other studies since then, as surveyed in Jørgensen et al. [23]. The specification implies decreasing marginal benefits of production in an interior solution.

The production of \(Y_i(t)\) generates pollution emissions. \(E_i(t)\) denotes the emission rate of country i at time t. Most of the dynamic game models used to analyze transboundary pollution problems considered a constant emission-output ratio [23]. One main feature of our model is that we consider the case where the ratio of emissions to output is endogenous and a decreasing function of the level of the stock of cleaner technology. This assumption was first introduced by Ploeg and Zeeuw [28] and later on by Jørgensen and Zaccour [24, 25] and De Frutos and Martín-Herrán [13]. By investing in cleaner technology, each country can reduce its emission-output ratio. Ploeg and Zeeuw [28] considered that the stock of cleaner technology is public knowledge, while [24, 25] and De Frutos and Martín-Herrán [13] assumed that the stock of cleaner technology is country specific. We follow [28] and assume that the emission rate \(E_i(t)\) resulting from production of country i is given by

$$\begin{aligned} E_i(t)=\alpha (K(t)) Y_i(t), \end{aligned}$$
(2)

where K(t) denotes the stock of cleaner technology at time t. Function \(\alpha \) is a decreasing and strictly convex function of the stock of cleaner technology to account for decreasing returns on the investment activities in cleaner technology.

Because the cleaner technology is assumed to be public knowledge, both countries benefit from the investment in cleaner technology of the other country. Therefore, the dynamics of the stock of cleaner technology is described by the following differential equation:

$$\begin{aligned} \dot{K}(t)=I_1(t)+I_2(t)-\mu K(t), \quad K(0)=K_{0}, \end{aligned}$$
(3)

where \(I_i(t)\) denotes the investment in clean technology in country i, \(\mu \) is the (constant) rate of depreciation of the common stock of clean technology and \(K_{0}\) is the initial stock of this technology. Adjustment costs associated with investment in clean technology are represented by:

$$\begin{aligned} C(I_i(t))=\frac{c}{2} I_i^2(t), \quad c>0, \end{aligned}$$
(4)

to account for increasing investment marginal costs.

The stock of pollution p accumulates according to:

$$\begin{aligned} \dot{p}(t)=E_1(t)+E_2(t)-\delta p(t), \quad p(0)=p_{0}, \end{aligned}$$
(5)

where \(\delta >0\) describes the natural rate of decay of the pollutant and \(p_0\) is the initial value of the stock of pollution. The accumulated stock of pollution causes damage in each country given by:

$$\begin{aligned} D(p(t))=\frac{\varphi }{2} p^2(t), \end{aligned}$$
(6)

where \(\varphi >0\) is a damage parameter.

The objective of country i is to choose the rate of pollutant emissions as well as the level of investment in cleaner technology to maximize its own payoff. Alternatively, country i could choose the production strategy \(Y_i(t)\) rather than the rate of pollution emissions. Due to (2), the two options are mathematically equivalent. Treating the emission rate, \(E_i\), as a control variable, as we have assumed, implies that the instantaneous objective function of each player is not linear-quadratic, while the dynamics of the two state variables are described by linear ordinary differential equations. In the alternative case, in which the production \(Y_i(t)\) is treated as a control variable of player i, the objective function would be quadratic, but the dynamics of the pollution stock would be described by a nonlinear differential equation. The standard assumption in dynamic pollution games considers that the instantaneous payoff of each country is given by a benefit from consumption measured by \(B_i(Y_i(t))\) in (1), minus the cost of the investment in cleaner technology \(C(I_i(t))\) in (4), and the damage caused by the stock of pollution D(p(t)) in (6). Taking into account the relationship between production, emissions and clean technology described in (2), the objective of player i is to maximize the following discounted payoff:

$$\begin{aligned} J_i(E_i, I_i, p_0, K_0)\!=\!\int _0^\infty \! \!\left( \widetilde{B}_i(E_i, K) \!-\!\frac{c}{2} I_i^2(t)\!-\!\frac{\varphi }{2} p^2(t)\!\right) \! e^{-\rho t}, \end{aligned}$$
(7)

with

$$\begin{aligned} {{\widetilde{B}}}_i(E_i, K)=\alpha ^{-1} (K(t)) E_i(t) \!\left( \!\!A\!-\!\frac{\alpha ^{-1} (K(t)) E_i(t)}{2}\!\right) , \end{aligned}$$

subject to the dynamics of the stocks of cleaner technology and pollution given in (3) and (5), respectively. Parameter \(\rho \) is the constant time discount rate. Therefore, the differential game considers two players (countries) and each player has pollution emissions and investment in cleaner technology as control variables. The differential game presents two state variables (the stocks of cleaner technology and pollution) and is played noncooperatively over an infinite time horizon.

As already commented in the introduction, one of the differentiated characteristics of our study is that our analysis is focused on stationary Markov-perfect Nash equilibria. On the one hand, contrary to the strategies that support open-loop Nash equilibria as assumed in Ploeg and Zeeuw [28], the strategies supporting Markov-perfect Nash equilibria do not require precommitment to a course of action over time and have been assumed to be a good description of realistic behavior (see, for example, Basar and Olsder [3], Haurie et al. [21] and Jorgensen et al. [23]). On the other hand, with the functional forms for instantaneous benefits and emission-output ratio considered in (1) and (2), the dynamic game does not belong to the class of state separable or linear-state differential games, as was the case in Jørgensen and Zaccour [24]. For this class of games, it is well known (see, Dockner et al. [16]) that the feedback Nash equilibria can be analytically characterized although they are degenerated in the sense that they are constant over time. In our specification, the two countries play a noncooperative game using a feedback information structure with non-degenerated feedback Nash equilibria such that the emission and investment decisions of a country are state-dependent: that is, they depend at any point in time on the state of the stocks of pollution and cleaner technology at that moment.

3 Feedback Equilibrium Strategies

The formulation of the dynamic game in the preceding section does not allow for the analytical characterization of the emission and investment feedback Nash equilibrium strategies. Therefore, we resort to numerical algorithms to carry out this analysis.

3.1 Discrete-Time Approximation

We look for an approximation to a Nash equilibrium of the problem using a numerical method inspired by a procedure well-known in the case of optimal control problems (see Bardi and Capuzzo-Dolcetta [2], Ch. 6). This method has been previously used in De Frutos and Martín-Herrán [11, 14] to analyze differential game problems. Essentially the procedure we use in this paper consists in substituting the continuous-time game by a discrete-time approximation and solving this last game by dynamic programming in a discrete grid in state space using a tensorial Chebyshev approximation. In De Frutos and Martín-Herrán [11, 14], a finite element method was employed.

Let \(h>0\) be the time step and let \(t_n=nh\), \(n=0,1,\dots ,\) be the discrete times. We define the discrete discount factor as \(\beta _h=1-\rho h\). We consider the discrete-time infinite horizon game defined as follows. The objective function for player \(i=1,2\), is:

$$\begin{aligned} J_{h,i}({\overline{E}}_i,{\overline{I}}_i,p_0,K_0)=h\sum _{n=0}^\infty \beta _h^n G(E_{i, n},I_{i, n},p_n,K_n) \end{aligned}$$
(8)

where

$$\begin{aligned} G(E,I,p,K)=\alpha ^{-1} (K) E \left( A-\frac{\alpha ^{-1} (K) E}{2}\right) -\frac{c}{2} I^2-\frac{\varphi }{2} p^2, \end{aligned}$$

and \({\overline{E}}_i\) and \({\overline{I}}_i\) denote sequences of nonnegative real numbers \({\overline{E}}_i=\{E_{i,n}\}_{n=0}^\infty \), \({\overline{I}}_i=\{I_{i,n}\}_{n=0}^\infty \). The dynamics are:

$$\begin{aligned} \begin{aligned} p_{n+1}&=p_n+h(E_{1,n}+E_{2,n}-\delta p_n),\\ K_{n+1}&=K_n+h(I_{1,n}+I_{2,n}-\mu K_n), \end{aligned} \end{aligned}$$
(9)

where \(p_0\) and \(K_0\) are the initial conditions in (5) and (3), respectively.

We look for Nash equilibria of the discrete-time game. The discrete-time value functions \(V_{h,i}(p,K)\), \(i=1,2\), are computed as solutions of the system of Bellman equations:

$$\begin{aligned} V_{h,i}(p,K)=\max _{E_i\ge 0,I_i\ge 0}\left( hG(E_i,I_i,p,K)+\beta _hV_{h,i}({\widetilde{p}},{\widetilde{K}})\right) ,\quad i=1,2, \end{aligned}$$
(10)

with

$$\begin{aligned} \begin{aligned} {\widetilde{p}}&=p+h(E_{1}+E_{2}-\delta p),\\ {\widetilde{K}}&=K+h(I_{1}+I_{2}-\mu K). \end{aligned} \end{aligned}$$
(11)

This type of discretization is well known for optimal control problems (see Bardi and Capuzzo-Dolcetta [2], Ch. 6 and Falcone [19]). It has been used in De Frutos and Martín-Herrán [11, 14] in the context of differential games. It can be shown (see De Frutos et al. [9]) that a feedback Nash equilibrium for the discrete-time game (8)–(9) is an \(\varepsilon \)-Nash equilibrium of the differential game defined by Eqs. (3), (5), and (7), where \(\varepsilon \) can be made arbitrarily small taking h small enough. This guarantees that a feedback Nash equilibrium of the discrete-time game, although suboptimal, is a good approximation to the feedback Nash equilibrium of the differential game for h small enough.

3.2 The Numerical Method

System (10) is approximated using collocation with a basis of tensorial products of Chebyshev polynomials. We choose \(p_L>0\), and \(K_L>0\) big enough, and for given positive integers N and M, we define the polynomials:

$$\begin{aligned} \phi _n(p)&=T_n\bigl (2\frac{p}{p_L}-1\bigr ), \quad p\in [0,p_L], \quad n=0,\dots , N,\\ \psi _n(K)&=T_n\bigl (2\frac{K}{K_L}-1\bigr ), \quad K\in [0,K_L],\quad n=0,\dots , M, \end{aligned}$$

where \(T_n\) is the Chebyshev polynomial in \([-1,1]\) of degree n. Let us denote by \({\mathbb {P}}_{N,M}\) the space of bivariate polynomials defined as the tensorial product of the space of polynomials of degree N in the p-variable and degree M in the K-variable. A generic polynomial \(Q\in {\mathbb {P}}_{N,M}\) can be written as:

$$\begin{aligned} Q(p,K)=\sum _{n=0,m=0}^{N,M}q_{n,m}\phi _n(p)\psi _m(K), \end{aligned}$$

where \(q_{n,m}\) are the Chebyshev coefficients.

Let us consider:

$$\begin{aligned}\begin{aligned} p_n&=p_L\frac{x_n+1}{2},\quad n=0,\dots , N, \\ K_m&=K_L\frac{y_m+1}{2},\quad m=0,\dots ,M, \end{aligned} \end{aligned}$$

the Chebyshev nodes in \([0,p_L]\) and \([0,K_l]\). Here, \(x_n=-\cos (n\pi /N)\), \(n=0,\dots , N\), and \(y_m=-\cos (m\pi /M)\), \(m=0,\dots , M\).

We compute an approximation \(V_{h,i}^{N,M}\in {\mathbb {P}}_{N,M}\) solving the collocation equations for every \((p_n,K_m)\), \(n=0,\dots ,N\), \(m=0,\dots ,M\),

$$\begin{aligned} V_{h,i}^{N,M}(p_n,K_m)\!=\!\max _{E_i\ge 0,I_i\ge 0}\!\left( \! hG(E_i,I_i,p_n,K_m)\!+\!\beta _hV_{h,i}^{N,M}({\widetilde{p}}_n,{\widetilde{K}}_m)\!\right) ,\, i=1,2, \end{aligned}$$
(12)

where \({\widetilde{p}}_n\) and \({\widetilde{K}}_m\) are defined as in (11).

Collocation Eq. (12) are solved by a fixed-point iteration (policy iteration). Let \(E_i^{n,m,[0]}\) and \(I_i^{n,m,[0]}\), \(n=0,\dots , N\), \(m=0,\dots , M\), \(i=1,2\), be arbitrary initial approximations. For \(r>0\), we compute the \(r+1\) iteration by the following rule: for \(i=1,2\), \(j=3-i\), and \(n=0,\dots , N\), \(m=0,\dots ,M\), solve the collocation equations:

$$\begin{aligned} V_{h,i}^{N,M,[r+1]}(p_n,K_m)\!=\!\left( \! hG(E_i^{n,m,[r]},I_i^{n,m,[r]},p_n,K_m) \!+\!\beta _hV_{h,i}^{N,M,[r+1]}({\widetilde{p}}_n,{\widetilde{K}}_m)\!\right) , \end{aligned}$$

where

$$\begin{aligned} {\widetilde{p}}_n&=p_n+h(E_{i}^{n,m,[r]}+E_{j}^{n,m,[r]}-\delta p_n),\\ {\widetilde{K}}_m&=K_m+h(I_{i}^{n,m,[r]}+I_{j}^{n,m,[r]}-\mu K_m). \end{aligned}$$

Then compute

$$\begin{aligned} \bigl (E_{i}^{n,m,[r+1]},I_{i}^{n,m,[r+1]}\bigr )=\underset{{E_i\ge 0,I_i\ge 0}}{\arg \max }\left( hG(E_i,I_i,p_n,K_m) \!+\!\beta _hV_{h,i}^{N,M,[r+1]}({\widehat{p}}_n,{\widehat{K}}_m)\right) , \end{aligned}$$

where

$$\begin{aligned} {\widehat{p}}_n&=p_n+h(E_{i}+E_{j}^{n,m,[r]}-\delta p_n),\\ {\widehat{K}}_m&=K_m+h(I_{i}+I_{j}^{n,m,[r]}-\mu K_m). \end{aligned}$$

The iteration is stopped when the difference between two consecutive iterants is small enough. Once the iteration is stopped we define the approximate value function \(V_{h,i}^{N,M}\) as the polynomial in \({\mathbb {P}}_{N,M}\) defined by the values \(V_{h,i}^{N,M,[r+1]}(p_n,K_m)\), \(n=0,\dots , N\), \(m=0,\dots , M\). The approximate discrete-time optimal policies are the polynomials \(E_{h,i}^*\in {\mathbb {P}}_{N,M}\), \(I_{h,i}^*\in {\mathbb {P}}_{N,M}\) defined by \(E_{h,i}^*(p_n,K_m)=E_{i}^{n,m,[r+1]}\), \(I_{h,i}^*(p_n,K_m)=I_{i}^{n,m,[r+1]}\), for all \(n=0,\dots , N\), \(m=0,\dots , M\) and \(i=1,2\). The approximate discrete-time optimal trajectories are computed by:

$$\begin{aligned} p_{\nu +1}^{*}&=p_\nu ^{*}+h\bigl (E_{h,1}^*(p_\nu ^{*},K_\nu ^{*})+E_{h,2}^*(p_\nu ^{*},K_\nu ^{*})-\delta p_\nu ^{*}\bigr ),\\ K_{\nu +1}^{*}&=K_\nu ^{*}+h\bigl (I_{h,1}^*(p_\nu ^{*},K_\nu ^{*})+I_{h,2}^*(p_\nu ^{*},K_\nu ^{*})-\mu K_\nu ^{*}\bigr ). \end{aligned}$$

The optimal trajectories are initialized with \(p_0^*=p_0\) and \(K_0^*=K_0\) with \(p_0\) and \(K_0\) being the initial conditions in (5) and (3), respectively.

3.3 Approximate Feedback Equilibrium Strategies

As already commented in the introduction one of the main objectives of this paper is to analyze the strategic impact of a country’s investment in a cleaner technology on the equilibrium levels of the countries’ emissions, on pollution levels, and on the countries’ welfare. In this subsection we focus on the effect that the adoption of a cleaner technology has on emissions. We are interested in exploring whether the counterintuitive effects of implementing a cleaner technology on pollution emissions presented in Benchekroun and Ray Chaudhuri [5] remain valid in our more general setting. More specifically, we want to explore whether the effects in B &RC come as a result of their modeling of exogenous changes in technology, which decreases the emission to output ratio, where the new technology is readily available and free. In our framework the ratio of emissions to output is endogenous and a decreasing function of the level of the stock of cleaner technology. In our model, the technology is no longer a parameter, but it is optimally determined by the players through their costly investments in cleaner technology. We are interested in showing if and how the endogenous determination of the technology does impact the effect of the adoption of a cleaner technology on emissions. Clean technology is assumed to be public knowledge, and consequently, endogenizing the clean technology dynamics introduces a positive externality into the problem, as it could certainly enhance the effects of the strategic interactions between the players. Because we characterize feedback equilibrium strategies, once the cleaner technology dynamics is endogenized, the players determine their optimal emission and investment strategies depending on the current states of the stocks of pollution and cleaner technology.

Hence, in this section, we study the strategic responses of each country to a change in the level of (initial) pollution and the (initial) level of the stock of cleaner technology. For the numerical examples, we particularize function \(\alpha (K)\) in (2) as follows: \(\alpha (K)=e^{- \gamma K}\) with \(\gamma \) being a positive parameter. This choice is made for analytical convenience, as it is a function that presents the needed features and is smooth enough to avoid problems in the numerical simulations. This functional form has already been used in previous papers that have assumed that the ratio of emissions to output can be reduced through the investment in a stock of cleaner technology, which instead of being public knowledge, is country or region specific (Jørgensen and Zaccour [24, 25] and De Frutos and Martín-Herrán [13].) As a benchmark case we retain the following parameter values: \(A=0.5, \varphi =1, c=1, \delta =0.5, \mu =0.5, \rho =0.1, \gamma =1, h=10^{-3}, N_p=N_K=50\). Figures 1, 2, 3, 4 and the results collected in Conjectures 15 have been derived using these values. However, we have carried out a thorough robustness analysis of the results in Conjectures 15 with respect to changes in all the model parameters. We have run new numerical simulations, changing in each case the value of each of the parameters of the model. A short summary of this robustness analysis is presented in Sect. 5. All the simulations lead to qualitatively similar results, meaning that optimal strategies and welfare satisfy the properties described at the different points in each conjecture.

For the benchmark case, the steady-state values of the state variables, stocks of pollution and cleaner technology, and the control variables, emission and investment in cleaner technology, areFootnote 1\(p^{ss}=0.3595, \, K^{ss}=0.1592, \, E^{ss}=0.0898, \, I^{ss}=0.0397\). The value function at these values is \(V(p^{ss}, K^{ss})=-0.1834\).

Fig. 1
figure 1

Optimal output function

Conjecture 1 presents the results derived from the analysis of the level curves of the optimal output along the optimal emission and investment in cleaner technology feedback strategies. Figure 1 presents the optimal output Y(pK) as a function of the state variables, p and K.

Conjecture 1

  1. 1.

    Keeping constant the stock of cleaner technology at level \(K_f\),

    1. (a)

      Output \(Y(p, K_f)\) is a non-increasing function of the pollution stock.

    2. (b)

      Output \(Y(p, K_f)\) is strictly positive for any level of the pollution stock below a threshold, \({{\tilde{p}}}_Y\). This threshold increases with \(K_f\) and is always larger than the steady-state pollution stock, \(p^{ss}\).

  2. 2.

    Keeping constant the stock of pollution at level \(p_f\),

    1. (a)

      Output \(Y(p, K_f)\) is a non-decreasing function of the stock of cleaner technology.

    2. (b)

      For low or intermediate fixed values of the stock of pollution, \(p_f\), output \(Y(p_f, K)\) is strictly positive for any level of the stock of cleaner technology. As \(p_f\) increases, output \(Y(p_f, K)\) is zero for any value of the stock of cleaner technology below a threshold, \({{\tilde{K}}}_Y\), and positive above \({{\tilde{K}}}_Y\). The threshold \({{\tilde{K}}}_Y\) increases with \(p_f\) and is always greater than the steady-state of the stock of cleaner technology, \(K^{ss}\).

Result 1.a. in Conjecture 1 reproduces those results obtained in the standard transboundary pollution dynamic game formulation where investment in clean technology is not an option. The countries restrict their emissions as the stock of pollution increases. Keeping constant the stock of clean technology, from (6) a rise in the level of accumulated pollution leads to an increase in the marginal damage cost. Hence, the country cuts its emissions to the level that equalizes the marginal benefit. In the standard framework without investment in clean technology, there is a direct relationship between output and emissions. Therefore, the countries reduce their output as the stock of pollution increases. The lower the fixed level of the stock of clean technology, \(K_f\), the more pronounced the decrease in output \(Y(p, K_f)\) as p increases. Result 1.b. in Conjecture 1 shows that when the pollution stock is large enough, the marginal emissions are too costly for the countries, and consequently they cease production and hence emissions. The greater the fixed level of the stock of clean technology, \(K_f\), the wider the range of values of the pollution stock for which production is worthy.

Result 2.a. in Conjecture 1 establishes that for any fixed level of the stock of pollution, \(p_f\), the cleaner the technology (the greater the stock of clean technology), the greater the countries’ output, \(Y(p_f, K)\). From (1), benefits increase with output provided that output remains below the upper bound A. Because the stock of pollution is assumed to be fixed at level \(p_f\), the larger the stock of clean technology, the larger output \(Y(p_f, K)\), and, hence, the greater the benefits. These greater benefits more than compensate for the cost of the investment in clean technology needed to boost its stock. The decision makers want to take advantage of lower investment costs when the stock of cleaner technology is lower. Result 2.b. in Conjecture 1 shows that production \(Y(p_f, K)\) is worthy regardless of the stock of cleaner technology if the environment is clean (low values of \(p_f\)). However, for a dirtier environment (greater values of \(p_f\)), production \(Y(p_f, K)\) is only worthy if the stock of clean technology is large enough to compensate for the damage coming from the greater pollution.

Figures 2 and 3 present the optimal emission and investment in cleaner technology feedback strategies as functions of the state variables, the stocks of pollution, p, and cleaner technology, K. Both figures show that for large values of the pollution stock, if the stock of clean technology is small, the optimal emission rate is null, and so is the investment in clean technology. When either the pollution stock is smaller or the stock of clean technology is greater, first the country starts to emit at a positive rate, and this positive emission is followed by a positive investment in clean technology.

Fig. 2
figure 2

Emission feedback strategy

Fig. 3
figure 3

Investment feedback strategy

The strategic interaction between the two players implies that while emissions are zero, and therefore, so is production, neither of the two players has an interest in investing in cleaner technology, which in the short-term means an additional cost which due to its global nature, could benefit the other player. On the contrary, in the case of a single decision-maker (the optimal control problem) for the same values of the parameters, we have checked that even when emissions are zero, investment in cleaner technology never is (see Fig. 4). The optimal investment policy of the decision maker always prescribes positive investments. Once the possibility of free-riding on the investment in clean technology by the other country has been eliminated, the only decision maker has less short-term behavior and is more interested in raising its stock of cleaner technology, being able to take advantage of it in the future. Furthermore, the strategic effect in the case of two decision-makers which reduces incentives to invest in cleaner technology can be reinforced by the fact that, under some circumstances, increasing the stock of cleaner technology might induce the other player to produce and emit more, which leads to larger pollution and greater environmental damage.

Conjecture 2 presents the patterns that can be deduced from the analysis of the level curves of the emission feedback strategy plotted in Fig. 2.

Conjecture 2

  1. 1.

    For any fixed value of the stock of cleaner technology, \(K_f\),

    1. (a)

      The emission rate \(E(p, K_f)\) is a non-increasing function of the pollution stock.

    2. (b)

      The lower the fixed value of the stock of cleaner technology, \(K_f\), the more pronounced the decrease in emissions as the pollution stock grows. The emissions are almost unchanged for large values of \(K_f\).

    3. (c)

      The emission rate \(E(p, K_f)\) is strictly positive for any level of the pollution stock below a threshold, \({{\tilde{p}}}_E={{\tilde{p}}}_Y\). This threshold increases with \(K_f\) and is always larger than the steady-state pollution stock, \(p^{ss}\).

  2. 2.

    For any fixed value of the pollution stock, \(p_f\),

    1. (a)

      If \(p_f\) is lower than the steady-state pollution stock, \(p_f<p^{ss}\), then the emission rate, \(E(p_f, K)\), is a non-decreasing function of the stock of cleaner technology.

    2. (b)

      If \(p_f\) is greater than the steady-state pollution stock, \(p_f>p^{ss}\), then there exists a threshold \({{\widehat{K}}}\) such that if \(K<{{\widehat{K}}}\), then the emission rate \(E(p_f, K)\) is a non-decreasing function of the stock of cleaner technology, while if \(K>{{\widehat{K}}}\), then \(E(p_f, K)\) is non-increasing. The greater \(p_f\), the larger this threshold.

    3. (c)

      For low fixed values of the stock of pollution, \(p_f\), the optimal emission rate \(E(p_f, K)\) is strictly positive for any level of the stock of clean technology. As \(p_f\) increases, the optimal emission rate \(E(p_f, K)\) is zero for any value of the stock of clean technology below a threshold, \({{\tilde{K}}}_E={{\tilde{K}}}_Y\), and positive above \({{\tilde{K}}}_E\). This threshold increases with \(p_f\).

As expected, the results in Points 1.a, 1.b, and 1.c in Conjecture 2 reproduce those obtained in the standard transboundary pollution dynamic game formulation where investment in clean technology is not an option. Keeping constant the stock of cleaner technology, from (2) there is a direct relationship between emissions and output. As already commented, output \(Y(p, K_f)\) decreases with the stock of pollution, as do the emissions, \(E(p, K_f)\). Point 1.c in Conjecture 2 reproduces in terms of emissions the result in Point 1.b in Conjecture 1.

Although one could expect that the result in Point 2.a in Conjecture 2 holds for any value of the pollution stock, Point 2.b in Conjecture 2 shows that this is not always the case. The adoption of a cleaner technology, in our framework represented by a greater value of the stock of clean technology, K, does not compulsorily lead to a decrease in emissions. Point 2.a in Conjecture 2 qualifies the following result in B &RC: The adoption of a cleaner technology results in a decrease in emissions in the short run only when the stock of pollution is below a certain level. As in B &RC we show that the use of a cleaner technology results in an increase in emissions when the stock of pollution is greater than a threshold. Furthermore, our numerical results show that this threshold in our case corresponds to the long-run pollution level and that this behavior only appears when the stock of cleaner technology is below a bound. The endogenization of the dynamics of the clean technology stock allows us to show that the conclusion derived under the assumption that technological improvements is exogenously given and for free is only applicable in the endogenous and costly case through investment in cleaner technology if the stock of cleaner technology is below a threshold. The conclusion is reverted when this threshold is exceeded. The intuition behind this result is as follows. Keeping constant the stock of pollution, the cleaner technology reduces the marginal damage from pollution and the country has an incentive to emit more. When the fixed pollution stock level is large and the stock of cleaner technology is below a threshold, the damage caused by this stock is large enough, and any increase in the stock of clean technology greatly reduces the marginal pollution damage, and as a consequence, the country increases its emissions until marginal benefit and marginal damage are equal. When the stock of cleaner technology surpasses a threshold, there is a saturation effect, in the sense that the regions are not interested in increasing production above the maximum A, and hence, they are interested in reducing emissions with the aim of reducing the damage coming from the pollution stock, and simultaneously maintaining the production below its maximum value A.

Point 2.c in Conjecture 2 is a direct consequence of the relationship between output and emissions as described by expression (2) and Point 2.b in Conjecture 1.

Conjecture 3 presents the patterns that can be deduced from the analysis of the level curves of the investment feedback strategy plotted in Fig. 3.

Conjecture 3

  1. 1.

    For any fixed value of the stock of cleaner technology, \(K_f\), there exists a threshold \(\breve{K}\) (which is greater than the steady-state value of the stock of cleaner technology, \(K^{ss}\)) such that

    1. (a)

      If \(K_f\) is lower than \(\breve{K}\), then the investment in clean technology, \(I(p, K_f)\) is a non-increasing function of the the pollution stock.

    2. (b)

      If \(K_f\) is greater than \(\breve{K}\), then the optimal investment in cleaner technology \(I(p, K_f)\) presents an inverted U-shaped behavior with respect to the pollution stock, i.e., any increase in the pollution stock leads first to an increase in the optimal investment in clean technology, \(I(p, K_f)\), followed by a decrease after the stock of pollution exceeds a threshold.

    3. (c)

      The investment is strictly positive for any level of the pollution stock below a threshold, \({{\tilde{p}}}_I\). This threshold increases with \(K_f\) and is always larger than the steady-state pollution stock. Furthermore, \({{\tilde{p}}}_I<{{\tilde{p}}}_E\).

  2. 2.

    For any fixed value of the pollution stock, \(p_f\),

    1. (a)

      There exists a threshold \({{\bar{K}}}\) such that the investment in cleaner technology, \(I(p_f, K)\) is a non-decreasing function of the stock of cleaner technology if \(K<{{\bar{K}}}\), while it is non-increasing if \(K>{{\bar{K}}}\). The greater \(p_f\), the larger the threshold \({{\bar{K}}}\).

    2. (b)

      For low fixed values of the stock of pollution, \(p_f\), the optimal investment, \(I(p_f, K)\) is strictly positive for any level of the stock of cleaner technology. As \(p_f\) increases, the optimal investment, \(I(p_f, K)\), is zero for any value of the stock of clean technology below a threshold, \({{\tilde{K}}}_I\), and positive above \({{\tilde{K}}}_I\). This threshold increases with \(p_f\). Furthermore, \({{\tilde{K}}}_I>{{\tilde{K}}}_E\).

The forward-looking and strategic behavior of the players is behind the result in Point 1.a in Conjecture 3. If the constant level of the cleaner technology, \(K_f\), is small, for large values of the pollution stock, output will become zero, and hence, the investment in cleaner technology becomes worthless. Therefore, for low values of \(K_f\), the investment in cleaner technology is a non-increasing function of the pollution stock. For larger values of \(K_f\), the result in Point 1.b applies. From Points 1.a and 1.b in Conjecture 2, we know that for a given constant level of the stock of cleaner technology, \(K_f\), the optimal emission rate decreases as the pollution stock increases, and that this decrease in emissions is stronger the lower \(K_f\). From the expression of the benefit function in (1) and the function showing the relationship between emission and production (2), one can easily conclude that for a fixed \(K_f\), the lower the emission rate, the lower the benefits from production. The larger the value of \(K_f\), the less pronounced the decrease of production benefits. In this case, on the one hand, if the pollution stock is below an upper bound, the corresponding optimal emission rate is quite large and decreases as the pollution stock increases. The decrease in the emission rate can be compensated by increasing the level of cleaner technology and hence boosting the investment in cleaner technology. On the other hand, if the pollution stock surpasses the upper bound, the corresponding optimal emission rate is smaller, and the decrease in the production benefits cannot be compensated by an increase in the stock of cleaner technology, and hence, the optimal strategy prescribes a decrease in the investment of this technology to slow down the cost of this investment.

Point 1.c in Conjecture 3 is a direct consequence of Point 1.c in Conjecture 2.

Keeping constant the pollution stock at level \(p_f\), Point 2.a in Conjecture 3 establishes that the investment strategy increases with the stock of cleaner technology in order to boost this stock if the latter is below a given level. However, once this level is surpassed, the investment strategy prescribes a decrease in investment as the stock of clean technology increases. Investments are greater at low levels of the stock of clean technology due to the increasing investment marginal costs. Again Point 2.b in Conjecture 3 is a direct consequence of Point 2.c in Conjecture 2. The optimal strategy prescribes zero investment while the optimal emission rate is null. Once the optimal emission rate is strictly positive, the countries are interested in investing in cleaner technology.

It is worthy to recall that as previously said in the dynamic game formulation, a zero optimal emission rate is linked to a null optimal investment, because the strategic interaction between the two players and the public nature of the cleaner technology imply that no player is interested in investing in this technology while emissions and production are zero. Both players want to avoid free-riding behavior from their competitor, in such a way that the competitor benefits from the investment in clean technology of the other country. Furthermore, the incentive to reduce the investment in cleaner technology comes from the fact that, under some circumstances, a greater cleaner technology might induce the other player to increase its production and emissions, leading to a larger pollution, which is costly in terms of the pollution damage. However, when the problem is formulated with a single decision-maker for the same values of the parameters, optimal investment is always positive when optimal emissions are zero, as illustrated in Fig. 4. Once the strategic behavior between the players is removed, and the possibility to free-ride on the investment in cleaner technology by the other country has been eliminated, the forward-looking decision maker is more interested in investing in cleaner technology to increase this stock and to take advantage of it in the future.

Fig. 4
figure 4

Optimal control problem. Optimal emission (left) and investment (right) policies

3.4 Approximate Value Functions

In this subsection we analyze the effect that the adoption of a cleaner technology has on welfare. We are interested in exploring whether the counterintuitive effects of implementing a cleaner technology on welfare presented in B &RC (2014) applies in our richer setup, where the dynamics of the cleaner technology stock is endogenized.

Fig. 5
figure 5

Optimal value function

Figure 5 presents the optimal value function. Clearly and as expected, keeping constant the stock of cleaner technology, \(K_f\), the value function \(V(p, K_f)\) decreases as the stock of pollution augments. On the contrary, analyzing the level curves of the value function (see Table 1) one can easily deduce that, keeping constant the stock of pollution, \(p_f\), a cleaner technology does not unequivocally result in greater welfare; that is, the value function \(V(p_f, K)\) does not compulsorily increase with the stock of clean technology. Specifically:

Conjecture 4

For any fixed value of the stock of pollution, \(p_f\), there exists a threshold \(\breve{p}\) such that

  1. 1.

    If \(p_f<\breve{p}\), then any increase in stock of cleaner technology leads to an increase in the optimal value function, \(V(p_f, K)\).

  2. 2.

    If \(p_f>\breve{p}\), then there exists a threshold \({\underline{K}}\) such that if \(K<{\underline{K}}\) any increase in the stock of cleaner technology leads to a decrease in the optimal value function, \(V(p_f, K)\), while \(V(p_f, K)\) increases with the stock of cleaner technology if \(K>{\underline{K}}\). The greater \(p_f\), the larger this threshold \({\underline{K}}\).

The somehow counterintuitive result in Point 2 in Conjecture 4 shows that a decrease in the emissions per output ratio (in our framework represented by a greater value of the stock of cleaner technology) does not compulsorily lead to an increase in welfare. This result is in the same vein as the following obtained in B &RC: When the damage from pollution is large enough, adopting a cleaner technology (a decrease in the exogenously given emissions per output ratio) reduces welfare throughout the transition phase from an initial pollution stock to the steady state. In our case, we show that the use of a cleaner technology results in a reduction of welfare when the initial stock of pollution is sufficiently large and the initial cleaner technology is below an upper bound. The endogenization of the dynamics of the clean technology stock allows us to derive our conclusion in terms of the initial size of the stock of cleaner technology. The intuition behind this result is as follows.

From the optimality conditions, we know that this result can only occur when the investment in cleaner technology is zero. As shown in Figs. 2 and 3 when the initial conditions reflect a situation with small values of the stock of cleaner technology and large values of the pollution stock, both the emission and investment equilibrium strategies are null, and therefore, both pollution and clean technology stocks decrease over time. Consider a fixed initial value of the pollution stock, \(p_f\), and two different initial values of the stock of cleaner technology, \(K_1^0, K_2^0\), with \(K_1^0< K_2^0\). On the one hand, as long as the equilibrium investment strategy remains null, it is true that \(K_1(t)< K_2(t)\), where \(K_i(t)=K_i^0 e^{-\mu t}\) denotes the time evolution of the stock of cleaner technology starting from the initial condition \(K_i^0\). On the other hand, as long as the equilibrium emission strategy remains null, it is true that \(p_1(t)= p_2(t)=p_f e^{-\delta t}\), denoting the time evolution of the pollution stock starting from the initial condition \(p_f\). Consequently, the optimal path starting at \((p_f, K_2^0)\) leaves the zero-emission and zero-investment region in a shorter period of time than the optimal path starting at \((p_f, K_1^0)\) due to the curvature of the boundary of these regions. Hence, there is a period of time in which both optimal investment paths are zero, although the optimal path of the emission rate associated with the greatest initial cleaner technology becomes positive, while the optimal path of the emission rate associated with the lowest initial cleaner technology remains zero. The effect on welfare of a greater (positive) emission rate is twofold. First, the benefits from production increases, but second, a greater emission rate leads to a greater pollution stock and pushes up the damage from pollution. Therefore, both effects affect welfare in opposite directions: the first positively and the second negatively. In all the numerical simulations carried out, we found that the negative effect outweighs the positive effect. The environmental damage caused by greater emissions are not compensated by greater benefits associated with greater production. One possible explanation for this result could be that as the initial stock of technology was low and throughout this initial period of time this stock is depreciating at rate \(\mu \), because no investment in clean technology is made, when the regions start emitting their emission rates are very large, in order to increase their production, and as a by-product they generate a large pollution stock. This short-term behavior is reproduced in the long-term, when the accumulation of welfare along the optimal paths in their transition toward their steady-state values, which do not depend on the initial value of the cleaner technology stock, is considered.

The result of Point 2 in Conjecture 4 is motivated by the existence of strategic interaction between the players. In the case of a single decision-maker, we have checked that this behavior is not present, and in this last formulation, welfare always grows as the stock of cleaner technology increases. The reason stems from the fact that, as already pointed out for the same values of the parameters, we have checked that the optimal investment policy in cleaner technology in the optimal control framework is always positive, even when emissions are zero. The unique forward-looking decision-maker, once the competitive aspect disappears and the positive externality associated with the clean technology is removed, is interested in investing in cleaner technology to increase the stock which will allow greater production benefits together with less environmental damage in the future.

Table 1 presents the optimal value function (the optimal welfare) for different initial values of both stocks, pollution and cleaner technology. Each entry in Table 1 corresponds to \(V(p_0, K_0)\). Recall that the steady-state values are \((p^{ss}, K^{ss})=(0.3595, 0.1592)\), and hence, the table presents the results for initial values of the stocks as the following: much lower, lower, around, greater, and much greater than their long-run values.

Table 1 Welfare for different initial values of the stock of pollution, \(p_0\), and clean technology, \(K_0\)

Table 1 corroborates our previous result which stated that regardless of the initial value of the stock of cleaner technology, for any \(K_0\) fixed, the value function \(V(p_0, K_0)\) decreases as the initial value of the pollution stock (\(p_0\)) increases. Furthermore, the table illustrates the result in Conjecture 4 which established that the effect of an increase of the initial stock of cleaner technology on the value function depends on the initial value of the pollution stock. For small or intermediate initial values of \(p_0\) (\(p_0=0, 0.12, 0.38, 0.45\) in Table 1) the value function \(V(p_0, K_0)\) increases with the initial value of the stock of cleaner technology (\(K_0\)). However, as the value of \(p_0\) becomes larger (\(p_0=0.66, 1.32, 2\) in Table 1), the value function \(V(p_0, K_0)\) either decreases with the initial value of the stock of cleaner technology if \(K_0\) is lower than a threshold \({\underline{K}}\) or increases with the initial value of the stock of cleaner technology if \(K_0\) is larger than \({\underline{K}}\). The threshold \({\underline{K}}\) increases with \(p_0\).

In the rest of this section we collect some results derived from the comparison of different pairs of initial conditions \((p_0, K_0)\), where the initial value of the pollution stock and the initial value of the clean technology are different.

Let consider two pairs of initial conditions \((p_0^{(1)}, K_0^{(1)})\) and \((p_0^{(2)}, K_0^{(2)})\) such that \(p_0^{(1)}<p_0^{(2)}\) and \(K_0^{(1)}>K_0^{(2)}\); that is, the first pair represents a situation with a lower initial pollution stock and a greater initial stock of clean technology. Table 1 shows that, as expected in this situation, welfare in the first pair surpasses welfare in the second case, i.e., \(V(p_0^{(1)}, K_0^{(1)})>V(p_0^{(2)}, K_0^{(2)})\). Each entry in Table 1 is higher than any other entry that is located to its left (lower initial stock of cleaner technology) and in some lower row (greater initial pollution stock). Furthermore, if the environment is initially very polluted such that \(p_0^{(1)}\) and \(p_0^{(2)}\) are very large, and \(p_0^{(1)}<p_0^{(2)}\), then \(V(p_0^{(1)}, K_0^{(1)})>V(p_0^{(2)}, K_0^{(2)})\) whatever the initial values of the stock of clean technology are. In this case, for a very polluted initial environment (\(p_0=0.66, 1.32, 2\)), a dirtier initial situation (a greater initial value of the pollution stock) cannot be compensated by a greater initial value of the cleaner technology stock, resulting in lower welfare.

If the environment is not initially very polluted, and \(p_0^{(1)}<p_0^{(2)}\), then the optimal value functions satisfy \(V(p_0^{(1)}, K_0^{(1)})>V(p_0^{(2)}, K_0^{(2)})\) whenever the initial values of the stocks of clean technology are small or moderate. However, if \(p_0^{(1)}<p_0^{(2)}\) and \(K_0^{(1)}<K_0^{(2)}\) with \(K_0^{(2)}\) being large, then \(V(p_0^{(1)}, K_0^{(1)})<V(p_0^{(2)}, K_0^{(2)})\).

From the previous results we can conclude the following

Conjecture 5

For identical or greater initial values of the stock of pollution, an initial cleaner technology (a larger initial value of the stock of clean technology) could result in lower welfare.

Conjecture 5 shows a result in the same vein as that obtained in B &RC: When the damage from pollution is large enough, a decrease in the emissions per output ratio reduces welfare. The endogenization of the dynamics of the clean technology in our setup allows us to corroborate that cleaner technology does not compulsorily lead to greater welfare. More specifically, as presented in Conjecture 5, we show that cleaner technology results in lower welfare when the initial pollution stock is high; that is, when a dirty initial environment is considered or when taking intermediate values, the stock of cleaner technology remains upper bounded.

4 Equilibrium Trajectories

Aiming to complementing the previous study of the strategic impact of investing in cleaner technology, in this section we characterize the equilibrium control and state paths to the steady state. Because the endogenization of the cleaner technology introduces a second state variable in the transboundary pollution dynamic game, we know that the equilibrium control and state paths may lose their monotonous character presented in the standard linear-quadratic game formulation with only one state variable. Here, we are interested in determining whether the equilibrium trajectories monotonously approach their long-run values or if some of them overshoot/undershoot these values before converging. It is interesting to know which optimal time-paths and under which circumstances present a behavior that at a first sight could be seen as not optimal, because the paths do not follow the straightest path toward their steady-state values, which could be considered a more expected trend. These non-monotonous behaviors do not appear exclusively when the strategic interaction between the players is considered, but also arise in scenarios with a single decision-maker, when models with more than one state variable are formulated. Within an optimal control framework, with two state variables, the pollution and cleaner technology stocks, for example, Fischer et al. [20] show that the optimal time trajectories toward the steady state can present non-monotonicities depending on whether the initial environment is clean or dirty. For the single decision-maker formulation of our model we have shown that the optimal paths can be non-monotonous and can involve overshooting or undershooting of the long-run targets before converging. However, the comparison of the optimal time-paths for the game and optimal control formulations allows us to conclude that the non-monotonicities are more frequent and more pronounced when each player behaves strategically with respect to his competitor. The positive and negative externalities present in the game formulation promote these non-monotonous behaviors.

For the dynamic game formulation, we present two sets of figures. In the first set, in each figure we fix an initial level of the stock of cleaner technology and plot the equilibrium paths for different values of the initial pollution stock (\(p_0=0, p_0=0.15, p_0=0.45, p_0=0.66\), with the steady-state pollution stock being \(p^{SS}=0.3595\)). In the second set, in each figure we fix an initial value for the pollution stock and plot the equilibrium paths for different values of the initial level of the stock of cleaner technology (\(K_0=0, K_0=0.12, K_0=0.22, K_0=0.44\), with the steady-state stock of technology being \(K^{SS}=0.1592\)). In each figure, we have four subplots. The two subplots in the upper part of the figure collect the equilibrium trajectories of the control variables, the emission rate (left) and the investment in cleaner technology (right). The two subplots in the lower part of the figure present the equilibrium trajectories of the state variables, the stock of pollution (left) and the stock of cleaner technology (right). The complete sets of figures are collected in Appendix A. Here, we only present a selection with some of the most representative cases which show interesting non-monotonous behavior.

As a benchmark case, we first assume that initially there is no stock of clean technology, hence, \(K_0=0\) (Figure 1 in Appendix A) or that this initial stock is very small relative to its steady-state level. Under this assumption all the optimal paths increase or decrease monotonously toward their long-run values. Hence, the pollution stock increases or decreases monotonously toward its steady-state value depending on whether the initial value is lower or greater than the long-run value. Correspondingly, the optimal emission path decreases/increases toward its long-run value when the pollution stock increases/decreases. To some extent, the behavior of the pollution and emission trajectories mimic the behavior of the trajectories in the standard transboundary pollution dynamic game where investment in cleaner technology is not an option. The stock of cleaner technology monotonously increases approaching its long-run level. The investment trajectory presents the same qualitative behavior as the emission trajectory. For initial values of the pollution stock lower than its long-run value, the pollution stock increases as time goes on, leading to an increase in the damage from pollution. In this case, the greater environmental damage is compensated with a decrease in the investment cost, and hence, the investment in cleaner technology decreases as time goes by. For initial values of the pollution stock greater than its long-run value, just the opposite reasoning applies, and the investment increases with time. Note that for a dirty initial environment (\(p_0=0.66\)), there is an initial period of time for which the investment in cleaner technology is not worthy, because emissions (and therefore production) are almost nil.

Fig. 6
figure 6

Optimal paths for \(K_0=0.12\)

Figure  6 considers \(K_0=0.12\), which is an initial intermediate value of the stock of clean technology which is lower than the long-run value (\(K^{SS}\) is around 0.16). The subplots for the control trajectories (emissions and investment) as well as the pollution stock trajectory present qualitatively the same behavior as previously described for the case of a null or very small initial stock of cleaner technology. The main difference appears in the equilibrium trajectory of the stock of cleaner technology. In the previous scenario the stock of cleaner technology monotonously increased toward its long-run value. However, in Fig. 6, with a greater initial value of the stock of cleaner technology (\(K_0=0.12\)), this monotonous behavior exclusively appears when \(p_0\) is quite close to its long-run value (\(p_0=0.375, p_0=0.45\)). For \(p_0\) much lower than its steady-state value (\(p_0=0, p_0=0.12\)), the stock of cleaner technology overshoots its long-run value before converging. In this case, the environmental problem is not initially important and hence, at the beginning the emission rates are very large, leading to a large production which allows for a large initial investment in cleaner technology. At the beginning, this large initial investment makes it possible for the stock of clean technology to increase rapidly and surpass its long-run value. As the pollution stock grows over time, the emissions and investment decrease, and as a consequence, the stock of cleaner technology decreases toward its steady-state value. When \(p_0\) is much larger than its steady-state value (\(p_0=0.66\)), the equilibrium trajectory of the stock of cleaner technology is U-shaped, loosening the monotonous behavior. In this situation, the initial state of the environment is dirty. As a consequence, both emissions and investment are initially very small (almost null), and hence, the stock of clean technology diminishes because the investment does not compensate for the depreciation of the technology. As the pollution stock decreases, the emissions and investment increase, and consequently, the stock of cleaner technology augments toward its steady-state value.

When an intermediate initial value of the stock of cleaner technology (\(K_0=0.22\)) is greater than the long-run value, emissions, investment, and pollution stock trajectories present qualitatively the same behavior as in the previous scenarios (Figure 3 in Appendix A). In this case, the equilibrium trajectory of the stock of cleaner technology again is only monotonous when \(p_0\) is quite close to its long-run value (\(p_0=0.375, p_0=0.45\)), although in the present case the stock of technology decreases with time. When \(p_0\) is much lower than its steady-state value (\(p_0=0, p_0=0.12\)) the equilibrium trajectory of the stock of clean technology presents an inverted U-shape. When \(p_0\) is much larger than its steady-state value (\(p_0=0.66\)), the equilibrium trajectory of the stock of clean technology undershoots its long-run value before converging.

Moving to a very large initial value of the stock of cleaner technology (\(K_0=0.44\), Fig. 4 in Appendix A), the trajectories of emission, investment, and pollution stock are qualitatively similar to the previous cases. Furthermore, the monotonous behavior of the stock of cleaner technology is recovered, although in this case the stock decreases in its convergence to the long-run value.

The second set of figures is described below. In each figure, we fix an initial value of the pollution stock (\(p_0=0, p_0=0.15, p_0=0.45, p_0=0.66\)) and plot the equilibrium paths for different values of the initial level of the stock of cleaner technology. In each of these figures, the stock of clean technology is fixed at five different levels: \(K_0=0, K_0=0.12, K_0=0.17, K_0=0.22\), and the long-run value is \(K^{SS}=0.1592\). All these figures show the somehow counterintuitive result that the greater the initial stock of clean technology, the greater the pollution stock, either along the entire time trajectory or at least after a short period of time.

Fig. 7
figure 7

Optimal paths for \(p_0=0\)

Figure 7 assumes that initially there is no stock of pollution, and hence that there is a very clean initial environment, \(p_0=0\). The pollution stock monotonously increases toward its long-run value, and in response to the rise in the pollution stock, the emission trajectory decreases with time. The greater the initial stock of clean technology, the lower the emission rate for an initial time period, although this behavior reverses at a certain time. In this later case, a larger initial stock of cleaner technology leads to greater emissions. The investment trajectory also decreases with time, while the trajectory of the stock of cleaner technology presents different behaviors depending on its initial value. If \(K_0\) is null, the stock of clean technology increases monotonously toward its long-run value. For positive initial values of \(K_0\) lower than or very similar to the steady-state value (\(K_0=0.12, 0.17\)) the stock of clean technology overshoots its long-run value before converging. For greater initial \(K_0\) values, the clean technology decreases monotonously toward the steady state. For a quite clean initial environment (\(p_0=0.15\), Figure 6 in Appendix A) the optimal paths present a behavior which is qualitatively similar to those shown in Fig. 7 for \(p_0=0\).

Fig. 8
figure 8

Optimal paths for \(p_0=0.45\)

Figure 8 presents the equilibrium trajectories when \(p_0=0.45\), which is an initial value of the pollution stock greater than the long-run value. The subplots for the emission, investment, and pollution stock trajectories are qualitatively opposite of those presented in the previous cases where the initial value of the pollution stock was lower than its long-run value (\(p_0=0, p_0=0.15\)). The pollution stock monotonously decreases toward its long-run value for intermediate and large initial values of the stock of clean technology, but undershoots this long-run value for small initial values. The emission rate monotonously increases toward its long-run value. The investment in cleaner technology monotonously increases toward its long-run value, except when the initial value of the cleaner technology is large enough that it decreases.

Moving to a larger initial value of the stock of pollution (\(p_0=0.66\), Fig. 8 in Appendix A), the same qualitative behavior of the trajectories of emission, investment, and pollution stock as in the previous figures is reproduced. In this case, if the initial value of the stock of cleaner technology is very large, the optimal investment time-path overshoots its long-run value when converging. Concerning the stock of cleaner technology the only difference is that in this last case, the cleaner technology stock undershoots its long-run value before converging if the initial value is moderately greater than the long-run value.

5 Robustness Analysis

This section is devoted to the study of the robustness of the results presented in Sect. 3. We run new numerical simulations, in each case changing the value of each of the model parameters. Considering the benchmark case defined in Sect. 3, we change each parameter by 20% while keeping all other parameters fixed and check that the qualitative results collected in Conjectures 15 remain unchanged. By “qualitatively similar results” we mean that optimal strategies and welfare satisfy the properties described at the different points in each conjecture. In Appendix B, we present the results of our exhaustive analysis. For each of the seven parameters of the model, we plot the emission and investment feedback strategies corresponding to three different values of the parameter at hand: the benchmark value, a 20% increment, and a 20% decrement. Furthermore, for each of these three cases we compute the steady-state values of the state variables (pollution and cleaner technology stocks), the control variables (emissions and investment), and the value function evaluated at the steady state. All the figures presented in Appendix B clearly show that the results presented in Conjectures 15 are robust when each of the parameters are changed at least 20%.

In order to underline the robustness of the results on the characteristics of the optimal strategies and the value function presented in Conjectures 15, we analyze here a new example in which the values of the parameters are very different from those of the benchmark case. The new values of the parameters are inspired by those used in Vardar and Zaccour [29]. This last paper analyzes the strategic impact of adaptation measures to prevent the adverse effects of accumulated pollution through a transboundary pollution dynamic game, also inspired by the original Ploeg and Zeeuw’s model (1992). The values of the parameters that are related to pollution and emissions, and not to investment and clean technology, are similar to those used in Benchekroun and Ray Chaudhuri [5] in their numerical example, where they fix values of the parameters based on empirical evidence. These new values of the parameters are as follows: \(A=1, \varphi =0.003, c=0.005, \delta =0.01, \mu =0.1, \rho =0.025, \gamma =0.05, h=10^{-3}/2\), and \(N_p=N_K=70\). For these parameter values the steady-state values of the state variables (stocks of pollution and cleaner technology) and of the control variables (emission and investment in cleaner technology) are \(p^{SS}= 10.043, K^{SS}=5.1348, E^{SS}=0.0502\), and \(I^{SS}=0.2567\). The value function at these values is \(V(p^{SS}, K^{SS})=-3.5441\).

Fig. 9
figure 9

The optimal output function (left-up), emission (right-up), and investment (left-down) feedback strategies and the optimal value function (right-down)

Figure 9 presents the optimal output function (left-up), emission (right-up), and investment (left-down) feedback strategies and the optimal value function (right-down). Comparing the plots in this figure with those in the corresponding figures for the benchmark case, it can be easily checked that the results in Conjectures 15 also apply to this new example. The confirmation that the results from the qualitative point of view are still valid in this new example allows us to conclude the robustness and generality of the results.

Table 2 Welfare for different initial values of the stock of pollution, \(p_0\), and clean technology, \(K_0\)

We present the optimal value function (the optimal welfare) for different initial values of both stocks, pollution, and cleaner technology in Table 2. Each entry corresponds to \(V (p_0, K_0)\) and the table presents the results for the initial values of the stocks as follows: much lower, lower, around, greater, and much greater than their long-run values. The values in Table 2 confirm the results obtained in the benchmark case on how the value function changes as the initial conditions of the pollution and cleaner technology stocks are modified and collected in Conjectures 4 and 5.

For this new example, we also characterize the equilibrium control and state paths to the steady state for different initial values of the stocks of pollution and cleaner technology. The complete set of figures are collected in Appendix C. The comparison of the optimal paths in this example with the optimal paths in the benchmark case for the initial values of the stock variables (below, near, or above the corresponding steady-state value) easily shows that these paths present similar trends, except the investment trajectory when the pollution stock is zero; that is, when the environmental problem is not important. In this case, for an initially complete clean environment, the cleaner technology is used to prevent a future problem, and regardless of the initial value of the stock of clean technology, the investment presents an inverted U-shape over time. Here, we present some optimal paths which present non-monotonous behavior, and we estimate the magnitude of the non-monotonicities.

Fig. 10
figure 10

Optimal paths for \(K_0=4\)

Figures 10 and 11 present the optimal paths for \(K_0=4\) and \(K_0=8\), that is, an initial stock of cleaner technology lower and greater than the long-run value (\(K^{SS}=5.1340\)), respectively. Both figures show the inverted U-shaped investment as time goes by when \(p_0=0\), as discussed above. In this case, if we measure the over-investment with respect to a hypothetical monotonous path as the difference of the maximum and the long-run value, this excess when \(K_0=4\) (Fig. 10) represents about a 5% and a 7% when \(K_0=8\) (Fig. 11). If \(K_0=4\), for any other initial value of the pollution stock other than zero, the investment trajectory shows a monotonous behavior. However, this is not the case for \(K_0=8\), and non-monotonicities appear for \(p_0=4\) and especially for \(p_0=12\). In the latter case, for a very polluted initial environment (\(p_0=12\)), the U-shape of the investment at the beginning leads to a under-investment of 7.5% compared to what would be done if the investment grew monotonously toward its long-term value.

Fig. 11
figure 11

Optimal paths for \(K_0=8\)

Regarding the trajectory of the stock of clean technology, non-monotonous behaviors are more pronounced when \(K_0=4\). In this case, for \(p_0=0, 4, 8\), the trajectory of the stock of clean technology overshoots its long-run value before converging. The excess of the accumulated stock of clean technology represents about 5% if \(p_0=0, 4\) and about 2.2% if \(p_0=8\). Furthermore, for an initial very polluted environment (\(p_0=12\)), the initial U-shape of the stock of technology corresponds to an initial decrease in this stock which represents a decrease of 3.24% with respect to its steady-state value. For \(K_0=8\) and an initial very polluted environment (\(p_0=12\)), the stock of cleaner technology undershoots its long-run value. With respect to a hypothetical monotonous path, the stock of cleaner technology decreases by 7.5% compared to its long-run value. For greater initial values of the stock of cleaner technology (\(K_0=12\)), Fig. 4 in Appendix C shows that in addition to non-monotonous behaviors for investment in technology, these behaviors also appear when emissions show an inverted U-shape when the initial pollution stock is less than the stationary value (\(p_0 = 0, 4, 8\)). Non-monotonicities also appear for the investment, the emission, and the stock of cleaner technology in the second set of figures collected in Appendix C (Figs. 58), where in each figure we fix an initial value for the pollution stock and plot the equilibrium paths for different values of the initial level of the stock of cleaner technology.

6 Concluding Remarks

This paper analyzes the strategic behavior of two countries facing transboundary pollution. We have analyzed a transboundary pollution noncooperative differential game played over an infinite horizon. Emissions accumulate in a common pollution stock and cause environmental damage in both regions. In addition to the choice of optimal emissions as in the standard model, in our model the countries invest in cleaner technologies to reduce the amount of emission-output ratio, aiming to reduce the environmental damage caused by the pollution stock. The countries invest in a common cleaner technology that is assumed to be public knowledge. Our model allows us to consider the interplay of two dynamic processes, the process of environmental degradation or improvement, and the process of developing cleaner technology. There are two types of externalities between the players: The pollution externality is negative, while the cleaner technology externality is positive.

The investment in cleaner technology as a key factor in environmental pollution control has been already emphasized in the literature in the context of a single decision-maker as well as in the context of multiple decision makers. However, as far as we know, the transboundary pollution dynamic games which assumed that a country can reduce its emission-output ratio by investing in the stock of clean technology have focused either in open-loop strategies (the open-loop equilibrium is not strongly time-consistent or subgame-perfect), or in special functional forms such that the differential game is linear state and the subgame-perfect Nash equilibria are degenerated in the sense that they are constant over time. In our model the subgame-perfect Nash equilibria are not constant over time, but depend on the state variables. As far as we know, this is the first study to have introduced the possibility of investment in cleaner technology in order to reduce the emission-output ratio and analyzed how the availability of new technology could affect the subgame-perfect Nash equilibrium emissions and investment strategies dependent on the pollution stock and the stock of clean technology.

Our paper extends the model in Benchekroun and Ray Chaudhuri [5] (B &RC) in the direction of making the emission-output ratio endogenous, which these authors took as given. B &RC focused on the analysis of exogenous changes in technology and concluded that, faced with the adoption of a cleaner technology, countries may respond by increasing their emissions, which results in an increase in the stock of pollution that may be detrimental to welfare. Because in our framework cleaner technology is assumed to be public knowledge, an additional positive externality is introduced in the model. One of the main objectives of our paper was to check whether or not our richer formulation, where the ratio of emissions per output can be reduced through investment in cleaner technology, still preserves the main conclusions in B &RC. The richer formulation of the transboundary pollution dynamic game leads to a nonlinear quadratic differential game. We have used a numerical method that allows us to characterize the feedback subgame-perfect Nash equilibria of the transboundary dynamic game. The endogenization of the clean technology dynamics introduces a new state variable in the model, and hence, the game presents two state variables, the stock of pollution and the stock of cleaner technology, and two control variables for each player, the emission rate and the investment in cleaner technology. The numerical algorithm we have used to carry out the analysis essentially consists of solving the programming equations associated with the discrete-time dynamic game by means of a tensorial Chebyshev approximation method.

Our numerical results allow us to qualify the main results in B &RC. Specifically, we have shown that the main conclusions obtained when technological improvements are formulated as exogenous changes in technology, as considered in B &RC, are only valid for large values of the pollution stock and when the stock of cleaner technology is below a certain threshold. Under these circumstances, the adoption of a cleaner technology may lead to greater emissions and a greater pollution stock, which implies lower welfare. We have shown that these conclusions are reverted when the threshold of the stock of cleaner technology is exceeded.

In order to deepen the understanding of the strategic impact of investing in cleaner technology, we have analyzed the transition paths of the decision and state variables toward their steady-state values. Concerning the equilibrium trajectories, we have shown that depending on the initial value of the stocks of pollution and cleaner technology, the equilibrium trajectories can approach monotonously their long-run values or they can lose the monotonous behavior and may even overshoot/undershoot the long-run equilibrium before converging. In our numerical examples, we have shown that the non-monotonous behavior can emerge for any of the state and control variables.

We have carried out an exhaustive robustness analysis of the results, running different numerical simulations, augmenting/diminishing the value of each of the model parameters by 20% with respect to the benchmark case and checking that the qualitative results collected in Conjectures 15 remain unchanged. To highlight the robustness of the results on the characteristics of the optimal strategies and the value function we have also analyzed another example considering parameter values that are very different from those in the benchmark case. We have checked that all the results also apply to this new example.

Future research can extend our work in at least three ways. First, our approach has relied on complete symmetry between the two countries. An interesting issue for further research would be to look at asymmetries in emissions. The differential game with two regions facing a pure downstream pollution problem is undoubtedly worth studying. Second, it would be interesting to consider that each country develops its own cleaner technology to reduce the emission-output ratio, instead of assuming that this technology is public knowledge, as supposed in the present work. In this case, the positive externality associated with the public nature of the cleaner technology stock in the present formulation would be eliminated. It would be interesting to see how the elimination of this positive externality could affect the results obtained in the present work. Third, it would also be worthwhile to further the analysis of the strategic impact of the investment in cleaner technology by considering a setting where pollution diffuses over time and across space. The linear-state differential game with spatial pollution diffusion analyzed in De Frutos & Martín-Herrán [13] constitutes a first step in this direction. The analysis of a richer formulation is in our research agenda.