1 Introduction

Although it is widely acknowledged that greenhouse gas (GHG) concentrations are responsible for global climate change,Footnote 1 it is also accepted that the evolution of the climate system is far from being deterministic. Rather, it is subject to considerable ecological uncertainty. To illustrate, the scientific community still lacks information about randomness in, e.g., the oceans’ and forests’ ability to decay pollution, the exact effect of greenhouse gases on climate change, and the precise change in temperature we may face in the future. On the economic side, we also face many uncertainties. For example, our knowledge of how production and abatement technologies will develop and improve over time is imperfect (see Pindyck 2007 for a review). These features clearly invite us to adopt models where we account for (at least some of) these uncertainties, as in, e.g., Pindyck (2000, 2007), Yeung and Petrosyan (2008) and de Zeeuw and Zemel (2012). Further, as the environmental battle is necessarily a long-term one–it is indeed hard to believe that technology and consuming habits can be changed overnight–decision-makers will have the opportunity to learn about the unknowns further down the road.

Several contributions have addressed the issues of uncertainty and learning in climate change and environmental management problems.Footnote 2 Ulph and Maddison (1997) consider a two-period, two-player pollution emissions game, where the damages could have several possible values with given probabilities. They compare the cases of no learning and learning, with irreversible emissions, and obtain that information could decrease the welfare when the two countries choose their emissions non-cooperatively. Other studies focus on the role of uncertainty and learning in the formation of an International Environment Agreement (IEA). Kolstad (2007) considers uncertainty in environmental costs and benefits, as well as learning about these costs and benefits. He finds that learning tends to increase the size of the cooperating coalition in an IEA. However, with partial learning, several stable coalitions emerge and one of them is smaller than it would have been under no-learning. Kolstad and Ulph (2008) synthesize and extend their earlier analysis of the formation of an IEA under uncertainty about damages and with different models of learning. As in Dellink and Finus (2012), they consider a two-stage game model, with three scenarios for learning, namely, no-learning (the value of the stochastic parameters are not known before making decisions), partial (the players learn the values of these parameters before having to make their second-stage decisions) and full learning (basically, no uncertainty is involved). We will also consider three scenarios, which are full information, learning and no-learning, but in a very different perspectives than the cases considered in Kolstad (2007), Kolstad and Ulph (2008) and Dellink and Finus (2012). Here, we have a fully dynamic model and at each stage a general Bayesian learning is taking place. Breton and Sbragia (2011) introduce a model of uncertainty and learning where the stochastic variable is distributed normally. They assume that the players are myopic, and then numerically determine the emissions rule and the size of stable IEAs.

In this paper, we focus on an international pollution problem in which neighboring countries emit a pollutant, e.g., \(\hbox {CO}_{2}\), which accumulates over time and damages the shared environment. We assume that this accumulation process is subject to ecological uncertainty, and allow the players (countries) to learn about the unknown parameter over time. Also, we suppose that the players face different marginal damage costs and have heterogeneous prior beliefs, which is consistent with the fact that countries are asymmetric in terms of their vulnerability to climate change. Our model has its root in two literature streams, i.e., uncertainty and learning and dynamic games of pollution emissions. More precisely, our uncertainty and learning setup is in line with the framework considered in Koulovatianos et al. (2009), Agbo (2014) and Mirman and Santugini (2014). Koulovatianos et al. (2009) is an optimal-growth model, that is, no strategic interactions are involved. Agbo (2014) and Mirman and Santugini (2014) extend the Great Fish War dynamic game to a learning environment. Specifically, Agbo (2014) considers the effect of learning in the context of a strategic exploitation of a natural resources whereas Mirman and Santugini (2014) study the effect of multiple source of learning when agents not only extract a resource for consumption, but also invest in technology to improve the future stock. A common thread in the literature is the presence of an externality only in the dynamics. In this paper, we add another layer of complexity by considering a learning environment in a dynamic game in which the players interact not only in the dynamics but also in the market. In other words, our model accounts not only for the externality in the dynamics (as in the previous literature) but also for the externality in the market through the instantaneous payoff function.

Our paper also belongs to the significant literature in dynamic games dealing with strategic emissions decisions; see the early contributions in, e.g., Van der Ploeg and de Zeeuw (1992), Long (1992) and Dockner and Long (1993), and the survey in Jørgensen et al. (2010). Our main contribution here is twofold. First, we introduce learning and biological uncertainty in a dynamic international pollution problem. Second, whereas this literature systematically assumed that each player’s revenues depend only on her emissions (or production) actions, we extend in this paper the model to a setting where there is a cross effect in emissions in the revenue functions. This implies that the players face environmental and economic interdependencies, and are not only linked through the pollution stock and the damage cost.

Taking into account uncertainty and learning in a dynamic model of pollution emissions, our objective is to answer the following question: does ecological uncertainty alleviate the emissions problem or exacerbate it? In order to address this research question, we cast it into two sub-questions:

  1. 1.

    In the presence of uncertainty, how do players’ emissions strategies compare under different information and learning assumptions? In other words, how do different sources of uncertainty affect these emissions strategies?

  2. 2.

    How do changes in beliefs, e.g., becoming more pessimistic or optimistic, or feeling a heightened sense of risk, affect emissions strategies?

To deal with these questions, we characterize and compare equilibrium strategies under two different information structure assumptions. In a benchmark scenario, we assume that all players are fully informed about the distributions of random variables. This assumption is adopted in the environmental economics literature analyzing the impact of uncertainty on decisions (see, e.g., Bramoullé and Treich 2009; de Zeeuw and Zemel 2012; Arrow and Fisher 1974; Pindyck 2012). The alternative assumption is to assume that players do not know the exact values of some of the model parameters, but have some beliefs about these unknown parameters. These beliefs are based on the available information and are updated when they receive new information. We characterize and contrast non-cooperative feedback strategies under these two information assumptions. As for the optimal-growth model in Koulovatianos et al. (2009), we present all our analytical results for any general probability distribution function in which the information can be summarized by a finite-dimensional vector of sufficient statistics.

Our main results can be summarized as follows:

  1. 1.

    Uncertainty due to anticipation of learning results in a decrease in total emissions. A surprising result, however, is that an individual player might increase her emissions. This is unexpected as uncertainty would normally lead to precautionary behavior, i.e., lower emissions.

  2. 2.

    Depending on the beliefs bias; and the slope and curvature of the distribution function, the impact of structural uncertainty could be either an increase, decrease, or even no change in the emissions of individual players and also total emissions.

  3. 3.

    Under the learning assumption, if one player changes her beliefs in a way that makes her feel more optimistic or less at-risk, while others’ beliefs remain unchanged, then: (i) this player increases her emissions; (ii) all other players decrease theirs; and (iii) total emissions increase. The reverse result is that pessimistic views or a heightened sense of risk alleviate the pollution problem.

The rest of the paper is organized as follows. In Sect. 2, we present the model. In Sect. 3, we characterize the equilibrium strategies under different learning assumptions and compare them in Sect. 4. In Sect. 5, we assess the impact of changes in players’ beliefs on their emissions strategies. In Sect. 6, we evaluate the effect of changes in the sense of peril on emissions, and briefly conclude in Sect. 7.

2 The Model

We consider \(N\) countries, indexed by \(i=1,\ldots ,N\), each producing quantity \(q_{i,t}\) of a representative good at time \(t=1,\ldots ,\infty \). Production generates revenues and, as a by-product, emissions, e.g., \(\hbox {CO}_{2}\). Denote by \(e_{i,t}\) the emissions of country \(i\) at time \(t\). As in Benchekroun and Long (1998), we make the simplifying assumption that the ratio of production to emissions is equal to one, that is, one unit of production emits one unit of pollution. For \(i=1,2,\ldots ,N,\,t=1,\ldots ,\infty \), the revenues of country \(i\) at time \(t\) are \(y_{i,t}\), which can be expressed as a quadratic function of emissions, i.e.,

$$\begin{aligned} y_{i,t}=e_{i,t}\left( \alpha _{i}-e_{i,t}-\gamma \sum _{j\ne i}^{N}e_{j,t}\right) \end{aligned}$$
(1)

where \(\alpha \) and \(\gamma \) are constants and we assume \(0\le \gamma <2\). (As it will be apparent later on, the upper bound on \(\gamma \) is required to have non negative emissions.) In the literature (see, e.g., Dockner and Long 1993; Long 1992 and the survey in Jørgensen et al. 2010), a common assumption is that the countries are only related through the environmental damage and not through their revenues, that is, \(\gamma =0\). We depart from this assumption and consider that countries also interact at an economic level through. For instance, such a scenario may be applicable if countries are engaged in international trade and firms have market power in the world market.

Emissions accumulate over time and damage the environment. Denote by \(S_{t}\) the pollution stock whose evolution is described by the following difference equation:

$$\begin{aligned} S_{t+1}=\eta \left( \sum e_{i,t}+dS_{t}\right) , \end{aligned}$$
(2)

where \(0<d<1\), such that \(1-d\) is the natural decay rate of pollution and \(\eta \) is a shock variable. We introduce \(\eta \) to account for ecological uncertainty, which can be due to, among other things, lack of information about Mother Nature’s capacity to absorb emissions. To illustrate, the players could have only partial knowledge about the precise motion of pollution due to, e.g., weather conditions (speed and direction of wind, temperature and humidity, etc.), or about the actual and future rate of pollution absorption by carbon sinks such as oceans and forests. Finally, players could be uncertain about future improvements in mitigation technologies. In brief, our specification in (2) is meant to account, in the most parsimonious way, for all these possible random ecological effects, which were typically ignored in the literature (see, e.g., Jørgensen et al. 2010).

The pollution stock imposes an environmental damage cost to all players. We assume that this cost can be well approximated by the following linear function:

$$\begin{aligned} D_{i}\left( S_{t}\right) =\beta _{i}S_{t}, \end{aligned}$$
(3)

where \(\beta _{i}\) is the (positive) marginal cost of the pollution stock. This assumption, which is not uncommon in the literature (see, e.g., Hoel and Schneider 1997; Breton and Sbragia 2011), is motivated by mathematical tractability and by clarity in the exposition of results. Considering a non-linear damage cost is clearly of interest and is left for future research.

Assuming a welfare maximization behavior over an infinite horizon, the optimization problem of player \(i\) is then stated as follows:

$$\begin{aligned} \max _{\{e_{i,t}\}_{t=1}^{\infty }}{\mathbb {E}}\left[ \sum _{t=0}^{\infty } \delta ^{t}\left( e_{i,t}\left( \alpha _{i}-e_{i,t}-\gamma \sum _{j\ne i}^{N}e_{j,t}\right) -\beta _{i}S\right) \right] \end{aligned}$$
(4)

subject to the pollution dynamics (2), where \(\delta \, (0<\delta <1)\) is the common discount factor and \({\mathbb {E}}\) is the expectation operator with respect to all stochastic variables in the model.

Having described the model, we now introduce the information available to the players. That is, we define what the players know about the variable \(\eta \) and how they update their knowledge. We assume that \(\eta \) is a realization of a random variable \({\tilde{\eta }}\), whose conditional probability distribution is given by the function \(\phi (\eta |\theta ^{*})\), where \(\theta ^{*},\, \theta ^{*}\in \Theta \subset {\mathbb {R}}^{k}\), is the vector of sufficient parameters of probability distribution function (p.d.f.) \(\phi \). Let the support of this p.d.f. be given by \({\mathcal {H}}\), such that \(\eta \in {\mathcal {H}}\subseteq (0,1]\). The players do not know the future realizations of the random variable \({\tilde{\eta }}\). There are two cases of interest. The first is where players know the distribution of the random variable. The second is where they do not know, but learn about it. Specifically, we consider the following scenarios:

  1. 1.

    Full information In this scenario, the assumption is that the players know all the functional forms and parameters involved in the model, and in particular, \(\theta ^{*}\). Consequently, there is no need for learning to take place. This is the simplest and most common informational structure when dealing with a problem where uncertainty is present. This is our benchmark scenario, and we shall refer, in this context, to the players as informed players.

  2. 2.

    Learning Here, the players do not know the exact value of the parameter \(\theta ^{*}\). They have beliefs about this unknown parameter, which are based on available information. Moreover, they learn about it because they observe past and present levels of the pollution stock, which is informative about \(\theta ^{*}\). The agents do not know the distribution of the ecological uncertainty, but they have some prior beliefs and they will learn about this distribution as information becomes available. Anticipation of learning is a natural consequence of an optimization problem that takes into account future learning. See, e.g., Koulovatianos et al. (2009), and Agbo (2014).

Compared to the full-information case, the learning case adds two layers of uncertainty. The first is simply that the objective distribution is replaced by the subjective distribution (due to structural uncertainty). The second is that a learner anticipates learning, i.e., future beliefs are random from today’s point of view, which makes future payoffs more random. In addition to these two cases, we study an intermediate case that separates these two layers of uncertainty: the case of no-learning. That is, the player does not know the true distribution of the random shock, has beliefs about it, but does not expect beliefs to evolve over time, i.e., there is structural uncertainty but no anticipation of learning.Footnote 3

Our main motivation for retaining the intermediate case of no-learning lies in its methodological interest (in an experimental design sense). Indeed, whereas the learning scenario embeds both the structural uncertainty and the uncertainty related to the anticipation of learning with respect to the full information case, the no-learning scenario only accounts for the structural uncertainty with respect to the full information case. Therefore, by contrasting the results obtained in the three scenarios, we are able to separate the effects of the two uncertainties.

To wrap up, we have introduced an infinite-horizon \(N\)-player dynamic game with one control variable for each player (emissions), and a stock pollution whose evolution is governed by a stochastic difference equation. We assume throughout the paper that the players use feedback strategies, that is, strategies that are functions of the state of the system (pollution stock and relevant information about the stochastic process); we also assume that the relevant solution concept is Nash equilibrium.

3 Equilibria

In this section, we characterize the non-cooperative equilibria under the different scenarios.

3.1 Full Information

We recall that in this benchmark case, the assumption is that the players know the distribution of \({\tilde{\eta }}\) and the actual value of \(\theta \), that is, \(\theta ^{*}\), and therefore, that no learning takes place, that is, that players do not change their beliefs during the planning horizon. Another way of putting it is that, despite the ecological randomness embedded in \(\eta \), there is no structural uncertainty in this scenario. Denoting by \({v_{i}^{I}}\left( S;\theta ^{*}\right) \) the value function of player \(i\), where the superscript \(I\) stands for informed player, the Hamilton–Jacobi–Bellman (HJB) equation of this player is given by

$$\begin{aligned} {v_{i}^{I}}\left( S;\theta ^{*}\right)&= \max _{e_{i}}\left\{ e_{i} \left( \alpha _{i}-e_{i}-\gamma \sum _{j\ne i}^{N}e_{j}\right) -\beta _{i}S\right. \nonumber \\&\left. +\,\delta \int _{\mathcal {H}}\left( v_{i}^{I}\left( \eta \left( \sum _{j=1}^{N}e_{j}+dS\right) \right) ; \theta ^{*}\right) \phi (\eta |\theta ^{*}){\mathbf {d}}\eta \right\} . \end{aligned}$$
(5)

Proposition 1 characterizes the feedback-Nash equilibrium strategies.

Proposition 1

The feedback-Nash equilibrium emissions of informed player \(i,\,i\in \{ 1,\ldots ,N\}\) are given by

$$\begin{aligned} {e_{i}^{I}}\left( S;\theta ^{*}\right) =\frac{1}{2-\gamma } \left( \alpha _{i}-\frac{\gamma \sum _{i=1}^{N}\alpha _{i}}{2+\left( N-1\right) \gamma }-\frac{\delta \mu (\theta ^{*})}{1-\delta d\mu (\theta ^{*})} \left( \beta _{i}-\frac{\gamma \sum _{j=1}^{N}\beta _{j}}{2+\left( N-1\right) \gamma }\right) \right) , \end{aligned}$$
(6)

where

$$\begin{aligned} \mu (\theta ^{*})=\int _{\mathcal {H}}\eta \phi (\eta | \theta ^{*}){\mathbf {d}}\eta . \end{aligned}$$
(7)

Proof

See “Appendix”. \(\square \)

3.2 Learning

The alternative scenario to full information is that the players do not know some parameter values (here \(\theta ^{*}\)). We suppose that each player has her own prior beliefs, denoted by \(\xi _{i}(\theta )\), about the possible values of \(\theta ^{*}\). To be as general as possible, we let these prior beliefs to be heterogeneous. Further, we assume that: (i) the beliefs are common knowledge; (ii) the players only observe the actual value of \(\eta \) at each period, and each player uses this new information to update her own beliefs about the value of \(\theta \); and finally (iii) the players are Bayesian learners, that is, they update their beliefs according to the Bayes’ rule:

$$\begin{aligned} \hat{\xi }_{i}(\theta |\eta )=\frac{\phi (\eta |\theta )\xi _{i}(\theta )}{\int _{\phi }\phi (\eta |x)\xi _{i}(x){\mathbf {d}}x}. \end{aligned}$$
(8)

In the learning case, the value function \({v_{i}^{L}}(S;\xi _{i},\xi _{-i})\) of player \(i\) must satisfy the following HJB equation:

$$\begin{aligned} {v_{i}^{L}}(S;\xi _{i},\xi _{-i})&= \max _{e_{i}}\left\{ e_{i}\left( \alpha _{i}-e_{i}-\gamma \sum _{j\ne i}^{N}e_{j}\right) -\beta _{i}S\right. \nonumber \\&\left. +\,\delta \int _{\mathcal {H}}\left( v_{i}^{L} \left( \eta \left( \sum _{i}e_{i}+dS\right) ;\hat{\xi }_{i}(\theta |\eta ), \hat{\xi }_{-i}(\theta |\eta )\right) \right. \right. \nonumber \\&\left. \left. \times \left( \int _{\Theta }\phi \left( \eta |\theta \right) \xi _{i}(\theta ){\mathbf {d}} \theta \right) \right) {\mathbf {d}}\eta \right\} , \end{aligned}$$
(9)

where the superscript \(L\) refers to the learning player and \(\xi _{-i}=\{\xi _{j}\}_{j=1,j\ne i}^{N}\) combine the beliefs of the other players. Hence, each player anticipates not only changes in the future stock of pollution but also changes in all beliefs via the continuation value function.

Proposition 2 characterizes the feedback-Nash-equilibrium emissions strategies.

Proposition 2

The feedback-Nash equilibrium emissions of an anticipative-learning player \(i,\,i\in \{ 1,\ldots ,N\}\) is given by

$$\begin{aligned} {e_{i}^{L}}(S;\xi _{i},\xi _{-i})&= \frac{1}{2-\gamma }\left( \alpha _{i}- \frac{\gamma \sum _{j=1}^{N}\alpha _{j}}{2+(N-1)\gamma }\right) \nonumber \\&- \frac{\delta }{2-\gamma }\left( \beta _{i}\int _{\Theta }\frac{\mu (\theta ) \xi _{i}(\theta ){\mathbf {d}}\theta }{1-\delta d\mu (\theta )}-\frac{ \gamma \sum _{j=1}^{N}\beta _{j}\int _{\Theta }\frac{\mu (\theta ) \xi _{j}(\theta ){\mathbf {d}}\theta }{1-\delta d\mu (\theta )}}{ 2+\left( N-1\right) \gamma }\right) ,\quad \end{aligned}$$
(10)

where

$$\begin{aligned} \mu (\theta )=\int _{\mathcal {H}}\eta \phi (\eta |\theta ) {\mathbf {d}}\eta . \end{aligned}$$
(11)

Proof

See “Appendix”. \(\square \)

3.3 Intermediate Case: No-learning

As noted, a non-learner player does not know \(\theta ^{*}\) and uses prior beliefs to form expectations about the future outcomes. However, a non-learner player does not anticipate updating her beliefs. Put differently, a non-learner player uses today’s beliefs to assess her expected future payoffs. In this case, the value function \({v_{i}^{A}}(S;\xi _{i},\xi _{-i})\) of player \(i\) must satisfy the following HJB equation:

$$\begin{aligned} {v_{i}^{ NL}}(S;\xi _{i},\xi _{-i})&= \max _{e_{i}}\left\{ e_{i}\left( \alpha _{i}-e_{i}-\gamma \sum _{j\ne i}^{N}e_{j}\right) -\beta _{i}S\right. \end{aligned}$$
(12)
$$\begin{aligned}&+\,\left. \delta \int _{\mathcal {H}}{v_{i}^{A}}\left( \eta \left( \sum _{j=1}^{N}e_{j}+dS\right) ;\xi _{i},\xi _{-i}\right) \left( \int _{\Theta } \phi (\eta |\theta )\xi _{i}(\theta ){\mathbf {d}}\theta \right) {\mathbf {d}} \eta \right\} ,\nonumber \\ \end{aligned}$$
(13)

where the superscript NL refers to the non-learner.

The difference between the value functions in (9) and (12) lies in the difference in the beliefs that player \(i\) uses to compute the expected value of future payoffs. Whereas a learner uses her updated beliefs \({\hat{\xi }}_{i}\), a non-learner uses her beliefs of today \(\xi _{i}\), i.e., a non-learner understands that she faces uncertainty but does not anticipate that she will learn in the future. Proposition 3 presents the emissions equilibrium strategy of a non-learner.

Proposition 3

The feedback-Nash equilibrium emissions of a non-learner player \(i\), \(i\in \{ 1,\ldots ,N\}\) are given by

$$\begin{aligned} {e_{i}^{ NL}}\left( S;\xi _{i},\xi _{-i}\right)&= \frac{1}{2-\gamma }\left( \alpha _{i}- \frac{\gamma \sum _{j=1}^{N}\alpha _{j}}{2+\left( N-1\right) \gamma }\right) \nonumber \\&-\frac{\delta }{2-\gamma }\left( \beta _{i}\frac{\int _{\Theta }\xi _{i}(\theta )\mu \left( \theta \right) \mathbf {d}\theta }{1-\delta d\int _{\Theta }\mu \left( \theta \right) \xi _{i}(\theta )\mathbf {d}\theta }-\frac{ \gamma \sum _{j=1}^{N}\beta _{j}\frac{\int _{\Theta }\xi _{j}(\theta )\mu \left( \theta \right) \mathbf {d}\theta }{1-\delta d\int _{\Theta }\mu \left( \theta \right) \xi _{j}(\theta )\mathbf {d}\theta }}{ 2+\left( N-1\right) \gamma }\right) .\nonumber \\ \end{aligned}$$
(14)

Proof

See “Appendix”. \(\square \)

4 Comparison

In this section, we compare the emissions strategies under the different scenarios. In Proposition 4, it is shown that the total emissions under no-learning exceeds value under learning. This result indicates that the additional uncertainty generated by the anticipation of learning leads to more cautious behavior by the players.

Proposition 4

Total emissions are lower under learning than under no-learning, i.e.,

$$\begin{aligned} \sum _{i=1}^{N}{e_{i}^{L}}(S;\xi _{i},\xi _{-i})< \sum _{i=1}^{N}{e_{i}^{ NL}}(S;\xi _{i},\xi _{-i}). \end{aligned}$$
(15)

Proof

See “Proof of Proposition 4” section of Appendix. \(\square \)

Remark 1

Here and in the rest of the paper, the ordering of pollution stocks under different scenarios is the same as ordering of emissions.

If we make the additional assumption that the players are homogenous, we obtain the results in Proposition 5, where individual emissions are compared.

Proposition 5

If the players have homogenous beliefs and face the same marginal damage cost, i.e., \(\xi _{i}(\theta )=\xi (\theta )\) and \(\beta _{i}=\beta ,\,\forall i=1,\ldots ,N\), then

$$\begin{aligned} {e_{i}^{L}}(S;\xi )<{e_{i}^{ NL}}(S;\xi ). \end{aligned}$$
(16)

Proof

See “Proof of Proposition 5” section of Appendix. \(\square \)

Proposition 4 is in line with the results in the literature, where it is shown that the risk generated from anticipated learning induces precautionary behavior, which in our context translates into emissions reductions; see, e.g., Agbo (2014). Further, it has been shown that uncertainty leads to emissions reduction due to risk considerations (see, e.g., Baker (2005) and Bramoullé and Treich (2009)). However, the result stated in Proposition 4 is due to the fact that players are homogeneous. When there is heterogeneity among the players, the ordering might be reversed, i.e., \({e_{i}^{L}}(S,\xi _{i},\xi _{-i}) > {e_{i}^{ NL}}( S,\xi _{i},\xi _{-i})\). Indeed, from (10) and (14), we have the following equivalence:

$$\begin{aligned}&{e_{i}^{L}}( S,\xi _{i},\xi _{-i}) >{e_{i}^{ NL}}( S,\xi _{i},\xi _{-i})\nonumber \\&\quad \Leftrightarrow \frac{\left( 2+(N-1)\gamma \right) \beta _{i}\left( \int _{\Theta }\frac{\mu \left( \theta \right) \xi _{i}\left( \theta \right) {\mathbf {d}}\theta }{1-\delta d\mu \left( \theta \right) }- \frac{\int _{\Theta }\mu \left( \theta \right) \xi _{i}\left( \theta \right) {\mathbf {d}}\theta }{1-\delta d\int _{\Theta }\mu \left( \theta \right) \xi _{i}\left( \theta \right) {\mathbf {d}}\theta }\right) }{\gamma \sum _{j=1}^{N}\beta _{j}\left( \int _{\Theta }\frac{\mu \left( \theta \right) \xi _{j}\left( \theta \right) {\mathbf {d}}\theta }{1-\delta d\mu \left( \theta \right) }-\frac{\int _{\Theta }\mu \left( \theta \right) \xi _{j}\left( \theta \right) {\mathbf {d}}\theta }{1-\delta d\int _{\Theta }\mu \left( \theta \right) \xi _{j}\left( \theta \right) {\mathbf {d}}\theta }\right) }<1, \end{aligned}$$
(17)

where the numerator is the weighted self-precautionary effect due to the additional uncertainty generated by anticipated learning, and the denominator measures the same effect for all other players. This effect for all other players is referred to as differential informational externality in Agbo (2014). The inequality in (17) can be rearranged as

$$\begin{aligned} \gamma >\frac{2\Psi }{1-(N-1)\Psi }, \end{aligned}$$
(18)

where

$$\begin{aligned} \Psi =\frac{\beta _{i}\left( \int _{\Theta }\frac{\mu \left( \theta \right) \xi _{i}\left( \theta \right) {\mathbf {d}}\theta }{1-\delta d\mu \left( \theta \right) }-\frac{\int _{\Theta }\mu \left( \theta \right) \xi _{i}\left( \theta \right) {\mathbf {d}}\theta }{1-\delta d\int _{\Theta }\mu \left( \theta \right) \xi _{i}\left( \theta \right) {\mathbf {d}}\theta }\right) }{ \sum _{j=1}^{N}\beta _{j}\left( \int _{\Theta }\frac{\mu \left( \theta \right) \xi _{j}\left( \theta \right) {\mathbf {d}}\theta }{1-\delta d\mu \left( \theta \right) }-\frac{\int _{\Theta }\mu \left( \theta \right) \xi _{j}\left( \theta \right) {\mathbf {d}}\theta }{1-\delta d\int _{\Theta }\mu \left( \theta \right) \xi _{j}\left( \theta \right) {\mathbf {d}}\theta }\right) }, \end{aligned}$$

meaning that for large (small) \(\gamma \), learning decreases (increases) emissions. In other words, our results show that the sign of the difference in emissions under learning and no-learning depends on the value of the market externality as captured by \(\gamma \). Note that with homogeneous players condition (18) becomes \(\gamma >2\), which falls outside the range of the parameter \(\gamma \in [0,2)\). However, when there is heterogeneity in \(\beta _{i}\) and/or \(\xi _{i}\), then it is possible for the right-hand side of (18) to be strictly less than 2, which implies that there exist values of \(\gamma \in (0,2)\) such that condition (18) holds. For instance, if player \(i\)’s marginal cost of the pollution, \(\beta _{i}\), is sufficiently lower than the average marginal cost across the players, then \({e_{i}^{L}}( S,\xi _{i},\xi _{-i}) >{e_{i}^{ NL}}( S,\xi _{i},\xi _{-i})\). Similarly, if player \(i\)’s prior beliefs are sufficiently less risky than the other players’ beliefs, then \({e_{i}^{L}}( S,\xi _{i},\xi _{-i}) >{e_{i}^{ NL}}( S,\xi _{i},\xi _{-i})\).

To compare emissions under full information and no-learning cases, we look at whether beliefs about the mean of the stochastic parameter are unbiased or biased. These comparisons provide information about the effect of structural uncertainty, i.e., the difference between knowing the structure (i.e., knowing \(\theta ^{*}\)) and not knowing the structure (i.e., \(\theta ^{*}\) is unknown). The results, which are summarized in Proposition 6, convey a different message than the one in Bramoullé and Treich (2009). Indeed, we obtain that depending on the belief bias, considering structural uncertainty may increase, decrease or even not affect the emissions strategies.

Proposition 6

  1. 1.

    If the beliefs about the mean of the ecological shock, \(\tilde{\eta }\), are unbiased, i.e., \(\mu (\theta ^{*})=\int _{\mathcal {H}}\mu (\theta )\xi _{i}(\theta ) {\mathbf {d}}\theta \), then \({e_{i}^{I}}={e_{i}^{ NL}}\);

  2. 2.

    If for player \(i\,\mu (\theta ^{*})>\int _{\Theta }\mu (\theta )\xi _{i}(\theta )d\theta \), and \(\forall j\ne i\, \mu (\theta ^{*})=\int _{\mathcal {H}}\mu \left( \theta \right) \xi _{j}(\theta )d\theta \), then

    $$\begin{aligned} {e_{i}^{I}}\left( S,\xi _{i},\xi _{-i}\right) >{e_{i}^{ NL}} \left( S,\xi _{i},\xi _{-i}\right) \end{aligned}$$
    (19)
  3. 3.

    If for player \(i\,\mu (\theta ^{*})<\int _{\Theta }\mu (\theta )\xi _{i}(\theta )d\theta \), and \(\forall j\ne i\, \mu (\theta ^{*})=\int _{\mathcal {H}}\mu \left( \theta \right) \xi _{j} (\theta )d\theta \), then

    $$\begin{aligned} {e_{i}^{I}}\left( S,\xi _{i},\xi _{-i}\right) <{e_{i}^{ NL}} \left( S,\xi _{i},\xi _{-i}\right) \end{aligned}$$
    (20)

Proof

It suffices to compare (14) and (6) to obtain the results. \(\square \)

Introducing the intermediate case of no-learning allows us not only to distinguish between the effect of structural uncertainty and of uncertainty due to anticipation on the emissions strategies, but to also see that these two sources of uncertainty can even have opposite effects on these strategies.

To illustrate, we have drawn in Fig. 1 the emissions trajectories for two players with different beliefs, assuming a beta distribution.Footnote 4 When the structural uncertainty causes a decrease in emissions with respect to the full information case (solid straight line in the figures), the uncertainty due to anticipation goes in the same direction and decreases emissions further. However, when the structural uncertainty increases the emissions, the uncertainty due to anticipation moderates this effect. Put differently, whereas the uncertainty due to anticipation may mitigate (at least the total) emissions, the effect of structural uncertainty depends on the model assumptions, and more specifically, on the slope and curvature of the distribution function. Proposition 7 gives a more precise statement in the special case of unbiased and homogeneous beliefs, i.e., \(\xi _{i}(\theta )=\xi (\theta ),\,\forall i=1,\ldots ,N.\)

Fig. 1
figure 1

Decomposing the effects of structural uncertainty and uncertainty due to anticipation for a Beta distribution with parameters \(a\) and \(b\) (\(B(a,b)\) where \(\text{ mean }=\frac{a}{a+b}\)) in a two-player game with heterogeneous beliefs

Proposition 7

Assume that the beliefs are homogeneous and unbiased, i.e., \(\theta ^{*}=\int _{\Theta }\theta \xi _{i}(\theta ){\mathbf {d}} \theta ,\,\forall i=1,\ldots ,N\), then:

  • If \(\mu ^{\prime \prime }>0\), then \(\sum {e_{i}^{I}}\left( S;\theta ^{*}\right) >\sum {e_{i}^{ NL}}\left( S,\xi _{i},\xi _{-i}\right) \);

  • If \(\mu ^{\prime \prime }=0\), then \(\sum {e_{i}^{I}}\left( S;\theta ^{*}\right) =\sum {e_{i}^{ NL}}\left( S,\xi _{i},\xi _{-i}\right) \);

  • If \(\mu ^{\prime \prime }<0\), then \(\sum {e_{i}^{I}}\left( S;\theta ^{*}\right) <\sum {e_{i}^{ NL}}\left( S,\xi _{i},\xi _{-i}\right) \).

Proof

Since \(\mu (\theta ^{*})=\mu ({\mathbb {E}}(\theta )),\) if \(\mu ^{\prime \prime }>0\, (\mu ^{\prime \prime }<0)\), i.e., \(\mu \) is convex (concave), then by Jensen’s inequality, we have \(\mu ({\mathbb {E}}(\theta ))>{\mathbb {E}}(\mu (\theta )) (\mu ({\mathbb {E}}(\theta ))<{\mathbb {E}}(\mu (\theta )))\), which completes the proof. \(\square \)

Note that the above result holds true for heterogeneous players in terms of the model’s parameters.

Recall that in the case of full information (i.e., \(\theta ^{*}\) is known), uncertainty enters the optimization problem through the mean of the shock, i.e., \(\mu (\theta ^{*})\). Hence, in the no-learning case, replacing \(\mu (\theta ^{*})\) by \(\int \nolimits _{\Theta }\mu (\theta )\xi (\theta )d\theta \) in (6) yields (14). It follows that the uncertainty in the no-learning case is also displayed through the conditional mean \(\mu (\theta )\), which implies that any change in prior beliefs about \(\theta \) has an effect on behavior only through the conditional mean. When comparing the case of full information with the no-learning case under unbiased beliefs about the unknown parameter, it is then the curvature of the conditional mean that determines the ordering of total emissions under full information and no-learning. Specifically, if the conditional mean is convex in \(\theta \), i.e., \(\mu ^{\prime \prime }(\theta )>0\), then a mean-preserving increase in risk for unbiased prior beliefs implies a lower anticipated level of pollution (i.e., \(\mu (\theta ^{*})>\int \nolimits _{\Theta }\mu (\theta )\xi (\theta )d\theta \)), which discourages total emissions. The opposite result occurs when the conditional mean is concave in \(\theta \). Finally, if the conditional mean is linear in \(\theta \) (i.e., \(\mu ^{\prime \prime }(\theta )=0\)), then a mean-preserving increase in risk has no effect on total emissions.

In the next two propositions we state some welfare comparative results. Denote by \({v_{i}^{x}}(S;\cdot )\) the value function (welfare) under learning scenario \(x\in \{ NL,L,I\}\), that is,

$$\begin{aligned} {v_{i}^{x}}(S;\cdot )={\kappa _{i,1}^{x}}S+{\kappa _{i,2}^{x}}. \end{aligned}$$
(21)

Unfortunately, it is not possible to compare the intercept terms \({\kappa _{i,2}^{x}}\) across scenarios, but we can compare the slopes of the value functions. Proposition 8 states that the anticipation of learning increases the negative effect of increasing pollution on welfare.

Proposition 8

Under learning and no-learning, the slopes of the value functions compare as follows:

$$\begin{aligned} {\kappa _{i,1}^{L}}<{\kappa _{i,1}^{ NL}}<0. \end{aligned}$$
(22)

Proof

See “Proof of Proposition 8” section of Appendix. \(\square \)

The next proposition shows that the beliefs effect (captured by the difference between the full-information and no-learning cases) may reduce or enhance the negative effect of increasing pollution on welfare.

Proposition 9

The ordering of \({\kappa _{i,1}^{I}}\) and \(\kappa _{i.1}^{ NL}\) depends on the beliefs bias as follows:

  1. 1.

    For \(\mu (\theta ^{*})>\int _{\Theta }\mu (\theta ) \xi _{i}(\theta )d\theta \),

    $$\begin{aligned} {\kappa _{i,1}^{I}}<\kappa _{i,1}^{ NL}<0. \end{aligned}$$
    (23)
  2. 2.

    For \(\mu (\theta ^{*})<\int _{\Theta }\mu (\theta ) \xi _{i}(\theta )d\theta \),

    $$\begin{aligned} \kappa _{i,1}^{ NL}<{\kappa _{i,1}^{I}}<0. \end{aligned}$$
    (24)

5 More Optimistic/Pessimistic Beliefs

The players form and change their beliefs according to the information they receive from different sources over time, e.g., lobbyists and scientific literature. In this section, we analyze the impact of changes in the players’ beliefs on their emissions. To do so, we first introduce the concept of first-order strict stochastic dominance, which is necessary to compare better and worse situations.

Definition 1

Consider two probability density functions \({\xi _{i}^{1}}\) and \({\xi _{i}^{2}}\). We say that \({\xi _{i}^{1}}\) first-order strict-stochastically dominates \({\xi _{i}^{2}},\,{\xi _{i}^{1}}\succ _{1} {\xi _{i}^{2}}\), if for any increasing function \(u:{\mathbb {R}}\rightarrow {\mathbb {R}}\), we have \(\int u\left( x\right) {\xi _{i}^{1}}\left( x\right) {\mathbf {d}}x>\int u\left( x\right) {\xi _{i}^{2}}\left( x\right) {\mathbf {d}}x\).

Next, we clarify what is meant in our context by a change in beliefs. Since beliefs are about \(\theta \) and not \(\eta \), to interpret the meaning of a change in beliefs, we need to know how the mean of \(\theta \), that is, \(\mu (\theta )\) varies with \(\theta \). Keeping in mind definition (1), suppose that the beliefs of player \(i\) change from \({\xi _{i}^{1}}\) to \({\xi _{i}^{2}}\), with \({\xi _{i}^{1}}\succ _{1}{\xi _{i}^{2}}\) and \( \mu ^{\prime }(\theta )>0\). Intuitively, this player’s beliefs have changed in a way that she expects lower values for the unknown parameter \(\theta \), and since \(\mu \) is increasing, she also expects lower values for the unknown variable \(\eta \). Therefore, under \({\xi _{i}^{2}}\), this player expects lower levels of pollution accumulation, and consequently, a lower environmental cost. Put differently, by believing in \({\xi _{i}^{2}}\) instead of \({\xi _{i}^{1}}\), this player becomes more optimistic about the environment. If \(\mu ^{\prime }(\theta )<0\), then the result is the opposite and a change from \({\xi _{i}^{1}}\) to \({\xi _{i}^{2}}\) would be equivalent to saying that player \(i\) is now more pessimistic. Proposition 10 presents the impact of becoming more optimistic on emissions strategies.

Proposition 10

Assume that player \(i\) becomes more optimistic while all the other players’ priors remain unchanged, i.e., we have the two N-tuples \( \xi ^{1}=\left\{ \xi _{1}^{1},\ldots ,{\xi _{i}^{1}},\ldots ,\xi _{N}^{1}\right\} \) and \(\xi ^{2}=\left\{ {\xi _{1}^{1}},,\ldots ,{\xi _{i-1}^{1}},{\xi _{i}^{2}}, {\xi _{i+1}^{1}},\ldots ,{\xi _{N}^{1}}\right\} \). If \({\xi _{i}^{1}}\succ _{1}{\xi _{i}^{2}}\) and \(\mu ^{\prime }( \theta )>0,\,x\in \{L,NL\}\), then for \(\gamma \ge 0\) we have

  • \(e_{i}^{x}\left( S,\xi ^{2}\right) >e_{i}^{x}(S,\xi ^{1})\);

  • \(e_{j}^{x}\left( S,\xi ^{2}\right) \le e_{j}^{x}(S,\xi ^{1}); \,\forall j\ne i\), (equality holds for \(\gamma =0\));

  • \(\sum _{z=1}^{N}e_{z}^{x}\left( S,\xi ^{2}\right) > \sum _{z=1}^{N}e_{z}^{x}(S,\xi ^{1})\).

Proof

See “Proof of Proposition 10” section of Appendix. \(\square \)

The Proposition shows that if player \(i\) becomes more optimistic, then she increases her emissions, while all the other players decrease theirs. This shows that emissions are strategic substitutes. More importantly, we obtain that the reduction by all other players is not sufficient to compensate for the increase in player \(i\)’s emissions. Optimism clearly has a negative effect on the environment; this may explain why environmentalists prefer to highlight poor prospects and play down positive news. To illustrate the results in the Proposition, we consider the following simple example: assume \(\eta \) has a uniform distribution with unknown support \([0,\theta ]\), and beliefs, \(\xi (\theta )\), \(\theta \in [0,1]\), then \( \mu (\theta )=\frac{\theta }{2}\), so \(\mu ^{\prime }\left( \theta \right) >0\). By Proposition 10, if \({\xi _{i}^{1}}\succ _{1} {\xi _{i}^{2}}\), then \({e_{i}^{L}}\left( S,\xi ^{2}\right) >{e_{i}^{L}}(S,\xi ^{1})\).

The following corollary shows that if all players become more optimistic, then the total emissions increase.

Corollary 1

If all countries become more optimistic about the environment, i.e., \({\xi _{i}^{1}}\succ _{1}{\xi _{i}^{2}},\,\forall i\in {1,\ldots ,N}\), and \(\mu ^{\prime }(\theta )>0\), then \(\forall x\in \left\{ L,NL\right\} \), we have

$$\begin{aligned} \sum _{z=1}^{N}e_{z}^{x}\left( S,\xi ^{2}\right) >\sum _{z=1}^{N}e_{z}^{x}(S, \xi ^{1}). \end{aligned}$$
(25)

We saw that emissions are strategic substitutes. In other words, while a player’s more optimistic beliefs lead to an increase in her own emissions, more optimistic outsider beliefs have the opposite effect on this player. Corollary 1 states that, in total, the effects on oneself dominate those of outsiders, and if everyone becomes more optimistic, the total emissions will increase. However, the effect of a change in everyone’s beliefs on individual emissions is ambiguous and depends on the weight of self-induced versus outsider effects.Footnote 5 Further, as our model allows for heterogeneity in beliefs, we can compare the emissions strategies of players with different beliefs. Indeed, if players \(i\) and \(j\) are similar in their parameters but not in their beliefs, such that player \(i\) is more optimistic than player \(j\), then

$$\begin{aligned} e_{i}^{x}\left( S,\xi \right) >e_{j}^{x}(S,\xi ), \,\,x\in \left\{ L,NL\right\} , \end{aligned}$$
(26)

that is, player \(i\) will emit more than player \(j\) under both the learning and no-learning cases. Now, if players change their beliefs about the functional form of \(\phi \left( \eta \right) \), then we can characterize more directly the impact of changes in optimism/pessimism. Recalling that a player becomes more pessimistic when she gives higher values on average to \(\eta \), Proposition 11 presents the effect of such a change on the players’ emissions.

Proposition 11

If \(\phi _{i}^{1}(\eta )\succ _{1}\phi _{i}^{2}(\eta )\), i.e., player \(i\) gives higher mean to \(\mu (\theta )\) under \( \phi _{i}^{1}(\eta )\) than under \(\phi _{i}^{2}(\eta )\), then

$$\begin{aligned} e_{i}^{x2}\left( S,\xi \right) >e_{i}^{x1}(S,\xi ),x\in \left\{ L,NL\right\} . \end{aligned}$$
(27)

Proof

Since \(\mu (\theta )=\int _{\mathcal {H}}\eta \phi (\eta | \theta ){\mathbf {d}}\theta \) and \(\eta \) is an increasing function if \(\phi _{i}^{1}(\eta )\succ _{1}\phi _{i}^{2}(\eta )\), so by definition we have \(\mu _{i}^{1}(\theta )>\mu _{i}^{2}(\theta )\). Given Proposition 2 and Proposition 3, the proof is complete. \(\square \)

In a nutshell, the main takeaway of this section is that increased optimism leads to increased total emissions.

6 Belief in Increased Peril

To represent optimism/pessimism, we used the first-order stochastic dominance since it deals with “better” vs. “worse” situations. To compare the relative riskiness (or volatility) of two distributions, we introduce and use the second-order stochastic dominance.

Definition 2

Given two distributions with the same mean, distribution with p.d.f. \(\phi ^{1}(\theta )\) second-order stochastically dominates the distribution with p.d.f. \(\phi ^{2}\left( \theta \right) \), \(\phi ^{1}(\theta )\succ _{2}\phi ^{2}(\theta )\), if for every non-decreasing concave function \(u:{\mathbb {R}}\rightarrow \mathbb {R}\), we have \(\int _{\mathbb {R}}u\left( x\right) \phi ^{1}\left( x\right) \mathbf {d }x\ge \int _{\mathbb {R}}u\left( x\right) \phi ^{2}\left( x\right) {\mathbf {d}}x\).

By definition, we can say that if \(\phi ^{1}(\theta )\succ _{2} \phi ^{2}(\theta )\), then \(\phi ^{1}\) is less volatile or less risky. The equality of the distributions’ means is crucial in the above definition as it allows us to distinguish the effects of more optimistic/pessimistic views from the effects related to the riskiness of the situation. Proposition 12 presents the effects of a change in the belief that the situation is more risky.

Proposition 12

Suppose \(\mu ^{\prime }(\theta )>0\), and consider the two \(N\)-tuples \(\xi ^{1}=\left\{ \xi _{1}^{1},\ldots ,{\xi _{i}^{1}},\ldots ,\xi _{N}^{1}\right\} \) and \(\xi ^{2}=\left\{ \xi _{1}^{1},,\ldots ,\xi _{i-1}^{1}, {\xi _{i}^{2}},\xi _{i+1}^{1},\ldots ,\xi _{N}^{1} \right\} \). If country \(i\) feels more at-risk under \({\xi _{i}^{2}}\) than under \({\xi _{i}^{1}}\), i.e., \({\xi _{i}^{1}}(\theta )\succ _{2}{\xi _{i}^{2}} (\theta )\), then for \(\gamma \ge 0\) we have

  1. 1.

    \({e_{i}^{L}}\left( S,\xi ^{2}\right) \le {e_{i}^{L}}(S,\xi ^{1})\);

  2. 2.

    \(e_{j}^{L}\left( S,\xi ^{2}\right) \ge e_{j}^{L}(S,\xi ^{1}),\,\forall j\ne i\), (equality holds for \(\gamma =0\));

  3. 3.

    \(\sum _{z=1}^{N}e_{z}^{L}\left( S,\xi ^{2}\right) \le \sum _{z=1}^{N}e_{z}^{L}\left( S,\xi ^{1}\right) \).

Proof

See “Proof of Proposition 12” section of Appendix. \(\square \)

According to Proposition 12, given that the unknown variable’s mean is increasing in \(\theta \), if a learner player perceives a higher-risk situation while all other players’ beliefs remain unchanged, then she will decrease her emissions, while other players will increase their own (by strategic substitution). Similar to what we found before (see Proposition 10), their reactions are not strong enough to overcome the direct effect and the total emissions will be lower. If everyone feels more at-risk, the net effect on player \(i\)’s emissions is ambiguous since the increase in one’s own riskiness tends to decrease the emissions while the increase in the opponents’ sense of peril is an incentive to increase emissions.

7 Concluding Remarks

We introduced in a dynamic game of pollution emissions the two important features of ecological uncertainty and learning, with the objective of answering the question of whether uncertainty decreases emissions or not. While one might expect uncertainty to alleviate the commons problem, we obtained that, depending on the model setup, the effect of uncertainty can go in both directions. By decomposing the different sources of uncertainty, we were able to separate the effects of structural uncertainty from those due to learning. We also studied the impacts of changes in players beliefs, either toward greater optimism/pessimism or a greater sense of peril. Our results suggest that more pessimistic beliefs about the pollution stock and a greater sense of peril both decrease the total emissions.

Number of extensions of this work to enhance our understanding of the impact of different sources of uncertainty on the results would be of interest. First, we could add economic uncertainty in the form of a random marginal cost of pollution, or in the parameters of the revenue function, that is, \(\alpha \) and \(\beta \). Note, however, that introducing uncertainty in terms of \(\alpha \) or \(\beta \) does not bring a profound effect of learning on behavior, if we keep the assumption of linear damage cost. Indeed, for these cases, the policy functions under learning are equivalent to the policy functions under no-learning. That is, the anticipation of learning is nonexistent. Hence, learning has different effect depending on which parameters you are learning about.Footnote 6 Second, we fully acknowledge that some of our results are reminiscent of our assumption of a linear damage cost and considering a non-linear damage cost is clearly an extension worth analyzing. Third, we could envision to link the realized values of \(\eta \) to the values of the revenue parameters \(\alpha \) and \(\beta \). For instance, if \(\eta \) is very high, this may cause a catastrophe leading to a discrete fall in the value of \(\alpha \). The possibility of catastrophes may affect players’ behavior, either by enhancing the precautionary motive or by encouraging countries to emit more before the catastrophe hits. Also, different behavioral assumptions, such as loss aversion, may lead to outcomes other than those derived in this model.Footnote 7 Finally, a very challenging extension is to add abatement technology as a state variable and let the emissions levels be a decreasing function of the available stock of this technology. With this addition, a player can respond to the uncertainty by either reducing production, reducing pollution emissions per production unit, or by a mix of the two options.