1 Introduction

Most environmental damage does not depend solely on the flows of pollutants or on their accumulated stocks. Investments dedicated to the management of these pollutants can alter their environmental impact. In the case of climate change, the adaptation effort can be seen as a such investment; in the case of nuclear waste, it can be the technology used to store the radioactive material. In this context, the need to directly tax the environmental damage emerges. By providing incentives to mitigate pollution as well as incentives to invest in the management of accumulated pollutants, this type of policy instrument can indeed restore the social optimum—if the only market failure is the environmental one.

The aim of the present paper is first to study the design of a first-best policy in such a context. Its implementation may however prove difficult in the real world. We thus also consider more easily implementable policies, such as taxes on resource use (see e.g., [7, 24, 25] or [11]), on pollutants flows or stocksFootnote 1 (e.g., carbon taxes). Such policy tools can restore the social optimum when the environmental damage only stems from the accumulation of the pollutant. However, if this damage results from the combination of the stock of pollutant with investments made in its management, these tools cannot yield a first-best outcome. Indeed, by only ascribing a cost to the production of pollutants, they do not provide incentives to invest in the management of their accumulated stock (e.g., adaptation to climate change). We nevertheless show how they can constitute efficient substitutes to first-best policy when they are implemented together with tools that provide these incentives.

In the context of climate change, the adaptation effort can be considered as the way the accumulated stock of greenhouse gases (hereafter GHG) is handled: for a given level of atmospheric GHG concentration, and thereby a given amount of radiative forcing and subsequent warming, the physical consequences of the changing climate can be partially reduced through investments in, say, inundation barriers for flood-prone areas, or greater efficiency of water consumption in drought-stricken areas. One can consider that this is also the case for (long-term) radioactive waste, rare earths, heavy metals or asbestos for instance. In these cases, the use of natural resources yields polluting by-products that have to be correctly handled so as to limit their environmental impact. Here, the environmental disutility depends not only on the accumulated stock of pollutants, but crucially also on the effort put into the management of these stocks. For a given flow of pollutant, and thus a given increase in the existing stock, the rise in the environmental damage depends on the methods and technologies used to handle it and, in some cases, store it. Simply put, the production of an identical quantity of plutonium has a very different environmental impact depending on whether it is illegally dumped in the ocean or geologically stored by the military.

Here, the environmental damage must be seen as the harmful effects of pollutants that remain despite the investments put into their management. For example, it can be the negative impact of climate change remaining after adaptation; it can also be the residual impact on public health of the pollution of groundwater after decontamination processes, or the inconvenience to live in the presence of radioactive waste stored in deep geological repositoriesFootnote 2.

In such a context, a tax on the environmental damage, which results both from the stock of pollution and the way it is managed, provides incentives to economic agents to reduce pollution as well as incentives to invest in the management of accumulated pollutants. By doing so, it can restore the social optimum. We discuss below several reasons why public authorities may not be able to implement such a policy instrument, not to mention setting the tax at its first-best level. This environmental policy must therefore be seen as a benchmark. That is why we consider second-best tools that are more conventional and easier to use: a tax on pollutant flows, a tax on the accumulated pollutant stock, and a subsidy to the research sector dedicated to the pollution management technology. We show that usual environmental policies need to be accompanied by green R&D policies in order to produce first-best outcomesFootnote 3.

We employ an endogenous growth model with Romer-type horizontal differentiation in which the use of a non-renewable resource yields pollution flows that contribute to an existing stock. There are two kinds of knowledge. The first is “standard” knowledge: its accumulation increases the productivity of the consumption good sector. The second stock of knowledge, which we refer to as “green,” is dedicated to reducing the negative impact of the accumulated stock of pollutant. Indeed, this pollution stock can be managed in order to limit its environmental impact: what (negatively) affects households’ utility is a combination of this stock with the stock of “green” knowledge. Here, for a given pollutant stock (or concentration), the higher the level of green knowledge is, the lower the environmental damage; in other words, we partially endogenize the resultant environmental damage.

Formally, our framework is close to [13], with two main differences. First, there is directed technical change—in the sense that there are two types of knowledge. Secondly, it is not the flow of non-renewable resource that affects households’ utility but a joint function of the existing accumulated stock of pollution and the stock of green knowledge.

Note that, contrary to Acemoglu-type models of growth with directed technical change and polluting resources (see, e.g., [1]), the second R&D sector does not improve the productivity of a backstop technology: it is dedicated to limiting the environmental impact of the accumulated pollutant stock. For this reason, the two research sectors do not have symmetric effects on output and growth.

We first analyze the social optimum and we present its main characteristic conditions. We then study the decentralized economy. In order to focus on environmental issues, that is, the time profile of pollutant flows and the direction of R&D, we rule out market imperfections in the R&D sector: we assume that once an innovation has occurred, the government pays to the innovator a sum equal to the willingnesses to pay of the sectors using it. We show that without environmental policy, pollutant flows are not optimal and there is no green R&D activity—which is sub-optimal. The relevant policy consists of a tax on the environmental damage itself. We characterize it and study its first-best properties. Since such a policy is difficult to implement, we consider alternative—second-best—policies that are more easily implementable: an environmental tax (levied on pollutant flows or accumulated stock) and a subsidy to green research. We analyze these policies and show how, when used together, they can restore the social optimum.

Many contributions already employ dynamic frameworks to study the impact of carbon taxes in different contexts (from, e.g., [15] to more recent contributions like [10] or [5]). However, these studies generally do not consider adaptation possibilities, that is, the fact that for a given accumulated stock of carbon, it is possible to invest in the management of this stock in order to limit its negative effect. Similarly, the relative effects of taxes on resource use and research subsidies have already been studied in endogenous growth frameworks ([19] for instance), but frequently, in contexts where pollution is not formally considered. [1] consider an endogenous growth model in which output is produced from one clean input and one polluting input. There is directed technical change in the sense that innovations can improve either the productivity of the clean input or the productivity of the dirty input. As mentioned above, this is a first difference with the present paper. The second major difference is that, in their model, the externality only stems from the accumulated stock of carbon: it is not possible to reduce the negative impact of a given carbon concentration by investing in adaptation. Furthermore, in the major part of the analysis, the source of the negative externality is the activity in the polluting sector, which includes the dedicated stock of knowledge. In the present paper, the source of pollutant accumulation is only the use of the non-renewable resource. More recently, [2] analyze both theoretically and empirically how carbon taxes and research subsidies can induce a successful transition to a clean economy. Here too, the authors assume away the possibility of reducing the environmental damage for a given stock of carbon.

We present the model and the socially optimal outcome in Section 2. Then, in Section 3, we study the decentralized economy and the equilibrium conditions. We characterize the first-best environmental policy in Sections 4 and 5 is devoted to alternative (second-best) policies. The final section provides concluding remarks.

2 Model and Social Welfare

2.1 The Model

At each date t ∈ [0, + ), a quantity Yt of consumption good is produced according to the following technology:

$$ Y_{t}=F(A_{t},R_{t})\text{.} $$
(1)

At is a stock of “standard” knowledge—not to be confused with “green” knowledge, introduced below. Rt is a flow of non-renewable natural resourceFootnote 4. We will denote by FA(.) and FR(.) the marginal productivities; both are strictly positive.

Technology for the production of standard knowledge—à la Romer—is:

$$ \dot{A}_{t}=\delta_{A}L_{At}A_{t}, $$
(2)

where δA is an exogenous parameter characterizing the efficiency of this research sector, and LAt is the amount of labour put into it.

The resource flow Rt is extracted from a finite stock St, according to the standard law of motion:

$$ \dot{S}_{t}=-R_{t}. $$
(3)

As is often done in the literature on growth and non-renewable resources, extraction costs are omitted here.

The use of the natural resource yields a flow of pollutant. We consider that this flow is equal to hRt, where h is an exogenous, strictly positive parameter. This simple relationship means that any flow resource use entails a given flow of pollutant: there is no abatement possibility, like carbon sequestration for instance. This pollution flow adds to the existing stock: \(W_{t}=W_{0}+{{\int }_{0}^{t}}hR_{s}ds\). We thus have the following relationship:

$$ \dot{W}_{t}=hR_{t}. $$
(4)

We do not consider pollution decay. Tahvonen [23] points out that this allows simplifying the relation between the stock of resource and the stock of pollutant: at each date t, we have Wt = W0 + h(S0St). We assume here that the pollutant stock is nil at date 0, thus we simply have Wt = h(S0St).

Any positive amount of pollutant stock is stored and managed. We consider that, at any time t, there is one (average) unique technique for pollution management, which depends on the current state of dedicated knowledge: Bt. For global pollution like GHG, Bt can be seen as the knowledge related to climate change adaptation that the economy has accumulated until date t. For local pollution, (extremely) different management technologies can coexist at the same date: radioactive waste management for instance, depending on the site, varies from illegal dumping to highly managed and capital-intensive geologic disposal. We nevertheless assume here that, at time t, the technology is unified. The production of knowledge in the management of pollutants, to which we will henceforth refer to as “green” knowledge, is given by:

$$ \dot{B}_{t}=\delta_{B}L_{Bt}B_{t}, $$
(5)

where δB is an efficiency parameter, and LBt is the flow of labor dedicated to this specific research. Note here that we assume away cross-sector knowledge spillovers. In other words, we consider here that the stock of standard knowledge, At, does not benefit from green innovations, that is, increases in Bt ; similarly, we assume that green knowledge does not benefit from innovations in the standard R&D sector (see Eq. 2). Taking into account such spillovers would complexify the analysis and in particular the study of research policies. An important part of the literature on directed technical change (see, e.g., [1] or [17]) makes a similar assumption. In our framework, it means that improvements in technologies dedicated to climate change adaptation or nuclear waste confinement are strictly disconnected from innovations in standard technologies.

We distinguish between the actual stock of pollutant and its environmental impact. The environmental damage consists of the local or global effects on welfare of pollutant stocks of long duration. This damage aggregates various types of nuisance. In the case of GHG, it can be seen as the remaining negative impact of global warming for a given level of adaptation (e.g., sea level increases overcoming barriers and floodwalls). In the case of nuclear waste, it is the risk of leaks or the risk of accidental or purposeful unearthing of these dangerous stockpiles, as well as the discomfort that people have to live in the presence of such hazard. Obviously, the more advanced the technique of maintenance of the stock—represented here by the level of green knowledge, Bt—the lower the remaining damage for a given pollutant stock Wt.

This damage is denoted by Ωt. It thus depends on the stock of pollutant, Wt, and on the technology used to manage it, Bt according to the following functional relation:

$$ {\Omega}_{t}={\Omega} (W_{t},B_{t}). $$
(6)

ΩW is the marginal environmental damage caused by pollution, and ΩB is the marginal environmental benefit from green knowledge. We thus assume ΩW(.) > 0 and ΩB(.) < 0.

The representative household is endowed with a constant flow L of labor, which we normalize to one. Labor has two competing uses: research in the general purpose sector (LAt), and research in the pollution management sector (LBt):

$$ 1=L_{At}+L_{Bt}. $$
(7)

The representative household’s instantaneous utility depends positively on the current level of consumption Ct, which is equal to the entire production of good Yt (Ct = Yt = F(At, Rt)), and negatively on the environmental impact of the stock of pollutant, Ωt. We denote by u(.) the instantaneous utility function and by uC and uΩ the marginal utility of consumption and the marginal disutility of the environmental damage, with uC(.) > 0 and uΩ(.) < 0. The intertemporal utility function is

$$ U_{0}={\int}_{0}^{+\infty }u(C_{t},{\Omega}_{t})e^{-\rho t}dt, $$
(8)

where ρ is the psychological discount rate.

In the rest of the paper, we drop time subscripts when they are not necessary for notational convenience. We also denote by \(g_{X}\equiv \frac {\dot {X}_{t}}{X_{t}}\) the growth rate of any variable Xt.

2.2 Social Welfare

We consider here the socially optimal outcome for the economy. As we shall see, two main arbitrages drive the economy: the intertemporal allocation of consumption and thus of resource use, and the arbitrage between standard and green research. This section characterizes the general conditions that govern the socially optimal arbitrages.

The social planner maximizes the intertemporal utility function (8) subject to Eqs. 27. The Hamiltonian associated with this program is: \(\mathcal {H}\,=\,u[ F(A_{t},R_{t}),{\Omega } (h(S_{0}- S_{t}),\) Bt)] + μAtδALAtAt + μBtδB(1 − LAt)BtμStRt, where μAt, μBt and μSt are the costate variables associated to constraints (2), (5) and (3) respectivelyFootnote 5.

The first-order condition for LAt yields μAt/μBt = δBBt/δAAt. Log-differentiating this equation gives \(g_{\mu _{A}}-g_{\mu _{B}}=\delta _{B}L_{Bt}-\delta _{A}(1-L_{At})\). One can eliminate μAt, μBt, \(g_{\mu _{A}}\) and \(g_{\mu _{B}}\) from these two equations by using the first-order conditions for At and Bt, which are \(g_{\mu _{A}}=\rho -u_{C}F_{A}/\mu _{At}-\delta _{A}L_{A}\) and \(g_{\mu _{B}}=\rho -u_{{\Omega } }{\Omega }_{B}/\mu _{Bt}-\delta _{B}(1-L_{A}) \). By rearranging, one gets

$$ u_{C}F_{A}\delta_{A}A_{t}=u_{{\Omega} }{\Omega}_{B}\delta_{B}B_{t}\text{.} $$
(9)

This condition establishes the equality between the marginal utilities of labor in the two research sectors. At date t, suppose a marginal decrease in the flow of labor dedicated to standard research, LAt. This reduction has an impact on the accumulation of standard knowledge (see Eq. 2), in turn on output production (Eq. 1), and subsequently on consumption and utility (Eq. 8). This decline in the instantaneous utility is described by the left-hand side of the equation. The amount of labor that has been saved is accordingly transferred to the green research sector (see Eq. 7): LBt marginally increases. Consequently, the stock of green knowledge, Bt, increases (Eq. 5), which, for a given non-renewable resource extraction path and, in turn, a given path of pollutant production, diminishes the environmental damage (Eq. 6), and thus increases instantaneous utility (Eq. 8). This rise in utility is given by the right-hand side of the equation. Condition (9), which equalizes these two variations of utility, thus characterizes the socially optimal arbitrage between the two research sectors: at the social optimum, it is not possible to increase utility by reallocating labor from green (respectively standard) to standard (resp. green) research.

The first-order condition for St yields gμt = ρ + huΩΩW/μSt. In this condition, μSt can be replaced by its expression derived from the first-order condition for Rt: μSt = uCFR. Similarly, gμt can be replaced by its expression derived from the log-differentiation of the first-order condition for Rt, which gives \(g_{\mu _{S}}=(u_{CC}\dot {C}+u_{C{\Omega } }\dot {{\Omega }})/u_{C}+(F_{RA}\dot {A}_{t}+F_{RR}\dot {R}_{t})/F_{R}\). By rearranging, one obtains

$$ \rho -\frac{u_{CC}\dot{C}+u_{C{\Omega} }\dot{{\Omega}}}{u_{C}}+\frac{h}{F_{R}} \left( \frac{u_{{\Omega} }}{u_{C}}\right) {\Omega}_{W}=\frac{F_{RA}\dot{A} _{t}+F_{RR}\dot{R}_{t}}{F_{R}}. $$
(10)

This corresponds to the Ramsey-Keynes condition in the particular context of this model. As in simple growth models, that is, without resources or pollution, this condition equates two marginal rates of substitution: one for consumer utility, the other for production. It basically states that if the firm marginally decreases production at date t, then the benefit at date t + Δt has to be equal to the quantity of good that compensates consumers at date t + Δt from the marginal loss of consumption at t.

The right-hand side of condition (10) represents the growth rate of the marginal productivity of the resource: a marginal decrease of output production at date t allows saving a quantity of resource 1/FRt (Eq. 1), which, used at date t + Δt, allows an increase in output Yt by \(\dot {F}_{Rt}/F_{Rt}\). The marginal decrease in production yields a marginal decrease in consumption at date t which in turn entails a decrease in instantaneous utility (Eq. 8). This decrease can be compensated in terms of discounted utility by a rise in consumption at date t + Δt. The left-hand side of condition (10) measures this amount of good; the term \(\frac {h}{F_{R}}\left (\frac {u_{{\Omega } }}{u_{C}}\right ) {\Omega }_{W}\), negative by definition, shows that this amount is lower than that in the standard case. Indeed, the reduction in consumption at t is due to a decrease in Rt, which, for a given level of knowledge Bt, corresponds to a decrease in the environmental damage, and thus an increase in utility.

3 Equilibrium of the Decentralized Economy

3.1 Agents’ Behavior

We now study the equilibrium of the decentralized economy. The price of good Y is normalized to one, and wt, \({p_{t}^{R}}\) and rt are, respectively, the wage, the resource price, and the interest rate on a perfect financial market.

3.1.1 Consumption Good Sector

In the laissez-faire case, at each time t, the firm’s instantaneous profit is

$$ {\pi_{t}^{Y}}=F(A_{t},R_{t})-{p_{t}^{R}}R_{t}. $$
(11)

The firm maximizes (11) with respect to resource use Rt. The first-order condition is

$$ F_{R}=p_{Rt}. $$
(12)

For the environmental externality stemming from Ω(Wt, Bt) (see Eq. 8), we consider a policy scheme consisting in directly taxing the environmental damage: we assume that a unit tax τt is levied at each date t on ΩtFootnote 6. As shown in what follows, the introduction of this tax has two main effects. First, it assigns a cost to the production of pollutant, and thus to the use of the resource. Second, as a consequence, it yields a value to the stock of green knowledge and hence yields incentives to invest in the management of the accumulated stock of pollutant. For given levels of tax and stock of pollutant, the firm producing the consumption good will pay more or less depending on whether the state of knowledge in the field of pollution management is low or high.

Because of the type of the tax we consider, an intertemporal dimension is added to the usual maximization program of the firm. At each date t, the chosen resource use generates a pollution flow which affects the time path of the stock of pollutant, and consequently the time profile of environmental tax payments from date t onwards. Indeed, using a flow R of resource at date ṯ means increasing the stock of pollutant W. For a given time profile of the stock of green knowledge, the total sum paid τtΩ(Wt, Bt) at any date t > ṯ are increased. In other words, costs rise ad infinitum. Formally, profit function \({\pi _{t}^{Y}}\) features the control variable Rt and, because of the tax τt, the state variable Wt which depends on Rs for all s ∈ (0, t), as stated in Eq. 4. The program of the firm is thus:

$$\begin{array}{@{}rcl@{}} &&\!\!\underset{R}{\max }{\int}_{0}^{+\infty }[F(A_{t},R_{t})-{p_{t}^{R}}R_{t}-\tau_{t}{\Omega} (W_{t},B_{t})]e^{-{{\int}_{0}^{t}}r_{u}du}dt\\ &&\!\!\text{subject to }\dot{W}_{t} = hR_{t}\text{ for all }t. \end{array} $$
(13)

From the maximum principle, we obtain two first-order conditions with respect to Rt and Wt. We denote by μWt the costate variable associated to the constraint. The first-order condition for Rt yields μWt = (pRtFR)/h. Differentiating this expression with respect to time gives: \(\dot {\mu }_{Wt}=\left [ \dot {p}_{Rt}-(F_{RA}\dot {A}_{t}+F_{RR}\dot {R}_{t})\right ] /h\). One can use these expressions to eliminate μWt and \(\dot {\mu }_{Wt}\) in the first-order condition for Wt, which is \(\dot {\mu }_{Wt}=r_{t}\mu _{Wt}+\tau _{t}{\Omega }_{W}\). This yields the following condition:

$$ \tau_{t}h{\Omega}_{W}=(F_{R}-p_{Rt})r_{t}-(F_{RA}\dot{A}_{t}+F_{RR}\dot{R}_{t}-\dot{p}_{Rt}). $$
(14)

This condition describes how the profit-maximizing firm uses the resource and thereby manages the accumulation of pollutant. At each date t, a marginal increase in resource use yields an additional profit equal to FRpRt. Investing it in the financial market generates an instantaneous income equal to (FRpRt)rt. Meanwhile, not keeping this resource in situ yields a potential loss due to the evolutions of the resource’s marginal productivity and its price: \(\dot {F}_{R}-\dot {p}_{Rt}\).

Hence, the right-hand side (hereafter RHS) of Eq. 14 stands for the net profitability of this marginal increase in resource use. Here, this additional resource use leads to an increase in the stock of pollutant of h units (see Eq. 4), and consequently an increase in the environmental damage of hΩW(Wt, Bt). Hence, at each date t, because of the environmental policy, the cost for the firm is τthΩW(Ws, Bs), that is, the left-hand side of condition (14). This condition thus states that the cost of extracting more resource must be equal to its benefit.

3.1.2 Non-renewable Resource Sector

On the competitive natural resource market, the maximization of the profit function \({\int }_{t}^{+\infty }{p_{s}^{R}} R_{s}e^{-{{\int }_{t}^{s}}r_{u}du}ds\) subject to \(\dot {S}_{s}=-R_{s}\), Ss ≥ 0, Rs ≥ 0, st, yields the standard Hotelling rule in the decentralized equilibrium:

$$ \dot{p}_{t}^{R}/{p_{t}^{R}}=r_{t}\text{, for all }t. $$
(15)

3.1.3 Representative Household

At each date t, the representative household maximizes the utility function (8) subject to the following budget constraint: \(\dot {b}_{t}=r_{t}b_{t}+w_{t}+{p_{t}^{R}}R_{t}-T_{t}-C_{t}\), where bt is the stock of bonds at date t, and Tt is the lump-sum tax levied by the government to finance research. This maximization leads to the usual decentralized-equilibrium condition:

$$ r_{t}=\rho -\frac{u_{CC}\dot{C}+u_{C{\Omega} }\dot{{\Omega}}}{u_{C}}. $$
(16)

3.1.4 R&D Sector

As previously mentioned, the paper focuses on the environmental externality and the direction of R&D. To do so, we rule out market imperfections in the research sectors. The basic structure of each R&D sector is identical to the one in [13]. Knowledge is directly financed (there are no intermediate goods) and we assume that once an innovation has occurred, the government pays to the innovator a sum equal to the willingnesses to pay of the sectors using itFootnote 7.

We denote by \(v_{it}^{Y}\) and \(v_{it}^{RD}\) for i = A, B, the willingnesses to pay of the consumption good sector and the R&D sectors (respectively) for an innovation occurring at date t in sector i. In both (standard and green) R&D sectors, the price of an innovation occurring at date t is, for i = A, B: \(v_{it}=v_{it}^{Y}+v_{it}^{RD}\) if \(v_{it}^{Y}>0\), and vit = 0 if \(v_{it}^{Y}= 0\). In other words, innovation is not financed if it has no value in the consumption good sector. In such a case, the economy remains in a corner equilibrium where the whole effort in R&D is made in the standard sector: LAt = 1 and LBt = 0.

For i = A, B, we have \(v_{it}^{Y}=\frac {\partial {\pi _{t}^{Y}}}{\partial i_{t}} \) and \(v_{it}^{RD}=\frac {\partial \pi _{t}^{RD_{i}}}{\partial i_{t}}\) where \(\pi _{t}^{RD_{i}}\) is the profit on innovations produced at date t in sector i. The value of one innovation in sector i at date t is thus \(V_{it}\equiv {\int }_{0}^{+\infty }v_{is}e^{-{{\int }_{t}^{s}}r_{u}du}ds\). Therefore, \(\pi _{t}^{RD_{i}}\) is given by

$$ \pi_{t}^{RD_{i}}=\delta_{i}L_{it}i_{t}V_{it}-w_{t}L_{it},\text{with }i=A,B. $$
(17)

The maximization of this profit function with respect to Lit leads to the following first-order condition:

$$ \delta_{i}i_{t}V_{it}=w_{t},\text{with }i=A,B. $$
(18)

Then, log-differentiating (18) with respect to time yields:

$$ \frac{\dot{w}_{t}}{w_{t}}-r_{t}=-\frac{v_{it}}{V_{it}}+\delta_{i}L_{it}, \text{ for all }t,\text{ with }i=A,B. $$
(19)

3.2 Equilibrium Conditions

We now characterize and interpret the fundamental equilibrium conditions. We first focus on the intertemporal arbitrage made on resource use, and then on the arbitrage made between the two types of R&D.

3.2.1 Ramsey-Keynes Condition and Resource Use

Equation 15 allows to replace \(\dot {p}_{Rt}\) by pRtrt in Eq. 14: one obtains \(h\tau _{t}{\Omega }_{W}=F_{R}r_{t}-(F_{RA}\dot {A}_{t}+F_{RR}\dot {R}_{t})\). Then, eliminating rt in this equation and (16), one gets

$$ \rho -\frac{u_{CC}\dot{C}+u_{C{\Omega} }\dot{{\Omega}}}{u_{C}}=\frac{F_{RA}\dot{A}_{t}+F_{RR}\dot{R}_{t}}{F_{R}}+\frac{h\tau_{t}{\Omega}_{W}}{F_{R}}. $$
(20)

Condition (20) in the decentralized economy is the counterpart of social optimality condition (10). It is a Ramey-Keynes condition which states that if the firm marginally decreases production at date t, then the induced increase in production at date t + Δt is equal to the quantity of good that compensates consumers at date t + Δt from the marginal loss of consumption at t. Without any environmental policy, this condition is clearly non-optimal. Indeed, the second term of the left-hand side in condition (10), which stands for the increase in utility derived from not producing and thus not polluting at date t, is absent from condition (20) when τt = 0.

To illustrate condition (20), consider a given growth path of the economy, and suppose that the firm producing the consumption good marginally reduces its production at date t. The left-hand side of condition (20) commonly characterizes the value of the amount of consumption good that compensates households at date t + Δt for the marginal loss of consumption good at date t. This marginal decrease in consumption good production at date t allows the firm to save a quantity 1/FRt of resource. The right-hand side of Eq. 20 represents the firm’s benefit from keeping this resource quantity in situ. It is composed of two separate elements. The first benefit is a higher productivity of the resource—represented by the first term in the right-hand side. The second benefit stems from the fact that, for a given path of the stock of green knowledge, Bt, forgoing this flow of resource, and hence not increasing the stock of environmental damage, means smaller payments of environmental taxes: this is represented by the second term in the right-hand side. By equating the sum of these two benefits to the amount of good that allows keeping households’ intertemporal utility unchanged, equilibrium condition (20) characterizes the equilibrium arbitrage.

3.2.2 R&D Arbitrage

By differentiating profit functions (11) and (17) with respect to At and Bt, one obtains vAt = FA + δALAtVAt and vBt = −τtΩB + δBLBtVBt. Note that the value of green research is only induced by the environmental policy. Indeed, the final use of knowledge is either an input to the production of consumption good (At) or an input to the management of the stock of pollutant (Bt). Without environmental policy, that is when τt = 0, the consumption good firm has no incentive to manage the stock of pollutant: \(v_{Bt}^{Y}=\frac {\partial {\pi _{t}^{Y}}}{\partial B_{t}}= 0\). Hence, green knowledge has no use and it is not valued: vBt = 0 (see Section 3.1.4). In other words, there is only standard research if the tax is not implemented.

Replacing vAt and vBt by these expressions in Eq. 19 and using Condition (18), one gets:

$$ \frac{\dot{w}_{t}}{w_{t}}-r_{t}=-\frac{F_{A}\delta_{A}A_{t}}{w_{t}}, $$
(21)

and

$$ \frac{\dot{w}_{t}}{w_{t}}-r_{t}=\frac{\delta_{B}\tau_{t}{\Omega}_{B}B_{t}}{w_{t}}. $$
(22)

Conditions (21 and 22) express the marginal return of labor in the standard and the green R&D sectors, respectively.

Equations 21 and 22 together yield

$$ F_{A}\delta_{A}A_{t}=-\tau_{t}{\Omega}_{B}\delta_{B}B_{t}\text{.} $$
(23)

This is the equilibrium no-arbitrage condition between the R&D sectors. This condition, which is the counterpart of optimality condition (9), simply states that the rate of return must be the same in both R&D sectors in the (interior, that is, both research sectors are active) equilibrium of the decentralized economy.

Since we rule out externalities in the R&D sector by assuming that the price paid for innovations is equal to the willingnesses to pay of its users, the only externality comes from the environmental damage Ωt produced by the use of the non-renewable resource. Ωt is a function of the accumulated stock of pollutant Wt and the stock of green knowledge Bt (see Eq. 6). Its time profile is therefore determined by the time profiles of Wt—and thus the time profile of resource extraction Rt (see Eq. 4)—and the time profile of Bt—and thus the time profile of LBt (see Eq. 5).

In the decentralized economy, the time profile of Rt is derived from the equilibrium Ramsey-Keynes condition (20), which is non-optimal if no environmental policy is implemented (as shown above). Furthermore, we have just seen that the equilibrium allocation of labor between green and standard R&D (LAt and LBt) is not optimal either without environmental policy. Hence, the equilibrium time profile of Ωt is not optimal. To correct such a trajectory, one needs to modify the time profile of the stock of pollutant Wt and/or the stock of green knowledge Bt. This means changing the time profile of resource extraction (Rt) and/or the time profile of the effort in green R&D (LBt), that is, modifying conditions (20) and (23). This is what the environmental tax τ and the policy mix studied in the two following sections do.

4 First-best Environmental Tax

In order to characterize the first-best environmental policy, we need to compare the preceding decentralized equilibrium conditions (23) and (20), with their socially optimal counterparts, (9) and (10). It is straightforward that the socially optimal level of the environmental tax isFootnote 8 (henceforth, the upper-script o is used to denote socially optimal values): \({\tau _{t}^{o}}=-u_{{\Omega } }/u_{C}\).

Proposition 1

A tax \({\tau _{t}^{o}}=-\frac {u_{{\Omega } }}{u_{C}}\) levied on the level of environmental damage Ωt at each date t allows achieving the economy’s first-best outcome.

Since uC(.) is positive and uΩ(.) is negative, \({\tau _{t}^{o}}>0\).

The socially optimal level of the environmental tax is thus equal to the marginal disutility of the environmental damage divided by the marginal utility of consumption; in other words, it basically is a measure of the social cost of the stock of environmental damage. This contrasts with the standard result obtained when the environmental policy consists of a tax on resource use or on the flow of pollution itself, such as a carbon tax. In such a case, it is not the environmental disutility at date t that appears in the expression of the optimal tax but the discounted sum of its instantaneous disutilities from t to infinity—see for instance [11] or [12]. In contrast, the present model directly taxes the environmental externality: therefore, the numerator of \({\tau _{t}^{o}}\) only features the disutility of the stock of environmental damage at date t. Note that, contrary to the standard result in the literature dealing with growth and non-renewable resources—where the dynamics of the tax alone can restore the social optimum—the level of the tax matters here. Indeed, the tax modifies the time profile of resource extraction, and provides incentives to green R&D at each date. A similar result can be found in [14], where the tax changes the dynamics of resource extraction and provides incentives to carbon sequestrationFootnote 9.

It is likely that policy makers will encounter difficulties to implement such a policy scheme, since it namely requires to assess the environmental damage—which is here the harmful effects of pollutants that remain despite the investments put into their management (e.g., the negative impact of climate change after adaptation), as previously mentioned.

First, depending on the type of pollutant, the measure of this environmental damage can be a thorny issue. If researchers or agencies somehow manage to evaluate the damage caused by global warming, other kinds of damage are more difficult to fathom. The disutility caused to households by the proximity of a stock of nuclear waste (for a given storing technology) may indeed be difficult to assess. Methods like contingent valuation can obviously be considered (see for instance [16] or [18]), but one can think that the measurement of such variables by a public authority cannot provide the accurate data required to implement the tax.

Second, even when the environmental damage is correctly measured, it may be impossible to fully identify all the agents that yield it. In the case of climate change, if one wants to tax the damage caused by accumulated emissions of CO2, who will bear such a tax? It seems difficult in practice to assign precisely this or that part of the stock of CO2 to each of the agents who have produced it. However, in the case where the tax-payers would be countries (or even long-lived organizations), whose long-term emissions can be tracked, this type of tax could be more realistically consideredFootnote 10. Furthermore, one advantage of such a policy design is that it does not require from public authorities the accurate forecast and actualization of future emissions, which is undoubtedly a difficult task, with controversial results—the same type of argument is also put forward by [4]. In the case of the nuclear industry, and more generally in the case of local pollution, this policy scheme is more conceivable on this aspect than for climate change. Indeed, it seems technically easier to observe radioactive waste at the time of its production and to identify the producer of each stock.

5 Second-best Environmental Policies

As previously discussed, the economic policy considered above may prove difficult to implement in the real world. The regulator has to measure the environmental damage, given by Ω(.), that is, evaluating the damage resulting from given stocks of pollutant and knowledge, which can be a very complex task. For this reason, we study here alternative policies that are more commonly discussed in the public debate: a tax on resource use, a tax on the stock of pollutantand a subsidy to green R&DFootnote 11. As we shall see, these policies are second-best since none of them can restore the first-best social optimum. However, these tools are easier to implement by public authorities, since they require to identify variables that are more observable.

5.1 Second-best Environmental Taxes

As previously explained, knowledge can have two uses for the firm producing the consumption good: either as an input to the production of this good (standard knowledge) or as an input to the management of the stock of pollutant entailed by the production activity (green knowledge). Consequently, each type of knowledge is also used within its associated R&D sector, since it is an input of its own production (see Eqs. 2 and 5).

If the consumption good sector does not value one type of knowledge, this R&D activity. This is what happens for green knowledge if no environmental policy is implemented, or, as we show below, if the environmental tax is levied on the flow of pollutant or resource useFootnote 12, or on the accumulated stock of pollutant. In these cases, it has no value. This outcome is non-optimal because both types of research are active in the social optimum.

5.1.1 Tax on Resource Use

Since there is no abatement here (that is, we assume away carbon sequestration), taxing resource use is equivalent to taxing pollutant flows (see Eq. 4).

Here, we consider an ad valoremFootnote 13 tax ξt on resource use. At each date t, the instantaneous profit of the consumption-good sector is then given by \({\pi _{t}^{Y}}=F(A_{t},R_{t})-{p_{t}^{R}}R_{t}-\xi _{t}{p_{t}^{R}}R_{t}\). For computational convenience, we will denote θt ≡ 1 + ξt.

For all ξt ≥ 0, the marginal profitabilities of the two inputs that are standard knowledge and the resource flow are respectively FA(.) and \(F_{R}(.)-{p_{t}^{R}}\theta _{t}\). Both are independent from the level of green knowledge (Bt). It is then straightforward that this policy instrument does not yield any incentive to produce green knowledge, despite the fact that it is necessary in order to attain the socially optimal level of environmental damage.

Maximizing \({\pi _{t}^{Y}}\) with respect to Rt yields \(F_{R}(A_{t},R_{t})=\theta _{t}{p_{t}^{R}}\). Log-differentiating this condition and rearranging, one obtains: \(\dot {p}_{Rt}/p_{Rt}=(F_{RA}\dot {A}_{t}+F_{RR}\dot {R}_{t})/F_{R}(A_{t},R_{t})-\dot {\theta }_{t}/\theta _{t}\). By the same method as in Section 3, we use Eq. 15, which is unchanged here, to replace \(\dot {p}_{Rt}\) by pRtrt and we obtain \(r_{t}=(F_{RA}\dot {A}_{t}+F_{RR}\dot {R}_{t})/F_{R}(A_{t},R_{t})-\dot {\theta }_{t}/\theta _{t}\). Then, using the expression of rt given by the Ramsey-Keynes condition (16) (which is also unchanged here), we get to the equilibrium condition:

$$ \rho -\frac{u_{CC}\dot{C}+u_{C{\Omega} }\dot{{\Omega}}}{u_{C}}=\frac{F_{RA}\dot{A}_{t}+F_{RR}\dot{R}_{t}}{F_{R}}-\frac{\dot{\theta}_{t}}{\theta_{t}}. $$
(24)

This condition corresponds to condition (20) in the context of a tax on resource use instead of a tax on the environmental damage (τt). The difference is that the second term of the right-hand side in Eq. 20 is replaced by the growth rate of the tax, \(\dot {\theta }_{t}/\theta _{t}\). At time t, a marginal decrease in production entails a marginal decrease in consumption which is compensated by an increase in consumption at date t given by the left-hand side of condition (24). The decrease in production at time t allows saving a flow FR of non-renewable resource. The marginal productivity of the resource growing over time, keeping the resource in situ yields a first benefit which is represented by the first term of the RHS of condition (24). Furthermore, if the tax θt increases (resp. decreases) over time, postponing resource extraction yields an extra cost (resp. benefit) represented by the second term of the RHS of Eq. 24.

Since the consumption-good sector’s profit \({\pi _{t}^{Y}}\) does not feature the level of green knowledge, Bt, this type of knowledge has no value in this economy, as in the laissez-faire case. Thus, the R&D activities are the same as when no environmental policy is implemented: the economy remains in a corner solution where the sole standard research sector is active, that is LAt = 1 and LBt = 0.

This means that the tax ξt improves welfare by modifying the time profile of the environmental damage Ωt(Wt, Bt) only through a change in the time profile of resource use Rt (and thus the time profile of the pollutant stock Wt).

5.1.2 Tax on the Stock of Pollutant

Here, we consider a tax λt on the accumulated stock of pollutant, which can be seen as more easily implemented in the case of nuclear waste for instance—see also [4] for the study of a similar policy in the context of climate change.

The instantaneous profit of the consumption-good sector is then given by:

$$ {\pi_{t}^{Y}}=F(A_{t},R_{t})-{p_{t}^{R}}R_{t}-\lambda_{t}W_{t}. $$
(25)

Here also, the marginal profitabilities of the two inputs are independent from green knowledge. This tool thus cannot trigger any activity in the green R&D sector either.

The program of this sector, which was given by Eq. 13 in the case of the first-best policy, is now:

$$\begin{array}{@{}rcl@{}} &&\underset{R}{\max }{\int}_{0}^{+\infty}[F(A_{t},R_{t})-{p_{t}^{R}}R_{t}-\lambda_{t}W_{t}]e^{-{{\int}_{0}^{t}}r_{u}du}dt\\ &&\text{subject to }\dot{W}_{t} =hR_{t}\text{ for all }t. \end{array} $$

Here also, we obtain, from the maximum principle, two first-order conditions with respect to Rt and Wt. After elimination of the costate variable, we have the condition:

$$ \lambda_{t}h=(F_{R}-p_{Rt})r_{t}-(F_{RA}\dot{A}_{t}+F_{RR}\dot{R}_{t}-\dot{p}_{Rt}). $$
(26)

This condition states how the tax alters the time profile of the firm’s resource use and thus the time profile of the pollutant stock. It must be related to condition (14), obtained with the first-best tax on the environmental damage.

As in the preceding section, we replace \(\dot {p}_{Rt}\) in Eq. 26 by its expression derived from Eq. 15. Then, eliminating rt in this equation and (16), we have the following equilibrium condition,

$$ \rho -\frac{u_{CC}\dot{C}+u_{C{\Omega} }\dot{{\Omega}}}{u_{C}}=\frac{F_{RA}\dot{A}_{t}+F_{RR}\dot{R}_{t}}{F_{R}}+\frac{h\lambda_{t}}{F_{R}}. $$
(27)

This condition must be related to condition (20) which characterizes the equilibrium of the decentralized economy with a tax on the stock of environmental damage. The only difference with condition (20) is that the second term of the right-hand side in Eq. 27, which represents the tax payment avoided by keeping one unit of resource in situ, does not feature ΩW anymore. Here, forgoing a quantity of resource 1/FRt allows to avoid a certain amount of tax payment given by the second term of the RHS of Eq. 27. This latter term is independent of the marginal disutility of the environmental damage since the tax only bears on the stock of pollutant.

As it is the case with a tax on resource use, green knowledge, Bt, is not valued. The environmental policy provides no incentives to invest in green R&D, and only affects the economy by altering the time path of resource use.

5.2 Subsidy to Green R&D

The two second-best environmental taxes studied above do not trigger any activity in green research: the economy remains in a corner solution where only standard research is done. An obvious way to provide incentives to green research is to directly subsidize it.

So we now suppose that there is no environmental policy (τt = ξt = λt = 0) and we consider a subsidy σt to green R&D such that \(v_{Bt}^{Y}=\frac {\partial {\pi _{t}^{Y}}}{\partial B_{t}}+\sigma _{t}\). Here, green research is financed even if it is not valued by the consumption good sector.

Such a policy tool neither affects the program of the firm producing the consumption good nor the programs of the firm exploiting the resource and the representative household. Hence, conditions (12), (15) and (16), characterizing the equilibrium without environmental policy, hold here. As a result, the first equilibrium condition is \(\rho -\frac {u_{CC}\dot {C}+u_{C{\Omega } }\dot {{\Omega }}}{u_{C}}=\frac {F_{RA}\dot {A}_{t}+F_{RR}\dot {R}_{t}}{F_{R}},\) which is condition (20) if τt = 0. The subsidy thus has no impact on the arbitrages made in resource use and pollutant accumulation.

Conversely, the arbitrages made in the R&D sector are altered. Indeed, if the price of an innovation in the standard sector is still given by \(v_{At}=v_{At}^{Y}+v_{At}^{RD}=\frac {\partial {\pi _{t}^{Y}}}{\partial A_{t}}+\frac {\partial \pi _{t}^{RD_{A}}}{\partial A_{t}}\), it is given by \(v_{Bt}=v_{Bt}^{Y}+v_{Bt}^{RD}=\frac {\partial {\pi _{t}^{Y}}}{\partial B_{t}}+\sigma _{t}+\frac {\partial \pi _{t}^{RD_{B}}}{\partial B_{t}}\) in the green sector. As a consequence, we have (by differentiating profit functions (11) and (17) with respect to At and Bt): vAt = FA + δALAtVAt and vBt = σt + δBLBtVBt.

Replacing vAt and vBt by these expressions in Eq. 19 and using Condition (18), one gets: \(\frac {\dot {w}_{t}}{w_{t}}-r_{t}=-\frac {F_{A}\delta _{A}A_{t}}{w_{t}}\) and \(\frac {\dot {w}_{t}}{w_{t}}-r_{t}=\frac {\sigma _{t}\delta _{B}B_{t}}{w_{t}}\). Combining both equations gives the following equilibrium condition:

$$ F_{A}\delta_{A}A_{t}=\sigma_{t}\delta_{B}B_{t}. $$
(28)

This is the no-arbitrage condition between the two R&D sectors. It must be related to condition (23), which is the no-arbitrage condition in the case of the first-best tax, and to condition (9), governing the social optimum.

Here, one can see that the subsidy modifies the time profile of the environmental damage only by modifying the arbitrages made between the two R&D sectors—contrary to the tax on the environmental damage τt, which modifies both the arbitrages made on resource extraction and on the two R&D sectors (see Eqs. 20 and 23).

5.3 Combination of the Second-best Tools

As previously shown, the second-best environmental taxes cannot trigger activity in the green-R&D sector whereas the subsidy to green R&D does not modify the time profile of resource use (and production of pollutant). Combining both types of tools thus seems an obvious way to fully correct the environmental externality. Comparing conditions (24) and (27) with condition (10) on the one hand, and condition (28) with condition (9) on the other hand gives the couples (θt, σt) and (λt, σt) that achieve the economy’s first-best outcome. We present these two couples in the following proposition.

Proposition 2

Combining a subsidy to the effort in green research \(\sigma _{t}=\frac {u_{{\Omega } }{\Omega }_{B}}{u_{C}}\) with either a tax on resource use growing at rate \(g_{\theta }=\frac {h}{F_{R}}\left (\frac {u_{{\Omega } }}{u_{C}}\right ){\Omega }_{W}\) or a tax on the stock of pollutant \(\lambda _{t}=-\frac {u_{{\Omega } }{\Omega }_{W}}{u_{C}}\) allows achieving the social optimum.

6 Concluding Remarks

We have used a simple endogenous growth model with directed technical change in which an environmental damage stems from the accumulation of pollutant through the use of a non-renewable resource. This damage can be reduced by improving the technology used to manage the stock of pollutant. We have characterized the social optimum of this economy and studied its decentralized equilibrium. Here, the first-best environmental policy consists in taxing the environmental damage itself. We have analyzed the properties of a such tax, which must be considered as a benchmark. Usual environmental policies like a tax on resource use (or pollutant flows) or a tax on the pollutant stock do not provide incentives to invest in pollution management; they can yield first-best results only if they are accompanied by complementary policies like incentives to green R&D. We have characterize two couples of second-best, and more easily implementable, instruments that allow achieving the economy’s social optimum.

The environmental policies that are more frequently implemented or proposed today consist in ascribing a cost, whether directly or indirectly, to polluting activities. This is notable in the case of climate change (or carbon control) initiatives, where policies such as the European Union Emissions Trading Scheme (EU-ETS), the US’s Regional Greenhouse Gas Initiative (RGGI), California’s cap and trade program or the UK’s Climate Change Levy have been squarely aimed at increasing the private cost of greenhouse gas emissions. Such policy instruments are now relatively mature and are increasingly being rolled out across a growing number of sectors and jurisdictions [26]. Moreover, with the adoption in 2015 of the Paris Agreement, a global objective to limit temperature increases to no more than 2 degrees was adopted, and individual countries have submitted national action plans aimed at delivering that collective goalFootnote 14. Given that several of the climate change policies mentioned above are, explicitly or otherwise, linked to attaining a given jurisdiction’s international climate targets (see, e.g., European CommissionFootnote 15 or countries’ “Nationally Determined Contributions” towards the Paris AgreementFootnote 16), it is not far-fetched to say that there is at least a degree of quantitative connection between carbon control policies and environmental damage. However, internalizing the environmental damage of a polluting resource flow through a tax is only one side of what this paper argues to be an efficient policy mix.

Relative to the progress made on taxing polluting flows, policies aimed at fostering innovation in reducing environmental damage (such as climate change adaptation) are much less widespread. Among the existing initiatives on climate change for example, most are aimed at financing climate change adaptation, either at the global (e.g., the UN’s Adaptation FundFootnote 17) or the national and regional levels (e.g., EU financing for adaptationFootnote 18). While it can be argued that providing state sponsored low-cost capital to adaptation efforts is a type of subsidy, explicit (let alone marginal) incentives for adaptation innovation are very rare and, contrary to emissions taxes, are not typically linked to specific levels of environmental damage.

One of our paper’s key messages is that, in a context where directly taxing environmental damage is technically unfeasible, an effective environmental policy should combine taxes on the pollution flows with incentives towards green R&D, with the two instruments calibrated to quantified levels of damage and technology effectiveness. Within the specific context of climate change, an area where significant progress has been made on developing and implementing control mechanisms, this implies that the existing and growing body of policies seeking to internalize carbon costs and deliver certain levels of maximum environmental damage should be more closely and explicitly accompanied by policies fostering adaptation that, where possible, are also quantitatively linked to specific levels of environmental damage.