The publication in 1972 of “The Limits to Growth” by the Club of Rome marked the emergence of a public awareness about collective perils associated with the sustainability of our development. Since then, citizens and politicians have been confronted with a never ending list of environmental problems: nuclear wastes, genetically modified organisms, climate change, biodiver sity, ... This debate has recently culminated with the publication of three reports. On one side, the Copenhagen Consensus (Lomborg 2004) put top priority on public programs yielding immediate benefits (fighting malaria and AIDS, improving water supply,...), and rejected the idea of investing to prevent of global warming. On the other side, the Stern Review (Stern 2006) and the fourth report of the IPPC (IPCC 2008) put tremendous pressure for acting quickly and heavily against global warming.

An important source of heterogeneity in the environmental policy recommendations comes from the selection of the discount rate. Behind this selection lies a crucial question, how much should current generations be willing to pay to improve the future? We all agree that one euro obtained immediately is better than one euro obtained next year, mostly because of the positive return we can get by investing this euro. This arbitrage argument implies that costs and benefits occurring in the future should be discounted at a rate equal to the rate of return of capital over the corresponding period. Because it is hard to predict the rate of return of capital for several centuries, one should follow an alternative approach to select the discount rate, which consists in evaluating explicitly the welfare effect of the environmental policy under consideration for each future generation. Since it compares consumption paths in which costs are redistributed across generations, it is important to make explicit the ethical and economic assumptions on which these comparisons are made. Most environmental policies will generate winners and losers, but one evaluates the welfare gain by defining an intergenerational welfare function which is a (discounted) sum of the welfare of each generation.

The welfare approach to discounting is based on the additional assumption that future generations will be richer than current ones. In a nutshell, one should not be ready to pay one euro to reduce the loss borne by future generations by one euro, given that these future generations will be so much wealthier than us. Suppose for example, as in the Stern Review, that the real growth rate of the world GDP per capita will be 1.3% per year over the next 200 years, which implies that people will enjoy a real GDP per capita 11 times larger in 2200 than it is today. Suppose also that, as in the Stern Review in which the representative agent has a logarithmic utility function, which means that doubling the GDP per capita halves the marginal utility of wealth. Combining these two assumptions implies that one more unit of consumption now has a marginal impact on social welfare that is 11 times larger than the same increment of consumption in 2200. This wealth effect alone corresponds to a discount rate of 1.3% per year. More generally, the so-called Ramsey rule states that the socially efficient discount rate is the product of the real growth rate of consumption times the elasticity of the marginal utility of consumption. The Ramsey rule, and its underlying wealth effect, is the cornerstone element in the current debate about the discount rate.

The Ramsey rule is subject to many criticisms in spite of the fact that it provides a sound economic basis for discounting. The main difficulty with the Ramsey rule comes from the complexity to predict the growth rate of consumption. Estimating the growth rate for the coming year is already a difficult task. No doubt, any estimation of growth for the next century/millennium is subject to potentially enormous errors. The history of the western world before the industrial revolution is full of important economic slumps (invasion of the Roman Empire, Black Death, worldwide wars,...). Those who emphasize the effects of natural resource scarcity will forecast lower growth rates in the future. Some even suggest a negative growth of the GDP per head in the future, due to the deterioration of the environment, population growth and decreasing returns to scale. They claim that the wealth effect goes the other direction, yielding a negative discount rate. On the contrary, optimistic people will argue that the effects of the improvements in information technology have yet to be realized, that there are still many scientific and technological discoveries to be made, and that the world will face a long period of prosperity. The rationale attitude would be to recognize that there is a lot of uncertainty about the economic environment in the distant future. This uncertainty casts some doubt on the relevance of the wealth effect to justify the use of a large discount rate.

It is thus crucial to put growth uncertainty into the picture, as done in Section 2. This is required for the sake of realism, and it is essential if one wants to provide a credible economic approach to the notion of sustainable development. A first attempt in that direction has been provided by Gollier (2002a, b, 2007), and more recently by Weitzman (2007). Instrumental to this analysis is the concept of prudence which refers to the consumer’s willingness to save more in the face of an increase in his future income risk, or, equivalently, the willingness to sacrifice the present in the face of a more uncertain future. Technically, an agent is prudent if the third derivative of his utility function is positive. Macroeconomists have been measuring the precautionary saving motive, and this literature tells us much about how much sacrifice people should do when the economic environment of future generations is uncertain.

Thus, the socially efficient discount rate has two main determinants, a wealth effect and a precautionary effect. The larger the expected future consumption or the smaller the future uncertainty, the larger is the discount rate. We can apply this analysis for different time horizons to determine the term structure of discount rates. In this paper, we discuss the importance of the persistence of shocks on the growth of the economy, hence on the shape of this structure. The high persistence of future shocks tends to magnify the long-term risk, and the associated negative precautionary effect on the long-term discount rate. Thus, as explained in Gollier (2007), it justifies using a smaller rate to discount cash flows occurring in a more distant future.

This paper reviews the standard consumption-based theory of the term structure of discount rates. The benchmark model is based on a power utility function for the representative agent and an arithmetic Brownian process for the log consumption. We show in Section 2 that this implies that the socially efficient discount rate be independent of the time horizon. We then show that the persistence of shocks to the growth rate of the economy can justify using a decreasing term structure. Several models exhibiting this property are presented and discussed in the core of the paper.

1 The extended Ramsey rule

We consider a standard utilitarian welfare function

$$ W=\sum\limits_{t=0}e^{-\delta t}Eu(c_{t}). $$
(1)

The preferences of the representative agent in the economy are represented by her utility function u and by her rate of pure preference for the present δ. In a model with multiple generations, the current generation integrates the welfare of the next generation as if it was its own one. Thus, there is pure altruism in model Eq. 1. The utility function u on consumption is assumed to be three times differentiable, increasing and concave. Let c t denote consumption at date t. Consider a marginal risk-free investment at date 0 which generates a single benefit e rt at date t per euro invested at date 0. At the margin, investing in this project has the following impact on welfare:

$$ \Delta W=e^{-\delta t}e^{rt}Eu^{\prime }\left( c_{t}\right) -u^{\prime }(c_{0}). $$
(2)

The first term in the right-hand side is the welfare benefit that such investment yields. Consumption at date t is increased by e rt, which yields an increase in expected utility by \(Eu^{\prime }\left( c_{t}\right) e^{rt}\), which must be discounted at rate δ to take account of the delay. The second term, \(u^{\prime }(c_{0}),\) is the welfare cost of reducing consumption today. Because ΔW is increasing in r, there exists a critical rate of return denoted r t , such that ΔW = 0 for r = r t . Obviously, r t is the socially efficient discount rate, which satisfies the following standard pricing formula:

$$ r_{t}=\delta -\frac{1}{t}\ln \frac{Eu^{\prime }(c_{t})}{u^{\prime }(c_{0})}. $$
(3)

If financial markets would be frictionless and efficient, r t would be the equilibrium interest rate associated to maturity t. This formula is the standard asset pricing formula for riskfree bonds (See for example Cochrane 2001).

Suppose that u (c) = c  − γ, where γ represents the constant relative risk aversion of the representative agent.Footnote 1 Suppose also that

$$ X_{t}=\ln c_{t}-\ln c_{0} $$

normally distributed. As is well-known,Footnote 2 the Arrow-Pratt approximation is exact for an exponential function and a normally distributed risk. This implies that

$$ \frac{Eu^{\prime }(c_{t})}{u^{\prime }(c_{0})}=E\exp (-\gamma X_{t})=\exp (-\gamma (EX_{t}-0.5\gamma Var(X_{t}))). $$

It implies in turn that

$$ r_{t}=\delta +\gamma \frac{EX_{t}}{t}-0.5\gamma ^{2}\frac{Var(X_{t})}{t}, $$
(4)

or equivalently, that

$$ r_{t}=\delta +\gamma g_{t}-0.5\gamma (\gamma +1)\frac{Var(X_{t})}{t}, $$
(5)

where \(g_{t}=t^{-1}\ln (Ec_{t}/c_{0})=t^{-1}(EX_{t}+0.5Var(X_{t}))\) is the annualized growth rate of mean consumption.Footnote 3 This formula states that the socially efficient discount rate has three determinants. The first one is the rate of pure preference for the present, δ, which we put equal to zero in this paper. The second one is the wealth effect, which is measured by γg t , the product of relative risk aversion and the annualized growth rate of mean consumption between 0 and t. The third determinant is the precautionary effect. We see that it has an effect that is equivalent to a sure reduction of the growth rate of consumption by \(0.5(\gamma +1)t^{-1}Var(X_{t}),\) which is the precautionary premium defined by Kimball (1990). Indeed, γ + 1 is the index of relative prudence − cu ″′(c)/u (c). The uncertainty on the wealth available to the generation living at date t tends to reduce the discount rate associated to that date, which implies that more sacrifice must be endured today to increase wealth at that date.

Stern (2006) considers the following specification: δ = 0.1%, g t  = 1.3%, Var(X t ) = 0 and γ = 1. Equation 5 applied with these values of the parameters implies that r t  = 1.4% per year. Actually, Stern does not use explicitly a discount rate, because the investment project is not marginal. Therefore, the marginalist approach presented in this section cannot be used in his context. Rather, Stern estimates the sure immediate and permanent loss in consumption that yields the same effect on welfare W as the impacts of climate change. However, in Stern (2006), the certainty equivalent loss does not exceed 15% of GDP in 2200, which is not far of being “marginal”. In order to evaluate this point, let us evaluate in a non-marginal way the maximum share y of current GDP that one should be ready to sacrifice to eliminate a sure loss of 15% of GDP in 200 years, where y is the solution of the following iso-welfare condition:

$$ u(c_{0})+e^{-\delta t}u\big(0.85c_{0}e^{200g}\big)=u(c_{0}(1-y))+e^{-\delta t}u\big(c_{0}e^{200g}\big). $$

Under the calibration of Stern, we obtain y = 12.46%. This corresponds to discounting the future loss of \(0.15\exp (200g)\) at a rate of 1.39% per year. Using a non-marginalist approach to valuing efforts to mitigate global warming does not noticeably change the conclusion compared to using a standard cost-benefit analysis of certainty equivalent impacts with a discount rate of 1.4%.

2 The term structure of discount rates

In the previous section, we determined the socially efficient rate to discount a cash-flow occurring at date t, as a function of the expectations about GDP/cap at that date. The Ramsey rule Eq. 4 holds for all t, which means that it characterizes the term structure of the socially efficient discount rate. It depends upon how EX t /t and Var(X t )/t evolve with t. The simplest and most classical case has ln c t follow an arithmetic Brownian motion with trend μ and volatility σ: dln c t  = μdt + σdz. This is equivalent to dc t /c t  = gdt + σdz, with g = μ + 0.5σ 2. In that case, EX t  = E(ln c t  − ln c 0) = μt and \(Var(X_{t})=\sigma ^{2}t\). This implies that

$$ r_{t}=\delta +\gamma \mu -0.5\gamma ^{2}\sigma ^{2}\text{ \ or \ } r_{t}=\delta +\gamma g-0.5\gamma (\gamma +1)\sigma ^{2}. $$
(6)

This so-called extended Ramsey rule is equivalent to those obtained by Mankiw (1981), Hansen and Singleton (1983), Breeden (1986) and Campbell (1986). It implies in particular that the socially efficient discount rate is independent of the time horizon. The independence of the discount rate with respect to the maturity is an important feature of the standard specification with a power utility function and an arithmetic Brownian motion for the growth rate of consumption. In the absence of uncertainty ( σ = 0), this constancy of the discount rate is due to the constancy of the growth rate g of consumption. It implies a constant rate of reduction of marginal utility by γμ. In order to maintain utility unchanged, this must be compensated by a constant growth rate γμ of the cash-flow. When uncertainty is introduced, the same story holds, because it raises expected marginal utility at constant rate 0.5γ 2 σ 2. This is a consequence of the fact that the stochastic process is i.i.d., implying that the uncertainty on consumption in t years is proportional to t.

What do we know about the growth process of consumption? Using annual data from 1889 to 1978, Kocherlakota (1996) estimated g = 1.8% and σ = 3.6%. Assuming δ = 0, this yields a discount rate equaling

$$ r=0.018\gamma -0,000635\gamma ^{2}. $$

In Table 1, we decompose the socially efficient discount rate into its wealth and precautionary components for different degrees of risk aversion. We observe that the precautionary component is a second order effect compared to the wealth effect, at least for reasonable values of γ.

Table 1 Decomposing the discount rate into the wealth and precautionary components

We have seen that the constancy of the discount rate mostly relies on the constancy of three economic variables: the trend of growth, the volatility of growth, and relative risk aversion. Gollier (2002a, b) relaxed the third assumption. He showed that the term structure of the discount rate should be decreasing if relative risk aversion is decreasing. Relaxing the constancy of the expected growth rate has obvious consequences on the discount rate, which are made explicit in Eq. 5. In particular, in an economy with diminishing expectations (\(\partial g_{t}/\partial t<0\)), the term structure should be decreasing because the wealth effect vanishes. We hereafter examine the consequences of relaxing the constancy of volatility.

2.1 Persistent shocks on growth

The benchmark model examined above is based on the assumption that aggregate log consumption follows a pure random walk, or that the growth rate of consumption has no serial correlation. But Cochrane (1988) and Cogley (1990) have shown that this hypothesis is rejected by the data in most countries. Let us thus alternatively assume that the change in log consumption exhibits some persistence that takes the form of an AR(1) process:

$$\begin{array}{rcl} \ln c_{t+1} &=&\ln c_{t}+x_{t} \\ x_{t} &=&\phi x_{t-1}+(1-\phi )\mu +\varepsilon _{t} \end{array}$$

where ε 0,ε 1,... are normal i.i.d. with mean zero and variance σ 2. We get the benchmark specification as a special case of this AR(1) with ϕ = 0. When ϕ is positive and less than unity, the change in log consumption exhibits some persistence. It is straightforward to check that

$$ X_{t}=\ln c_{t}-\ln c_{0}=\mu t+(x_{-1}-\mu )\frac{\phi (1-\phi ^{t})}{ 1-\phi }+\sum_{\tau =1}^{t}\frac{1-\phi ^{\tau }}{1-\phi }\varepsilon _{t-\tau }. $$

It implies that

$$ \frac{EX_{t}}{t}=\mu +(x_{-1}-\mu )\frac{\phi (1-\phi ^{t})}{t(1-\phi )}. $$

When the state variable x  − 1 at date 0 is larger than μ, the persistence of the past positive shock generates a positive wealth effect that vanishes over time. It justifies a decreasing term structure. The opposite configuration arises when x  − 1 is smaller than μ.

We then get that

$$ \frac{Var(X_{t})}{t}=\frac{\sigma ^{2}}{(1-\phi )^{2}}+\sigma ^{2}\frac{\phi (1-\phi ^{t})}{t(1-\phi )^{3}}\left[ \frac{\phi (1+\phi ^{t})}{1+\phi }-2 \right] . $$

This is increasing in t when there is some persistence (ϕ > 0). The positive serial correlation of the growth process tends to magnify the long term risk relative to the short term one. From Eq. 4, the increasing precautionary effect that this generates justifies a decreasing term structure. Because Var(X t )/t converges to σ 2/(1 − ϕ)2 when t tends to infinity whereas \(V(X_{1})=\sigma ^{2},\) the precautionary effect is multiplied by a factor \((1-\phi )^{-2\text{ }}\) when going from the very short term to infinity. For example, when ϕ = 0.3, the long term precautionary effect is two times larger than the short term effect.

Notice that the short-term interest rate in this model also follows an AR(1) process since, using Eq. 4, the short-term interest rate at date t equals δ − 0.5γ 2 σ 2 + γ(1 − ϕ)μ + γϕ x t − 1. This indicates that this model is a discrete-time consumption-based version of the Vasicek (1977) model of the term structure of interest rates. Moreover, because changes in the short-term interest rate are proportional to changes in past log consumption, the degree ϕ of persistence of the latter equals the degree of persistence of the former, which has been well documented in the literature on the term structure of interest rate. For example, Backus et al. (1998) consider ϕ = 0.024 month − 1, which corresponds to ϕ = 0.3 year − 1.

Because ϕ = 0.3 per year yields a half-life time of 2.3 years, this model is useful to justify differences in discount rates for maturities expressed in years, but not really for maturities expressed in decades or centuries.

2.2 A model with parameter uncertainty

As invoked in the so-called Peso-problem, the absence of a sufficiently large data set to estimate the long-term growth process of the economy implies that the parameters controlling the growth process are uncertain and subject to learning in the future. To take into account this observation, we extend the benchmark model presented in Section 2 by introducing some parametric uncertainty, as in Gollier (2007). Suppose that log consumption follows an arithmetic Brownian motion with trend μ(θ) and volatility σ(θ). These values depend upon parameter θ, which is unknown at date 0. This uncertainty is characterized by random variable \(\widetilde{\theta }.\) By the law of iterated expectations, Eq. 3 can be rewritten as

$$ r_{t}=\delta -\frac{1}{t}\ln E_{\theta }\left[ \frac{E\left[ \left. u^{\prime }(c_{t})\right\vert \theta \right] }{u^{\prime }(c_{0})}\right] =\delta -\frac{1}{t}\ln E_{\theta }\left[ \exp \left[ -\gamma t(\mu ( \widetilde{\theta })-0.5\gamma \sigma (\widetilde{\theta })^{2})\right] \right] , $$

where the second equality is obtained by using again that the Arrow-Pratt approximation is exact in the case of an exponential function and a normally distributed random variable (conditional to θ). We conclude that r t equals δ + γM t , where M t is the certainty equivalent of μ \((\widetilde{\theta })-0.5\gamma \sigma (\widetilde{ \theta })^{2}\) under function \(v_{t}(x)=-\exp (-\gamma tx):\)

$$ \exp \left[ -\gamma tM_{t}\right] =E\exp \left[ -\gamma t(\mu (\widetilde{ \theta })-0.5\gamma \sigma (\widetilde{\theta })^{2})\right] \label{w1} $$
(7)

Because an increase in t makes function v t more concave, we directly obtain that M t is decreasing in t. It equals \(M_{0}=E\left[ \mu ( \widetilde{\theta })-0.5\gamma \sigma (\widetilde{\theta })^{2}\right] \) for the instantaneous rate, and it tends to the lowest possible value \(M_{\infty }=\min_{\theta }\big[ \mu (\theta )- 0.5\gamma \sigma (\theta )^{2}\big] \) . This implies that the socially efficient discount rate decreases with the time horizon.

Gollier (2007) provides an intuition for why the uncertainty surrounding the drift of the growth process justifies selecting a smaller long discount rate. Indeed, the observation of a high (low) growth in the short run induces the representative agent to revise her expectations about the distribution of growth upwards (downwards). Thus, Bayesian learning generates a positive correlation in the perceived growth process. This magnifies the long-term risk, thereby inducing the prudent representative agent to make more effort for the distant future. This risk magnification effect is described in Figs. 1 and 2, where we assume that σ 2 = 3.6% and μ equals either 0.8% or 2.8% with equal probabilities. In Fig. 1, we compare the densities of X 1, i.e. the log of consumption in one year, under this parametric uncertainty, or when we assume that μ = 1.8% for sure. In Fig. 2, we do the same comparison for a time horizon of 100 years. For both time horizons t = 1 and t = 100, the parametric uncertainty makes the tails fatter, thereby raising the precautionary effect and reducing the discount rate. But the intensity of the phenomenon is much stronger for the long-term rate. This explains that the term structure of the socially efficient discount rates, which is drawn in Fig. 3, should be decreasing when the drift of economic growth is subject to parametric uncertainty.

Fig. 1
figure 1

The density of the log of consumption in 1 year with (plain curve) and without (dashed curve) parametric uncertainty on the drift of economic growth

Fig. 2
figure 2

The density of the log of consumption in 100 years with (plain curve) and without (dashed curve) parametric uncertainty on the drift of economic growth

Fig. 3
figure 3

The term structure of discount rates when \(\protect\gamma =2,\protect\delta =0,\) \(\protect\sigma =3.6\%\) and \(\protect\mu \sim (0.8\%, 1/2; 2.8\%, 1/2)\)

Weitzman (2007) considers a special case of parametric uncertainty where μ is known, but σ has an inverted Gamma probability distribution. Because the inverted-Gamma distribution has an unbounded support, we directly conclude that the socially efficient discount rate r t tends to − ∞ when t tends to infinity. This is because the representative agents are upset about the possibility of an arbitrary large volatility of the growth of consumption, which makes them extremely prudent. This model is linked to the notion of fat-tails, since a normal distribution combined with an inverted Gamma distribution for the volatility yields a Student-t distribution, which has tails fatter than the normal. As shown by Barro (2006) for example, thickening the lower tail of the aggregate risk of the economy can have devastating effects on the socially efficient discount rate. Weitzman (2007) shows that if one replaces the distribution of log consumption from normal to Student-t, which has fatter tails, then the socially efficient discount rate goes to minus infinity. Of course, because this specification is unrealistic, it leads to unrealistic policy recommendations. The existence of catastrophic risks raises a difficult challenge for the econometrician, since the rarer an event the more uncertain is the estimate of its probability. Similarly, we don’t know much about the shape of marginal utility at very low levels of consumption. The critical question is whether it tends to infinity when consumption goes to zero, which implies that one would be ready to give up our entire wealth to reduce the probability to end up with zero consumption.

In fact, the result that r t tends to minus infinity holds in the CRRA case whenever the support of μ − 0.5γσ 2 is unbounded below. Suppose alternatively that σ 2 can take value v/2 or 10 v, respectively with probabilities 18/19 and 1/19, which implies that the mean variance equals v = (3.6%)2, as in the benchmark case. It yields

$$ r_{t}=\delta +\gamma \mu -\frac{1}{t}\ln \left[ \frac{18}{19}e^{0.25\gamma ^{2}vt}+\frac{1}{19}e^{5\gamma ^{2}vt}\right] . $$

In Fig. 4, we represented the term structure of discount rates for γ = 2, δ = 0, μ = 1.8%, v = (3.6%)2. We see that the short-term discount rate equals 3.34% as in the benchmark case with σ = 3.6%, but converges to 1.01% for very large time horizons. This asymptotic discount rate is the one that one would obtain by using the extended Ramsey rule Eq. 6 with the large volatility \(\sigma =\sqrt{ 10v},\) which is \(\sqrt{10}=3.1\) times the standard deviation of the growth rate of the aggregate consumption in the US from 1889 to 1978. Weitzman’s extreme conclusion comes from the fact that the observed volatility of growth over this period does not exclude in theory the possibility that the true volatility can be much larger than \(\sqrt{10v}\simeq 11.4\%\).

Fig. 4
figure 4

The term structure of discount rates (in %) when the variance of the growth rate of the economy is uncertain

2.3 Risk of catastrophic recessions

Another way to introduce some persistence in shocks on growth is to consider a Markov process. Suppose that there are two states, s = g and s = b, yielding different expected changes in log consumption μ(s g ) = μ g  > μ b  = μ(s b ). In each period, there is some probability π s that the state will reverse. This stochastic process is described as follows:

$$\begin{array}{rcl} \ln c_{t+1} &=&\ln c_{t}+x_{t} \\ x_{t} &=&\mu (s_{t})+\varepsilon _{t} \\ P[s_{t+1} &=&b\text{ }\left\vert \text{ }s_{t}=g\right. ]=\pi _{g}\text{ and }P[s_{t+1}=g\text{ }\left\vert \text{ }s_{t}=b\right. ]=\pi _{b}, \end{array}$$

where ε t is i.i.d. normal with mean zero and variance σ 2. Cecchetti et al. (2000) estimated such a two-state regime-switching process for the US economy using the annual per capita consumption data covering the period 1890–1994. Table 2 reproduces their estimates. It reveals that the low-growth state is moderately persistent but very bad, with consumption growth in that state being μ b  = − 6.78%. On the contrary, the high-growth state is highly persistent, with consumption growth in that state equaling 2.25%. The economy spends most of the time in this state with the unconditional probability of being in state g equaling π b /(π g  + π b ) = 96%.

Table 2 Estimates of the regime-switching consumption process

One can solve this problem by recursion. Let us define \(m_{t}^{s}\) in such a way that

$$ \exp \left( -\gamma m_{t+1}^{s}\right) =E\left[ \exp \left( -\gamma (x_{\tau }+...+x_{\tau +t})\right) \left\vert s_{\tau }=s\right. \right] . $$

It implies that

$$\begin{array}{rcl} \exp \left( -\gamma m_{t+1}^{s}\right) &=&(1-\pi _{s})\exp \left( -\gamma (\mu _{s}-0.5\gamma \sigma ^{2}+m_{t}^{s})\right)\\ &&+\pi _{s}\exp \left( -\gamma (\mu _{-s}-0.5\gamma \sigma ^{2}+m_{t}^{-s})\right) \end{array}$$

which, together with \(m_{0}^{s}=0\), allows us to compute \(m_{t}^{s}\) by recursion. We can then derive the term structure since

$$ r_{t}\left[ s_{0}=s\right] =\delta +\gamma \frac{m_{t}^{s}}{t}. $$

In Figs. 5 and 6, we draw the term structures that prevail respectively in the good state and in the bad state. The two rates converge towards r  ∞  = 3.26%. In the good state, the term structure is decreasing because of the persistence of the shock, implying that the long-term risk is much larger than the short-term one. In the bad state, things can only be better in the future, and the expectation of recovery generates a strong wealth effect that yields an increasing term structure.

Fig. 5
figure 5

The term structure in the regime-switching model: Good state

Fig. 6
figure 6

The term structure in the regime-switching model: Bad state

2.4 Initially uncertain return of capital

Weitzman (1998, 2001) provides a very simple argument in favour of a decreasing term structure. Let θ denote the annualized return of capital in the economy. By a simple arbitrage argument, it must be that the discount rate equals θ. This means that an investment project with payoff B at date t per euro invested at date 0 is efficient if and only if its NPV − 1 + Be  − θt is positive. Suppose now that θ is uncertain at the time of the investment decision. Following Weitzman, suppose that the optimal criterion in this environment is to invest in any project with a positive expected NPV. Obviously, this would mean using a discount rate R t such that

$$ f_{t}(R_{t})=Ef_{t}(\widetilde{\theta }), $$
(8)

with \(f_{t}(x)=e^{-tx}\). Because f is decreasing and convex, we directly derive that R t is less than \(E\widetilde{\theta }\). Moreover, because an increase in t makes f t more convex in the sense of Arrow-Pratt, this reduces the “certainty equivalent” R t of \(\widetilde{\theta }.\) Weitzman concludes that the uncertainty on the rate of return of capital justifies using the smallest possible rate (i.e., the lower bound of the support of \(\widetilde{\theta }\)) to discount very distant cash flows. Gollier (2004) criticized this simple argument on the basis that there is no theoretical justification to use the expected net present value criterion when the interest rate is uncertain.Footnote 4 In this section, we explore a very stylized model that allows us to discuss Weitzman’s argument with a fully fledged economic model.

For the sake of simplicity, let us depart from the infinite horizon approach that we used in this paper, and suppose that the representative investor has a finite lifetime [0,T]. Suppose also that his lifetime wealth is w, and that the return θ of wealth is revealed at date ε = 0 +  . After observing θ, his consumption-saving problem can be solved under certainty:

$$ \max_{c}\text{ \ }\int\limits_{0}^{T}e^{-\delta t}u(c(t))dt\text{ \ s.t. \ } \int\limits_{0}^{T}e^{-\theta t}c(t)dt=w. $$

The solution of this problem is a function c(t,θ) that solves the first-order condition e δt u (c(t,θ)) = λ(θ)e  − θt, or equivalently,

$$ e^{-\theta t}=\frac{e^{-\delta t}u^{\prime }(c(t,\theta ))}{u^{\prime }(c_{0}(\theta ))}, $$
(9)

with c 0(θ) = c(0,θ). Under constant relative risk aversion γ, the growth rate of consumption will be a constant g(θ) = (θ − δ)/γ. This means that the initial productivity shock on the growth rate of consumption is infinitely persistent in this model.

Now, consider a marginal investment project that yields a payoff \( B=e^{R_{t}t}\) at date t per euro invested at time 0. This investment decision must be made prior to the resolution of the uncertainty on \( \widetilde{\theta }\).Footnote 5 This marginal project has no effect on the expected lifetime utility of the agent if and only if

$$ -Eu^{\prime }(c_{0}(\widetilde{\theta }))+e^{R_{t}t}e^{-\delta t}Eu^{\prime }(c(t,\widetilde{\theta }))=0. $$

This determines the socially efficient discount rate for maturity t. It is defined by

$$ e^{-R_{t}t}=\frac{Ee^{\delta t}u^{\prime }(c(t,\widetilde{\theta }))}{ Eu^{\prime }(c_{0}(\widetilde{\theta }))}=\frac{Ee^{-\widetilde{\theta } t}u^{\prime }(c_{0}(\widetilde{\theta }))}{Eu^{\prime }(c_{0}(\widetilde{ \theta }))}, $$
(10)

where the second equality is obtained by using Eq. 9. This equation can berewritten as \(f_{t}(R_{t})=\widehat{E}f_{t}(\widetilde{ \theta }),\) where \(f_{t}(x)=e^{-tx}\) and \(\widehat{E}\) is the risk-neutral expectation operator \(\widehat{E}f_{t}(\widetilde{\theta })\!=\!Ef_{t}( \widetilde{\theta })u^{\prime }(c_{0}(\widetilde{\theta }))/Eu^{\prime }(c_{0}(\widetilde{\theta })).\) The same argument as the one provided by Weitzman (1998, 2001) can thus be used, but with the important correction of using risk-neutral probabilities. We conclude that the term structure of the socially efficient discount rate in this environment is decreasing and that it converges to the lowest possible rate of return of capital.

Observe that this model is not far from the one developed in Section 2.2 with parametric uncertainty. In this model, the uncertain parameter θ endogenously determines the growth of the economy. The assumed high persistence of the initial shock on capital returns is transmitted to the consumption process. As for the other models presented above, this persistence explains the decreasing nature of the term structure, which is estimated by Newell and Pizer (2003), Groom et al. (2007), and Gollier et al. (2008).

3 Conclusion

We have provided various arguments in favour of using a smaller discount rate for more distant maturities. The central argument is based on some persistence of shocks on aggregate consumption. This persistence magnifies the long-term risk relative to the short-term risk. If the representative agent is prudent, there is then a precautionary argument in favour of increasing the value of more distant benefits. This justifies a decreasing term structure for the discount rate. We examine various models exhibiting these features. Two of them are particularly compelling in the context of sustainable development. The first one relies on parametric uncertainty. If the growth process is sensitive to some parameter whose value is currently not perfectly known, the successive updated beliefs will exhibit persistence. This persistence due to learning should be treated just the same way as would a model with no learning, but with a persistence pattern in growth rates due to a drift over time. Simple but reasonable specifications of this model yield a term structure going from 3.5% in the short term to 1% in a millennium. The other model is based on a two-state regime-switching growth process. Because of the high persistence of Markov shocks, the term structure is also decreasing in this model. The intuition is that, when considering more distant time horizons, there is a cumulative effect of the uncertainty about the duration of the time spent in the bad state. Using an econometric estimation of this process on US data by Cecchetti et al. (2000), and assuming that we are currently in the good state, we obtained a term structure that goes from 4.3% to 3.4% when maturities go from zero to 100 years.