1 Introduction

What is the impact of international trade on environmental degradation? The urgency of this question is stressed by the robust growth of global trade: in the last decade alone, merchandise trade ballooned from $6 trillions in 1999 to over $15 trillions in 2010.Footnote 1

The literature on the subject divides into two opposing camps. According to the first hypothesis, trade allows an efficient allocation of resources, which reduces the pressure on scarce inputs and may facilitate the diffusion of cleaner, more efficient technologies. According to the second, market failures lead to imbalanced trade patterns in which scarce resources are overused, and pollution increases because the negative externalities of production are not internalized. In this framework, the production of pollution-intensive goods is first outsourced to countries with laxer environmental regulations, the so-called pollution havens, then re-imported by countries with more demanding laws. Scholars have conjectured that the environmental Kuznets curve (EKC)—the empirical finding by which per-capita pollution follows an inverted-U trajectory as per-capita income grows—is the product of such an outsourcing effect. However, current empirical evidence, reviewed below, is inconclusive.

I evaluate the empirical support for each hypothesis in the case of carbon dioxide \((\hbox {CO}_2)\) emissions per capita, a potent pollutant and a source of climate change (Solomon et al. 2007). I argue that the reason for previous studies’ inconclusive results is that the extant literature suffers from omitted variable bias due to the lack of proper accounting of spatial correlation. If trade shifts production across countries, then it should be modeled as a mechanism of pollution diffusion. Trade does not by itself cause pollution, but it is a channel through which pollution is moved across borders.Footnote 2 Then, the appropriate econometric model specifies trade as a spatial weight that correlates carbon emissions between countries.

Examining per-capita \(\hbox {CO}_2\) trajectories, I find that the EKC is not robust to estimations that account for spatial effects. I find that trade significantly shifts pollution across countries, and that once I account for this, there is no evidence that per-capita \(\hbox {CO}_2\) emissions follow an inverted-U shaped trajectory. Instead, the relationship is linear, in contrast to previous studies. Hence, environmental gains made in industrialized countries appear to be illusory.

Next, I attempt to relate these gains to the development of pollution havens. First, I show that the effects of trade are even stronger when I only consider imports from non-OECD countries, suggesting a shift of pollution to developing countries. Then, I compare the carbon trajectories of developing and industrialized countries. If the efficiency argument holds, the pollution trajectories of developing countries should go underneath that of industrialized countries. Under the pollution haven hypothesis, the reverse should be true. I test these predictions using a propensity score matching design that alleviates concerns about non-linearity and improves the comparability of developing and industrialized countries.

Empirically, I identify a second mover disadvantage: developing countries emit more \(\hbox {CO}_2\) per capita than industrialized countries at similar stages of socio-economic development. The effect suggests that developing countries emit between 0.29 and 0.7 tons of carbon dioxide per capita more than industrialized countries, holding everything else constant. This compares with a sample mean of roughly 1.1 ton. This provides further evidence that developing countries were the main targets for outsourcing, since this suggests the existence of an excess of carbon emissions that is likely due to the outsourced industries.

These findings are critical because they question the limitations of domestic policymaking when the costs of trade are low. National policies have external consequences, a finding that mirrors debates on taxation policies for instance (Plümper et al. 2009). Restrictive environmental regulations are clearly less effective if firms can avoid them by shifting their production across countries. This paper thus contributes to the larger literature on the weaknesses of states in the face of decreasing trade barriers and questions more recent optimistic findings on the positive effects of trade on the environment (Evans 1997; Prakash and Potoski 2006; Vogel 1995).

The debate surrounding trade and the environment is sensitive. Trade is an important determinant of public and private income. This is particularly true for developing countries, which often rely on trade to raise income but which also increasingly are major polluters (World Resources Institute 2010). Moreover, the trade regime is perceived as part of the solution to these problems: the pressure to use trade-based incentives to enforce good environmental governance is mounting (Nimubona 2012). For instance, the European Union has mentioned the possibility of imposing tariffs on countries deemed to emit too much greenhouse gas.Footnote 3 Whether trade in fact affects the environment is central for the legitimacy of these attempts.

The remainder of this paper is structured as follows. In the second section, I briefly review the relevant literature pertaining to the EKC, trade, and the environment. In the third section, I consider how spatial effects may affect empirical analyses of pollution trajectories. I present evidence that once trade effects are taken into account, the environmental gains made in industrialized countries disappear. I also show that countries that develop later pollute more than countries that industrialized in the past. The fourth section concludes.

2 Literature and Theory

While the debates raged over the ratification of the North American Free Trade Agreement (NAFTA) in the first half of the 1990s, a number of commentators worried about the possible side effects of trade on the environment. A 1993 article in the New York Times set the stage for these debates: “few dispute that Nafta has the potential to affect the environment in the United States, indeed the hemisphere – for good as well as ill” (emphasis added).Footnote 4 Indeed, the decade witnessed increasing concerns about the possible unintended consequences of globalization in general, and free trade in particular.

Grossman and Krueger’s (1995) influential study shed new light on this debate. They argue that environmental deterioration increases at low levels of income until a tipping point is reached, after which pollution diminishes as countries grow wealthier. Noticing the analogy to Kuznet’s study on income inequality, scholars have since referred to this inverted-U shaped trajectory as the EKC. If trade leads to growth, encouraging trade will help states reach the tipping point and abate pollution in the long run.

Empirically, evidence in support of the EKC has been found for various pollutants (Grossman and Kruege 1991; Grossman and Krueger 1995; Kleemann and Abdulai 2013; Selden and Song 1995), including carbon dioxide (at least for wealthy countries; see Musolesi et al. 2010). However, these findings have been criticized. Harbaugh et al. (2002) point out the sensitivity of these studies to model specification, data quality, and definition of the dependent variable (e.g., what type of pollutants are considered). Another source of criticism is that many of these studies are tested on cross-sectional data, which ignores the dynamics and heterogeneity of each state’s pollution trajectory.

Furthermore, some have suggested that even if the inverted-U relationship holds for industrialized countries, this may be an artifact of some trade-related effect. The trade-environment nexus is complex, and scholarship is divided on effects of the former on the latter (Beladi and Oladi 2011). The trade literature generally posits that free trade is welfare- and efficiency-increasing. In a simple Heckscher-Ohlin framework, trade is determined by the relative endowment in factors of production. As trade barriers vanish, the economic activity in a given country shifts to the production of goods requiring the factor that is relatively abundant (Leamer 1995). Hence, the efficiency hypothesis posits that trade alleviates the pressure on the relatively scarce resource. If this resource is a natural resource, such as land, then trade weakens the demand on nature.

In addition, trade favors output growth as well as the spread of clean technologies. Richer societies place a higher premium on a clean environment, and their governments themselves are more able of implementing potent environmental regulations (Dasgupta et al. 2002). Copeland and Taylor (1994) suggest that income may endogenously lead to stricter environmental laws. Demand for clean energy in conjunction with high investment rates may lead to a virtuous feedback loop: growth leads to better institutions and technology, which reinforce the demand for a clean environment (Ayres 1993; Lomborg 2001; Simon 1998). Efficiency-enhancing technologies in particular may play a key role, since they can help newcomers avoid costly investments (Lanjouw and Mody 1996). These claims have been highly publicized in the debates on free trade agreements (Stern 2004).

The strength of the efficiency hypothesis relies on the soundness of a number of assumptions. The most critical is that prices reflect the scarcity of resources (Ekins et al. 1994). If goods are correctly priced, then prices will raise at a socially optimal rate as resources become scarce, ensuring an efficient allocation of resources. However, this does not hold if the production and consumption of these goods involve externalities or public good aspects (Hardin 1968).

Concerns have emerged that trade liberalization might have a disruptive effect in the presence of market failures. Chichilnisky (1994) argues that countries specialize in the extraction of resources whose property rights are ill defined, even in the absence of a comparative advantage in doing so. The key insight is that the marginal cost of extraction does not reflect the true value of these resources as assets. According to Chichilnisky, this explains why developing countries tend to overuse their natural resources, making themselves poorer in the process.

Furthermore, increased market competition due to trade may lead to weaker environmental policies. Trade liberalization exposes and exacerbates institutional failures: countries that want to attract foreign investors may voluntarily weaken their environmental regulations (Cole 2004). The political incentives to do so may be powerful. In wealthy industrialized countries, citizens may ask for a higher provision of green public goods (Franzen and Meyer 2010). If they live under democratic institutions, it is likely that their policymakers may respond by imposing more stringent environmental regulations (Bernauer and Koubi 2009). The incentives are radically different in developing countries. There, individuals may place a premium on economic growth over environmental quality. Policymakers, whether authoritarian or democratic, have an incentive to oblige, at least to the extent that it keeps them in power. They are then more likely to implement weak environmental demands on businesses.

A related argument suggests that poorer countries also have lower legislative capacity. These states do not possess the resources to design, implement, and monitor compliance to tough environmental laws (Haas 1989). The governments of such countries are more likely to be corrupt and captured by the interests of powerful firms (Bardhan 1997; Olson 1965). In these cases, even if citizens actually do wish to receive more green public goods, their hopes are unlikely to be met by regulators.

At any rate, when environmental laws are lax, producers will not internalize the costs of the negative externalities of their business. This in turn will create an incentive for firms to move their production to locations that impose low environmental demands, assuming that other production costs are held constant or lower. Such an incentive is strengthened when international trade is cheap. Typically, the multiple rounds of tariff reductions under the WTO have provided ideal conditions for shifting production to other countries. This effect is generally referred to as the pollution haven hypothesis, or sometimes the outsourcing argument.

The evidence for trade-related effects (whether positive or negative) on pollution is mixed. Early research by Leonard (1988) and Tobey (1990) fail to find support for a weakening of environmental regulations in wealthy countries. Frankel and Rose (2005) find no evidence that trade has a detrimental effect on the environment. However, these studies use cross-sectional data and include trade openness as their main independent variable. Furthermore, using an instrumental variable approach with samples as small as 30 observations increases doubts about the strength of their findings, as the small sample properties of IV models are poor. Grether and De Melo (2003) consider a similar question, but find some evidence for trade-related outsourcing of dirty production. Cole’s (2004) study is similar to the analysis performed in this paper; he finds that a trade-related effect for some pollutants but that this effect does not eliminate the EKC. His study, however, entirely focuses on industrialized countries over a relatively short time span, losing much of the dynamics of the data. Peters et al. (2011) provide evidence that trade may account for industrialized countries’ ability to stabilize their emissions. The limits of their study is the relatively short time period of the data (1990–2008) and the need to impute most of the dataset except for three reference years. Kleemann and Abdulai (2013) use panel data and find evidence in support of the pollution haven hypothesis for energy use per capita and adjusted net savings.

We are left with a puzzle. By most scholarly accounts, industrialized countries have witnessed an improvement of the quality of their environment, as the EKC predicted. The improvement of environmental quality in industrialized countries is consistent with a more efficient use of resources, as the efficiency argument would predict. Skeptics argue that the gains made in wealthy countries were achieved at the cost of higher environmental degradation in developing countries. Yet, the evidence in favor of either argument has remained elusive. I argue below that the reason for the ongoing uncertainty lies in a mismatch between the empirical test of these hypotheses and their theoretical implications. I contribute to the literature by using an alternative empirical analysis that relies on the spatial effects that trade may have on the environment.

3 Empirical Analysis

3.1 Pollution and Spatial Effects of Trade

To identify and estimate the effects of trade, I conceptualize trade as a mechanism of diffusion, that is, an intermediating factor that conditions how outcomes in one country affect outcomes in other countries (Neumayer and Plümper 2010; Simmons and Elkins 2004). The literature on global diffusion of policies, institutions, and outcomes has burgeoned in the environmental literature (Bechtel and Tosun 2009; Busch and Jörgens 2005; Neumayer and Plümper 2010). The core idea of the diffusion argument is that outcomes across space are not independent, but may influence each other. For instance, pollution in Mexico and in the US might be correlated. In turn, this implies that econometric analyses should model trade as a spatial weight relating carbon emissions across countries.

None of the theories presented above imply that trade in itself is a determinant of pollution. Trade favors efficiency or encourages pollution, but does not on its own determine pollution. Thus, trade is a conditional factor that makes environmental problems more or less acute, depending on whether the efficiency or the outsourcing effects dominate.

Theoretically, trade should therefore be modeled as the mechanism through which pollution is diffused. This is not just a semantic question, but has serious implications for any econometric analysis. Both the efficiency and the pollution haven hypotheses predict that there is some relationship between outcomes in one country and outcomes in other countries. The efficiency argument suggests that as the environment improves in country \(i\), it does not worsen in other countries by more than what was gained in country \(i\). The pollution haven argument, on the other hand, implies that as one country improves its environmental record, some other country will worsen its own environmental performance. Hence, environmental outcomes are spatially correlated. This intuition has largely been ignored by the literature.

This simple yet central insight is of major importance. In the presence of correlation of the error term across units (countries), an OLS estimation of the parameters of a linear model of the form \(y_{i,t} = {\mathbf {X}}_{i,t}' \beta + \varepsilon _{i,t}\) will be biased. Recent development of models that account for correlation of outcomes across units solves these issues.

3.2 Spatial Diffusion: Empirical Strategy

Spatial econometrics recognize that units might be correlated across space (Anselin 1988; Franzese and Hays 2008; Hays et al. 2010). Outcomes in unit \(i\) may be causally related to outcomes in unit \(j\), and vice versa. Both the efficiency and the outsourcing hypotheses posit that a change of carbon emissions in one country may have external consequences on emissions in another country. According to the outsourcing theory, a decrease in pollution in a wealthy country is associated with an increase of pollution in another, presumably poorer, country. The efficiency hypothesis predicts that a decrease in carbon emissions in a given country will be accompanied by a decrease of pollution in other countries, leading to net efficiency gains.

The simplest spatial model is the spatial OLS (or S-OLS) model and is defined as follows:

$$\begin{aligned} \text{ CO }_2\,\text{ per } \text{ capita }_{i,t} = \mu + \rho {\mathbf {W}}_{i,j,t} \text{ CO }_2\,\text{ per } \text{ capita }_{j,t} + {\mathbf {X}}_{i,t}' \beta + \phi _i + \theta _t + \varepsilon _{i,t}, \end{aligned}$$
(1)

where \({\mathbf {W}}\) is the connectivity matrix that relates outcomes in \(j\) to outcomes in \(i\). The vector \({\mathbf {X}}\) includes income per capita and its squared term, and democracy. In addition, some models include population, the relative size of the industrial sector, population density, and oil prices. All models have country fixed effects, and all include either a time trend or year fixed effects. The parameter \(\rho \) is to be estimated. This parameter indicates the degree to which our units experience spatial interdependence.

The connectivity matrix \({\mathbf {W}}\) is a square matrix, with diagonals entries 0, reflecting that units have no connection to themselves. Entry \(w_{i,j,t}\) will reflect the degree to which \(a\) and \(b\) are related at time \(t\). Notice that \(w_{i,j,t}\) doesn’t have to be identical to \(w_{j,i,t}\). Here, the connectivity matrix is measured as the imports by country \(i\) from country \(j\) in year \(t\). The rational for using imports is best illustrated as follows. Let \(i\) be the United States, and \(j\) Mexico (and let us ignore the year for the time being). Let us suppose that the United States now outsources some polluting factory to Mexico, and then imports the goods produced by this factory. This would translate to (i) an increase in imports from Mexico, and (ii) an increase in per-capita carbon emissions in Mexico, both of which would lead to a reduction in per-capita carbon emissions in the United States. The interactive effect of imports and foreign production leads us to expect \(\rho \) to be negative: an increase in \(w_{US,M}\) and of \(\hbox {CO}_2\) per capita in Mexico would lead to lower \(\hbox {CO}_2\) per capita values in the United States.

As noted by Plümper and Neumayer (2009), the spatial methodology literature generally assumes that this connectivity matrix ought to be row-normalized (i.e., the sum of each row should sum up to 1). The rationale is that the influence of each outside unit should be a relative fraction of its ties to the main unit. There are three main reasons for row-normalizing the spatial matrix. First, it bounds the absolute value of the spatial autoregressive coefficient \(\rho \) between 0 and 1 (LeSage and Kelley 2009, 15). Second, the asymptotic properties of the estimates from a spatial model depend on the features of the weight matrix (Elhorst 2003, 249; Elhorst 2012, 8). Third, the vulnerability of a unit such as a country to external variables is often relative to the remaining units. However, as Plümper and Neumayer (2009) argue, this is a matter of theory. Clearly, as the example in the previous paragraph shows, row-normalizing is not necessarily appropriate because the theoretical argument relates the levels of carbon dioxide in \(i\) to those in \(j\). At the same time, the technical reasons in favor of row-normalization cannot be ignored. I report the estimates of both models with row-normalized and with non-row-normalized spatial matrices; I find that my conclusions are not affected by either approach.

In addition to these robustness tests, I also replicate the main results using a time-invariant spatial matrix. Drawing on the literature on trade gravity (Deardorff 1998), I use the inverse of the distance between two countries to measure their spatial connectivity. This enables me to use a time-invariant spatial matrix, avoiding thus some of the issues derived from time-varying matrices (Elhorst 2012). For all practical purposes, the main results are similar to the ones reported below: the concave trajectory of per-capita carbon emissions is not robust to controlling for spatial effects. The results are reported in Table A18.

Technically, by introducing \(y_{j,t}\) on the right-hand side of Eq. 1, we face an endogeneity problem (Franzese and Hays 2008). To avoid this issue, I use a spatial two-stage least squares (S-2SLS) approach (Kelejian and Prucha 1998). The instrument for \({\mathbf {W}}y_{j,t}\) is readily found in \({\mathbf {W}}{\mathbf {X}}_{j,t}\). Remember that \({\mathbf {X}}\) is a vector containing income per capita, its squared term, and the level of democracy. Therefore, I instrument trade-weighted foreign carbon emissions with foreign trade-weighted income per capita.

These instruments must satisfy the usual criteria for 2SLS estimation and cross-spatial endogeneity, namely that the instruments must be causally related to the instrumented variable, but uncorrelated to the error term of the second stage of the estimation (Franzese and Hays 2008; Murray 2006). The correlation between foreign output and foreign carbon emissions is undisputed. The exclusion restriction is more subtle. It requires that foreign income has no effect on domestic carbon emissions, except through trade. This seems plausible, since any effect a change in foreign demand may have on domestic carbon emissions must be somehow reflected in a change in trade patterns. The results of the first stage estimation are reported in the Appendix (Table A12) [(W \(\ldots \) capita)]. I also report the results of the reduced form [\(({\mathbf {W}}_{i,j,t} \cdot \text{ GDP } \text{ per } \text{ capita }_{j,t})\) instead of \(({\mathbf {W}}_{i,j,t} \cdot {\hbox {CO}_2}\text{ per } \text{ capita }_{j,t})\)] model in Table A15. This avoids the problem of complicated feedback loops, whereby pollution in country \(i\) affects pollution in \(j\), which then affects \(i\) in a feedback loop. The results remain unchanged from the analysis presented below.

An alternative solution to the endogeneity problem is to lag \(y_{j,t}\) (Beck et al. 2006). The main argument against this approach is that it is only valid if our estimates do not suffer from serial correlation. Furthermore, I expect the effects of trade to be contemporaneous, which renders this approach inadequate. However, I show in the Appendix that the results remain the same if one employs the lagged approach and various variants thereof (Table A13).

The first and second stage equations to be estimated are given by Eq. 2:

$$\begin{aligned} {\mathbf {W}}_{i,j,t} \hbox {CO}_2 \,\,\text{ per } \text{ capita }_{j,t}= & {} \alpha + {\mathbf {W}}_{i,j,t} {\mathbf {X}}_{j,t} + \nu _{j,t} \nonumber \\ \hbox {CO}_2\,\, \text{ per } \text{ capita }_{i,t}= & {} \mu + \rho \widehat{{\mathbf {W}}_{i,j,t} \hbox {CO}_2 \,\,\text{ per } \text{ capita }_{j,t}} + {\mathbf {X}}_{i,t}' \beta + \phi _i +\theta _{t} + \varepsilon _{i,t} \end{aligned}$$
(2)

There exists a wide range of other spatial panel data models. The main model estimated in this article is commonly referred to as the spatial autoregressive model (LeSage and Kelley 2009, 32; Elhorst 2003). The motivation for using the spatial autoregressive model is that the theory suggests that domestic carbon emissions should be correlated with foreign emissions, but not with foreign output. This would plausibly allow to constrain the effect of \({\mathbf {W}}x_{j,t}\) to be 0. However, this is but one of a range of possible spatial models. Alternatives include the spatial error model (whereby the error terms are spatially correlated) or the spatial Durbin model, where \(y_{i,t}\) is correlated to both \(y_{j,t}\) and \(x_{j,t}\) via the spatial weight matrix \({\mathbf {W}}\). In robustness tests, I reestimated similar models but using the spatial Durbin approach. The substantive results remain unchanged (Table A17).

Estimates may be sensitive to specification. In the Appendix (Section A5), I report the results of a specification test using sulfur dioxide emissions instead of carbon dioxide. The rationale is that we know ex ante that sulfur emissions have largely been phased out across the world. This implies that most countries should exhibit an inverted-U shaped trajectory with respect to sulfur and income. Thus, even when accounting for spatial effects, one would expect income to have a positive effect while its squared term should be negative, regardless of the inclusion of a spatial component. Indeed, I find that this is the case (Table A16).

3.3 Data and Variables

To estimate the model presented above, I build a dataset that contains 151 countries over the 1950–2000 period, where variables vary by country-year. The average number of years of observations per country is 39 years. All variables used in this analysis are reported in Table 1, and their sources are listed in the Appendix (Section A2).

Table 1 Summary statistics of the variables used in the main estimates

The main dependent variable used in this paper is per-capita carbon dioxide emissions and is obtained from Boden et al. (2010). Carbon dioxide is a major greenhouse gas (Solomon et al. 2007). As such, carbon emissions capture a type of environmental degradation. Furthermore, as they are correlated with other polluting activities, they are a useful proxy for a variety of pollutants that could be affected by some trade-related diffusion. Using per-capita data allows to consider the elasticity of change of the polluting structure in a country with respect to changes in the covariates. The variable is measured in metric tons of carbon per inhabitant. The mean is 1.1 metric ton of carbon dioxide emissions per capita.

The bilateral trade data comes from Gleditsch (2002) and is measured in millions of US dollars. Following the spatial approach, I interact imports by country \(i\) from a foreign country \(j\) with its per-capita carbon emissions, then I sum up this number for each importing country over all its trade partners. To obtain my instrument, I then do the same using the foreign country’s income per capita instead of its emissions. Income data comes from Maddison (2007).

I control for a range of potential confounding factors. First, in line with the EKC argument, I include income per capita and its squared term. Next, I control for the country’s level of democracy using data from the Polity project (Marshall 2010). This builds upon a rich literature that suggests that democracies may be more inclined to protect their environment (Neumayer 2002). Finally, I control for the structure of a country’s economy. I include the relative size of a country’s industrial and service sectors (excluding the agricultural sector), its population, and its population density. This data comes from the World Bank’s World Development Indicators. Finally, I control for OPEC membership, since countries that possess their own oil resources may have incentives to provide them cheaply to their own population. This may lead them to emit more emissions per capita.

3.4 Results

Model (1) and (2) in Table 2 are the naive estimates, where (1) includes a linear time trend and (2) includes year fixed effects. The results replicate the EKC findings, since income has a positive effect while its squared term is negative. This reflects the inverted-U relationship between the two variables. Here, the maximum expected level of carbon emissions occurs when income reaches $32,000 per capita.

Table 2 Spatial model of carbon dioxide emissions. For 2-SLS models, the estimates are those from the second stage

As I claim above, Model (1) and (2) are misspecified and ought to control for spatial correlation. Table 2 contains two sets of models. Models (3) to (6) contain the estimates using a spatial weight matrix \({\mathbf {W}}\) that is not row-normalized. Model (3) is a simple spatial OLS model; Models (4) to (6) are estimated with an instrumental variable approach, and differ in their exact specification. Next, Model (7) to (9) replicate the same specifications, but this time \({\mathbf {W}}\) is row-normalized.

There are two main findings. First, the spatial term, \(\hat{\rho }\), is always significantly negative. The interpretation is that as both imports from \(j\) to \(i\) and carbon emissions per capita in \(j\) grow, carbon emissions per capita in \(i\) decrease. This is exactly what the outsourcing approach predicts: as factories are closed in wealthy countries while they are being rebuilt elsewhere, and as imports from the new location increase, we observe a decrease in pollution in the home country. This result establishes strong evidence in favor of the spatial correlation between countries in terms of pollution through imports as the diffusion mechanism. In other words, pollution in \(i\) is essentially dependent on pollution in \(j\). While this basic idea is at the root of the environmental studies literature, this paper provides empirical support for such an effect.

What is more interesting is to see the confounding effects that including the spatial \(\hbox {CO}_2\) variable has on the EKC. In particular, I find that the squared term of per capita income becomes positive and in some specifications significantly so. The only exception to this patter comes from Models (8) and (9); but the maximum level of pollution suggested by these estimates occurs at an income level of about $87,000, an absurdly high level. For all practical purposes, per-capita carbon emissions grow linearly with per-capita GDP. This suggests that including a spatial term in the canonical model of carbon emissions renders the relationship between income and pollution linear at the very least.

It is important to evaluate the substantive effects of the estimates. This is particularly arduous because point estimates are difficult to interpret directly because of the existence of numerous feedback loops, whereby one country’s carbon emissions influences another’s, which then affects the original country (LeSage and Kelley 2009). (Since \(|\hat{\rho }| < 1\), this effect decays gradually.) The nonlinearity of the model and its very large spatial weight matrix renders the computation of marginal effects extremely complex and the information they contain difficult to communicate. In addition, since the outsourcing argument implies that both imports and \(\hbox {CO}_2\) per capita change together at the same time, it is unclear how the feedback loop from spatial lag model will operate. I overcome these problems in the following ways.

First, drawing on the procedure documented in LeSage and Pace (2009, 34) (and its Matlab code), I compute both the direct and indirect effect from the spatial autoregressive model as well as the corresponding Durbin model (see also Franzese and Hays 2007). The results are reported in Table A16 and summarized here. In the former, the direct effect of GDP per capita is \(0.161\) (\(t=18.17, p \text {value} <0.000\)) and its squared term is \(0.001\) (\(t=1.55,\, p=0.122\)). The indirect effect is \(0.037\) (\(t=17.85,\, p<0.000\)) for GDP per capita, and \(0.0003\) (\(t=1.55,\, p=0.122\)) for GDP per capita squared. The total effects then are \(0.199\) (\(t=18.16, p<0.000\)) for GDP per capita and \(0.002\) (\(t=1.55, p=0.12\)) for GDP per capita squared. In the Durbin model, the direct effect is \(0.168\) (\(t=2.6, p=0.009\)) for GDP per capita, \(0.001\) (\(t=0.14, p=0.89\)) for its squared term. The indirect effect is \(-0.059\) (\(t=-2.98, p=0.003\)) for GDP per capita and \(0.0005\) for GDP per capita squared (\(t=0.02, p=0.99\)). Finally, the total effect is \(0.11\) (\(t=1.3, p=0.19\)) and 0.002 \((t=0.6, p=0.94)\). In sum, the effect of income never exhibits a concave pattern; the effect of income on carbon emissions is positive and mostly linear. This confirms earlier findings.

I compare the estimated carbon trajectories when using the naive estimates (which, as we saw, suggest the presence of an EKC) and when using estimates from the spatial model. The trajectories are plotted in Fig. 1. In this illustration, the intercept is set to zero, which is of course not representative. Indeed, based on the estimates, the expected level of carbon emissions per capita when all covariates are set to zero ranges between 6 and 10 tons. Nonetheless, the important finding concerns the difference in expected \(\hbox {CO}_2\) emissions based on the inclusion—or not—of spatial effects. The divergence in the carbon trajectories appears to become statistically significant when income reaches a level of about $20,000 per capita. The expected difference reaches about 3 tons of \(\hbox {CO}_2\) per capita when per-capita income is about $40,000. This discrepancy represents more than one standard deviation.

Fig. 1
figure 1

Estimated carbon trajectories. The estimated trajectory with trade effects is based on the estimates from Table 2, Model (5). The estimated trajectory without trade effects is based on the estimates from Table 2, Model (1). All other covariates are set to zero, incuding the country-specific intercept

I also estimate an additional spatial model, namely the reduced-form version of the S-2SLS model. That is, instead of including foreign carbon emissions on the right hand side \(({\mathbf {Wy}})\), I estimate the effect of foreign economic output \(({\mathbf {WX}})\). This eliminates feedback loops. I find that the point estimates of GDP per capita is 0.155 and its squared term is 0.002; both terms are highly significant. Again, this model finds no evidence for a concave effect of income. An advantage of this model is that one can more easily estimate the effects of foreign shocks. In the politically-sensitive context of NAFTA, I simulate the combined effect of an increase of income in Mexico and an increase of imports from this country by the US. I find that an increase of GDP per capita in Mexico of $1,000 and an increase of imports by the US by $1 million reduces US carbon emissions per capita by 0.03 tons.

In summary, I find that the EKC is a fragile finding that is not robust to controlling for spatial correlation. This finding provides support for the argument that the net environmental effect of trade is detrimental. Trade has not allowed either the diffusion of clean technologies or incentivized a more efficient use of resources enough to compensate for the creation and development of pollution havens, at least for carbon dioxide. Of course, these results are not a cost-benefit analysis and ignore other benefits of trade, but they do support the argument that the overall effect has been adverse for the environment.

3.5 Inter- and Intra-industry Trade

The empirical evidence uncovered above does not directly imply that the outcome is due to the outsourcing of \(\hbox {CO}_2\)-intensive industries to pollution havens. For instance, the implications of inter- versus intra-industry trade are very different. Inter-industry trade is more likely to promote outsourcing whereas intra-industry trade is possibly more conducive to efficiency gains.Footnote 5 Furthermore, it is also more likely that intra-industry trade leads to the spread of superior and possibly clean technologies, especially when it takes place among industrialized countries.

To parse out the difference between trade among homogeneous countries and trade among heterogeneous countries, I proceed in three steps. First, I reestimate the model in Table 2. Instead of examining the effect of all imports, regardless of their origins, I limit the spatial effect to imports coming from non-OECD countries. If the pollution haven hypothesis is correct, the average effect of imports from developing countries should be at least as strong as the average effect of imports from any country. Thus, I only consider how imports from non-OECD countries affects carbon emissions. The results are reported in Table 3. As in Table 2, the first two columns show the naive models. The third column reports the estimates from a spatial OLS model, while the next three columns use spatial 2-stage least squares to estimate the parameters of the model. The last three models report the same estimates but using a row-normalized spatial matrix this time

Table 3 Spatial model of carbon dioxide emissions. For 2-SLS models, the estimates are those from the second stage

The results are very similar to those in Table 2. The EKC disappears once the spatial term is included. Furthermore, the slope of the spatial term is steeper. This implies that carbon emissions in the home country are more reactive to changes in trade with developing countries. This makes sense in the context of the pollution haven hypothesis: whereas intra-industry trade, which mainly takes place in the North, has little effects on pollution, inter-industry trade is more likely to affect pollution patterns.

Next, I compare the carbon trajectories of countries that are currently considered to be developing with countries that are further in their economic development. Developing countries are more likely to be targets for outsourced, pollution-intensive industries. If the carbon trajectory of developing countries goes above the average trajectory of industrialized countries, then this may be indicative of a carbon burden coming from outsourced industries. This excess of carbon dioxide adds to the level of emissions that would have obtained in the absence of trade. This would indicate the existence of an environmental second mover disadvantage: countries that develop later do not benefit from superior technology and must carry the industries that are undesired in industrialized countries. If the trajectory goes underneath the trajectory of industrialized countries however, then developing countries may in fact be benefiting from developing later in time by taking advantage of new and cleaner production technology.

A simple test for this idea is to separate countries between those that are already industrialized and those that are currently developing. The difference in per-capita carbon emission between these two groups, holding everything constant, can then be interpreted as an estimation of the net effect of trade. In turn, this gives us a first approximation on the size of the inter-industry trade effect. The regression equation is:

$$\begin{aligned} \hbox {CO}_2 \,\text{ per } \text{ capita }_{i,t}= & {} \tau \text{ Developing } \text{ Country }_i + \lambda \text{ GDP } \text{ per } \text{ capita }_{i,t} + \phi \text{ GDP } \text{ per } \text{ capita }_{i,t}^{2} \nonumber \\&+\, \psi \text{ Total } \text{ Imports } \text{ per } \text{ capita } + {\mathbf {X}}_{i,t}' \beta + \gamma _k + \delta _t + \varepsilon _{i,t} \end{aligned}$$
(3)

Empirically, I define a developing country as one that was not an original member of the OECD. Table A6 in the Appendix reports the estimates if countries that recently reached high income levels, such as South Korea, are included in the industrialized group, with no noticeable consequence.

Obviously, only few countries have reached high levels of income per capita. This raises the issue of comparing very wealthy countries to poorer ones; in other words, the lack of overlap on important covariates may bias the estimates, especially in the presence of nonlinearity. To reduce these concerns, I limit the sample to comparable cases, that is, to those cases that have similar values on the main independent variables, in particular income. I use a propensity score matching approach to achieve this (Ho et al. 2007).Footnote 6 In the Appendix, I show that the results hold if other matching techniques are used, such as the k-nearest neighbor approach (Table A4 and A5). In the first step, I estimate the propensity of a state to be a developing country. I use income per capita, its squared term, output, democracy, and their various interactions to improve the balance between the treated (developing countries) and the control group (OECD countries).

Importantly, I pool all data, which allows for comparison across time. For instance, a developing country that currently has an average income of $10,000 may be matched with an industrialized country when the latter had a similar income per capita. This is coherent with my intent, which is to compare countries at similar stages of economic development. In the Appendix (Table A9), I show that the results are not affected if least developed countries are dropped from the analysis altogether. Furthermore, I show that the results are substantially similar if the sample is limited to the post-1960 period (Table A8). Global changes in technology are accounted for through year fixed effects.

In the second step, I regress per-capita carbon emissions on income, its squared term, imports per capita, democracy, as well as the same control variables used above. In addition, I include a term that denotes whether a country is developing. This term captures whether the carbon trajectory of these countries goes above or below, so far, the one followed by industrialized countries. Since my treatment is time invariant, I cannot include country fixed effects. Instead, I use continent fixed effects. Finally, I control for imports per capita to account for trade effects. In the Appendix (Table A7), I show that using trade openness instead of imports per capita does not affect the results. Following the propensity score approach, I weight each observation by the inverse of the probability that it is included because of the sampling design.

In Table 4, I report the results of this analysis. Model (1) is the naive model; Models (2) to (6) add the developing country effect. I find that developing countries emit between 0.29 and 0.71 more metric tons of carbon per capita than industrialized countries. This compares with a sample mean of roughly 1.1 ton and represents between a fifth and half of a standard deviation of the dependent variable. This shows that the carbon trajectory of developing countries goes significantly above the one that was followed in the past by currently industrialized countries. The implication of this finding is that developing countries suffer from an environmental second mover disadvantage: countries that develop later in time pollute more, on average, than countries who developed earlier. Notice that the disappearance of the EKC is unsurprising here: since I restrict the sample to comparable cases, the nonlinear part of the carbon trajectory for countries with very high income is suppressed by design.

Table 4 Explaining carbon trajectories—P-score matching design

Finally, it is possible that some non-OECD countries are not proper second movers: they are not developing and may continue to remain low growth countries for the foreseeable future. In such a situation, I may be capturing something else than a second over disadvantage. In the Appendix, I report an alternative way to test this hypothesis (Table A11). I restrict the sample to countries that are currently OECD members; I then estimate the effect that joining the OECD (which can be interpreted as becoming a first mover) has on \(\hbox {CO}_2\) emissions. In line with my argument, I find that joining the OECD reduces emissions by 0.5–1 metric tons of \(\hbox {CO}_2\) per capita. This provides further evidence that countries that develop earlier pass the buck (outsource emissions) to countries that develop later in time.

4 Conclusion

This paper provides a theoretical model of the effect of trade on pollution, suggesting that it is a mechanism of diffusion. Further, it finds evidence that (i) developing countries emit more carbon dioxide emissions on a per-capita basis than industrialized states at similar stages of development, (ii) trade accounts for part of this effect, and (iii) trade may explain the marginal decrease of pollution in industrialized countries.

This, however, does not imply that trade ought to be restricted. The welfare effects of trade may well compensate the negative externalities born out of carbon emissions. Indeed, this paper does not make any claim about the aggregate effects of trade. However, it provides some evidence that apparent gains made in industrialized countries are the result of a displacement of polluting industries to the developing world. That is, pollution is correlated across countries, underlying their interdependence.

Additional research is needed in this area. In particular, understanding the institutional factors that affect where industries move and why may offer a large payoff. While the pollution haven literature has made claims about what can happen when states compete for polluting production, the interaction between domestic institutions and foreign industries requires better theories and empirical analyses. What, besides production costs, motivates firms to implement factories in any particular countries? Do foreign institutions condition the movement of industries? Or are property rights more important? Considering that the outsourcing hypothesis largely rests on an institutional argument, the nature of the diffusion process may shed additional light on the spatiality of trade.

The second stream of possible research relates to understanding why the positive effects of trade, in particular in the form of clean technology diffusion, are not stronger. What explains the adoption and the spread of these technologies? Are the obstacles of an institutional nature? We know little about which countries are able to learn from other countries’ innovations.

Finally, since trade tends to depend on international agreements, whether at the regional or global level, further theoretical studies of the optimal design of international institutions is warranted. How should the WTO or regional trade agreements be designed in the presence of pollution havens? Are direct cash or technology transfers superior alternatives?