Introduction

On a global scale, land-use change and forestry activities have historically been, and are currently, net sources of carbon dioxide to the atmosphere. During the decade of the 1990s, carbon dioxide (CO2) emissions to the atmosphere caused by changes in land use were estimated to be 1.6 billion t C/year (Bolin and Sukumar 2000), with tropical deforestation essentially responsible for most of this source. Activities that reduce deforestation rates, increase forestation, or improve land use efficiency offer significant potential for mitigating greenhouse gas (GHG) emissions, thereby reducing the potential impacts of climate change. Through projects and policies that change forest and other land management practices, humans have the potential to change the direction and magnitude of the flux of carbon dioxide between the land and atmosphere. At the same time these changes can provide multiple co-benefits to meet environmental and socioeconomic goals of sustainable development.

Afforestation and reforestation projects are generally accepted as projects that can generate tradable greenhouse gas (GHG) emission reductions (e.g., under the UN Framework Convention on Climate Change (UN FCCC) Kyoto Protocol). Forest conservation projects, on the other hand, have faced obstacles to acceptance due to the difficulty in determining key elements of the project cycle. For instance, some have argued that determining baselines for forest conservation projects is too difficult and uncertain. Others have raised objections with respect to “leakage” (i.e., the off-site effects of project activities on carbon stocks and GHG emissions) (Brown et al. 2000b). Without inclusion of projects that are designed to avoid deforestation and improve the sustainability of agriculture in developing countries, a large opportunity is lost (Niles et al. 2002).

At the same time, many countries continue to be interested in developing forest conservation projects given the potential for such projects to slow or even reverse high rates of deforestation that could generate credible GHG emission reductions. Given the challenge of addressing important analytical issues related to and the continuing interest in forest conservation projects, we look at issues related to these project types.

A fundamental and challenging component of all project activities, and avoided deforestation projects specifically, is the determination of the extent to which project interventions lead to GHG benefits that are additional to business-as-usual scenarios (i.e., the baseline scenarios). The development of a baseline is a key step in the implementation of land use, land-use change, and forestry (LULUCF) projects to ensure accurate crediting of their carbon impacts (OECD/IEA 2003) because GHG benefits of a project activity are computed as the difference in carbon stocks and other GHG emissions of the project activity and the baseline. A key issue therefore, is how to develop a baseline scenario for avoided deforestation that reasonably represents the net emissions without the project.

There are currently no standard practices for developing baselines for avoided deforestation projects. A baseline has two major components: the projected land-use or land-cover change, and the corresponding carbon stocks in live and dead vegetation and soil. Of the two components needed for baselines, the projections of changes in land use are the most important and yet the most difficult to address analytically (OECD/IEA 2003) because many socioeconomic and environmental factors affect the way people use land and these are difficult to predict. And, once a project to reduce deforestation is implemented the rate and pattern of land-use change in the project area can no longer be monitored.

Existing baseline estimates are limited by the absence of agreed standardized methods. For many of the existing pilot forestry-based carbon projects, estimates of changes in land use and baselines were determined on a project-by-project approach using simple logical arguments that assumed continuation of observed past trends for the limited project area or a region. These projects generally did not use analytically rigorous and transparent agreed methods because they did not exist at the time and were not required by voluntary programs to which the projects were reported as demonstrations (Brown et al. 2000b; OECD/IEA 2003). They also did not test alternative baseline approaches. In addition, this project-by-project approach is likely to increase investment costs, further undermining the potential for developing these kinds of projects (OECD/IEA 2003). The result is the perception of LULUCF baselines as subjective projections of land-use change and hence GHG mitigation potential with high uncertainty, high cost per unit of carbon benefit, and a lack of transparency.

Developing regional baselines for the land-use component by project-activity type offers an alternative to the project-by-project approach (also called the performance standard approach in the World Resources Institute/World Business Council for Sustainable Development 2003 project protocols). Regional baselines are projections of the magnitude and in some cases spatial depiction of one or more land-use change activities (e.g., forestation, deforestation) over a region in which a potential mitigation project could be located. These baselines would use regional data and transparent analytic assumptions not derived from a specific project, to set a generic baseline for the defined class of activity. This baseline can be either spatially resolved (e.g., a projection for specific pixels or lands), or an average rate of change over time for the activity in that region. The concept was pioneered by the Scolel Te project team in Chiapas State, southern Mexico, which developed several alternative, spatially resolved, baselines projected out 50 years for about half of the state, (Tipper and De Jong 1998; Tipper et al. 1998).

Regional baselines may have several advantages, including: reduced investment cost to develop compared to project-specific baselines; consideration of regional factors that could affect land-use changes; and opportunity for host country or state governments to identify the effects of and target the type of projects supportive of their sustainable development. Use of regional baselines is likely to result in more transparent and credible baselines. Although regional baselines have the advantages presented here, there is potentially a major disadvantage if they are not spatially resolved. Project developers could identify areas where deforestation appears likely to be lower than the regional baseline thus getting more carbon credits than were actually being generated. At the other extreme, areas that appear to have potentially higher deforestation rates than the baseline would be avoided because the carbon credits would be underestimated.

For the carbon stocks, most pilot projects based their baselines on estimates from the scientific literature in combination with some field measurements in nearby areas. The use of estimates from the literature for the carbon stocks is a reasonable first approximation for the baseline. Unlike the land-use change component of the baseline, the carbon stocks can be monitored over the length of the project. Thus, once a project area is selected, the carbon stocks can be monitored at that locale, the first approximation revised, and a more project-specific baseline can be produced. Carbon stocks and their changes in above- and below-ground biomass, on a unit area basis, can be measured under many circumstances to relatively high levels of accuracy and precision at a modest cost (95% confidence intervals of less than ±10% of the mean, at an estimated cost of about <$1/t C; Brown 2002a).

In this paper we focus on developing baseline projections of changes in land use, particularly projecting deforestation. We have identified three approaches for developing regional baselines for changes in use of the land. The approaches use models that provide a conceptual basis for integrating diverse measures into a self-consistent framework and for making extrapolations across time and space. Here we report on the application of these three models to determine baseline scenarios in land-use change for six regions in the tropics, four of which encompass a pilot carbon-offset project.

The methods range from relatively simple model extrapolations of past trends in land use based on simple drivers such as population growth, to more complex extrapolations of past trends using spatially explicit models of land-use change driven by biophysical and socioeconomic factors. All models were used to project the baseline for changes in land use over the same duration of 20 years out. The regions used in this work were specifically chosen to encompass existing sites where several of us had already been actively engaged and where data were available.

The study was designed to address an overarching research question and related questions with policy relevance. The principal research question was: what is the most analytically feasible and credible approach for establishing deforestation baselines? Secondary questions include: (1) which baseline-setting method provides more credible results, by project activity type and land use conditions? (2) What is a reasonable time frame over which a deforestation baseline should be projected? (3) Under what changes in baseline conditions, and how often, should baselines be reviewed and potentially revised? (4) How feasible and practical are each of the methods? (5) What are the tradeoffs among data availability, spatial scale of analysis, and precision of a baseline? (6) Lastly, can these results offer any potential guidance to policymakers confronted with the task of establishing guidelines for land-use based projects to mitigate climate change? We conclude with a discussion of lessons learned and steps that can be undertaken to develop credible baselines.

This paper is a summary of two large projects that applied these three models to six tropical regions (four regions supported by the US Environmental Protection Agency [Belize, Bolivia, Brazil, and Chiapas, Mexico] and two regions supported by US Agency for International Development-Mexico [Campeche and Michoacán, Mexico]). Details on the descriptions of the study areas and models, with corresponding sources, are given in Brown (2002b, 2003).

Methods

Description of study areas

The six study regions featured different ecological and socioeconomic conditions, population dynamics, and uses of the land (Table 1). The six study areas included in this analysis are sub-regions of (Fig. 1): (1) Belize encompassing the Rio Bravo Climate Action project in northern Belize; (2) Santa Cruz state, Bolivia encompassing the Noel Kempff Climate Action project (Brown et al. 2000a); (3) Paraná state in Brazil encompassing the Itaqui Climate Action project in the Atlantic rainforest zone; (4) Campeche, Mexico encompassing a planned project in the Calakmul Biosphere Reserve area; (5) Chiapas, Mexico encompassing the Scolel Te project (Castillo-Santiago et al. 2006; De Jong et al. 2005); and (6) Michoacán, Mexico. Further details of each study area are covered in the larger reports mentioned above.

Table 1 Summary of geographic and land use characteristics of the six study areas
Fig. 1
figure 1

Location of sites used in the study. (A) State of Paraná in Brazil, (B) Santa Cruz Department in Bolivia, (C) northern Belize, (D) Michoacán, Mexico, (E) Chiapas, Mexico, and (F) Campeche, Mexico. “Project area” refers to the area of the pilot carbon sequestration projects in these regions

Description of the models

Our goal was to consistently compare multiple, competing methodological approaches to deforestation baseline setting that ranged from models that used readily available non-spatial data for relatively large geographical areas (e.g., millions of ha), to models that required more intensive data collection but could operate in smaller geographic areas (e.g., tens to hundreds of thousands of ha). Kaimowitz and Angelsen (1998), based on a review of 146 existing tropical deforestation models, grouped such models into three classes: analytical, simulation (including programming), and regression models. The three models used in this study to simulate future changes in land use are described below (further details are given in Brown 2002b, 2003). Each of these models represents each of the class of models proposed by Kaimowitz and Angelson, e.g., FAC is a non-spatial analytical model, LUCS is a non-spatial simulation model, and GEOMOD is a spatial regression and rule-based model.

Forest area change (FAC)

This model was first formulated in the framework of the FAO Forest Resources Assessment Project implemented during 1990–1994 (Food and Agriculture Organization; FAO 1993; Sciotti 1991), and revised in 1998 (Sciotti 2000). The deforestation model was developed to overcome the lack of multi-temporal information on forest cover in tropical countries. The goal was to develop a modeling approach that could produce the required forest area change information for all countries. The basis was multi-date observations for a limited number of countries, in combination with another set of correlated variables for which data were available for all countries. In building this model it was assumed that the overall pattern of expansion of non-forest area over time (deforestation) would be described by a logistic curve of two key variables, with different parameters for different ecological zones within a country. The model uses historical data on forest cover and associated population density. Using these data, two key variables were developed, generally expressed at a sub-national level: the dependent variable—ratio of non-forest area to total area, and the independent variable—population density. Then, using projections of human population growth for the area in question, the model simulates the change in forest cover over time.

Advantages of this model for baseline setting include minimal data requirements, potentially reducing costs of its use, and its applicability to large regions (e.g., millions of ha). Disadvantages include its lack of spatial resolution, reliance on only two major variables to project complex deforestation patterns and processes, and its inability to be used at smaller geographic scales relevant to sequestration projects if key data variables are not available.

Land use carbon sequestration (LUCS)

This model was developed to estimate land-use change in rural areas that depend largely upon low-productivity agriculture for subsistence and fuel wood for energy (Faeth et al. 1994). The model assumes that land-use change is primarily driven by changes in population and land management. As the population grows, more land is required to supply food and livelihoods, and in some cases, fuel wood. While demand for food and income grows, the land’s ability to meet that demand may increase or decrease depending on changes in productivity and other activities. The key parameters used in this model are: the rate of population growth and the year it is expected to stabilize; the initial area of principal land uses, including: permanent agriculture, shifting agriculture, agroforestry, and native closed and open forests, plantations and secondary forests; and required agricultural land as a function of population, agricultural land required per person, fraction of food imported and agricultural land required for export production. The main driving force after initialization is change in population in the modeled area.

Advantages of this model for baseline setting include its applicability to many scales and its ability to model many types of land-use change activities (not just deforestation). Disadvantages include its lack of spatial resolution, its model code and structure are not readily understandable by the operator, and the assumptions that are needed for many poorly known parameters.

Geographical modeling (GEOMOD)

This model was developed to try to replicate spatially explicit land-use change in Costa Rica and was subsequently applied in SE Asia and Africa to estimate carbon releases from tropical deforestation over time (Dale 1994; Hall et al. 1995; Pontius et al. 2001). It uses spatially distributed data to simulate landscape dynamics in a geographical information system (GIS) (Hall et al. 2000, 2006; IDRISI Project 2003). There are two components to this model: the rate of land-use change and where the change will occur. To derive the rate of land-use change, an extrapolation of past rates is generally used, based on interpreted satellite imagery for two or more points in time for the area under study. To simulate where deforestation will occur, the model uses numerous spatial data layers of biophysical and socioeconomic factors (e.g., elevation, slope, soils, and distance from rivers, roads and already established settlements) to explain the pattern of deforestation. The model is calibrated by assigning weights to map cells based on analysis of the importance of each of these driving factors and combination of factors.

The GEOMOD model has an internal validation procedure—the kappa index for-location, an index that measures the improvement by the model over what just a random selection would achieve (Pontius 2000, 2002). Use of GEOMOD quantifies some of what has been termed “counterfactual uncertainty” (Kerr 2001; Moura Costa 2001) inherent in all models used to estimate the business-as-usual baseline. The kappa-for-location statistic represents a standardized procedure of assessment of some aspects of this “counterfactual uncertainty” because it quantifies model performance compared to random allocation of change. Like other models, however, it still must make projections based on assumptions with associated uncertainties. The difference between GEOMOD and other models is that GEOMOD tests the validity of the assumptions.

Potential advantages of GEOMOD include its capability of spatial resolution at any scale for which data are available because it is raster-based (and thus gives deforestation estimates for any pixel or geographic scale requested within the analytic domain, for an entire region). Additionally, incorporation of the kappa for-location statistic allows evaluation of model performance versus chance. Potential disadvantages include its large data requirements, the need to experiment with a large number of variables to identify those providing the most explanatory power for predicting deforestation, and the potential cost of data acquisition and analysis.

Scale of simulations

The geographic scale selected as the baseline modeling domain for each model has a significant effect on estimates of the initial percent of forest area in each of the six regions (Table 2). The large-region wide FAC model tends to result in lower percent forest cover estimates than the more highly resolved GEOMOD and LUCS models, and thus generates substantially higher baselines of forest area from which project activities of slowing forest loss rates would be calculated. For example, the percent initial forest cover in Paraná, Brazil; Santa Cruz, Bolivia; and Campeche, Mexico is considerably lower for the FAC model than for the other two because the FAC model was applied to the total area of these three states. Expanding the size of the modeling domain due to data resolution limitations adds in lower-carbon-density disturbed forest and agricultural lands not included in the geographically more constrained modeling domains of LUCS and GEOMOD (which have a higher percent of forest lands). The FAC estimates of initial forest cover average about 62% of GEOMOD and LUCS estimates for the six regions. Thus the simple selection of level of data aggregation used produced an almost 40% difference in the initial forest area that could affect baseline projections. These differences in forest cover between the six regions illustrate the contrasting situations in level of development and subsequent pressures on the forested landscape.

Table 2 Areas of land encompassed by each model for each region (in million ha), estimated initial percent forest cover, and average percent cover by model, (FAC = Forest Area Change model; GEOMOD = Geographic Model; and LUCS = Land Use and Carbon Sequestration model)

The FAC model was simulated at the entire state level and for the entire country of Belize (Table 2). For all study areas in Mexico, the lack of reliable historical data of forest cover prevented a locally parameterized version of the FAC model from being developed; instead a general model was used with effects of local ecofloristic zones incorporated. The LUCS model, on the other hand, can simulate land-use changes within smaller sub-national units depending at what scale population data are provided. For most of the study areas, the LUCS model encompassed the same area as that used by GEOMOD; the exception was Campeche where LUCS simulated one large municipality only (Calakmul representing more than 75% of the total area simulated by GEOMOD) (Table 2). GEOMOD simulated land-cover change at a scale where boundaries were defined to reflect biophysical, socioeconomic, and cultural or other relevant factors for all study areas.

The main reason areas simulated differ among the models is related to the spatial scale of available data required for each model. For example, the FAC and LUCS models rely on available data that are generally reported at sub-national political units (e.g., population data at the municipality level, or forest cover data at the state level), within which data are not further subdivided. Consequently the FAC and LUCS models are limited in their application to the corresponding scales of the available data (e.g., municipalities for population data). On the other hand, GEOMOD can model at any scale desired for which satellite imagery can be acquired, and is limited rather by the availability of spatial databases of interest, particularly socioeconomic databases, and the processing capacity of the computer running the model.

As mentioned above, several of the study areas encompass pilot projects or planned pilot projects. For the Santa Cruz, Bolivia and Campeche, Mexico areas, the existing or proposed large pilot projects were about 640,000 ha and 323,000 ha in area, respectively. The analytic domain for GEOMOD and LUCS models is about 5–6 times the pilot project areas, whereas the domain for the FAC model is about 13–60 times the project area. For the smaller projects in Belize (about 15,000 ha) and Paraná, Brazil (about 5,000), the analytic domain for GEOMOD and LUCS models is about 30–38 times the project area, and for FAC model the domain is about 146 (Belize) and almost 4,000 (Brazil) times larger. Although Chiapas contains a pilot project, it consists of several hundred very small landowners scattered throughout the analytic domain for all models.

Results and discussion

Comparison of projected baselines for deforestation

To make a meaningful comparison of the land-use change component of the baseline, the results from each modeling approach were expressed as the cumulative percent of the initial forest cover lost or deforested over time for a 20-year period for each of the six study areas (Fig. 2). (The results presented here for GEOMOD are only the rate projections, the spatial component will be presented later.) It is clear from this analysis that there is little similarity in the deforestation projections produced by the different models for a given region. The maximum projected cumulative loss in forest cover over the 20-year period ranges from 14% to 52% of the initial forest cover. The FAC model projected the maximum deforestation in four of the six areas (Table 3). The minimum projected loss over the 20-year period ranges from a gain of 7% to a loss of 21%, and the LUCS model projects the minimum loss in four out of the five cases. The highest projected loss in forest cover is about two times the lowest projected loss for Chiapas and Michoacán, and as high as 36–70 times the lowest for Santa Cruz and Campeche.

Fig. 2
figure 2

Cumulative % of initial forest area deforested for each study area by each of the three models. FAC = Forest Area Change model, LUCS = Land Use and Carbon Sequestration model, and GEOMOD = Geographic Model. The high and low scenarios of the FAC model for Belize represent high- and low-population growth projections

Table 3 Minimum and maximum baseline projected cumulative loss of forest cover over 20-year period from base year, as % of initial forest cover, for the six study areas and model producing each value

For Belize, only two models were used because LUCS was not applicable. Deforestation in northern Belize is caused by Mennonite farmers who convert the forests to mechanized agriculture rather than to subsistence agriculture, and LUCS could not readily model this type of commercial agricultural conversion. Depending on the model and population scenario (e.g., the FAC models results are based on projected high and low rates of population growth for the whole country), the cumulative amount of forest lost over the 20-year period ranges from about 10% to 50% of that present at the start of the simulations.

In the Santa Cruz, Bolivia case, the amount of forest loss estimated by LUCS and GEOMOD is considerably lower (less than 2% of the initial forest cover lost after 20 years) than that projected by the FAC model (about 14% of the initial forest cover lost). The LUCS and GEOMOD models were applied to the same region (3.7 million ha) adjacent to the Noel Kempff project area, whereas the FAC model was applied to the whole state of Santa Cruz, an area of about 36 million ha. The high rate of forest loss projected by the FAC model is a result of the influence of high population growth rates in large cities and towns throughout the state, particularly the main city of Santa Cruz de la Sierra (see Fig. 1B)—that produces a high deforestation estimate in the model, even though population growth currently is occurring within and radiating from the urban areas, and is not yet evident in the far reaches of the department. In contrast, the low rates projected by LUCS were caused by the simulation of local-scale conditions where only a scattering of small communities occur and population growth is low. The low forest cover change rates projected by GEOMOD (rates based on analysis and projection of spread of deforestation from the capital, Santa Cruz de la Sierra, in 100 km rings using satellite images from 1975 through 1995; Hall et al. 2006) reflect the projected slow rates of population spread, and corresponding forest clearing, in progressive waves in the zones farthest from the departmental capital city of Santa Cruz.

The model simulations for Paraná, Brazil produce the most contrasting results of all six areas. For the Brazil LUCS simulation, the population dynamics of water buffalo were used instead of human population because buffalo livestock management was the main driving force behind deforestation, and during the past decade or so the population of water buffalo has been declining at about 4% per year. LUCS projects a gain of forest of about 7% of the initial amount and GEOMOD projects a net loss of only 0.1% with reforestation of abandoned pasture areas keeping pace with new deforestation. In contrast, the FAC model projects a continuing loss so that after 20 years, another 14% of the forest is gone. The results from the FAC model are based on the population–forest cover relationship for the whole state of Paraná, a state that encompasses a high, more temperate plateau, and where the forest clearing has been extensive in the past from urban growth and development (Fig. 1A). The lowland coastal area modeled by LUCS and GEOMOD encompasses municipalities that show little to no growth in population and consequently little deforestation over the recent past.

For Campeche, the FAC model projected that 25% of the forests would be deforested over the 20-year period, compared to 11.5% projected by GEOMOD and the 0.7% projected by LUCS. Somewhat like the Bolivia area, the FAC simulation is influenced by the concentration of human populations and infrastructure, and resulting forest conversion, in the west and northwest section of the region with conversion in the rest of the region more scattered (Fig. 1F). Even though the GEOMOD simulation of the total area produced results that were about half those based on FAC, we did find that for the two municipalities closest to the west and northwest of the GEOMOD area, the projected cumulative deforestation was similar to that projected by FAC.

The GEOMOD model projected Chiapas to have the highest rates of forest loss compared to all other regions, with 52% of the initial forest gone after 20 year. In this case, GEOMOD projected deforestation based on projected population growth from official sources and one remote sensing image because of the unavailability of existing imagery products for two points in time. Even though the area simulated by the FAC model was almost three times larger than that used by LUCS, both gave practically the same results and projected that about 20% of the forest would be gone within the 20-year period.

Because of the high and relatively evenly distributed density of human population and subsequent use of the land across the entire region of Michoacán, the three models projected amounts of forest loss over the 20-year period that were more similar to each other, ranging from a low of 21% (GEOMOD and FAC) to a high of 35% (LUCS), or less than a two-fold difference. The tendency for convergence of results from the three models in Michoacán implies that no particular concentration of human activity dominates deforestation patterns. This is similar to the situation for Chiapas.

A comparison of all model outputs across all six regions shows that depending upon which model is used, we obtain quite different results—largely driven by how population change is modeled. In general, the FAC model projects the highest amount of forest loss (four out of six) and the LUCS model projects the lowest amount of forest loss (four out of five cases). Both of these models rely heavily on population dynamics, with the FAC model using published projections and LUCS model using population change based on a hypothesized growth rate. When GEOMOD made projections of forest loss linked to population projections rather than from remote sensing products, as in the case of Chiapas, a high rate of deforestation also resulted because population growth in the region is exponential. As described above, the FAC model is applied at the national administrative-unit scale where population data needed to simulate the model are generally available. However, when the national administrative unit encompasses more than one biophysical-socioeconomic zone, as in the case of Paraná, Brazil (lowland sparsely populated coastal zone and populated cool plateau; Table 1), or where the pattern of deforestation has a discernable frontier or wave, as in the case of Santa Cruz and Campeche, the FAC model gives higher rates of deforestation in remote areas than the other two models caused by the influence of the highly concentrated population in cities and towns far removed from the area of interest. On the other hand, when human populations and their infrastructure are widely dispersed across the landscape, regardless of whether different biophysical-socioeconomic zones occur, as in the case of Chiapas and Michoacán, all three models produce results that have narrower range of variation, particularly in the near-term (about 10 years).

Evaluation of the models for projecting rates of deforestation

At the outset of this work, the questions we were attempting to answer by comparing three different models were (1) what is the most analytically feasible, and credible, approach for establishing deforestation baselines and (2) were the models feasible and practical to use for this purpose? To address these questions, we evaluated the models against a set of five criteria and 13 indicators (Brown 2003). The five criteria and corresponding indicators were: (1) transparency with indicators of understandability and replicability; (2) accuracy and precision with indicators related to model calibration, validation, and uncertainty in data bases; (3) applicability with indicators related to ability to deal with multiple scales and multiple land uses; (4) compatible with international standards (i.e., standard definitions of forest); and (5) cost-effective with indicators related to intensity and availability of data needs, time to simulate models, and knowledge and skills needed to run the models. For each indicator, a score (from 1—lowest to 5—highest) was assigned, then averaged for each criterion, and summed for all criteria for a maximum of 25 points. The overall evaluation gave the GEOMOD model the highest score (22.6), and little difference in the scores between the FAC (18.6) and LUCS (17.5) models. However, for some criteria, the order of the evaluation was different from the overall trend, for example:

  • For transparency, the GEOMOD and FAC models scored the highest and LUCS scored the lowest because its model code and structure are not readily understandable by the operator.

  • The data bases needed for all three models tend to have a high degree of uncertainty associated with them, either because they depend on interpretation of remote sensing imagery (GEOMOD), on national statistics (FAC and LUCS), or on assumptions for many parameters that are poorly known (LUCS).

  • The GEOMOD and LUCS models are the most applicable for modeling land-use change as they can be applied to any scale and to many changes in land uses; the FAC model was built to simulate only deforestation at sub-national political units with population growth as the single driver.

  • The FAC model is compatible with international requirements because it has been officially used and accepted by FAO to estimate deforestation for year 1990 and 1995 for all developing countries and the model was built on a clear and internationally accepted definition of forest.

  • The FAC model scored the highest on cost-effectiveness indicators, whereas the other two models require more data, time and effort to simulate.

Main factors explaining the empirical pattern of land-use change

Whereas all the models estimate the rate of deforestation, GEOMOD is the only one of the three specifically developed to project where deforestation is likely to occur in the future. Spatially explicit models like GEOMOD can project the location and pattern over time of estimated deforestation—of interest to land managers, government agencies, and local and international sequestration project developers or evaluators. For example, GEOMOD analyzed a total of 29 spatially distributed factors to determine which ones explain the historical pattern of human settlement and deforestation in each of the six regions. Significance is based on the percent of each class of each factor already deforested at time one, the calibration period. From these percentages a weighted map of potential land-use change (PLUC) is produced that supplies the model with information on which forested cells to select for future deforestation. We analyzed these PLUC maps using principal components analysis (PCA) to compare the importance of factors across the six regions. The PCA-derived values indicate how much of the land-use variation at time one is explained by each factor compared to all others analyzed for that region but does not necessarily provide a measure of their statistical significance (Table 4).

Table 4 Amount of variation each factor explains in GEOMOD, based on principal component analysis, in the spatial distribution of forest and non-forest lands in the historic time-1 land-use map of each of the six regions

Not all factors were used in all regions due to data availability constraints (Table 4). An importance factor was calculated to estimate how many factors in each variable category (physical, biological, distance to areas of past disturbance, sociological/demographic, and economic/infrastructure) ranked among the top three in a study region. A comparison of the importance factors reveals that physical (factors 7–19) with 9 out of 23 (0.39) and biological factors (factors 20–21) with 0.50, as well as distance to areas of past disturbance (factors 22–25) with 0.38, were each about twice as important as either sociological/demographic (factors 26–28) with 0.20 or economic/infrastructure factors (factors 1–6) with 0.24, in explaining empirical land use patterns. Elevation (factor 7) ranked among the top three factors in all five regions where it was analyzed, and slope, an elevation derivative, was among the top three in Chiapas. Distance to roads (factors 3–5) were highly significant in half of the regions, principally Paraná, Chiapas and Belize, and distance to already deforested areas (factors 23–25), which was also highly significant in three regions, explains between 11% and 17% of the variation in deforestation in Santa Cruz, Belize, and Campeche. Distance to assumed market areas, and or community services (factors 1–2), was ranked among the top three factors only in Belize. Land tenure (factor 26) ranked high in both regions where it could be analyzed, Belize and Chiapas, but ranked among the top three in only the latter. Distance to water sources (factors 13–17) was not nearly as important as we assumed, except for Campeche and Michoacán, where rainfall averages between 750 mm/year and 800 mm/year, significantly lower than the other regions analyzed.

Strength of factors in projecting future land-use change: which factors, and how many are needed?

The percent of cells projected correctly based on a comparison of GEOMOD’s simulated time-2 map with the actual time-2 validation map ranges from 90% to 99.8% for all sites except Chiapas, where only 72% were correctly modeled (Table 4). However, it is possible to get a high percent correct when little change is occurring between two time periods, as in Santa Cruz and Paraná. Also, a certain percent of the cells will be modeled correctly based simply on random assignment, or chance alone, due to persistence of large areas of either agriculture when population is high or forest when it is not. The kappa-for-location statistic, which varies between 0 (no better than a random model) and 1 (a perfect simulation), takes this into account, and provides a better metric of how well the model performed than just percent correct. For Belize, Paraná, Campeche, and Michoacán, the kappa-for-location is greater than 0.5 suggesting that the GEOMOD improved significantly over a random assignment of newly deforested cells. For Santa Cruz, the lower kappa combined suggests that model enhancements could be made, thus illustrating the importance of validation as a means of building the best model possible to achieve the most robust projections (Hall et al. 2006).

The individual importance of factors in explaining patterns of land use for a past time period does not necessarily portend their ability to predict a future landscape. This underlines the importance of validation in the modeling process. The predictive strength of empirical patterns is enhanced or diminished in combination with other factors and must be tested for before projecting into the future. All the factors analyzed for Santa Cruz, Belize, and Paraná, not just the top three, were required to derive the best possible fit (kappa-for-location) between the simulated and actual time-2 maps. In Campeche seven (2, 3, 6, 7, 17, 27, 29) of the 11 factors analyzed were necessary to improve more than 50% over a random model, and those seven did not even include any of the PCA top three. In Chiapas, a combination of only five (3, 24, 20, 27, 28) of the seven yielded the best fit possible, and in Michoacán only two factors, slope (factor 8) and distance to water sources (factor 16), were required to produce an 88% improvement over a random model. This is not surprising in a region where steep slopes are being developed as the best land is already in production. In both Chiapas and Michoacán, only one factor of the final “best” predictive set had ranked among the top three in the PCA analysis of past pattern—distance to roads (factor 3) and distance to year-round and seasonal water sources (factor 16), respectively.

Even though a large initial list of driving factors were included in the spatial modeling, the factors providing the best fit in validation could be reduced to a few key ones. Targeting a few key factors per activity type and region could offer potential for streamlining and standardizing the PLUC map upon which simulation of the future without project landscape is based, thereby reducing data requirements, and costs of spatial baseline analysis. For instance, we found that in five out of six regions, distance to roads (factor 3) was included in the final set of factors, and in four out of six regions the following were required: distance to towns (factor 2), elevation (factor 7), distance to areas of some kind of earlier human use (factors 23, 24, and 25) and distance to water (factors 13, 14, 16, and 17). Distance to roads, though important elsewhere, did not enhance validation in Michoacán; this could be due to the high density of both roads and deforested areas in the region.

Potentiality for deforestation

We created a final map of potential land-use change (PLUC) (Fig. 3) in GEOMOD based on the factors for each region that yielded the best “goodness of fit” (Table 4) between the simulated and actual time-2 land-use map as measured by the kappa-for-location statistic. The PLUC map, indicating each cell’s likelihood for future development, was derived by summing the percent developed for all factors yielding the best kappa-for-location in validation. The model simulates the distribution of potential future deforestation by selecting the highest value cells (those most likely to be deforested) in these maps in descending order up to the amount of area projected to be lost over a 20-year period. We then aggregated these values into three quartiles to visualize those areas of most likely (red) and least likely (blue) for future deforestation pressure (Fig. 3).

Fig. 3
figure 3

Maps showing the location of potential deforestation in each region analyzed, based on GEOMOD’s calculation of each cell’s potential suitability for human use. Suitability is derived through analysis of the important biophysical/socio-demographic/economic factors determining where people have chosen to settle in the past. Suitability values are ranked into quartiles to facilitate visualization of the areas of most likely future deforestation pressure, independent of the rate of change experienced in the region. The bottom quartile is considered as having no potential of being deforested

These mapped cells with varying potentiality of deforestation essentially provide estimated timing (or order) and location of deforestation differentially across a landscape over the period of projection. Thus they also essentially provide a spatially resolved estimate of their relative departure with respect to a business-as-usual baseline of all lands evaluated regarding their potential for deforestation-avoidance projects—i.e., carbon benefits in avoided deforestation projects are estimated and measured as a positive departure from a baseline. Lands assigned low probability of deforestation over 20 years would have relatively low departure from the baseline, and lands with high probability of conversion would have higher departure—if project activities prevent forest conversion.

In the study regions where human populations and their infrastructure are widely dispersed across the landscape (e.g., Chiapas and Michoacán, yellow color on maps in Fig. 3), high potentiality for deforestation is generally scattered in relatively small parcels with few areas that have low potentiality. In contrast, areas with large blocks of forest with both high and low potentiality for deforestation are located in those study regions where human populations and infrastructure are not widely scattered and where a deforestation frontier is evident (e.g., Belize, Campeche, and Bolivia; Fig. 3).

Four of the study regions (Belize, Bolivia, Brazil, and Chiapas) have pilot carbon sequestration projects embedded within them (see Fig. 1A, C, E) and it can be seen that large blocks within these study regions have low and medium potentiality for deforestation, and only smaller areas with high potentiality, so targeting project sites to high potentiality areas is important for demonstrating departure from the baseline for greenhouse gas mitigation programs. The GEOMOD approach was used in developing final baselines for three of the pilot projects (Belize, Bolivia, and Brazil) and took into account the patterns shown in Fig. 3.

Carbon emissions baseline

In the analyses presented so far, the focus has been on developing the land-use change component of the baseline. However, carbon sequestration projects need to develop a baseline of carbon emissions or removals by projecting the rate of land-use change over a given time period combined with carbon stock data. The benefit of using spatially explicit models to project where the change will occur is that it provides a means for matching change locations with the corresponding carbon stocks. This is particularly advantageous in areas where forest types vary across a project landscape (e.g. flooded forests and upland forests, degraded and mature forests, etc.). The “location” tells us which forest type is being cleared.

The application of this approach to an example pilot project—the Noel Kempff pilot project in Bolivia—is shown in Fig. 4 (Brown 2002b). The carbon baseline for the Bolivian pilot project is not a monotonic increasing curve, but rather it is an irregular pattern of high emissions some years and lower emissions other years (Fig. 4). This irregular pattern is caused by two main factors: (1) the deforestation is modeled within a larger landscape and in any given year, the total amount of forest projected to be lost does not occur all within the project boundaries because not all the most suitable land exists there, and (2) the pilot project areas had six different forest strata with a corresponding range of carbon stocks, and in any given year forest with higher or lower carbon stocks could be cleared. Thus in this example, the rate of deforestation and identification of lands suitable for conversion are established in the regional context. But the actual baseline is developed at the project scale, where the area cleared within the project area is matched to the carbon stocks measured in the same area—thus the carbon baseline is project specific.

Fig. 4
figure 4

Carbon baseline of annual net carbon emissions for a pilot carbon sequestration project—Noel Kempff project in Bolivia

If the baseline projection for the Noel Kempff pilot project was based on the other two models and used in combination with an area-weighted carbon stock for the project area (147.6 Mg C/ha; Brown 2002a), the projected baseline would be a monotonically increasing curve with a total carbon emissions of 11.54 Tg for the FAC model and 0.183 Tg for the LUCS model over the 20-year period (applying the percent deforestation rate to the area of the project). The total emissions from GEOMOD (summed annual emissions from Fig. 4) would be 1.05 Tg C over the same 20-year period.

If the carbon benefits of stopping deforestation are estimated as the difference between the baseline emissions and the “with-project” emissions (essentially zero) as is typically done (Brown et al. 2000a, b) then the benefits from using GEOMOD would be 1.05 Tg for the 20 years, with either an order of magnitude less using LUCS or order of magnitude more using FAC. Thus, clearly the choice of model to make the projections can have a major effect on the potential carbon benefits.

Strategy for generating deforestation baselines

A large opportunity to mitigate GHG emissions is lost without the inclusion of projects designed to avoid deforestation and improve the sustainability of agriculture in developing countries (Klooster and Masera 2000; Niles et al. 2002). Sathaye et al. (2006) estimate that under quite moderate carbon price scenarios, by 2100, the global cumulative carbon benefits from avoided deforestation is 51–78% of all potential in the land use sector. Many developing countries continue to be interested in forest conservation projects because of their potential to slow or even reverse high rates of deforestation and to conserve biodiversity and other natural resources. In this section, we propose, based on the work presented here and the lessons learned, a common methodology to advance the development of credible baselines for deforestation. This approach also may be generally applicable to other climate change mitigation activities involving land-use change, like afforestation, reforestation, and restoration of degraded forests, but we have not assessed them here.

For an avoided deforestation project to produce credible carbon benefits, the baseline needs to demonstrate that the area was under threat of deforestation. Large areas of tropical forests are often not under threat for deforestation and would therefore not be eligible for such a project. An analysis of deforestation threats using spatial models is suited to this task. For the six areas analyzed here, we have generated maps showing the areas of most immediate threat scaled from high to low potentiality for deforestation (Fig. 3). Projects intended to stop deforestation would have a measurable difference on carbon emissions in areas of high to medium potentiality. An additional advantage of using potential land-use change (PLUC) maps as shown in Fig. 3 is that other development criteria could be overlain on the map to help select areas that meet multiple goals. For example, maps of ranges of threatened or endangered species, maps of poverty indicators or maps of critical watersheds could be overlain on the PLUC maps, and the intersection of other development goals with the highest threat for deforestation could be identified.

The temporal dimension for avoided deforestation baselines is a significant analytic and policy issue—how far into the future can, and should, the baseline be projected? Rates and patterns of land-use change are subject to biophysical factors regulating human use of the land that change marginally over time, but socioeconomic and political factors are more dynamic and less predictable through time. Thus, the farther business-as-usual baseline scenarios are projected into the future, the less reliable they are likely to be. We suggest that a 10-year period is a reasonable time frame for projecting baselines forward based on the following: (a) historical data are often collected over the decadal time frame (e.g., population data), and may indicate future projections over the same time period given the dynamics of development and growth in most countries; (b) for certain regions in our analyses there tends to be some convergence among the model projections during the first 10-year period; and (c) from a policy perspective, a decade is two Kyoto commitment periods (of 5 years), and roughly two political election cycles (averaging 4–6 years generally, varying by political system). We propose a projection timeframe for land-use changes and associated carbon benefits equal to the proposed project length (currently set at 20–60 years under the guidelines developed for the CDM), but a baseline locked in for only the first 10-year period, and then reviewed and adjusted if needed throughout the project duration. Any revisions to a baseline are likely to affect the carbon credits ascribed to the project in the subsequent time period because the projected rate of deforestation could change. New projects implemented in a region during the second 10-year period would of course use the revised baseline.

Combining the baseline duration issues with the work presented here, we propose a methodology for developing a baseline projection for an avoided deforestation project that involves three major steps comprising six tasks (below):

  • Step 1: Develop historic land-use change and deforestation estimate:

  • Task 1: Determine analytic domain and obtain historic data:

    • Delineate the approximate regional analytic domain scale:

      • About 5–7 times the area of large projects (e.g., greater than several hundreds of thousand ha; magnitude and thresholds recommended will vary with regional conditions), or 20–40 times the area of smaller projects (e.g., tens of thousands of ha or less; will vary by regional conditions).

    • Obtain historic data on land use and socioeconomic characteristics for the past c. 10 years or so, ideally including two recent remote sensing imagery data sets at least 5 years apart, and identify potential major baseline drivers.

  • Task 2: Analyze candidate historic baseline drivers and identify major drivers:

    • Analyze satellite imagery to produce maps of land use or obtain existing digital, satellite imagery-based, land-use maps for location of deforestation. Analyze candidate baseline drivers (e.g., see Table 4 for key ones) to find the three to four key drivers that best describe patterns of historic land-use change. Weight these drivers according to their importance in the time one or calibration period. Select those that produce upon simulation the best match with a time two or validation period map.

  • Step 2: Generate baseline projection for deforestation

  • Task 4: Use key drivers to project potential land-use change (PLUC):

    • Use the three to four key drivers of land-use change to generate a potential land-use change (PLUC) map, or a map of areas projected to have high to low risk for deforestation. A number of spatial models, such as GEOMOD or others locally available (e.g., Castillo-Santiago et al., 2006; De Jong et al., 2005) can be used to create the PLUC. Potential deforestation can be divided into quartiles, from high risk to low risk.

  • Task 5: Project rates of deforestation using the PLUC map:

  • A 10-year baseline projection for project GHG accounting and crediting purposes is suggested as baseline projections beyond a 10-year period are not likely to be realistic because rates of land-use change are subject to many factors that are difficult to predict over the long-term.

    • Project rates of land-use change over a 10-year period. Initially for this step, it would make sense to employ change detection of satellite imagery, such as used by GEOMOD in five out of six cases, because such images would already be on hand as part of the data base for the PLUC map. However, if at least two such images are not available (two images will give a linear projection only, but for a 10-year period this may be adequate) and the pattern of existing deforestation is dispersed across the landscape, then the simpler FAC model could be used.

    • Assess the relative carbon impacts of the mitigation actions in the proposed project case area and activities, by land parcel. Each parcel combines land and socioeconomic characteristics with proposed mitigation activity, overlain on projected high potential for deforestation (and thus relatively high carbon impact of project activities), to low potential (thus low carbon impact, as these lands are unlikely to be converted). For example, one might apply the total estimated rate of deforestation, expressed as a percent of the initial forest cover, to areas in the potential project area mapped as high potential, some discounted rate to the medium potential (how to discount would likely be a policy decision), and assume no deforestation in areas with low potential. This step would result in a projection of the baseline rate and location of deforestation.

  • Task 6: Combine PLUC map with projected rates of deforestation and carbon stock estimates and make baseline projections

    • Estimate the carbon stocks in the forests being cleared from measurements in the potential project area or from the literature depending on the status of project development. If it is only a feasibility study, then literature data or limited field studies would suffice, but if the project is beyond a feasibility stage, more detailed measurements and analyses of the carbon stocks would be needed (e.g., see Brown 2002a).

    • Combine the rate of forest loss over the 10-year period with carbon stock data to produce the deforestation baseline as shown in Fig. 4 for example. If a potential avoided deforestation project was at an implementation stage, the GEOMOD model could be used to simulate where the land is likely to change in the project area using the rate data, and to subsequently to match these with the corresponding carbon stock data.

    • For reporting of estimated GHG benefits, the project could submit its baseline driver assumptions to a GHG registry or marketing programs for review for reasonableness, and some form of certification of these assumptions, the baseline they produce, and hence the estimated project GHG benefits.

  • Step 3: At agreed interval (e.g., +10 years), review and re-assess baseline:

    • Because a 10-year baseline might be considered to be short and interest is in longer-term projects, it could be envisaged that the spatial PLUC map and estimates of rates of land-use change would be redone on a 10-year cycle. This would allow for the rates and changes in spatial drivers (e.g., new roads, new communities, new protected area, etc.) to be incorporated into the derivation of the new PLUC map and for adjustments in the estimation of the rate of land-use change and carbon stocks.

Greenhouse gas mitigation programs or market investors in GHG offsets are likely to require periodic review of the reasonableness of the project baseline under changing market, commodity product, population, other socioeconomic factors, natural disaster, or other circumstances in the project’s region. An agreed period for review, say 10 years out, would provide certainty to investors that the baseline would be in place for at least that time, yet would allow baseline updating if conditions warrant.

Operationally, the analysis of baseline conditions and assumptions about the values of baseline drivers could be proposed by a project, and reviewed and certified by a greenhouse gas registry program for 10 years. After 10 or an agreed number of years, the baseline conditions and drivers would be reviewed by the project and program, and proposed unchanged for another 10 years. An agreed set of baseline conditions and drivers could be identified in advance that, if they change by an agreed percentage or amount, would automatically trigger a required revision to a baseline. If no such changes trigger a revision, then the original baseline driver values would be re-certified for the next period. Candidate conditions and triggers for revision might vary by the key baseline drivers for a given mitigation activity and region.

The steps and tasks outlined above clearly represent fairly substantial modeling and analysis effort that begs the question as to how such effort could be supported. We propose that such an effort could be supported by traditional overseas development assistance and performed by relevant agencies within a country, particularly if other land-use changes were also modeled and analyzed. Not only would such analyses provide regional baselines for GHG mitigation projects, and well position countries to participate in the nascent carbon market, but they would also provide information to assist in the identification of other carbon sequestration opportunities. The approach outlined above could also help a country identify its potential GHG emission liabilities into the future and provide an opportunity for the country to plan alternative development pathways.