Introduction

Besides pure scientific interest, the assessment of carbon sinks and sources of forests has gained high policy relevance in the course of the implementation of the Kyoto Protocol (UNFCCC 1997). However, an accurate estimate of the contribution of forest ecosystems to the global carbon cycle remains a major challenge, as it is difficult to directly measure carbon pools or fluxes over large areas. Regarding the estimation of carbon dynamics, the use of process-based ecosystem models is of particular interest because this approach allows not only for the estimation of the carbon budget under various environmental conditions, but it also interprets and quantifies the possible causes of changes in carbon stocks as a result of environmental changes (cf. White et al. 1998; Churkina et al. 2003). The potential error sources of these models can be separated into errors in (1) model theory and equations, (2) the forcing data, primarily climatic data, (3) the initial conditions of the model, and (4) the parameterization estimates.

In the present study, we tested the process-based ecosystem model Biome-BGC (Running and Hunt 1993; Hunt et al. 1996; Thornton 1998) with the default parameterization at a number of specific central European forest sites. Using the model without parameter adaptation is a test of whether physiology can be represented in a general manner, while knowing that a lot of site-specific variability of ecological processes exists. Therefore, it is as much a test of the model parameterization as of the model theory and equations.

Biome-BGC has already been applied in a range of studies for assessing carbon fluxes in forests (e.g., Melillo et al. 1995; Hunt et al. 1996; Churkina and Running 2000). Biome-BGC originates from a coniferous forest ecosystem model, Forest-BGC (Running and Coughlan 1988; Running and Gower 1991), and it has undergone extensive validation of several water and carbon cycle components in North American and European forest ecosystems (e.g., Nemani and Running 1989; Hunt et al. 1991; Running 1994; White et al. 1997, 2000; Cienciala et al. 1998; Cramer et al. 1999; Law et al. 2001, 2003; Thornton et al. 2002; Churkina et al. 2003). However, these previous evaluation studies have all focused on flux simulations.

In contrast, the Biome-BGC evaluation presented in this study includes an analysis of carbon fluxes as well as of carbon pools. The model evaluation was based on data from central European forests comprising an extended altitudinal gradient, which allowed for tests under a wide range of climatic conditions. We tested the performance of different model components at various time scales and examined the sensitivity of Biome-BGC to climate and soil parameters. First, we examined the sensitivity of simulated long-term aboveground carbon storage (80-year simulation) towards changes in climate and soil characteristics along an altitudinal gradient in the Dischma valley (Switzerland, subalpine to alpine zone). Then, we simulated medium-term (30 years) changes in forest growth and aboveground carbon storage at 19 forest yield research plots ranging from the colline to the upper subalpine zone in Switzerland and compared these values with long-time measurements and with simulation results from a semi-empirical forest growth model, SILVA (Pretzsch et al. 2002; Schmid et al. 2006). Finally, we compared simulated annual leaf area index (LAI), daily net ecosystem exchange (NEE), and daily actual evapotranspiration (AET) with measured data from two central European research sites of the EUROFLUX project (Aubinet et al. 2000; Valentini et al. 2001; Valentini 2003) (4-year simulation).

Materials and methods

Model description

We used the ecosystem process model Biome-BGC version 4.1.2 described in Thornton et al. (2002), with the minor modifications described in the Biome-BGC User’s Guide, version 4.1.2 (P.E. Thornton, personal communication). Biome-BGC is a biogeochemical model that simulates above- and belowground carbon, water, and nitrogen cycles of different vegetation types. The model is strongly controlled by LAI and climate.

Regarding spatial structure, Biome-BGC is based on some simplifying assumptions. Trees are not defined individually, but rather the whole ecosystem (above- and belowground parts) is split up into the different pools that are relevant for the carbon, water, and nitrogen cycles. The vertical structure of the model includes a differentiation into a number of layers between the rooting system and the vegetation canopy, whereas the ecosystem is assumed to be horizontally homogeneous. Additionally, the model does not consider tree species, but forests are divided into four different plant functional types: evergreen and deciduous needleleaf forest, and evergreen and deciduous broadleaf forest. Due to the lack of horizontal structure, Biome-BGC provides point estimates of carbon, water, and nitrogen pools.

The temporal framework of the Biome-BGC model is based on a dual discrete time step approach (Thornton 1998). Most ecosystem processes are calculated on a daily basis (e.g., soil water balance, photosynthesis, new leaf and fine root growth, litterfall, and carbon and nitrogen dynamics in the litter and soil layer); they are driven by daily values of temperature, precipitation, vapor pressure deficit, and radiation. However, a few processes—including the determination of phenological timing and the allocation of carbon and nitrogen to the growth of new tissue—are simulated on an annual time step.

Study sites

Dischma valley (carbon storage along an altitudinal gradient)

We simulated aboveground carbon storage along an altitudinal gradient in the Dischma valley (46°46′N, 9°53′E) located in the eastern part of the Swiss Alps. The valley runs from south–southeast to north–northwest and has a continental to moderate central-alpine climate (Walder 1983; Riedo et al. 2001). Its elevation extends from 1,500 to 3,200 m a.s.l. While at the valley bottom hay meadows and pastures are the predominant vegetation type, the hillslopes are primarily covered by spruce-dominated forests up to the timberline (Hefti and Bühler 1986). Above the timberline (located at approximately 2,100 m a.s.l. under current land-use conditions), dwarf shrubs and alpine tundra dominate the landscape.

Within the valley, we chose two transects (at the valley entrance and in the middle of the valley), with two slopes each [east–northeast (ENE) and west–southwest (WSW)]. We applied Biome-BGC at every 10-m elevation interval along each of the four slopes. Information on soil, stand characteristics and management history were available from the “Man and the Biosphere” (MAB) project (Krause 1986) (Table 1), in which soil and stand data had been collected in the Dischma valley during the years 1982 and 1983.

Table 1 Site characteristics of the four slopes in the Dischma valley along which aboveground carbon storage was simulated (Krause 1986)

Forest yield research plots (medium-term carbon dynamics)

We used 19 long-term forest yield research plots of the Swiss Federal Institute of Forest, Snow, and Landscape Research (WSL) to estimate aboveground carbon; we compared simulated and measured carbon storage in aboveground living plants (subsequently called ‘aboveground carbon’) during a 30-year period for these sites. For each research plot, detailed information about single-tree growth, single-tree mortality, and forest management was collected at intervals ranging from 1 to 13 years. The general characteristics of the sites used in this study are summarized in Table 2. The sites range from the colline to the upper subalpine zone in Switzerland and include three different plant functional types. The majority of the stands were between 60 and 120 years old at the beginning of the simulation period, and all stands are managed, the only exception being the site Horgen (no. 8). Plot area varies between 0.2 and 1.0 ha. Soil characteristics (soil texture and soil depth) were derived from the soil suitability map of Switzerland (BFS 1992).

Table 2 Site characteristics of the 19 forest yield research plots and location of the meteorological station used to generate the climate of the plot

EUROFLUX sites (daily NEE and AET)

To compare daily model estimates of NEE and AET with measurements, we used datasets provided by the EUROFLUX project (Aubinet et al. 2000; Valentini et al. 2001; Valentini 2003). These datasets were available for the years 1996–1999. EUROFLUX uses a standardized protocol and the eddy covariance methodology (Leuning and Moncrieff 1990) as an established technique to measure the long-term exchanges of CO2, water vapour, and sensible heat between vegetated surfaces and the atmosphere at various sites across Europe. A complete description of the methodology and the instrumentation used at these sites is given by Moncrieff et al. (1997) and Aubinet et al. (2000). For this study, we used data from the two central European EUROFLUX sites that are closest to the Alpine region, i.e. Sarrebourg-Hesse (eastern France) and Bayreuth (southern Gemany; cf. Table 3). The forests at both sites are of natural origin and are managed. Data on the management history and soil conditions were available from the EUROFLUX network.

Table 3 Site characteristics of the two EUROFLUX sites

Model input data and parameters

Meteorological and environmental data

The required daily climate input data were generated for each site separately. For the Swiss test sites (the forest yield research plots and the altitudinal gradient in the Dischma valley), daily climate measurements of a close-by meteorological station of MeteoSwiss (the national weather service of Switzerland) were extrapolated to the site by means of the weather generator MTCLIM 4.3 (Running et al. 1987; Thornton and Running 1999). Based on daily values of—at least—maximum and minimum air temperature and total precipitation, MTCLIM generated daily values of air temperature, total precipitation, total incoming radiation, and daylight average humidity as required for Biome-BGC, and adjusted them to the location of interest, correcting for elevation, slope, and aspect differences between the meteorological station and the location. For the forest yield research sites, we applied the default MTCLIM lapse rates for minimum and maximum air temperature (−3.0 and −6.0°C km−1), since at these sites altitudinal differences between the selected meteorological station and the site of interest were small. Yet, in the case of the altitudinal transect in the Dischma valley, the altitudinal differences between the meteorological station and the highest location amounted to nearly 1,000 m. Therefore, we derived local lapse rates of minimum and maximum air temperature from two meteorological stations (Davos 1,560 m a.s.l. and Weissfluhjoch 2,590 m a.s.l.), resulting in local lapse rates of −3.3 and −6.9°C km−1, respectively. These were then used by MTCLIM to extrapolate climate values from one single meteorological station to different locations along the altitudinal gradient.

For the simulation at the two EUROFLUX sites, which are situated outside Switzerland, the surface weather data were drawn from two different databases. During the EUROFLUX measurement period (1996–1999) we used the daily climate values measured directly at the EUROFLUX sites (Granier 2003; Tenhunen and Schulze 2003). For the years prior to the EUROFLUX measurements, we used a climate dataset provided by Mitchell et al. (2004). It comprises gridded monthly climate variables for Europe at a spatial resolution of 10′ × 10′ covering the period from 1901 to 2000. The data include temperature, diurnal temperature range, precipitation, vapor pressure, and cloud cover. For both EUROFLUX sites under consideration, climate data from a nearby raster point were used and corrected for differences in elevation. Finally, the monthly data were downscaled linearly to daily data as required by Biome-BGC.

For the simulations, transient scenarios of atmospheric CO2 concentration and nitrogen deposition were used. Atmospheric CO2 concentrations show a continuous increase from 296 ppm in 1900 to 373 ppm in 2002 (Erhard et al. 2005), whereas nitrogen deposition was assumed to remain at a constant rate of 2 kg ha−1 up to the year 1949 (Holland et al. 1999) and thereafter to increase linearly up to the current value. For the simulations in the Dischma valley and at the forest yield research plots, the current annual nitrogen deposition rate was taken from the nitrogen deposition map of Switzerland (value of the year 1998; BUWAL 1996; Rihm and Kurz 2001) (Tables 1, 2). For the EUROFLUX simulations, data on nitrogen deposition were available from the EUROFLUX network (Table 3).

Ecophysiological parameters

The following plant functional types that are used in BIOME-BGC occurred at the test sites: evergreen needleleaf forest, deciduous needleleaf forest, and deciduous broadleaf forest. In the model, each of these plant functional types is defined by a set of 44 ecophysiological characteristics that do not change over time (parameters). For the present study, we used the default parameterization of model version 4.1.2, except for the annual whole-plant mortality fraction. The default value of this parameter (0.005 year−1) was replaced by values from the Swiss national forest inventory (NFI) (Brassel and Brändli 1999) (Table 4) depending on the vegetation zone and on forest management. This mortality rate includes natural tree mortality and mortality due to disturbances such as windthrow. The vegetation zones are defined according to Ott et al. (1997). For the simulation at the forest yield research plots, where mortality had been recorded in detail, we used the measured values instead of those in Table 4.

Table 4 Whole-plant mortality fraction (year-1) for the different vegetation zones based on Swiss NFI data (Brassel and Brändli 1999)

Since the model does not currently simulate mixed forest stands, we divided sites with mixed-species stands into different plots according to the basal area fraction covered by the respective plant functional type, and simulated them separately. In this study, the separation of mixed forest stands into their components led to better agreement with the measured dynamics than including only the dominant forest type. Nevertheless, the potential implications of this separation on stand growth must be considered (see also the discussion of the importance of interacting plant functional types by Law et al. 2001).

Simulation experiments and analysis of output data

In Biome-BGC, the ecosystem is represented by a number of carbon, water, and nitrogen pools, the model’s state variables. Due to a lack of measured data for the initialization of some of the state variables (mainly the belowground carbon and nitrogen pools), we performed model simulations for their initialization (so-called “spin-up runs”). The spin-up runs were used to bring the state variables into steady state with respect to the site’s climate and the specified plant functional type. For the simulation in the Dischma valley and at the EUROFLUX sites, we aimed at reproducing preindustrial environmental conditions during the spin-up run. Therefore, the atmospheric CO2 concentration was set to 296 ppm, approximating the level at the end of the nineteenth century (Erhard et al. 2005), and for the annual nitrogen deposition we used 2 kg N ha−1 (Holland et al. 1999). For the spin-up run, we used climate data from the years 1901 to 1930. In the case of the forest yield research plots, where climate measurements from this period often were lacking, the same environmental conditions were used as for the simulation run. After the spin-up run, the model runs and output analyses described below were performed (a summary of these simulations is given in Table 5).

Table 5 Summary of the sequence of the different model runs and the data analyses

Carbon storage along an altitudinal gradient

The spin-up run, conducted at every 10 m of elevation along each of the four slopes in the Dischma valley between 1,500 and 2,500 m a.s.l., included timber harvesting of 10% of the standing biomass every tenth year, which approximates the historic management in the Dischma valley as derived using expert knowledge and data from the Swiss NFI (Mahrer 1989; Brassel and Brändli 1999). After this spin-up run, we assumed having reached the state of the year 1900. Then, we simulated ecosystem dynamics from 1901 to 1980. During the simulation, every tenth year a timber harvest of 5% of the standing biomass was performed as derived from data of the MAB project (Hefti and Bühler 1986). We applied this harvesting intensity to all simulation points at all elevations, as the altitudinal dependence of harvesting intensity is unknown. To test the sensitivity of model outputs to harvesting, the model simulations were also performed without timber harvest in the twentieth century, but based on the same initialization.

The endpoints of these simulations (year 1980) along the four slopes—once with and once without harvesting—provided eight altitudinal gradients of aboveground carbon. They were compared to the observed aboveground biomass gradient that had been calculated by Schumacher (2004) based on measurements from the MAB project (Hefti and Bühler 1986). This measured gradient represents the mean over the entire Dischma valley. To compare the measured with the simulated gradient, we applied a factor of 0.5 to convert biomass to carbon (IPCC 2003).

Medium-term carbon dynamics

To initialize aboveground carbon (and nitrogen) pools of the model after the spin-up run at the 19 forest yield research plots, we converted the measured single-tree data to aboveground carbon. This was done using allometric functions based on Swiss high-resolution data (Perruchoud et al. 1999; Kaufmann 2001), the wood densities described in Körner et al. (1993), and again a factor of 0.5 to convert biomass to carbon (IPCC 2003). Soil carbon values as obtained from the spin-up run were left unchanged. Then, we performed a 30-year simulation run including timber harvest and mortality based on the history of management practices and tree mortality at each site. Depending on the years when measurements were made, the starting year of model simulations varied between 1950 and 1972 for the different sites. To obtain simulated aboveground carbon increments, we subtracted final aboveground carbon as simulated by Biome-BGC at the end of the simulation period from the initial data and compared this increment with the measured increment. Finally, the aboveground carbon increments simulated by Biome-BGC were compared to the simulation results of the semi-empirical single-tree model SILVA 2.2 (Pretzsch et al. 2002) applied at the same sites and for the same period. For the SILVA simulations, we used the default model parameterization of the dominant European tree species and the model was initialized with climate and soil data from the test sites. The details of these SILVA simulation experiments were described by Schmid et al. (2006).

Daily NEE and AET

For the simulations at the two EUROFLUX sites, we used a combination of results from the spin-up run for the initialization as well as our knowledge of each site’s management history. The spin-up run was followed by a model run representing the development of the years 1900–1999 including site-specific timber harvesting amounts as provided by the EUROFLUX network.

The results of these model runs were used to analyze the fluxes of carbon and water. To this end, we compared simulated values of NEE and AET with eddy covariance flux measurements for the period from 1996 to 1999. NEE is the net exchange of carbon between the biosphere and the atmosphere. NEE provides the size and direction of the net carbon flux, which can be positive (a flux from the atmosphere to the biosphere) or negative; at equilibrium it would be zero. NEE is a critical variable to consider for long-term (decadal) carbon storage (IGBP et al. 1998). In contrast, AET can serve as an indicator of water balance in the ecosystem, since it represents the net flux of water from the land surface back to the atmosphere. AET includes canopy and soil evaporation, plant transpiration, and snow sublimation. Additionally, the simulated annual LAI from 1996 to 1999 was compared with measured data. LAI is a measure for canopy density and size and is defined as the ratio of projected leaf area per unit ground area. Finally, we tested the sensitivity of NEE and AET towards changes in the canopy water interception coefficient, which is a crucial factor in the modeled water cycle.

Data analysis

For the analysis of the simulated medium-term increment of aboveground carbon, we used the relative difference between the simulated (x) and the measured (X) increment of each forest yield research plot. Based on these values, we performed two-sided Wilcoxon signed rank tests for dependent data samples with a significance level of 0.05. Moreover, the absolute bias (ē) and the relative bias (ē%) of the relative differences were calculated over i = 1, ..., n forest yield research plots (Eqs. 1, 2).

$$ \ifmmode\expandafter\bar\else\expandafter\=\fi{e} = \frac{{{\sum\limits_{i = 1}^n {{\left( {x_{i} - X_{i} } \right)}} }}} {n} $$
(1)
$$ \ifmmode\expandafter\bar\else\expandafter\=\fi{e}\% = \frac{{100 \times \ifmmode\expandafter\bar\else\expandafter\=\fi{e}}} {{\ifmmode\expandafter\bar\else\expandafter\=\fi{X}}}. $$
(2)

The analysis of the NEE and AET fluxes was based on 7-day averaged fluxes instead of daily fluxes to avoid daily flux “noise”, as Biome-BGC has been designed primarily to capture weekly to seasonal rather than daily variations of NEE (S. Running, personal communication). Again, we calculated bias and relative bias to compare simulated NEE and AET fluxes with measured values. Additionally, we used two-sided paired t-tests with a significance level of 0.05 and linear regression analysis. However, instead of using the common ordinary least squares (OLS) regression, we applied reduced major axis (RMA) regression (Sokal and Rohlf 1995). In contrast to OLS, the RMA regression method considers errors in both the simulated and the measured values. The slope b of the regression line x = a + bX is calculated in RMA as the ratio of the standard deviations of x and X, s x and s X :

$$ b = \pm \frac{{s_{X} }} {{s_{x} }} = \pm {\sqrt {\frac{{{\sum {(X - \ifmmode\expandafter\bar\else\expandafter\=\fi{X})^{2} } }}} {{{\sum {(x - \ifmmode\expandafter\bar\else\expandafter\=\fi{x})^{2} } }}}} }. $$
(3)

The sign of b is the sign of the following sum of products:

$$ {\sum {{\left( {X - \ifmmode\expandafter\bar\else\expandafter\=\fi{X}} \right)}{\left( {x - \ifmmode\expandafter\bar\else\expandafter\=\fi{x}} \right)}} }. $$
(4)

The intercept a and the degree of determination R 2 are calculated in the same way as in OLS regression. For a more complete description and discussion of RMA, see Sokal and Rohlf (1995) and Niklas (1994).

Results

Carbon storage along an altitudinal gradient

The measured aboveground carbon gradient for the year 1980 (Schumacher 2004) represents the average over the entire Dischma valley. Therefore, it cannot be used for an (absolute) quantitative comparison with the simulated gradients of the year 1980 that refer to two particular transects at the entrance and in the middle of the Dischma valley. However, the qualitative comparison between the shape of the measured and the simulated gradient allows for an assessment of the biological plausibility of our simulations. Under the simulations with forest management (5% of the standing biomass is harvested every tenth year), the shape of the simulated and measured carbon gradients agreed quite well up to an elevation of about 2,000 m a.s.l. (Fig. 1). Above this elevation, however, the simulated gradient was much flatter than the measured one, thus leading to higher biomass carbon stocks. Furthermore, the simulated timberline lay well above the observed timberline.

Fig. 1
figure 1

Aboveground biomass and carbon along altitudinal gradients in the Dischma valley. Bold solid line measured values over the entire valley calculated by Schumacher (2004); grey lines simulation results without management; black lines simulation results with management. Fine solid lines ENE slope at valley entrance; fine dashed lines WSW slope at valley entrance; fine dashed dotted lines ENE slope in the valley middle; fine dotted lines WSW slope in the valley middle

Variations in climatic and soil conditions between the four slopes that we studied did not seem to strongly impact aboveground carbon stocks. Within one management regime (5% harvest every tenth year or no harvest), the results differed only slightly between the four slopes (maximum difference of 17.8 t ha−1 at 2,190 m a.s.l.; Fig. 1). In contrast, aboveground carbon stocks were quite sensitive to management impacts: the absolute difference between 5% harvest every tenth year and no harvest was largest at the valley bottom (35.7 t ha−1). However, the relative differences remained fairly constant along the gradients, since harvesting was implemented as a percentage of standing biomass without altitudinal differentiation.

Medium-term carbon dynamics

The relative difference between the simulated and measured increment of aboveground carbon during the 30-year growth period was calculated for each of the forest yield research plots (Fig. 2). For comparison, Fig. 2 also shows the relative increment differences obtained by the single-tree simulator SILVA at the same forest yield research sites of the colline, the montane, and the subalpine zone (Schmid et al. 2006). Note that the three upper subalpine sites were found to be located outside the application region SILVA (cf. Schmid et al. 2006).

Fig. 2
figure 2

Relative differences in aboveground carbon increment [(simulated increment − measured increment)/measured increment] shown for each test site. Information about the test sites is given in Table 2. Circles results from Biome-BGC. Diamonds results from the single tree simulator SILVA. enf Evergreen needleleaf forest, dbf deciduous broadleaf forest, dnf deciduous needleleaf forest

The results from Biome-BGC revealed that, in the colline region, the model mainly underestimated growth (ē = −17.283, ē% = −17.92), but not significantly (two-sided Wilcoxon signed rank tests for dependent data samples, n = 11, P value = 0.998). However, Biome-BGC significantly overestimated growth at the montane, subalpine, and upper subalpine sites (ē = 17.719, ē% = 45.55) (two-sided Wilcoxon signed rank tests for dependent data samples, n = 8, P value = 0.0289). Across all 19 forest yield research plots, a slight but not significant underestimation of measured growth was found (ē = −2.541, ē% = −3.60) (two-sided Wilcoxon signed rank tests for dependent data samples, n = 19, P value = 0.730). At the majority of the sites, i.e., 14 out of 19, the simulated increment was within ± 30% relative to the measured increment, and at seven sites, the simulated increment differed less than 10% from the measured value. The comparison of the relative increment differences of Biome-BGC with those of SILVA revealed no significant difference between the two models (two-sided Wilcoxon signed rank tests for dependent data samples, n = 16, P value = 0.211). Moreover, at those sites where the relative difference in aboveground carbon increment was larger than ± 40% in the Biome-BGC simulation, SILVA simulations differed considerably from measurements as well. Yet, at these sites the Biome-BGC differences were larger than those of SILVA.

Daily NEE and AET

Simulated LAI of the years 1996–1999 at the two EUROFLUX sites Sarrebourg-Hesse and Bayreuth agreed well with measured values (Table 6). The comparisons of simulated 7-day averaged NEE and AET values with eddy flux measurements are illustrated in Figs. 3 and 4.

Table 6 Measured and simulated LAI (m2/m2) for the four simulation years
Fig. 3
figure 3

Simulated (black) and measured (grey) development of daily NEE over time (left) and scatterplot of simulated versus measured daily NEE (right) at the EUROFLUX sites Sarrebourg-Hesse and Bayreuth. Negative NEE is a source to the atmosphere, positive NEE is a sink. Solid line in the scatterplot: intercept = 0 and slope = 1; dashed line linear RMA regression between simulated and measured daily NEE

Fig. 4
figure 4

Simulated (black) and measured (grey) development of daily AET over time (left) and scatterplot of simulated versus measured daily AET (right) at the EUROFLUX sites Sarrebourg-Hesse and Bayreuth. Solid line in the scatterplot: intercept = 0 and slope = 1; dashed line linear RMA regression between simulated and measured daily AET

Linear RMA regression analysis of simulated versus measured NEE showed reasonable agreement of the simulated and observed flux variance for both sites (Sarrebourg-Hesse: R 2 = 0.595; Bayreuth: R 2 = 0.545; Fig. 3, Table 7). Nevertheless, the NEE fluxes at Bayreuth were significantly overestimated by the model (two-sided paired t-test, n = 175, P value = 2×10−16). At Sarrebourg-Hesse no significant differences were found (two-sided paired t-test, n = 187, P value = 0.462). Dividing the simulation period into days with net carbon uptake (simulated NEE > 0, predominantly summer months) and days with net carbon release (simulated NEE ≤ 0, predominantly winter months) revealed large seasonal differences in simulation accuracy. Biome-BGC simulated the variance in the measured fluxes relatively well during the net carbon uptake period at Sarrebourg-Hesse (R 2 = 0.507) and less well at Bayreuth (R 2 = 0.310), but the model failed when simulating the fluxes during the net carbon release period (R 2 = 0.070; R 2 = 0.002) (Table 7). Simulations during the net carbon uptake significantly overestimated the NEE fluxes at both sites (two-sided paired t-test, Sarrebourg-Hesse: n = 99, P value = 10−7; Bayreuth: n = 101, P value = 2×10−16). During the net carbon release period, however, measured NEE was underestimated, but only Sarrebourg-Hesse showed a significant difference (two-sided paired t-test, Sarrebourg-Hesse: n = 88, P value = 2×10−16; Bayreuth: n = 74, P value = 0.937). The flux overestimation during the net carbon uptake period and the flux underestimation during the net carbon release period are also supported by the intercept and slope from the linear RMA regression (Table 7). The standard errors of these two statistical parameters were probably underestimated, because the regression analysis was based on statistically dependent values (temporal autocorrelation). Therefore, the intercepts may be closer to 0 and the slopes closer to 1, respectively, than shown in Table 7. However, over the entire simulation period flux measurements and simulations agreed on the fact that both forests are net carbon sinks.

Table 7 Results from the linear RMA regression (x = a + bX) between measured and simulated daily average NEE

Regarding daily AET, simulated values agreed well with measured data at Sarrebourg-Hesse (R 2 = 0.675), but the model failed to simulate the variance in the measured AET fluxes in Bayreuth (R 2 = 0.284) (Fig. 4, Table 8). However, measured AET was significantly overestimated by the model at both sites (two-sided paired t-test, Sarrebourg-Hesse: n = 187, P value = 2×10−16; Bayreuth: n = 175, P value = 2×10−16; Table 8). The simulated AET peaks at Bayreuth were found at days with extremely high precipitation, with a correlation coefficient between simulated AET and precipitation of 0.829 (7-day averaged values).

Table 8 Results from the linear RMA regression (x = a + bX) between measured and simulated daily average AET

The overestimation of AET at Bayreuth was found to be caused mainly by the fact that Biome-BGC simulates high rates of evaporation of rainwater that was intercepted by the canopy (Fig. 5). Yet, canopy water interception is not the dominant component of AET in temperate forests (Flemming 1995). A sensitivity analysis of AET and NEE to changes in the canopy water interception coefficient revealed that decreasing the value of the interception coefficient led to improved AET simulation (Table 9). Particularly, decreasing its value from 0.041 (default value of model version 4.1.2) to 0.01 LAI−1 day−1 (value from the ecophysiological parameter set from the Biome-BGC web database; Biome-BGC 2004) led to a strongly improved AET simulation. A decrease of the coefficient value to 0.00025 LAI−1 day−1 (found in Churkina et al. 2003) further improved simulated AET, especially at the Bayreuth site. However, decreasing the canopy water interception coefficient did not necessarily lead to better results in the NEE flux simulations (Table 9).

Fig. 5
figure 5

Monthly mean values of simulated daily AET and its components. Solid lines AET, long dashed line canopy evaporation (from intercepted rainwater), dash dotted line plant transpiration, dotted line soil evaporation, short dashed line snow sublimation

Table 9 Sensitivity of NEE and AET, represented by the coefficient of determination R 2 of the linear RMA regression analysis, to changes in the canopy water interception coefficient

Expanding this sensitivity analysis to the 30-year growth simulation at the forest yield research plots showed that the change of the interception coefficient from 0.041 to 0.00025 LAI−1 day−1 led to a higher simulated aboveground carbon sink. It caused a shift in the relative difference in aboveground carbon increment towards higher values between + 5.4% (St. Moritz, no.17) and + 38.0% (Chanéaz, no.5) (cf. Fig. 2). At 13 of the 19 forest yield research plots, this led to larger relative differences in aboveground carbon increment between simulation and measurement. In other words, while decreasing the interception coefficient helped to improve short-term AET simulations, the medium-term simulations of aboveground carbon increment deteriorated considerably.

Discussion

Carbon storage along an altitudinal gradient

According to the records from the MAB project, the current timberline (i.e., the upper elevation limit of closed forest; cf. Körner 1998) in the Dischma valley is located at about 2,100 m a.s.l. The strong decrease in measured aboveground biomass observed between 2,000 and 2,100 m a.s.l. is usually thought to be due to intensive grazing and avalanches, which are common features in the Dischma valley. Thus, the elevation of the current upper timberline is a result of anthropogenic land use and, to a smaller extent, of natural disturbances. The potential timberline in the Dischma valley is generally assumed to be situated at least 100 m higher, i.e., at 2,200 m a.s.l. (Walder 1983). Moreover, in the Dischma valley, the treeline (highest occurrence of tree patches; cf. Körner 1998) is located at about 2,300 m a.s.l. (Walder 1983).

Biome-BGC, however, simulates aboveground carbon stocks of 60–90 t ha−1 at elevations around 2,100 m, which still represents a closed forest, and carbon stocks of 35–70 t ha−1 at 2,300 m a.s.l., where potential treeline should be located. It has to be emphasized that Biome-BGC simulates growth depending on climatic and edaphic conditions and on forest management, but without any further influences such as grazing or avalanche activity. Nevertheless, the simulations indicate that Biome-BGC overestimates growth at and above the current timberline. This conclusion is also supported by the fact that the model simulates tree growth up to 2,500 m a.s.l., 200 m above climatic treeline.

One reason for this growth overestimation at treeline might be the fact that some climate-based influences crucial to tree growth at high elevations are not simulated in the model. For example, in the model version ATE (Alpine Treeline Ecotone)-BGC developed by Cairns and Malanson (1998), a winter injury process due to wind exposure was incorporated as the main model adaptation to the upper treeline ecotone. Another reason might be that in Biome-BGC as in most other biogeochemical models (e.g., Foley et al. 2000; Sitch et al. 2003) photosynthesis, is assumed to be the limiting process of plant production. However, Körner and Paulsen (2004) suggested that particularly at low temperatures the transfer of sugars into structural tissue is limiting carbon storage, rather than the process of photosynthesis per se (cf. Körner 1998). Therefore, also the increased limitation of tissue growth under the cold environmental conditions at high elevations could lead to the growth overestimation of Biome-BGC near upper treeline.

A further potential explanation for the overestimation of growth at high elevations could be the respiration submodel of Biome-BGC. The magnitude of the respiration response to temperature is modeled by a prescribed rate defined at a reference temperature (i.e., 15°C) and a proportional change in the rate for a 10°C change in temperature, defined by the so-called Q10 parameter. At low temperatures, the Q10 model leads to extremely small respiration rates, which in turn can easily lead to an increased storage of biomass carbon. Therefore, several authors (e.g., Qi et al. 2002; Zierl and Bugmann 2006) have suggested that the Q10 model is inappropriate for larger temperature ranges, and they postulate a temperature-dependent function for the Q10 parameter.

Medium-term carbon dynamics

The Biome-BGC simulation results from the 19 forest yield research plots revealed that the model tends to underestimate measured growth at low elevations and to overestimate growth at high elevations. These differences may be a result of incorrect climatic input data, which raises the question of the accuracy of the weather generator MTCLIM. Since MTCLIM was designed especially for applications in mountainous terrain (Thornton et al. 2000), we expect the model to provide rather accurate data. Moreover, sites for which climate data had to be extrapolated across large elevations (e.g., Embrach, Oberhünigen, Hospenthal, Morissen with differences larger than 400 m) were not those that revealed the largest differences between simulated and measured increment of aboveground carbon. Thus, it is more likely that the deviations between simulation and measurements were caused by the parameters describing that plant functional types that are the same regardless of elevation, whereas in reality, the parameters of a plant functional type are likely to change over large altitudinal ranges as acclimation occurs. Alternatively, the differences may be due to the mortality rates applied that may be low compared to actual mortality rates, in spite of the increase with elevation. We cannot conclude this firmly, but we surmise that the different climatic conditions in combination with plant acclimation phenomena at high elevations (cf. Theurillat and Guisan 2001) constitute the main reasons for the differences between simulations and measurements.

It is striking that those test sites that revealed large deviations from the measurements in the Biome-BGC simulations (differences of more than ± 40%) also showed relatively large errors in the simulations with SILVA (Schmid et al. 2006), a model parameterized for central European forests. However, even though Biome-BGC simulates the impact of environmental conditions on growth in a far more mechanistic manner than SILVA, the Biome-BGC errors exceeded those of SILVA. This could indicate that Biome-BGC is more susceptible to uncertainties in climatic input data and in the accuracy of the parameter estimation than the SILVA model, or that variations in plant physiological traits over a large range of altitudes may limit the accuracy of process-based simulations. It also suggests that the relevant processes and their interactions and interdependencies are not fully understood yet.

Finally, the differences between measured and simulated growth may be due to site-specific drivers of forest growth [e.g., grazing as observed at the upper subalpine site in Sils (no. 19)] that would be quite difficult to quantify and to incorporate adequately into Biome-BGC.

Daily AET and NEE

The comparison of simulated and measured NEE values indicates that the model has a better predictive ability for photosynthesis-related processes or carbon sinks (simulated NEE > 0, predominantly summer months) than for respiration-related processes or carbon sources (simulated NEE ≤ 0, predominantly winter months). At both EUROFLUX sites, the variance of the measured fluxes was not simulated well by Biome-BGC.

Problems with the way respiration is modeled in Biome-BGC have also been found in other studies (e.g., Hunt et al. 1996; Thornton et al. 2002; Churkina et al. 2003). However, the NEE differences between simulation and measurement during respiration periods may also be due to measurement errors, mainly errors of flux measurements in winter (Baldocchi 2003; Churkina et al. 2003). During winter, nights are significantly longer than days, and thus the contribution of the nighttime flux to the total daily flux becomes more important. Because at night the turbulent flux is partially inhibited by stable stratification of the atmospheric surface layer above the canopy (winds are light and intermittent), the measurement of nighttime ecosystem respiration is difficult and can lead to a flux underestimation (Baldocchi 2003).

The strikingly high simulated AET at Bayreuth warrants some discussion. Total evapotranspiration is mainly a function of climatic factors and LAI. The comparisons of simulated vs. measured LAI (a measure of canopy density and size) were very favorable. Therefore, the strong overestimation of AET, particularly for the evergreen needleleaf forest at Bayreuth, cannot be attributed to an erroneous simulation of canopy leaf area. Moreover, it is unlikely that the overestimation is caused by deficient climatic inputs, as weather data were measured on-site). Our finding that mainly the evaporation of rainwater intercepted by the forest canopy leads to this AET overestimation agrees well with a Biome-BGC study by Thornton et al. (2002). In this study, the discrepancy between simulation and measurement of AET in evergreen needleleaf forests was attributed to measurement errors, i.e., a suspected flux underestimation bias in the measurements when the sonic anemometers are wet. Thornton et al. (2002) even omitted the simulated evaporation of intercepted water when comparing simulated AET with measured fluxes. In the present study, however, measurement errors cannot fully explain the large differences between simulated and measured AET. Independent from these AET differences, the simulated share of evaporated interception water relative to total forest evaporation at Bayreuth must be considered as being definitely too large: According to Flemming (1995), evaporation of water intercepted by the canopy reaches 30–40% of AET for spruce, 25–35% for pine, and 15–25% for deciduous forests. The results at Sarrebourg-Hesse agree well with these data, but clearly not those at Bayreuth. Therefore, not only measurement errors but also deficiencies in the simulation of intercepted water evaporation are likely to contribute to the differences between simulated and measured AET.

Although a reduction in the canopy interception coefficient led to better AET simulations, this was not true for NEE simulations. In Biome-BGC (as in virtually all other biogeochemistry models), the water and carbon cycles of the ecosystem are coupled through a control on stomatal conductance. Therefore, changing the parameters of the water cycle also has an impact on the carbon cycle. Even if the effect on carbon fluxes remains small, the change becomes apparent when looking at medium-term carbon storage. This demonstrates the susceptibility of the model output to single parameters, and the difficulty to simulate appropriately the different components of biogeochemical cycling in a forest ecosystem, particularly if the predictions are evaluated simultaneously at several temporal scales.

In addition to these potential model-based and measurement-based biases, uncertainties in the management history may also contribute to the discrepancies between simulation and measurements. Several studies have shown that management and other disturbances can play an important role when assessing carbon fluxes at short (e.g., Churkina et al. 2003) as well as at decadal and longer time scales (e.g., Thornton et al. 2002).

Conclusions

Model-based predictions of changes in the global carbon cycle across the twenty-first century are a scientific challenge and have become of key political interest over the past few years. In this study, we tested the ability of the ecosystem process model Biome-BGC to simulate different aspects of forest growth. We wanted to evaluate within one single study both long-term and medium-term dynamics of carbon pools as well as daily carbon and water fluxes of a series of coniferous and deciduous forests in central Europe from the colline zone to upper treeline. In the context of carbon sources and sinks, estimating the accuracy of long-term carbon simulations has become quite important, particularly because short-term simulation successes cannot be interpreted as long-term predictability. Moreover, analyzing the accuracy of a model applied across heterogeneous regions is particularly challenging because ecosystem processes often respond in a nonlinear manner to variations in climate and physiography (cf. Band et al. 1991).

Simulations along an altitudinal gradient in the Alps resulted in vigorous tree growth up to approximately 200 m above the climate-determined treeline. This may be attributed to a lack of mechanic disturbance processes in the model, but probably also to changing plant traits at high elevations that are not accounted for in the model. A further reason for the overestimation could be the incomplete representation of ecological processes and their interactions, such as the limiting effect of low temperatures on tissue growth (transfer of sugars into structural tissue) that is not included in Biome-BGC, or the respiration model that is not well adapted to the cold climates at high elevations.

Further, our results showed that Biome-BGC appropriately simulated medium-term (30 years) dynamics of carbon fluxes at different elevations and climatic regions in Switzerland. At low elevations, the model tended to underestimate measured growth, while it generally overestimated growth at high elevations. However, the accuracy of these simulations did not significantly differ from those of the semi-empirical single-tree model SILVA parameterized particularly for central European tree species.

At the Sarrebourg-Hesse site, Biome-BGC performed well in simulating the annual course of daily net carbon uptake during the growing season, but the model was less successful in reproducing net carbon release by the forests during winter. At Bayreuth, NEE simulations deviated strongly from measurements. The differences between simulation and measurement during respiration periods can be attributed to some extent to flux measurement errors. Also, the overestimation of daily water fluxes may partly be due to measurements errors (i.e., biases for wet canopy evaporation), but there are also deficiencies in the model itself. Our attempt to improve daily water flux simulations by changing the value of the canopy water interception coefficient led to better estimates of the water balance, but applying this ‘improved’ parameter to the simulation of medium-term carbon dynamics led to larger differences (overestimation) between simulation and measurements.

By using the process-based model Biome-BGC without any site-specific parameter adaptations, we tested the model parameterization as well as the model theory and equations. The test results show that future applications of Biome-BGC could benefit much from an improvement of some of the model algorithms (e.g., the Q10 model) as well as from a detailed analysis of the ecological significance of crucial parameters such as the canopy water interception coefficient. In spite of these model deficiencies, comparing Biome-BGC simulations with measurements and with simulations of the semi-empirical model SILVA adapted to central European forests, we conclude that Biome-BGC provides good results in terms of carbon fluxes as well as in terms of carbon pools in large parts of the Central European forests.