Introduction

Tropical forest ecosystems embody an important potential for C storage over other terrestrial ecosystems, though proportionally smaller in extent compared to soil and ocean ecosystems (Baker et al. 2010). Hence, understanding the amount of C content stored in tropical forests contributes a critical step toward quantification of their contribution in climate change mitigation. This study juxtaposes two sampling designs, one using a geostatistical framework and the other a design-based framework, derived from different theoretical and philosophical bases for their precision for estimating C stock. Consequently, the application of either of these methods for C stock estimation may not always give the same results.

There is compelling evidence that the average temperature of the earth is increasing and it is this potentially adverse phenomenon that has stimulated research into the dynamics of C sequestration by forest ecosystems (Wysowski 2010). Carbon dioxide (CO2) is believed to play a major role in global warming (Aydin Coskun and Gençay 2011). Therefore, solutions to problems of climate change are aimed at reducing the levels of CO2 in the atmosphere (UNFCCC 1998). Informed decisions about curbing the devastating effects of climate change require elimination of uncertainty in estimates of the rate of deforestation (Friedlingstein et al. 2006). Hence, quantification of C stocks is important in the determination of the extent of C sequestration by forests.

To date, significant advances have been made toward the estimation and quantification of C in tropical forests using techniques ranging from forest inventories to remote sensing (Gibbs et al. 2007). The most direct way to quantify the C stored in living forest biomass is through harvesting of all the trees in a known forest, weighing the biomass, and then applying a conversion factor to C stock (Leemans et al. 1996). This method could be accurate for a particular forest, but it is not environmentally sustainable and, moreover, it is time consuming and expensive. A greater part of tropical forests typically contains over 300 species and research has shown that species-specific allometric relationships are not necessary for the generation of reliable estimates of forest C stocks (Bhat and Ravindranath 2011). Instead, Brown (2002) demonstrated that generalized allometric equations, stratified by broad forest types, are highly effective for the tropics. This is because the diameter of trees at breast height alone explains more than 95% of the variation in above-ground tropical forests’ C stock.

Researchers and forest practitioners including Houghton (2005) have estimated the amount of C sequestered in forests using allometric equations, all with a goal of providing unbiased and accurate estimates of C stock in forest ecosystems. Sales et al. (2007) compared the performance of a geostatistical method with a simple biomass estimation using the sample mean for mapping forest biomass as a step toward the estimation of CO2 emissions due to land use and land cover changes in the Brazilian Amazon. The research demonstrated the superiority of geostatistics in improving estimates of CO2 emissions in the Amazon forests, one of the world’s biggest CO2 reservoirs and sinks.

Sampling methods are largely based on design-based (classical) statistical techniques, central to which is randomized sampling founded on the probability theory (Keller et al. 2001). An alternative and promising technique is that of a model-based (geostatistical) analysis, which is independent of randomization. For example, Sales et al. (2007) demonstrated a significant reduction in the root mean squared error (RMSE) of biomass estimates in Rondonia (Brazil) by using kriging with external drift (KED) compared to the use of a simple biomass estimation using the sample mean. However, given the spatial variability of biomass due to changing soil catena and topography in tropical forests, a model-based approach might be more appropriate than other techniques in improving C stock estimates in such landscapes (Bryan et al. 2010; Montes et al. 2005). This is because the derivation of site-specific estimates and predictions of C stock in inaccessible areas using the model-based framework are possible, which is useful for planning and management.

Moreover, given the extensive and complex nature of tropical forests and landscapes, these techniques are usually faced with inevitable limitations of error propagation within the data processing chain (Goodchild 1994; Wang et al. 2005). It is physically impossible to sample everywhere due to the prohibitive costs of survey sampling and inaccessibility of some locations. In light of this background, the inevitable limitation regarding the aforementioned biomass estimation methods opens avenues for the geostatistical approach. Therefore, the aim of this study was to investigate the precision of the model-based approach and the design-based approach in estimating C stock in the Ludikhola watershed of Nepal.

Methodology

Study Area

The Ludikhola watershed (Fig. 1) is situated in the Gorkha district of the western development region of Nepal. The area lies between 27°55′02.85″N latitude and 27°59′43.58″E longitude with an average annual precipitation of 1,972–2,000 mm/year and an average temperature of 23.1°C with an area coverage of 5,750 ha. The watershed has a characteristic hilly physiography with altitudes ranging from 318 to 1,714 m (ANSAB 2010). The watershed falls under the sub-tropical ecological zone with Schima wallichi, Shorea robusta, and Castanopsis indica as predominant species.

Figure 1
figure 1

Location of the study area, Ludhikola watershed, Nepal

Measurement of Tree Biomass

Measurements of diameter at breast height (DBH) (ca. 1.3 m) of trees with heights greater than or equal to 10 cm DBH were made in each of the circular 500 m2 supports using diameter tape, clinometers, and linear tapes. Trees with DBH less than 10 cm generally have insignificant C stocks (Gibbs et al. 2007). On average, slopes of at least 30% characterize most of the plots in the study area and a slope correction was therefore necessary to correct for the DBH of the measured trees in different slopes (Bhat and Ravindranath 2011).

Biomass Calculation and C Stock Derivation

Due to the lack of local allometric equations, the general equation proposed by the Intergovernmental Panel on Climate Change (IPCC 2007) was used for estimating forest biomass, while an equation developed by Basuki et al. (2009) for tropical forest was used for the S. robusta species. The basis for the application of this allometric equation for this species in particular is the fact that the mean annual rainfall (2,000 mm) and temperature range (21–34°C) are similar to the climatic conditions prevailing in the study area. Similarly, the equation used for the other species was also formulated using DBH ranging from 5 to 148 cm and the rainfall (2,000 mm) and temperature were similar to the study area. The biomass for each individual tree species was subsequently converted to C stocks per species using a conversion factor of IPCC (2007). Per plot (support) values of C stocks were then expanded to unit area, in this case, a hectare (mg C ha−1).

Model-Based Approach

Sampling Design

In this study, 186 observations with a support of 500 m2 as the unit of replication were sampled from the September 19th to the October 13th 2011. Seven community forests (CF) with area coverage of 687.9 ha were sampled on a 100 × 100 m2 regular grid, with grid nodes being the location of the sampling points. Assuming that the UTM grid intersection is random with respect to the study area, de Gruijter et al. (2006) assert that the optimal location of a sampling point is the grid node or intersection. The 100 × 100 m2 regular grid was derived from the reconnaissance variogram modeling carried out during the pre-field work exercise (Webster and Oliver 1992).

An initial location was randomly chosen within the study area using the ArcGIS sampling facility. The establishment of this point paved the way for the rest of the sampling locations, which were defined so that all the locations were at regular 100 × 100 m2 intervals over the entire study area. Sampling was therefore undertaken and done on every node of the regular grid over the seven CFs.

Reconnaissance Variogram

Sixty observations collected in 2010 on a support of 500 m2 were used in the modeling of a reconnaissance variogram for the determination of scales of spatial variability (Rachina 2011). The data collected using a stratified random scheme were used for the derivation of the sampling interval of C stock for the present study (Oliver and Webster 2008). Thus, the range of influence and the grid distance for estimating C stock using geostatistics were derived from the parameters of the reconnaissance variogram of the data.

Spatial Exploratory Analysis

A spatial exploratory analysis of the response variable was made in order to shape the road map for the subsequent analysis. This is important as it allows for the assessment of the adequacy of modeling assumptions before making a decision to transform. A spatial plot of C stock observations with respect to their locations was done using the geoR geostatistical library (Diggle and Ribeiro 2007). A test of directional dependence of the target variable was made through directional variogram modeling (Diggle and Ribeiro 2007).

Kriging

We used ordinary kriging as a reference for assessing the actual gain for accounting for covariates. As noted by Yan et al. (2007), the precision with which a variable is estimated may be improved by auxiliary variables. For instance, Hudson and Wackernagel (1994) used elevation data to improve a kriged map of temperature and Odeh et al. (1994) showed regression kriging to produce better estimates than ordinary kriging.

KED was used to model the spatial structure of C stock because of its ability to incorporate many covariates. Sales et al. (2007) demonstrated KED interpolation to perform better than classical statistical approaches in estimating forest biomass. The KED algorithm limits stationarity within each search neighborhood, thereby offering more local detail than when ordinary kriging is used (Deutsch and Journel 1998). In order to match the sampling density of the variable in question and the scales of spatial variability, a grid cell resolution of 10,000 m2 was subsequently used in the kriging methods (Hengl 2007). For KED, the assumption of normality was tested with the residuals of the best linear model.

Selection of the Best Linear Model of Feature Space

A criterion-based approach using the Akaike Information Criterion (AIC) was followed to select the best candidate model for use in KED (Akaike 1978). Criterion-based methods employ a wider search for the best model and compare models in a preferable manner (Faraway 2002). A statistic related to the AIC is the Bayesian information criterion (BIC), which works by imposing heavy penalties to larger models and results in the selection of smaller models in comparison to AIC and its algorithm. Forward, backward, and stepwise approaches for the inclusion or exclusion of predictors are inconsistent, resulting in different final models, even from the same dataset (MacNally 2000; James and McCulloch 1990). Hence, the selection of the AIC approach for linear model selection is justified due to the inability of stepwise procedures in considering the increased probability of Type I error due to multiple testing.

A principal component analysis was carried out in order to make the explanatory variables independent. A principal component analysis therefore provides a selection criterion of candidate variables to retain (or eliminate), which consequently results in data reduction (Mansfield and Helms 1982). The number of potential predictors from the principal components’ regression was 3 and by the criterion-based rule, 23 (equal 8) candidate models were fitted and the model with the lowest AIC was chosen for subsequent modeling using KED.

Cross-Validation

In order to assess the predictions, cross-validation statistics as outlined in Webster and Oliver (2001) were calculated for each prediction method. The validation statistics used included the mean error (ME), the mean squared error (MSE), the RMSE, and the predicted residual sum of squares (PRESS). Variogram models for the kriging variants were cross-validated to assess the validity of the fitted models and to compare estimates from the variogram models with actual values (Utset et al. 2000).

Design-Based Approach

Sampling Design

Under this plan, the population of interest was divided into mutually exclusive and exhaustive strata and a simple random sample was taken within each stratum (Cochran 1977). The resulting sample size for this design was therefore made up of one hundred and fifteen C stock observations, with the proportions as illustrated in Table 1.

Table 1 Proportions and Required Sample Sizes for the Sampled CF

Non-Spatial-Exploratory Analysis

The normality assumption regarding the measured target and predictor variables was assessed through a tabular display of summary statistics. It is from the summary displays that the decision to transform non-normally distributed variables was made. The normalizing transformations can result in the data meeting the assumptions of the distribution (Longford 2008).

Analysis of Variance (ANOVA)

A single-factor ANOVA model was fitted to the data and analyzed for significant differences of mean C stock among the sampled CFs. Furthermore, the analysis facilitated the calculation of C stock estimates in each of the sampled stratum.

Jackknifing

Jackknifing provides a parametric statistical inference of the dataset by applying re-sampling without replacement to the original dataset (Wells 1994). A jackknifing procedure was applied in order to calculate the standard error and bias estimates of the fitted ANOVA model.

Assessment of Confidence Interval Overlap

A test of significance to establish the extent of overlap between the two estimators was carried out using a method applied by Goldstein and Healy (1995). Assuming the mean of two estimators to be denoted by \( \bar{X}_{1} \) and \( \bar{X}_{2} \), independently and normally distributed with standard errors \( \sigma_{1} \) and \( \sigma_{2} \) , respectively, the confidence intervals do not overlap if the following inequality (1) is satisfied:

$$ \left| {\bar{X}_{1} - \bar{X}_{2} } \right| > {\rm Z}_{\beta } \left( {\sigma_{1} + \sigma_{2} } \right) $$
(1)

where \( {\rm Z}_{\beta } \) is the (positive) normal quartile with two-tailed probability \( \beta \).

The probability that the inequality in Eq. (1) is satisfied under the null hypothesis of equal means of C stocks of the aforementioned estimators is given by Eq. (2) as follows:

$$ \gamma_{12} = 2\left[ {1 - \varphi \left( {\frac{{{\rm Z}_{{\frac{\alpha }{2}}} \left( {\sigma_{1} + \sigma_{2} } \right)}}{{\sqrt {\left( {\sigma_{1} + \sigma_{2} } \right)} }}} \right)} \right] $$
(2)

where \( \gamma_{12} \) denotes the comparison of estimator 1 (model-based) to estimator 2 (design-based) and \( \varphi \) is the normal integral.

Results

Model-Based Approach

Reconnaissance Variogram

The results of this analysis showed a high nugget/total sill ratio (0.62), indicating a weak spatial structure and high short-range variability, even after averaging over the 500 m2 support. However, an analysis of spatial structure of C stock from the 186 samples collected for the present study showed ranges of 750–1,514 m (Fig. 2), which are considerably longer than those in the preliminary analysis. Thus, the grid distance could have been wider than the lag distance subsequently used for the modeling of the spatial structure of C stock.

Figure 2
figure 2

Stagewise comparison of the effects of covariates on the spatial correlation structure of C stock

Spatial Exploratory Data Analysis

There was no discordance of C stock observation with respect to their spatial neighbors, particularly with respect to geography (Fig. 3). The concentration of sampled C stock was high in the southern region, suggesting the existence of a spatially varying mean due to a trend surface (Ribeiro and Diggle 2001). It is therefore evident from this plot that the behavior of C stock density is not the same across the different geographical regions of the sampled field. A quartile plot (Fig. 3b) of the empirical distribution of C stock revealed a predominance of the third quartile (>75%) C stock values in the southeast and southwestern regions of the sampled CF.

Figure 3
figure 3

(a) The behavior of C stock density and distribution with respect to geography. (b) Quartiles of the empirical distribution of measured C stock values

The variograms indicate a clear lack of independence of the primary variable with respect to direction (Fig. 4). This fact is vindicated by the behavior of the response variable plotted in the northern, northeastern, eastern, and southeastern directions. The directional variograms exhibit different nugget effects and total sills, a hint for anisotropy. The existence of anisotropy was evident and this suggests a non-stationary mean in the primary variable, and the stationarity assumption was therefore not valid (Oliver and Webster 2008). This outcome justifies the modeling of C stock variability and distribution with a trend. Subsequent modeling of C stock consequently relied on the information regarding the displayed trend.

Figure 4
figure 4

Directional variograms for C stock

Feature Space Model Selection Criterion

Incorporation of the different predictor variables from the results of a principal component’s analysis resulted in an additive model with elevation and distance providing the best linear model of feature space. As illustrated in Table 2, this model accounted for most of the variability and subsequently took away the greater part of the spatial correlation structure of C stock (87.3%). Furthermore, the model had the best qualities for explaining the variability of C stock which ranged from linear model diagnostics (meeting regression assumptions) to the amount of variability explained for C stock by the predictors. However, models including interaction terms had lower AIC (i.e., excessive complexity) values than the most parsimonious model eventually applied for KED as indicated in Table 2. Thus, an additive relationship among C stock, elevation, and distance gave the following relation:

$$ \left( {{\text{Carbon}}\;{\text{stock}}} \right)^{\frac{1}{2}} = 20 - 0.02\;{\text{Elevation}} + 0.55\;\log \left( {\text{Distance}} \right). $$
(3)
Table 2 Summary of the Selection Criterion for the Best Candidate Linear Model of C Stock Prediction Using KED

The range shows that the spatial dependence of C stock is increased from the estimated range of 450 m from the reconnaissance variogram analysis to 1,541 m for the present study (Table 3). Cambardella et al. (1994) described different classes of spatial dependence using the ratio between the nugget and total sill variance. Values less than 25% are categorized as strongly spatially dependent, 25–75% as moderately spatially dependent, and values more than 75% as weakly spatially dependent. Hence, a nugget to sill ratio for C stock of 0.30 (Fig. 2) corresponds to a moderately spatially dependent variable and implies that 30% of the C stock variability consisted of unexplained, short-distance random variation.

Table 3 Parameters for the Ordinary Variogram and Residuals for the Distance and Elevation Covariates

Ordinary Kriging

Ordinary kriging, which relies on the spatial dependence in the primary variable (sampled C stock values), was used as a reference to assess the actual gain of accounting for covariates. A spherical model was fitted to the variogram. The predicted C stock density varied from 2.37 to 10.99 sqrt (mg C ha−1) (1st and 99th percentiles) with standard error varying from 2.17 to 3.23 mg C ha−1, Figure 5a. The kriging prediction variances (Fig. 5b) give a statistical measure of uncertainty across the spatial field and show that most of the locations near sampling locations had smaller uncertainty compared to locations remote from the known C stock observations. Thus, the quality of the ordinary kriging prediction map is not better than 10.44 sqrt (mg C ha−1).

Figure 5
figure 5

Kriging results illustrating (a) ordinary kriging predictions’ sqrt (mg C ha−1), (b) ordinary kriging variances (mg C ha−1), (c) KED predictions’ sqrt (mg C ha−1), and (d) KED kriging variances (mg C ha−1) for C stock

Kriging with External Drift (KED)

The inclusion of elevation and distance as predictor variables resulted in a decrease of the total sill of the variogram and a shortening of the range of influence (Fig. 2). The best feature space model predicted C stock density with a range of 3.82–13.32 sqrt (mg C ha−1) (1st and 99th percentiles) with a standard error varying from 0.27 to 0.29 mg C ha−1 (Fig. 5c). The shortening of the range of influence and the decrease in the total sill demonstrate the predictive power which a linear additive model of elevation and distance has on the spatial correlation structure of C stock compared to a pure-based approach using the ordinary variogram. Hence, in Figure 5d, the quality of predictions was greatly improved as a result of using covariates in the prediction of C stock, with the highest predictions occurring in the southwest part of the sampled CF (Fig. 5c). Consistent with theory, the KED estimated error variance seemed to be dependent on the observed data configuration, where the uncertainty of estimation decreased toward the sampling locations (Fig. 5d).

Cross-Validation Statistics and Model Diagnostics

The summary statistics of the cross-validation procedure, as proposed by Isaaks and Srivastava (1989), shows that the mean prediction errors approach zero, with KED giving a far superior error distribution than ordinary kriging, an indication of non-biasedness (Table 4; Fig. 6).

Table 4 Summary of Cross-Validation Statistics
Figure 6
figure 6

(a) Diagnostic plots of the residuals of the kriging prediction variants. (b) Model diagnostics of the best linear model (elevation + log (distance) of feature space

Design-Based Approach

Non-Spatial Exploratory Data Analysis

Serious deviations from normality in the response, distance, slope, and NDVI covariables were noted. A maximum likelihood estimate of \( \lambda \) was 0.5 with estimated 95% confidence interval of \( \left( {0.4 \le \lambda \le 0.6} \right) \). Hence, a Box–Cox transformation of the target variable resulted in a square root transformation.

Analysis of Variance (ANOVA)

A square root transformation of C stock resulted in failure of rejection of the null hypothesis of a Bartlett’s test (P (\( \alpha = 0.05 \)) = 0.069). A graphical box and whisker plot showing the variability of mean C stock in the different CFs (Fig. 7) shows that the Taksatari CF had the most density of C stock, 14.91 sqrt (mg C ha−1), and the Chisapani CF had the least C stock density, 8.57 sqrt (mg C ha−1). An ANOVA demonstrated significant differences in the mean C stock density of the sampled CF under the management of different CF user groups (CFUGs) (F 6,103 = 5.21, P = 0.00017). The Tukey–Kramer (Kramer 1956) multiple comparison method was used to establish significant differences among the CFs.

Figure 7
figure 7

Mean C stock density within the seven sampled community forests

Jackknifing

The ANOVA model was validated using the jackknifing technique. Two measures of accuracy of the estimator, \( \hat{\theta } \), calculated from the data with the ith observation removed, namely the standard error and the bias of the estimator, were calculated and gave values of 0.16 mg C ha−1 and 0 sqrt (mg C ha−1), respectively, with a 95% confidence interval of 9.74 ≤ \( \hat{\theta } \) ≤ 9.77 sqrt (mg C ha−1).

Assessment of Confidence Interval Overlap

The best linear model of feature space using KED gave predictions with a narrower margin of error of 9.89 ± 0.17 sqrt (mg C ha−1) compared to the C stock confidence interval obtained using the design-based approach as 9.91 ± 0.8 sqrt (mg C ha−1). However, the design-based approach gave slightly higher total C stock estimates in comparison to the model-based approach (Table 5).

Table 5 Summary of Total C Stock Estimates for the Design-Based Approach and the Model-Based Approach Using KED

The 95% confidence intervals for the two estimators were \( 9.72 \le \bar{X}_{1} \le 10.06 \) sqrt (mg C ha−1) and \( 9.11 \le \bar{X}_{2} \le 10.71 \) sqrt (mg C ha−1) for the model-based approach and the design-based approach, respectively. There is insufficient evidence for the rejection of the null hypothesis that there is no significant difference between the mean C stock estimated by the model-based approach and the mean C stock estimated using the design-based approach.

Discussion and Conclusions

The results of KED appear justified in terms of the known physical and presumed anthropogenic relationships imposed on the local mean by elevation and, to a lesser extent, by distance to the nearest human settlements. It is therefore the partitioning of the data into a deterministic trend component and a residual “noise” component that vindicates non-stationary geostatistics (Hengl 2007). The precision of C stock estimation using ordinary kriging could not match that using KED. This emanates from the assumptions that govern the ordinary kriging algorithm compared to KED. A test of directional dependence for C stock showed the variable to be highly anisotropic, with different nuggets and total sills in various directions. However, KED modeled this trend using explanatory variables in the form of elevation and log (distance), thereby smoothening the variance in the predictions (Berterretche et al. 2005).

It is not only the incorporation of secondary information that can make KED better than other kriging variants but it also depends on the quality of the secondary information used (Ahmed and De Marsily 1987). With the increasing availability of topographic information at finer resolutions than before, it is easy to access information like elevation, slope, and aspect from the Shuttle Radar Topography Mission (SRTM) 30-m elevation online databases. In that case, the application of KED is more attractive and outweighs the pure-based approach of using ordinary kriging.

The model-based approach using KED gave a narrower margin of error for the mean C stock estimates than the margin of error obtained from the design-based approach. This is partly because of the lower variances and a relatively larger sample size used for the KED algorithm. However, the design-based approach predicts slightly higher total C stock estimates than any of the kriging variants. The design-based technique is known to overestimate since it assigns the same weighting to all the predictions and to all the residuals (Montes et al. 2005). On the other hand, KED has the ability of incorporating auxiliary information to further improve estimates of a primary variable (e.g., C stock) (Isaaks and Srivastava 1989). The kriging variants, especially the KED, greatly reduce the uncertainty associated with predicting the variable in question by using predictors as its aides, an important piece of information for C investment and forest management. The findings of the present study are in conformity with the results of a comparative study by Guibal (1973) and Montes et al. (2005).

From the assessment of the extent of overlap between the C stock mean estimators of the two sampling protocols, it is clear that the mean C stock estimates for both methods fall within the 95% confidence bounds of each other. The estimators will overlap 99.7% of the time, implying insufficient evidence to suggest that the mean C stock estimates are significantly different from each other. The reason for this substantial extent in overlap emanates from the fact that the larger mean C stock estimate from the design-based approach is lower than the upper confidence limit of the smaller mean from the model-based approach (Moore and McCabe 2002). Hence, the model-based approach is not significantly better than its design-based counterpart as we had postulated at the onset of this study. However, the model-based approach looks relatively superior to the design-based approach as we demonstrate that the estimated C stock of the former has a narrower confidence interval and margin of error. This is an important leap toward the judgment and evaluation of an estimator as it clearly shows the amount of uncertainty that we have in estimating the population parameter of interest. In other words, there is a very small distance between the sample statistic and the population parameter for the model-based approach rather than for the design-based approach, a position that favors the model-based approach as an option for C stock estimation.

In light of evidence presented in this study, we conclude that for forest management and C stock estimation, we are closer to the estimated population parameter with a model-based approach than with a design-based approach. Due to the limitation in the sample size used for the determination of scales of spatial variability of C stock in the reconnaissance variogram analysis, a mismatch in the sampling intervals used and the actual scales of spatial variability of C stock could have been made. We therefore suggest that future studies focus on the improvement of the determination of scales of spatial variability and explore the possibility of testing more covariates in the geostatistical modeling of C stock. For the purposes of C stock monitoring and accounting, the balance of probabilities favors the model-based approach since the design-based approach generalizes uncertainty of C stock estimates for the area of interest.