Introduction

Effect of climate and topography on vegetation has attracted much scientific attention in ecological research. Climate is widely used to explain vegetation pattern and distribution around the world (Holdridge 1947; Woodward 1987; Fang and Yoda 1989; Fang and Yoda 1990; Neilson 1995). Spatial variation in topographic condition is a critical factor influencing vegetation pattern, species distribution, and plant growth (Bennie et al. 2008). For instance, elevation, slope, and aspect may be crucial where vegetation distribution is restricted to niches with favorable microclimatic conditions (Hennenberg and Bruelheide 2003). Several studies have successfully used topographic variables to predict plant growth and species distributions in mountain areas (Guisan and Zimmerman 2000; Yang et al. 2006; Lassueur et al. 2006).

Traditionally, field survey sample plot data are used to build the relationship between vegetation and climatic and topographic variables (Wang et al. 2005a; Yang et al. 2006; Zhang et al. 2006). Selection and investigation of field samples is labor and time consuming, and sample numbers are often limited. Remote sensing data, on the other hand, provide multidimensional information for spatial estimation of vegetation status, such as leaf area index (LAI), normalized difference vegetation index (NDVI), and its derivates like net primary productivity (NPP). Globally available digital elevation models (DEMs) facilitate extraction of site-specific conditions like elevation, aspect, and slope, and can as well be used in interpolation of climatic variables. The combination of DEM and remote sensing data can, therefore, yield huge amounts of samples with information far surpassing field survey data (Bader and Ruijten 2008; Greenberg et al. 2009).

Historically, statistical regression and correlation techniques are used in describing the relationship between vegetation and environmental variables (Yang et al. 2006; Zhang et al. 2006). The global ordinary least squares (OLS) regression is, however, stationary in a spatial sense. In OLS, a single model is applied over the entire geographic space with a single and constant coefficient (Foody 2003; Propastin and Kappas 2008). Thus, spatial variations in the relationships among vegetation and climate are ignored. On the contrary, geographically weighted regression (GWR) is a recently developed regression method with special emphasis on spatial non-stationarity (Fotheringham et al. 2002). The technique provides weighting of locally associated information and allows regression model parameters to vary in space. GWR models have been used in predicting relationships among NPP and environmental variables (Wang et al. 2005a), NPP and the maximum NDVI (Wang et al. 2005b), NDVI and rainfall (Propastin and Kappas 2008), forest patterns and topography, and fire history (Kupfer and Farris 2007). An important conclusion of the above studies is that GWR efficiently reveals spatial variations in empirical relationships among variables that are otherwise ignored in traditional analysis.

The Haihe Catchment is a semiarid region in North China where vegetation is highly sensitive to environmental factors, especially drought. The predominant climate is the Asian Monsoon climate with cold, dry winter (from December to February) and hot, rainy summer (from June to August). Annual precipitation is 380–800 mm, about 75% of which falls in the rainy months June to September. Annual average temperature is 9.5–12.5°C. The catchment is characterized by rapid vegetation degradation, soil erosion, and desertification, which are predicted to affect environmental conditions of the nation’s capital and environs. Hence in the last three decades, tremendous effort has been made to improve environmental conditions in the catchment. Unsuitable activities (even those that may facilitate vegetation recovery) can result in further environmental degradation. For example, afforestation in semiarid regions could cause further drying of soils, resulting in land degradation (McVicar et al. 2007). Thus, analyzing spatial distribution and variation of vegetation under different topographical conditions is important for regional vegetation management and planning.

The objective of this study was to estimate the effect of topographic and climatic variables on vegetation pattern. The Moderate Resolution Imaging Spectroradiometer (MODIS) LAI product is used as an indicator for vegetation state. DEM data are used to estimate variations in topographical factors like elevation, slope, and aspect, and to further evaluate the effect of topographical conditions on the climatic variables of temperature, precipitation, and solar radiation. The spatial relationships among LAI, climatic, and topographical variables are then analyzed via GWR.

Methods

Study area

Haihe Catchment is in North China, and it comprises the plain area (the so-called North China Plain) and the mountain terrain. While agriculture is practiced in the plain area, the mountain region is largely under natural vegetation. This study considers only the mountain regions consisting of the Taihang Mountains in the west and Yanshan Mountains in the north (see Fig. 1).

Fig. 1
figure 1

Location of the study area in Haihe Catchment and the background topography

Data preparation

Eight-day MODIS LAI product from 2001 to 2006 at 1-km resolution was downloaded. The yearly maximum LAI was extracted for each grid through the maximum value composite (Holben 1986). The average maximum LAI for each grid was calculated by averaging 6-year maximum LAI. This average maximum LAI was thereafter used for further analysis (Fig. 2a).

Fig. 2
figure 2

Spatial distribution of the regression variables: a LAI, b temperature, c precipitation, and d reference evapotranspiration

The 90 m × 90 m DEM data were downloaded from the U.S. Geological Survey (USGS) website. In order to ensure compatibility with LAI, the spatial resolution of the DEM is reduced to 1 km × 1 km by averaging interpolation method. This operation is assumed to be reasonable in that digital elevation data at given resolutions are interpretable as averages for the surrounding areas of a point (Helmlinger et al. 1993).

Aspect and slope are derived by DEM with the ArcGIS software package. Since aspect is a cyclic variable with 0° and 360° defined as north, cosine transformation was used to derive a continuous gradient designated as 1 in the north and −1 in the south (Guisan et al. 1999).

Topographic Wetness Index (TWI) is a function of both slope and upstream contributing areas per unit width orthogonal to the flow direction. TWI is proven to be highly correlated with soil attributes including horizon depth, percent silt, organic matter content, and phosphorus (Moore et al. 1993). The implementation of TWI is given as:

$$ {\text{TWI}} = {\text{ln}}\left( {{\frac{\text{As}}{\tan \beta }}} \right) $$
(1)

where As is the specific catchment area expressed in square meters per unit width orthogonal to the flow direction; and β is the slope angle expressed in radians (Gessler et al. 1995).

Spatial resolution of the DEM could affect the accuracy of derived topographic parameters such as aspect, slope, and TWI. For example, slope estimated from coarse resolution data can significantly underestimate the true slope of a region (Zhang et al. 1999). However, the accuracy of topographic variables with resolutions of 1 km is acceptable to integrate the coarse spatial resolution satellite data (also see Dobos et al. 2000) and analyze topographic effect on temperature, precipitation, and solar radiation (see Zeng et al. 2005, 2009).

Temperature and precipitation are the two climatic variables used in nearly all terrestrial model simulations (Woodward 1987; Wang et al. 2005a). Hence daily meteorological data were obtained from 40 stations (within and around the study area) from the National Meteorological Information Centre of China. Both temperature and precipitation data were analyzed on a daily basis and aggregated to monthly scale. Then annual average temperature and annual total precipitation were calculated for the period 2001–2006, corresponding to the period for which LAI data were available.

Inverse distance weighted (IDW) interpolation (Shepard 1968) was used to establish spatial distribution of relative humidity, wind speed, and sunshine hours. Topographic effect is critical for temperature, precipitation, and solar radiation. Aside from spatial interpolation of temperature, the lapse coefficient showing negative effect of elevation on temperature was also taken into consideration. For any given site (pixel), s, temperature is calculated as:

$$ T_{\text{s}} = \sum\limits_{i = 1}^{n} {(W_{i} T_{i} )} $$
(2)

where T s is the temperature at site s; W i is the weight coefficient for station i; T i is the adjusted temperature for station i; and n is the total number of weather stations.

$$ W_{i} = {\frac{1}{{d_{i}^{2} }}} $$
(3)

where d i is the distance between site s and station i.

$$ T_{i} = T_{{{\text{o}}i}} - a(h_{s} - h_{i} ) $$
(4)

where h s is the site elevation; h i is the elevation of station i; a is the lapse coefficient (which varies from 0.56 in December to 0.83°C/100 m in June) from regression analysis on the forty stations; and T oi is observed temperature in station i.

Similar to that for temperature, the relationship between precipitation and elevation is developed and lapse coefficient derived (Yang et al. 2006). The regression relation between precipitation and elevation is expressed as:

$$ P = b_{0} + b_{1} {\text{ele}} + b_{2} {\text{ele}}^{2} $$
(5)

where P is the precipitation; ele is the elevation; b 0, b 1, and b 2 are the regression coefficients for the forty stations. IDW was used to interpolate for precipitation at any given pixel using adjusted precipitation data for neighboring stations (Fig. 2b and c).

After interpolating for sunshine hour from neighboring station data, the effect of latitude, elevation, aspect, slope, and shade on solar radiation was estimated following Ye et al. (2004). Potential evapotranspiration (ET0) is a critical water balance factor, and is widely used in climate-vegetation studies (Neilson 1995). FAO Penman–Monteith equation (Allen et al. 1998) was used to calculate ET0 in this study. Spatial interpolation data for air temperature, wind speed, solar radiation, and relative humidity were used in the ET0 calculation (Fig. 2d).

Spatial autocorrelation

Spatial autocorrelation measures similarity of samples for a given variable as a function of spatial distance (Legendre 1993). Moran’s I (Legendre and Legendre 1998) is used in this study to evaluate the spatial pattern of LAI, environmental factors (elevation, temperature, precipitation and ET0), and the regression model residual. Under the null hypothesis that no spatial autocorrelations exist among the variables, Moran’s I has an expected value near zero for large numbers of samples. Positive and negative values respectively indicate positive and negative autocorrelations. Tests of significance of Moran’s I are done via Bonferroni correction following Legendre and Legendre (1998).

Geographically weighted regression

Relationships among LAI, climate variables, and topographical factors such as elevation, aspect, and TWI were constructed using conventional regression methods (OLS) and GWR analysis. Compared with traditional OLS, GWR expands application of standard regression to spatial data, allowing for spatial change in parameters. Mathematically, GWR can be represented as:

$$ y = \beta_{0} (\mu ,\nu ) + \beta_{1} (\mu ,\nu )x_{1} + \cdots + \beta_{n} (\mu ,\nu )x_{n} + \varepsilon $$
(6)

where y is the dependent variable; x 1 to x n are the independent variables; (μ, ν) denotes the sample coordinate in space; and ε is the error term.

The parameter β is estimated from:

$$ \hat{\beta }(\mu ,\nu ) = (X^{\text{T}} W(\mu ,\nu )X)^{ - 1} X^{\text{T}} W(\mu ,\nu )y $$
(7)

where \( \hat{\beta }(\mu ,\nu ) \) is the estimate from β, W(μ, ν) is the weighting matrix, ensuring that observations near to the location have greater influence than those far away (also see Fotheringham et al. 2002).

The corrected Akaike information criterion (AICc) (Akaike 1973) is used here for comparing the performance of the models with both OLS and GWR. As a general rule, the lower the AICc, the closer the model approximation is to reality. Thus, the best model is the one with the smallest AICc (Fotheringham et al. 2002).

Different combinations of the independent variables were designed in predicting for LAI (dependent variable) via both GWR and OLS methods. Since estimated temperature from elevation is naturally highly correlated with elevation (Fig. 1 and 2b), simultaneous use of elevation and temperature was avoided. Details for the independent variables of each model are listed in Table 1. In this study, 20,000 randomly selected grids are used for GWR and OLS inputs for the 185,000 km2 study area.

Table 1 Independent variables, corrected Akaike Information Criterion (AICc), and coefficient of determination (R 2) of designed models for ordinary least squares regression (OLS) and geographically weighted regression (GWR)

Results

Spatial autocorrelation is a precondition for the application of GWR. Spatial correlograms for LAI and the environmental variables of elevation, temperature, precipitation, and ET0 are shown in Fig. 3. All the factors show positive autocorrelation over short distances and negative autocorrelation at large distances. The spatial correlogram for LAI indicates that LAI is positively autocorrelated up to 200 km. For the environmental variables, positive spatial autocorrelation is up to 150 km for elevation and temperature, 200 km for precipitation, and 120 km for ET0. The spatial autocorrelation can be interpreted in terms of trends or linear gradients across the study area.

Fig. 3
figure 3

Spatial correlograms for LAI, elevation, temperature, precipitation, ET0, and residuals of model 13 from OLS and GWR (closed circles indicate that Moran’s I values are significantly larger than the value expected under the null hypothesis of no positive autocorrelation, and open circles represent no significant positive autocorrelation)

OLS and GWR model performance

Table 1 shows AICc and coefficient of determination (R2) for each model using OLS and GWR methods. The dependent variable is LAI, while independent variables are individually listed in Table 1. The improvement of model performance is evident from OLS to GWR, both from the values of AICc and R 2 and from the F-test.

In reference to GWR, models 13 and 14 are the best; with the lowest AICc and the highest R 2. Model 13 has the highest R 2 for OLS. The difference between models 13 and 14 is that while the former includes elevation, the latter includes temperature. Influence of elevation and temperature on LAI is determined by comparisons between models 1 and 4, 8 and 9, and 11 and 12. From Table 1, it is noticeable that temperature has less influence on LAI compared with elevation under traditional regression simulation. In contrast, however, their contributions are similar under GWR.

Detailed statistics of results from models 13 and 14 for both OLS and GWR are given in Table 2. A parameter is non-stationary when inter-quartile range of local estimates is greater than ±1 standard deviation of the equivalent global parameter (Fotheringham et al. 2002). Table 2 clearly indicates that in model 13, all the inter-quartile ranges of estimated parameters by GWR fall outside the ±1 standard deviation of equivalent OLS parameters, except for ET0. OLS estimates for the parameters of precipitation and Cos(aspect) are approximately within the 25% quartile and median range, showing that most local estimates of both parameters are higher than OLS values. OLS estimates for the parameters of elevation and TWI are within the minimum and 25% quartile, indicating that most local parameters of elevation are much higher than OLS values. It is interesting to note that the elevation parameter has a positive median value under GWR while a negative relationship exists between LAI and elevation under OLS. ET0 under OLS is even smaller than the minimum for GWR, indicating that all ET0 parameters are higher than OLS values.

Table 2 Descriptive statistic of parameter estimates for models 13 and 14 from both OLS and GWR

Similar to model 13, all inter-quartile ranges of GWR-estimated parameters in model 14 fall outside the ±1 standard deviation range of equivalent OLS parameter estimates. For temperature, the parameter is negative with a median value almost equal to that for OLS.

The above results suggest that the relationships between LAI and independent variables are spatially non-stationary. This is also proven true by Monte Carlo significance testing for the parameters. Figure 4a shows the spatial distributions of coefficient of determination (R 2) for model 13, with median 0.574 and range 0.121–0.801. High values are noted for northern Yanshan Mountain regions where spatial variations in LAI are large (Fig. 2a). This may be due to broad temperature and precipitation gradients (see Fig. 2b and c). On the contrary, the southern Taihang Mountains, where LAI remains spatially similar, has relatively low values. This could reflect the driving of water and heat gradients to LAI in the study area. Figure 4b illustrates the spatial distribution of LAI residual for GWR. Residual values within the range ±1.0 dominate in most areas, with some high residuals in high LAI areas (Fig. 2a). There are generally more positive residuals, suggesting that model 13 underestimates LAI under GWR.

Fig. 4
figure 4

Spatial distribution of GWR-estimated LAI parameters: a R 2, and b LAI residual for model 13

Figure 3 also shows the spatial correlograms for the residuals of model 13 under OLS and GWR. The OLS correlogram indicates that Moran’s I values up to a lag distance of less than 200 km are significantly and positively autocorrelated. In comparison though, no positive spatial autocorrelation is noted for the residuals of GWR model 13 in terms of spatial scale. This suggests that GWR could solve many of the problems of spatially autocorrelated error terms in the traditional global OLS model.

Ecological interpretation of GWR estimate

In Table 2, independent variable parameters range from negative to positive. In order to avoid compensating effects from other independent variables, GWR analysis is applied on each single independent variable (models 1–6), which is used to interpret spatially non-stationary relationships in the context of ecology.

Based on R 2 for models 1–6, model performance is better for the northern part of Yanshan Mountains than for Taihang Mountains. Furthermore, spatial patterns of R 2 for all the models are very similar, though they have different ranges. Elevation and temperature have the strongest effect on LAI, followed by precipitation, TWI, aspect, and ET0. Generally, areas of high elevation and climatic change have high R 2. This could be driven by high variation in LAI in regions with high variability in climate. ET0 has the lowest effect on vegetation. This is natural because ET0 is only an indicator for potential evaporation rather than a direct measure of water availability.

For most of the regions, while elevation, Cos(aspect), and precipitation have positive influence on LAI, temperature and ET0 have negative influence. This is particularly true for the northern Yanshan Mountains where there are relatively high variations in LAI and climatic variables. In contrast, there is only a low R 2 for Taihang Mountains, suggesting that climatic factors do not significantly influence LAI. This could be heavily driven by the low variations in LAI. As the dependent variable, when variation in LAI is small, variations in climatic factors naturally do not significantly influence LAI. Furthermore, low variations in LAI suggest limited water, since LAI is relatively low in most of the southern regions of Taihang Mountains.

However, TWI has a compound influence on LAI. The parameter has a median of −0.086 with a range from −0.268 to 0.050. Positive values are mainly found in the north of the Yanshan Mountains, southwest of Taihang Mountains, and the transition between two Mountains; while negative values are within other regions, especially where LAI and precipitation are relatively high. This presumably reflects that LAI may increase with TWI in arid regions, while the relationship between LAI and TWI is complex in relatively humid regions where TWI itself cannot indicate the soil moisture.

Discussion

Superiority of GWR over global OLS

The importance of spatial autocorrelation in analysis of geographically ecological variations has been addressed by a number of scholars (Legendre et al. 2002; Lichstein et al. 2002). The main reason, as pointed out by Fotheringham et al. (2002), is that GWR provides more directly interpretable solutions to spatially autocorrelated error terms of regression models applied to spatial data.

This article describes spatial relations among LAI and environmental variables in semiarid Haihe Catchment, China via two different regression techniques—GWR and global OLS. Although OLS shows a clear (either negative or positive) and direct effect of different climatic and topographical factors on vegetation status, the results of GWR provide more accurate predictions by significantly reducing autocorrelation and absolute regression residuals (Foody 2003, 2004; Wang et al. 2005a; Kupfer and Farris 2007; Propastin and Kappas 2008). Furthermore, GWR spatially reflects the effect of climate and topographic factors on vegetation, which is not possible with OLS.

Generally, vegetation grows worse as altitude increases due to decreasing temperature. However, this is not always true for water-scarce regions as both low temperatures and high precipitations under high elevation conditions could result in better water conditions, which in turn support strong vegetation growth. Several studies in North China have observed such a trend. In this study, similar trends are noted under GWR, with a positive median value of 0.0001 and a range −0.0047–0.0049 between LAI and elevation (Table 2). For OLS, however, elevation is negatively related to LAI (−0.0011). This suggests that the estimate for the traditional OLS is incorrect. This could be due to interactions among the variables used in model 13.

Major factors influencing LAI and spatial heterogeneity

It is well established that LAI is concurrently influenced by heat and water (Wang et al. 2008). In this study, LAI for the natural vegetation of the mountain regions of Haihe Catchment exhibits spatial correlation with climatic (temperature, precipitation, and ET0) and site (elevation and aspect) conditions as revealed by remote sensing data and GWR analysis. This is similar to the findings of Yang et al. (2006) and Zhang et al. (2006). Using sample plot data and ordinary regression analysis, both studies show that vegetation growth in Taihang Mountains has significant correlation with environmental variables.

Climate is the most critical factor influencing vegetation. Since temperature generally lies within a suitable range for vegetation growth in the study area, precipitation and evapotranspiration (respective indicators for water availability and consumption) largely determine water condition of the vegetation. The fact that ET0 only indicates potential water requirement rather than water deficit renders it less important in LAI analysis.

Traditionally, field survey sample plot data are used to build relationships between vegetation, and climatic and topographic variables. Field data are typically collected over relatively small areas near meteorological stations. In this study, the meteorological stations are limited and sparse—only 40 stations in an area of over 185,000 km2. Thus, even though interpolations generally have accuracy concerns, using topographical data to achieve estimation of climatic factors at a given site is worthwhile. With available global DEM, spatial estimation of temperature, precipitation, and ET0 based on topographic effects can give vivid information on water and heat conditions in a pixel. To verify the effect of interpolation on spatial autocorrelation of the variables, autocorrelations of temperature and precipitation from the 40 ground-based stations and that from the interpolated data are compared. Both variables show positive autocorrelation over short distance and negative autocorrelation at a large distance. Temperature and precipitation are positively autocorrelated up to 150 km and 200 km, respectively. There is only a very slight increase in Moran’s I over short distance for the spatially interpolated temperature and precipitation compared with the 40 ground-based station data. This further suggests that the interpolated data did not introduce any man-made influence on GWR application.

Variation in topography determines microclimatic conditions at any given site (Bennie et al. 2008). For example, soil moisture is normally higher on the slope facing north than south. This explains the positive correlation between Cos(aspect) and LAI. Similar results are reported by Yang et al. (2006), whose study suggests that the slope facing south is not good for forest growth. Furthermore, elevation is positively related with precipitation whereas it is negatively related with temperature. On the other hand, high elevation may increase the potential for high precipitation. This dual effect of elevation favors increased water supply and less water depletion—conditions that favor healthy plant growth. This makes elevation the most crucial factor (even over temperature) in estimations of LAI.

OLS analysis, however, shows that precipitation has the greatest influence on LAI, then the factors with less importance are in the order: elevation, TWI, ET0, aspect, and temperature (see Table 1).

GWR analysis reveals that spatially, different driving factors influence LAI in different regions of the Haihe Catchment. For instance, precipitation gradient is the most important driver of LAI variation and good model performance in rugged terrain, especially in the northeast region. On the other hand, temperature is the most limiting factor for LAI variation in the hilly areas along the eastern regions of the Taihang Mountains. Theoretically, increase of temperature not only increases respiration rate, but also increases shortage of plant water supply; which in turn decreases the rate of photosynthesis (Wang et al. 2005a). Then, under similar elevation, temperature, and precipitation conditions, topographic factors such as aspect and TWI become crucial for LAI analysis.

Combining analyses from GWR and OLS, it is, therefore, clear that water availability is the most limiting factor for vegetation growth.