Introduction

Forest biomass and net primary productivity (NPP) form an integral part of the global terrestrial carbon cycle by operating as sink–source of atmospheric CO2 (Dixon et al. 1994; Pan et al. 2011). These are crucial biophysical parameters for understanding ecosystem functioning in any forested landscape. Forest biomass and NPP are known to have causal correlation (Cramer et al. 1999). However, mere observation of increase in carbon storage of a forest vegetation would be insufficient to acknowledge the increase in productivity (Keeling and Phillips 2007). Thus, site-specific factors, viz. temperature, moisture and nutrient availability influencing the process of carbon assimilation, must be taken into consideration for productivity assessment (Lieth 1975; Melillo et al. 1993; Laurance et al. 1999; Knapp and Smith 2001; Malhi et al. 2004; Raich et al. 2006).

Forest biomass represents the potential amount of living or dead organic matter that gets added to the biosphere, whereas NPP signifies the rate at which net assimilation of organic carbon by green vegetation occurs over a period. Thus, their quantification becomes imperative for discerning the energy flow exchanges and nutrients fluxes in a terrestrial ecosystem, more importantly in a complex mountainous terrain (Chave et al. 2008; Shirima et al. 2015). Two most widely followed approaches for estimation of forest AGB are: ground-based and remote sensing (RS)-based methods combined with field data (Roy and Ravan 1996; Kale et al. 2002; Tiwari et al. 2005; Kumar et al. 2011; Singh et al. 2012; Devagiri et al. 2013; Salunkhe et al. 2016). Certainly, ground-based methods furnish precise information but are time-labor intensive and destructive to semi-destructive in nature (de Gier 2003). They also fail to capture the overall spatiotemporal micro-climatic variability when practiced in rugged mountainous terrain on large scale. An alternate to prior approach is to use satellite data based on RS technique, since it is amenable to produce synoptic and time-series coverage of an area (Kale et al. 2002; Patenaude et al. 2005; Rosillo-Calle 2007; Ravindranath and Ostwald 2008). Maisongrande et al. (1995), Nelson et al. (2000), Lu et al. (2004) and Lu (2005) have utilized satellite data-derived variables as predictor variables for geospatial modeling of forest above ground biomass (AGB) by applying either empirical or biophysical process-based models (Kale et al. 2002; Gasparri et al. 2010; Manna et al. 2014). However, selection of the method is usually based on the availability of the input parameters, micro-climatic and topographic conditions of the region. In Indian scenario, AGB assessments were done by developing regression equations between crown cover and stand biomass using satellite data in different eco-regions in western Himalaya (Tiwari and Singh 1984; Tiwari 1994; Tiwari et al. 2005). Kale et al. (2002, 2005) and Kumar et al. (2011) applied similar approach for AGB assessment in tropical forest ecosystems of Central India and Shivalik Himalaya. Singh et al. (2012) performed spatial up-scaling of AGB using multi-season NDVI images of MODIS satellite data in temperate forests of Jammu & Kashmir Himalaya. Upgupta et al. (2015) used very high-resolution data of Cartosat-1 and Quick Bird to assess AGB of forest plantations. Though majority of studies have used few VIs for spectral modeling of AGB many authors have reported saturation of VIs at higher biomass values (Steininger 2000; Kasischke et al. 2014). Hence, attempts are being made to develop suitable regression models world over (Lu 2006; Mutanga et al. 2012). Random forest (RF) algorithm is an ensemble technique that exploits bagging and boosting to perform classification and regression analyses (Breiman 2001). With a modest fine-tuning of parameters, the RF algorithm produces outcomes with high accuracy at high computational speed (Gislason et al. 2006; Lawrence et al. 2006; Meacham et al. 2016; Liu et al. 2017; Safari et al. 2017; Le et al. 2018; Pandit et al. 2018a; Teluguntla et al. 2018). The algorithm is also capable of predicting important independent variables w.r.t. dependent variable by using recursive feature elimination function (RFE) (Guyon and Elisseeff 2003; Ismail and Mutanga 2010; Belgiu and Drăgu 2016; Dang et al. 2019).

NPP is recognized as key indicator for assessing ecosystem pattern, processes and its overall health. The increased availability of RS data has excelled the count of models simulating dynamics of NPP over a time. Cramer et al. (1999) and Ruimy et al. (1999) have reviewed productivity models in two groups, viz. production efficiency model (PEMs) and canopy production model (CPMs). The diagnostic PEMs rely on the concept of light use efficiency (LUE)—‘the efficiency with which light energy is used by the vegetation to sequester carbon’ (Monteith 1972; Kumar and Montieth 1981). A few examples of PEMs coupled with RS data are Global-PEM (Prince and Goward 1995), Carnegie–Ames–Stanford Approach ‘CASA’ (Potter et al. 1993), Carbon Fixation (C-Fix), etc. The prognostic CPMs such as Global Biome Model (Haxeltine and Prentice 1996), Carbon Assimilation the Biosphere (Warnant et al. 1994), HYBRID (Friend and Cox 1995), etc., follow the principles of major biophysical processes, viz. photosynthesis, respiration and allocation of assimilates. Modeling of NPP through CPMs requires various field-derived eco-physiological inputs, whereas LUE model utilizes fraction of absorbed photosynthetically active radiation (fAPAR) and LUE (εmax). Roy and Jain (1998), Kale et al. (2002) and Kale and Roy (2012) estimated NPP using PEMs in tropical deciduous forests of central India. Chhabra and Dadhwal (2004), Nayak et al. (2010), Chitale et al. (2012) and Nayak et al. (2013, 2015) used C-Fix and CASA models to estimate monthly NPP over the Indian subcontinent. Singh et al. (2011) studied the inter-annual variability in NPP using Global-PEM model. Tripathi et al. (2018) and Behera et al. (2019) estimated monthly NPP in the forests of Northern India using CASA and Biome BGC models. The process of estimation of NPP has transformed now for monitoring of in situ carbon fluxes (Chhabra and Dadhwal 2004; Goroshi et al. 2014; Dadhwal 2012; Deb Burman et al. 2017). However, due to unavailability of in situ data, only limited number of studies related to biomass and carbon dynamics are reported from remote interiors of Indian Himalaya Region (IHR). Despite holding 39.33% forest cover of the total geographical area, the forest ecosystems of IHR are insufficiently studied in terms of NPP specifically at basin and watershed scales. Under such scenario, the present study is an attempt (a) to estimate forest AGB using RF algorithm and (b) to assess spatiotemporal variation in forest NPP using LUE approach in a Himalayan watershed.

Materials and methods

Study area

Aglar watershed lies from 30° 27′ 4″ to 30° 38′ 5″ N and 77° 56′ 15″ to 78° 18′ 45″ E in Tehri-Garhwal district of Uttarakhand, India (Fig. 1a). The watershed covers an area of ~ 307.28 km2 with altitudinal variation of 690–3015 amsl, representing tropical to humid temperate biomes. The highest rainfall is recorded in the months of June to September and the lowest in December with an annual average of > 2000 mm (IMD 2015). The altitudinal gradient and climatic conditions favor a variety of vegetation formations. The dominant vegetation types are gregarious formations of Himalayan Moist Temperate forest of Quercus leucotricophora, Qurecus floribanda and Cedrus deodara, Himalayan subtropical Pine forest (Pinus roxburghii) and tropical mixed miscellaneous forest with dominant species like Grewia optiva, Terminalia chebula, Bauhinia variegata, etc. However, the lower heights of southern aspect are mostly dominated by subtropical scrub, grasslands interspersed with agricultural fields (Fig. 1b).

Fig. 1
figure 1

a Location map of Uttarakhand, India b LULC and c homogeneity map of Aglar watershed (CD = Cedrus deodara, MF = Mixed forest, PR = Pinus roxburghii, QM = Quercus mixed, QS = Quercus scrub), and d LAI sampling points per 0.1 ha plot

Satellite data and preprocessing

The ortho-rectified images of Landsat-8 OLI with 30 m spatial resolution were used for mapping of forest cover type, forest density and modeling of forest AGB and NPP (illustrated in Fig. 2). The cloud-free data for each month of the year 2015 were downloaded from https://earthexplorer.usgs.gov except for July and August due to thick cloud cover because of prevailing monsoon season. For modeling of forest AGB, 17 April 2015 image was chosen when forest vegetation has distinct foliage cover. The entire dataset was atmospherically corrected using which VIs, textural components and linearly transformed images were generated (Table 1).

Fig. 2
figure 2

Methodology paradigm for assessing geospatial pattern of AGB and NPP

Table 1 List of predictor variables computed from Landsat 8 OLI image for AGB modeling

Ground data collection

A stratified random sampling approach based on homogeneity map was followed to capture the variability of forest types, forest density and terrain conditions (Fig. 1c). The forest cover type and density cover maps were prepared from dry and growing season Landsat-8 OLI images. To determine sample size, a few samples were laid initially to ascertain the variance and range of bole diameter, tree height and biomass in each stratum. The sample size was determined using Chako’s formula (1965) (Eq. 1)

$$N = t^{2} \times \left( {{\text{Coefficient}}\;{\text{of}}\;{\text{variation}}} \right)^{2} /\left( {\% \;{\text{Standard}}\;{\text{error}}} \right)^{2}$$
(1)

where N is sample size and t is value of t test at 95% confidence level.

Resultantly, considering probability proportion to its size, 71 sample plots of 0.1 ha were distributed across the different strata. Relevé size was based on earlier studies of Forest Department. Ground inventory on species, tree diameter at breast height (i.e., 1.37 m), tree height, canopy closure, number of storys, soil characteristics, pH, etc., were noted. The monthly leaf area index (LAI) observations were collected from 14 representative plots with the help of a well-calibrated AccuPARLP-80 Ceptometer. The LAI measurements were taken by traversing the area in all directions during early or late hours of the day or under sky cast conditions to avoid speckle effect of direct sunlight (Fig. 1d).

Estimation of forest AGB

The species-specific volumetric equations developed by FSI (1996) were applied to get volume of individual trees. Tree volume was multiplied with wood-specific gravity (FRI 2002) to obtain bole biomass. Biomass expansion factor was used to obtain total tree biomass (Haripriya 2002). The individual tree biomass was summed up to obtain plot-level biomass and then factorized to get pixel-level biomass. For geospatial modeling of AGB, RF algorithm from RandomForest package was applied in R-environment. The algorithm combines large sets of decision trees formed by selecting sets of variables to improve classification and regression analysis. We used recursive feature elimination function (RFE) to find out important independent variables w.r.t. dependent variable. RFE was performed over 96 variables consisting of 6 spectral bands, 5 linear transformed images, 7 simple ratios, 30 complex ratios and 48 textural (5 × 5) variables. Effective set of 24 variables were identified on the basis lowest RMSEC value marked as red circle in Fig. 3. For modeling, 2/3 rd of the samples (n = 53), also called in-bag samples, were utilized for training the algorithm, and 1/3 rd of the samples, i.e., out-of-bag ‘OOB’ samples (n = 18), were used for cross-validation to determine the model error (or OOB error). Major parameters required for proper optimization of RF were: ‘ntree,’ i.e., total number of regression trees grown from bootstrap sample of the observations; ‘mtry’ was the number of predictor variables examined on each node, and ‘node-size’ was the smallest size of the end nodes of the trees grown. After multiple iterations, the optimum ‘ntree’ selected was 500 at ‘mtry’ = 8 as it resulted into the small OOB error.

Fig. 3
figure 3

RFE-based selection of optimum variables

Estimation of NPP

The integration of LUE and process-based model was identified as an efficient approach to model NPP using RS data (Kale et al. 2002; Lu et al. 2010). NPP is predominantly affected by ε* (realized LUE) and absorbed photosynthetic active radiation (APAR) (Monteith 1972) and were obtained from Eqs. 2 and 3, respectively.

$${\text{NPP}} = \varepsilon^{*} \times {\text{APAR}}$$
(2)
$${\text{APAR}} = {\text{fAPAR}} \times {\text{PAR}}$$
(3)

where PAR stands for photosynthetic active radiation in visible range (400–700 nm) calculated as 45–50% (Potter et al. 2003) of the shortwave radiation, and fAPAR is the fraction of absorbed radiation and was calculated using Eq. 4 (Ruimy et al. 1999).

$${\text{fAPAR}} = 0.95 \times (1 - {\text{e}}^{{ - k \times {\text{LAI}}}} ) )$$
(4)

k in Eq. 4 is the light extinction coefficient and was set as 0.5 (Jarvis and Leverenz 1983), and LAI is the leaf area index modeled by regressing field-observed monthly/seasonal LAI with corresponding NDVI.

The complex interactions between vegetation structure, soil moisture, climatic factors and solar radiation affect maximum LUE (\(\varepsilon_{\hbox{max} }\)) to govern the spatiotemporal variations of forest productivity (Kale et al. 2002; Nemani et al. 2003). Therefore, integration of temperature (Ts) and moisture scalar (Ws) was important in order to get ε* of a species in different forest ecosystems (Eq. 5).

$$\varepsilon^{* } = \varepsilon_{\hbox{max} } \times {\text{Ts}} \times {\text{Ws}} \times {\text{Ps}}$$
(5)

For the present study, \(\varepsilon_{\hbox{max} }\) value for different forest types was obtained from (Nayak et al. 2010). Terrestrial ecosystem model-based Ts (Raich 1991) and LSWI-based Ws (Xiao et al. 2005) were obtained from Eqs. 6 and 7.

$${\text{Ts}} = \frac{{\left( {T - T_{\hbox{min} } } \right)\left( {T - T_{\hbox{max} } } \right)}}{{\left( {T - T_{\hbox{min} } } \right)\left( {T - T_{\hbox{max} } } \right) - \left( {T - T_{\text{opt}} } \right)^{2} }}$$
(6)

where Tmin, Tmax, Topt and T are minimum, maximum, optimum and average air temperatures (in  °C), respectively. We used Tmin and Tmax (2.68 °C and 23 °C, respectively) as recorded by Automatic Weather Station (AWS) installed in the area. Topt for photosynthesis in temperate evergreen forest ranges from 10 to 25 °C. It was set 21 °C and was used obtained from Cunningham and Read (2002).

$${\text{Ws}} = \frac{{1 + {\text{LSWI}}}}{{1 + {\text{LSWI}}_{ \hbox{max} } }}$$
(7)

where LSWImax stands for maximum value of LSWI for each pixel in the growing period.

To account for leaf developmental stage, phenology scalar (Ps) was obtained from Xiao et al. (2004) (Eq. 8).

$${\text{Ps}} = \frac{{1 + {\text{LSWI}}}}{2}$$
(8)

Similar approach was adopted by Huang et al. (2010) to model NPP in mountainous forest of Guangdong Province, China.

Results and discussion

Estimation of AGB

A total 3317 trees were measured for bole diameter (dbh) and tree height. The tree density (per 0.1 ha) ranged from 23 to 147 in Quercus (Oak) mixed forest, 17 to 73 in Cedrus deodara (Deodar) forest, 12 to 56 in Pinus roxburghii (Pine) forest and 27 to 82 in mixed miscellaneous forest. The diameter in Quercus forest ranged from 12.61 to 40.63 cm (± 6.45 SD), for Cedrus deodara forest from 27.47 to 58.64 cm (± 10.71 SD), for Pinus roxburghii forest from 17.27 to 56.85 cm (± 12.48 SD) and for mixed forest it was from 19.78 to 21.26 cm (± 0.52SD).

AGB based on primary data ranged from 325.86 to 470.98 Mg ha−1 (± 37.44 SD) in Cedrus deodara forest, Quercus mixed forest from 87.85 to 413.93 Mg ha−1(± 88.70 SD), mixed miscellaneous forest from 173.31 to 367.75 Mg ha−1 (± 92.23 SD), in softwood Pinus roxburghii forest from 60.26 to 358.49 Mg ha−1(± 85.74 SD) and Scrub from 25.78 Mg ha−1 to 46.27 Mg ha−1(± 12.98 SD). It may be discerned from the range of AGB that Cedrus deodara forests have good and uniform crop of tree and canopy density. Maximum range difference was noticed in Quercus forest, which is an indication of site conditions and biotic pressures (Quercus is a chief fuel wood and fodder species). However, the vigor of the Quercus forest improves with slope at higher ridges. Inaccessibility to the area could be the possible reason for it. Pinus roxburghii forests grow on exposed, rocky, poor soil conditions having high biotic pressures, and hence varied range of ABG was observed. Lower AGB in Pinus roxburghii forest in comparison with Quercus mixed forest may be attributed to the high specific density of Quercus wood (Tiwari et al. 2005). The results are in close agreement with the study performed by Sharma et al. (2016) under similar forest types of the Garhwal Himalaya. AGB for Quercus mixed forest in this study was found lesser than Sharma et al. (2016); however, it was close to the AGB estimated by Dimri et al. (2017) under similar forest of Garhwal Himalaya.

For RF-based regression modeling of AGB, RFE ranked NDVI to be the most important predictor variable along with NNIR, ARVI, IPVI, GNDVI, etc., on the basis of %IncMSE and IncNode purity (Fig. 4). The individual spectral bands (B2, B3 and B4) also formed a robust combination for prediction of AGB (Kumar et al. 2011; Singh et al. 2012; Vicharnakorn et al. 2014) and the presence of textural variables improved the prediction accuracy (Lu 2005). RF predicted highest AGB in Cedrus deodara forest (407.73 Mg ha−1) and minimum in mixed scrub (48.52 Mg ha−1), in agreement with primary data. High AGB was in dense forests of Quercus mixed occurring on the higher ridges in the watershed (Fig. 5). However, the results indicated that RF underestimated in high-density forests with high AGB and overestimated in low-density forests with low AGB (Pandit et al. 2018a, b). The average value of modeled AGB was found to be 268.22 Mg ha−1 in the present study. Singh et al. (2012) reported 210.48 Mg ha−1 in temperate forest in northwestern Himalaya in Kashmir valley. High AGB reported in this study can be ascribed to favorable environmental conditions, viz. high moisture availability and soil organic carbon in Garhwal Himalaya as compared to dry temperate regions of Kashmir valley (Kishwan et al. 2009). pH plays important role in net carbon assimilation. The former has more acidic soils with pH ranging from 5.50 and 6.64, respectively (Gairola et al. 2012; Wani et al. 2014) which could also be a plausible reason behind it.

Fig. 4
figure 4

Relative importance of variables selected by RFE (refer Table 1 for details)

Fig. 5
figure 5

Spatial distribution of AGB in Aglar watershed

The validation of RF predicted vis-à-vis primary AGB at pixel-level gave coefficient of determination (R2) = 0.84, RMSE = 42.46 Mg ha−1, %RMSE = 19.49%, MAPE and MAE equal to 19.94%, and 34.68 Mg ha−1, respectively (Fig. 6). Safari et al. (2017) reported that RF outperforms with most important variables from Landsat-8 OLI data with low RMSE and moderate R2 values. Table 2 compares model statistics obtained from various regression models used to assess AGB over the world.

Fig. 6
figure 6

Validation of field-observed versus RF-modeled AGB

Table 2 Comparison of regression analysis-based models statistics for AGB prediction

Spatiotemporal pattern of NPP

Regression analysis performed between field-observed LAI vis-à-vis NDVI showed high correlation for each corresponding month (Fig. 7). Thus, NDVI images were used for geospatial modeling of LAI and fAPAR for each month. The NPP ranged from 102 to 1056 gC/m2/year with an average of 561.32 gC/m2/year. It was the highest in Quercus mixed forest (663.19 gC/m2/year) followed by forests of Cedrus deodara, Pinus roxburghii and mixed forest as 468.79, 369.07 and 250.57 gC/m2/year, respectively. Monthly variation of NPP among all the forest types studied is shown in Fig. 8a, b. Highest carbon assimilation of 95,148,073.9 gC/year has been found in Quercus mixed with evergreen broadleaf species, followed by needle-shaped leaf forest of Cedrus deodara and Pinus roxbughii as 5,752,954.1 gC/year and 1,863,187.7 gC/year and mixed forests 2,634,737.1 gC/year. A comparison of our findings with global reports is presented in Table 3.

Fig. 7
figure 7

Scatter plot correlation analysis between filed-observed monthly LAI and NDVI

Fig. 8
figure 8

Distribution of NPP statistics a overall monthly variation b forest-type-wise variation

Table 3 Comparison of NPP estimates among similar forest cover types in the world

Spatiotemporal variations in monthly NPP were noticed across the watershed (Fig. 9). The significant effect of climatic conditions, viz. temperature, precipitation events and PAR, was evident on forest productivity (Fig. 10). NPP peaked during growing season which starts from May and culminates in November–December (Fig. 10a). It was the highest in October which may be attributed to optimal environmental conditions for photosynthesis soon after the receding of monsoon (Fig. 10b, c). The productivity begins to decline with the onset of dry winter months, i.e., December to January and the decrease in productivity continues till January. It is primarily due to very low temperature that decreases the rate of photosynthetic phenomenon (Zhu et al. 2006). With the rise in temperature and moisture condition in February, the leaf flushing or ‘green wave’ accelerates and this coupled with rise in temperature in the subsequent months helps in leaf expansion and maturity (Raich et al. 2006). However, at this stage dry summer alleviates the process of carbon assimilation by the plant foliage due to lack of moisture in soil and atmosphere. Such temporal variations of NPP were also reported by Feng et al. (2007) while studying temperate forests of China in the neighboring region. Other factors such as increased cloud cover reduces the availability of PAR (Beer et al. 2010) which negatively affects the process of carbon assimilation. This phenomenon was evident in the month of September (Fig. 10d). The NPP estimates for major forest types in this study were found coherent with the findings of Feng et al. (2007).

Fig. 9
figure 9

Spatiotemporal distribution of NPP in Aglar watershed

Fig. 10
figure 10

Mean monthly variation in NPP w.r.t. a environmental variables and b, c, d are their effect on different forest types of Aglar watershed

Conclusion

Himalayan ecosystems exhibit intricate relationships among the environmental variables because of its arc-shape and complex terrain (slope, aspect, elevation) and varied orography. Forest biomass and NPP are recognized as important climate regulatory ecosystem services in South Asia furnished by Himalayan forest ecosystems. In-depth knowledge of these biophysical variables is helpful in manifesting forest ecosystems status/functioning, sequestration potential and for framing climate-change mitigation strategy. It is established from study that temporal primary data and corresponding seasonal earth observation data have good potential to estimate AGB using RF and NPP using LUE for geospatial modeling in a medium sized watershed. RF-based RFE function helped us in prioritizing and identifying the 24 most important variables out of 96 contributing to AGB prediction. The results had low RMSE with high R2 than other regression equation-based simple spectral models. The process-based models coupled with earth observation data provided better understanding of highly dynamic productivity in ecosystems. LUE is a most important factor for NPP assessment and is affected by climatic factors such as rainfall, temperature, moisture/humidity, PAR, age of the leaves and plant itself. However, PAR, temperature and precipitation were the dominant factors governing it. Climatically, Aglar watershed lies in the western most limits of humid temperate forests in outer Himalayan zone and not far from vast Indo-Gangetic Plains; thus, it is very likely that NPP would vary from other temperate regions with high or low rainfall and humidity. We report lesser NPP cf. with earlier reports from other temperate forest ecosystems. The approach is very robust and simple and can serve as a good alternative for reliable estimation of AGB and carbon storage potential especially in areas where ground data is scarce. Such watershed-level study would help to understand complex biogeochemical processes by improving regional and global scale models of climate change and NPP. However, the NPP estimates need to be further tested and validated with in situ measurements, viz. carbon flux tower.