Introduction

Forests encompass nearly one-third of the Earth’s land cover (FAO, 2015), and they play a key role in global water and carbon cycles (IPCC, 2006; UNFCCC, 2016) and serve as a significant reservoir of raw materials, fuel, and other ecosystem services (Binder et al., 2017). It is crucial to have accurate data on forest resources to manage forests sustainably, especially in tropical regions where forests make up nearly 40% of all terrestrial biomass and store almost 17% of all land-based carbon stocks (Lucas et al., 2004). Between 1990 and 2000, the extent of temperate forests increased by about 3 million hectares per year, while tropical forests lost an average of more than 12 million hectares annually during the same time frame. Uncertainty in biomass variation is greatest in tropical forests, posing a significant challenge in estimating the carbon flux dynamics in the area (Millennium Ecosystem Assessment, 2005). The United Nations Framework Convention on Climate Change (UNFCCC) has recognized forest biomass as a crucial climate variable necessary to decrease the uncertainties in our understanding of the climate system (GCOS, 2010).

Different methods have been developed to assess forest biomass, and among those, traditional field-based approaches are the most accurate ones for biomass estimation. However, these approaches are generally labor-intensive and time-consuming, as well as having constraints in providing continuous spatial distribution of biomass for large areas (Brown et al., 2002). The estimation of above-ground biomass of forest ecosystems by employing earth observation data has drawn a lot of attention in recent years for a number of reasons, including the capacity to spatially extrapolate ground measurements on forest biophysical parameters, which facilitates mapping of AGB of large areas, the increased accessibility of various remote sensing data types, and the critical nature of the estimation of forest biomass for the conservation of forests and the evaluation of carbon stock and carbon fluxes (Verkerk et al., 2014; Corona, 2016). Among the several sensor types, synthetic aperture radar (SAR) exhibits the greatest potential for estimating the above-ground biomass of forests due to its sensitivity to the plant canopy and penetrating capabilities (Le Toan et al., 1992).

SAR sensitivity to AGB varies with wavelength because it controls how deeply microwave signals penetrate the canopy and how much they scatter off of woody and other structural elements of the vegetation. The longer wavelengths allow stronger penetration of the microwave radiation into the canopy and also greater scattering from the tree trunks (Du et al., 2000; Saatchi & McDonald, 1997). Many studies have demonstrated this sensitivity by correlating the SAR backscatter to AGB at various frequencies such as P-band (Sandberg et al., 2011; Santos et al., 2003; Saatchi et al., 2011), L-band (\(\sim\) 15–30 cm wavelength) (Cartus et al., 2012; Lucas et al., 2010), S-band (\(\sim\) 7.5–15 cm wavelength) (Ningthoujam et al., 2016; Ningthoujam et al., 2017), and C-band (\(\sim\) 4–8 cm wavelength) (Dobson et al., 1992; Pulliainen et al., 1999; Vaghela et al., 2021). Previous research has shown that L-band cross-polarization (L-HV) is the best option (because P-band spaceborne SAR is not yet accessible), although S- and C-band SAR are also effective for retrieving AGB in low-biomass forests. In earlier studies, rapid saturation of the SAR data happened in high biomass forests, i.e., for C and S (at below 50 t/ha), and L-, P-bands saturate at \(\le\) 100 t/ha and \(\le\) 200 t/ha of biomass (Imhoff, 1995; Le Toan et al., 1992; Luckman et al., 1997; Ningthoujam et al., 2017; Schlund & Davidson, 2018).

Empirical correlation between field-measured AGB and SAR backscatter intensity has been a popular technique for estimating AGB using SAR since the 1980s. Previous research has shown that using SAR data in multiple frequencies and polarizations in the regression models can improve AGB estimation (Harrell et al., 1997; Kellndorfer et al., 1998; Wagner et al., 2003). However, the complexity of the tree’s architecture, the distribution of its leaves and branches, its electromagnetic properties, and other factors like topography, soil moisture, and nearby disturbances had a significant impact on the empirical relationships (Luckman et al., 1998). In numerous studies that used empirical regression techniques for AGB retrieval, significant deviations from the regression line were reported (Dobson et al., 1995; Harrell et al., 1997). To avoid this, canopy scattering models (Ferrazzoli et al., 1995; Karam et al., 1995; Mougin et al., 1993; Saatchi & McDonald, 1997; Tavakoli et al., 1993; Ulaby et al., 1990; Wang & Qi, 2008) that take into account the structural components of the vegetation stands can be used to predict AGB. Physical models can represent the canopy as either a group of discrete scatterers or a random continuous medium, but the former approach has the advantage of being a more appropriate description of the canopy. Therefore, models that describe the canopy as a collection of random scatterers of varying sizes and orientations with given shapes, namely cylinders, and disks representing branches and leaves, respectively, in a homogeneous medium, are the most plausible kinds of models. Models can be developed from an electromagnetic perspective using either wave theory or energy transport theory. To characterize the varying component of the medium’s dielectric constant, scattering models based on wave theory (the distorted Born approximation) (Fung et al., 1978; Saatchi & McDonald, 1997; Soja et al., 2020; Tsang & Kong, 1981) use a correlation function to account for the complexity of the medium. This approach is suitable for media with weak scattering where the fluctuating component of the dielectric constant has a small ratio to the mean medium value (Lee & Kong, 1985; Ulaby et al., 1986). On the other hand, models based on the energy transfer (radiative transfer theory approach) (Eom & Fung, 1984; Karam et al., 1995; Ulaby et al., 1990) account for multiple scattering by taking the average of the Stokes parameters over the probability distributions of the orientation, shape, and size of the canopy components (Ulaby et al., 1986). This method relies on the theoretical approximation of vegetation backscattering premised on first- or second-order radiative transfer functions (RTFs), as the mathematical formulation of the scattered radiation results in integro-differential equations with open-ended solutions. The use of canopy scattering models in temperate and boreal forests with sparse to medium densities (Dobson et al., 1995; Ranson & Sun, 1997; Saatchi & Moghaddam, 2000) has been extensively studied. However, there have been very few studies on scattering models used in tropical forest areas (Wang & Qi, 2008).

This work builds upon a detailed investigation of the potential of dual-polarized multi-frequency (L-, S-, and C-band) SAR backscatter for estimating the above-ground biomass of tropical vegetation by exploiting a microwave scattering model. The microwave scattering model for the vegetation layer was built based on the framework suggested by Karam and Fung (1988), and the backscattering from the underlying surface was modeled with the improved integral equation model (\(I^{2}EM\)) put forth by Fung and Chen (2010). The total backscatter intensity was simulated by combining these two models. In this proposed methodology, the vegetation was modeled as a layer of defoliated trunks and approximated as a layer of dielectric cylinders with finite heights. The model allows retrieving the backscattering from above-ground woody structures containing most of the tree biomass. Constrained nonlinear minimization of a cost function was used to invert the simulated backscatter intensity and to retrieve the biophysical parameters, i.e., the tree height and the trunk radius, at each pixel of ALOS-2, NovaSAR, and Sentinel-1 images. The biophysical parameters retrieved from the model are then applied to allometric equations to estimate the AGB. Additionally, the dependence of SAR backscatter on AGB is illustrated by the single-frequency relationships with saturation levels.

Study area and data

The study area is located on the western slopes of the southern Western Ghats in the Thiruvananthapuram district of Kerala. It covers an approximate area of 151 \(km^{2}\) with varying topography and vegetation types. The terrain is undulating, with elevations ranging from 100 m at reservoir level to 1777 m near Agasthyamalai peak. The eastern portion of the area is characterized by steep slopes, cliffs, and rocky outcrops and comprises numerous waterfalls and intact forests. The terrain on the western side is rather gentle, with disturbed forests and plantations. The region is mainly characterized by moderately to steeply undulating terrain units, except for a few isolated hillocks. The study area delineated is shown in Fig. 1 with a land cover map produced using the maximum likelihood classification of Sentinel-2 imagery.

Fig. 1
figure 1

Location map of the study area

The study area has two climatic regimes: tropical and montane subtropical. However, the study area has considerable variation in temperature depending on location, topography, and altitude. The mean annual rainfall is about 300 cm, contributed by both the southwest and northeast monsoons. The study area has remarkable variability in vegetation as a result of the different climatic and topographic characteristics. Primary vegetation types in the study area include semi-evergreen, wet evergreen, and tropical moist deciduous forests and plantations of rubber, eucalyptus, and acacia. Moist deciduous forests and plantations are mainly spread in the lower elevation areas, whereas wet evergreen forests are confined to high-elevation regions in the eastern part of the study area, and semi-evergreen forests are mainly located adjacent to the streams in medium- to high-elevation areas. All vegetation types have visible seasonal variation, except for evergreen forests. The area distribution of the vegetation types in the study area is given in Table 1). The most prevalent vegetation type in the study area is moist deciduous forest, which covers 39.61 \(km^{2}\) or 26.23% of the entire geographical area. The semi-evergreen forest, which also occupies 26% of the area, is the second largest vegetation type. Plantations and wet evergreen forests make up 12.86% and 7.53%, respectively, of the entire geographic area. The dominant tree species include Artocarpus hirsutus, Terminalia paniculata, Pterocarpus marsupium, Wrightia tinctoria, Macaranga indica, Canarium strictum, Lophopetalum wightianum, cullenia exarillata, Diospyros candolleana, Eucalyptus grandis, Acacia auriculiformis, and Hevea brasiliensis.

Table 1 Species details and area distribution of vegetation classes

Ground data and above-ground biomass estimation

Forest inventory data, including vegetation allometric parameters, was collected in December 2019 and March 2021 over 21 sample plots of 0.1 ha each, distributed among the various forest types of the study area. The sample plots were distributed among moist deciduous forests, semi-evergreen forests, wet evergreen forests, and plantation stands of rubber, acacia, and eucalyptus (Fig. 2). The study locations were carefully chosen to include all significant vegetation types in the study area and were located on relatively flat terrain to reduce topographic effects. To account for the fact that plantation stands can be of various ages, representative samples from young, middle-aged, and mature plantations have been included in the field data.

Fig. 2
figure 2

Vegetation type cover map with field sample location points for the study site

All trees in a plot whose girth at breast height (GBH) was greater than >10 cm were measured. Numerous forest biophysical parameters exhibit slow seasonal variation and can be assumed to remain stable for several months. The following parameters were measured for each plot: tree height, GBH, tree number density, and tree species names. The measuring tape and laser rangefinder were used to measure GBH and tree height, respectively. The corresponding GBH measurements were used to calculate the diameter at breast height (DBH). The ranges of the mean tree height and mean diameter at breast height (DBH) were 4.08 to 17.75 ms and 5.02 to 40.70 cm, respectively. Table 2 presents forest parameter statistics based on this survey in the study area. Tree species were identified based on common names and the reports of the Kerala Forest Research Institute. GPS was used to record the latitude, longitude, and altitude of each sampled plot.

Table 2 Summary statistics for field sample data in the study area

A general allometric Eq. (1) with coefficients specific to various vegetation types was used to estimate above-ground biomass from the biophysical parameters.

$$\begin{aligned} {\begin{matrix} \ln {B} = a+b\ln ({\rho D^{2}H}) \end{matrix}} \end{aligned}$$
(1)

where B is above-ground biomass, \(\rho\) is the tree wood density recommended by the Forest Survey of India (FSI), D is tree trunk diameter at breast height, and H is tree height. Table 3 lists the vegetation-specific coefficients used in the allometric equation. The frequency distribution of the AGB of the field-measured samples is shown in Fig. 3.

Table 3 Values for coefficients used in allometric equation
Fig. 3
figure 3

Frequency distribution of field-measured above-ground biomass

The following input parameters are used in the RT models: SAR frequency, incident angle, polarization, soil moisture, surface RMS height, correlation length, vegetation dielectric constant, tree height, tree number density, and trunk radius. Because the values of the parameters vary across a given plot, the average values of the biophysical parameters were selected as the input for forward modeling. Some variables can be measured, such as tree number density or tree height, while others, such as correlation length or RMS height, are very difficult to measure. For those variables, either an estimate or a value taken from the literature was used.

Satellite data

This study used multi-polarized and multi-frequency SAR data (Fig. 4), including L-band dual-polarized (HH/HV) data from ALOS-2, S-band dual-polarized (HH/HV) data from NovaSAR, and C-band dual-polarized (VV/VH) data from Sentinel-1.

Fig. 4
figure 4

RGB images (R: HH, G: HV, B: HH/HV ratio) of ALOS-PALSAR and NOVASAR and VV-VH RGB image (R: VV, G: VH, B: VV/VH ratio) of Sentinel-1A over the study area

ALOS-2 L-band (1.5 GHz) SAR data was acquired in fine mode and processed as a level 1.5 detected geocoded data product with 25 m spatial resolution. NovaSAR S-band (3.2 GHz) tri-pol (HH/HV/VV) data (only dual-pol was used) was gathered in ScanSAR mode with a resolution of 30 m. The Sentinel-1A C-band (5.405 GHz) data were downloaded from Copernicus Data Hub as ground range detected (GRD) in interferometric wide swath (IW) mode with 20 m resolution. The scene was radiometrically calibrated (Small, 2011) and geocoded based on Shuttle Radar Topography Mission (SRTM) data. The digital number (DN) of the images was converted to normalized radar sigma-naught using equations pertinent to sensors. More details on the satellite imagery used for the study are given in Table 4. The temporal gap between the SAR images was due to the lack of data for the same year in the study area. The dynamics of the forests in protected areas are considered to be slower, so the temporal gap is anticipated to have less of an impact on changes in biomass.

Table 4 Characteristics of satellite imagery used for modeling

Methodology

The overall framework of the work is shown in Fig. 5. The process entails the following steps:

  • Preprocessing SAR data and acquiring backscatter intensity.

  • Fine-tuning and validating the VRT-based forward model (for each vegetation class) using representative field-measured biophysical variables.

  • Retrieving biophysical parameters through model inversion.

  • Estimation of above-ground biomass with the retrieved biophysical parameters.

This methodology was used to undertake independent analyses for various SAR images, and the outcomes were assessed by comparing them to ground truth data. Pre-processing of ALOS-2 and NovaSAR was carried out using Environment for Visualizing Images (ENVI) 5.3.1 software (Exelis Visual Information Solutions, 2015), and pre-processing of Sentinel-1 was carried out using Sentinel Application Platform (SNAP) (European Space Agency, 2015). With the aid of customized Python 3 scripts, modeling and simulations were performed.

Vector radiative transfer modeling

In the developed model, the stands of vegetation were split into two layers: a layer of dielectric finite-length cylinders functioning as defoliated trunks with random orientation distributions and the underlying rough ground. A pixel’s total backscatter intensity (\(\sigma ^{0}\)) is an additive contribution from the randomly oriented trunk layer and the ground layer underneath.

Fig. 5
figure 5

Schematic work-flow for retrieval of above-ground biomass from RT model

The simulation did not consider saplings, grass, or understory vegetation. The backscattering coefficient for the layer of circular cylinders over the rough surface is calculated using the first-order solution of the radiative transfer equation. The total backscattering coefficient, \(\sigma ^{0}_{pq}(i)\), can be written as follows:

$$\begin{aligned} {\begin{matrix} \sigma ^{0}_{pq}(i) = \sigma ^{0}_{c_{pq}} + \sigma ^{0}_{g_{pq}} \end{matrix}} \end{aligned}$$
(2)

where the backscattering from the cylinders and ground is represented by \(\sigma ^{0}_{c_{pq}}\), and \(\sigma ^{0}_{g_{pq}}\), respectively.

The backscatter model put forth by Karam and Fung (1988) was modified to simulate the backscattering coefficients (HH/HV for ALOS-2 and NovaSAR, and VV/VH for Sentinel-2) from the cylinder layer independently at each frequency. The scattering matrix linked to the trunk was determined by estimating its inner field using the field inside an analogous infinite cylinder, which was calculated using the classical method (Wait, 1955; Wait, 1959) in terms of the Hankel functions, the Bessel functions, and the first derivatives of them. The extinction coefficient is then calculated using forward scattering theorem, and the scattering amplitude is transferred to the reference frame (Karam & Fung, 1982). \(\sigma ^{0}_{c_{pq}}\), the cylinder layer’s backscattering coefficient, can be expressed as follows:

$$\begin{aligned} \begin{aligned} \sigma ^{0}_{c_{pq}}&= \left[ {4\pi \cos \theta _{i}}/{\langle K_{e}^{p}(i)\rangle}+{\langle K_{e}^{q}(i)\rangle}\right] \\&\quad \cdot {\{1-exp\left[ -\left( \langle K_{e}^{p}(i)\rangle +\langle K_{e}^{q}(i)\rangle\right) n_{0}d sec\theta _{i}\right] \}}\\&\quad \cdot \langle \mid f_{pq}(i,i)\mid ^{2}\rangle \\ \end{aligned} \end{aligned}$$
(3)

where \(n_{0}\) is the number of cylinders per unit volume, d is vegetation layer depth, and \(\theta _{i}\) is the incidence angle. The equations for scattering amplitude \(\left<\mid f_{pq}(-i,i)\mid ^{2}\right>\) and extinction coefficient \(\left<K_{e}^{p}(i)\right>\) of the cylinder layer are given in (4) and (5):

$$\begin{aligned} \langle K_{e}^p(i) \rangle = \int ^{2\pi }_{0} dx\int ^{\pi }_{0}d\beta \int ^{\pi }_{0}d\gamma p(\alpha , \beta , \gamma )K_{e}^p(i) \end{aligned}$$
(4)
$$\begin{aligned} \langle \mid f_{pq}(-i,i)\mid ^{2} \rangle = \int ^{2\pi }_{0} dx\int ^{\pi }_{0}d\beta \int ^{\pi }_{0}d\gamma p(\alpha , \beta , \gamma ) \mid f_{pq}(-i,i)\mid ^{2} \end{aligned}$$
(5)

where the scattering amplitude (\(f_{pq}(-i,i)\)) and the extinction coefficient (\(K_{e}^{p/q}(i)\)) of a single cylinder are respectively given in (6) and (7).

$$\begin{aligned} f_{pq}(s,i) = \sum _{psl}\sum _{qil}f^{'}_{pq}(s,i)(p_{s}\cdot p_{sl})(q_{il} \cdot q_{i}) \end{aligned}$$
(6)

where \(f_{pq}(s,i)\) is the scattering amplitude tensor element.

$$\begin{aligned} \begin{aligned} K_{e}^p(i)&= 4\pi kl Im\bigg [(\epsilon _{r}-1)\bigg (-\bigg \{\bigg [e_{0\nu } B_{0} \cos \theta _{a}+2\sum _{n=1}^{\infty }(e_{n\nu }B_{n}\cos \theta _{il} - j\eta h_{n\nu }A_{n})\bigg ]\\&\quad \cdot \cos \theta _{a}+\bigg (e_{0\nu }Z_{0} + 2\sum _{n=1}^{\infty }e_{n\nu }Z_{n}\bigg )\sin \theta _{il}\bigg \}\\&\quad \cdot (\nu _{il}\cdot q_{i})^{2} + \bigg \{\eta h_{0h}B_{0} + 2\sum _{n=1}^{\infty }(\eta h_{nh}B_{n} + je_{nh}\cos \theta _{il}A_{n})\bigg \}(h_{il}\cdot q_{il})^{2}\bigg )\bigg ]{} & {} \end{aligned} \end{aligned}$$
(7)

where k represents the wave number, Im( ) being the imaginary part, and \(\theta _{il}\) stands for the incidence angles in the reference frame. The polarization vectors in the reference frame are \(h_{il}\) and \(\nu _{il}\), and the relative dielectric constant of the cylinder with regard to the background medium is \(\epsilon _{r}\). \(\eta =\sqrt{(\mu _{0}/\epsilon _{0}})\), where \(\mu _{0}\) and \(\epsilon _{0}\) respectively, are the permeability and the dielectric constant of the background medium. Refer to Karam and Fung (1988) for comprehensive explanations of the equations.

The complexity and randomness of the medium are explained by considering the orientation of trunks using probability density functions. It is assumed that the layer of cylindrical scatterers is oriented uniformly in the azimuthal direction. Since there is no correlation between the angles of cylinder orientation, the following Eq. (8) can be used to calculate the joint probability distribution function:

$$\begin{aligned} {\begin{matrix} p(\alpha ,\beta ,\gamma ) = p(\alpha ) p(\beta ) p(\gamma ) \end{matrix}} \end{aligned}$$
(8)

The angles (\(\alpha , \beta\), and \(\gamma\)) in the equation are termed Tait-Bryan angles. Since cylinders are symmetric, Euler angles are able to define them by taking

$$\begin{aligned} {\begin{matrix} \gamma = 0 \quad \quad \textrm{and} \quad \quad p(\gamma ) = 1 \end{matrix}} \end{aligned}$$
(9)

The model put forth by Karam and Fung (1988) used the Kirchhoff model under the scalar approximation to obtain the soil backscatter, \(\sigma ^{0}_{g_{pq}}\), which represents the scattering characteristics of the rough soil surface under the assumption that the soil is a continuous, gently undulating dielectric surface. As a result, it was not enough to replicate radar scattering under diverse soil moisture and roughness conditions, especially in sparse, moist deciduous forests and young plantation sites where soil surface significantly impacted total backscatter. To more efficiently simulate the soil surface scattering, a more advanced, improved integral equation model (\(I^2EM\)) proposed by Fung and Chen (2010) was adopted in the present work. \(I^{2}EM\) surface backscatter model can be applied to a variety of soil surface conditions. The general form of the equation for getting the backscattering coefficient from the surface layer using the \(I^{2}EM\) model is given in (10).

$$\begin{aligned} \begin{aligned} \sigma _{pp} = \frac{k^2}{4\pi }exp[-4k^{2}_z\sigma ^{2}] \Bigg \{\left|{(2k_z\sigma )f_{pp}+\frac{\sigma }{4}(F_{pp1}+F_{pp2})}\right|^2w(2k\sin \theta ,0) \\ + \sum _{n=2}^{\infty }\left|(2k_z\sigma )^n f_{pp}+ \frac{\sigma }{4}F_{pp1}(2k_z\sigma )^{n-1}\right|^{2}\frac{w^n(2k\sin \theta 0,0)}{n!}\Bigg \} \end{aligned} \end{aligned}$$
(10)

where \(p=v,h\) polarizations, k stands for the radar wave number, \(\sigma\) is the rms-height, \(\theta\) denotes the incidence angle, \(f_{vv}=2R_{v}/\cos \theta\), and \(f_{hh}=-2R_{h}/\cos \theta\). \(R_h\) and \(R_h\) are the horizontally and vertically polarized Fresnel reflection coefficients, respectively. The parameters w and \(w^{n}\) represent the surface spectra of the two-dimensional Fourier transforms of the correlation coefficient as well as its nth power, respectively. For in-depth explanations of the equation, see Fung and Chen (2010).

Model inversion and validation

The estimation of tree height and trunk radius from the simulated backscatter coefficients of the dielectric cylinder model can be described as an inverse problem. In order to do this, the iterative optimization (IO) approach was used to invert the simulated backscatter intensity. Iterative optimization (Wang, 2010) is a popular method for inversion problems that are ill-posed. As an illustration, consider the case where Y is the vector of output parameters in the model M), correlates to the vector of input parameters as

$$\begin{aligned} {\textbf {Y}} = {\textbf {M}}(\Theta , {\textbf {X}}) + \epsilon \end{aligned}$$
(11)

where \(\Theta\) is the vector of model input parameters. During the inversion process, a merit function S(X) is minimized for n observations to obtain X,

$$\begin{aligned} {\begin{matrix} S(X) = \sum\limits_{i=1}^{n}\left[ Y_{i} - M(\Theta ,X_{i})\right] ^{2} \end{matrix}} \end{aligned}$$
(12)

This non-linear merit function can be solved by employing conventional optimization methods (Jacquemoud et al., 1995). An initial guess of the parameter is required in order to begin the method, and it continuously updates those guesses until the merit function gets close to a minimum. In this case, the minimization problem is a non-linear, constrained, and multivariate scalar function. The allowable height and radius ranges were limited to 3 to 25 m and 0.04 to 0.5 m, respectively. Employing a non-linear L-BFGS-B technique (Morales, 2002), the values of the parameters (\(X_{i}\)) that minimize the merit function fall between these ranges and are chosen as the best result. The modeled tree heights and trunk radius were compared with ground measurements made at the study sites for the purpose of validation. The vegetation cover map produced for the study area was employed to apply the algorithm to the image data for pixel-by-pixel estimation of above-ground biomass. The parameters of the forward model were fixed separately for three vegetation classes, with the vegetation types being divided into three classes based on the shared traits of the input variables. The first class was comprised of rubber, eucalyptus, and acacia plantations; the second class was comprised of moist deciduous forests; and the third class was comprised of evergreen (semi- and wet-evergreen) forests. Using two ground truth points from each vegetation class, two parameters of the forward model, viz., tree number density and vegetation dielectric constant, were fixed. Soil moisture data was retrieved from the Soil Moisture Active Passive (SMAP) satellite for the respective dates. The other parameters, such as surface RMS height and correlation length of the rough ground, were gathered from literature. Six points from the ground truth data were used to fix the parameters of the forward model, and the remaining data (fifteen points) were used as independent validation data. The retrieval of extra biophysical parameters is possible only with the use of quad-pol data. The predicted radius and height values from the model inversion were used in the allometric Eq. 1 with vegetation-specific coefficients to estimate the above-ground biomass for each stand.

A simple linear regression analysis was conducted independently between the cross-pol backscatter intensities of each frequency and field-measured AGB. The AGB predicted by the inversion of the scattering model was compared to the AGB predicted through the regression of various SAR frequencies. In order to gauge how well the retrieval procedures worked, the coefficient of determination (\(R^2\)) and root mean square error (RMSE) was used. Spatial maps of above-ground biomass for the study area were generated using the scattering model for the selected SAR frequencies. The SAR images were resampled to a 32 m \(\times\) 32 m pixel size (which is also equal to the size of the field plots) to decrease the computation time during the optimization phase. The non-vegetated areas are masked out from the procedure.

Results

In this section, the potential of SAR data at various frequencies in the scattering model to predict above-ground biomass and other biophysical variables was assessed. Additionally, the effectiveness of using SAR data in the regression model for estimating biomass was also examined. Finally, the comparison of the results of the scattering model and the linear regression model for selected SAR sensors is illustrated as well.

Relationship between SAR backscatter and biomass

We looked at the co- and cross-polarized signals of the chosen SAR data to check how sensitive the \(\sigma ^{0}\) is to the above-ground biomass. Cross-polarized returns are found to have the best \(\sigma ^{0}\) sensitivity to AGB across all frequencies. Therefore, only the cross-polarization channels were used as input in the regression analysis. However, the cylinder scattering model made use of both co- and cross-polarization returns. This subsection examines the relationships between the field-measured above-ground biomass and the radar backscattering coefficients in the cross-polarizations of the L-, S-, and C-bands. At each study location, the backscatter coefficient was obtained from the calibrated, topographically corrected SAR images (at 32 m \(\times\) 32 m pixel size). Field measurements of tree height, diameter at breast height (DBH), and wood-specific gravity were used at 21 locations to estimate above-ground biomass. The AGB at the selected sampling sites was found to range from 5.02 to 250 t/ha, and most of the area was found to be in the range of > 200 t/ha. The field AGB at the field locations had a logarithmic relationship to the SAR backscatter coefficients. The data for L(HV) and C(HV) show typical AGB versus backscatter relationships with steeper slopes at the lower biomass range and shallower slopes throughout the range of higher biomass levels. It was discovered that slopes were insensitive to biomass levels greater than approximately 100 t/ha. The L(HV) backscatter increased quickly in the fitted \(\sigma ^{0}\)- biomass curve, then slowed and got saturated at a biomass of nearly 100 t/ha. Similarly, the S(HV) and C(VH) data exhibited the strongest sensitivity at a very low biomass interval (\(\le 50\) t/ha) and there is much dispersion in the fitted points compared to the L-band. The trend-line of the data with L(HV) has relatively higher slopes than the S(HV) and C(VH) trend-lines at higher biomass levels, indicating a greater sensitivity to biomass. Even at lower biomass levels, the S- and C-bands experience very rapid saturation, which could be because these shorter-wavelength signals don’t penetrate the canopy as deeply as L-band signals do.

Regression relations were established between the field-measured above-ground biomass and cross-polarization channels to predict above-ground biomass. Figure 6 presents the validation plots of predicted biomass from linear regression of SAR backscatter and measured biomass data. The predicted AGB with L(HV) has shown a moderate correlation (\(R^2\) = 0.48) to the measured AGB, and it has reduced to low correlations (\(R^2\) = 0.12 and \(R^2\) = 0.03) when using S(HV) and C(HV). Since L(HV) data was not sensitive to AGB beyond 150 t/ha, most of the predicted values were found to be falling in a range of 0–150 t/ha. Estimating AGB with linear regression has resulted in high error values for all three frequencies.

Fig. 6
figure 6

Logarithmic growth equation fitted between field-measured AGB and backscatter coefficient (dB) of a ALOS-2, c NovaSAR, and e Sentinel-1. Validation plots for the AGB predicted from linear regression of \(\ln (AGB)\) and cross-pols of b ALOS-2 and d NovaSAR f Sentinel-1. Solid lines depict linear fit line through the data

Dielectric cylinder model results and validation

The results obtained from the scattering model are reported in this subsection. Similar to the previous subsection, the scatter plots are based on biophysical parameters that the model has retrieved with respect to different frequencies. Unlike the regression model, the results are based on the inversion of both co- and cross-polarized data. The plot average of the biophysical parameters (DBH, height, and AGB) was considered for each study location. In Fig. 7, the validation results of the model-retrieved parameters with ground truth measurements using L-, S-, and C-bands are shown. For ALOS-2 data, relationships were consistently high with minimal error for all parameters. On the other hand, Sentinel-1 and NovaSAR data yielded subpar results with higher errors and much less correlation. With L-band data, the height estimation had an \(R^2\) equal to 0.74 and an RMSE equal to 2.3 m, and the radius estimation had an \(R^2\) equal to 0.81 and an RMSE equal to 0.025 m. Using S-band data, the height estimation had an \(R^2\) of 0.5 and an RMSE of 2.97 m, while the radius estimation had an \(R^2\) of 0.63 and an RMSE of 0.037 m. The height estimation using C-band data had an \(R^2\) of 0.49 and an RMSE of 2.99 m, while the radius estimation had an \(R^2\) of 0.48 and an RMSE of 0.044 m.

Fig. 7
figure 7

Scatterplots of predicted tree height and radius to ground measured values at the study sites for ALOS-2 (a and b), NovaSAR (c and d), and Sentinel-1 (e and f)

The allometric Eq. 1 was used to estimate the above-ground biomass using the predicted radius and height values from the model inversion. In Fig. 8, the validation plots of biomass estimation using L-, S-, and C-bands are shown. The L-band data-based AGB estimate outperformed the other SAR frequencies, with an \(R^2\) of 0.73 and an RMSE of 35.90 t/ha. The AGB estimate, which used S-band data, had an \(R^2\) of 0.37 and an RMSE of 63.37 t/ha. The AGB estimation had an \(R^2\) of 0.25 and an RMSE of 72.32 t/ha using C-band data. According to the findings, L-band data is more promising for estimating vegetation biomass in tropical forest areas than C- and S-band data.

Fig. 8
figure 8

Validation plots for the AGB predicted and AGB maps from the dielectric cylinder model with ALOS-2 (a and b), NovaSAR (c and d), and Sentinel-1 (e and f)

Three separate AGB maps (Fig. 8) were generated using the dielectric cylinder model and SAR data with different frequencies to depict the spatial distribution of the predicted AGB over the chosen study area. The non-forest areas were masked on the AGB maps. In order to speed up the computation required for the optimization process, the resampled SAR pictures, i.e., 32 m \(\times\) 32 m pixel size, were used. It has a color scheme that gradually transitions from vivid red to deep blue, signifying an increase in AGB from 0 to >250 t/ha. The study area encompassed sites ranging from low to high biomass, and the predicted outcomes show that the area has a heterogeneous above-ground biomass distribution. The AGB maps agreed with the ground measurements that areas with semi-evergreen and evergreen forests had the highest biomass, followed by moist deciduous forests. The biomass of plantations was also significantly higher in the study area. Locations with higher AGB are associated with older, denser forests, while places with lower AGB are associated with younger, sparser forests. The AGB is found to be high and more evenly distributed, especially in the eastern part of the study area, which is comprised of semi-evergreen and evergreen forests. These forests are the least disturbed since they are located at high elevations and inside the core areas of the wildlife sanctuary. However, in the western part of the study area, which is dominated by plantations and sparse deciduous forests, biomass was low and more uneven. Plantations and moist deciduous forests have varying biomass ranges as they have very young to mature patches. The estimation of AGB with the selected SAR frequencies has been considerably impacted by the saturation of SAR signals. The highest estimated biomass range using C-band data is found to be between 150 and 200 t/ha. When examining the AGB map with C-band data, it is found that the AGB is generally underestimated in pixels, with the majority of them having an AGB range of 0–50 t/ha. When employing S-band data, AGB ranges up to 200–250 t/ha are observed in some pixels. But similar to the C-band data, most of the pixels in the S-band data also underestimate the AGB. There is a clear spatial variation in the predicted AGB using L-band data. In regions with low biomass, i.e., <150 t/ha, AGB is predicted with greater accuracy using L-band data. It is observed that ambiguity has arisen in locations with high biomass values, specifically >200 t/ha. An overview of the accuracy assessment is provided in Table 5.

Table 5 Performance evaluation of the dielectric cylinder model with ALOS-2, NovaSAR, and Sentinel-1 data

Discussion

Through the use of a microwave scattering model, a comprehensive evaluation of the potential of dual-polarized multi-frequency (L-, S-, and C-band) SAR backscatter for determining the woody biomass (i.e., the most above-ground biomass) of tropical vegetation served as the foundation for this work. The previously developed dielectric cylinder scatter model (Karam & Fung, 1988) and \(I^{2}EM\) surface scatter model (Fung & Chen, 2010) were modified to include first-order scatter mechanisms from the ground and trunk layers. The simulation of the extinction and scattering components from the canopy layer using the radiative transfer approach makes it possible to retrieve the biophysical parameters and subsequent estimation of the biomass of different kinds of vegetation, independent of their location. It is possible to use the RTM-based method anywhere because it explicitly establishes the relationships between the canopy parameters and the backscatter. The application of a polarimetric scattering model to estimate above-ground biomass was strongly supported by numerous earlier investigations. Liao et al. (Liao et al., 2013) used the Michigan Microwave Canopy Scattering (MIMICS) model (Ulaby et al., 1990) to estimate the above-ground biomass in wetland vegetation. A similar study conducted by Wang and Qi (2008) also used a first-order radiative transfer theory to estimate the woody biomass of tropical forests. Another study conducted by Saatchi and Moghaddam (2000) used a backscatter model to map the crown, stem, and total biomass of boreal forests. The Iterative Optimization (IO) approach (L-BFGS-B method) was used to invert the simulated backscatter intensity to predict the biophysical parameters from the dielectric cylinder model. In many earlier works (Mandal et al., 2019; Polatin and Sarabandi, 1994; Soja et al., 2020), the IO approach to inverting scatter models for estimating parameters was successfully used. Utilizing many of the current models is still challenging. For instance, the Michigan Microwave Canopy Scattering Model requires far too many input variables (more than 60 input parameters). As a solution, we concentrated on using an approximate scatter model to retrieve biophysical parameters. Furthermore, the models are not precise enough as a result of the employment of questionable theoretical surface scattering models such as the small perturbation model (SPM), physical optics (PO), and geometrical optics (GO) models (Oh et al., 1985). Hence, a more advanced, improved integral equation model (\(I^{2}EM\)) proposed by Fung and Chen (2010) was utilized in the present work to accurately simulate soil surface scattering.

The selected frequencies were found to have issues with predicting AGB. The signal saturation arising at AGB of more than 100 t/ha (Dobson et al., 1992; Le Toan et al., 1992; Luckman et al., 1998) limits the sensitivity of L-band SAR to tropical forests, whereas S- and C-bands have the highest sensitivity at AGB ranging from \(\le\) 50 t/ha and are only sensitive to the top layer of the canopy (Ningthoujam et al., 2016). The P-band SAR data is anticipated to achieve better performance. However, because of the lack of P-band data for the study area, only L-, S-, and C-band SAR may currently be used to monitor the region’s tropical vegetation. Numerous earlier studies (Dobson et al., 1992; Le Toan et al., 1992; Mitchard et al., 2011; Ranson & Sun, 1994) revealed that L-band cross-polarized backscatter has more sensitivity to changes in biomass, whereas the co-polarized signal and higher frequencies are less linked to biomass. The results of this research likewise support earlier research (Saatchi et al., 2011) by establishing that the L-band data has a high sensitivity to AGB < 100 t/ha. According to the findings of the study, which are in line with those of earlier studies, L-band backscatter is more suitable to map young, sparse forests with low biomass content (Peregon & Yamagata, 2013). Tropical forests consistently have higher measured errors for the predicted AGB than temperate/boreal forests (Bharadwaj et al., 2015; Saatchi et al., 2011). These frequencies tend to saturate at a given biomass range, and this might contribute to errors in modeling results. The other observed differences between measured and estimated AGB can be attributable to the following aspects: Due to the model’s sensitivity to parameters, any errors in biophysical derivations can lead to inaccuracies in the model, such as inadequacies in the number density and dielectric constant of the trunk component. The limitations of the first-order radiative transfer model prevent the inclusion of multiple scattering mechanisms among the scattering elements (Liang et al., 2005) and the presence of mixed species along with multiple vegetation layers in the model simulation (Ningthoujam et al., 2017). Additionally, studies reported that topographic effects in the SAR imagery have impacted the AGB estimation (Wang & Qi, 2008). In this study, the modeled AGB on the east side of the study area, which is inhabited by evergreen and semi-evergreen woods, was seriously dubious. Because of the high relief and steep mountaintop slopes in these forests, model errors could be highly significant. The combined effects of these factors may cause an overestimation or underestimation of the above-ground biomass from the selected vegetation stands.

The findings show that the current microwave scattering model can successfully simulate AGB in tropical vegetation using the selected SAR data sets. With the scattering model, the L-band gave better estimates (\(R^2\) = 0.73, RMSE = 35.90 t/ha) for the AGB prediction. Similar correlations, ranging from 0.407 to 0.76, were identified in studies predicting AGB using L-band data in tropical forests (Hamdan et al., 2014; Mitchard et al., 2009). For the selected bands, the RTM-based approach had a higher level of retrieval accuracy than linear regression. The final AGB maps’ spatial resolution is 32 m, which is finer than the region’s current large-scale AGB maps (Reddy, 2016) while retaining a comparable degree of accuracy. In comparison to other modeling techniques, the implementation of the polarimetric scattering model allowed for a more precise and detailed simulation of AGB. Additionally, the RTM-based approach does not necessitate a large set of training data, whereas, in machine learning algorithms, a sufficient training database is crucial for the efficiency of the models (Hongliang & Shunlin, 2003). Wang and Qi (2008) used a first-order radiative transfer model to estimate the woody biomass of tropical forests with thirty two sampling sites. Another study conducted by Soja et al. (2020) used a canopy scatter model with P-band SAR data for estimating AGB using six sampling plots. In comparison to empirical models, radiative transfer theory-based models are more reproducible since they are less dependent on field data (Houborg et al., 2007; Quan et al., 2015; Yebra et al., 2013). The results of this study have significant effects on carbon-related initiatives like UN-REDD and can assist with monitoring and risk management systems to achieve goals. Since these products have such high resolution, it is possible to monitor forests and carbon stocks with greater accuracy and to detect even very slight variations in biomass. In tropical forests, methodologies employing phase or coherence rather than only backscatter could improve the accuracy of the AGB estimation, although they are constrained by the availability of data (Berninger et al., 2018). Another possibility for more accurate measurements of the biomass of tropical forests could be achieved by combining optical data with SAR data (Mitchard et al., 2014; Ploton et al., 2012; Sandberg et al., 2011). Additionally, the launch of new P-band satellites, such ESA’s Earth Explorer Biomass (European Space Agency, 2008; European Space Agency, 2012), offers great promise for getting better estimates of biomass in these areas.

Conclusion

By utilizing a microwave scattering model, this work expands on a thorough investigation of the capabilities of dual-polarized multi-frequency (L-, S-, and C-band) SAR backscatter for calculating the above-ground biomass of tropical vegetation. In a vegetation stand, most of the AGB is contained in the woody portion comprising the branches and trunks. This prompted us to choose a cylinder scattering model in which the canopy was treated as a defoliated trunk layer consisting of a group of randomly distributed dielectric cylinders having fixed heights and the underneath surface as a rough ground. In order to retrieve backscattering from the rough soil surface, an \(I^{2}EM\) model was implemented. Parameter retrieval was carried out using model inversion. The predicted biophysical parameters were validated using the measured data from the field.

The ground measurements from 21 sample plots, each measuring 0.1 ha in size, spread across the various forest types in the study area, were collected during the field survey. The sample locations were carefully chosen at relatively low slope sites as the study area is highly mountainous, minimizing topographic impacts while including all main vegetation types in the study area. The plot average of the biophysical variables (DBH, height, and AGB) was considered for each study location. The radiative transfer model was provided with the ground-measured data, allowing it to quantify the scattering and attenuation imparted by woody structures. The forward model was fine-tuned using two points from each type of vegetation, and the remaining data were used as independent validation points. The forward model was inverted using an iterative optimization approach. A general allometric equation with coefficients specific to vegetation class was used to determine the above-ground biomass with model-retrieved biophysical parameters. The modeled results have shown a varying biophysical parameter distribution in tropical forests. By evaluating the dependence of SAR backscatter on the standing biomass of the varied vegetation types in the study area, it was observed that SAR backscatter lacks sensitivity beyond a certain range of biomass. The sensitivity of \(\sigma ^{0}\) on the AGB was found to decrease with wavelength due to the scattering and attenuation contributed by the foliage and small branches of the canopy. The results from the regression analysis gave evidence that retrieval algorithms for above-ground biomass using cross-polarization of L-band SAR data have better performance (\(R^2\) = 0.48, RMSE = 50.02 t/ha) compared to the S-band (\(R^2\) = 0.12, RMSE = 70.98 t/ha) and C-band (\(R^2\) = 0.03, RMSE = 80.84 t/ha) data. This validation helped to prove the efficiency of L band data in estimating AGB in the selected mixed vegetation patch. The tree height and trunk radius were estimated by the microwave canopy scattering model inversion with L-band data having the \(R^2\) of 0.74 and 0.81 and the RMSE of 2.3 m and 0.025 m, respectively. The scattering model inversion gave better results as compared to the regression-based approach for all the frequencies. In this approach, the L-band gave better estimates of AGB (\(R^2\) = 0.73, RMSE = 35.90 t/ha) compared to the higher frequencies. The use of the S- and C-bands in the scattering model was found to be inferior, with \(R^2\) of 0.37 and 0.25 and RMSE of 63.37 t/ha and 72.32 t/ha, respectively. Finally, AGB maps were prepared for the study area with each frequency of SAR data for comparison.

The SAR-based biomass estimation is limited by the backscatter signal’s saturation effects in higher biomass ranges. In addition, the environmental conditions, particularly the topographic conditions, affected the accuracy of biomass mapping in the highly undulating terrains of the study area, posing a problem that needs to be addressed by future studies in this domain. Additionally, it was difficult to extend out and increase the number of ground truth points in the inaccessible terrain of the tropical Western Ghats. It is possible to update the approach in a significant way by addressing these issues and looking into solutions using quad-pol SAR data.