Introduction

The pure spectral elements represent a quick synopsis of the vast hyperspectral image data (Thompson et al., 2010). An endmember is the unmixed representative of a sample and its extraction is an utmost important process carried out in a hyperspectral data analysis (Plaza & Chang, 2006). The spectral unmixing is an important step in hyperspectral data analysis to determine pure spectral elements for classification (Somers et al., 2012). According to Veganzones and Grana (2008), there are various methods of spectral unmixing such as geometric method, lattice computing method and heuristic method which further have subtypes. A number of methods of endmember extraction were compared by Plaza et al. (2004) to determine the better performing algorithm, and considered the linear mixture model to be more appropriate. Chen et al. (2018) used the Sequential Maximum Angle Convex Cone (SMACC) in combination with hyperspectral data for determining the metallic-rich zone in rocks. The work done by Aufaristama et al. (2018) involved the utilisation of SMACC for endmember selection for working on volcanic results considered SMACC to be quick in endmember selection. The classification performed using the endmembers could help in further analysis and monitoring of various parameters like vegetation.

Vegetation is an important component of the ecosystem owing to their contribution in atmospheric reciprocity and various climate change-related activities of the world (Pei et al., 2018). The impact of various natural and man-made phenomena, viz., urban sprawl and climate dynamics, highly determines floral behaviour (Pei et al., 2018; Ekwueme & Agunwamba, 2021). For example, the activity of vegetation is enhanced during carbon dioxide expansion (Piao et al., 2012) and diminuted in droughts (Ji & Peters, 2003). The dynamicity in biodiversity stability has been due to the climate change and land use patterns (Sala et al., 2000; Hansen et al., 2001). Global warming has been responsible for dragging down global biodiversity under the threat zone (Malcolm et al., 2006) and biological alterations (Parmesan & Yohe, 2003; Root et al., 2003). Studies have been carried out to determine the impact of global warming on biodiversity worldwide (Kappelle et al., 1999; Noss, 2001). Hence, it is of utmost importance to monitor the environment for detecting the changes in the biodiversity composition (Prasad et al., 2010) which includes the vegetation health.

The vegetation health monitoring is essential to determine stress conditions in vegetation and hyperspectral remote sensing plays an important role to serve the purpose (Kureel et al., 2021). Shafri and Hamdan (2009) considered the red edge-based technique better than the indices based while working on plant disease infection. However, the study by Dutta et al. (2009) determined the vegetation health using vegetation indices and random forest method and inferred the method to be reliable for vegetation health analysis if a threshold is provided for the vegetation indices to categorise to healthy or stressed class. Further, Kureel et al. (2021) derived the vegetation health of Lonar forest in Maharashtra using the hyperspectral remote sensing and a combination of several indices and concluded the method to be efficient in vegetation health analysis.

Remote sensing is a promising and more efficient technology which has gained weightage over the traditional methods of mapping (Kuenzer, 2011). The use of multispectral and hyperspectral data has been popular for the determination of the dynamics of the ecosystem (Shippert, 2003). But in recent years, the use of hyperspectral remote sensing has gained much popularity and has been considered a very useful technology (Navin & Agilandeeswari, 2020). But only a limited set of works have been carried out using SMACC for endmember extraction in combination with support vector machine (SVM). The present study highly contributes towards highlighting the advantages of the automated endmember extraction method SMACC for selecting the endmembers in hyperspectral data classification as we hypothesised that SMACC is highly beneficial for endmember selection and further the use of various environmental variables for the estimation of vegetation health. Hence, the objective of the study is (1) to determine the efficiency of SMACC for endmember extraction by deriving LULC of forest from it using SVM and (2) to further determine the vegetation health status from the same.

Materials and method

Study area

We took the Barkot forest range of Dehradun district in the state of Uttarakhand in India as the study area around 30° 06′ North longitude and 78° 18′ East latitude as in Fig. 1. This region is situated in the Himalayan foothills with an altitude range of 340 to 560 m above mean sea level (MSL) (Attri & Kushwaha, 2018). It covers the parts of Rajaji national park, and the range of Shivaliks. The main land use land cover (LULC) classes are forest, urban, water body, grassland and cropland. The rivers Ganga, Chandrabhaga and Song flow around the area.

Fig. 1
figure 1

The Barkot forest range is depicted in Dehradun district of the Indian state of Uttarakhand

Being bounded by the lush green forest ranges of Motichur (southern region) and Lachchhiwala (western region) and the urban settlements of Doiwala, Rishikesh and Bhaniawala, the region lies into the Sub-Group 3C North Indian Tropical Moist Deciduous Forests class of Champion and Seth’s (1968) forest classification of Indian forests (Attri & Kushwaha, 2018). Tectona grandis (Teak), Shorea robusta (Sal) and Mallotus phillipensis (Indian redwood) are the main tree species of forest (Bhattacharjee et al., 2019). Falling into the tropical to subtropical moist climate category, the temperature range varies from minimum of about 2 °C to maximum of about 41 °C with average annual precipitation of about 2300 mm (Nandy et al., 2017). The area consists of gentle slopes with fine and loamy thermic haplustalf type of soil (Shah & Subudhi, 2009).

Data used

For analysis, we used the cloud-free satellite data which was derived from the United State Geological Survey (USGS) (http://earthexplorer.usgs.gov/) open source geo-portal. The details of the data are given in Table 1.

Table 1 Specifications of the satellite data used for the study

Data pre-processing

We performed the pre-processing of the EO-1 Hyperion data to derive the desired results for analysis and it involved the bad band removal as the first step in which we selected the bands with non-zero data and we removed the bands with noise. We carried out de-stripping to remove noisy lines to obtain noise-free equivalent data. For this, we used the Tactical Hyperspectral Operations Resource (THOR) workflow. Further, we eliminated the bad column. In this, we selected the column and assigned them a value from the average values of the neighbouring pixels or columns.

After the elimination of the bad bands and the bad columns, then we performed the correction for atmospheric errors using the Fast Line of Sight Atmospheric Analysis of Spectral Hypercubes (FLAASH). It requires various parameters which vary according to a given set of conditions. The parameters used are depicted in Table 2.

Table 2 Parameters used for FLAASH for atmospheric correction of EO-1 Hyperion data

Finally, we georeferenced the data using the Landsat 5 ETM + data.

Endmember selection

We performed the endmember selection for classification. The endmembers signify the pure classes of an image that have values not lower than zero. For this, the method used was the SMACC. SMACC finds the spectral endmembers and their abundances throughout an image with the help of the angle made with the current cone, of which the vector with maximum angle is selected for the endmember and the method is useful for hyperspectral dataset due to impairment of unmixing (Gruninger et al., 2004). The endmembers were selected for the ten classes S. robusta (Sal), T. grandis (Teak), mixed forest, scrub, grass, riverine forest, cropland, settlement, dry riverbed and water. Now, the selected endmembers were used to select region of interest (ROI). After that, we matched the spectra of the selected species to create its ROI. The classification was performed using SVM.

For the health assessment, we derived the spectral variables NDVI (Normalised Difference Vegetation Index), CRI (Carotenoid Reflectance Index), Anthocyanin Reflectance Index (ARI), Modified Simple Ratio (MSR), Modified Chlorophyll Absorption Ratio Index (MCARI) and WBI (Water Band Index), the details of which are mentioned in Table 3. We assigned the threshold range of values and accordingly weightage favouring the healthy vegetation for each of the variables based on various literatures (Rouse, 1974; Hati et al., 2020; Chen, 1996; Peñuelas et al., 1993) and is depicted in Fig. 2.

Table 3 The indices used for health assessment with their formulae
Fig. 2
figure 2

The graph depicting the assigned threshold values for variables used

Further, we calculated the Land Surface Temperature (LST) where Pv was used to derive emissivity as \(\boldsymbol P\boldsymbol v\boldsymbol={\boldsymbol\lbrack\boldsymbol(\boldsymbol N\boldsymbol D\boldsymbol V\boldsymbol I\boldsymbol-\boldsymbol N\boldsymbol D\boldsymbol V\boldsymbol I\boldsymbol m\boldsymbol i\boldsymbol n\boldsymbol)\boldsymbol/\boldsymbol(\boldsymbol N\boldsymbol D\boldsymbol V\boldsymbol I\boldsymbol m\boldsymbol a\boldsymbol x\boldsymbol-\boldsymbol N\boldsymbol D\boldsymbol V\boldsymbol I\boldsymbol m\boldsymbol i\boldsymbol n\boldsymbol)\boldsymbol\rbrack}^{\mathbf2}\) (Chavez, 1996) where Pv = proportion vegetation, NDVImin = minimum value of NDVI, and NDVImax = maximum value of NDVI.

Now, ETM6 (emmissivity) was calculated as.

\({E}_{TM6}=0.004Pv+0.986\) (Moran et al., 1992).

Then, L (spectral radiance at sensor) calculated as \(\boldsymbol L=\lbrack({\boldsymbol L}_{\boldsymbol M\boldsymbol A\boldsymbol X}-{\boldsymbol L}_{\mathbf M\mathbf I\mathbf N})/(\boldsymbol Q\boldsymbol C\boldsymbol A{\boldsymbol L}_{\mathbf M\mathbf A\mathbf X}-\boldsymbol Q\boldsymbol C\boldsymbol A{\boldsymbol L}_{\mathbf M\mathbf I\mathbf N})\rbrack\ast\boldsymbol(\boldsymbol Q\boldsymbol C\boldsymbol A\boldsymbol L-\boldsymbol Q\boldsymbol C\boldsymbol A{\boldsymbol L}_{\mathbf M\mathbf I\mathbf N})+{\boldsymbol L}_{\mathbf M\mathbf I\mathbf N}\) (Brivio et al., 2006) where,

QCAL = quantized calibrated pixel value in DN, LMAX = spectral radiance scaled to QCALMIN (here LMAX = 15.303 W/(m2*sr*um)), LMIN = spectral radiance scaled to QCALMIN (here LMIN = 1.238 W/(m2*sr*um)), QCALMAX = maximum quantized pixed value in DN (here, QCALMAX = 255) and QCALMIN = minimum quantized pixed value in DN (here, QCALMIN = 1).

Then LST was calculated as \(\boldsymbol B\boldsymbol T=({\boldsymbol K}_{\mathbf2}/(\boldsymbol l\boldsymbol n({\boldsymbol K}_{\mathbf1}/\boldsymbol L)+\mathbf1))-\mathbf{273}\boldsymbol.\mathbf1\) where,

K1 = band-specific thermal conversion constant from the metadata (here, K1 = 607.76),

K2 = band-specific thermal conversion constant from the metadata (here, K2 = 1260.56) and L = spectral radiance at sensor. The methodology flowchart is depicted in Fig. 3.

Fig. 3
figure 3

The methodology flowchart of the study

Result and discussion

The LULC classification was derived along with the vegetation health and LST map.

The SVM classification shown in Fig. 4 resulted in an overall accuracy of 89.13% and fetched the kappa coefficient 0.87. The class S. robusta exhibited an accuracy of 93.40% which was the maximum among all the classes and it was followed by T. grandis (92.88%) and this explains the species level classification efficiency of SMACC in the hyperspectral data. The efficiency of SMACC has been earlier considered by Chen et al. (2018) and Aufaristama et al. (2018) for classification. Also, the high accuracy of the abovementioned classes might be due to the presence of more super pixels.

Fig. 4
figure 4

The LULC map derived by SVM classification depicts the different vegetation and other classes

Further, the combination of the variables displayed the majority area (78.6%) falling under the most healthy to moderately healthy vegetation category (Fig. 5). The most healthy vegetation was found in the class T. grandis species followed by S. robusta. This might be due to lack of anthropogenic disturbances in the area and favourable environment, topographic and climatic factors. The least healthy categories included the riparian forest, croplands, river and settlement. The variations of the edaphic factors of riverine or riparian forests due to flood conditions such as the deposition of new sediments, duration of existence and number of times of occurrence of floods challenge the adaptability potential of the vegetation in those forests (Priyadarshana et al., 2009). Similarly, the impact of climate change on the hydrological dynamics (Oo et al., 2020; Faye et al., 2022) could contribute towards limiting the vegetation health. This justifies the riparian forests falling under the least healthy category. Similar work was done and the vegetation health was derived where the method was considered to serve the purpose efficiently (Kureel et al., 2021; Dutta et al., 2009). This may be due to the wide range of values that discriminate the vegetation classes according to various parameters determined by the variables, where the ones with most suitable values in all variables are considered to be healthy. This determines the effectiveness of environmental variables in vegetation analysis.

Fig. 5
figure 5

The vegetation health map derived from the analysis depicting the categorisation of vegetation according to their health status

The minimum LST range was 13.13 to 18.40 °C which again was for the T. grandis and S. robusta class (Fig. 6). In the vegetation category, the highest LST was of the riparian forest. This showed the relation of vegetation health with the LST where LST is lowest in case of healthiest vegetation. Hernández-Clemente et al. (2019) considered the use of hyperspectral data and thermal data to be beneficial for signifying the vegetation health as they determine the biophysical changes.

Fig. 6
figure 6

The LST map derived from Landsat 5 ETM + 

The use of an efficient automated endmember extraction method like SMACC with one of the most systematic classification methods SVM into the hyperspectral data showcased the technological strength of the work. The vegetation health estimation and categorisation and correlation with LST could clear paths for numerous future research works based on the use of hyperspectral data. The hyperspectral data carries great potential in floral analysis on species level and the addition of the health estimation to it could further enhance the decision-making strength in various conservation plans. However, despite the perks of the number of bands and narrow band width, both the spatial and temporal limitations in the data availability of hyperspectral data are the major weakness. The multispectral data like Sentinel-2 cover the major regions on the earth and have the data repetivity which proves to be useful in current time data analysis but the same cannot be stated for the hyperspectral data as the coarse temporal resolution and limited region availability of data restricts the emergence of countless number of fruitful and great research works.

Conclusion

The study reflected the importance of automated endmember extraction method SMACC for the hyperspectral dataset which was useful for deriving pure endmembers for ROI according to our hypothesis. A good accuracy was observed in vegetation species level separability in SMACC-derived ROI. It was a quick method to derive the super pixels. The healthiest vegetation classes were T. grandis and D. sissoo whereas the riverine forests were categorised under the stressed category and expected to have deteriorating or poor health. Hence, some necessary measures are prescribed to drag it to the healthy vegetation category as it may harbour important, endemic and rare species which need conservation. The LST is inversely proportional to the vegetation health. Also the vegetation of poor health conditions lead to expansion in LST.

The global concern of climate change and global warming fetches the issue of biodiversity to be vulnerable to decline. This study contributes towards future studies as it is helpful for various forestry activities such as biodiversity conservation and contributes towards floral monitoring and mapping. However, the major limitation of the study lies in the lack of periodic data availability as the hyperspectral data are openly available for some limited region without regular repetivity. Also, the endmember selection process was approximate which might affect the precision value.