Introduction

Remote sensing is the art of collecting the earth’s surface information with the help of aircraft or space-borne satellites to be utilized in applications (Sharma et al., 2013), e.g., oceanography, cryosphere, hydrology, agriculture, and weather monitoring service but not limited (Sood et al., 2021a, 2021b). In the land-use and land-cover (LULC) applications, it plays a vital role in the estimation of soil moisture and erosion, forest cover mapping, urban planning, crop yield monitoring and prediction, and management of natural resources (Bhosle et al., 2019; Taloor et al., 2020; Singh et al., 2022). With the continuous improvements in satellite imaging technology, the high-resolution earth’s surface imagery is available at a huge range of spectral bands that can be utilized in numerous applications. However, there are still many challenges yet to be resolved for detecting land-cover changes in big cities, swath width problems, and resolution issues (Vivekananda et al., 2020). Multispectral imaging allows the acquisition of earth’s surface information in different spectral bands, i.e., the red, green, blue, near-infrared, thermal infrared, and short-wave infrared. But, due to a wider bandwidth, some of the critical information may be lost, which may be retrieved from the advanced algorithms but not up to the extent as in hyperspectral imaging (Wang et al., 2019).

To overcome the limitations of the multispectral dataset, hyperspectral imaging can be proven significant in terms of extracting critical information about the different natural resources. Hyperspectral imaging allows the collection of earth’s surface information in much narrower bands (10–20 nm). Observing information at such narrow spectral resolution has numerous advantages, such as quantifying surface materials, identifying and quantifying molecular absorption, and discriminating in different crops via different classification algorithms (Mahesh et al., 2015; Caballero et al., 2020). The classification is a process of extracting information classes from a multi-band or hyperband raster image to form a thematic map. As compared to multispectral remote sensing, the classification of hyperspectral remote sensing data is one of the challenging tasks due to the availability of an enormous amount of information (Dahiya et al., 2023). On the other hand, the hyperspectral image classification offered better discrimination among the different class categories as compared to multispectral image classification (Dahiya et al., 2023; Jarocińska et al., 2023). Some of the previous studies have proven the potential of the hyperspectral dataset (HyspIRI) as compared to the multispectral (i.e., Landsat 8 and Sentinel-2) for the mapping forest alliances in Northern California (Clark et al., 2020). In another study, Clarke et al. (2009) explored the summer and multi-seasonal variable groups via the hyperspectral and multispectral datasets and concluded the better performance of hyperspectral as compared to multispectral. It is also suggested that target-specific absorption features could be considered in the classifiers to improve the outcomes.

Generally, classification algorithms are categorized into supervised and unsupervised or hard and soft classifiers. Some of the well-defined or commonly used classifiers are summarized in Table 1. In the past few years, various classification methods have been used to classify multispectral data, such as neural network (NN) (Zhong et al., 2020), support vector machine (SVM) (Negri et al., 2016), principal component analysis (PCA) (Licciardi et al., 2012), k-nearest neighbor (Huang et al., 2016), maximum likelihood classifier (MLC) (Sood et al., 2018), and linear mixer model (LMM) (Singh et al., 2021a, 2021b). Detailed information on classifiers can be found in the literature (Lu et al. 2007). Pu et al. (2008) performed the comparative analysis of multispectral, i.e., Advanced Land Imager (ALI) onboard Earth Observation (EO-1) satellite and Landsat-7 Enhanced Thematic Mapper Plus (ETM +) and hyperspectral, i.e., Hyperspectral Imager (Hyperion) using vegetation indices (VIs), spectral texture information and maximum noise fractions (MNCs), and multivariate prediction models. They concluded the effectiveness of the hyperspectral dataset in forest mapping and leaf area index (LAI) as compared to ALI and Landsat-7. However, it has been analyzed that the accuracy can also be improved with the help of machine learning or deep learning classification models. But very rare studies were conducted to analyze the performance of hyperspectral on the different classification methods. Moreover, it is also required to perform the comparative analysis with multispectral datasets such as Landsat-8 OLI/TIRS. Therefore, there is a need to perform a comparative analysis of different classifiers for both multispectral and hyperspectral datasets.

Table 1 A detailed comparison of various classification algorithms

The focus of the present study is to evaluate the performance of various classifiers in land-use monitoring using hyperspectral and multispectral datasets. The objectives are divided as (a) to implement the different classifiers, i.e., SVM, MLC and feedforward neural network (FF-NN) using hyperspectral and multispectral datasets; (b) to compute the accuracy assessment of each classifier with different datasets; (c) to compare the performance of hyperspectral and multispectral dataset on each classifier; (d) to extract the discriminate the crops using hyperspectral imagery with the best classifier and compared with the multispectral dataset. This study has been conducted over a part of the Indian States, i.e., Haryana and Uttar Pradesh. This study has numerous applications in forestry, vegetation monitoring, and soil detection. It can also be used for monitoring crop stress, detecting various plant diseases, weather forecasting, and many more (Taloor et al., 2021).

Study Area and Satellite Dataset

Study Area

The study area is part of the North Indian States, i.e., Haryana, and Uttar Pradesh, having geographical coordinates between 30°8″ N and 29°16″ N in latitude and 77°16″ E to 77°6″ E in longitude, as shown in Fig. 1. Under these regions, the major class categories include vegetation/cropland, built-up area, barren land, and water. However, the study area covers the major portion of agricultural land. Moreover, these states are the biggest contributors to agriculture in India and agriculture in these states is one of the primary sources of income and employment, which plays a significant role in the improvement of the gross domestic product (GDP) of India. Therefore, the continuous monitoring and mapping of agricultural land are crucial for the effective management of agricultural land and accomplishing future requirements. Remote sensing offers a cost-effective solution for monitoring and mapping agricultural land and various types of crops.

Fig. 1
figure 1

Location of study site. a Image of India (highlighted area representing study site) b False color image (RGB: 40,30,20) of the study area (Hyperion) c False color image (RGB: 5, 4, 4) of the study area (Landsat-8) d Reference image

Dataset

In the present work, two cloud-free images from the Landsat-8 OLI/TIRS and EO-1 based Hyperion (hyperspectral) satellites were acquired on 12th March 2017 and 4 March 2017, respectively. The dataset was downloaded from the United States Geological Survey (USGS) earth explorer's online web platform (https://earthexplorer.usgs.gov/). The Landsat-8 consists of eleven spectral bands which include the wavelength of band 1 (0.43 µm—0.45 µ), band 2 (0.450 µm—0.51 µm), band 3 (0.53–0.59 µm), band 4 (0.64–0.67 µm), band 5 (0.85–0.88 µm), band 6 (1.57–1.65 µm), band 7 (2.11–2.29 µm), band 8 (0.50–0.68 µm), band 9 (1.36–1.38 µm), band 10 (10.6–11.19 µm) and band 11 (11.5–12.51 µm). The bands, i.e., 1–7 and 9, offered a spectral resolution of 30 m and band 8 had a spectral resolution of 15 m, whereas band 10 and 11 offered a spatial resolution of 100 m. On the other hand, the Hyperion EO-1 dataset includes 242 spectral bands with a separation of 10 nm with a wavelength coverage of 356–2577 nm at a spectral resolution of 30 m.

To validate the outcomes, the Pléiades constellation dataset was acquired from Google Earth history images at the spatial resolution of 0.5 m (panchromatic) and 2 m (multispectral). Airbus Defence and Space/Centre National d'Etudes Spatiales (CNES) oversees the operation of this satellite. It provides high-resolution imaging, which can give more specific information about the area. Google Earth viewer incorporated into the ERDAS Imagine version 2015 allows for image-based grounding on Google Earth. It allows users to connect with Google Earth, go to a predetermined area and analyze the satellite imagery with respect to Google Earth (Dahiya et al., 2023).

Methodology

The methodology of the proposed work is divided into three sections: (a) preprocessing of hyperspectral and multispectral datasets, (b) classification using MLC, SVM and FF-NN classifiers, and (c) accuracy assessment.

Preprocessing

Preprocessing is the first and most important step, which is to be taken care of after data collection. It is done for the successful removal of the numerous errors caused due to a variety of circumstances, including the location of the sun, varying air conditions, errors produced by satellite sensors, and errors resulting from rocky topography. If errors are not resolved timely, they may alter the result. The Hyperion EO-1 data was collected from the USGS website and consists of 242 spectral bands with a wavelength range of 356–2577 nm. The sensor’s built-in visible near-infrared (VNIR) detector gathers information in bands 1 to 70, while the short-wave infrared (SWIR) detector gathers information in bands 71–242. During preprocessing, the bad bands (not informative) were removed from the datasets. The bands which were removed from the study include 1–7 (non-illuminated), 58–76 (overlap region), 221–224 (water vapor region), and 225–242 (not used). Out of 242, only 196 bands were used for classification purpose. Similarly, for Landsat-8 out of 11 spectral bands, only 9 informative bands were used for classification.

Moreover, the radiometric correction has also been performed over Hyperion EO-1 and Landsat-8 datasets using Fast Line-of-Sight Atmospheric Analysis of the Spectral Hypercubes (FLAASH) tool available in Environment for Visualizing Images (ENVI) v5.3 software. Here, digital numbers (DN) are used by the sensor to store the electromagnetic radiations (EMR) intensity. These DN need to be transformed into useful units like reflectance, radiance etc. Converting DN readings into radiance values is part of the estimation of reflectance. By considering the highest and minimum radiance values of each band, the DN imagery can be translated into a radiance value (Mishra et al., 2009). According to Singh et al. (2018), the radiance Xi is computed as follows:

$${x}_{i\lambda }=\left[\frac{\left(D{M}_{i}\lambda \right)}{\left(MGray\right)} \times \left(B{max}_{\lambda }-B{min}_{\lambda }\right)\right]+ B{min}_{\lambda }$$
(1)

where \(B{\mathrm{max}}_{\lambda }\) is the value of maximum radiance provided in the metadata, \(B{\mathrm{min}}_{\lambda }\) is the value of minimum radiance provided in the information, and \(D{M}_{i}\) is band pixel digital number. \(MGray\) represents the maximum DN value for a certain band. Calculating the various angles, including the solar zenith angle, azimuth angle, and elevation angle, is necessary. The distance between the Sun and its direct overhead position, or solar zenith angle, is measured in degrees.

Classification Algorithms

In the present paper, three popular supervised classification algorithms, namely (a) MLC, (b) SVM, and (c) FF-NN, have been implemented to classify hyperspectral and multispectral imagery as explained in subsequent sections. These classifiers are chosen for research purposes due to their numerous advantages as described in the coming section. The testing is done through other classifiers, also but for the present work MLC, SVM, and ANN show the best results. As shown in Table 1, KNN is not suitable for high-dimensional data, and RF and DT suffer from an overfitting issue which is resolved by SVM and showed better results for the current work. MP is a time-consuming classifier so not suitable for complex work. MLC is a robust classifier and NN is a fast learning classifier and has the capability to extract hidden features and improve the accuracy, so both are selected for the current work.

Maximum Likelihood Classifier (MLC)

The MLC is one of the most used supervised classifiers that computes the posterior probability of a pixel belonging to a specific class category. In other words, the pixel with maximum likelihood will be allocated to the corresponding class category and beneficial for more complex models of evolution (Lillesand et al., 2015). On the other hand, if the pixel has a smaller likelihood than the threshold value, it remains unclassified. It is also known as the parametric method as it is based on assumptions for the distribution of frequency for each class category. This approach is used to train the model for the classification of different classes into specific categories. The flow diagram of MLC is shown in Fig. 2a.

Fig. 2
figure 2

Comparison of the methodology of different supervised classification algorithms: a MLC b SVM c FF-NN

Steps for the execution of the MLC algorithm:

Step 1: In the prime stage, various training samples (n) are picked out based on different observations and spectral signatures.

Step 2: Select the number of classes.

Step 3: Afterward, files of spectral signatures of chosen class categories are generated for algorithm training.

Step 4: Compute the covariance matrices and mean vector as follows

$$M=\mathrm{lnln} \left(cj\right) -\frac{1}{2lnln \left(\left|\mathrm{Cov}j\right|\right) }-\left[\frac{1}{2\left(P-mj\right)T\left(\mathrm{Cov}{j}^{-1}\right)\left(P-mj\right)}\right]$$
(1)

In Eq. (2), cj is the probability of a class, Covj is the covariance matrix, P is used as the measurement matrix of the pixel, and m is used as the sample mean vector of class j.

Step 5: To train the model, 1000 samples were used which is subdivided into training (~ 80%), validation (~ 20%). After that, region of interest is selected (ROI) from the input image and the required class is selected from the ROI tool using Envi v5.3 software.

Step 6: Choose the probability threshold value as single and set the data scale factor value as 1. While converting integer-scaled reflectance or radiance data into floating-point values, the scale factor is employed as a division factor. After this, classification is performed.

Step 7: Then, the classified maps are visually interpreted with the reference data. If the desired result is not found, then go to step 1 and repeat the whole process.

Step 8: If the desired result is found, then a false color code is allocated to the class category of thematic images.

Support Vector Machine (SVM)

The SVM is a supervised algorithm which is used for both regression and classification purposes. It is further categorized into linear SVM for separable data and into nonlinear for inseparable data. In SVM, the hyperplane with maximum margin is selected which aims to separate the datasets into a distinct number of classes. It is implemented with the help of kernels which are used to convert the low dimensional input into the high-dimensional which helps to solve the algorithm problem (Maulik and Chakraborty, 2017). Various types of kernels like linear, polynomial, and radial basis function kernel can be used according to the requirement. The flow diagram of the SVM algorithm is shown in Fig. 2b.

Steps for the execution of the SVM algorithm:

Step 1: Select ‘n’ number of samples for algorithm training.

Step 2: A radial basis function kernel is selected for classification as it maps the input space in indefinite-dimensional space using Eq. (2) as given below.

$$k\left(x,xi\right)=\mathrm{exp}(-\mathrm{gamma}*\mathrm{sum}\left(x-xi\right)\mathrm{pow}2)$$
(2)

In Eq. (3) k stands for the kernel, the gamma value ranges from 0 to 1, and its value is assigned manually during the model training. Here x and xi are the data points used for margin selection.

Step 3: Set all the parameters to find a hyperplane.

Step 4: Compute the hyperplane as given below.

$$w.x+b=0$$
(4)

In Eq. (4) w is a vector which is normal to the hyperplane and b is an offset.

Step 5: To train the model, 1000 samples were used which is subdivided into training (~ 80%) and validation (~ 20%) After that, the region of interest is selected (ROI) from the input image and the required class is selected from ROI tool using Envi v5.3 software.

Step 6: Choose the gamma value in the kernel function as 0.005. After kernel selection, a penalty parameter is chosen that regulates the compromise between allowing for training mistakes and enforcing strict margins. The cost of incorrectly categorizing points rises as the penalty parameter’s value is increased and its default value is 100. After this, classification is done.

Step 7: Then, the classified maps are visually interpreted with the reference data. If the desired result is not found, then go to step 2 and repeat the whole process.

Step 8: If the desired result is found, then a false color code is allocated to the class category.

Feedforward Neural Network (FF-NN)

The FF-NN is a subpart of an artificial neural network (ANN) and is also known as the multi-layered network of neurons (MLN). It consists of many layers, i.e., the input layer, the output layer, and the hidden layer. The hidden layer between the input and the output layer aims to perform the non-linear transformation of the input layer to produce the desired output (Paoletti et al., 2019). Each perceptron in one layer is connected to every perceptron on the next layer which allows the constantly transferring or "feed forward" from one layer to the next. It allows more generalized and accurate results as compared to other supervised algorithms. It is used to solve various problems such as the validation of data and helps to find the patterns in data. The flow diagram of the FF-NN algorithm is shown in Fig. 2c.

Steps for the execution of the FF-NN algorithm:

Step 1: Select the ‘n’ number of training samples.

Step 2: Afterward, select the number of hidden layers which is the size between the input and the output layer.

Step 3: Select the number of iterations to train the model.

Step 4: Compute the values as given below.

$$y=a\left(W1*X1+W2*X2\dots \dots \dots ..Wn*Xn\right)$$
(5)

In Eq. (5) a is the activation function, (W1, W2…. Wn) are the weights, and (X1, X2……Xn) are the input neurons.

Step 5: To train the model, 1000 samples were used which is subdivided into training (~ 80%) and validation (~ 20%) After that, the region of interest is selected (ROI) from the input image and the required class is selected from ROI tool using Envi v5.3 software.

Step 6: Choose the activation method and adjust the value of threshold training in between 0 and 1.0. The training algorithm dynamically modifies the weights between nodes and, if necessary, the node thresholds. The internal weights of the node are unaffected by setting the Training Threshold Contribution to 0. Better classifications could result by adjusting the internal weights of the nodes, while poor generalizations might result from using too many weights.

Step 7: Then, the classified maps are visually interpreted with the reference data. If the desired result is not found, then go to step 2 and repeat the whole process.

Step 8: If the desired result is found, then a false color code is allocated to the class category.

Accuracy Assessment

To validate the outcomes, the accuracy assessment has been computed for each classified map generated from MLC, SVM, and FF-NN. The reference dataset is also used for assessment to determine the accuracy of the classified results. Field surveys and the visual interpretation of high-resolution Google Earth pictures were used to gather the reference data. The research area was divided into five land-cover classes, all of which could be seen in the field and in pictures (from Google Earth), i.e., dense vegetation, deciduous vegetation, built-up, water, and barren. Envi v5.3 tool was used to collect a total of 996 locations using stratified random sampling. According to its size, each land-cover type was given a certain number of points. The total number of samples is further divided into training (~ 80%) and validation set (~ 20%). In total, 1000 (approx.) samples were collected for both Hyperion EO-1 and Landsat-8 datasets. Multiple training samples (50 to 60 polygons) were chosen from each class category from the Hyperion and Landsat-8 datasets across the agricultural region in Haryana and Uttar Pradesh in order to train the model. The fivefold cross-validation method was used 10 times on the samples to determine the final accuracy. The dataset was shuffled before each repetition randomly and new folds were created to improvise the model performance. The essential components of the accuracy assessment included the producer’s accuracy (PA), user’s accuracy (UA), overall accuracy (OA), and kappa coefficient (Kc). The PA defines the probability of correct classification with respect to reference pixels and the probability of pixels fall under the correct class category, whereas OA and Kc represent the collective accuracy and distinction between actual and expected outcomes, respectively (Dahiya et al., 2023).

Results and Discussion

In the present study, two datasets, namely Hyperion EO-1 and Landsat 8 OLI/TIRS, are used as input. Three supervised classifiers, i.e., MLC, SVM, and FF-NN, have been implemented using hyperspectral as well as a multispectral dataset to classify the different categories and to narrate the impact on LULC over the part of Indian states, i.e., Haryana and Uttar Pradesh. During the classification process, various categories are explored such as dense vegetation, deciduous vegetation, built-up, water, and barren. The MLC algorithm has been implemented according to Eq. (1) for assigning a class category to a pixel based on maximum likelihood. Afterward, SVM has been implemented using Eqs. (2) and (3) in which kernels are selected according to the desired result and the hyperplane is selected with maximum margin. According to Eq. (4), the FF-NN is implemented to select the hidden layers and a number of iterations to fetch the maximum features. The final classified outputs from each classifier using hyperspectral and multispectral imagery are shown in Fig. 3.

Fig. 3
figure 3

Input and classified outcomes from Hyperion EO-1 and Landsat-8 using different classifiers a Input Hyperion EO-1 Image (RGB: 40-30-20) b MLC c SVM d FF-NN e Input Landsat-8 image f MLC g SVM (h) FF-NN

To compute the effectiveness of thematic or classified images, the accuracy assessment is one of the important steps to evaluate errors and the efficiency of the model. Tables 2 and 3 represent the accuracy assessment parameters computed for hyperspectral and multispectral imagery, respectively. From the statistical analysis, the accuracy assessment table has confirmed the effectiveness of the FF-NN (91.20% with Hyperion, and 82% with Landsat-8) classifier as compared to other classification methods, i.e., SVM (87.60% with Hyperion and 80% with Landsat-8) and MLC (84.40% with Hyperion and 72.40% with Landsat-8). The accuracy assessment of various supervised classifiers is performed based on various accuracy assessment parameters such as PA, UA, KC, and OA. From the experimental outcomes, it is evident that the FF-NN algorithm not only improves the accuracy for a given LULC region but also obtained the highest accuracy among various supervised classifiers using the hyperspectral dataset as compared to the multispectral dataset.

Table 2 Accuracy assessment of different classifiers for (Hyperion EO-1) hyperspectral dataset
Table 3 Accuracy assessment of different classifiers for (Landsat 8 OLI/TSRI) multispectral dataset

Along with the above analysis, the potential of hyperspectral has also been testified on subset representation of input and processing datasets as shown in Fig. 4. From the visual interpretation, the difference between the outcomes of the hyperspectral and multispectral can be easily analyzed. Along with the above analysis, the potential of hyperspectral has also been testified on discrimination of different vegetation types using FF-NN classifier and compared with the Landsat-8 dataset as shown in Fig. 5. Moreover, the statistical analysis has also been computed as shown in Table 4. These outcomes depict the effectiveness of hyperspectral-classified images as compared to the multispectral classified image. The main reason behind such results is due to the potential of hyperspectral to deliver the narrow band information and produces spectra of all pixels. On the other hand, the multispectral dataset is easy to process but provides only limited information only, which results in the loss of vital information. The major challenge associated the hyperspectral imagery is the impact on the computing speed while dealing with FF-NN. However, some of the alternate or advanced approaches of ANN can also be explored to increase the computing capacity in the classification process.

Fig. 4
figure 4

Representation of subset a Hyperion EO-1 imagery classified by b MLC, c SVM, and d FF-NN; e Landsat-8 imagery classified by f MLC, g SVM, and h FF-NN; and i reference dataset

Fig. 5
figure 5

Discrimination of ornamental crops a Landsat-8 Input Image b Hyperion EO-1 Input Image c Reference Image d FF-NN Landsat-8 e FF-NN Hyperion EO-1

Table 4 Accuracy assessment of discriminant ornamental crops using FF-NN classifier for (Hyperion EO-1) Hyperspectral dataset and (Landsat-8) Multispectral dataset

Conclusion

In the present work, the Hyperion EO-1 and Landsat 8 data are evaluated over a region of Indian states, namely Haryana and Uttar Pradesh. This article shows the potential of three well-defined supervised classifiers, i.e., MLC, SVM, and FF-NN using hyperspectral and multispectral datasets. From the experimental outcomes, it is apparent that the FF-NN classification method obtained the highest accuracy (91.20% with Hyperion and 82% with Landsat-8) as compared to other classification methods, i.e., SVM (87.60% with Hyperion and 80% with Landsat-8) and MLC (84.40% with Hyperion and 72.40% with Landsat-8). It is also apparent that the hyperspectral can generate accurate information from classified maps compared to the multispectral dataset. Moreover, the potential of hyperspectral in vegetation discrimination is also evident as compared to the multispectral. This study can be further used for different applications in different regions such as crop identification, disease detection, and crop growth for sustainable crop production.