Introduction

Remote sensing (RS) satellites collect vital information from the earth for mapping and monitoring the Earth’s surface (Rogan & Chen, 2004). To effectively monitor and document the Earth’s surface, remote sensing data is commonly utilized for the collection of information about LULC (Sowmya et al., 2017). Land use refers to the human use of landscapes such as urban, building, and agriculture (Kim, 2016). Land cover is a natural element that appears on the landscape like water, mountains, and forests (Lv et al., 2019). LULC classification refers to the process of assigning and classifying land cover classes to pixels. Waterbody, urban, cultivation, building, forests, agriculture, plains, mountains, bare land, and highlands are a few example classes for LULC (Talukdar et al., 2020). According to recent studies, LULC classification is an essential tool for finding the effect on a variety of characteristics of the Earth’s surface, including terrestrial ecosystems, water balance, biodiversity, and climate (Alshari & Gawali, 2021). For the LULC classification, Landsat 8 images are the most commonly used dataset from the United States Geological Survey (USGS). USGS imagery includes Landsat, MODIS, and Sentinel 2. Among them, the Landsat time series spanning nearly 40 years is of particular note. Kulkarni and Vijaya (2021a) used Landsat 8 imagery to identify the LULC changes that occurred in Bangalore during 2013, 2016, and 2019. They chose the study area of Bangalore because of urban sprawl and the city’s significant increase in built-up areas. Similarly, the study area of Madurai in the southeastern region of India was chosen for this research. Data were gathered by the Landsat 8 satellite in 11 different bands using two different sensors. The information required for LULC classification may come from any band. However, if the bands themselves have a high degree of correlation, the information provided by each band will be redundant. As a result, only a subset of all accessible bands may be utilized. The selection of spectral, texture, and vegetation indices features is a more important step in LULC classification (Kulkarni & Vijaya, 2021b). Numerous vegetation indicators were used in the current study to achieve consistent accuracy. It also ensures a comprehensive analysis of vegetation conditions and characteristics (Qu et al., 2021). These indices offer a more exact view of vegetation dynamics to differentiate forest, agriculture, and uncultivated regions.

Traditionally, most LULC classifications were based on PB classification from remotely sensed images (Varma et al., 2016). They either used supervised or unsupervised categorization or both. Several supervised machine-learning algorithms have been used for LULC classification, including SVM (Heumann, 2011), maximum likelihood (ML) (Sinha et al., 2015), kNN (Hudait & Patel, 2022), and random forests (RF) (Hütt et al., 2016). There are also several unsupervised machine-learning algorithms commonly used for LULC classification, including a priori (Lee et al., 2018), principle component analysis (PCA) (Deng et al., 2008), and independent component analysis (ICA) (Lu et al., 2019). A PB takes into account the spatial and contextual information associated with the particular pixel. As higher-resolution imagery becomes more available, it may be possible to use this spatial information to produce more accurate LULC classifications (Willhauck, 2000). To address these issues, remote-sensing image analysis increasingly uses segment-based classification instead of pixel-based classification. By analyzing objects individually, it is possible to minimize the spectral variability within a class, as well as classification errors caused by pixel artifacts due to spectral differences in atmospheric corrections. OB classification offers another advantage when it comes to broad-area mapping, in that it reduces computational complexity, although segmentation may be a time-consuming process (Blaschke, 2010).

Many studies have compared the performance of PB and OB approaches for classifying land use. Gholoobi et al. (2010) examined the performance of PB classification and OB classification of land use in mountainous regions. Based on their findings, the OB classification approach yields no noisy results. According to Johansen et al. (2010), the OB approach reduces the following effects: differences in sensor view, shadows of clouds, high spatial frequency noise, and unregistered images. Based on Aggarwal et al. (2016) work, it is possible to accomplish OB classification without the limitations of PB classification, which is dependent solely on the spectral values of the data collected through remote sensing. The limitation of OB classification was discussed by Zhou et al. (2008a) in which the OB method requires greater computational power than PB classification, and the effectiveness of the rule is heavily dependent on the expertise of the experts. Several OB LULC classification problems can be successfully solved using SVM classification algorithms (Shao & Lunetta, 2012). It is important to note that one of the issues with SVM is that while the classification training data set is divided by the class boundary line, many misclassifications occur near this boundary line. In order to refine the classification decision near the boundary line of SVM, in second level, kNN distance measure is used. Thus, the misclassification rate of the data classification is reduced as a result of the two-level classification (SVM-kNN). The papers (Garcia-Gutierrez et al., 2010; Rithesh, 2017) proposed and discussed a two-step SVM-kNN classification approach, which uses both SVM and kNN classifiers sequentially to solve classification problems. During the OB classification process, similar pixels are grouped into clusters, and objects are generated based on the clustering and segmentation of the pixels. The accuracy of OB LULC classification has improved significantly since the segmentation process was developed.

The traditional segmentation algorithm works based on the following algorithms, that are region growth, threshold, level set and active contours. Many segmentation algorithms such as fractal net evolution (Zhou et al., 2008b), bottom‐up region‐merging (Im et al., 2008), multitemporal segmentation (Civco et al., 2002), hierarchical segmentation (Tassi & Vizzari, 2020), and multiresolution segmentation (Atik & Ipbuker, 2021) are achieving good results in the object, but most traditional segmentation algorithms need to achieve decent improvement in the segmentation speed. In the current state of technology, the superpixel segmentation (Wang et al., 2017) algorithm is widely applicable to image segmentation and classification in various fields because of its low calculation quantity, faster processing speed, and better anti-noise characteristics. Achanta and Süsstrunk (2017) presented simple linear iterative clustering (SLIC) and SNIC segmentation algorithms in 2012 and 2017, respectively, based on the concept of superpixel segmentation. Several approaches were developed according to the original SLIC segmentation technique, and these algorithms have the advantage of being quick and simple to compute.

Increasing the accessibility of geographical imagery has recently made it possible to create and develop cloud-based spatial analysis frameworks such as the Google Earth Engine (GEE). Previously, a large-volume spatial analysis such as LULC was impossible to implement because of the complexity of the algorithm. The GEE interface provides users with the opportunity to state, create, and execute the algorithms according to their needs using an intuitive interface. Furthermore, it offers a publicly accessible dataset that encompasses a comprehensive archive of Landsat 6, 7, 8, and 9 images, along with the MODIS dataset, and Sentinel 1, 2, and 3 imagery. Because of the factors outlined above, the proposed work will be implemented and executed over a GEE platform for simplicity and efficiency.

In the present study, a novel boundary-specific two-level learning approach augmented with auxiliary features is used to evaluate LULC classification systems. The primary contribution lies in the comparison between PB classification and OB classification methods using SVM classification within the GEE environment. To enhance OB classification accuracy, auxiliary features are incorporated alongside traditional features, and the SNIC segmentation algorithm is utilized for segmentation instead of the previously employed multiresolution segmentation. Finally, a boundary-specific classification algorithm combining SVM and kNN is employed to minimize SVM’s misclassification rate. These findings provide valuable insights for improving LULC classification techniques.

Material and methods

Methodological framework

The boundary-specific two-level classification framework for the LULC classes is shown in Fig. 1 along with a comparison of the classification accuracy of PB and two OB techniques with and without SNIC (OBS and OB) using Landsat 8 datasets. The typical workflow comprises the following steps: (i) data composition, (ii) creation of PB and OB data, (iii) boundary-specific two-level classification, and (iv) accuracy evaluation. In the first step of the methodology, a data composition approach is a process of merging data from several sources, such as spectral bands, dates, and resolutions, to produce a more complete image of the study area. The data composition helps to increase the quantity and quality of information available from Landsat 8 data. In the pre-processing step to increase the quantitative and qualitative information, Landsat 8 data is included with date, ROI (region of interest masking), and cloud coverage filter. Then, the features are collected from the preprocessed image and the features are mainly three types: spectral, textural, and indices. Following this, the next step of the methodology is the creation of PB and OB data. PB data is the combination of bands Surface Reflectance_Band7 (SR_B7), SR_B5, and SR_B3. In OB, that spectral composite image is segmented into an object (group of similar pixels) by applying multi-resolution segmentation. Based on these objects, the spectral and textural information of the image is used for the classification. After that to implement the second OBS approach, auxiliary features are added and compared with the existing spectral features as well a new SNIC segmentation approach is used for the object creations.

Fig. 1
figure 1

Flowchart illustrating the proposed methodology’s data processing and analysis implemented in GEE

The final methodological step is boundary-specific two-level classification; here, the first level dataset is divided into two parts, training and testing, then it is classified with SVM classification. Based on the classification result the training data is labeled and compared with the user-defined class labels. kNN classification is implemented to refine the training set and improve the accuracy of the SVM classification. Based on the classification output and the ground truth value the confusion matrix is generated for each method. The accuracy level of the output LULC map is interpreted and generated using these matrices, which are used to create a number of qualitative and quantitative measures helpful for comparing the performance of the various techniques.

Study area

Tamilnadu’s district of Madurai (study area) is situated in the southern part of India and is one of the districts of that state. In the north, it is bordered by the districts of Dindigul and Thiruchirapalli, in the east by Sivagangai, in the west by Theni, and in the south by Virudhunagar. There are 3,038,252 residents in the city, which covers an area of 3710 km2. In terms of geographical location, it lies between the North Latitudes of 9°30.00 and 10°30.00 and the East Latitudes of 77°00.00 and 78°30.00 (Alaguraja et al., 2010). The study region is bounded by the Southeast Ghats and several mountain spurs of the Western Ghats. During the 8 months of the year, the climate of the study area is predominantly hot and dry. As a result of the difficult patterns of topography and climate that dictate the locations of Madurai LULC, a great deal of biodiversity is created as well as unique landscapes (Rajesh et al., 2020). Forestry, agriculture, urban, water bodies, uncultivated land, and bare land are the most common land uses and land cover patterns in the study area and these six classes are used for the LULC classification. The selected classes, their codes, and the description of the class information are present in the Table 1. The number of classes is chosen based on the characteristics of the study area and the importance of the problem that was desired to be solved. The study area is carefully examined depending on how the various classes were distributed in the data (Mohideen, 2017). The study area location map is shown in Fig. 2.

Table 1 Class descriptions, codes, and descriptions of the LULC classes
Fig. 2
figure 2

Location map for the study area

Data composition

Landsat 8 Operational Land Imager (OLI) images are used in this study and the images are collected from the Landsat 8 satellite which was operated by the National Aeronautics and Space Administration (NASA) and USGS. On average, Landsat 8 produces a 30-m resolution image every 2 weeks, and its categories with three tiers (tier_1, tier_2, and tier_RT), two collections (collection_1, and collection_2), and three different processing methods (surface reflectance, top of atmosphere, and raw images) (Knight & Kvaran, 2014). Among the categories, “Landsat 8 Level_2 tier_2 collection_2 Surface Reflectance” images are collected from the GEE archive and these images are already atmospherically corrected datasets. The input parameters of the data insertion in GEE are shown in Table 2. The images comprise five visible and near-infrared bands (VNIR) and two short-wave infrared bands (SWIR) (Barsi et al., 2014). To minimize the impact of cloud coverage and the most suitable time for vegetation growth, the data set is filtered with the appropriate time intervals from June 2015 to October 2015. Moreover, the image is filtered with the geometric boundary based on the study area, and the filtered image is shown in Fig. 3a and b.

Table 2 Data collection input specification in GEE
Fig. 3
figure 3

The study area’s multispectral image is shown in a prior to cloud and ROI masking, and in b after applying the masking process

Feature collection

According to earlier research, the different remote sensing feature sets are responsive to various forms of LULC. As a result, there is no comprehensive feature set for LULC. In the feature set collection process, 7 spectral characteristics, 4 texture features, and 4 spectral indices were extracted from preprocessed Landsat 8 OLI images (Table 3). In order to create the PB classification data composition, SR_B7, SR_B5, and SR_B3 are linearly combined. The combination of these bands is perfect for tracking agricultural crops, which tend to appear in bright green hues (Acharya & Yang, 2015). A similar combination of bands is used for the classification of OB. In addition to this, gray level co-occurrence matrix (GLCM) and entropy are added in order to create the feature set for OB classification. For OBS classification auxiliary dataset is added with the earlier feature (F2, F3, F4, and F5). Here auxiliary dataset is a geospatial dataset not derived from the Landsat 8 satellite. A global forest/non-forest map (FNF) (F2) is created by categorizing backscattering coefficients in a 25-m resolution mosaic, differentiating “forest” and “non-forest” using variable thresholds (Altunel et al., 2020). The auxiliary feature, “Inland water bodies—GLCF: Landsat Global Inland Water,” (F3) aids in identifying water bodies with Landsat imagery (Chen et al., 2015). Soil texture auxiliary feature (F4) characterizes soil properties and influences vegetation growth globally, including numeric properties at different depths and soil class distribution (De Lannoy et al., 2014). The global population dataset (F5) enhances predictions for dense classes and reduces confusion between similar urban classes (Chen et al., 2020). These datasets are open access datasets in GEE shown in Fig. 4, and this dataset provides variation and spatial distribution of human settlements, soil features, inland water bodies, and forest cover.

Table 3 Feature set description and code
Fig. 4
figure 4

The auxiliary dataset used in study. A Global human settlements white pixels denote the human settlements. B Forest cover green pixels denote the forest regions. C Soil features green and gray pixels denote the different textures of the soil. D Inland water bodies blue pixel denotes the water bodies regions

SNIC segmentation

In the OBS method, a SNIC segmentation algorithm is used for image segmentation. Simple linear iterative clustering (SLIC) acts as a base for SNIC, an advanced super-pixel segmentation method. Compared with the SLIC segmentation algorithm, the SNIC segmentation algorithm has more advantages, since it reduces the computation time of segmentation by using non-iterative procedures. The main parameters of SNIC segmentation in the GEE environment are super-pixel size, connectivity, compactness, neighborhood size, and seed shape. Figure 5 shows the input parameters and output image of the SNIC segmentation algorithm for the study area.

Fig. 5
figure 5

The outcome of SNIC segmentation in GEE environment with parameter list

In SNIC, K initial centroids in the image plane are generated using a regular grid. The grid matches with K corresponding elements in the input image and here K is the user-specified number of super-pixels. The main user-defined parameter of SNIC is K, which stands for the number of initial centroids. It establishes a super-pixel size s, which may be computed as follows:

$$s=\sqrt{N/K}$$

where N is the image’s pixel count and the Image \({\left\{{I}_{i}\right\}}_{i=0}^{N}\). Each element in SNIC consists of the spatial position of the candidate pixel, the CIELAB color of the pixel, and the label of the super-pixel centroid. The priority queue Q is interleaved with K elements, and it checks to see if Q is not empty first, then it pops out the minimum distance element from the queue. For each linked neighbor pixel of the popped element, a new element is formed and the distance from the connected centroid and the label of the connected centroid are assigned. Repeat this procedure until the centroid has been assigned or all of the image’s pixels have been popped. The result is an image where each pixel is mapped to the nearest centroid.

Boundary-specific two-level classification

This methodology utilizes a two-level boundary-specific classification method to compute the LULC Classification. The earlier investigations have shown that, as a consequence of the uniqueness and limitations of the tools, no classifier can be categorically regarded as superior to other classifiers since it cannot guarantee high-quality classification for all datasets (Vivekananda et al., 2021; Alshari & Gawali, 2021). The binary SVM classification provides satisfactory classification results in complex multidimensional datasets. In complex multidimensional datasets, binary SVM classification can provide satisfactory classification results. On the other hand, it does not work when applied to large and imbalanced datasets. It is necessary to perform another classification at the second level in order to reduce the misclassification rate. A kNN classifier is implemented in the second level to reduce the misclassifications caused by the SVM classifier. Therefore, the boundary-specific two-level classification method combines the strengths of the two classifiers to increase LULC classification accuracy. This method has proven to be reliable and accurate in various applications. Figure 6 illustrates the algorithm steps for implementing the boundary-specific two-level classification. Figure 7 illustrates the boundary-specific two-level approach in a two-dimensional space, which utilizes the cascade learning method. In the first level of classification, the SVM classifier is applied to the dataset, forming a hyperplane based on the support vectors and kernel function. This hyperplane separates the dataset into two classes with labels + 1 and − 1. The objects falling inside the strip line are considered as a dataset for the kNN classifier. To develop a training set for the kNN classifier, objects from the boundary region are selected, assuming that the SVM classifier will continue to correctly classify objects outside the area, but may make mistakes inside this area. The kNN classifier works differently from the SVM classifier and can improve the overall data classification quality in some cases.

Fig. 6
figure 6

Algorithm steps for boundary-specific two-level classification using SVM and kNN

Fig. 7
figure 7

Boundary-specific two-level classificationin two-dimensional space

In the GEE environment, the important parameters for the SVM and kNN classifier are shown in Table 4.

Table 4 Classification parameters and code for SVM and kNN in the GEE environment

Accuracy assessment

In the GEE environment for testing and training of the boundary-specific two-level classification, 200 polygons are randomly collected from the study area. The number of sample polygons is chosen based on the study area size and previous studies (Avci et al., 2023). To create the ground truth points, the polygons is overlaid with the Google Earth high-resolution base map, and manually labeled each polygon as FAC, UAC, WBC, BLC, ULC, and ALC (Al-Abdulrazzak & Pauly, 2014). According to a random selection, 70% of these points are used for training and 30% for validation. The confusion matrix is used to assess the accuracy of PB, OB, and OBS LULC classification. To assess the LULC classification accuracy quantitatively producers’ accuracy (PA), users’ accuracy (UA), OA, and K were calculated (by calculating the confusion matrix). A confusion matrix based on the number of pixels was calculated in the PB classification methods. Object-based classification methods can be applied either based on the number of objects or based on the area of the objects to determine the confusion matrix.

Results

Figure 8 shows the step-by-step outcome of PB LULC classification. The initial step of the classification is masking and creating the true color composite image. The true-color composite layer is used to examine how pixels are visualized for each band. To generate a true-color composite image, the “SR_B4,” “SR_B3,” and “SR_B2” bands are utilized. In the next step, spectral features and indices are collected. As a result of the feature collection, the study area is classified into one of the PB LULC classes such as FAC, UAC, WBC, BLC, ULC, and ALC using the supervised classification method.

Fig. 8
figure 8

The step-by-step outcome of PB LULC classification

The findings of the OB and OBS LULC categorization are displayed in stages in Figs. 9 and 10. In the classification of OB and OBS approaches, the processing stages are similar, but the segmentation methodologies differ. As part of the OB LULC classification approach, a multiresolution segmentation algorithm is employed to construct the objects, and the contextual information is added to the feature collection. The OBS LULC classification process uses the SNIC segmentation approach for segmentation and auxiliary features are introduced during the feature-collecting stage. These results suggest that object-based classifiers take into account smoothness, form, and texture in addition to spectral values, whereas PB classifiers simply take into account spectrum values.

Fig. 9
figure 9

The step-by-step outcome of OB LULC classification

Fig. 10
figure 10

The step-by-step outcome of OBS LULC classification

Figure 11 illustrates the significance of augmenting auxiliary feature sets using the OBS method. In this context, F1 represents the accuracy (92.45%) of the OBS method when using the earlier feature set, comprising spectral features (7), texture (4), and indices (4). F2 corresponds to the inclusion of the auxiliary feature forest cover alongside the earlier feature set, resulting in an accuracy of 93.36%. Furthermore, when incorporating the inland water bodies feature with F1, the accuracy increases to 93.60% (F3). Similarly, the simultaneous addition of soil feature and population feature to the F1 feature yields accuracies of 94.25% (F4) and 94.39% (F5), respectively. Finally, by augmenting all auxiliary features with the earlier feature set, the overall accuracy reaches 94.42%. It is worth noting that each feature provides marginal improvement over its predecessor, but the combined augmentation leads to a notable 2% increase in accuracy.

Fig. 11
figure 11

Auxiliary feature augmentation

In GEE, a confusion matrix was developed to statistically compare the ground truth points to the validation points with the output classifications to assess the accuracy of both LULC classifications. In addition to providing information about the OA and K, the confusion matrix also indicates the LULC classes that produce errors (quantified by the UA and PA, respectively). These errors can be used to identify potential areas of improvement in the analysis for more accurate results. Furthermore, the confusion matrix can be used to identify the most and least accurate LULC classes. Table 5 lists the OA and K values of the PB, OB, and OBS LULC classifications. The results from the OBS LULC classification accurately depicted the land cover of the study area. The table showed that the OA and K values of the OBS LULC classification (OA: 94.42% and K: 0.92) were higher than those of the PB (OA: 81% and K: 0.76) and OB (OA: 91% and K: 0.89) LULC classifications. Therefore, the OBS LULC classification is more accurate than the other two.

Table 5 Accuracy assessment of the proposed LULC classification methodologies

Based on the classification methodologies employed in this study, Fig. 12 shows the proportion of the total area filled by each class. Based on OBS’s highest accuracy methodology, BLC is the biggest area class and ULC is the second biggest area. Following UAC occupies nearly 14% of the total study area. The remaining classes occupy lesser areas of the total study area, with the smallest proportion being occupied by the ALC class. This is followed by the FAC and WBC classes, which occupy nearly 7% and 1.7% of the total area, respectively. Overall, the results show that BLC is the most dominant class in the study area.

Fig. 12
figure 12

Distribution of 6 LULC classes for the study area

The SVM classification in the OBS method produced an OA of 94.42% and a K of 0.92, which was a marked improvement over the PB and OB methods. However, when it came to PA and UA, the accuracy of ULC and UAC classes remained quite low due to misclassification between them. The maximum rate of misclassification occurs between the following pairs of classes ULC and UAC, UAC, and BLC. To improve the accuracy of the OBS method, special attention should be paid to these misclassifications.

The proposed two-level boundary-specific SVM-kNN classifiers could increase the accuracy of the classification. It may be possible to reduce the number of misclassified objects by applying the kNN classifier to objects that are near the separating hyperplane found by the SVM classifier. As a result of this method, the OA is increased by 3%, and the K is 0.94. There is also a slight increase in the PA and UA of ULC and UAC when compared with SVM. The confusion matrix of the OBS method, employing SVM classification and a boundary-specific two-level classification, is displayed in Tables 6 and 7.

Table 6 Confusion matrix for SVM Classification in OBS method
Table 7 Confusion matrix for boundary-specific two-level classification in OBS method

Figure 13 displays a map of the small section of the study area with a scale of 500 m, which clearly outlines the land use and land cover classes. It can be seen that this region is mainly composed of waterbodies along with some urban land, agricultural land, and uncultivated land. This output map from the OBS LULC classification method accurately differentiates the land within the waterbody and road as an urban area.

Fig. 13
figure 13

LULC classification outcome of Boundary-specific two-level classifications in the OBS method

Discussion

LULC classification is heavily dependent on the accuracy of the classification algorithms. Thus, improving the classification accuracy is of primary importance in remote sensing. Prior studies have attempted to increase the accuracy of the LULC classification, both by the traditional PB (Varma et al., 2016; Whiteside & Ahmad, 2005) and OB (Dorren et al., 2003; Kavzoglu & Yildiz, 2014) methods. In order to improve the accuracy of LULC classification existing in literature, Landsat8 OLI image is taken for the study area Madurai. The choice of Landsat 8 is based on the work by Chen et al., (2015) which proves that medium-resolution (Landsat-like, 10–30 m) sensors are more competent for detecting most human-nature interactions in high-resolution LULC classification. Landsat 8 is low to moderate-resolution sensors, which are not suitable for high-precision LULC classification studies at regional scales. The two leading platforms for medium-resolution satellite land imaging are Landsat and Sentinel. Due to Landsat’s compatibility with its earlier missions and extensive historical datasets compared to Sentinel satellites, Landsat is frequently used in research (Chander et al., 2009). The increased radiometric performance and the thermal band calibration of Landsat 8 also contribute to improved analyses of LULC classification (Roy et al., 2014). Thus, “Level 2 Surface Reflectance image from the Landsat 8 OLI (tier 2)” was employed. surface reflectance (SR) data from Landsat 8 Level-2 data products have already undergone certain radiometric and geometric adjustments (Acharya & Yang, 2015), under the direction of the United States Geological Survey (USGS). These corrections have been precisely applied to the Landsat 8 Level-2 data products, so using this dataset often does not require further radiometric and geometric corrections by individual researchers (Vermote et al., 2016).

The study aims to develop an OB LULC classification by combining auxiliary features, SNIC and boundary-specified two-level classification on a freely accessible GEE platform. Furthermore, the study also compares the LULC classifications of PB and OB. The OA for the PB classification achieved was 81%, whereas the OA for the OB classification without SNIC and auxiliary features was 91%. An accuracy improvement of 10% is observed for OB over the PB classification, which is a significant improvement over the accuracies achieved by Whiteside and Ahmad (2005) and Weih and Riggan (2010). In addition to the OB, Qu et al. (2021), Zhu et al. (2016), and Hurskainen et al. (2019) discuss the advantages of integrating the auxiliary features for better classification. The integration of freely available auxiliary features (F2, F3, F4, and F5) with spectral and textural features done in this paper has demonstrated a significant enhancement (2% improvement) in the OA of LULC classification (Fig. 11). The feature set F1 (7 spectral characteristics, 4 texture features, and 4 spectral indices) are selected based on the result obtained from the previous study (Rohini & Geraldine Bessie Amali, 2023).

In addition to addressing and analyzing the effect of the segmentation algorithms namely multiresolution and SNIC segmentation on OB classification, a comparison is also presented in this paper. The multi-resolution segmentation and SNIC segmentation provide an accuracy of 91% and 94.42%, respectively, in OB LULC classification. The implementation of the OBS approach resulted in a significant gain of almost 14% in the OA in comparison to the PB classification. The SNIC segmentation algorithm makes use of the “compactness factor” to define cluster shapes (with greater values leading to more condensed clusters), “connectivity” to decide how neighboring clusters merge (either four connections like queens or eight connections like rooks), and a “neighborhood size” to eliminate artifacts at tile boundaries when merging nearby clusters. The selection of compactness (0.1) and connectivity (8) parameters involves a methodical approach of experimentation and analysis. It was systematically assessed for compactness and connectivity through iterative testing.

Previous studies (Machhale et al., 2015; Zanchettin et al., 2012) improved the classification accuracy of SVM by using a hybrid SVM/kNN model. To implement this, researchers used SVM and kNN algorithms simultaneously on the complete dataset. However, in this study only a subset of data is considered for the second level of kNN based on the boundary condition of the SVM classification at the first level (Fig. 7). The results indicate that the boundary-specific two-level classification algorithm proposed in this paper provided better results than the existing literature despite the reduced dataset used at the training at the second level. A significant increase of 5% in terms of UA and PA for LULC classes was also observed (Tables 6 and 7).

Conclusion

This study offers valuable insights into LULC classification for monitoring the land degradation for the study area of Madurai. The primary focus of this research is to enhance classification accuracy through the integration of auxiliary features, utilization of the SNIC segmentation algorithm, and the implementation of a boundary-specific two-level classification approach using SVM and kNN. The evaluation of PB and OB classification techniques highlights the limitations of PB methods when compared to the OB. Advancements in the computational capabilities of platforms like GEE and improvements in the SNIC segmentation algorithm are poised to elevate LULC classification outcomes for Landsat 8 data, even at a 30-m resolution. The study effectively illustrates the efficiency of incorporating auxiliary features such as spanning forest cover, inland water bodies, soil characteristics, and population data. The study also develops a novel boundary-specific two-level classification methodology that synergistically combines the SVM and kNN techniques to reduce the misclassification rate. The proposed OBS method increases the OA and K from (94.42% to 95.78%) and (0.92 to 0.94). Overall, the present study provides a solid foundation for further research and opens avenues for improving LULC classification techniques, thereby enabling more accurate and reliable land cover information for various scientific and practical purposes. The limitation of the proposed method is that it does not take advantage of deep learning techniques, which can learn from large quantities of data and complex patterns that will be considered for future work. The proposed LULC classification can be extended and applied to different temporal data to identify the changes over a certain period of time.