Introduction

Over the past few decades, the advancement of remote sensing and easier availability of satellite images have made land-use analysis using image classification a vital research topic (Robles Granda 2011; Tehrany et al. 2014). Automated image classification is one of the easiest and preferable techniques to prepare land use land cover (LULC) of an area (Rozenstein and Karnieli 2011). The studies carried out in the past have identified the best performing classification algorithm by comparing different classification algorithms. However, none of them provides a comprehensive comparative analysis of all the popular classification algorithms (Srivastava et al. 2012).

Literature offers various studies which address comparison between pixel-based and/or object-based classification approaches. The basic difference between the two approaches is that of the underlying spatial unit—pixel or object (Duro et al. 2012; Tehrany et al. 2014). Pixel-based classification uses the spectral information stored as digital numbers (DN) in each pixel, where each pixel represents different feature on the earth’s surface. The object-based classification approach considers spatial features e.g. shape, size, tone/color, texture, pattern, association etc., and divides the image into homogeneous objects (Gao et al. 2011; Tehrany et al. 2014).

Srivastava et al. (2012) have evaluated three pixel-based classification algorithms—artificial neural network (ANN), support vector machine (SVM) and maximum likelihood (ML) using low resolution Landsat TM/ETM + images and found ANN as a better classifier than SVM and ML. Rozenstein and Karnieli (2011) have used low resolution Landsat TM image to compare different pixel-based classification algorithms: supervised (ML), unsupervised (ISODATA) and hybrid method (developed by combining spectral signatures from supervised and unsupervised classification). Their results revealed that hybrid method (73.5%) performed better than unsupervised (70.67%) and supervised (60.83%) algorithms. Similar order of performance by the three algorithms was observed after performing post-classification processing. Hybrid classification method was found to be statistically significant than supervised classification, but not in comparison to unsupervised method. Cleve et al. (2008) have compared pixel-based (ISODATA) and object-based classification (nearest neighbour) in wildlife-urban interface area using high resolution aerial photographs and found object-based approach (80.08%) to be better than pixel-based approach (62.11%). Using medium resolution (10 m) SPOT 5 data, Tehrany et al. (2014) has compared pixel-based (decision trees (DT)) and object-based (SVM and nearest neighbour (k-NN) classification approaches. The k-NN algorithm in object-based classification performed significantly better than SVM algorithm in object-based classification and pixel-based (DTs) classification. Duro et al. (2012) have used SPOT (10 m) data to compare pixel-based and object-based classification approaches using Random Forests (RF), DT, support vector machine (SVM) algorithms. Their statistical tests revealed that pixel-based and object-based classification are not statistically significant when same algorithm is applied, however, object-based (DT) classification algorithm was found to be statistically significant than object-based (RF) and SVM algorithms. Duro et al. (2012) stated that both pixel-based and object-based approaches produced similar overall accuracies; however, pixel-based classification was less time consuming to process. LULC map produced using object-based classification approach was found to be visually smoothened. Using pixel-based (ML) and object-based (k-NN) classification approaches to extract urban land cover from VHR quickbird data, it has been observed (Myint et al. 2011) that object-based classifier (90.40%) performs significantly better than pixel-based classifier (67.60%). Jozdani et al. (2019) evaluated comparative performance of different deep learning algorithms, common ensemble algorithms and SVM in classifying urban areas using object-based approach. The findings indicated multilayer perceptron (MLP) as the best classifier. Moreover, other classifiers such as SVM were observed as capable enough to map LULC in complex landscapes. While using Landsat 8 data, Qu et al. (2021) demonstrated that integration of auxiliary datasets improves pixel-based and object-based classification results. The performance of object-based approach was found to be higher than pixel-based approach. Object-based approach was observed to achieve higher accuracy while using only spectral datasets. Gudiyangada Nachappa et al. (2020) compared conventional pixel-based models (data-driven frequency ratio and expert based analytical hierarchical process) with geon-based object-based classification approaches. The results indicated that object-based approach provided higher accuracy than both pixel-based models and produced meaningful spatial units. Tassi et al. (2021) have compared pixel-based and object-based classification approaches and evaluated the impact of integrating textural details in classification process. The results revealed that the accuracy of pixel-based approach has not improved with the integration of textural details. The object-based approach was found to perform better than the pixel-based approach while employing 15 m panchromatic band of Landsat 8 data. Addition of panchromatic band did not improve the pixel-based classification results; however, it generated a detailed LULC with object-based classification approach.

Many LULC classification studies (Gao et al. 2009; Hu and Weng 2011; Duro et al. 2012; Tehrany et al. 2014) reveal that object-based classification approach is believed to provide more accurate results than pixel-based approach. Although more information is acquired from higher resolution images than coarser ones, high resolution images provide challenges for pixel-based classification (Cleve et al. 2008). Unlike natural landscapes, many features in a small space in an urban area can be captured precisely in higher spatial resolution image, but such higher level of detail may congest the details of urban features in spectral context (Myint et al. 2006, 2011). It occurs because pixel-based classification considers only spectral information and neglects spatial information, an attribute which is considered significant in object-based classification (Benz et al. 2004; Walter 2004; Myint et al. 2011; Duro et al. 2012). The similar spectra from different features in urban areas (e.g. buildings, rooftops, roads, sidewalks and other bright surface objects etc.) leads to “mixed pixel problem” or “salt and pepper effect” (Kelly et al. 2004; Cleve et al. 2008; Myint et al. 2011; Ouyang et al. 2011) in pixel-based classification. This causes higher intra-class spectral variability which lowers the statistical separability between classes, thereby leading to misclassification and low classification accuracy (Su et al. 2004) and thus, object-based classification approach is used to overcome these challenges (Cleve et al. 2008; Ouyang et al. 2011). Many researchers (Herold et al. 2003; Durieux et al. 2008; Hu and Weng 2011; Myint et al. 2011) have employed object-based classification for urban areas, considering urban areas to be too complex to be classified accurately by pixel-based methods. In comparison to pixel-based classification, besides higher accuracy, object-based classification approach has another advantage to classify objects with proper shape, which is rather hard to be achieved in pixel-based classification (Baatz et al. 2004; Ouyang et al. 2011). Although, object-based approach is believed to be more time consuming and labour intensive (Duro et al. 2012).

With the above background, the present study provides a comprehensive comparative view of most used classification algorithms while taking account of satellite datasets from different platforms. Nine different classification algorithms including maximum likelihood (ML), neural network (NN), support vector machine (SVM)—linear, polynomial, radial basis function (RBF) and sigmoid kernels, random forests (RF) and naive bayes (NB) in pixel-based and (maximum likelihood (ML) in object-based classification (OBC) are performed on different satellite datasets.

Study area

The study area is National Capital Territory (NCT) of Delhi (Fig. 1). While covering an area of 1,484 km2, its latitudinal and longitudinal extent is 28.4084° N to 28.8845° N and 76.8328° E to 77.3377° E respectively. It has a completely urbanized landscape with some agricultural area at the outskirts and river Yamuna flowing through it.

Fig. 1
figure 1

Study area—NCT of Delhi

Data and methodology

In this study, satellite images of different spatial resolution are used. Detailed specifications of images are mentioned in Table 1, which indicates the variation in datasets (in terms of data of different time periods and sensors) that has been considered in the present work. LISS and IRS-1D images were obtained from National Remote Sensing Centre (NRSC), Indian Space Research Organisation (ISRO) (https://www.nrsc.gov.in/). The years 2005, 2010 and 2016 have been chosen carefully considering two important things, i.e. (1) the availability of cloud-free satellite images and (2) optimal temporal variability so as to have changing land-use to be prominent. To avoid seasonal effects, all the images were acquired for the same season.

Table 1 Specifications of satellite images used for classification to prepare LULC of Delhi (2005–2016) (* in accordance with metadata of the image)

Image pre-processing

Before performing classification, all the satellite images for years 2005, 2010 and 2016 were pre-processed in ERDAS® IMAGINE 2016 (Hexagon Geospatial 2016). The methodology used for pre-processing the images is represented in Fig. 2. The images of 2005 and 2010 were resolution merged to generate higher spatial resolution images for the respective years. The highest spatial resolution among all the satellite images was 5 m, therefore the spatial resolution of resolution merged images was resampled to 5 m. For resolution merging, various techniques (including wavelet, high-pass filter (HPF), modified IHS (intensity, hue, saturation), principal component (PC) based resolution merging, projective resolution merging and hyperspherical color space (HCS) were performed, however, the images resolution merged using HCS technique for both 2005 and 2010 were chosen to be used further as they were more accurate among all.

Fig. 2
figure 2

Brief methodology

Image classification

Different classification algorithms—pixel-based and object-based, were performed on the processed subset images. Nine LULC classes, namely water bodies, built-up, dense vegetation, sparse vegetation, cropland, fallow land, open land, scrubland/forest and sediment, were identified.

Pixel-based classification

For the training dataset, the spectral signatures from more than 3000 pixels from the satellite image of each year were selected to perform pixel-based classification. Thereafter, the training sets from spectrally similar pixels were merged. The same training dataset was used to perform all the algorithms used in the study in pixel-based classification.

All the pixel-based classification algorithms were performed in an open-source statistical computing software R version 3.3.2 (R Development Core Team 2016). Various add-on packages used in R to build and perform different classification algorithms include “rasclass” package for ML (Wiesmann and Quinn 2011), “nnet” package for NN (Venables and Ripley 2002), “kernlab” package (Karatzoglou et al. 2004) for SVM, “randomForest” package (Liaw and Wiener 2002) for RF and “naiveBayes” package for NB (Majka 2018).

Object-based classification (OBC)

The OBC was carried out in ArcGIS 10.5 (ESRI 2016). Firstly, image segmentation based on mean shift approach was performed to create segments or features of interest. There is no common scale (Myint et al. 2011) or fixed criterion to estimate the best parameters (Ouyang et al. 2011; Duro et al. 2012) for segmentation. The researchers (Chen et al. 2006; Ouyang et al. 2011) identify the scale that delineates the objects in the best visually corresponding manner to the real-world objects and consider it the appropriate scale level to be adopted for the classification. Initially, segmentation was tried with different values of parameters (Fig. 3). In parameters, the spectral detail value was kept constant and different spatial detail values were experimented to decide the parameters values for segmentation. Thereafter, based on visual attributes of the segmented image, images with criteria spectral detail = 20, spatial detail = 20 and minimum segment size = 5 pixels were found to be more appropriate and precise. The features of segmented image served as the underlying units for OBC. On an average, each image was segmented into more than 200,000 image objects. Once image segmentation was done precisely, training samples were collected from the segmented raster. Using the training samples and ML classifier, a classifier file was generated. Subsequently, based on generated classifier file, OBC was executed.

Fig. 3
figure 3

Image segmentation parameters in OBC approach used in the study

Accuracy assessment

Accuracy assessment of thematic (LULC) maps is crucial since the reliability of remotely sensed LULC maps depends on their accuracy. In the present study, for accuracy assessment of the LULC maps, 542 points for 9 LULC classes in each year’s dataset were selected based on equalized random sampling. The accuracy was determined using (1) Confusion (or Error) matrix; and (2) Mc Nemar’s test (Kavzoglu 2017). Confusion matrix provides three accuracy measures, i.e., overall accuracy, producer accuracy, and user accuracy. The confusion matrix is based on the comparison between reference image and classified image (output). Columns of matrix refer to LULC classes of reference image whereas the rows of matrix LULC classes of classified image. The no. of pixels comprising a specific LULC class is show by the cells of a matrix, whereas the number of pixels accurately classified is show by the diagonal cells. The overall accuracy is determined by dividing accurately classified pixels by total number of pixels. The overall accuracy decides the classification accuracy of the entire image whereas producer’s accuracy and user’s accuracy decide the accuracy of individual LULC classes. The producer’s accuracy is calculated as accurately classified pixels divided by the sum of total pixels in the reference image. The user’s accuracy is calculated as accurately classified pixels divided by sum of total pixels in the classified image.

Mc Nemar’s test is a statistical test used to evaluate statistical significance in the differences in the performance of different classifiers (Dietterich 1998). The test is applied to 2 × 2 contigency table where cells indicate number of samples incorrectly and correctly classified by two methods, the number of samples only correctly classified by one method (Kavzoglu 2017). The test statistic for Mc Nemar is give as Eq. (1)

$$\chi^{2} { } = \frac{{\left( {\left| {a_{ij} - a_{ji} } \right| - 1} \right)^{2} }}{{a_{ij} + a_{ji} }}$$
(1)

where aij refers to pixels incorrectly classified by method i but classified correctly by method j, aji refers to pixels incorrectly classified by method j but not by method i. χ2 follow chi-square distribution with degree of freedom 1. If estimated test values > χ value in the tale, two methods are said to perform differently, which means the difference in accuracy obtained by methods i and j are statistically significant.

Many researchers (Cohen 1960; Foody 2004; Rozenstein and Karnieli 2011; Duro et al. 2012) have pointed out that the cases wherein the same validation samples are used to assess different algorithms; the presumption that every algorithm is evaluated independently is infringed. In such instances, statistical comparison using kappa remains unjustifiable (Foody 2004; Duro et al. 2012; Rozenstein and Karnieli 2011). Hence, in such circumstances, Agresti (2002) and Zar (2009) recommends the use of Mc Nemar’s test for comparing classification algorithms. It is a non-parametric statistical measure for assessing the accuracy of thematic maps (Yan et al. 2006; Dingle and King 2011; Rozenstein and Karnieli 2011; Whiteside et al. 2011; Duro et al. 2012).

Mc Nemar’s test gives p value and chi-square value which determines the statistical significance of the difference between two algorithms (Foody 2004; De Leeuw et al. 2006; Rozenstein and Karnieli 2011). It is suggested to be performed as not every difference between two algorithms shall be significant. Assessing 27 LULC maps using Mc Nemar’s test revealed the statistically significant difference between any of the pixel-based algorithms and OBC approach or among different pixel-based algorithms.

Temporal analysis of LULC change

LULC maps of 2005, 2010 and 2016 were compared to analyse the change in LULC over the specified period. Post classification comparison technique was adopted as it is widely used and considered to provide more accurate results than other techniques including PCA, image differencing etc. (Dingle Robertson and King 2011). LULC class-wise area statistics was tabulated to analyse the nature and trend of land-use change shown by different algorithms temporally.

Theory

A brief description of the algorithms used in pixel and object-based classification is mentioned here in Table 2.

Table 2 A brief description of the algorithms used in pixel-based and object-based classification

Results

The LULC maps classified using all the studied algorithms are shown in Figs. 4, 5 and 6. The accuracy assessment of all the maps was performed using confusion matrix. The accuracy measures (overall accuracy (OA), producer’s accuracy (PA), user’s accuracy (UA) and kappa statistic) for all the years are given in Table 3 and the results of Mc Nemar’s test for years 2005, 2010 and 2016 are given in Tables 4, 5 and 6 respectively.

Fig. 4
figure 4

Different classification algorithms performed on Delhi year 2005 dataset. a Standard False Colour Composite of Delhi satellite image, b ML, c NN, d SVM (linear), e SVM (polynomial), f  SVM (RBF), g SVM (sigmoid), h RF,  i NB, j  OBC (ML)

Fig. 5
figure 5

Different classification algorithms performed on Delhi year 2010 dataset. a Standard False Colour Composite of Delhi satellite image, b ML, c NN, dSVM (linear), e SVM (polynomial), f SVM (RBF), g SVM (sigmoid), h RF, i NB, j OBC (ML)

Fig. 6
figure 6

Different classification algorithms performed on Delhi year 2016 dataset. a Standard False Colour Composite of Delhi satellite image, b ML, c NN, d SVM (linear), e SVM (polynomial), f SVM (RBF), g SVM (sigmoid), h RF, i B, j OBC (ML)

Table 3 Accuracy assessment results of different classification algorithms for years 2005, 2010 and 2016. (PA: Producer accuracy; UA: User accuracy; Dense v.: dense vegetation; Sparse v.: sparse vegetation; Crop l.: cropland; Fallow l.: fallow land; Open l.: openland; Scrub l/f.: scrubland/forest)
Table 4 Results of Mc Nemar’s test (p value, chi square) for year 2005 dataset (*denotes 2 tailed p value s; statistically significant values (p < 0.05) are in bold)
Table 5 Results of Mc Nemar’s test (p value, chi square) for year 2010 dataset. (*denotes 2 tailed p value s; statistically significant values (p < 0.05) are in bold)
Table 6 Results of Mc Nemar’s test (p value, chi square) for year 2016 dataset (*denotes 2 tailed p value s; statistically significant values (p < 0.05) are in bold)

On an average, the overall accuracy of all the LULC maps is approximately 50%. This is far below the established standard that states that the accuracy of the LULC maps should be at least 85% for the maps to be useful for planning and management of the areas (Anderson et al. 1976). However, in the present research work, the prepared LULC maps are not to be used for planning and management purposes but to compare the relative effectiveness of the different algorithms in classifying the remotely sensed satellite images accurately. Therefore, the output (LULC) of the algorithms as is produced have been taken into account for evaluation of algorithms and decided not to manipulate it with any post-classification processing i.e. filtering or recoding to increase the overall accuracy (Rozenstein and Karnieli 2011).

Accuracy assessment of LULC maps using confusion matrix

Overall accuracy (OA)

From Table 3, it is evident that among all studied algorithms, RF with OA (54.98% in 2005; 52.58% in 2010; and 56.83% in 2016) has performed as the best classification algorithm and Naive Bayes (39.11% in 2005; 41.14% in 2010; and 35.42% in 2016) the least. The performance of all the four kernels of SVM has been better than that of ML and NN in all the three years. However, no trend in the relative performances of the kernels across the three datasets is observed.

In comparison to all the pixel-based algorithms, object-based classification approach (44.46% OA in 2005 and 43.91% OA in 2010) has performed quite low; however, for year 2016, the performance of object-based classification (54.98% OA) has been very close to the best performed (pixel-based) classification algorithm i.e. RF (56.83% OA). Apparently, it indicates that object-based classification approach has performed better with original high-resolution dataset (i.e., LISS4 MX70 of 2016) than resolution-merged datasets (i.e., LISS 3 merged with LISS 4 and LISS 3 merged with IRS-1D of 2010 and 2005 respectively); although the resolution of resolution-merged datasets is similar that of the original dataset i.e. 5 m. This observation is not seen in any of the algorithms used in pixel-based classification approach.

Producer’s accuracy (PA) and User’s accuracy (UA)

Referring to PA and UA in Table 3, it is seen that no algorithm has highest PA and/or UA with respect to all the LULC classes in one or more years. Considering the notion of > 85% accuracy individually class-wise, it is observed that NN in 2003 and NB in 2016 have the highest PA for dense vegetation (87.1%) and water (90.32%) respectively. ML in 2005 (89.47%), ML in 2010 (95.45%), NN in 2005 (89.47%), RF in 2010 (95.83%), RF in 2016 (96.15%), NB in 2010 (88.46%) and OBC in 2005 (90.48%), in 2010 (91.67%) and in 2016 (100.00%) have the highest UA for water. RF in 2016 has the highest UA for cropland (85.17%). All the SVM kernels in 2010 have the highest UA for sediment (100.00%). It shows that the performance of classification algorithms is better with respect to UA in comparison to PA. Analysing the results class-wise, it is observed that the highest PA in 2005 is related to built-up (75.56%) classified using RF; in 2010 to water (68.43%) classified using SVM sigmoid and in 2016 to water classified using RF (80.65%). Similarly, the highest UA in 2005 is related to water (90.48%) classified using OBC; in 2010 to sediment (100.00%) classified using all the SVM kernels and;in 2016 to water (100.00%) classified using OBC.

Statistical significance assessment of LULC maps using Mc Nemar’s test

From Table 46, the results of Mc Nemar’s test with 5% significance level reveal that in 2005, OBC is statistically significant (p < 0.05) than many pixel-based algorithms (NN (p = 0), SVM linear (p = 0.015), SVM polynomial (p = 0.023), SVM RBF (p = 0.038), RF (p = 0) and NB (p = 0.046)). In year 2010, OBC is statistically significant than RF (p = 0.001) and in 2016, OBC is statistically significant than ML (p = 0.033), NN (p = 0) and NB (p = 0).

Analysis of Mc Nemar’s test with 5% significance level for different pixel-based algorithms reveal that statistical significance (p < 0.05) exists between many pixel-based algorithms; however, no consistent pattern regarding statistical significance among the algorithms is observed. A statistically significant comparison of any two or three pixel-based classification algorithms can be done using p values mentioned in Tables 4, 5 and 6.

Temporal analysis of LULC change

Table 7 shows the temporal LULC change (in km2) derived from all the classification algorithms for all the years. As RF and OBC, based on OA, are found out to be the best classification algorithms, only these two are discussed in detail here. The performance of rest of the algorithms can be studied from Table 7.

Table 7 Temporal LULC analysis (2005–2016) by different classification algorithms

The area covered by water has decreased from year 2005 to 2016 for RF (46.41 km2 to 20.38 km2) as well as for OBC (29.08 km2 to 19.79 km2). These declining results in the are an obvious error as such sharp decline in the amount of water bodies in Delhi is not feasible. RF shows an overall decline in built-up (564.17 km2 in 2005 to 450.592 km2 in 2016) which is incorrect for a study area like Delhi which is constantly urbanising. The OBC shows a realistic trend of an increase in built-up (374.98 km2 in 2005 to 564.93 km2 in 2016), though the accuracy of the amount of area mapped as built-up cannot be relied upon as OBC has low P.A. (45.31% in 2005; 53.57% in 2010; and 69.63% in 2016) and U.A. for built-up for all the years. For dense vegetation, both RF (24.84 km2 in 2005 to 67.62 km2 in 2016) and OBC (20.59 km2 in 2005 to 64.65 km2 in 2016) shows an overall increase over the years, which is an incorrect detail considering the land-use of Delhi. To its credit, Delhi has only central ridge forest as dense vegetation and it has not increased with this huge magnitude over the given period. For sparse vegetation, both RF and OBC shows contrary overall trends. RF shows an increase in sparse vegetation (111.37 km2 in 2005 to 247.80 km2 in 2016) and OBC shows an overall decrease in sparse vegetation (258.232 km2 in 2005 to 242.68 km2 in 2016). Similar contrary trends are observed for cropland and fallow land by RF as well as OBC. The trend of LULC change for open land by RF as well as OBC is inaccurate as open land in Delhi, over the period, has a likelihood to get converted into either built-up or green spaces. Hence, decline in open land from 2005 to 2010 (49.30 km2 in 2005 to 27.84 km2 for RF and 81.11 km2 to 51.79 km2 for OBC) is understandable and justifiable, however, a sudden increase in open land area in 2016 is an error. Similar erroneous trend observations are seen for scrubland/forest class by RF and OBC. For sediment, RF and OBC show different trends. In RF, the area of sediment has increased in 2010 (42.33 km2 from 12.10 km2 in 2005) and then has declined to 10.43 km2 in 2016. This sudden increase of sediment in 2010 is an error and unexplanatory. OBC shows a constant decline in sediment (30.38 km2 in 2005 to 7.73 km2 in 2016) though the amount of change appears huge considering sediment is located only along the banks of Yamuna River in Delhi.

Discussion

The results of the study infer that it is difficult to achieve higher overall accuracy in classifying large urban areas with detailed information using 5 m resolution satellite imageries. This is in consistence with the findings of Myint el al. (2011) stating that higher accuracy is difficult to be attained in detailed mapping of large urban areas. Also, the visual analysis of LULC maps reveal that the LULC maps prepared using pixel-based approach possess salt and pepper or mixed pixel effect and LULC prepared using object-based approach has provided a visually smoothened landscape in output LULC map that gives the appearance of earth-like landscape as found in Duro et al. (2012). This smoothening occurs because the heterogeneity in urban landscapes due to the presence of many different sized features in a small space in the area congest the spectral details of the urban features (Myint et al. 2006, 2011). This is the reason pixel-based classification leads to salt and pepper effect considering only spectral information. On the other hand, OBC considers spatial as well as spectral information of the features (Benz et al. 2004; Walter 2004; Myint et al. 2011; Duro et al. 2012) and it identifies the objects more precisely and leads to more accurate classification (Kelly et al. 2004; Cleve et al. 2008; Ouyang et al. 2011). Thus, the study demonstrates that OBC (ML) approach is preferable than pixel-based classification approach to prepare LULC for urban areas using satellite images having original high (5 m) spatial resolution. Among pixel-based classification, RF performs better compared to other algorithms. Despite having similar resolution (i.e. 5 m), original and resolution-merged dataset affects the performance of OBC. It illustrates that besides complex landscapes and classification algorithms, the type of remotely sensed data is another factor that affects the accuracy of the prepared LULC maps (Manandhar et al. 2009). In our study, it happens because the resolution merging technique used, i.e. hyperspherical color space (HCS) (Padwick et al. 2010) merges the edges of features with the shadow region in the image and thereby leads to the disappearance of smaller edges (Dahiya et al. 2013; Ghosh and Joshi 2013). Thus, it lacks spatial details (Ghosh and Joshi 2013) to some extent, which is a significant attribute in OBC. This is the reason, why resolution-merged datasets of 2005 and 2010 have shown lower accuracy for object-based approach. In this study, HCS resolution merging is used as it has generated resolution-merged datasets for years 2005 and 2010 which appear visually more accurate (Agrafiotis and Georgopoulos 2015) than those generated using Ehler’s fusion, wavelet, HPF, modified IHS and subtractive resolution merging methods. But considering the results and the fact that resolution-merging techniques affect the quality of the resolution merged products (Wang et al. 2005; Ghosh and Joshi 2013), it is suggested that before performing classification, the accuracy of merged datasets prepared using different techniques should be assessed by different measures and not only visually.

It has been observed that the time consumed in selecting the object features for OBC approach is almost equal to that of consumed in selecting the training data for pixel-based classification, provided the user has expertise in carrying out OBC. Otherwise, it can be very time consuming and labour intensive. With reference to the procedure of accuracy assessment of OBC, few researchers (e.g. Cleve et al. 2008) believe that a procedure that can assess the shape and topology of the features should be adopted because OBC takes into account the spatial topology, shape etc. of the classified features. In our study, to assess the performance of different pixel-based algorithms as well as OBC approach, pixel-based accuracy assessment method is used considering it to be the most suitable one as a pixel is the smallest unit of LULC map (Myint et al. 2011).The results reveal that unlike OA, the type of dataset (original or resolution merged) has no clear impact on the PA and UA of LULC classes with respect to different algorithms.

Higher PA of NN in 2005 for dense vegetation (87.1%) and that of NB in 2016 for water (90.32%) suggest NN and NB as the most powerful algorithms to classify the respective classes. Higher UA of OBC in 2005 for water (90.48%); that of RF and SVM in 2010 for water (95.83%) and sediment (100.00%) respectively and that of OBC and RF in 2016 for water (100.00%) and cropland (85.71%) suggest these algorithms as the most reliable ones in classifying the respective classes as accurate as their presence on the earth’s surface. These results reveal that though based on OA, RF and OBC have performed as the best classifiers; class-wise, neither of them has higher (> 85%) PA for any of the LULC classes and UA for any other class except the ones mentioned a while ago. The PA and UA statistics (Table 3) show some shortcomings depicted by classification algorithms in few of the LULC classes. None of the algorithms has classified sediment class accurately in 2005 datasets, resulting into 0.00% PA and UA. The reason behind this could be smaller percentage area of sediment in the study area. Similarly, NB in 2016 has not classified open land accurately. NN in 2005 and 2016 has not classified cropland and cropland and sediment respectively in the image. On visual basis, it was observed that the LULC maps in question do not contain enough number of pixels in the respective class that the accuracy of that class can be evaluated. Hence, it does not provide any PA or UA.

In addition to this, all the LULC maps were employed to analyse temporal LULC analysis. The only aim was to analyse the trend that how efficiently different algorithms mapped different LULC classes over the years. The nature and trend of LULC change was evaluated based on the knowledge of development occurred in the study area over the period. Comparison among different algorithms on LULC change or quantification of LULC change was not considered as the overall accuracy of all the LULC maps was quite low. The results revealed that neither of the two, RF and OBC, had shown satisfactory performance although OBC mapped LULC change trends correctly for built-up class.

Conclusion

In the present study, comparative evaluation of different classification algorithms and the impact of different types of satellite images on classification has been performed using confusion matrix and Mc Nemar’s test. The results indicate that OBC is found to be statistically significant (p < 0.05) than other algorithms in all the years (2005, 2010, 2016). Also, various pixel-based algorithms in the three years show statistical significance (p < 0.05) although no consistent pattern has been observed. With an overall accuracy (54.98% in 2005; 52.58% in 2010; 56.83% in 2016), RF has performed as the best classification algorithm whereas Naive Bayes shows the least overall accuracy (39.11% in 2005; 41.14% in 2010; 35.42% in 2016). OBC exhibits lower overall accuracy (44.46% in 2005; 43.91% in 2010; 54.98% in 2016) in comparison to pixel-based algorithms. Moreover, the visual investigation of LULC reveals that despite lower accuracy, OBC derived LULC are visually smooth and contiguous in nature in comparison to pixel based derived LULC which possess salt and pepper effect. The assessment of different types of satellite data with respect to classification reveals that OBC has performed significantly better with original high-resolution dataset. The poorer performance of OBC with resolution-merged images could be attributed to the reason that HCS resolution merging algorithm that is used in this study degrades the sharpness and spatial details to some extent in the output, an entity that is significant in OBC algorithm. Hence, the study suggests that to prepare LULC map of an urban area using satellite images of original 5 m spatial resolution, OBC approach is recommended whereas with resolution merged 5 m spatial resolution, RF algorithm in pixel-based approach is recommended. The findings of the study may be useful for future studies mapping urban land-use using higher resolution or resolution merged images.