Introduction

Satellite remote sensing has provided an amazing opportunity to accurately map and monitor environmental processes and land cover change by repeated data collection over vast areas of the Earth (Lam 2008; Rimal et al. 2019a). One of the most important and widely used datasets for remote sensing work is that of Landsat satellites. The Landsat satellite mission has continuously collected global imagery since 1972 and provides a biweekly observation of Earth at 30 m x 30 m resolution (Cohen and Goward 2004). The USGS’s Landsat open data policy in 2008 allowed for researchers to mine this data freely and calculate land cover change in ways that were not possible previously (Wulder et al. 2012). Landsat data have also not remained static, but the satellite data products have improved over time, giving researchers more bands of data, which allow for more accurate classification of different geological processes. Landsat 5-TM (8 bits radiometric) has seven bands, and Landsat 7-ETM+ (9 bits radiometric) has eight spectral bands with 30 m resolution. However, the latest version of new generation Landsat 8-OLI has 11 bands (12 bits radiometric), and this technology is regarded as the best option for the analysis of earth environment (Phiri and Morgenroth 2017; Zhu et al. 2015). Creating classifications using Landsat imagery provides a cost-effective and accurate means to derive land cover maps that can be used for environmental management, urban planning, forestry, agriculture and many other sectors. To derive land cover maps from Landsat data, researchers must use image classification algorithms. Image classification is the most useful technique to derive land cover information from satellite images, and common methods include pixel and object-based classification.

Pixel-based classification is the most commonly used method for classifying satellite imagery and uses numeric approaches to recognize patterns by pixel within an image (Steiner 1970). These classifiers can be grouped in two broad categories: parametric and nonparametric. The parametric classifiers are grounded on theories of probability as the classification is based on the normal distribution of image values (Lu and Weng 2007). The most early computer-based classification was mainly based on parametric approaches. Some of the parametric techniques include: Ameba approach described by Bryant (1979); parallelepiped, minimum distance (MD) function, maximum likelihood (ML), artificial neural networks (ANN), fuzzy classification (FC) described by Campbell (1996); ISODATA (Duda and Hart 1973), extraction and classification of homogeneous objects (ECHO) (Kettig 1975), layer classification (LC) (Jensen 1979) and contextual classification (Swain 1984). The advancement in pattern recognition techniques led to the developments of advanced nonparametric classifiers. They have proved to be more useful as they do not base classification on statistical parameters or on a normality assumption (Rodriguez-Galiano et al. 2012). Nonparametric classifiers include support vector machine (SVM), ANN and decision tree of which SVM is the most commonly used (Srivastava et al. 2012). Details on different classification approaches can be found in an article by Phiri and Morgenroth (2017).

However, land cover maps derived from pixel-based classifications result in miscalculations due to spectral variation of land cover classes (Jawak et al. 2015) and/or mixed pixels due to similar reflectance from two or more land cover types (Blaschke 2010). Therefore, conducting an accuracy assessment is an essential and integral component of any image classification. It is the process which estimates the reliability of the land cover data derived for the analysis. It quantifies the quality of the data and makes the map users easier to identify (Stehman and Czaplewski 1998).

Of the different parametric or nonparametric methods, SVM and ML classifier are the two most widely used classifiers (Kavzoglu and Colkesen 2009). ML is mainly used with a supervised classification approach (Kavzoglu and Colkesen 2009; Rijal et al. 2018; Schneider 2012; Sharma et al. 2018) for mapping land cover and urban development in the city (Rimal et al. 2017; Thapa and Murayama 2009). ML can achieve an overall accuracy of as much as about 84.4% (Thapa and Murayama 2009). Meanwhile, SVM is a set of related learning algorithms (Otukei and Blaschke 2010) with above 86.6% of overall accuracy (TAATI et al. 2014). It is the algorithm with good results with high accuracy than traditional methods of classification (Kavzoglu and Colkesen 2009; Lee et al. 2017; Qian et al. 2014; Schneider 2012).

Due to the widespread popularity of both, studies have compared both head to head (Rokni et al. 2014), but not in regards to understanding land cover change in Nepal. In the Nepalese context, land cover change of the rapidly urbanizing Kathmandu Valley has been the subject of multiple studies using ML (Rimal et al. 2017; Thapa and Murayama 2009) and SVM classifier (Rimal et al. 2018). However, none of these studies compared the accuracy of these two approaches. This study compares the accuracy of ML and SVM to develop a more accurate urban change map of the Kathmandu Valley.

Method

Study Area

The study area is situated in Nepal’s province number 3 and includes Kathmandu, the country’s capital. The area integrates the administrative cities of the Kathmandu Valley (Kathmandu 11 cities, Lalitpur three cities and Bhaktapur four cities) and Kabhrepalanchowk district (six cities) (CKVKD). Geographically, it is enclosed between 27°31′ and 27°49′ north latitude and 85°11′ to 85°43′ eastern longitude with a total area of 1215.23 km2 (Fig. 1). The total population of the four districts more than doubled between the years 1991 and 2011, from 1.43 to 2.90 million (CBS 2014). We selected Kathmandu CKVKD, because it has witnessed significant urban expansion since the last three decades. Accurate estimation of land cover change in the valley could provide useful information on the trend and pattern of urban change and could support in design and planning of urban development.

Fig. 1
figure 1

Location map of study area

The accelerated urban history of the Kathmandu Valley dates back to late 1950s (Toffin 2010) led by migration and population growth (Rimal et al. 2018; Thapa and Murayama 2010). The rapid pace of urban expansion in the region has resulted in significant transformation in land use (Rimal et al. 2017; Thapa and Murayama 2009). Major transformations include the increase in urban/built up and sharp decline of cultivated land areas. Urban/built-up area had occupied 2.94% of the total area in 1967 which extended to 14.19% in 2000 (Thapa and Murayama 2009). Similarly, urban coverage of the study area increased by 103.82 km2, (from 40.53 km2 in 1988 to 144.35 km2 in 2016) with the increase of 346.85%, whereas cultivated land declined by 122.91 km2 (from 764.87 km2 to 641.96 km2) from 1988 to 2016. According to the simulation analysis, urban/built-up area will extend to 200 km2 and 238 km2 by 2024 and 2032 while cultivated land will subside to 587 km2 and 555 km2 in the respective years (Rimal et al. 2018). Historical land cover change of the study area during 1988–2016 is shown in Table 1 and Fig. 2a–h, and the detail of the land cover statistics is presented in Appendix Table 3. According to the analysis, the largest temporal transformation occurred from cultivated land to built-up area while changes in other classifications were negligible. Cultivated land declined from 52.91% in 1988 to 46.12% in 2015. Kabhrepalanchowk, the adjoining district to Kathmandu, which includes important farmland and forest resource, is also confronting similar trend of urbanization and land cover change.

Table 1 Historical land cover change between 1988 and 2016
Fig. 2
figure 2

Land cover map classified based on SVM approach a 1988; b 1992; c 1996; d 2000; e 2004; f 2008; g 2013; h 2016

Land Cover Extraction and Sample Point Collection

We developed six land cover categories: urban area, cultivated land, vegetation cover, sand area, water body and open field. A total of 1200 sample points were considered for training (i.e., 200 for each class) and were tested for the accuracy assessment. Trainings are frequently used for accuracy assessment (Jensen 1996; Sexton et al. 2013; Sloan and Pelletier 2012). The classification of accuracy was observed based on field survey data, Landsat satellite images and Google Earth high-resolution satellite images for the label of random sample points. Similarly, topographical data developed by Survey Department of Nepal (GoN 1995), scale of 1: 2500, were used as references. Overall accuracy (OA), user’s accuracy (UA) and producer’s accuracy (PA) were computed and tested.

Methods and Materials

Maximum cloud free L1T (terrain corrected) total 8 scenes, Landsat 5-TM, 7 +ETM and 8-OLI images were collected from 1988 to 2016 from the United States Geological Survey (USGS) website https://earthexplorer.usgs.gov (Table 2). The image selection was made for the months of February, April, October and November due to the full to partial cloud coverage in the remaining months. Image processing was conducted in ENVI environment. The Flash Line- off- sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) (Ibrahim Mahmoud et al. 2016) atmospheric correction model was applied where not more than 15 m (0.5) pixels of positional root-mean-square (RMS) error of rectification was accepted and all available scenes were stacked and land cover were extracted. The details of the Landsat images used for this study are provided in Table 2.

Table 2 Satellite images used in this study

TM is Thematic Mapper, ETM+ is Enhance Thematic Mapper, and OLI is Operational Land Imager.

The maximum likelihood classification is calculated using the following discriminant functions for each pixel in the help section of ENVI version 5.3.

$$g_{i} (x) - \ln p(\omega_{i} ) - 1/2\ln \left| {\varSigma_{i} } \right| - 1/2(x - m_{i} )^{i} \varSigma_{i}^{ - 1} (x - m_{i} )$$
(1)

where i = class, x = n-dimensional data (where n is the number of bands), p(wi) = probability that class wi occurs in the image and is assumed same for all classes, |Σi| = determinant of the covariance matrix of the data in a class, Σ−1i = its inverse matrix, mi= mean vector of a class.

SVM algorithm finds a hyperplane to separate the database based on pre-defined number of categories (Mountrakis et al. 2011). SVMs approach is generally organized into four Kernel functions: linear, polynomial, radial basis function (RBF) and sigmoid (Kavzoglu and Colkesen 2009; Lee et al. 2017). RBFs are more powerful kernels than others (linear, polynomial, radial) and are used to achieve the better results (Rimal et al. 2019b).

The classification equations of each Kernel are described in the help section of ENVI version 5.3. The following are the equation of each Kernel functions used in SVM:

$$\begin{aligned} &(\text{i})\quad {\text{Linear:}}\;K\left( {x_{i,} y_{i} } \right) = x_{i}^{T} \cdot x_{j} \hfill \\&(\text{ii})\quad {\text{Polynomial:}}\;K\left( {x_{i,} y_{i} } \right) = (g \cdot x_{i}^{T} \cdot x_{j} + r)^{d} ,\quad g > 0, \hfill \\& (\text{iii})\quad {\text{Radial}}\;{\text{basis}}\;{\text{function:}}\;K\left( {x_{i,} y_{i} } \right) = e^{{ - g(x_{i} - x_{j} )^{2} }} ,\quad g > 0, \hfill \\ &(\text{iv})\quad {\text{Sigmoid:}}\;K \, \left( {x_{i,} y_{i} } \right) = \tan \, h(g \cdot x_{i}^{T} \cdot x_{j} + r) \hfill \\ \end{aligned}$$
(2)

where g, d and r are user-controlled parameters of kernel function.

Results

Comparison of SVM and ML Accuracies

In the study, overall LULC classification accuracies achieved using SVM classifier were 88.75% (1988), 90.83% (1992), 90.33% (1996), 89.67% (2000), 91.92% (2004) 88.92% (2008), 90.92% (2013) and 90.25% (2016). The overall classification accuracies of the alternative ML classifier were 85.83% (1988), 88.08% (1992), 87.92% (1996), 86.83% (2000), 88.58% (2004), 85.50% (2008), 87% (2013) and 88.92% (2016).

SVM classifier obtained higher OA than the ML classifier across all classification years (Fig. 3). SVM obtained a maximum accuracy of 91.92% and a minimum of 88.92%, while the ML classifier ranged from a minimum of 85.50% in 2008 to a maximum of 88.58% in 2004. The overall accuracy mean of SVM is 90.40 (± 0.91)% and ML is 87.54 (± 1.39)%. The differences in OA between the two classifications show that SVM has better accuracy of 2.9% than ML in determining land cover types.

Fig. 3
figure 3

Overall classification accuracy of SVM and ML in different classification years

SVM classifier identified all the classes more accurately than the ML classifier (Figs. 4, 5). For instance, during 2013, the highest UA of SVM in terms of urban (91.5%) was witnessed, while ML classifiers for that year were relatively lower (88.5%). Similarly, the highest SVMs regarding cultivated land, vegetation cover, sand area, water body and open field were 91%, 90.5%, 91.5%, 93% and 94.5% during 2004, 1992, 2016, 1996 and 2008, respectively. Contrarily, the ML classifiers for the respective classes in the same years were as follows: 86.5%, 85%, 85%, 90% and 88.5%.

Fig. 4
figure 4

User’s accuracy assessment

Fig. 5
figure 5

Producer’s accuracy assessment

The producer’s accuracy (PA) of SVM classifier was also relatively higher than the ML classifier. The highest PA in terms of SVM was 91.6% for urban/built-up area in 2004, while the ML was 86.2%. The PA of cultivated land was found to be 83.4% in 2016, and that of vegetation area was remained highest (90.24%) in 2016. Again, the 2016 was found important for water body (98.39%), whereas the PA of SVM was found consistently dominant in 1988, 2004 and 2008. On the other hand, the ML was found to be 81.9% in 2016 and vegetation cover was found to be 86.47% in 2016. ML classifier of sand area for 2013 was 86.67%, and that of water body for 2016 was 94.18%. The PA of ML of open field observed in 1988, 2004 and 2008 was 97.4%, 97.77% and 94.65%. The highest UA and PA from SVM classifier were witnessed mostly in open field (Figs. 4 and 5), and the lowest UA from SVM was observed in urban/built-up area (80% during 1988) and the lowest PA of SVM in cultivated land (75.56% during 1988) (Appendix Table 4)

Discussion

SVM and ML are both well-recognized algorithms for assessing the accuracy of land cover classification of any area (Bray and Han 2004; Srivastava et al. 2012). ML is the classical parametric classifier which is used during the assumption of the multivariate normal distribution of data (Kavzoglu and Colkesen 2009). Particularly, SVM produces accurate and improved land cover classification because of their nonparametric nature (Vapnik 1971). SVM reduces the land cover classification error of hidden information or control a certain level of misclassification. SVM and ML are popular in the land cover classification as they perform higher accuracy compared with MLC in identifying urban and other land cover types (Bray and Han 2004; Huang et al. 2002; Kavzoglu and Colkesen 2009; Schneider 2012). However, Scholz et al. (1979), Hixson et al. (1980) and Campbell (1981) argued that the selection of sample points (training data) was more important than the choice of classification algorithms to achieve the higher classification accuracy of the classified images.

Accuracy assessment is a complex and essential step on land cover classification and mapping (Campbell 1996). Accuracy assessment refers to the analysis of typically conducted process to indicate the correctness of map or classification (Foody 2002). Accuracy assessment is undertaken to measure the map quality, evaluate various classification algorithms and identify the errors. Assessment and validation of the land cover map provide measures of data quality including the overall accuracy, user’s accuracy and producer’s accuracy. In the assessment, the high accuracy means that the bias of land cover classification is low. Producer’s accuracy is capable of informing how well a definite area can classified, and user’s accuracy confirms that the classified pixel in the image exactly matches with the category on the real ground (Congalton 1991). Accuracy assessment is fundamental yet challenging task in the thematic mapping (Foody 2002).

Conclusions

Higher user’s and producer’s accuracies were obtained from the SVM classifier in comparison to the ML classifier. SVM was found effective in determining land cover classification, particularly urban/built up. It was attributed by revealing higher accuracies due to more distinct signatures; however, the disparate signatures of open field also yielded the higher accuracies in ML approaches. Of the total six land cover classes, the highest user’s and producer’s accuracies were witnessed in the open field, whereas the lower user’s accuracy in urban/built up and lower producer’s accuracy in cultivated land. In case of urban/built-up area, SVM obtained the accuracy above 86% in each time stamps and this considered highly reliable. Meanwhile, the accuracy was relatively higher than ML. Due to these evidences obtained from our study, we recommend SVM as a suitable option for precise classification of land cover, particularly urban/built up.