Introduction

There are two general approaches towards dataset classification: supervised classification and unsupervised classification (Shi 2014). One of the major and cost-intensive problems dealt with supervised classification is the acquiring of training data. As such, for many cases, unsupervised classification (also referred to as clustering) represents an alternative which attempts to solve the problem to some extent (Han et al. 2012). Clustering algorithms without a priori training data are used to identify different clusters within a dataset, with some of them working with given number of the clusters, leaving the others identifying the clusters with no knowledge of the number of the clusters (Mann and Kaur 2013).

Clustering analysis, or simply clustering, refers to the process of dividing a set of data or observations into subsets (clusters) in such a way that the members in each subset share similar properties with one another, while those are significantly different from the members in any other cluster (Han et al. 2012; Shi 2014). In other words, clustering aims to allocate a set of data into different clusters in such a way that intra-cluster member similarity is maximized, while minimizing inter-cluster member similarities (Karaboga and Ozturk 2011). Satellite images represent spatial datasets which have attracted a great deal of attention from researchers. High-spatial resolution satellite images are a potential source of information for better identification of different clusters (Niazmardi et al. 2012). However, as the images show a high level of details, existing algorithms are engaged with a sort of uncertainty which influences the clustering process.

Numerous algorithms have been proposed to solve the clustering problem. Among other algorithms, K-Means is one of the most common and simple algorithms proposed by Rosenberger and Chehdi (2000). Examples of other clustering algorithms include K-Medoids, FCM,Footnote 1 and KFCM.Footnote 2 Various researches have considered these methods to improve the clustering process.

Santos and Pedrini (2016) incorporated the K-Means algorithm into entropy technique to obtain enhanced results. Among other researches where K-Means algorithm was used for clustering, one can refer to Jia et al. (2016), Rozanda et al. (2015), and Liao and Compsc (2003).

Yang and Huang (2007) compared FCM algorithm against K-Means algorithm and showed that the FCM algorithm can end up with lower cost function; he further added a term into the cost function to improve the FCM algorithm. Among other researches where FCM algorithm was employed for clustering, one can refer to Kannan et al. (2011), Bilgin et al. (2008), Bidhendi et al. (2007), and Ooi and Lim (2006).

Zhang and Chen (2003b) compared FCM and KFCM algorithms to show that the KFCM algorithm outperforms the former algorithm when it comes to the cluster identification in high-spatial resolution images. Among other researches where KFCM algorithm was utilized for clustering, one can refer to Tsai et al. (2012), Das and Sil (2010), Yang and Tsai (2008), and Zhang and Chen (2003a).

Despite their partial superiority over other algorithms, each of the above-mentioned algorithms suffers from a general problem: being highly dependent on an initial guess, the algorithms may return local optima when initial cluster centers are assigned randomly (Karaboga and Ozturk 2011). In recent years, in response to this concern, a combination of these algorithms with optimization algorithms has been proposed to solve the problem (Chen et al. 2014; Esmin and Matwin 2012; Goel et al. 2011; Youssef 2011).

Zhaoxia (2011) succeeded to improve clustering results by combining so-called genetic optimization algorithm with the K-Means algorithm. Xu and Xiao (2012), Yu and JinZhi (2010), and Ke et al. (2009) are other examples of the researches wherein genetic algorithm was used to improve the results of the above-mentioned clustering algorithm.

Among the researches where artificial bee colony (ABC) algorithm was incorporated into FCM clustering, one can refer to Karaboga and Ozturk (2010) where it was shown that, when combined with ABC algorithm, FCM provides better performance than the cases where it was combined with other optimization algorithms. Sathishkumar et al. (2013) and Karaboga and Ozturk (2011) further used a combination of ABC and FCM algorithms to enhance clustering results.

Zhao and Zhang (2011) used a combination of KFCM and ABC to cluster images. Niazmardi et al. (2012) used KFCM combined with particle swarm optimization (PSO) algorithm to cluster hyperspectral images, demonstrating superiority of the combined approach over K-Means and FCM algorithms.

Each of the cited researches in the two groups have provided a separate combination to improve the clustering; however, the present research proposes a combination of K-Means, K-Medoids, FCM and KFCM methods with ABC optimization algorithm into which a mixed cost function is introduced, and investigates the results. In the researches referred to in the two groups, clustering uncertainty and quality are not accounted for simultaneously; however, in the present research, not only the clustering uncertainty and quality are investigated, also the best hybrid algorithm for the clustering of high-spatial resolution satellite images is selected. Moreover, the proposed solution not only makes it possible to identify optimum number of clusters within the images, it rather is able to identify non-convex clusters.

In the remainder of this paper, “Hybrid Cost Function (HCF)” section introduces a Hybrid Cost Function (HCF) before resolving the optimized local value-obtaining problem through combining the proposed HCF with K-means, K-Medoids, FCM, KFCM, and ABC methods in “Hybrid Algorithms” section. “Methods for Results Assessment” section evaluates performances of common methods in clustering of satellite images of high-spatial resolution. The final section presents a comparison of the clustering uncertainty through Shannon entropy technique, and the clustering quality through indicators such as Kappa coefficient, compactness, and separation criteria the section further provides a discussion on the results.

Hybrid Cost Function (HCF)

For most of the clustering algorithms used for raster data (e.g. satellite images), cluster calculation is based on the minimization of a cost function presented in Eq. (1) (Chunfei and Zhiyi 2013; Ke et al. 2009; Yu and JinZhi 2010).

$$ {\text{CF}} = \mathop \sum \limits_{{{\text{i}} = 1}}^{\text{k}} \mathop \sum \limits_{{{\text{p}} \in {\text{C}}_{\text{i}} }}^{ } {\text{d}}\left( {{\text{z}}_{\text{p}} ,{\text{m}}_{\text{i}} } \right) $$
(1)

where \( {\text{k}} \) is the number of clusters, \( {\text{m}}_{\text{i}} \) is the center of the ith cluster (\( {\text{C}}_{\text{i}} \)), \( {\text{z}}_{\text{p}} {\text{s}} \) are the image pixels, and \( {\text{d(z}}_{\text{p}} ,{\text{m}}_{\text{i}} ) \) is the distance between center of a cluster and assigned pixels to that cluster (Han et al. 2012). It is worth noting that, in Eq. (1), only the distance between the center of a cluster and assigned members to that cluster is used, with the other parameters considered for improvement of the clustering.

The cost function used in this study, as shown in Eq. (2), is defined in a different form than that of common functions, i.e. the sum of three sections (Omran et al. 2002; Salman et al. 2005).

$$ {\text{f}}\left( {{\text{x}},{\text{z}}} \right) = {\text{w}}_{1} \overline{\text{d}}_{ \hbox{max} } \left( {{\text{x}}_{ } ,{\text{z}}_{ } } \right) + {\text{w}}_{2} \left( {{\text{z}}_{ \hbox{min} } - {\text{d}}_{ \hbox{min} } \left( {\text{x}} \right)} \right) + {\text{w}}_{3} {\text{J}}_{\text{e}} $$
(2)

where \( {\text{w}}_{1} \), \( {\text{w}}_{2} \), and \( {\text{w}}_{3} \) are the weights of each section utilized in the calculation of the total cost function. These values are optional and where the sections are of the same level significance, the same weight can be considered for all three components. The weights \( {\text{w}}_{1} \), \( {\text{w}}_{2} \), and \( {\text{w}}_{3} \) have non-negative values, with the sum of them being equal to 1. \( {\text{x}}_{ } \) is the vector of the cluster centers, i.e. \( {\text{x}}_{ } = \left( {{\text{m}}_{ 1} , \ldots ,{\text{m}}_{{ {\text{j}}}} , \ldots ,{\text{m}}_{{ {\text{N}}_{\text{c}} }} } \right) \), in which \( {\text{m}}_{{ {\text{j}}}} \) is the center of jth cluster and \( {\text{N}}_{\text{c}} \) is the number of clusters. \( {\text{z}} \) is a matrix containing components of each cluster. In the first section of Eq. (2), mean distance from the center of each cluster to its components is calculated (Eq. 4), with the longest mean distance taken as \( \overline{\text{d}} _{ \hbox{max} } \). Equation (3) shows how this value is calculated.

$$ \overline{\text{d}}_{ \hbox{max} } \left( {{\text{x}}_{ } ,{\text{z}}_{ } } \right) = \mathop {\hbox{max} }\limits_{{{\text{j}} = 1, \ldots ,{\text{N}}_{\text{c}} }} \left\{ {\mathop \sum \limits_{{{\text{z}}_{\text{p}} \in {\text{C}}_{{ {\text{j}}}} }}^{ } {\text{d}}\left( {{\text{z}}_{\text{p}} ,{\text{m}}_{{ {\text{j}}}} } \right)/\left| {{\text{C}}_{{ {\text{j}}}} } \right|} \right\} $$
(3)

where \( \left| {{\text{C}}_{{ {\text{j}}}} } \right| \) is the cardinality of the jth cluster. The distance from each cluster center to its components can be calculated through Eq. (4).

$$ {\text{d}}\left( {{\text{z}}_{\text{p}} ,{\text{m}}_{\text{j}} } \right) = \sqrt {\mathop \sum \limits_{{{\text{k}} = 1}}^{{{\text{N}}_{\text{b}} }} \left( {{\text{z}}_{\text{pk}} - {\text{m}}_{\text{jk}} } \right) ^{2} } $$
(4)

where \( {\text{N}}_{\text{b}} \) is the number of image bands.

It is clear that the value of the first section in Eq. (2) should be minimized to increase the clustering quality (Omran et al. 2002). In the second part of Eq. (2), the distances between each pair of cluster centers (i.e. \( {\text{d}}( {{\text{m}}_{\text{i}} ,{\text{m}}_{\text{j}} } )) \) are calculated, so as to find the smallest such distance (Eq. 5).

$$ {\text{d}}_{ \hbox{min} } \left( {{\text{x}}_{ } } \right) = \mathop {\hbox{min} }\limits_{{\forall {\text{i}},{\text{j}},{\text{i}} \ne {\text{j}}}} \left\{ {{\text{d}}\left( {{\text{m}}_{\text{i}} ,{\text{m}}_{\text{j}} } \right)} \right\} $$
(5)

To identify the main clusters, the distance between cluster centers should be maximized; however, this value is subtracted from the largest distance existing within the dataset, as is expressed in Eq. (6). Consequently, the second section of Eq. (2) should also be minimized.

$$ {\text{z}}_{ \hbox{max} } = \sqrt {{\text{N}}_{\text{b}} } {\text{d}}_{\text{m}} $$
(6)

where \( {\text{d}}_{\text{m}} \) is the maximum distance within the dataset.

In the third part of Eq. (2), a general measure of clustering quality is calculated through assessing average maximum distance in each cluster, according to Eq. (7).

$$ {\text{J}}_{\text{e}} = \frac{{\mathop \sum \nolimits_{{{\text{j}} = 1}}^{{{\text{N}}_{\text{c}} }} \left[ {\mathop \sum \nolimits_{{{\text{z}}_{\text{p}} \in {\text{C}}_{\text{j}} }}^{ } {\text{d}}\left( {{\text{z}}_{\text{p}} ,{\text{m}}_{\text{j}} } \right)} \right]/\left| {{\text{C}}_{\text{j}} } \right|}}{{{\text{N}}_{\text{c}} }} $$
(7)

In the Eq. (1), the distance between each cluster center and the members in the same cluster is considered only. This is while, in order to improve the clustering performance, Eq. (2) comes with three terms which not only investigate cluster center-to-member distance, but also account for the distance between centers of different clusters and average of the maximum distance in each cluster.

Hybrid Algorithms

Because K-means (K), K-Medoids (KM), Fuzzy C-Means (FCM) and Kernel-based Fuzzy C-Means (KFCM) algorithms depend largely on the starting point of the algorithms (primarily random values for the cluster centers), they may yield local optimum values (Karaboga and Ozturk 2011). In order to address this critical issue, a combination of clustering methods with optimization algorithms has been generally recommended in recent years. In the present study, in order to achieve clustering algorithms of higher performance, the three components (i.e. HCF, clustering algorithms, and ABC) were combined to end up with a clustering approach of higher quality and lower uncertainty for satellite images of high-spatial resolution.

In fact, in the present study, combining the most popular clustering methods with the ABC optimization algorithm and the mixed cost function, efficiency of each of the combinations will be investigated from three perspectives. Clustering results are evaluated via Kappa coefficient, compactness–separation criteria and entropy technique, so as to select the most efficient method for clustering high-spatial resolution satellite images. Figure 1 shows a layout of the process.

Fig. 1
figure 1

General process of the application of hybrid algorithms and their evaluation

ABC-FCM Hybrid Algorithm

Being first introduced by Dunn (1973) and then improved by Bezdek (1980), FCM clustering algorithm is one of the most widely used algorithms in the field of clustering (Xiaojun et al. 2012). In contrast to K-Means method in which each pixel belongs to only one cluster, this algorithm assigns a set of membership degrees to each pixel indicating membership values of each pixel to different clusters. The FCM algorithm is an iterative clustering process that stops when the number of generated clusters reach \( {\text{N}}_{\text{c}} \). These clusters are obtained by minimizing Eq. (8) (Ouadfel et al. 2012).

$$ {\text{J}} = \mathop \sum \limits_{{{\text{k}} = 1}}^{\text{n}} \mathop \sum \limits_{{{\text{i}} = 1}}^{{{\text{N}}_{\text{c}} }} {\text{u}}_{\text{ik}}^{\text{f}} {\text{d}}^{2} \left( {{\text{z}}_{\text{k}} ,{\text{m}}_{\text{i}} } \right) $$
(8)

where \( {\text{f}} \) is the fuzzification degree, which is usually considered as 2 (Niazmardi et al. 2012; Pour and Homayouni 2016; Xiaojun et al. 2012), \( {\text{u}}_{\text{ik}}^{ } \) is the degree to which pixel \( {\text{k }} \) belongs to the ith cluster [obtained through Eq. (9)], and \( {\text{m}}_{\text{i}} \) is the center of ith cluster which is calculated by Eq. (10).

$$ {\text{u}}_{\text{ik}}^{ } = \frac{1}{{\mathop \sum \nolimits_{{{\text{j}} = 1}}^{{{\text{N}}_{\text{c}} }} \left( {\frac{{{\text{d}}_{\text{ik}} }}{{{\text{d}}_{\text{jk}} }}} \right)^{{{\raise0.7ex\hbox{$2$} \!\mathord{\left/ {\vphantom {2 {{\text{f}} - 1}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${{\text{f}} - 1}$}}}} }} $$
(9)
$$ {\text{m}}_{\text{i}} = \frac{{\mathop \sum \nolimits_{{{\text{k}} = 1}}^{\text{n}} {\text{u}}_{\text{ik}}^{\text{f}} {\text{z}}_{\text{k}} }}{{\mathop \sum \nolimits_{{{\text{k}} = 1}}^{\text{n}} {\text{u}}_{\text{ik}}^{\text{f}} }} $$
(10)

As this algorithm depends heavily on initial randomly-selected cluster centers, it may yield local optimized values (Karaboga and Ozturk 2011). To address this issue, the algorithm should be combined with meta-heuristics such as ABC algorithm.

The ABC algorithm was first introduced by Karaboga (2005) for optimization purposes. This algorithm uses the explorative behavior of bees during the search for food (Karaboga 2005). Karaboga and Ozturk (2011) showed that this algorithm outperforms other optimization methods when it comes to clustering; hence, this algorithm is selected among the existing meta-heuristics for optimization.

There are three types of bees in ABC algorithm: scout bees, follower bees, and employee bees. Although all bees act as employee bees in the first iteration, the bee with a smaller fitness function value is considered as the scout bee, with the other bees assessed accordingly (Karaboga and Ozturk 2011). In the proposed method, the fitness function is obtained through Eq. (2) and the objective is to minimize this value. Once the scout bee is recognized, corresponding values are assigned to the follower and employee bees through Eq. (11). The values are different for these two types of bees (Karaboga and Ozturk 2011).

$$ {\text{z}}_{\text{i}} = {\text{Z}}_{\text{i}} + {\varphi }_{\text{i}} \left( {{\text{v}}_{{{\text{i}}1}} - {\text{v}}_{{{\text{i}}2}} } \right) $$
(11)

where \( {\text{z}}_{\text{i}} \) is the new location of bee, \( {\text{Z}}_{\text{i}} \) is the location of scout bee, \( {\varphi }_{\text{i}} \) is the random function returning a value ranging from 1 to −1, \( {\text{v}}_{{{\text{i}}1}} \) and \( {\text{v}}_{{{\text{i}}2}} \) are the lowest and highest limits of the search range, respectively. Location of employee bee can be obtained through \( {\text{v}}_{{{\text{i}}1}} = {\text{z}}_{ \hbox{max} } \) and \( {\text{v}}_{{{\text{i}}2}} = {\text{z}}_{ \hbox{min} } \). Figure 2 shows a bee structure in the proposed method. According to Fig. 2, each bee takes, as unknown, cluster centers within each band for which it calculates an optimum value via an iterative approach. As such, each bee hosts \( {\text{N}}_{\text{c}} \) × \( {\text{N}}_{\text{b}} \) unknown parameters.

Fig. 2
figure 2

Structure of bees in ABC-FCM

ABC-KFCM Hybrid Algorithm

Developed based on the concept of FCM algorithm (Zhang and Chen 2003b), KFCM algorithm projects the pixels in kernel space before looking for similarities between points. Equation (12) shows this algorithm (Niazmardi et al. 2012).

$$ {\text{J}} = \mathop \sum \limits_{{{\text{i}} = 1}}^{{{\text{N}}_{\text{c}} }} \mathop \sum \limits_{{{\text{k}} = 1}}^{\text{n}} {\text{u}}_{\text{ik}}^{\text{f}} ||{\varphi }\left( {{\text{z}}_{\text{k}} } \right) - {\varphi }\left( {{\text{m}}_{\text{i}} } \right)^{2}|| $$
(12)

The distance in Eq. (12) is calculated using Eq. (13) (Niazmardi et al. 2012).

$$ ||{\varphi }\left( {{\text{z}}_{\text{k}} } \right) - {\varphi }\left( {{\text{m}}_{\text{i}} } \right)||^{2} = {\text{K}}\left( {{\text{z}}_{\text{k}} ,{\text{z}}_{\text{k}} } \right) + {\text{K}}\left( {{\text{m}}_{\text{i}} ,{\text{m}}_{\text{i}} } \right) - 2{\text{K}}\left( {{\text{z}}_{\text{k}} ,{\text{m}}_{\text{i}} } \right) $$
(13)

The Gaussian Kernel Function, K, is obtained through Eq. (14) (Niazmardi et al. 2012).

$$ {\text{K}}\left( {{\text{z}}_{\text{k}} ,{\text{m}}_{\text{i}} } \right) = { \exp }\left( { - ||{\text{z}}_{\text{k}} - {\text{m}}_{\text{i}}^{2}|| /\upsigma^{2} } \right) $$
(14)

According to Eq. (14), \( {\text{K}}\left( {{\text{z}}_{\text{k}} ,{\text{z}}_{\text{k}} } \right) = {\text{K}}\left( {{\text{m}}_{\text{i}} ,{\text{m}}_{\text{i}} } \right) = 1 \); accordingly, Eq. (12) is changed into Eq. (15).

$$ {\text{J}} = 2\mathop \sum \limits_{{{\text{i}} = 1}}^{{{\text{N}}_{\text{c}} }} \mathop \sum \limits_{{{\text{k}} = 1}}^{\text{n}} {\text{u}}_{\text{ik}}^{\text{f}} \left( {1 - {\text{K}}\left( {{\text{z}}_{\text{k}} ,{\text{m}}_{\text{i}} } \right)} \right)^{ }_{ } $$
(15)

In FCM method, cluster centers are the only unknown parameters, while KFCM algorithm has additional unknown parameters including the width of Gaussian Function (σ).

The combination of ABC algorithm with KFCM method is identical to that described for FCM in “ABC-FCM Hybrid Algorithm” section, but bee structures are different. Figure 3 shows the bee structures in ABC-KFCM algorithm. According to Fig. 3, similar to Fig. 2, in this algorithm, each bee takes, as unknown, cluster centers within each band. However, in KFCM algorithm, σ parameter is also unknown for each band. As such, each bee hosts \( {\text{N}}_{\text{c}} \) × \( {\text{N}}_{\text{b}} \) unknown cluster centers (each cluster in each band represents one unknown) along with \( {\text{N}}_{\text{b}} \) unknown σ parameters.

Fig. 3
figure 3

Bee structures in ABC-KFCM

Methods for Results Assessment

One of the important issues in clustering is results assessment. This examination helps identifying most appropriate groups for the data understudy (Halkidi et al. 2001). While there are different methods for assessing clustering algorithms results in the literature, this paper utilizes three indices, namely Kappa coefficient, entropy technique, and compactness and separation criteria.

Kappa Coefficient

Kappa coefficient is one of the most important statistical indices that can be extracted from error matrix. In fact, this index removes the effects of chance from modeling. Equation (16) shows how Kappa coefficient is calculated (Mather and Tso 2009).

$$ \widehat{\text{k}} = \frac{{{\text{n}}\mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{k}} {\text{n}}_{\text{ii}} - \mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{k}} {\text{n}}_{{{\text{i}} + }} {\text{n}}_{{ + {\text{i}}}} }}{{{\text{n}}^{2} - \mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{k}} {\text{n}}_{{{\text{i}} + }} {\text{n}}_{{ + {\text{i}}}} }} $$
(16)

where \( {\text{n}} \) is the total number of actual earth pixels, and \( {\text{n}}_{{{\text{i}} + }} \) and \( {\text{n}}_{{ + {\text{i}}}} \) show all of the components on ith row and ith column, respectively. Kappa coefficient is a value between 0 and 1, with the former indicating a completely random clustering, while the latter refers to an ideal clustering. Kappa coefficient is, in fact, a pessimistic estimation of the modeling and yields an accuracy lower than the actual value (Mather and Tso 2009).

Entropy Technique

Entropy is a general concept in physics, social sciences, and information theory. It shows the untrustworthiness level of an expected content in a message. In other words, entropy in information theory is a criterion stated for the untrustworthiness level by a discrete probability distribution, such that this untrustworthiness will be more if the distribution is scattered rather than peaked. Equation (17) shows how this parameter is calculated (Asagharpour 2010).

$$ {\text{E}} = - \mathop \sum \limits_{{{\text{i}} = 1}}^{\text{n}} \left[ {{\text{P}}_{\text{i}} \times {\text{Ln}}\left( {{\text{P}}_{\text{i}} } \right)} \right],\quad {\text{P}}_{\text{i }} = \frac{{{\text{d}}_{\text{i }} }}{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{n}} {\text{d}}_{\text{i }} }} $$
(17)

where d is the distance from each cluster center to its components.

Compactness and Separation Criteria

Berry and Linoff (1996) introduced two criteria called compactness and separation criteria. Expressed in terms of standard deviation, compactness criterion shows how close are values of members in each cluster to one another; this value should be minimized (Halkidi et al. 2001). Separation shows an appropriate difference in clusters for which three approaches have been introduced by Berry and Linoff (1996).

  1. 1.

    Measuring the distance between the closest two components in clusters.Footnote 3

  2. 2.

    Measuring the distance between the farthest two components in clusters.Footnote 4

  3. 3.

    Measuring the distance between cluster centers.Footnote 5

Implementation and Evaluation

To evaluate the efficiency of the proposed methods, it was implemented and tested for the clustering of WorldView-2 satellite images. Captured on October 9, 2011, the test images were from Lawrence, Kansas, USA. For these images, a spatial resolution of 0.5 m was obtained following the spectral sharpening. All of the eight spectral bands (4 standard colors: red, blue, green, near-IR; 4 new colors: red edge, coastal, yellow, near-IR2) were used for resolving the clustering issue. Figure 4 shows the three case study areas selected for cluster analysis using the proposed method. Different sizes of images were selected to make sure image size did not affect assessment results. Region 1 is of 750 × 833 pixel dimension, Region 2 is of 484 × 478 pixel dimension, and Region 3 covers an area of 734 × 697 pixel dimension.

Fig. 4
figure 4

The study areas selected from the WorldView2 satellite images

To evaluate the proposed combinations, five algorithms were used for the image clustering: K-means, Modified K-means (called MK) (using hybrid cost function), combined Modified K-means-ABC (called ABC-MK), combined Modified K-Medoids-ABC algorithms (called ABC-MKM), combined ABC-FCM algorithms (called ABC-FCM), and combined ABC-KFCM algorithms (called ABC-KFCM). K-Means and MK were used with the hybrid cost function, so as to explain the effect of the proposed cost function on clustering, and compare the results. Moreover, the proposed cost function was also used for other methods. The ABC algorithm was combined with each of these methods to obtain the most optimized clustering method for high-resolution satellite images. Figure 5 shows the clustering of the areas under study using different algorithms.

Fig. 5
figure 5figure 5

Clustering of under study regions using different algorithms

Figure 6 value of the hybrid cost function value in different iterations with the utilized algorithms.

Fig. 6
figure 6figure 6

Hybrid cost function value in different iterations and for utilized algorithms

Table 1 reports the assessment results of algorithms for the regions understudy. The values obtained from K-Means and modified K-Means show that the hybrid cost function succeeded to optimize the results. The entropy value in Table 1 indicates enhanced quality-trustworthiness and reduced uncertainty of the associated results with modified K-Means algorithm, as compared to those of the K-Means algorithm. The associated uncertainty values with modified K-Means algorithm in regions 1, 2, and 3 decreased from 51.2322, 49.7665, and 49.7651 to 41.6859, 33.9081, and 41.7540, respectively.

Table 1 Results of the proposed algorithms assessment

Standard deviation served as a measure of compactness for the clustering assessment. The obtained standard deviations for the regions understudy show that, the hybrid cost function contributed to decreased standard deviation of the distances from each cluster member to the cluster center. This indicates the closeness of cluster members to one another. According to Table 1, corresponding standard deviations to K-Means algorithm were 50.8733, 47.0922, and 43.6543 for regions 1, 2, and 3, respectively. In modified K-Means algorithm, these values were observed to be significantly decreased to 23.1087, 34.3867, and 33.5498, respectively. To estimate separation level of the clusters with different algorithms, cluster distances were assessed. By increasing the inter-cluster space, the hybrid cost function attempts to assign similar pixels to the same cluster while maximizing inter-cluster differences. The cluster distances in K-Means algorithm were 432.5019, 239.6501, and 467.9342 for regions 1, 2, and 3, respectively. In the modified K-Means algorithm, however, the distances increased to 492.5623, 245.5923, and 481.0512, respectively.

Moreover, Table 1 shows that, the combinations of ABC algorithm with other algorithms contributed into optimized results. This can be explained by the fact that, the ABC algorithm prevents the clustering algorithms from being trapped in local optima. Considering the values reported in Table 1, compared to modified K-Means, the ABC-MK algorithm decreased the corresponding entropy to region 1 by 5.3057. It further decreased the compactness and increased the cluster distance by 2.1359 and 31.3458, respectively. Optimizations experienced in other regions are also reported in Table 1.

According to Table 1, the corresponding values of the hybrid cost function to regions 1, 2, and 3 underwent a significant decrease by 77.33, 45.0008, and 30.3791, respectively, in modified K-Means algorithm. The decrease was due to the use of ABC algorithm. Furthermore, Table 1 shows that, one can sort the algorithms in order of decreasing uncertainty, compactness, cost function, and increasing separation, as ABC-KFCM, ABC-MK, ABC-FCM, ABC-MKM, modified K-Means, and K-Means. The results indicated that the hybrid algorithm consisting of the proposed cost function, KFCM algorithm, and ABC algorithm was the most optimized method. Therefore, this hybrid algorithm can be recommended for the clustering of high-resolution satellite images.

The only issue remained is to assess the clustering method against the reality; the mission was accomplished using Kappa coefficient which assessed clustering results of satellite images of high-spatial resolution against the reality. Figure 7 shows the values obtained by Kappa coefficient for the provided algorithms in each study area. Once hybrid cost function was implemented, values of Kappa coefficient increased from 0.636, 0.718 and 0.628 to 0.753, 0.814 and 0.715, respectively. Following the incorporation of ABC algorithm into modified K-Means algorithm, Kappa coefficient exhibited 0.118, 0.080, and 0.142 enhancements for regions 1, 2, and 3, respectively. Considering the obtained significant values (reported in Table 1), ABC-KFCM algorithm yielded Kappa coefficients at 0.942, 0.969, and 0.947 confidence level for regions 1, 2, and 3, respectively. These values were 0.306, 0.251, and 0.319 higher than those yielded by K-Means algorithm.

Fig. 7
figure 7

Values of Kappa coefficient for the provided algorithms

Conclusion

This study showed the effect of hybrid cost function and intelligent hybrid methods on uncertainty and quality of cluster analysis for high-spatial resolution satellite images. The optimization level of clustering quality and increased trustworthiness of each cluster value were initially assessed by introducing a hybrid cost function. Clustering quality was assessed by Kappa coefficient, compactness/separation levels, and cost function values. Moreover, uncertainties were assessed through entropy measurements. These assessments were conducted on the three study areas across a high-spatial resolution satellite image. The results showed that, in all three regions, the uncertainty level decreased according to the entropy values. Furthermore, clustering compactness and cost function decreased, while the separation of clusters and Kappa coefficient increased; i.e., every indicator showed the enhancement in the clustering quality.

Uncertainty and clustering quality of high-spatial resolution satellite images were then assessed by combining ABC, hybrid cost function, and clustering methods (i.e. K-Means, K-Medoids, FCM, and KFCM). According to the quality indicators, the results showed that the combination of the ABC algorithm, the hybrid cost function, and KFCM leads to even further enhancements in clustering certainty and quality of high-spatial resolution satellite images. Therefore, the proposed hybrid algorithm was proved to serve as the most optimized method, compared to other methods considered, for the clustering of high-spatial resolution satellite images. It is recommend to address the problem with other optimization methods, so as to compare the methods and find an optimal approach.