1 Introduction

Segmentation is a fundamental processing in image analysis systems. For a long time, various segmentation approaches have been proposed in the literature. These approaches can be classified into three main categories: region-oriented approaches, contour-oriented approaches, and clustering approaches. Region and contours oriented approaches are based on techniques that have been developed specifically for image segmentation. They, therefore, benefit from image data properties, namely the spatial correlation. However, segmentation by clustering is based on techniques that are originally used for the analysis of any type of data and not specifically images. Even if these techniques give acceptable results when applied to image segmentation, their performances can be further improved if they are adapted to this particular task. Segmentation based on the FCM clustering algorithm (Bezdek, Ehrlich, and Full (1984)) does not escape this rule. In fact, this algorithm has been widely used in the literature for grouping image pixels into regions. Basically, the assignment of a pixel to a cluster is exclusively based on its inherent characteristics. The information related to the spatial correlation in the local neighborhood of each pixel is not taken into consideration in this process. The incorporation of this important feature of the image signal can be of great interest, especially when dealing with noisy image segmentation (Choudhry, and Kapoor (2016); Liew, Leung, and Lau (2000)). Indeed, noise appears as pixels that look visually different from their neighbors. Despite this visual difference, they belong semantically to the same region as their neighbors. This configuration causes an ambiguity for classification algorithms, making it difficult for them to classify noisy pixels correctly. To overcome this issue, improved versions of FCM have been proposed in the literature. These versions have proceeded by incorporating spatial information at different levels of the FCM algorithm: the objective function, the dissimilarity distance, and the membership degrees updating. In this paper, we propose a modified FCM algorithm that falls within the framework of the last type of above-mentioned approaches, where the incorporation of the spatial information is made at the level of the membership degrees updating. Our proposal, called robust contextual fuzzy c-means (RCFCM), defines three types of local spatial contexts and introduces a novel readjustment factor that adaptively amends the membership degrees according to these contexts. Most of the proposed approaches introduce spatial information through a uniform parameterized mathematical model. The adaptation to normal or noisy pixels is implemented via the model parameters which are dynamically calculated depending on the context data. Compared to these approaches, we use in our proposal a rule-based modeling where we define a specific processing strategy for each context type. This offers thereby a greater adaptation ability that lets us cope better with non-linearly separable data since normal and noisy pixels will be processed differently, each with a dedicated model. Moreover, the RCFCM algorithm has the advantage to be easy to set since except the size of the neighborhood to be considered, all the other used parameters are automatically estimated.

The rest of the paper is organized as follows: The next section reviews the state of the art relevant to this work. Section 3 presents the conventional FCM algorithm and details its limits when dealing with noisy images. The novel approach is introduced in Section 4. Section 5 describes the experiments and the assessment of our proposal. Finally, the discussion of the obtained results and conclusions are respectively reported, in Sections 6 and 7.

2 Related Work

To overcome the FCM sensitivity to noise, most algorithms try to incorporate spatial contextual information into the clustering process. The review of literature shows that such incorporation was addressed at three main levels.

The first one is the level of the objective function to optimize. The main idea in such approaches is to modify the conventional objective function of the FCM algorithm by adding a regularization term that quantifies the dissimilarity between a pixel and its neighbors in the feature space (Ahmed, Yamany, Mohamed, Farag and Moriarty (2002); Lei, Jia, Zhang, He, Meng, and Nandi (2018); Wang, Song, Soh, and Sim (2013); Wang, Wang, Fang, and Yang (2020c); Wang, Pedrycz, Li, and Zhou (2020a); Wang, Pedrycz, Zhou, and Li (2020b); Wang, Wang, Fang, and Jiao (2021); Zhang, Pan, Wu, Chen, Mao, and Wu (2020)). Pixels’ memberships and clusters’ centers expressions are obtained by optimizing this new objective function using the Lagrange multiplier technique. Most of these methods consider a single objective function. They tend to be effective for well-separated spherical clusters, but their performances decrease with more complicated cluster structures (Zhao, Liu, and Fan (2015)).

The second level of the spatial contextual information incorporation is that of the distance used to measure the dissimilarity between each pixel and the clusters’ centers ( Ayech, El Kalti, and El Ayeb (2010); Despotovic, Vansteenkiste, and Philips (2013); Liew, Leung, and Lau (2000, 2003); Mohamed, Ahmed, and Farag (1998)). Most of the proposals made in this context use a weighted two terms distance, where the first term expresses the conventional pixel distance to a given cluster while the second term expresses a spatial distance. This latter represents the distance separating the pixel’s neighbors from the considered cluster. Weights serve to control the influence to give to each term. They can be statically fixed for all the pixels or dynamically calculated according to the local context of each pixel. The main difference between the different proposals of this category lies in the way in which the neighbors are taken into account. We note here that the updating of the membership degrees and the clusters’ centers remain identical to those of the standard FCM algorithm. It is noteworthy in this context that the spatial distances of the above approaches have been constructed around the Euclidean distance. These approaches lose performance when data is nonlinearly separable. To cope with this issue, Kernel distance-based methods are used to project data into a higher dimensional space and thus make it linearly separable. In Chang-Chien, Nataliani, and Yang (2021); Elhedda, Mehri, and Mahjoub (2020); Yang, Zhang, Lu, and Ma (2010), the authors introduced various kernel distances in the FCM algorithm. Some other works like in Elazab, Wang, Jia, Wu, Li, and Hu (2015); Yang, and Tsai (2008); Zhao, Jiao, and Liu (2013) combined both the use of kernel distance and the incorporation of spatial information into the cost function. The main shortcoming of these techniques lies in the high time-consuming.

The third level of spatial contextual information incorporation is that of the membership degrees updating function. In that case, the partition matrix is updated in such a way as to take into account the neighbors’ membership degrees. Two main approaches can be distinguished in that case. In the first approach, the membership degrees of each pixel are updated by a new value, obtained by a combination with those of its neighbors and based on a linear spatial function. The membership degrees of all the pixels are updated uniformly using this spatial function (Alruwaili, Siddiqi, and Javed (2020); Adhikari, Sing, Basu, and Nasipuri (2015); Shamsi, and Seyedarabi (2012); Li, and Li (2006); Chuang, Tzeng, Chen,Wu, and Chen (year)). In the second approach, the spatial context of each pixel is first analyzed. Then, according to this analysis, its membership degrees are amended using a readjustment factor. Conditional rules are often used to model the knowledge related to the amendment (Fan, Zhen, and Xie (2003); Tian, Yu, and Shen (2012); Tolias, and Panas (1998)).

It should be noted that, apart from the FCM framework, other fuzzy clustering techniques incorporating spatial information have been proposed in the literature. In Zhao, Liu, and Fan (2015), fuzzy clustering is modeled as a multi-objective optimization problem to satisfy multiple segmentation requirements. The authors proposed a multi-objective spatial fuzzy clustering algorithm for image segmentation that optimizes two objective functions. The first expresses the fuzzy compactness with spatial information, and the second expresses the fuzzy separation. In Zhao, Liu, Li, Liu, Lan, and Fan (2021), the authors proposed the use of two membership functions expressing, respectively, the local and non-local spatial information. The multiobjective optimization is implemented using evolutionary algorithms. In Kalaiselvi, and Gomathi (2020), the authors proposed a fuzzy deep neural network (FDNN) for change detection between multi-temporal images. The fuzzyfication layer of the FDNN retains spatial information and the variation of neighbor pixels in order to reduce the effect of speckle when classifying pixels as “changed” or “unchanged”.

3 Fuzzy C-Means Clustering

3.1 FCM Algorithm

Fuzzy c-means algorithm (FCM) is a popular technique used for unsupervised clustering of multivariate data. It represents an extension of the hard clustering K-means (Jain (2010)) algorithm that introduces fuzzy logic. Unlike K-means algorithm that assigns each data sample to one cluster, FCM assigns each sample to all the clusters with fuzzy membership degrees. The clustering is performed by iteratively minimizing a cost function based on a quadratic criterion that represents the weighted distance separating each sample from the clusters’ centers. This cost function denoted J is given by the following equation:

$$\begin{aligned} J_{\text {FCM}}(U,V) = \sum _{i=1}^{C}\sum _{j=1}^{N} \mu _{ij}^m \Vert x_j-v_i\Vert ^2, \end{aligned}$$
(1)

where

\(x_j \in X\) refers to the feature vector of the \(j^{th}\) sample,

\(X=\{x_1, x_2, \dots , x_N\} \in \mathrm {I\!R}^p\) is the dataset of samples, N the size of X and P the size of the features space,

\(V = (v_1, v_2, \dots , v_c)\) is the vector of the clusters’ centers,

\(v_i\) represents the \(i^{th}\) cluster center and C is the number of clusters,

m is a fuzzyfication factor,

\(\mu _{ij}\) refers to the membership degree of the sample j to the cluster represented by the center \(v_i\),

\(U=[\mu _{ij}]\) is the partition matrix. This matrix fulfills the two following constraints:

$$\begin{aligned} \mu _{ij} \in [0,1] \text { } \forall \text { } i,j, \end{aligned}$$
(2)
$$\begin{aligned} \sum _{i=1}^{C}\mu _{ij} = 1 \text { } \forall \text { } j . \end{aligned}$$
(3)

The FCM algorithm proceeds as follows: first, samples are assigned to the different clusters with random membership degrees. Clusters’ centers are then calculated using Eq. 4.

$$\begin{aligned} v_i = \frac{\displaystyle \sum _{k=1}^{N} \left( \mu _{ik}\right) ^m x_k}{\displaystyle \sum _{k=1}^{N} \left( \mu _{ik}\right) ^m} . \end{aligned}$$
(4)
Fig. 1
figure 1

Image with two regions having a noisy pixel

Fig. 2
figure 2

Three examples of noisy pixels’ local contexts

Considering the newly calculated centers, the samples’ membership degrees are updated using Eq. 5.

$$\begin{aligned} \mu _{ij} = \frac{1}{\displaystyle \sum _{k=1}^{C} \left( \frac{\Vert x_j-v_i\Vert }{\Vert x_j-v_k\Vert }\right) ^\frac{2}{m-1}} . \end{aligned}$$
(5)

This iterative process is repeated until convergence which is reached when the change in the objective function between two consecutive iterations (t) and (\(t+1\)) is smaller than a given threshold \(\epsilon \).

$$\begin{aligned} \Vert J_{\text {FCM}} (U^{(t+1)}, V^{(t+1)}) - J_{\text {FCM}} (U^{(t)}, V^{(t)}) \Vert < \epsilon . \end{aligned}$$
(6)

3.2 Limitation of the FCM Algorithm in Noisy Pixel Clustering

The main drawback of the conventional FCM algorithm when dealing with image segmentation lies in its use of the individual pixels’ data without taking into consideration the local context represented by the spatial neighborhood. In images, data are spatially correlated. Hence, the incorporation of the context becomes advantageous to reduce FCM sensitivity to noise.

Figure 1 depicts an illustrative case of such a situation. It represents an image with two regions: black (0) and white (255). Applying the FCM algorithm to segment the two regions produces two well-separated clusters, one cluster per region. However, the FCM fails to assign correctly the noisy pixel denoted NP on the image. In fact, due to its gray level value, the FCM assigns it to the white region cluster while it belongs spatially to the black region cluster.

Figure 2 shows another example of a noisy image with three samples of noisy pixels and their corresponding neighborhoods (marked by the red squares).

Fig. 3
figure 3

From top to bottom, column a, the three local windows N1, N2, and N3 marked in Fig. 2, column b, the clustering result into three clusters for N1, N2, and N3, column c, the expected result in case of incorporation of local context

Figure 3 shows in more detail, these neighborhoods (column a), the result of the clustering of the image into three classes using the standard FCM (column b), and the expected outcome in case of the incorporation of the local context (column c). We note that in the case of the presence of more than one cluster in the local context, noisy pixels take the label of the dominant cluster.

4 Proposed Approach

To overcome the above-mentioned limitations of the standard FCM algorithm when dealing with noisy images, we propose in this paper a novel approach that incorporates contextual information in the clustering process. This approach proceeds by readjusting the membership degrees after each iteration in such a way as to take into consideration the local context of each pixel. To set out our approach, we need to introduce first some key notions we have defined.

4.1 Key Notions

Our approach is mainly based on the notion of spatial context. A spatial context of a pixel j, denoted SC\(_j\) is defined by the neighboring pixels belonging to a \(L \times L\) square window centered on j. Given the membership degrees of j to the different clusters \(\{\mu _{ij}, i:1..C\}\), we define two particular types of clusters: the pixel cluster and the context cluster.

Definition 1

: Pixel cluster

The pixel cluster of a pixel j, denoted P\(_j\), is defined as the cluster to which j has the highest membership degree.

$$\begin{aligned} P_j = \arg \max _{i=1..C}(\mu _{ij}) . \end{aligned}$$
(7)

Definition 2

: Context cluster

The context cluster of a pixel j, denoted C\(_j\) is defined as the cluster which is the pixel cluster of the most neighbors of j in the local context.

$$\begin{aligned} C_j = \arg \max _{i=1..C}(f(i,j)) , \end{aligned}$$
(8)

where

$$\begin{aligned} f(i,j) = \sum _{k \in SC_j} w_{ik} \text {, where } {\left\{ \begin{array}{ll} w_{ik} = 1 &{} \text { if } P_k = i , \\ w_{ik} = 0 &{} \text { else} . \end{array}\right. } \end{aligned}$$
(9)

Given these two particular types of clusters, we define two other key parameters: the contextual similarity degree and the intrinsic similarity degree.

Definition 3

: Contextual similarity degree

The contextual similarity degree of a pixel j denoted (\(\alpha _j\)) is defined as the proportion of neighbors of j having the same context cluster than j.

$$\begin{aligned} \alpha _j = \frac{\text {card}\left( k \in SC_j / k \ne j, C_j=C_k \right) }{L^2-1} . \end{aligned}$$
(10)

Definition 4

: Intrinsic similarity degree

The intrinsic similarity degree of a sample j denoted (\(\beta _j\)) is defined as the proportion of neighbors of j having the same pixel cluster as j.

$$\begin{aligned} \beta _j = \frac{\text {card}\left( k \in SC_j / k \ne j, P_j=P_k \right) }{L^2-1} . \end{aligned}$$
(11)

4.2 Types of Pixels

In our approach, we distinguish three types of pixels needing each, an adapted processing during its clustering: noisy pixel, contour pixel, and region pixel. The distinction between these three types is mainly based on the analysis of the spatial context.

A noisy pixel (denoted NP) is a pixel that represents a noise. It belongs generally to a cluster that is different than those of its neighbors. Based on the notions we have introduced, a noisy pixel will have a pixel cluster P\(_{\text {NP}}\) that is different from the pixels clusters P\(_k\) of its k neighbors. Hence, if we consider its spatial context, its context cluster will be different from its pixel cluster, and its \(\beta _{\text {NP}}\) factor will be zero or very close to zero. For a given pixel j, the noisy pixel estimation rule can be expressed as follows:

$$\begin{aligned} j \text { IS NP} \text { IF } P_i \ne C_j \text { AND } \beta _j \approx 0 . \end{aligned}$$

A region pixel (denoted RP) is a normal pixel that belongs to a region. Its features are correlated to those of its neighbors, and consequently, it has the same pixel cluster P\(_{\text {RP}}\) as those of its neighbors. Hence, its context cluster and its pixel cluster are the same. This makes that its Contextual and intrinsic similarity degrees \(\alpha _{\text {RP}}\) and \(\beta _{\text {RP}}\) will be equal and both close to 1. For a given pixel j, the region pixel estimation rule can be expressed as follows:

$$\begin{aligned} j \text { IS RP} \text { IF } P_j = C_j \text { AND } \alpha _j \approx \beta _j \approx 1 . \end{aligned}$$

A contour pixel (denoted CP) is a pixel situated at the level of a contour separating two or more regions. Its context contains generally pixels that are similar to it (same cluster) and others that belong to the cluster(s) of the bordering region(s). Its P\(_{\text {CP}}\) and C\(_{\text {CP}}\) may be equal or different depending on its position with respect to the contour. Unlike NP and RP, for a CP, \(\alpha _{\text {CP}}\) and \(\beta _{\text {CP}}\) factors are not close to 0 nor to 1 but take values around 0.5. For a given pixel s, the region pixel estimation rule can be expressed as follows:

$$\begin{aligned} j \text { IS CP} \text { IF } \alpha _j \approx 0.5 \text { AND } \beta _j \approx 0.5 . \end{aligned}$$

Figure 4 represents two portions of an image that has two regions, one black (label B) and the other white (label W). It gives some illustrations of the three types of pixels that we consider in this work (one RP, one NP, and four examples of CP). Table 1 gives the values taken by the parameters we defined for each of these pixels. For a sample j, these parameters concern, respectively, the pixel cluster P\(_j\), the context cluster C\(_j\), the contextual similarity degree \(\alpha _j\), and the intrinsic similarity degree \(\beta _j\). These values are consistent with our characterization of the various types of pixels. Note that in the case where \(\alpha _j= \beta _j\) (case of CP1 in Fig. 4), C\(_j\) takes the label of the central pixel (C\(_j=\)P\(_j\)).

Fig. 4
figure 4

Different types of pixels

4.3 Integration of the Contextual Information

The integration of the contextual information that we propose is performed at the level of the membership degrees updating. Hence, after the computation of the new membership degrees according to the standard FCM algorithm, these degrees are analyzed in order to estimate the type of each pixel (NP, RP, or CP). This analysis is based on the \(\alpha _j\) and \(\beta _j\) parameters as presented in the previous section. According to the type of the pixel in process, we propose a readjustment strategy that tries to compensate for the weakness of the standard FCM at this level.

Table 1 Pixel cluster, context cluster, intrinsic, and contextual similarity degrees of the illustrative pixels of the Fig. 4

4.4 Updating Strategy for NP Pixels

If the pixel in process corresponds to a NP, then the membership degree of this latter to its context cluster should be increased while its membership degrees to the other clusters including its pixel cluster should be decreased. This updating strategy can be expressed for a given pixel j by the following rule:

$$\begin{aligned} \text {IF } j \text { is NP} \text { THEN } increase \text { } \mu _{c_jj} \text { AND } decrease \text { } \mu _{kj} . \end{aligned}$$

where \(\mu _{c_jj}\) refers to the membership degree of the pixel j to its context cluster C\(_j\) and \(\mu _{kj}\) refers to the membership degrees of the pixel j to the other k clusters.

To implement this rule, we propose a readjustment that brings the membership degree of the pixel NP to an estimation of the context cluster membership degree. This estimation is given by the average of the membership degrees of a selection of representative neighbors which are not noisy and which belong to the context cluster of NP. A neighbor is representative when it has a higher membership degree to the context cluster than that of the pixel NP. Formally, for a given noisy pixel j and an iteration (t), this update is made using the readjustment factor \(\eta _j\):

$$\begin{aligned} \eta _j^{(t)} = \left| \left( \frac{1}{N_c} \sum _{k \in V_{cj}} \mu _{c_kk}^{(t)}\right) - \mu _{c_jj}^{(t)}\right| , \end{aligned}$$
(12)

where

$$\begin{aligned} V_{cj} = \{x_k \in sc_j /p_k = c_j \text { and } \mu _{c_jk} > \mu _{c_jj}\} \text { and } N_c = \text {card}(V_{cj}) . \end{aligned}$$

The membership degree of \(\mu _{cj}\) is increased as follows:

$$\begin{aligned} \mu _{c_jj}^{(t+1)} = \mu _{c_jj}^{(t)}+ \eta _j^{(t)} . \end{aligned}$$
(13)

To fulfill the condition given by Eq. 3 the membership degrees of j to the other clusters are decreased proportionally to their importance as follows:

$$\begin{aligned} \mu _{lj}^{(t+1)} = \mu _{lj}^{(t)} - \frac{\mu _{lj}}{\displaystyle \sum _{\begin{array}{c} m=1..c \\ m\ne c_j \end{array}} \mu _{mj}} \eta _j^{(t)}, \text { } l: 1..C \text { and } l \ne c_j . \end{aligned}$$
(14)

4.5 Updating Strategy for CP Pixels

If the pixel being processed corresponds to a CP, the membership of this latter to its pixel cluster should be increased. To satisfy the constraint given by Eq. 3, its membership degrees to the other clusters should be decreased. This updating strategy can be expressed for a given pixel j by the following rule:

$$\begin{aligned} \text {IF } j \text { is CP} \text { THEN } increase \text { } \mu _{p_jj} \text { AND } decrease \text { } \mu _{kj} , \end{aligned}$$

where \(\mu _{c_jj}\) refers to the membership degree of the pixel j to its context cluster \(C_j\) and \(\mu _{kj}\) refers to the membership degree of the pixel j to the remaining k clusters.

The readjustment factor that we propose for this updating strategy is given as follows:

$$\begin{aligned} \eta _j^{(t)} = \left| \left( \frac{1}{N_p}\sum _{k \in V_{pj}} \mu _{pk}^{(t)}\right) - \mu _{pj}^{(t)}\right| , \end{aligned}$$
(15)
$$\begin{aligned} V_{pj} = \{x_k \in sc_j /p_k = p_j \text { and } \mu _{p_kk} > \mu _{p_jj}\} , \end{aligned}$$
$$\begin{aligned} N_p = \text {card}(V_{pj}) . \end{aligned}$$

Then, the membership degrees are updated according to the following equations:

$$\begin{aligned} \mu _{p_jj}^{(t+1)} = \mu _{p_jj}^{(t)}+ \eta _j^{(t)} , \end{aligned}$$
(16)
$$\begin{aligned} \mu _{lj}^{(t+1)} = \mu _{lj}^{(t)} - \frac{\mu _{lj}}{\displaystyle \sum _{\begin{array}{c} m=1..c \\ m\ne c_j \end{array}} \mu _{mj}} \eta _j^{(t)}, \text { } l: 1..C \text { and } l \ne p_j . \end{aligned}$$
(17)

4.6 Updating Strategy for RP Pixels

For a RP, the context cluster and the pixel cluster are the same (C\(_{\text {RP}}\) \( = \)P\(_{\text {RP}}\)). This means that its assignment by the standard FCM is going in the right direction. For such pixels, our approach preserves their membership degrees without amendment.

4.7 Proposed Clustering Algorithm

Considering the updating strategies described in the above section. The clustering algorithm that we propose is summed up in Algorithm 1. The main principle is identical to the standard FCM algorithm. However, at the updating stage, our algorithm starts by estimating the type of the pixel to cluster. Then, depending on its type, the adequate updating strategy is used according to the rules introduced in the previous section.

Algorithm 1
figure a

Robust contextual fuzzy c-means (RCFCM)

5 Experiments

To assess the effectiveness of our approach, we carried out experiments on both synthetic and real images corrupted by three types of noises at different rates: Gaussian noise (Gauss), salt-and-pepper noise (S &P) and mixed noise (Mixed). This latter is created by mixing salt-and-pepper noise and Gaussian noise at the same rate.

The proposed robust contextual fuzzy c-means algorithm (RCFCM) is compared with the following state-of-the art FCM based algorithms: FCM Jain (2010), FCMS1 Chen, and Zhang (2004), FCMS2 Chen, and Zhang (2004), EnFCM Szilagyi, Benyo, Szilágyi, and Adam (2003), FGFCM Cai, Chen, and Zhang (2007), FLICM Krinidis, and Chatzis (2010), FRFCM Lei, Jia, Zhang, He, Meng, and Nandi (2018), WRFCM Wang, Pedrycz, Li, and Zhou (2020a), and FCM_SICM Wang, Wang, Fang, and Yang (2020c).

5.1 Parameters Setting

All these algorithms were applied with a fuzzification parameter \(m = 2\) and minimum error \(\epsilon = 10^{-4}\). The \(\alpha \) parameter used in FCMS1, FCMS2, and EnFCM to control the effect of the spatial context is set to 0.85. The spatial and gray level scale factors used in FGFCM are respectively set to \(\lambda _s = 3\) and \(\lambda _g = 6\), and the size of the neighborhood is \(3 \times 3\). For the FRFCM algorithm, a \(3 \times 3\) window is used for the structuring element used to produce the marker image and for the kernel of the median filter used to update the membership degrees. For the WRFCM algorithm, the parameters are set as follows: \(\xi =0.0008\), \(\phi =5\), and the neighborhood size = \(3 \times 3\). The geometric and photometric spread parameters of the bilateral filter used by the FCM_SICM are respectively set to \(\sigma _d = 3.5\) and \(\sigma _r = 2\). The eps parameter of this algorithm is set to 0.000001. Finally, for our algorithm RCFCM, the only parameter is the size of the context, and it was set to \(3 \times 3\).

5.2 Performance Metrics

Performances are evaluated using three metrics which are accuracy, Dice index, and peak signal-to-noise ratio.

5.2.1 Accuracy

The accuracy (ACC) is defined as the sum of the ratios of the correctly classified pixels for each cluster to the total number of pixels. It is given by the following equation:

$$\begin{aligned} ACC = \sum _{k=1}^C \frac{A_k \cap C_k}{\sum _{j=1}^C C_j} , \end{aligned}$$
(18)

where c is the number of clusters, \(A_k\) and \(C_k\) denote the pixels of the cluster k, respectively detected by the clustering technique and those given by the ground truth.

5.2.2 Dice Index

The Dice index (DI) gives the degree of similarity between the segmented image and the ground truth. Using the same notations than in Eq. 18, this index is defined as follows:

$$\begin{aligned} DI = \frac{1}{C}\sum _{k=1}^C \frac{A_k \cap C_k}{A_k \cup C_k} . \end{aligned}$$
(19)

5.2.3 Peak Signal-to-Noise Ratio

The peak signal-to-noise ratio (PSNR) is a metric that expresses the quality of a reconstruction of an image compared to the original image. In our case, the reconstructed image is the one obtained as the output of the studied clustering algorithms while the original image is the ground truth. This metric is interpreted as follows: the higher the PSNR, the better the quality. PSNR is expressed in dB and formulated as in Eq. 20.

$$\begin{aligned} PSNR = 10 \log _{10} \left( MAX^{2} \left[ \frac{1}{N} \sum _{i=1}^N (s_{i} - o_{i})^{2} \right] ^{-1} \right) , \end{aligned}$$
(20)

where \(s_i\) denotes the segmented image, \(o_i\) the original image, N the number of pixels and MAX refers to the maximum value that can be taken by a pixel.

Fig. 5
figure 5

Examples of synthetic images used in experiments

5.3 Results on Synthetic Images

Figure 5 shows two synthetic images among those used in experiments. The first image (Fig. 5a) contains three classes and presents both straight and curved contours separating regions. The second image (Fig. 5b) contains some thin structures (plant branches) that should be preserved as much as possible by the clustering techniques.

Fig. 6
figure 6

Segmentation results on synthetic image (Fig. 5a) corrupted by salt-and-pepper noise at 10%. a Noisy image and b ground truth. From c to l: the results of FCM, FCMS1, FCMS2, EnFCM, FGFCM, FLICM, FRFCM, WRFCM, FCM_SICM, and RCFCM

Fig. 7
figure 7

Segmentation results on synthetic image (Fig. 5a) corrupted by Gaussian noise at 4%. a Noisy image and b ground truth. From c to l: the results of FCM, FCMS1, FCMS2, EnFCM, FGFCM, FLICM, FRFCM, WRFCM, FCM_SICM, and RCFCM

Fig. 8
figure 8

Segmentation results on synthetic image (Fig. 5a) corrupted by mixed noise at 4%. a Noisy image and b ground truth. From c to l: the results of FCM, FCMS1, FCMS2, EnFCM, FGFCM, FLICM, FRFCM, WRFCM, FCM_SICM, and RCFCM

Figures 67, and 8 show the results obtained after the application of the different FCM-based algorithms on images in Fig. 5a. For illustrative purposes, we present the results for one rate for each type of noise. Visual observation of the results shows that the RCFCM algorithm ranks among the best techniques in terms of quality. The contours of the geometric shapes (square, circle, and triangle) have been well preserved, even at the corners. Table 2 lists the quantitative results obtained for this image for the three types of conidered metrics. The values obtained demonstrate that our proposal gives the best performance for the salt-and-pepper noise. The RCFCM is ranked second for the Gaussian and mixed noise at 2%. But when the rate of these noises increased, its performances dropped compared to FLICM, WRFCM, and FCM_SICM. The PSNR results are globally consistent with those of the accuracy and the Dice index.

Table 2 Quantitative results obtained for the image given by Fig. 5a

The second synthetic image (Fig. 5b) is segmented into three classes: black (value 0), gray (value 127), and white (value 255). Branches and leaves are within the gray class. They represent thin graphic elements that are sensitive to spatial clustering.

Figures 910, and 11 show the outcomes of the segmentation of this image for various types of noise. The details of the quantitative evaluation are given in Table 3. These findings show that the RCFCM algorithm acquires the better evaluation results for the most types of noise except for the mixed noise at 4%, where it is outperformed by the FRFCM algorithm. They also reveal that it achieves the best preservation of thin structures, while some other techniques considerably deteriorate them.

5.4 Results on Real-World Images

Figure 12 gives two examples of real images among those used in experiments. These images were also corrupted by the three types of noises at different rates.

Fig. 9
figure 9

Segmentation results on synthetic image (Fig. 5b) corrupted by salt-and-pepper noise at 10%. a Noisy image and b ground truth. From c to l: the results of FCM, FCMS1, FCMS2, EnFCM, FGFCM, FLICM, FRFCM, WRFCM, FCM_SICM, and RCFCM

Fig. 10
figure 10

Segmentation results on synthetic image (Fig. 5b) corrupted by Gaussian noise at 4%. a Noisy image and b ground truth. From c to l: the results of FCM, FCMS1, FCMS2, EnFCM, FGFCM, FLICM, FRFCM, WRFCM, FCM_SICM, and RCFCM

Fig. 11
figure 11

Segmentation results on synthetic image (Fig. 5b) corrupted by mixed noise at 4%. a Noisy image and b ground truth. From c to l: the results of FCM, FCMS1, FCMS2, EnFCM, FGFCM, FLICM, FRFCM, WRFCM, FCM_SICM, and RCFCM

Table 3 Quantitative results obtained for the image given by Fig. 5b
Fig. 12
figure 12

Examples of real images used in experiments

Image given by Fig. 12a was segmented on two clusters in order to separate rice grains from the non-uniformly illuminated background. Figures 1314, and 15 visually show the comparison between the RCFCM algorithm result and its peers, while Table 4 gives the quantitative comparison. We notice that the main competitor of our technique is the WRFCM algorithm. The best results are shared between this technique and ours with a slight superiority for WRFCM.

Fig. 13
figure 13

Segmentation results on real image (Fig. 12a) corrupted by salt-and-pepper noise at 10%. a Noisy image and b ground truth. From c to l: the results of FCM, FCMS1, FCMS2, EnFCM, FGFCM, FLICM, FRFCM, WRFCM, FCM_SICM, and RCFCM

Fig. 14
figure 14

Segmentation results on real image (Fig. 12a) corrupted by Gaussian noise at 4%. a Noisy image and b ground truth. From c to l: the results of FCM, FCMS1, FCMS2, EnFCM, FGFCM, FLICM, FRFCM, WRFCM, FCM_SICM, and RCFCM

Fig. 15
figure 15

Segmentation results on real image (Fig. 12a) corrupted by mixed noise at 4%. a Noisy image and b ground truth. From c to l: the results of FCM, FCMS1, FCMS2, EnFCM, FGFCM, FLICM, FRFCM, WRFCM, FCM_SICM, and RCFCM

Figure 12b represents a brain magnetic resonance image (MRI). The accurate segmentation of brain tissues such as gray matter, white matter, and cerebrospinal fluid is an important step for the detection of many diseases. Figures 1617, and 18 show the results obtained by clustering the brain MRI image into three clusters to isolate each type of tissue. It shows that for salt-and-pepper noise, RCFCM achieves good delineation of the different tissue types while removing noise. However, Gaussian and mixed noises caused a classification ambiguity, especially between gray matter and white matter. The quantitative evaluation of these results is reported in Table 5. As for most other experiments, RCFCM outperformed the other algorithms when dealing with salt-and-pepper noise. But for the Gaussian and mixed noises, the best performances are globally achieved by the FLICM algorithm.

Table 4 Quantitative results obtained for the image given by Fig. 12a

6 Discussion

All the carried out experiments showed that our algorithm produced very competitive results placing it globally among the top studied techniques. It achieved the best performance for salt-and-pepper noise, and it gave acceptable results for Gaussian and mixed noise with low rates where it was slightly outperformed, mainly by the WRFCM algorithm (often with a deviation around \(1\%\)) for some images and by the FLICM algorithm for some others. However, as the noise became more severe, the performance of RCFCM decreased, compared to the top techniques. This point will have to be further studied in our future work in order to improve it. It should be noted that RCFCM has the merit of obtaining these performances without having to resort to filtering, as is the case with a certain number of studied techniques, that exploit filtered versions of the image in the classification process (mean and median filtering for FCMS1 and FCMS2, morphological filtering for FRFCM, and bilateral filtering for FCM_SICM). Moreover, in the majority of works, the integration of spatial information has led to the use of additional parameters that are difficult to configure. For some techniques, these parameters are set experimentally, sometimes depending on the image under study (case of WRFCM). Such techniques need to be massively tested for each image to find the appropriate parameter value to use; otherwise, their performances may fluctuate. Compared to this, RCFCM has the advantage of being easily configurable since, apart from the standard FCM parameters, it only requires one additional, easy-to-set, parameter, which is the context size.

Fig. 16
figure 16

Segmentation results on real image (Fig. 12b) corrupted by salt-and-pepper noise at 10%. a Noisy image and b ground truth. From c to l: the results of FCM, FCMS1, FCMS2, EnFCM, FGFCM, FLICM, FRFCM, WRFCM, FCM_SICM, and RCFCM

Fig. 17
figure 17

Segmentation results on real image (Fig. 12b) corrupted by Gaussian noise at 4%. a Noisy image and b ground truth. From c to l: the results of FCM, FCMS1, FCMS2, EnFCM, FGFCM, FLICM, FRFCM, WRFCM, FCM_SICM, and RCFCM

Fig. 18
figure 18

Segmentation results on real image (Fig. 12b) corrupted by mixed noise at 4%. a Noisy image and b ground truth. From c to l: the results of FCM, FCMS1, FCMS2, EnFCM, FGFCM, FLICM, FRFCM, WRFCM, FCM_SICM, and RCFCM

Table 5 Quantitative results obtained for the image given by Fig. 12b

7 Conclusion

In this paper, a novel robust contextual clustering algorithm is proposed to address the limitations of the conventional FCM algorithm in the classification of noisy pixels in images. Contextual clustering is performed through the incorporation of the information coming from the contexts of pixels in the clustering process. For each pixel, the context is defined by its surrounding neighbors, delimited by a square window. The incorporation of contextual information can be made at different levels. In the proposed approach, it was made at the membership degrees updating level and was based on two new parameters: the contextual similarity degree and the intrinsic similarity degree. These parameters are used to estimate the type of pixel being processed. In this regard, three types of pixels were identified: noisy pixel, contour pixel, and region pixel. For each of these types, a specific strategy for updating the membership degrees was established. The reported experimental results have proven the effectiveness of the approach on two levels. The first level is related to the main objective of this work, which is the improvement of the clustering quality by reducing the misclassification of noisy pixels. The second level is concerned with the preservation of contours and details, where the proposed approach has achieved a satisfactory tradeoff between noise removal effect and detail preservation. This tradeoff was performed due to the adaptive updating strategy adopted in this work.