Introduction

Breast cancer is the most common cancer and the second leading cause of cancer death for American women today. Medical imaging is essential for screening and diagnosing the breast cancer. Magnetic resonance imaging holds great potential as a non-invasive tool for the detection and diagnosis of breast lesions. Conventional MRI [1, 4] techniques attempt to characterize tissues based on proton density signal intensity (essentially, the water content), as modulated by effects of the molecular structure and associated microscopic magnetic field environment. The signal intensity of breast lesions [6, 7] is predominantly determined by the water content and fibrous cell matter of individual tissues, rather than by specific cellular characteristics. Because both benign and malignant lesions may have high water content and cellular or fibrous content, they exhibit similar signal behaviors and consequently have similar T1-weighted and T2-weighted measurements [5]. This also accounts for the wide variations in signals within benign and malignant classes of lesions. These various studies have now led to the conclusion that reliable tissue characterization for detection and diagnosis of breast lesions, based on tissue contrast by plain MRI is not feasible. Over the last decade attention has shifted from plain MRI to contrast-enhanced MRI using paramagnetic contrast agents, typically Gadolinium chelates. Dynamic contrast-enhanced MRI (DCE-MRI) [3, 10, 32] of breast has been increasingly used in clinical practice for diagnostic imaging and post-treatment evaluation, but its specificity is still limited. DCE-MRI makes it possible to evaluate the architectural features of a breast lesion in several orientations, and also enables the radiologist to analyze the dynamic contrast enhancement [22, 23] characteristics of the lesion.

Due to the movement of patients, partial volume effects, intensity in-homogeneity, and limitation in imaging equipments are the well-known artifacts arise in imaging modality during the process of imaging. So it is very important to segment the DCE-MRI before diagnosing breast cancer, as the diagnosis based DCE-MRI image with well-known artifacts some times cause death [13]. There are a lot of methods available for MR image segmentation [27, 30]. Among them, fuzzy segmentation [14, 17, 20, 26] methods are of considerable benefits, because they could retain much more information from the original image than hard segmentation methods. The Fuzzy C-Means (FCM) [2, 25] algorithm, assigns pixels to fuzzy clusters without labels. Unlike the hard clustering methods which force pixels to belong exclusively to one class, FCM allows pixels to belong to multiple clusters with varying degrees of membership. The main disadvantages of fuzzy clustering technique are its need for a large amount of time to converge and it is more sensitive to the noise and outliers in the data, because of squared-norm to measure similarity between prototypes and data points.

To cluster more general dataset, a lot of algorithms have been proposed by replacing the squared-norm with other similarity measures. A recent development is to use kernel method to construct the kernel versions of the FCM algorithm. Zhang and Chen [33] proposed KFCM for clustering the incomplete data and medical image segmentation. However, a disadvantage of KFCM in segmentation of medical images is not to consider any spatial information in image context, which makes it very sensitive to noise and other imaging artifacts. Hence researchers have incorporated the local spatial information [15, 16, 28, 29] into the conventional FCM and KFCM [8] algorithm to improve the performance of image segmentation [24, 25]. One disadvantage of FCM_S is that it computes the neighborhood term in each step, which is very time consuming. In order to reduce the computation time, Chen and Zhang [31, 34] proposed modified KFCM_S by adding the spatial penalty term. But the disadvantage of KFCM_S is that it computes the neighborhood term in each iteration step, which is very time-consuming.

In this paper, we proposed three new robust algorithms for DCE-breast MR Image segmentation based on the concept of KFCM, Tolerance [5], additional penalty term, and Entropy [11, 18, 19]. The tolerance vector [9, 21] improves the similarity between each data and cluster centers in the proposed algorithm. The proposed objective functions with the penalty term, extended additional penalty term and additional entropy term are mainly developed for effective image segmentation, robustness to noise and outliers, desirable memberships, and advance the similarity measurement.

In order to reduce the number of iteration, these algorithms select the initial centers by using dist-max initialization method and these algorithms incorporate the spatial information. The experimental results show that the proposed algorithms are effective and more robust to reduce the noise, and outliers. Further the experimental results give the suggestion for selecting the best algorithm for segmentation of DCE-breast MR Image.

The structure of the paper is as follows: “Traditional KFCM” describes the traditional KFCM algorithm. The proposed new fuzzy clustering algorithms are presented in “Robust KFCM with spatial informations (RKFCM_S)” which is used to alleviate the drawbacks of the existing algorithms. The experimental results are described and analyzed in “Results and discussions” and conclusions are presented in “Conclusion” according to the discussions in the previous sections.

Traditional KFCM

Given a dataset \( X = \left\{ {{x_1},{x_2}, \ldots, {x_n}} \right\} \subset {R^p} \) , the basic FCM algorithm partitions the dataset X into c fuzzy subsets by minimizing the following objective function

$$ {J_{fcm}}\left( {U,V} \right) = \sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m{{\left\| {{x_i} - {v_k}} \right\|}^2}} } $$
(1)

Here, the number of clusters denoted by c and the number of data points denoted by n.

U :

represents the matrix of u ik , the membership of x i in class k.

V :

represents the set of cluster centers or prototypes (v k R p).

The parameter m is a weighting exponent on each fuzzy membership and determines the amount of fuzziness of the resulting classification.

The objective function J is minimized by a famous iterative algorithm subject to the constraints

$$ \sum\limits_{k = 1}^c {{u_{ik}}} = 1\;\forall i $$
(2)

Define a nonlinear map as \( \varphi :x \to \varphi (x) \in F \), where xX . X denotes the data space, and F the transformed feature space with higher even infinite dimension. KFCM minimizes the following objective function

$$ {J_{kfcm}}\left( {U,V} \right) = \sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m{{\left\| {\varphi \left( {{x_i}} \right) - \varphi \left( {{v_k}} \right)} \right\|}^2}} } $$
(3)

where

$$ {\left\| {\varphi \left( {{x_i}} \right) - \varphi \left( {{v_k}} \right)} \right\|^2} = K\left( {{x_i},{x_i}} \right) + K\left( {{v_k},{v_k}} \right) - 2K\left( {{x_i},{v_k}} \right) $$
(4)

where \( K\left( {x,y} \right) = \varphi {(x)^T}\varphi (y) \) is an inner product kernel function. If we adopt the Gaussian function as a kernel function, i.e., \( K\left( {x,y} \right) = \exp \left( { - \frac{{{{\left\| {x - y} \right\|}^2}}}{{{\sigma^2}}}} \right) \), then K(x, x) = 1, according to Eqs. 4 and 3 can be rewritten as

$$ {J_{kfcm}}\left( {U,V} \right) = \sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - K\left( {{x_i},{v_k}} \right)} \right)} } $$
(5)

Minimizing Eq. 5 under the constraints of u ik , we have

$$ {u_{ik}} = \frac{{{{\left( {\frac{1}{{1 - K\left( {{x_i},{v_k}} \right)}}} \right)}^{\frac{1}{{m - 1}}}}}}{{\sum\limits_{j = 1}^c {{{\left( {\frac{1}{{1 - K\left( {{x_i},{v_j}} \right)}}} \right)}^{\frac{1}{{m - 1}}}}} }} $$
(6)
$$ {v_k} = \frac{{\sum\limits_{i = 1}^n {u_{ik}^mK\left( {{x_i},{v_k}} \right){x_i}} }}{{\sum\limits_{i = 1}^n {u_{ik}^mK\left( {{x_i},{v_k}} \right)} }} $$
(7)

Here we just use the Gaussian kernel function for simplicity. If we use other kernel functions, there will be corresponding modifications in Eqs. 6 and 7.

Robust KFCM with spatial informations (RKFCM_S)

Initialization

FCM is a local search optimization algorithm, which is very sensitive to the initial centers. The algorithm will get the local optimum solution easily if the initial centers are produced random. In order to avoid the blindness of evaluate random and make the initial centers approach the globally optimum solution, we propose the following initialization method for our proposed algorithms.

  1. Stage 1:

    Let \( X = \left\{ {{x_1},{x_2}, \ldots, {x_n}} \right\} \subset {R^p} \) be a p-dimensional data set. Find m 1 ,m 2 ,.....,m n , where \( {m_i} = \frac{{{x_{i1}} + {x_{i2}} + \ldots + {x_{ip}}}}{p} \), i = 1,2,...n. Arrange m i ’s in ascending order.

  2. Stage 2:

    Rearrange the data matrix in respect of its relabeling mean value. (i.e) \( X\prime = \left[ {{\hbox{x}}{\prime_1}{\hbox{,x}}{\prime_2}{,} \ldots {\hbox{x}}{\prime_{\rm{n}}}} \right] \). Partition the data into c groups.

Find \( s = \left\lfloor {\frac{n}{c}} \right\rfloor \), where s is the number of elements in each group. The number of cluster “c” is specified according to the nature of the dataset.

  1. Case 1:

    Suppose s is an integer, then s elements exist in each cluster.

  2. Case 2:

    Suppose s is not an integer. Consider s = s.d, where d is decimal point. If the decimal d < 0.5, then s.d has been rounded as s. If the decimal d >= 0.5, then s.d has been rounded as s + 1.

First group contains first s data of X’. Second group contains second s data of X’

$$ \begin{array}{*{20}{c}} . \hfill \\. \hfill \\. \hfill \\\end{array} $$

(c-1)th group contains remaining (c-1)th s data of X’. cth group contains remaining all the elements.

  1. Stage 3:

    Making the distance tables that show the distance between the elements within each group. (ie) If group \( k = \left[ {x_1^k,x_2^k, \ldots x_s^k} \right] \), the distance table is

 

\( x_1^k \)

\( x_2^k \)

..............

\( x_s^k \)

\( x_1^k \)

\( d_{11}^k \)

\( d_{12}^k \)

 

\( d_{1s}^k \)

\( x_2^k \)

\( d_{21}^k \)

\( d_{22}^k \)

 

\( d_{2s}^k \)

.

    

.

    

.

    

.

    

\( x_s^k \)

\( d_{s1}^k \)

\( d_{s2}^k \)

...................

\( d_{ss}^k \)

  1. Stage 4:

    Select maximum distance from each distance table of groups. If \( d_{ij}^k \) is maximum distance of kth group, find the mean value M k of the elements \( x_i^k \) and \( \,x_j^k \). kth cluster center = M k . k = 1, 2,..., c

Objective function of robust KFCM with spatial information (RKFCM_S)

Although KFCM can be directly applied to image segmentation like FCM, it would be helpful to consider some spatial constraints on the objective function. We propose a modification to Eq. 5 by introducing a penalty terms containing spatial neighborhood information. In order to avoid poor result when having to deal with data corrupted by noise, and other artifacts, this paper is considered additional penalty term.

The additional penalty term of the proposed objective function of this subsection is extended by considering neighborhood attraction of each pixel. This penalty terms act as a regularizer and biases the solution toward piecewise-homogeneous labeling. Such regularization is helpful in segmenting images corrupted by Gaussian noise. The modified objective function is given by

$$ {J_{rkfcm\_ s}}\left( {U,V} \right) = \sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - K\left( {{x_i},{v_k}} \right)} \right)} } + {\gamma_1}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - K\left( {{{\overline x }_i},{v_k}} \right)} \right)} } + {\gamma_2}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - K\left( {{{\tilde{x}}_i},{v_k}} \right)} \right)} } $$
(8)

Where, \( {\overline x_i} \) and \( {\tilde{x}_i} \) are the means and median of neighboring pixels lying within a window around each pixel xi in given image for segmentation, respectively, and γ 1, γ 2 > 0. The parameter γ 1 and γ 2 controls the effect of the neighborhood terms for each pixel to have desirable memberships. Further the parameters and the neighborhood terms are diminishing the effect of noise on a pixel. Hence the proposed objective function with the penalty term is effective for image segmentation, robustness to noise & outliers, and it is computationally less time taking.

Membership value evaluation

The objective function (8) will be minimized subject to the constraint (2) by using Lagrangian multipliers method.

$$ {L_{rkfcm\_ s}}\left( {U,V,\lambda } \right) = \sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - K\left( {{x_i},{v_k}} \right)} \right)} } + {\gamma_1}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - K\left( {{{\overline x }_i},{v_k}} \right)} \right)} } + {\gamma_2}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - K\left( {{{\tilde{x}}_i},{v_k}} \right)} \right)} } - \sum\limits_{i = 1}^n {{\lambda_i}} \left( {\sum\limits_{k = 1}^c {{u_{ik}}} - 1} \right) $$
(9)

Here \( \lambda = \left( {{\lambda_1},{\lambda_2}, \ldots, {\lambda_n}} \right) \) represents the Lagrangian multipliers.

Taking the derivative of (9) with respect to u ik and setting the result to zero, we have, for m > 1,

$$ \frac{{\partial {L_{rkfcm\_ s}}}}{{\partial {u_{ik}}}} = mu_{ik}^{m - 1}\left( {1 - K\left( {{x_i},{v_k}} \right)} \right) + {\gamma_1}mu_{ik}^{m - 1}\left( {1 - K\left( {{{\overline x }_i},{v_k}} \right)} \right) + {\gamma_2}mu_{ik}^{m - 1}\left( {1 - K\left( {{{\tilde{x}}_i},{v_k}} \right)} \right) - {\lambda_i} = 0 $$
(10)

Solving for u ik , we have

$$ {u_{ik}} = {\left( {\frac{{{\lambda_i}}}{{m\left( {\left( {1 - K\left( {{x_i},{v_k}} \right)} \right) + {\gamma_1}\left( {1 - K\left( {{{\overline x }_i},{v_k}} \right)} \right) + {\gamma_2}\left( {1 - K\left( {{{\tilde{x}}_i},{v_k}} \right)} \right)} \right)}}} \right)^{1/m - 1}} $$
(11)

Since \( \sum\limits_{j = 1}^c {{u_{ij}}} = 1\,\forall i \),

$$ \sum\limits_{j = 1}^c {{{\left( {\frac{{{\lambda_i}}}{{m\left( {\left( {1 - K\left( {{x_i},{v_j}} \right)} \right) + {\gamma_1}\left( {1 - K\left( {{{\overline x }_i},{v_j}} \right)} \right) + {\gamma_2}\left( {1 - K\left( {{{\tilde{x}}_i},{v_j}} \right)} \right)} \right)}}} \right)}^{1/m - 1}}} = 1 $$
(12)

Or

$$ {\left( {{\lambda_i}/m} \right)^{1/m - 1}} = \sum\limits_{j = 1}^c {\frac{m}{{{{\left( {\left( {1 - K\left( {{x_i},{v_j}} \right)} \right) + {\gamma_1}\left( {1 - K\left( {{{\overline x }_i},{v_j}} \right)} \right) + {\gamma_2}\left( {1 - K\left( {{{\tilde{x}}_i},{v_j}} \right)} \right)} \right)}^{1/m - 1}}}}} $$
(13)

Substituting into (11), the zero-gradient condition for the membership estimator can be rewritten as

$$ {u_{ik}} = \frac{1}{{\sum\limits_{j = 1}^c {{{\left( {\frac{{\left( {1 - K\left( {{x_i},{v_k}} \right)} \right) + {\gamma_1}\left( {1 - K\left( {{{\overline x }_i},{v_k}} \right)} \right) + {\gamma_2}\left( {1 - K\left( {{{\tilde{x}}_i},{v_k}} \right)} \right)}}{{\left( {1 - K\left( {{x_i},{v_j}} \right)} \right) + {\gamma_1}\left( {1 - K\left( {{{\overline x }_i},{v_j}} \right)} \right) + {\gamma_2}\left( {1 - K\left( {{{\tilde{x}}_i},{v_j}} \right)} \right)}}} \right)}^{1/m - 1}}} }} $$
(14)

Updating cluster center

The objective function (8) can be rewritten as

$$ {J_{rkfcm\_ s}}\left( {U,V} \right) = \sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - \exp \left( { - {{\left\| {{x_i} - {v_k}} \right\|}^2}/{\sigma^2}} \right)} \right)} } + {\gamma_1}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - \exp \left( { - {{\left\| {{{\overline x }_i} - {v_k}} \right\|}^2}/{\sigma^2}} \right)} \right)} } + {\gamma_2}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - \exp \left( { - {{\left\| {{{\tilde{x}}_i} - {v_k}} \right\|}^2}/{\sigma^2}} \right)} \right)} } $$
(15)

Taking the derivative of (15) with respect to v k and setting the result to zero, we have

$$ \left[ {\sum\limits_{i = 1}^n {u_{ik}^mK\left( {{x_i},{v_k}} \right){x_i} + {\gamma_1}\sum\limits_{i = 1}^n {u_{ik}^mK\left( {{{\overline x }_i},{v_k}} \right){{\overline x }_i} + {\gamma_2}\sum\limits_{i = 1}^n {u_{ik}^mK\left( {{{\tilde{x}}_i},{v_k}} \right){{\tilde{x}}_i}} } } } \right] - {v_k}\left[ {\sum\limits_{i = 1}^n {u_{ik}^m\left( {K\left( {{x_i},{v_k}} \right) + {\gamma_1}K\left( {{{\overline x }_i},{v_k}} \right) + {\gamma_2}K\left( {{{\tilde{x}}_i},{v_k}} \right)} \right)} } \right] = 0 $$
(16)

Solving for v k , we have

$$ v_k^{t + 1} = \frac{{\sum\limits_{i = 1}^n {u_{ik}^m\left( {K\left( {{x_i},v_k^t} \right){x_i} + {\gamma_1}K\left( {{{\bar{x}}_i},v_k^t} \right){{\bar{x}}_i} + {\gamma_2}K\left( {{{\tilde{x}}_i},v_k^t} \right){{\tilde{x}}_i}} \right)} }}{{\sum\limits_{i = 1}^n {u_{ik}^m\left( {K\left( {{x_i},v_k^t} \right) + {\gamma_1}K\left( {{{\bar{x}}_i},v_k^t} \right) + {\gamma_2}K\left( {{{\tilde{x}}_i},v_k^t} \right)} \right)} }} $$
(17)

where t is the iteration count. At t = 0 the initial centers occurred.

RKFCM_S algorithm

The RKFCM_S algorithm for segmenting the breast MR Images into different region can be summarized in the following steps.

  1. Stage 1:

    Select initial cluster centers \( \left\{ {{v_k}} \right\}_1^c \) by using dist-max initialization method.

  2. Stage 2:

    Compute the partition matrix using (14).

  3. Stage 3:

    Update the centers of the clusters using (17).

  4. Stage 4:

    Estimate objective function using (8).

  5. Stage 5:

    Repeat Steps (2)–(4) till termination. The termination criterion is as follows:

$$ \left\| {{J_t} - {J_{t - 1}}} \right\| < \xi $$
(18)

where t is the iteration count, where \( \left\| . \right\| \) is the Euclidean norm, J is a objective function, and ξ is a small number that can be set by the user.

Improved RKFCM_S with tolerance (IRKFCM_ST)

This section proposed an improved IRKFCM_S to improve the similarity measurement of the pixel intensity and the centers of clusters by considering neighborhood attraction. The improved RKFCM_S from (8) contains new distance with tolerance ε of the neighborhood attraction which is given by

$$ {J_{irkfcm\_ st}}\left( {U,V} \right) = \sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - K\left( {{x_i} + {\varepsilon_i},{v_k}} \right)} \right)} } + {\gamma_1}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - K\left( {\overline {{x_i} + {\varepsilon_i}}, {v_k}} \right)} \right)} } + {\gamma_2}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - K\left( {\overline{\overline {{x_i} + {\varepsilon_i}}}, {v_k}} \right)} \right)} } $$
(19)

where, \( \overline {{x_i} + {\varepsilon_i}} \) and \( \overline{\overline {{x_i} + {\varepsilon_i}}} \) are the mean and median of the neighboring pixels lying within a window around x i with tolerance term ε i , respectively. The parameters γ 1, γ 2 and m are the same as in RKFCM_S. The tolerance term ε i is corresponding to a boundary condition for the error of the data element x i . That is, the tolerance of x i have the upper bounds of the tolerance κ i .

The above objective function satisfies the conditions (2) and

$$ {\left\| {{\varepsilon_i}} \right\|^2} \leqslant {\left\| {{\kappa_i}} \right\|^2},\,\left( {{\kappa_i} > 0} \right) $$
(20)

Membership value evaluation

The objective function (19) is minimized subject to the constraints (2) and (20) by using Karush Kuhn-Tucker method. Taking the first derivatives of (19) with respect to u ik and v k , and zeroing them, respectively, two necessary but not sufficient conditions for (19) to be at its local extrema will be obtained as follows:

The Lagrangian function of (19) is

$$ {L_{rkfcm\_ st}}\left( {U,V,\lambda } \right) = \sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - K\left( {{x_i} + {\varepsilon_i},{v_k}} \right)} \right)} } + {\gamma_1}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - K\left( {\overline {{x_i} + {\varepsilon_i}}, {v_k}} \right)} \right)} } + {\gamma_2}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - K\left( {\overline{\overline {{x_i} + {\varepsilon_i}}}, {v_k}} \right)} \right)} } - \sum\limits_{i = 1}^n {{\lambda_i}} \left( {\sum\limits_{k = 1}^c {{u_{ik}}} - 1} \right) - \sum\limits_{i = 1}^n {{\delta_i}} \left( {{{\left\| {{\varepsilon_i}} \right\|}^2} - {{\left\| {{\kappa_i}} \right\|}^2}} \right) $$
(21)

Taking the derivative of (21) with respect to u ik and setting the result to zero, we have, for m > 1,

$$ \frac{{\partial L}}{{\partial {u_{ik}}}} = mu_{ik}^{m - 1}\left( {1 - K\left( {{x_i} + {\varepsilon_i},{v_k}} \right)} \right) + {\gamma_1}mu_{ik}^{m - 1}\left( {1 - K\left( {\overline {{x_i} + {\varepsilon_i}}, {v_k}} \right)} \right) + {\gamma_2}mu_{ik}^{m - 1}\left( {1 - K\left( {\overline{\overline {{x_i} + {\varepsilon_i}}}, {v_k}} \right)} \right) - {\lambda_i} = 0 $$

Solving for u ik , we have

$$ {u_{ik}} = {\left( {\frac{{{\lambda_i}}}{{m\left( {\left( {1 - K\left( {{x_i} + {\varepsilon_i},{v_k}} \right)} \right) + {\gamma_1}\left( {1 - K\left( {\overline {{x_i} + {\varepsilon_i}}, {v_k}} \right)} \right) + {\gamma_2}\left( {1 - K\left( {\overline{\overline {{x_i} + {\varepsilon_i}}}, {v_k}} \right)} \right)} \right)}}} \right)^{1/m - 1}} $$
(22)

Since \( \sum\limits_{j = 1}^c {{u_{ij}}} = 1\,\forall i \),

$$ \sum\limits_{j = 1}^c {{{\left( {\frac{{{\lambda _i}}}{{m\left( {\left( {1 - K\left( {{x_i} + {\varepsilon _i},{v_k}} \right)} \right) + {\gamma _1}\left( {1 - K\left( {\overline {{x_i} + {\varepsilon _i}} ,{v_k}} \right)} \right) + {\gamma _2}\left( {1 - K\left( {\overline{\overline {{x_i} + {\varepsilon _i}}} ,{v_k}} \right)} \right)} \right)}}} \right)}^{1/m - 1}}} = 1 $$

Or

$$ {\left( {{\lambda_i}/m} \right)^{^{1/m - 1}}} = \sum\limits_{j = 1}^c {\frac{{}}{{{{\left( {\left( {1 - K\left( {{x_i} + {\varepsilon_i},{v_j}} \right)} \right) + {\gamma_1}\left( {1 - K\left( {\overline {{x_i} + {\varepsilon_i}}, {v_j}} \right)} \right) + {\gamma_2}\left( {1 - K\left( {\overline{\overline {{x_i} + {\varepsilon_i}}}, {v_j}} \right)} \right)} \right)}^{1/m - 1}}}}} $$

Substituting into (22), the zero-gradient condition for the membership estimator is expressed as

$$ {u_{ik}} = \frac{1}{{\sum\limits_{j = 1}^c {{{\left( {\frac{{\left( {1 - K\left( {{x_i} + {\varepsilon_i},{v_k}} \right)} \right) + {\gamma_1}\left( {1 - K\left( {\overline {{x_i} + {\varepsilon_i}}, {v_k}} \right)} \right) + {\gamma_2}\left( {1 - K\left( {\overline{\overline {{x_i} + {\varepsilon_i}}}, {v_k}} \right)} \right)}}{{\left( {1 - K\left( {{x_i} + {\varepsilon_i},{v_j}} \right)} \right) + {\gamma_1}\left( {1 - K\left( {\overline {{x_i} + {\varepsilon_i}}, {v_j}} \right)} \right) + {\gamma_2}\left( {1 - K\left( {\overline{\overline {{x_i} + {\varepsilon_i}}}, {v_j}} \right)} \right)}}} \right)}^{1/m - 1}}} }} $$
(23)

Updating cluster center

The objective function (19) can be rewritten as

$$ {J_{rkfcm\_ st}}\left( {U,V} \right) = \sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - \exp \left( { - {{\left\| {{x_i} + {\varepsilon_i} - {v_k}} \right\|}^2}/{\sigma^2}} \right)} \right)} } + {\gamma_1}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - \exp \left( { - {{\left\| {\overline {{x_i} + {\varepsilon_i}} - {v_k}} \right\|}^2}/{\sigma^2}} \right)} \right)} } + {\gamma_2}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {u_{ik}^m\left( {1 - \exp \left( { - {{\left\| {\overline{\overline {{x_i} + {\varepsilon_i}}} - {v_k}} \right\|}^2}/{\sigma^2}} \right)} \right)} } $$
(24)

Taking the derivative of (24) with respect to v k and setting the result to zero, we have

$$ \left[ {\sum\limits_{i = 1}^n {u_{ik}^mK\left( {{x_i} + {\varepsilon_i},{v_k}} \right)\left( {{x_i} + {\varepsilon_i}} \right) + {\gamma_1}\sum\limits_{i = 1}^n {u_{ik}^mK\left( {\overline {{x_i} + {\varepsilon_i}}, {v_k}} \right)\overline {{x_i} + {\varepsilon_i}} + {\gamma_2}\sum\limits_{i = 1}^n {u_{ik}^mK\left( {\overline{\overline {{x_i} + {\varepsilon_i}}}, {v_k}} \right)\overline{\overline {{x_i} + {\varepsilon_i}}} } } } } \right] - {v_k}\left[ {\sum\limits_{i = 1}^n {u_{ik}^m\left( {K\left( {{x_i} + {\varepsilon_i},{v_k}} \right) + {\gamma_1}K\left( {\overline {{x_i} + {\varepsilon_i}}, {v_k}} \right) + {\gamma_2}K\left( {\overline{\overline {{x_i} + {\varepsilon_i}}}, {v_k}} \right)} \right)} } \right] = 0 $$
(25)

Solving for v k , we have

$$ v_k^{t = 1} = \frac{{\sum\limits_{i = 1}^n {u_{ik}^m\left( {K\left( {{x_i} + {\varepsilon_i},{v^t}_k} \right)\left( {{x_i} + {\varepsilon_i}} \right) + {\gamma_1}K\left( {\overline {{x_i} + {\varepsilon_i}}, v_k^t} \right)\overline {{x_i} + {\varepsilon_i}} + {\gamma_2}K\left( {\overline{\overline {{x_i} + {\varepsilon_i}}}, v_k^t} \right)\overline{\overline {{x_i} + {\varepsilon_i}}} } \right)} }}{{\sum\limits_{i = 1}^n {u_{ik}^m\left( {K\left( {{x_i} + {\varepsilon_i},v_k^t} \right) + {\gamma_1}K\left( {\overline {{x_i} + {\varepsilon_i}}, v_k^t} \right) + {\gamma_2}K\left( {\overline{\overline {{x_i} + {\varepsilon_i}}}, v_k^t} \right)} \right)} }} $$
(26)

where t is the iteration count. At t = 0 the initial centers occurred.

Tolerance evaluation

Taking the derivative of (21) with respect to ε i under the constraint (20) and setting result to zero we get,

$$ {\varepsilon_i} = - {\alpha_i}\left( {{x_i}\sum\limits_{k = 1}^c {u_{ik}^m\left( {{x_i} - {v_k}} \right)} } \right) $$
(27)

where \( {\alpha_i} = \min \left\{ {{\kappa_i}{{\left\| {\sum\limits_{k = 1}^c {u_{ik}^m\left( {{x_i} - {v_k}} \right)} } \right\|}^{ - 1}},{{\left( {\sum\limits_{k = 1}^c {u_{ik}^m} } \right)}^{ - 1}}} \right\} \)

Improved RKFCM_S with tolerance algorithm (IRKFCM_ST)

The algorithm for IRKFCM_S T for segmenting breast MRI into different regions can be summarized as follows:

  1. Stage 1:

    Select initial cluster centerss \( \left\{ {{v_k}} \right\}_1^c \) by using dist-max initialization method and give the value for κ i and ε i .

  2. Stage 2:

    Calculate the partition matrix using (23).

  3. Stage 3:

    Update the centers of the clusters using (26).

  4. Stage 4:

    Calculate the tolerance value using (27).

  5. Stage 5:

    Estimate the objective function using (19).

  6. Stage 6:

    Repeat Steps (2)–(5) till termination. The termination criterion is as follows:

$$ \left\| {{J_t} - {J_{t - 1}}} \right\| < \xi $$
(28)

where t is the iteration count, where \( \left\| . \right\| \) is the Euclidean norm, J is an objective function, and ξ is a small number that can be set by the user.

Novel RKFCM_S with entropy term (NRKFCM_SE)

This subsection derives a novel RKFCM_S from Eq. 8 with additional entropy term to incorporate both local spatial contextual information and feature space information into the image segmentation. To handle large amount of noise and to ensure effective fuzzification, additional entropy term is included with the proposed novel penalty FCM. In order to control the degree of membership and centers in the resulting objective function, parameters γ 1, γ 2 are included with the proposed objective function.

The proposed objective function of Novel RKFCM_S with Entropy term method is

$$ {J_{nrkfcm\_ se}}\left( {U,V} \right) = \sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {{u_{ik}}\left( {1 - K\left( {{x_i},{v_k}} \right)} \right)} } + {\gamma_1}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {{u_{ik}}\left( {1 - K\left( {{{\overline x }_i},{v_k}} \right)} \right)} } + {\gamma_2}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {{u_{ik}}\left( {1 - K\left( {{{\tilde{x}}_i},{v_k}} \right)} \right)} } + {\beta^{ - 1}}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {{u_{ik}}\log {u_{ik}}} } $$
(29)

Where, \( {\overline x_k} \) and \( {\tilde{x}_k} \) are the mean and median of neighboring pixels lying within a window around x k , respectively. At m = 1 the above objective function gives almost same effect of RKFCM_S while β approaches ∞. With the high value of β the distribution of the memberships will be uniform.

Membership value evaluation

The objective function (29) is minimized under the constraints of (2) by using Lagrangian multipliers method. For getting membership value and updating cluster center equation, the first derivation of (29) with respect to u ik and v k , equals to zero respectively and two necessary but not sufficient conditions for (29) to be at its local extrema is obtained as follows:

The Lagrangian function of (29) is

$$ {L_{nrkfcm\_ se}}\left( {U,V,\lambda } \right) = \sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {{u_{ik}}\left( {1 - K\left( {{x_i},{v_k}} \right)} \right)} } + {\gamma_1}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {{u_{ik}}\left( {1 - K\left( {{{\overline x }_i},{v_k}} \right)} \right)} } + {\gamma_2}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {{u_{ik}}\left( {1 - K\left( {{{\tilde{x}}_i},{v_k}} \right)} \right)} } + {\beta^{ - 1}}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {{u_{ik}}\log {u_{ik}}} } - \sum\limits_{i = 1}^n {{\lambda_i}} \left( {\sum\limits_{k = 1}^c {{u_{ik}}} - 1} \right) $$
(30)

Taking the derivative of (30) with respect to u ik and setting the result to zero, we have, for m > 1,

$$ \frac{{\partial L}}{{\partial {u_{ik}}}} = \left( {1 - K\left( {{x_i},{v_k}} \right)} \right) + {\gamma_1}\left( {1 - K\left( {\overline {{x_i}}, {v_k}} \right)} \right) + {\gamma_2}\left( {1 - K\left( {{{\tilde{x}}_i},{v_k}} \right)} \right) + {\beta^{ - 1}}\left( {1 + \log {u_{ik}}} \right) - {\lambda_i} = 0 $$

Solving for u ik , we have

$$ {u_{ik}} = \exp \left( {\beta {\lambda_i} - 1} \right).\exp \left[ { - \beta \left[ {\left( {1 - K\left( {{x_i},{v_k}} \right)} \right) + {\gamma_1}\left( {1 - K\left( {\overline {{x_i}}, {v_k}} \right)} \right) + {\gamma_2}\left( {1 - K\left( {{{\tilde{x}}_i},{v_k}} \right)} \right)} \right]} \right] $$
(31)

Since \( \sum\limits_{j = 1}^c {{u_{ij}}} = 1\,\forall i \),

$$ \exp \left( {\beta {\lambda_i} - 1} \right) = \frac{1}{{\sum\limits_{j = 1}^C {\exp \left[ { - \beta \left[ {\left( {1 - K\left( {{x_i},{v_j}} \right)} \right) + {\gamma_1}\left( {1 - K\left( {\overline {{x_i}}, {v_j}} \right)} \right) + {\gamma_2}\left( {1 - K\left( {{{\tilde{x}}_i},{v_j}} \right)} \right)} \right]} \right]} }} $$

Substituting into (31), the zero-gradient condition for the membership estimator can be rewritten as

$$ {u_{ik}} = \frac{{\exp \left[ { - \beta \left[ {\left( {1 - K\left( {{x_i},{v_k}} \right)} \right) + {\gamma_1}\left( {1 - K\left( {\overline {{x_i}}, {v_k}} \right)} \right) + {\gamma_2}\left( {1 - K\left( {{{\tilde{x}}_i},{v_k}} \right)} \right)} \right]} \right]}}{{\sum\limits_{j = 1}^C {\exp \left[ { - \beta \left[ {\left( {1 - K\left( {{x_i},{v_j}} \right)} \right) + {\gamma_1}\left( {1 - K\left( {\overline {{x_i}}, {v_j}} \right)} \right) + {\gamma_2}\left( {1 - K\left( {{{\tilde{x}}_i},{v_j}} \right)} \right)} \right]} \right]} }} $$
(32)

Updating cluster center

The objective function (29) can be rewritten as

$$ {J_{nrkfcm\_ s}}\left( {U,V} \right) = \sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {{u_{ik}}\left( {1 - \exp \left( { - {{\left\| {{x_i} - {v_k}} \right\|}^2}/{\sigma^2}} \right)} \right)} } + {\gamma_1}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {{u_{ik}}\left( {1 - \exp \left( { - {{\left\| {\overline {{x_i}} - {v_k}} \right\|}^2}/{\sigma^2}} \right)} \right)} } + {\gamma_2}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {{u_{ik}}\left( {1 - \exp \left( { - {{\left\| {{{\tilde{x}}_i} - {v_k}} \right\|}^2}/{\sigma^2}} \right)} \right)} } + {\beta^{ - 1}}\sum\limits_{i = 1}^n {\sum\limits_{k = 1}^c {{u_{ik}}\log {u_{ik}}} } $$
(33)

Taking the derivative of (33) with respect to v k and setting the result to zero, we have

$$ \left[ {\sum\limits_{i = 1}^n {{u_{ik}}K\left( {{x_i},{v_k}} \right){x_i} + {\gamma_1}\sum\limits_{i = 1}^n {{u_{ik}}K\left( {{{\overline x }_i},{v_k}} \right){{\overline x }_i} + {\gamma_2}\sum\limits_{i = 1}^n {{u_{ik}}K\left( {{{\tilde{x}}_i},{v_k}} \right){{\tilde{x}}_i}} } } } \right] - {v_k}\left[ {\sum\limits_{i = 1}^n {{u_{ik}}\left( {K\left( {{x_i},{v_k}} \right) + {\gamma_1}K\left( {{{\overline x }_i},{v_k}} \right) + {\gamma_2}K\left( {{{\tilde{x}}_i},{v_k}} \right)} \right)} } \right] = 0 $$
(34)

Solving for v k , we have

$$ v_k^{t + 1} = \frac{{\sum\limits_{i = 1}^n {{u_{ik}}\left( {K\left( {{x_i},v_k^t} \right){x_i} + {\gamma_1}K\left( {{{\overline x }_i},v_k^t} \right){{\overline x }_i} + {\gamma_2}K\left( {{{\tilde{x}}_i},v_k^t} \right){{\tilde{x}}_i}} \right)} }}{{\sum\limits_{i = 1}^n {{u_{ik}}\left( {K\left( {{x_i},v_k^t} \right) + {\gamma_1}K\left( {{{\overline x }_i},v_k^t} \right) + {\gamma_2}K\left( {{{\tilde{x}}_i},v_k^t} \right)} \right)} }} $$
(35)

where t is the iteration count. At t = 0 the initial centers occurred.

NRKFCM_S with entropy term algorithm (NRKFCM_SE)

The algorithm of NRKFCM_SE for the segmentation of breast MRI can be summarized in the following steps

  1. Stage 1:

    Select initial cluster centers \( \left\{ {{v_k}} \right\}_1^c \) by using dist-max initialization method.

  2. Stage 2:

    Calculate the partition matrix using (32).

  3. Stage 3:

    Update the centers of the clusters using (35).

  4. Stage 4:

    Estimate objective function using (29).

  5. Stage 5:

    Repeat Steps (2)–(4) till termination. The termination criterion is as follows:

$$ \left\| {{J_t} - {J_{t - 1}}} \right\| < \xi $$
(36)

where t is the iteration count, where \( \left\| . \right\| \) is the Euclidean norm, J is an objective function, and ξ is a small number that can be set by the user.

Results and discussions

In this section, we describe some experimental works on real breast images corrupted with Gaussian noise to compare the segmentation performance of the following algorithms: (i) FCM, (ii) KFCM [33], (iii) Spatial constrained KFCM [34] (KFCM_S) (iv) RKFCM_S (v) IRKFCM_ST (vi) NRKFCM_SE. We test the six algorithms under noises on DCE-breast MR images given in Fig. 1(a–b) and the real contrast – enhanced-Breast Magnetic Resonance Images (ce-BMRI) given in Fig. 2(a–b). The algorithms are coded in [R] programming language and they ran on a 2.66 GHz, Intel Core 2 Duo personal computer with a memory of 500 GB. Here we choose the parameters γ 1 = 0.7, γ 2 = 0.9, ξ 1 = 0.001. Fig. 1(c–n) show the segmentation results of, FCM, KFCM, KFCM_S, RKFCM_S, IRKFCM_ST, and NRKFCM_SE respectively. As shown in Figs. 1 and 2(c–h), without spatial constraints, neither FCM nor KFCM can separate the four classes, while SKFCM nearly and proposed methods give better results to succeed in correcting and classifying the data as shown in Figs. 1 and 2(i–n). From the images, we can see that without spatial constraints, both FCM and KFCM are affected by the noise badly, while KFCM_S partially and our proposed methods almost eliminate all the noises in given images. Our proposed algorithms give the better segmentation results than existed algorithm and these are having more effectiveness to reduce the noise and outlier of pixel data. Especially, the method NRKFCM_SE gives the best segmentation result in a minimum iteration.

Fig. 1
figure 1

Comparison of Segmentation result on DCE-breast MRI (ab) Pre-contrast & Post contrast Image corrupted by Gaussian noise (cd) Segmented image by FCM (ef) Segmented image by KFCM (gh) Segmented image by KFCM_S (ij) Segmented image by RKFCM_S (kl) Segmented image by IRKFCM_ST (mn) Segmented image by NRKFCM_SE

Fig. 2
figure 2

Comparison of Segmentation result on ce-left and right BMRI. (ab) corrupted by Gaussian noise (cd) Segmentation results by FCM (ef) Segmentation results by KFCM (gh) Segmentation results by KFCM_S (ij)Segmentation results by RKFCM_S (kl)Segmentation results IRKFCM_ST (mn)Segmentation results NRKFCM_SE.

Table 1 gives the segmentation accuracy of the six algorithms on two different noisy images, where segmentation accuracy is defined using silhouette value in(S R Kannan [12, 13]). These silhouette average values measures the degree of confidence in the clustering assignment of a particular observation, with well-clustered observations having values near 1 and poorly clustered observations having values near −1. The silhouette accuracy s(i) of the object i is derived by the equation \( s(i) = \frac{{v(i) - w(i)}}{{\max \left\{ {v(i),w(i)} \right\}}} \). For each object we denote by the cluster to which it belongs, and compute

$$ v(i) = \frac{1}{{\left| G \right|}} - 1\sum\limits_{j \in A,i \ne j} {d\left( {i,j} \right)} $$
Table 1 Segmentation accuracies

The equation v(i) is the average distance between the ith data and all other objects in the cluster G. Now consider a second cluster H different from G and put

$$ d\left( {i,H} \right) = \frac{1}{{\left| H \right|}}\sum\limits_{j \in H} {d\left( {i,j} \right)} = average\,dissimilarity\,of:\,i\,to\,all\,objects\,of\,H\,and\,H \ne G. $$

After computing d(i, H) for all H we take the smallest of those.

$$ w(i) = \mathop {{\min }}\limits_{C \ne A} d\left( {i,H} \right) $$

The cluster B which attains this minimum [that is d(i, H) = w(i)] is called the neighborhood of object i, this is the second best cluster for object i.

From Table 1, the best clustering validity 0.89 was obtained for our NRKFCM_SE during the experimental work on breast image data. Further, it is clear from Fig. 2(m–n) that our NRKFCM_SE method completely succeeded in correcting and classifying the breast data and almost it eliminated completely the effect of noise in images. NRKFCM_SE method is essentially different from existed method and our other proposed methods.

Conclusion

The new algorithms for DCE-breast MR Images segmentation based on kernalized fuzzy c-means with spatial information, tolerance, and entropy terms are proposed in this paper. The algorithms selected the initial cluster centers using dist-max initialization method. To enhance the noise immunity, the clustering of centre pixel is influenced by the neighborhood mean value and median value. Also, these algorithms incorporated the spatial information into the membership function to improve the segmentation result. This neighboring effect reduces the number of spurious blobs and biases the solution towards piecewise homogeneous labeling. To show the effectiveness of our proposed methods, the algorithms FCM, KFCM, KFCM_S and our proposed methods were applied on DCE-breast and ce-breast images and the proposed methods compared with other three methods. The experimental results indicate that the proposed algorithms are more robust to the noises and faster than many other segmentation algorithms. Particularly, the method NRKFCM_SE provided the well accurate segmentation result among the other methods.