1 Introduction

Nowadays, fuzzy theory is applied into many kinds of human life in various applications [32, 34]. Data clustering is the process of grouping objects into different groups, such that the objects in a group are more similar than those in other groups. Data clustering can be separated into two subgroups, including hard clustering and fuzzy clustering. In hard clustering, each data point belongs to a unique cluster, while in fuzzy clustering, each data point can belong to many different clusters with a probability of that data point to be in each cluster. In fuzzy clustering, Fuzzy C-means clustering combining with support vector machine is used in solving channel equalisation issue [19]. By integrating FCM, this model reduces the number of parameters based on the comparison to other models. In many cases, some additional information is provided by users to guide the clustering progress. In this case, the clustering algorithms are called the semi-supervised fuzzy clustering algorithms [27, 29]. Data clustering techniques have been applied in many areas such as document classification [11], logistics [25], dental radiographs segmentation [31], medical image segmentation [1, 4, 12], risk classification [44], etc. In satellite image processing, there are many separate application areas, for example, classification of vehicles participating in traffic [7], classification of satellite image to forecasting drought, flood, wildfire [22, 36], etc.

There are several kinds of additional information in semi-supervised fuzzy clustering: The constraints of must-link and cannot-link (1); The labels of data (2); The pre-defined membership degree (3). In the first one (1), the former requires that two elements must be in the same cluster. On the contrary, the latter shows that a couple of elements cannot belong to a same cluster [17]. In the second one (2), labels of a part of data are known but the remaining are not. In the last one (3), the additional information provides the membership degrees of data points in dataset to achieve better clustering quality and to focus on the objects that users need to orientate [30]. Thong [37] presented an overview of semi-supervised fuzzy clustering algorithms showing the use of three kinds of additional information in typical algorithms.

Being aware that fuzzification coefficient is an important factor that affects the performance of fuzzy clustering, this paper proposes the improvements of semi-supervised standard fuzzy C-Mean clustering (SSFCM) by using multiple fuzzifiers to increase clusters quality. Two proposed models, named as MCSSFC-P and MCSSFC-C, use different fuzzifiers for each data point and for each cluster, respectively, which are established in a form of optimal problems. The values of fuzzifiers are updated to get the best values of objective functions. Evaluations on different datasets are performed. The numerical results show the higher performance of our model than some related models.

2 Related Works

In this section, an overview of semi-supervised fuzzy clustering methods is given. Some related concepts are also presented in this part.

2.1 Researches of Semi-supervised Fuzzy Clustering Algorithms

Many researches of semi-supervised fuzzy clustering algorithms based on FCM [5] are introduced by different approaches. These methods combined some classic methods, such as kernel function, weight function, adaptive function, etc.

2.1.1 Semi-supervised Fuzzy Clustering Based on Multiple Kernel Function

Frigul et al. [13] examined the existing fuzzy kernel clustering algorithms. This research provided an overview of unsupervised algorithms and semi-supervised algorithms that incorporated partial supervision into the objective function to control the optimization process. Huang et al. [18] shown that most efforts along with this direction have been used in supervised learning, support vector classification and regression.

In [28], an algorithm of semi-supervised fuzzy clustering based on multi-kernel function was proposed. This algorithm improved land classification performance by using a function that combines multiple kernels, in which each kernel had a different weight for each attribute. Kanzawa [20] also proposed a semi-supervised fuzzy clustering approach by modifying differential kernel matrices. This method improved FCM algorithm by using an unknown kernel function and controlling the entropy to use soft constraints. The experiments have shown the higher performance of this method comparing with other methods. Apart from the advantages of high clustering quality, the limitation of this algorithm is the process of parameter estimation of the kernels. The calculation of parameters depended on the initialization of original values. Salehi et al. [33] presented a semi-supervised multiple kernel fuzzy clustering based on entropy and relative entropy. In this method, the entropy coefficient and divergence measure of entropy are applied in place of the generator for the unsupervised part and the corresponding geometric distance measurement for the semi-supervised part by highlighting the combination of unsupervised and semi-supervised sections explicitly.

2.1.2 Safe Semi-supervised Fuzzy Clustering

Gan et al. [14] proposed a safe semi-supervised clustering method with weight function. First, the algorithm used unsupervised clustering method to partition the datasets and defined the difference matrix to calculate the safety reliability of each labelled sample. It is assumed that a correctly clustered sample must have a high level of certainty. Then, the algorithm built a graph-based regularization term to model the relationship among the labelled samples and unlabelled samples based on clustering results. This graph-based regularization term enabled the predictions of the labelled samples to approximate those of the local homogeneous neighbours. It is expected that the risk of labelled samples will be reduced. This algorithm actively takes advantage of labelled patterns by limiting the appropriate output to the specified class labels.

Another safe semi-supervised fuzzy clustering method based on FCM also was proposed by Gan [16]. This method has carefully uncovered trademark errors by tying the appropriate predictions to those due to the unsupervised subset. In the meantime, predictions of other marked patterns for certain labels may be accessible. Therefore, it is expected that labelled samples will be safely examined by the balance between the unattended subgroup and the semi-supervised subgroup. From the experimental results, the proposed algorithm can yield better results than FCM. However, this method still has some limitations, including the inability to deal with pair constraints, and the high risk of labelled samples and unlabelled samples. Gan et al. [15] also proposed a method of safe semi-supervised clustering with confidence-weighted. This model aimed to determine the safety of each sample because in the dataset, each sample had a different effect on the performance of the model. It means that a sample with high accuracy clustering results is highly reliable. With 3 phases of implementation, the model determined that the labelled samples would be exploited safely. The performance of this method was compared with the unsupervised and semi-supervised clustering methods on selected datasets.

2.1.3 Semi-supervised Fuzzy Clustering Based on Adaptive Function

Casalino et al. [8] developed a semi-supervised fuzzy clustering algorithm based on FCM that integrated the adaptive function and data stream classification. The algorithm used a dynamic component to estimate the required number of clusters based on data distribution and used FCM to find optimal results, then. In another approach, Yu et al. [43] proposed the method of constraint propagation based on transitive closure using the bridging operator and affinity propagation to solve the first constraint. Then, a semi-supervised clustering framework based on a random subspace with a set of confidence coefficients was proposed to address the second constraint. Next, the adaptive semi-supervised clustering aggregation framework was introduced to address the third constraint. The optimal subspace set was found by a new adaptive process. The experimental results of this research direction are better than related researches. However, the computational volume of the algorithm is large with multiple steps. This increased processing time, especially for large datasets. Cai et al. [6] presented an adaptive method of object selection to apply the semi-supervised fuzzy clustering. Experimental results show that adaptive method of object selection applied in semi-supervised fuzzy clustering can improve clustering performance.

Some other methods using neural networks are suggested by Arshad et al. [2], Arshad et al. [3]. These methods mostly related to FCM and selective fusion techniques. At first, the algorithm divided the dataset into separated parts, including unlabelled data, properly labelled data, and incorrectly labelled data. Then, the FCM method was used to cluster all the data. Clustering results were included in the attribute selection process to give the final results. This method increased the accuracy of classification but it was not considering the relationship between attributes. Following the approach of Śmieja et al. [35], a neural network framework for semi-supervised clustering with paired constraints (must-link or cannot-link) was proposed. In this framework, the semi-supervised clustering was consisted of two phases. In the first phase, two Siamese neural networks were used to identify pairs of unlabelled points. In second phase, based on a supervised neural network, clustering process was performed with the use of full paired data set generated in previous phase. Extensive experiments on different data sets showed the high efficiency of the proposed method. In [25], a method has been given to solve two problems in semi-supervised clustering, including (1) to improve the clustering process by discriminative feature; (2) to improve the clustering efficiency by using labelled data and unlabelled data. This method was based on deep learning combined with semi-supervised learning to improve the efficiency of the semi-supervised cluster. The experimental results showed the high efficiency of the proposed method.

Kumar et al. [23] developed a semi-supervised clustering method for dental radiographs segmentation. This approach used Otsu’s method to divide the dental X-Ray image into background and foreground regions. Then used FCM algorithm and proposed algorithm to increase efficiency for clustering method. In the semi-supervised rough fuzzy Laplacian Eigenmaps [26], a set of semi-supervised fuzzy similarity particles was constructed to describe the similarity between samples based on the rules that homogeneous samples have higher similarity than heterogeneity sample. This method also evaluated the similarity of two samples on the same class. A Laplace proximity diagram and a class-based neighbourhood diagram were created to describe the topological structure among patterns and between each pattern and its prototype. This ensures that homogeneous samples in a space with lower dimensions were mapped closer and more compactly around the prototypes. The recommended method provided high classification accuracy and data visualization comparing to some related semi-supervised methods [6, 9, 10, 24, 27, 33, 38,39,40,41].

2.2 Semi-supervised Standard Fuzzy C-Mean clustering (SSFCM) algorithm

In the research of Semi-supervised Standard Fuzzy C-Mean clustering, SSFCM was introduced by Yasunori et al. [42] using pre-defined membership degree \(\overline{U}\). In this method, the membership degree \(\overline{U}\) was integrated in the objective function of FCM to improve the clustering performance of the algorithm. The objective function is formed as below:

$$J\left( {U,V} \right) = \mathop \sum \limits_{i = 1}^{N} \mathop \sum \limits_{j = 1}^{C} \left\| {u_{ij} - \overline{u}_{ij} } \right\|^{m} \left\| {X_{i} - V_{j} } \right\|^{2} \to min$$

The main steps of SSFCM are presented in Table 1 below.

Table 1 Semi-supervised standard fuzzy C-mean clustering algorithm

2.3 Fuzzy C-Mean Clustering with Multiple Fuzzifiers Algorithm (MC-FCM)

Khang et al. [21] proposed a new fuzzy clustering algorithm using different fuzzifiers for each data point to increase the quality of the cluster. The main idea of this method is using multiple fuzzifiers instead of unique fuzzification coefficient in the FCM. Multiple fuzzifiers (mi) can be defined by (1). When mi are equal, the MCFCM algorithm is the FCM algorithm.

$$m_{i} = m_{1} + \left( {m_{2} - m_{1} } \right)\left( {\frac{{S_{i} - S_{min} }}{{S_{max} - S_{min} }}} \right)^{\alpha } ;i = \overline{1,N},$$
(1)

where m1, m2 is the lower and upper boundary of mi (1 ≤ m1 ≤ m2); α is an exponent parameter; \(S_{i} = \mathop \sum \nolimits_{j = 1}^{N/C} D_{ij}\).; \(D_{ij} = \left\| {X_{i} - X_{j} } \right\| \left( {\forall i, j = \overline{1,N} } \right)\); \(S_{max} = max_{i \in N} \left( {S_{i} } \right)\). \(S_{min} = min_{i \in N} \left( {S_{i} } \right)\).

The main steps of MC-FCM are presented as in Table 2 below.

Table 2 Fuzzy C-Mean Clustering with Multiple Fuzzifiers algorithm

3 Methods

Based on the advantages of additional information in semi-supervised clustering from SSFCM and the advantages of multiple fuzzifiers from MC-FCM, in this paper, we propose two improvements of the semi-supervised standard fuzzy C-Mean clustering algorithm by using multiple fuzzifiers for data point and for cluster to increase clusters quality. The main idea of these methods is using SSFCM with multiple fuzzifiers instead of one fuzzifier. In this research, mathematical models in the optimal problem form are introduced. Two novel algorithms are named as the Semi-supervised fuzzy clustering algorithm with multiple point fuzzifiers (MCSSFC-P) and Semi-supervised fuzzy clustering model with multiple cluster fuzzifiers (MCSSFC-C). The details of two these algorithms are presented in Sections 3.1 and 3.2, respectively.

3.1 Semi-supervised Fuzzy Clustering Algorithm with Multiple Point Fuzzifiers (MCSSFC-P)

By applying SSFCM algorithm in the case of using multiple fuzzifiers, MCSSFC-P is formed as an optimal problem with the objective function (2):

$${\text{J}}\left( {{\text{U}},{\text{V}}} \right) = \mathop \sum \nolimits_{{{\text{i}} = 1}}^{{\text{N}}} \mathop \sum \nolimits_{{{\text{j}} = 1}}^{{\text{C}}} \left\| {u_{{{\text{ij}}}} - \overline{u}_{{{\text{ij}}}} } \right\|^{{{\text{m}}_{{\text{i}}} }} {*}\left\| {{\text{X}}_{{\text{i}}} - {\text{V}}_{{\text{j}}} } \right\|^{2} \to {\text{min}}$$
(2)

and the constraints:

$$\mathop {u_{ij} \in \left[ {0,1} \right]; \sum }\nolimits_{j = 1}^{C} u_{ij} = 1; \forall i = \overline{1,N} ,\mathop {\overline{u}_{ij} \in \left[ {0,1} \right], \sum }\nolimits_{j = 1}^{C} \overline{u}_{ij} \le 1; \forall i = \overline{1,N}$$
(3)

Additional information is defined via the results of FCM algorithm. Based on that, the pre-defined membership degrees are calculated by using (4). If a part of data is labelled, based on the label of these samples, membership degrees are defined by using (5).

$$\overline{u}_{ij} = \left\{ {\begin{array}{*{20}c} {u_{ij}^{{\left( {FCM} \right)}} ,\begin{array}{*{20}c} {} & {} \\ \end{array} when\begin{array}{*{20}c} {} & {u_{ij}^{{\left( {FCM} \right)}} = max_{{j = \overline{1,C} }} \left( {u_{ij}^{{\left( {FCM} \right)}} } \right)} \\ \end{array} } \\ {0 ,\begin{array}{*{20}c} {} & {} \\ \end{array} when\begin{array}{*{20}c} {} & {u_{ij}^{{\left( {FCM} \right)}} \ne max_{{j = \overline{1,C} }} \left( {u_{ij}^{{\left( {FCM} \right)}} } \right)} \\ \end{array} } \\ \end{array} } \right. i = \overline{1,N} ;j = \overline{1,C}$$
(4)
$$\overline{u}_{ij} = \left\{ {\begin{array}{*{20}c} {1,_{{}} when\,\,\,X_{i} label j} \\ {0,\begin{array}{*{20}c} {\begin{array}{*{20}c} {} & {} \\ \end{array} } & {} \\ \end{array} otherwise} \\ \end{array} } \right.,\;i = \overline{1,N} ;j = \overline{1,C}$$
(5)

In order to solve optimal problem (2)–(3), we use Lagrange multiplier method. Then, we have

$$L = \mathop \sum \nolimits_{i = 1}^{N} \mathop \sum \nolimits_{j = 1}^{C} \left| {u_{ij} - \overline{u}_{ij} } \right|^{{m_{i} }} \left\| {X_{i} - V_{j} } \right\|^{2} - \mathop \sum \limits_{i = 1}^{N} \lambda_{i} \left( {\mathop \sum \limits_{j = 1}^{C} u_{ij} - 1} \right)$$
(6)

Taking the derivation of \(L\) by Vj, we get

$$\frac{\partial J}{{\partial V_{j} }} = - 2\mathop \sum \nolimits_{i = 1}^{N} u_{ij}^{{m_{i} }} \left| {u_{ij} - \overline{u}_{ij} } \right|\left( {X_{i} - V_{j} } \right) = 0$$
(7)

From Eq. (7), cluster centres Vj are defined by

$$V_{j} = \frac{{\mathop \sum \nolimits_{i = 1}^{N} \left| {u_{{{\text{ij}}}} - \overline{u}_{ij} } \right|^{{m_{i} }} X_{i} }}{{\mathop \sum \nolimits_{i = 1}^{N} \left| {u_{{{\text{ij}}}} - \overline{u}_{ij} } \right|^{{m_{i} }} }}, j = \overline{1,C}$$
(8)

On the other hand, taking the derivation of L by uij, we get

$$\frac{\partial L}{{\partial u_{ij} }} = m_{i} \left( {u_{ij} - \overline{u}_{ij} } \right)^{{m_{i} - 1}} \left\| {X_{i} - V_{j} } \right\|^{2} - \lambda_{i} = 0$$
(9)

From Eqs. (3) and (9), membership degrees uij are calculated using (10):

$$u_{{{\text{ij}}}} = \overline{u}_{ij} + \left( {1 - \mathop \sum \nolimits_{j = 1}^{C} \overline{u}_{ij} } \right)\frac{{\left( {\frac{1}{{\left\| {X_{i} - V_{j} } \right\|}}} \right)^{{\frac{2}{{m_{i} - 1}}}} }}{{\mathop \sum \nolimits_{j = 1}^{C} \left( {\frac{1}{{\left\| {X_{i} - V_{j} } \right\|}}} \right)^{{\frac{2}{{m_{i} - 1}}}} }}, i = \overline{1,N} ;j = \overline{1,C}$$
(10)

The general diagram of MCSSFC-P is shown in Fig. 1 below.

Fig. 1
figure 1

General diagram of MCSSFC-P algorithm

In MCSSFC-P algorithm, input data and parameters are used to define fuzzifiers for each data sample by Eq. (1). Using these fuzzifiers, SSFCM algorithm is implemented to determine centres of clusters and membership degree matrix. From general diagram, the specific steps of MCSSFC-P are given as in Table 3 below.

Table 3 Semi-supervised fuzzy clustering algorithm with multiple point fuzzifiers

3.2 Semi-supervised Fuzzy Clustering Model with Multiple Cluster Fuzzifiers (MCSSFC-C)

In fuzzy clustering, the fuzzifier represents the uncertainty of the data. In fuzzy clustering algorithms and the mostly standard fuzzy semi-supervised clustering algorithms, the fuzzy parameter (called fuzzifier also) is a constant value for all data sample (denoted by m and usually chosen as 2). Normally, this parameter is chosen based on the experience and is constant to all data samples. However, this parameter represents for the uncertainty of data sample belonging to clusters. In fact, each data sample has the own uncertainty degree. The constant value of fuzzifier is not suitable in clustering process. In practice, however, each data sample exists a different degree of uncertainty. This affects to the uncertainty of the clusters. The more uncertainty data points a cluster has, the higher the uncertainty is. In the case of using a constant value of fuzzifiers for all data samples, objective function is presented by

$$J\left( {U,V} \right) = \mathop \sum \nolimits_{k = 1}^{N} \mathop \sum \nolimits_{j = 1}^{C} \left\| {u_{{{\text{ij}}}} } \right\|^{m} *\left\| {X_{k} - V_{j} } \right\|^{2} \to min$$
(11)

Because the data samples in a cluster have a certain similarity, we propose the membership function for the cluster (uj) based on the membership function of the samples and the number of samples in that cluster at the time of consideration. Let Cj be the set of samples in the jth cluster, we have

$$u_{j} = \frac{{\mathop \sum \nolimits_{{X_{i} \varepsilon C_{j} }} u_{ij} }}{{\left| {C_{j} } \right|}},\,\,\,\,\,\,\,j = \overline{1,C}$$
(12)

The objective function (11) achieves the minimum if and only if

$$u_{{{\text{ij}}}} = \frac{1}{{\mathop \sum \nolimits_{k = 1}^{C} \left( {\frac{{\left\| {X_{i} - V_{j} } \right\|}}{{\left\| {X_{i} - V_{k} } \right\|}}} \right)^{{\frac{2}{m - 1}}} }}, i = \overline{1,N} ;j = \overline{1,C}$$
(13)

Thus, provided that the jth cluster has a membership function uj, then from (12) and (13), we get

$$u_{j} = \frac{1}{{\left| {C_{j} } \right|}}\mathop \sum \nolimits_{{x_{i} \in X_{j} }} \frac{1}{{\mathop \sum \nolimits_{k = 1}^{C} \left( {\frac{{\left\| {X_{i} - V_{j} } \right\|}}{{\left\| {X_{i} - V_{k} } \right\|}}} \right)^{{\frac{2}{m - 1}}} }}; j = \overline{1,C}$$
(14)

Each cluster will also have a corresponding fuzzy parameter. Usually, the value of fuzzifier is chosen based on the experience without general formula. This is easy to see, the selection of m is only an approximation. Therefore, it is not feasible to find a formula to calculate the exact fuzzifier m.

In our research, we use different values of fuzzifier (called as multiple fuzzifiers) for each data point in MCSSFC-P or each cluster in MCSSFC-C. From the initial values, the multiple fuzzifiers may be updated in iterations to get the optimal values of objective function. From formula (14), considering as the approximate case, we have

$$m_{j} = 1 + \frac{2}{{\log u_{j} + \log \left| {C_{j} } \right|}},\,\,j = \overline{1,C}$$
(15)

The mj computed by (15) satisfies the condition that the fuzzifier must be greater than or equal to 1. Apart from that, the fuzzifier needs to satisfy the condition inversely proportional to the value of the membership function. This is true of formula (15) because the larger the membership function of the cluster, the less uncertainty there is. Hence, the fuzzifier value is small and vice versa. With fuzzy cluster parameters, now the fuzzy membership function from formula (14) is replaced by formula (16) with fuzzifiers for each cluster.

$$u_{j} = \frac{1}{{\left| {C_{j} } \right|}}\mathop \sum \nolimits_{{X_{i} \in C_{j} }} \frac{1}{{\mathop \sum \nolimits_{k = 1}^{C} \left( {\frac{{\left\| {X_{i} - V_{j} } \right\|}}{{\left\| {X_{i} - V_{k} } \right\|}}} \right)^{{2/\left( {m_{j} - 1} \right)}} }},j = \overline{1,C}$$
(16)

Based on the formula (12), membership degree of the jth cluster is \(u_{j} \in \Big(\frac{1}{c}{,1\Big)}\). Then, we have

$$\log \frac{{\left| {X_{j} } \right|}}{c} < \log u_{j} + \log \left| {X_{j} } \right| < \log \left| {X_{j} } \right|$$

The value of fuzzifier parameter of the jth cluster satisfies

$$1 + \frac{2}{{\log \left| {X_{j} } \right|}} < m_{j} < 1 + \frac{2}{{\log \frac{{\left| {X_{j} } \right|}}{c}}}$$

However, since the model is very sensitive when the fuzzifier changes, it is recommended to use one parameter to adjust the fuzzifier value. Moreover, increasing or decreasing the domain of values can give the highest accuracy. In our situation, the new objective function for the semi-supervised fuzzy algorithm with multiple fuzzifier values of clusters (mj) will be

$$J\left( {U,V} \right) = \mathop \sum \nolimits_{i = 1}^{N} \mathop \sum \nolimits_{j = 1}^{C} \left\| {u_{{{\text{ij}}}} - \overline{u}_{{{\text{ij}}}} } \right\|^{{\alpha *m_{j} }} *\left\| {X_{i} - V_{j} } \right\|^{2} \to min$$
(17)

Alpha coefficient is included to be flexible in adjusting the effects of fuzzy parameters. Because the algorithm is very sensitive to the change of the parameter mj, we need to ensure to get the optimal target function after each loop. Hence, in each iteration, the values of the fuzzy parameters will be updated if the following condition is met:

$$\left\| {J^{\left( t \right)} - J^{{\left( {t - 1} \right)}} } \right\| < \varepsilon$$
(18)

In the case of changing, the new fuzzy parameters mj is used to update the values in U and V as below.

$$V_{j} = \frac{{\mathop \sum \nolimits_{i = 1}^{N} \left| {u_{{{\text{ij}}}} - \overline{u}_{ij} } \right|^{{m_{j} }} *X_{i} }}{{\mathop \sum \nolimits_{i = 1}^{N} \left| {u_{{{\text{ij}}}} - \overline{u}_{ij} } \right|^{{m_{j} }} }}, j = \overline{1,C}$$
(19)

When the fuzzifier mj the value of the membership function is calculated by the following formula:

$$u_{{{\text{ij}}}} = \overline{u}_{ij} + \left( {1 - \mathop \sum \nolimits_{k = 1}^{C} \overline{u}_{ik} } \right)\frac{{\left( {\frac{1}{{X_{i} - V_{j} }}} \right)^{{\frac{2}{{m_{j} - 1}}}} }}{{\mathop \sum \nolimits_{k = 1}^{C} \left( {\frac{1}{{X_{i} - V_{k} }}} \right)^{{\frac{2}{{m_{j} - 1}}}} }}, i = \overline{1,N} ;j = \overline{1,C}$$
(20)

In which, pre-defined membership degree (\(\overline{u}_{{{\text{ij}}}}\)) in this algorithm is calculated according to the data. If the data set has no labelled data, the values of \(\overline{u}_{{{\text{ij}}}}\) are computed as follows:

$$\overline{u}_{ij} = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {u_{{{\text{ij}}}} } & {u_{{{\text{ij}}}} = \mathop {max}\limits_{{k = \overline{1,C} }} \left( {u_{{{\text{ik}}}} } \right)} \\ \end{array} } \\ {\begin{array}{*{20}c} 0 & {u_{{{\text{ij}}}} \ne \mathop {max}\limits_{{k = \overline{1,C} }} \left( {u_{{{\text{ik}}}} } \right)} \\ \end{array} } \\ \end{array} } \right.,i = \overline{1,N} ; j = \overline{1,C}$$
(21)

With labelled data, the values of \(\overline{u}_{ij}\) are defined as follows:

$$\overline{u}_{ij} = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} 1 & {\text{i is labelled class j}} \\ \end{array} } \\ {\begin{array}{*{20}c} 0 & {otherwise} \\ \end{array} } \\ \end{array} } \right.,i = \overline{1,N} ; j = \overline{1,C}$$
(22)

From the above analysis, the paper proposes a semi-supervised fuzzy clustering algorithm with multiple fuzzifiers for clusters (MCSSFC-C, Fig. 2). In this approach, the initial values of fuzzifiers for all clusters are equal, \(m^{\prime}_{j}\) = 2 for all j = 1,…, C. During clustering, this parameter will only change when condition (18) is satisfied. This means that only fuzzifier updates for clusters when clustering results in the next iteration are better than in the previous iteration. This is the key point of the proposed algorithm. In the worst case, all fuzzifier changes do not yield better clustering results. The proposed algorithm will become the SSFCM algorithm.

Fig. 2
figure 2

Semi-supervised fuzzy clustering algorithm with multiple cluster fuzzifiers

3.3 Numerical examples to illustrate the steps of the algorithm

To illustrate the proposed algorithm MCSSFC-C, we give an example with 10 data samples, 4 attributes and 2 clusters as shown in Table 4.

Table 4 The data illustrate the proposed algorithm

Numerical examples performed on the data set are shown in Table 4 with 10 data samples clustered in 2 classes.

The values obtained by this process are m1 = 7.0616; m2 = 3.993 based on formula (5) and U1 = 0.9345; U2 = 0.9501 based on formula (6). The process in Table 5 is repeated until the stop condition is satisfied.

Table 5 The process of define membership matrix and centres of clusters

Figure 3 describes the data distribution and clustering results in a 3-dimensional space.

Fig. 3
figure 3

Data distribution and clusters in 3-dimensional space

3.4 Discussions

The goal of calculating fuzzifiers values for the cluster is to optimize the objective function. In the worst case, the fuzzifiers do not change. It means that fuzzifiers for the clusters are always equal to the initial initialization value \(m_{j} = 2\). Then, the accuracy of MCSSFC-C is equivalent to that of SSFCM. The advantage of the proposed algorithm is to find fuzzifiers for each cluster. This is more suitable than using only one fuzzifier for all data sets.

It can be seen that the computational complexity of the proposed algorithm includes the complexity of steps 1 to 7 in Table 3. Including the complexity of the FCM algorithm O(T1ndc2) and in each loop: Update the fuzzy membership function U by Eqs. (10) or (11), update the V by Eq. (9), calculate J (U, V) by Eq. (7), update the fuzzy parameters and check for stop condition has a computational complexity O(ndc2) and calculate potential fuzzy parameters for each cluster by Eq. (5) and cluster fuzzy membership by Eq. (6) has a computational complexity O(2ndc2). Therefore, the computational complexity of MCSSFC-C algorithm is O(3T2ndc2). T1 and T2 are the number of iterations of FCM and MCSSFC-C, respectively.

Although when the number of samples approaches infinity, the computational complexity between the SSFCM algorithm and MCSSFC-C is equivalent. Compared with the SSFCM algorithm, the MCSSFC-C algorithm adds steps 4 and 6. These steps will allow the algorithm to try on many different fuzzifiers values, leading to slower convergence of this algorithm due to the need to calculate and test the optimal conditions for choosing fuzzy parameters. MCSSFC-C algorithm will have 3 times higher in computing complexity than SSFCM algorithms. Thus, in the actual test, the MCSSFC-C algorithm runs slower than the SSFCM algorithm.

In this research, we proposed two improvements of SSFCM model, including MCSSFC-P model and MCSSFC-C model with some main advantages as below:

  • MCSSFC-P is an extension of SSFCM algorithm in the case of using fuzzifiers for each data sample. Additional information is provided by experts or based on the previous analyses. This information will increase the accuracy of clustering. Using additional information, the different values of fuzzifiers are flexibly defined for each sample based on its membership degree, instead of using a fixed fuzzifier for all samples in SSFCM.

  • MCSSFC-C defines the values of fuzzifiers for each cluster. This reduces the time computing comparing with that of MCSSFC-P. Moreover, the samples in the same cluster having the same fuzzifiers will increase the performance of this model.

However, MCSSFC-C still have some limitations, including

  • The values of fuzzifier parameters in MCSSFC-C are defined mainly based on the experiences of our research team.

  • The analyses of types of data that are suitable to the proposals in this paper are not complete.

4 Experimental Results

We have been using the Dell laptop with Core i5 processor for implementing the algorithms in MATLAB 2014. In the evaluation, we use both two-label datasets and multi-label datasets. The two-label datasets are three benchmark datasets (Wine, Iris, Breath, Liver, Diabetes, Arrhythmia datasets) taken from the benchmark UCI Machine Learning Repository in Table 6.

Table 6 Experimental datasets

The experiment of data classification on algorithms SSFCM [42], MCSSFC-P (taken according to each data point) and MCSSFC-C (MCSSFC taken according to each data cluster) in both cases: Using label data and not using label data to calculate the supplementary membership function values. Experimental parameters include eps = 0.001, Maxstep = 1000 and alpha = 1.0. Indexes Davies-Bouldin (DB), PBM, Partition Coefficient (PC) and Classification Entropy (CE) are used to evaluate the effectiveness of the algorithms. Furthermore, the accuracy (Acc) of classification results is also assessed by the percentage of the number of samples correctly classified to the total sample of the dataset. In which, Acc1: calculated on data know labels; Acc2: calculated on the all dataset.

Remote sensing images in 2 regions (a and b) of BacDa, Hoabinh with size 515 × 515 are used for the experiment (see Fig. 4). With the parameters for the algorithms set up as above, the remote sensing image data are classified in 4 classes corresponding to 4 different types of land covers. Remote sensing image data are classified according to 4 classes.

Fig. 4
figure 4

Remote sensing image data

4.1 Experimental Results on the Benchmark UCI Datasets

Label data were obtained at the rate of 10% on experimental data set. The supplementary membership function is calculated from the labelled data. Table 7 shows the values of the indexes used to evaluate cluster quality of the SSFCM, SSFC-FS [38], MCSSFC-P and MCSSFC-C algorithms using labelled data. SSFCM and SSFC-FS use a constant value of fuzzifier, while MCSSFC-P and MCSSFC-C use multiple fuzzifier in clustering progress.

Table 7 Experimental results on datasets using labelled data

By using DB and PBM indices, MCSSFC-C algorithm achieved the highest accuracy at 3/6 datasets, followed by MCSSFC-P algorithm with 3/6 datasets. With CE index, MCSSFC-P algorithm achieved the highest accuracy at 4/6 datasets, followed by MCSSFC-C algorithm with 2/6 datasets. With PC index, MCSSFC-C algorithm achieved the highest accuracy at 4/6 datasets, followed by MCSSFC-P algorithm with 2/6 datasets. With ACC1 index, MCSSFC-C algorithm achieved the highest accuracy at 3/6 datasets, followed by MCSSFC-P algorithm with 3/6 datasets. SSFC-FS gets the best result of ACC1 on Diabetes dataset. With ACC2 index, MCSSFC-C algorithm achieved the highest accuracy at 3/6 datasets, followed by MCSSFC-P algorithm with 1/6 datasets. The results of ACC2 obtained by applying two proposed algorithms are equal on 2/6 datasets, including Iris and Breast.

Based on results in Table 7 on 6 datasets (Wine, Iris, Breast, Liver, Diabetes and Arrhythmia) and 6 assessment indicators (DB, PBM, CE. PC, ACC1 and ACC2) of cluster quality with 36 comparisons, the best results are mainly obtained by using MCSSFC-P and MCSSFC-P. Both the proposed algorithms achieved the best 19/36 times. SSFC-FS gets the best result only once. It means that the algorithms with multiple fuzzifiers get the higher cluster quality.

4.2 Experimental Results on the Remote Sensing Image Data

In these experiments, additional information is defined from the results of the FCM. The number of clusters is chosen as 4. Figures 5 and 6 show the two original images and clustered images by using SSFCM, MCSSFC-P and MCSSFC-C, respectively.

Fig. 5
figure 5

Result of remote sensing image classification on image 1: a) Color image; b) SSFCM; c) MCSSFC-P and d) MCSSFC-C

Fig. 6
figure 6

Result of remote sensing image classification on images 2: a) Color image; b) SSFCM; c) MCSSFC-P and d) MCSSFC-C

Apart from illustrating by the images above, the validity indices are also calculated and are given in Table 8 below.

Table 8 Experimental results on image datasets

Based on the values in Table 8, MCSSFC-P and MCSSFC-C give better results than SSFCM in most cases. The values of PBM, PC obtained by MCSSFC-C are highest on both selected images. With values DB, MCSSFC-C still gets the best result on Image 1 and MCSSFC-P gets the best result on Image 2. With values CE, MCSSFC-C still gets the best result on Image 2 and MCSSFC-P gets the best result on Image 1.

The values of fuzzifiers obtained by applying MCSSFC-C on these images are presented in Table 9.

Table 9 Fuzzy parameters achieved by algorithm MCSSFC-C in clusters

In both Tables 8 and 9, the experimental results show that the use of multiple fuzzy parameters (MCSSFC-P and MCSSFC-C algorithms) can improve the accuracy of clustering results compared to using only one fuzzifier (SSFCM algorithm).

5 Conclusions

In this paper, we proposed two improvements of semi-supervised standard fuzzy C-Mean clustering by using multiple fuzzifiers. The contributions of this research include (i) to improve clustering quality in SSFCM by using different fuzzifiers for each data point (MCSSFC-P); (ii) to improve clustering quality and reduce the run time by using different fuzzifiers for each data cluster (MCSSFC-C); (iii) to implement two of these improvements on UCI datasets and remote sensing images to evaluate the performance of proposed models and compare these models to other related models.

The results evaluated on indicators (DB, PBM, CE, PC and Acc) show that the proposed method can give better clustering results in most of the indicators used. Meanwhile, algorithm MCSSFC-C reached, respectively, 19/36; MCSSFC-P algorithm reached 19/36, respectively. From the experimental results on remote sensing image data, it can be seen that MCSSFC-C algorithm gives better classification results than other algorithms based on PBM, PC indexes. With the above results, it can be confirmed that the use of many fuzzy parameters can improve the quality of data clustering.

From the experimental results, different clusters or data samples will have different levels of uncertainty, the advantage of the proposed algorithm can give fuzzy parameters for each cluster or point. This provides greater precision than just using one fuzzifier for all data sets. However, the proposed algorithm with the addition of steps to find fuzzifier values leads to greater computational complexity and more time.

In the next studies, we will develop an algorithm to consider the effects of fuzzy coefficients for each cluster and data sample. Moreover, we also improve the algorithm for multiple fuzzifiers based on 3-dimensional functions, depending on U, V and X. Improving the algorithm for multiple fuzzifiers with mixed complementary information, including Label, Constraints, Membership, Fuzzifiers. Furthermore, the optimal parameters of the proposed algorithms can be determined by meta-heuristics algorithms such as GA, PSO, etc.