Abstract
Multi-view clustering, which integrates information from different views for better performance, has gained widespread attention. However, the existing clustering methods cannot represent the uncertainty and imprecision in cluster assignment caused by the tendency of clusters to overlap and the diversity among views. To surmount this issue, in this paper, we propose an adaptive weighted multi-view evidential clustering (WMVWC) method based on the framework of belief functions. The proposed WMVEC can be viewed as a multi-view version of conventional evidential c-means clustering. We construct the objective function of WMVEC by integrating the learning of view weights and credal partition into a unified framework, and design an optimization scheme to obtain the optimal results of WMVEC. Specifically, the view weight can measure the contribution of each view in clustering. The credal partition can provide a deeper understanding of the data structure by allowing samples to belong not only to singleton clusters, but also to a union of different singleton clusters, called meta-cluster. Experiment results demonstrate the effectiveness of the proposed WMVEC with respect to other state-of-the-art methods on real-world datasets. The code can be available at link.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Clustering, as a core paradigm of unsupervised learning, has been widely used in various domains owing to its powerful capacity in data preprocessing [13]. Many representative clustering methods like k-means and fuzzy c-means (FCM) have been invented for single-view data [10]. With the prompt development of information technology, however, single-view data can no longer meet the actual needs, and we are often faced with multi-view data represented by different sources or features. For example, an image can be expressed by multiple feature descriptors, such as LBP, SIFT and HOG. In such situations, the aforementioned classical clustering methods will no longer be applicable.
To counter this flaw and explore multi-view data, a load of derivatives of multi-view clustering methods have been emerged. One of the most representative is hard partition-based [2, 3, 6, 15]. For example, Cai et al. [2] suggested a \(\ell _{2, 1}\) norm-based multi-view k-means clustering, called RMVKM, to learn the weight of each view. Later, Chen et al. [3] developed a two-level variable-weighted k-means (TW-k-means) clustering for multi-view data by introducing two entropy regularizations to adjust the importance of the weights. Based on collaborative learning, Zhang et al. [15] presented a two-level weighted collaborative k-means (TW-Co-k-means) clustering to consider the weight of views and features. However, these multi-view clustering methods all consider that the relationship between objects and clusters is distinct, i.e., an object can only belong to a cluster completely or not.
Unlike hard partition-based clustering methods, fuzzy clustering describes the uncertainty by assigning each object to various singleton clusters with different memberships [11]. Many multi-view methods based on the idea of fuzzy clustering have been developed [4, 5, 7, 14]. Cleuziou et al. [4] proposed collaborative fuzzy clustering for multi-view data, named Co-FKM, whereas it equally considers the contribution of each view. Jiang et al. [7] proposed a weighted view collaborative fuzzy c-means clustering to identify the contributions of different views. Recently, Yang and Sinaga [14] suggested a feature-weighted multi-view FCM based on collaborative learning (Co-FW-MVFCM), which can automatically identify the contribution of each view and feature. Although these methods have achieved reasonable results to some extent, they cannot represent the imprecision in the result. In fact, uncertain and imprecise cluster structure is very ubiquitous in applications, that is, the cluster information of some objects is difficult to distinguish.
To address such issue, a new concept of credal partition based on the theory of belief functions is introduced [8, 9]. Credal partition extends the concepts of hard, fuzzy and possibilistic partitions by assigning the objects, not only to singleton clusters, but also to the union of several singleton clusters (called meta-cluster) with different masses of belief. To date, some evidential clustering methods have been developed in the framework of belief functions [9, 12]. Evidential c-means (ECM) clustering [9], as the evidential counterpart of FCM, can well characterize uncertain and imprecise cluster structures and derive effective clustering results. Unfortunately, ECM is built upon the assumption of access to single-view data. Hence, how to characterize the uncertainty and imprecision of multi-view data in cluster assignment and improve the clustering performance is an open issue.
In this paper, we propose a new adaptive weighted multi-view evidential clustering (WMVEC) method. Specifically, we introduce view weights to capture the different contributions of each view in clustering and employ entropy regularization to regulate the distribution of view weights.
The main contributions are summarized as follows:
-
1.
Inspired by the theory of belief functions, we propose a novel WMVEC method, which can characterize the uncertainty and imprecision in clustering results.
-
2.
We design the objective function of WMVEC and obtain the weight vector, cluster prototype matrix and credal partition matrix in an alternating optimization way.
-
3.
We conduct extensive experiments on several real-world datasets to verify the superiority of WMVEC with respect to other state-of-the-art methods.
The remainder of this paper is organized as follows. Section 2 proposes the WMVEC method and illustrates its optimization process. In Sect. 3, the effectiveness and practicability of the proposed WMVEC method are analyzed using real-world datasets. Finally, we make a brief conclusion in Sect. 4.
2 Multi-view Evidential Clustering
As mentioned earlier, many real-world applications are currently crowded with multi-view data. However, how to characterize the uncertainty and imprecision in clustering results on such data is still a challenging task. Conventional evidential clustering is no longer adequate for the situations we encounter today.
In this section, we propose a novel adaptive weighted multi-view evidential clustering (WMVEC) method to simultaneously learn the consensus credal partition and the importance of each view.
2.1 Formulation
Given a dataset \(\mathcal {X} = \{\textbf{X}_1, \textbf{X}_2, \cdots , \textbf{X}_{\mathcal {H}}\}\) and h-th view is expressed as \(\textbf{X}_h = \{\textbf{x}_{1,h}, \textbf{x}_{2, h}, \cdots , \textbf{x}_{n,h}\} \in \mathbb {R}^{n \times \mathcal {P}_h}\). The objective function of WMVEC can be represented as follows:
with
where \(r_h\) is the view weight of h-th view, \(m_{ij}\) is the (i, j)-th element of credal partition matrix \(\textbf{M} \in \mathbb {R}^{n \times 2^c}\) and denotes the mass of belief of i-th object associated to cluster \(A_j\). \(\bar{\textbf{v}}_{j,h}\) is the j-th cluster prototype of h-th view. The parameter \(\lambda \) is used to adjust the distribution of view weights, and parameters \(\alpha \), \(\beta \) and \(\delta \) are the same as in ECM. The objective function \(\mathcal {J}_{MVEC}\) consists of two parts, the first part is similar to ECM and computes the sum of intra-cluster weighted distances in each view by assigning weights to the views. The second part is a negative weight entropy, which regulates the effect of view weight.
2.2 Optimization
In this subsection, we provide an efficient iterative method to minimize (2). More specifically, we employ the Lagrange optimization method to alternately update one variable while leaving others fixed.
Updating. \({\textbf{M}}\) With \(\textbf{V}\) and \(\textbf{r}\) fixed, \(\textbf{M}\) is computed by Theorem 1.
Theorem 1
Suppose \(\textbf{V} = \textbf{V}^*\) and \(\textbf{r} = \textbf{r}^*\) are fixed, \(\mathcal {J}_{{WMVEC}}(\textbf{M}, \textbf{V}^*, \textbf{r}^*)\) is minimized iff:
Proof
We employ Lagrange multipliers \(\tau _{i}\) to solve the constrained minimization problem with respect to \(\textbf{M}\). The objective function (2) can be rewritten as:
By differentiating the Lagrangian with respect to the \(m_{ij}\), \(m_{i\emptyset }\), and \(\tau _i\) and setting the derivatives to zero, we obtain:
We thus have from (7):
Using (9)–(11), we can obtain the updating (4)–(5).
Updating. \({\textbf{r}}\) With \(\textbf{M}\) and \(\textbf{V}\) fixed, \(\textbf{r}\) is computed by Theorem 2.
Theorem 2
Suppose \(\textbf{M} = \textbf{M}^*\) and \(\textbf{V} = \textbf{V}^*\) are fixed, \(\mathcal {J}_{{WMVEC}}(\textbf{M}^*, \textbf{V}^*, \textbf{r})\) is minimized iff:
Proof
Lagrange multiplier \(\sigma \) is used to solve the constrained minimization problem with respect to \(\textbf{r}\). The objective function (2) becomes:
By differentiating the Lagrangian with respect to the \(r_{h}\) and \(\sigma \) setting the derivatives to zero, we obtain:
We thus have from (14):
Using (15)–(16), we can obtain the updating (12).
Updating. \({\textbf{V}}\) while fixing \({\textbf{M}}\) and \({\textbf{r}}\) With \(\textbf{M}\) and \(\textbf{r}\) fixed, \(\textbf{V}\) is computed by Theorem 3.
Theorem 3
Suppose \(\textbf{M} = \textbf{M}^*\) and \(\textbf{r} = \textbf{r}^*\) are fixed, \(\mathcal {J}_{{WMVEC}}(\textbf{M}^*, \textbf{V}, \textbf{r}^*)\) is minimized iff:
where \(B_{lq,h}\) (resp. \(H_{lk,h}\)) is the l, q-th (resp. l, k-th) element of the matrix \(\textbf{B}_h \in \mathbb {R}^{c \times \mathcal {P}_h}\) (resp. \(\textbf{H}_h \in \mathbb {R}^{c \times c}\)).
Proof
The partial derivatives of \(\mathcal {J}_{{WMVEC}}\) with respect to the cluster centers are given by:
where
Setting these derivatives to zero gives c linear equations that can be written as:
The system of linear equations can be equally represented by (17) and (18).
For the convenience of implementation, the proposed WMVEC method is outlined in Algorithm 1.
2.3 Computational Complexity
The computational complexity of ECM is \(O(tn2^c)\), where t is the number of iterations, n and c represent the number of objects and clusters respectively. WMVEC is an extension of ECM in multi-view scenarios by introducing view weights. Therefore, the computational complexity of WMVEC is \(O(t\mathcal {H}n2^c)\), and \(\mathcal {H}\) is the number of views. It is worth noting that the number of clusters grows exponentially with the computational complexity, and the assignment of objects to high-cardinality meta-clusters is challenging to interpret in practice. Therefore, we can consider imposing constraints to limit the number of focal elements in the meta-cluster, for example, like in ECM, to keep only the meta-cluster containing two clusters. For higher cluster datasets, we can restrict the framework of discernment to consist of only singleton clusters, empty set, and \(\varOmega \). By doing so, the computational complexity of WMVEC can be reduced to \(O(t\mathcal {H}nc^2)\) and \(O(t\mathcal {H}n(c+2))\).
3 Experiments
3.1 Experiment on Iris Dataset
In this experiment, we demonstrate the ability of credal partition in WMVEC on the IrisFootnote 1 dataset. The Iris dataset contains 150 objects with 3 clusters, where each object consists of four features (Sepal.Length, Sepal.Width, Petal.Length and Petal.Width). We split the Iris dataset into three views by different features, as shown in Fig. 1(a), (e), (i). We can see that \(\{\omega _1\}\) is clearly distinguished from \(\{\omega _2\}\) and \(\{\omega _3\}\), whereas \(\{\omega _2\}\) and \(\{\omega _3\}\) are partially overlapped, and view 3 is much easier to distinguish between \(\{\omega _2\}\) and \(\{\omega _3\}\) than the other two views.
From Fig. 1(b)–(d), (f)–(h), (j)–(l), we can observe that WMVEC can clearly characterize the imprecision and uncertainty of cluster assignment. It carefully assigned indistinguishable objects to relevant meta-cluster (i.e. \(\{\omega _2, \omega _3\}\)). What is more, the parameter \(\alpha \) determines the size of the meta-cluster. A large \(\alpha \) tends to correspond to lower imprecision and higher error, and vice versa. In applications, \(\alpha \) should be adjusted according to the specific situation or user preferences. In the following experiments, we take \(\alpha = 2\), as suggested in [9].
3.2 Experiments on Real-World Multi-view Datasets
To verify the validity of the proposed method with respect to other methods, we constructed experiments on eight public multi-view datasets, including WebKBFootnote 2, Multiple featuresFootnote 3, Prokaryotic Phyla [1], SensIT VehicleFootnote 4, HumanEva 3D Motion [1] and MSRCv1 [14]. Table 1 shows the basic information of the used multi-view datasets, including the numbers of objects, clusters, views and features.
We compare the performance of the proposed WMVEC method with respect to a number of related methods, which are shown as follows: FCM [11], ECM [9], Co-FKM [4], RMVKM [2], TW-k-means [3], SMVF [6], TW-Co-k-means [15], Co-FW-MVFCM [14] and MVASM [5]. Among them, FCM and ECM are two classical single-view clustering methods, RMVKM, TW-k-means, SMVF and TW-Co-k-means are representative multi-view clustering methods based on hard partition, Co-FKM, Co-FW-MVFCM and MVASM are state-of-art methods based on fuzzy partition. In order to fully evaluate the performance of the above mentioned methods, we applied two popular performance metricsFootnote 5, namely Accuracy (ACC) and Rand Index (RI). Note that the larger values of ACC and RI correspond to better clustering performance.
Table 2 and Table 3 show the clustering results of WMVEC and other methods on real-world datasets. Specifically, several points can be drawn from the observation.
-
1.
Compared with single-view clustering methods (i.e. FCM and ECM), WMVEC generally has obvious advantages, which indicates that it is not appropriate to use single-view clustering methods to deal with multi-view data. Besides, we can see that on the second view of the Prop dataset, FCM and ECM both achieve relatively good results, however, these results are physically meaningless.
-
2.
Compared with the multi-view clustering method, the proposed WMVEC achieves better performance compared to other methods. In particular, the clustering methods considering view weights can often obtain better results, which further demonstrates the rationality of introducing view weights.
-
3.
More importantly, the advantage of WMVEC is not only that it can be transformed into hard or fuzzy partition to obtain outstanding performance, but WMVEC with credal partition can also offer deeper insight into the data. It can effectively characterize the uncertainty and imprecision in cluster assignment and avoid the risk of misassignment, which is very valuable in some cautious decision-making applications.
3.3 Parameter Study
Finally, we also tested the parameter sensitivity of WMVEC. The ACC and RI results with different parameters \(\beta \) and \(\lambda \) are shown in Fig. 2 and Fig. 3. We take \(\beta \) that varies in \(\{1.1, 1.2, \cdots , 2.0\}\) and \(\lambda \) that varies in \(\{e^0, e^1, \cdots , e^{10}\}\). It can be seen that the values of \(\beta \) and \(\lambda \) have a significant impact on the performance of the WMVEC. For each dataset, the WMVEC can gain better results if appropriate \(\beta \) and \(\lambda \) can be selected. In addition, we can clearly see that the WMVEC can often obtain better results with smaller \(\beta \) and larger \(\lambda \).
4 Conclusion
In this paper, we propose an adaptive weighted multi-view evidential clustering (WMVEC) based on the framework of belief functions. As such, WMVEC is able to handle the uncertainty and imprecision of multi-view data clustering, and provides a credal partition, which extends the fuzzy, possibilistic and rough ones, allowing the representation of the imprecision and uncertainty about the assignment of objects to clusters. What is more, WMVEC can automatically learn the importance of views. Experimental results on several real-world datasets illustrate the potential and superiority of our proposed method. In future studies, we intend to extend the proposed method to deal with mixed types of data as well as incomplete data.
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
For credal partition, we can obtain a hard or fuzzy partition by Pignistic probability transformation [9].
References
Benjamin, J.B.M., Yang, M.S.: Weighted multiview possibilistic c-means clustering with L2 regularization. IEEE Trans. Fuzzy Syst. 30(5), 1357–1370 (2022)
Cai, X., Nie, F., Huang, H.: Multi-view k-means clustering on big data. In: IJCAI (2013)
Chen, X., Xu, X., Huang, J.Z., Ye, Y.: TW-k-means: automated two-level variable weighting clustering algorithm for multiview data. IEEE Trans. Knowl. Data Eng. 25(4), 932–944 (2013)
Cleuziou, G., Exbrayat, M., Martin, L., Sublemontier, J.H.: CoFKM: a centralized method for multiple-view clustering. In: ICDM, pp. 752–757 (2009)
Han, J., Xu, J., Nie, F., Li, X.: Multi-view k-means clustering with adaptive sparse memberships and weight allocation. IEEE Trans. Knowl. Data Eng. 34(2), 816–827 (2022)
Jiang, B., Qiu, F., Wang, L.: Multi-view clustering via simultaneous weighting on views and features. Appl. Soft Comput. 47, 304–315 (2016)
Jiang, Y., Chung, F.L., Wang, S., Deng, Z., Wang, J., Qian, P.: Collaborative fuzzy clustering from multiple weighted views. IEEE Trans. Cybern. 45(4), 688–701 (2015)
Liu, Z.: An effective conflict management method based on belief similarity measure and entropy for multi-sensor data fusion. Artif. Intell. Rev. 1–28 (2023)
Masson, M.H., Denœux, T.: ECM: an evidential version of the fuzzy C-means algorithm. Pattern Recognit. 41(4), 1384–1397 (2008)
Oyewole, G.J., Thopil, G.A.: Data clustering: application and trends. Artif. Intell. Rev. 56(7), 6439–6475 (2023)
Ruspini, E.H., Bezdek, J.C., Keller, J.M.: Fuzzy clustering: a historical perspective. IEEE Comput. Intell. Mag. 14(1), 45–55 (2019)
Su, Z., Denœux, T.: BPEC: belief-peaks evidential clustering. IEEE Trans. Fuzzy Syst. 27(1), 111–123 (2019)
Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. Neural Netw. 16(3), 645–678 (2005)
Yang, M.S., Sinaga, K.P.: Collaborative feature-weighted multi-view fuzzy c-means clustering. Pattern Recognit. 119, 108064 (2021)
Zhang, G., Wang, C., Huang, D., Zheng, W., Zhou, Y.: TW-co-k-means: two-level weighted collaborative k-means for multi-view clustering. Knowl.-Based Syst. 150, 127–138 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, Z., Huang, H., Letchmunan, S. (2023). Adaptive Weighted Multi-view Evidential Clustering. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14257. Springer, Cham. https://doi.org/10.1007/978-3-031-44216-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-031-44216-2_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44215-5
Online ISBN: 978-3-031-44216-2
eBook Packages: Computer ScienceComputer Science (R0)