Abstract
Co-clustering is a powerful technique with varied applications in text clustering and recommender systems. For large scale high dimensional and sparse real world data, there is a strong need to provide an overlapped co-clustering algorithm that mitigates the effect of noise and non-discriminative information, generalizes well to the unseen data, and performs well with respect to several quality measures. In this paper, we introduce a novel fuzzy co-clustering algorithm that incorporates multiple regularizers to address these important issues. Specifically, we propose MRegFC that considers terms corresponding to Entropy, Gini Index, and Joint Entropy simultaneously. We demonstrate that MRegFC generates significantly higher quality results compared to many existing approaches on several real world benchmark datasets.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Madiera, S., Oliveira, A.: Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Trans. Computational Biology and Bioinformatics 1, 24–45 (2004)
Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2001, pp. 269–274. ACM, New York (2001)
Dhillon, I.S., Mallela, S., Modha, D.: Information theoretic co-clustering. In: Proceedings of the Ninth ACM SigKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 89–98. ACM Press, New York (2003)
George, T., Merugu, S.: A scalable collaborative filtering framework based on co-clustering. In: Proceedings of the Fifth IEEE International Conference on Data Mining, ICDM 2005, pp. 625–628. IEEE Computer Society, Washington, DC (2005)
Banerjee, A., Dhillon, I., Ghosh, J., Merugu, S., Modha, D.S.: A generalized maximum entropy approach to bregman co-clustering and matrix approximation. In: Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2004, pp. 509–514. ACM, New York (2004)
Sahami, M., Hearst, M., Saund, E.: Applying the multiple cause mixture model to text categorization. In: International Conference on Machine Learning (1996)
Tjhi, W.C., Chen, L.: A partitioning based algorithm to fuzzy co-cluster documents and words. Pattern Recognition Letters 27, 151–159 (2006)
Miyamoto, S., Mukaidono, M.: Fuzzy c-means as a regularization and maximum entropy approach. In: Proceedings of IFSA, vol. 2, pp. 86–92 (1997)
Oh, C.H., Honda, K., Ichihashi, H.: Fuzzy clustering of categorical multi-variate data. In: Proceedings of IFSA/NAFIPS, Vancouver, USA, pp. 2154–2159 (2001)
Kummamuru, K., Dhawale, A., Krishnapuram, R.: Fuzzy co-clustering of documents and keywords. In: IEEE International Conference on Fuzzy Systems (2003)
Frigui, H., Nasraoui, O.: Simultaneous clustering and attribute discrimination. In: Proceedings of FUZZIEEE, pp. 158–163 (2000)
Frigui, H., Nasraoui, O.: Simultaneous categorization of text documents and identification of cluster-dependent keywords. In: Proceedings of FUZZIEEE, pp. 158–163 (2001)
Shafiei, M.M., Milios, E.E.: Model based overlapping co-clustering. In: SDM, Maryland, USA (2006)
MacKay, D.: Information theory, inference, and learning algorithms. Cambridge University Press (2003)
Dumitrescu, D., Lazzerini, B., Jain, L.: Fuzzy sets and their applications to clustering and training. CRC Press, Boca Raton (2000)
Goldberg, K., Roeder, T., Gupta, D., Perkins, C.: Eigentaste: A constant time collaborative filtering algorithm. Information Retrieval 4, 133–151 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Garg, V.K., Chaudhari, S., Narang, A. (2013). Multi-regularization for Fuzzy Co-clustering. In: Lee, M., Hirose, A., Hou, ZG., Kil, R.M. (eds) Neural Information Processing. ICONIP 2013. Lecture Notes in Computer Science, vol 8227. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-42042-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-42042-9_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-42041-2
Online ISBN: 978-3-642-42042-9
eBook Packages: Computer ScienceComputer Science (R0)