Abstract
This paper proposes a clustering technique that minimizes the need for subjective human intervention and is based on elements of rough set theory (RST). The proposed algorithm is unified in its approach to clustering and makes use of both local and global data properties to obtain clustering solutions. It handles single-type and mixed attribute data sets with ease. The results from three data sets of single and mixed attribute types are used to illustrate the technique and establish its efficiency.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
T. Sorensen. A Method of Establishing Groups of Equal Amplitude in Plant Sociology Based on Similarity of Species Content and Its Application to the Analyses of the Vegetation on Danish Commons. Biologiske Skrifter, vol. 5, no. 4, pp. 1–34, 1948.
P. Sneath. The Application of Computers to Taxonomy. Journal of General Microbiology, vol. 17, no. 1, pp. 201–226, 1957.
R. R. Sokal, P. H. A. Sneath. Principles of Numerical Taxonomy, W. H. Freeman, San Francisco, USA, 1963.
J. H. Ward. Hierarchical Grouping to Optimize an Objective Function. Journal of the American Statistical Association, vol. 58, no. 301, pp. 236–244, 1963.
M. R. Anderberg. Cluster Analysis for Applications, Academic Press, New York, USA, 1973.
M. S. Aldenderfer, R. K. Blashfield. Cluster Analysis, Sage University Paper, Newbury Park, USA, 1984.
B. S. Everitt. Cluster Analysis, Edward Arnold, Cambridge, UK, 1993.
S. Sharma. Applied Multivariate Techniques, John Wiley & Sons, New York, USA, 1996.
A. K. Jain, M. N. Murty, P. J. Flynn. Data Clustering: A review. ACM Computing Surveys, vol. 31, no.3, pp. 264–323, 1999.
R. R. Yegar. Intelligent Control of the Hierarchical Clustering Process. IEEE Transactions on Systems, Man, and Cybenetics-Part B, vol. 30, no. 6, pp. 835–845, 2000.
E. W. Forgey. Cluster Analysis of Multivariate Data: Efficiency Versus Interpretability of Classifications. Biometrics, vol. 21, no. 3, pp. 768–769, 1965.
R. C. Jancey. Multidimensional Group Analysis. Australian Journal of Botany, vol. 14, no. 1, pp. 127–130, 1966.
J. B. MacQueen. Some Methods of Classification and Analysis of Multivariate Observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, Berkeley, USA, vol. 1, pp. 281–297, 1967.
G. H. Ball, D. J. Hall. ISODATA, a Novel Method of Data Analysis and Pattern Classification, Technical Report AD 699616, Stanford Research Institute, Menlo Park, USA, 1965.
F. H. C. Marriott. Optimization Methods of Cluster Analysis. Biometrika, vol. 69, no. 2, pp. 417–421, 1982.
S. Z. Selim, M. A. Ismail. K-means Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality. IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 6, no. 1, pp. 81–87, 1984.
J. C. Dunn. A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact, Well Separated Clusters. Journal of Cybernetics, vol. 3, no. 3, pp. 32–57, 1973.
J. C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithm, Plenum Press, New York, USA, 1981.
A. K. Jain, R. C. Dubes. Algorithms for Clustering Data, Prentice-Hall, USA, 1988.
M. S. Kamel, S. Z. Selim. New Algorithms for Solving the Fuzzy Clustering Problem. Pattern Recognition, vol. 27, no. 3, pp. 421–428, 1994.
J. S. R. Jang, C. T. Sun, E. Mizutani. Neuro-fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence, Prentice-Hall, USA, 1996.
F. Höppner, F. Klawonn, R. Kruse, T. Runkler. Fuzzy Cluster Analysis. Wiley & Sons, Chichester, England, 1999.
Z. Pawlak. Rough Sets. International Journal of Information and Computer Sciences, vol. 11, no. 5, pp. 341–356, 1982.
Z. Pawlak. Rough Sets: Theoretical Aspects of Reasoning about Data, Kluwer Academic, Dordrecht, Holland, 1991.
A. Skowron, C. Rauszer. The Discernibility Matrices and Functions in Information Systems. Intelligent Decision Support, Handbook of Applications and Advances of the Rough Sets Theory, R. Slowinski (ed.), Kluwer Academic, Dordrecht, Holland, pp. 331–362, 1992.
J. Komorowski, Z. Pawlak, L. Polkowski, A. Skowron. Rough Sets: A Tutorial. Rough Fuzzy Hybridization: A New Method for Decision Making, S. Pal, A. Skowron (eds.), Springer, Berlin, Germany, 1998.
D. Dubois, H. Prade. Rough Fuzzy Sets and Fuzzy Rough Sets. International Jounal of General Systems, vol. 17, no. 2, pp. 191–209, 1989.
C. L. Bean, C. Kambhampati. Knowledge-oriented Clustering for Decision Support. In Proceedings of IEEE International Joint Conference on Neural Networks, Portland, Oregon, USA, vol. 4, pp. 3244–3249, 2003.
T. Okuzaki, S. Hirano, S. Kobashi, Y. Hata, Y. Takahashi. A Rough Set Based Clustering Method by Knowledge Combination. IEICE Transactions on Information and Systems, vol. 85, no. 12, pp. 1898–1908, 2002.
C. L. Bean, C. Kambhampati, S. Rajasekharan. A Rough Set Solution to a Fuzzy Set Problem. In Proceedings of IEEE International Conference on Fuzzy Systems, World Congress in Computational Intelligence, Honolulu, Hawaii, vol. 1, pp. 18–23, 2002.
S. Hirano, S. Tsumoto. A Knowledge-oriented Clustering Technique Based on Rough Sets. In Proceedings of 25th IEEE International Conference on Computer and Software Applications, Chicago, USA, pp. 632–637, 2001.
B. J. F. Manly. Multivariate Statistical Methods, A Primer, Chapman & Hall, New York, USA, 2000.
J. A. Hartigan. Clustering Algorithms, John Wiley & Sons, New York, USA, 1975.
Author information
Authors and Affiliations
Corresponding author
Additional information
Charlotte Bean received her Master of mathematics (MMath) degree from the University of Hull, UK, in 1999 and the Ph.D. degree in computer science from the same university in 2004. She is currently a research fellow in the Medical School at the University of Warwick, UK.
Her research interests include data analysis, the theoretical development of techniques and algorithms used in statistical modeling and data mining, especially using tools such as rough set theory, FST, and artificial neural networks.
Chandra Kambhampati received his Ph.D. degree from City University, London, for his dissertation on algorithms for optimizing control in 1988. He hold positions in the Department of Cybernetics, at the University of Reading, and the Department of Chemical and Process Engineering, at the University of Newcastle Upon Tyne. He is currently a reader in the Department of Computer Science at the University of Hull. He also heads the Neural, Emergent and Agent Technologies (NEAT) Group, and has a number of doctoral and graduate students, and research associates. He has authored and co-authored over 100 papers on optimization, adaptive optimization, optimal control, neural networks, and fuzzy logic for control and robotics. He is a member of IEEE and IEE.
His research interests include fault tolerant control, networked control systems, signal processing, neural networks, and multiagent systems.
Rights and permissions
About this article
Cite this article
Bean, C., Kambhampati, C. Autonomous clustering using rough set theory. Int. J. Autom. Comput. 5, 90–102 (2008). https://doi.org/10.1007/s11633-008-0090-3
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/s11633-008-0090-3