Abstract
This paper analyses the properties of four alternative representation/operator combinations suitable for data clustering algorithms that keep the number of clusters variable. These representations are investigated in the context of their performance when used in a multiobjective evolutionary clustering algorithm (MOCK), which we have described previously. To shed light on the resulting performance differences observed, we consider the relative size of the search space and heuristic bias inherent to each representation, as well as its locality and heritability under the associated variation operators. We find that the representation that performs worst when a random initialization is employed, is nevertheless the best overall performer given the heuristic initialization normally used in MOCK. This suggests there are strong interaction effects between initialization, representation and operators in this problem.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Cole, R.M.: Clustering with genetic algorithms. Master’s thesis, University of Western Australia, Australia (1998)
Corne, D.W., Knowles, J.D., Oates, M.J.: PESA-II: Region-based selection in evolutionary multiobjective optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 283–290. ACM Press, New York (2001)
Falkenauer, E.: Genetic Algorithms and Grouping Problems. John Wiley and Son Ltd., New York (1998)
Goulden, I.P., Jackson, D.M.: Combinatorial Enumeration, p. 192, 3.3.28. John Wiley and Sons, Inc, New York (1983)
Handl, J., Knowles, J.: Evolutionary multiobjective clustering. In: Proceedings of the Eighth International Conference on Parallel Problem Solving from Nature, pp. 1081–1091. Springer, Berlin, Germany (2004)
Handl, J., Knowles, J.: An evolutionary approach to multiobjective clustering. IEEE Transactions on Evolutionary Computation (in press, 2006)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)
Ma, P.C.H., Chan, K.C.C., Yao, X., Chiu, D.K.Y.: An evolutionary clustering algorithm for gene expression microarray data analysis. IEEE Transactions on Evolutionary Computation (in press, 2006)
MacQueen, L.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967)
Park, Y.-J., Song, M.-S.: A genetic algorithm for clustering problems. In: Proceedings of the Third Annual Conference on Genetic Programming, pp. 568–575. Morgan Kaufmann, San Francisco (1998)
Radcliffe, N.J., Surry, P.D.: Fitness variance of formae and performance prediction. In: Foundations of Genetic Algorithms, vol. 3, pp. 51–72. Morgan Kaufmann Publishers, San Mateo (1995)
Raidl, G.R., Gottlieb, J.: Empirical analysis of locality, heritability and heuristic bias in evolutionary algorithms: A case study for the multidimensional knapsack problem. Evolutionary Computation 13(4), 441–475 (2005)
Rothlauf, F., Goldberg, D.E.: Redundant representations in evolutionary computation. evolutionary computation 11(4), 381–415 (2003)
Sloane, N.J.A.: Series A060281 in The On-Line Encyclopedia of Integer Sequences
Zitzler, E.: Evolutionary algorithms for multiobjective optimization: methods and applications. PhD thesis, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Handl, J., Knowles, J. (2006). An Investigation of Representations and Operators for Evolutionary Data Clustering with a Variable Number of Clusters. In: Runarsson, T.P., Beyer, HG., Burke, E., Merelo-Guervós, J.J., Whitley, L.D., Yao, X. (eds) Parallel Problem Solving from Nature - PPSN IX. PPSN 2006. Lecture Notes in Computer Science, vol 4193. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11844297_85
Download citation
DOI: https://doi.org/10.1007/11844297_85
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38990-3
Online ISBN: 978-3-540-38991-0
eBook Packages: Computer ScienceComputer Science (R0)