Abstract
Because of its unsupervised nature, clustering is one of the most challenging problems, considered as a NP-hard grouping problem. Recently, several evolutionary algorithms (EAs) for clustering problems have been presented because of their efficiency for solving the NP-hard problems with high degree of complexity. Most previous EA-based algorithms, however, have dealt with the clustering problems given the number of clusters (K) in advance. Although some researchers have suggested the EA-based algorithms for unknown K clustering, they still have some drawbacks to search efficiently due to their huge search space. This paper proposes the two-leveled symbiotic evolutionary clustering algorithm (TSECA), which is a variant of coevolutionary algorithm for unknown K clustering problems. The clustering problems considered in this paper can be divided into two sub-problems: finding the number of clusters and grouping the data into these clusters. The two-leveled framework of TSECA and genetic elements suitable for each sub-problem are proposed. In addition, a neighborhood-based evolutionary strategy is employed to maintain the population diversity. The performance of the proposed algorithm is compared with some popular evolutionary algorithms using the real-life and simulated synthetic data sets. Experimental results show that TSECA produces more compact clusters as well as the accurate number of clusters.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Abraham A, Jain L, Goldberg R (2005) Evolutionary multiobjective optimization. Springer, Berlin
Al-Harbi SH, Rayward-Smith VJ (2006) Adaptive k-means for supervised clustering. Artif Intell 24:219–226
Al-Sultan K (1995) A tabu search approach to the clustering problem. Pattern Recognit 28(9):1443–1451
Bäck T, Hammel U, Schwefel HP (1997) Evolutionary computation: comments on the history and current state. IEEE Trans Evol Comput 1(1):3–17
Bandyopadhyay S, Maulik U (2002a) An evolutionary technique based on K-means algorithm for optimal clustering in ℝN. Inf Sci 146:221–237
Bandyopadhyay S, Maulik U (2002b) Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recognit 35:1197–1208
Brown D, Huntley C (1992) A practical application of simulated annealing to clustering. Pattern Recognit 25(4):401–412
Cowgill MC, Harvey RJ, Watson LT (1999) A genetic algorithm approach to cluster analysis. Comput Math Appl 37:99–108
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Recognit Mach Intell 1(2):224–227
Everitt B, Landau S, Leese M (2001) Cluster analysis. Arnold, Sevenoaks
Fogel DB (2006) Evolutionary computation: toward a new philosophy of machine intelligence. IEEE Press, New York
Garai G, Chaudhuri BB (2004) A novel genetic algorithm for automatic clustering. Pattern Recognit Lett 25:173–187
Gen M, Cheng R (1997) Genetic algorithms and engineering design. Wiley, New York
Gen M, Cheng R (2000) Genetic algorithm and engineering optimization, Wiley series in engineering design and automation. Wiley, New York
Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison Wesley, Reading
Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. J Intell Inform Syst 17:107–145
Geva A (1999) Hierarchical unsupervised fuzzy clustering. IEEE Trans Fuzzy Set 7(6):723–733
Handl J, Meyer B (2007) Ant-based and swarm-based clustering. Swarm Intell 1(2):95–113
Hillis WD (1990) Co-evolving parasites improve simulated evolution as an optimization procedure. Physica D 42:228–234
Hruschka ER, Campello RGB, Freitas AA, Carvalho APL (2009) A survey of evolutionary algorithms for clustering. IEEE Trans Syst Man Cybern Part C 39(2):133–155
Hruschka ER, Ebecken NFF (2003) A genetic algorithm for cluster analysis. Intell Data Anal 7:15–25
Jain A, Murty M, Flynn P (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323
Jeong YS, Kim SJ, Jeong MK (2008) Automatic identification of defect patterns in semiconductor wafer maps using spatial correlogram and dynamic time warping. IEEE Trans Semicond Manuf 21(4):625–637
Johnson RA, Wichern DW (2002) Applied multivariate statistical analysis. Prentice-Hall, New York
Kim JY, Kim Y, Kim YK (2001) An endosymbiotic evolutionary algorithm for optimization. Appl Intell 15(2):117–130
Kim YK, Kim JY, Kim Y (2000) A coevolutionary algorithm for balancing and sequencing in mixed model assembly lines. Appl Intell 13:247–258
Kim YK, Kim JY, Shin KS (2007) An asymmetric multileveled symbiotic evolutionary algorithm for integrated FMS scheduling. J Intell Manuf 18:631–645
Kim YK, Park KT, Ko JS (2003) A symbiotic evolutionary algorithm for the integration of process planning and job shop scheduling. Comput Oper Res 30:1151–1171
Koza JR (1992) Genetic programming. MIT Press, Cambridge
Korkmaz EE (2010) Multi-objective Genetic Algorithms for grouping problems. Appl Intell 33:179–192
Kuncheva LI, Bezdek JC (1997) Selection of cluster prototypes from data by a genetic algorithm. In: Proceedings of 5th European congress on intelligent techniques and soft computing, pp 1683–1688
Lu Y, Lu S, Fotouhi F, Deng Y, Brown SJ (2004) FGKA: a fast genetic K-means clustering algorithm. In: Proceedings of ACM symposium on applied computing, pp 622–623
Margulis L (1970) Origin of eukaryotic cells. Yale University Press, New Haven
Maulik U, Bandyopadhyay S (2000) Genetic algorithm-based clustering technique. Pattern Recognit 33:1455–1465
Michalewics Z (1996) Genetic algorithms+data structures=evolution programs, 3rd edn. Springer, Berlin
Moriarty DE, Miikkulainen R (1997) Forming neural networks through efficient and adaptive coevolution. Evol Comput 5:373–399
Murthy CA, Chowdhury N (1996) In search of optimal clusters using genetic algorithms. Pattern Recognit Lett 17:825–832
Ozyer T, Zhang M, Alhajj R (2009) Parallel clustering of high dimensional data by integrating multi-objective genetic algorithm with divide and conquer. Appl Intell 31:318–331
Ozyer T, Zhang M, Alhajj R (2011) Integrating multi-objective genetic algorithm based clustering and data partitioning for skyline computation. Appl Intell 35:110–122
Potter MA (1997) The design and analysis of a computational model of cooperative coevolution. PhD dissertation, George Mason University
Rosin CD, Belew RK (1997) New methods for competitive coevolution. Evol Comput 5:1–29
Saitta S, Raphael B, Smith IFC (2008) A comprehensive validity index for clustering. Intell Data Anal 12:529–548
Selim S, Alsultan K (1991) A simulated annealing algorithm for the clustering problems. Pattern Recognit 24(10):1003–1008
Sung C, Jin H (2000) A Tabu-search-based heuristic for clustering. Pattern Recognit 33:849–858
Syswerda G (1991) A study of reproduction in generational and steady-state genetic algorithms. In: Foundations of genetic algorithms. Morgan Kaufmann, San Mateo, pp 94–101
Tou JT, Gonzalez RC (1974) Pattern recognition principles. Addison-Wesley, Reading, MA
Tseng LY, Yang SB (2001) A genetic approach to the automatic clustering problem. Pattern Recognit 34:415–424
Vesanto J, Alhoniemi E (2000) Clustering of the self-organizing map. IEEE Transactions on Neural Networks 11(3):586-600
Wang CH (2008) Recognition of semiconductor defect patterns using spatial filtering and spectral clustering. Expert Syst Appl 34:1914–1923
Whitley D (1989) The genitor algorithm and selection pressure: why rank-based allocation of reproductive trials is best. In: Proceedings of the third international conference on genetic algorithms. Morgan Kaufmann, San Mateo, pp 116–121
Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678
Yuan T, Kuo W (2008) Spatial defect pattern recognition on semiconductor wafers using model-based clustering and Bayesian inference. Eur J Oper Res 190:228–240
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Shin, K.S., Jeong, YS. & Jeong, M.K. A two-leveled symbiotic evolutionary algorithm for clustering problems. Appl Intell 36, 788–799 (2012). https://doi.org/10.1007/s10489-011-0295-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-011-0295-y