Abstract
This paper introduces a software tool named KEEL which is a software tool to assess evolutionary algorithms for Data Mining problems of various kinds including as regression, classification, unsupervised learning, etc. It includes evolutionary learning algorithms based on different approaches: Pittsburgh, Michigan and IRL, as well as the integration of evolutionary learning techniques with different pre-processing techniques, allowing it to perform a complete analysis of any learning model in comparison to existing software tools. Moreover, KEEL has been designed with a double goal: research and educational.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Alcalá R, Alcala-Fdez J, Casillas J, Cordón O, Herrera F (2006) Hybrid learning models to get the interpretabilityaccuracy trade-off in fuzzy modeling. Soft Comput 10(9): 717–734
Batista GE, Monard MC (2003) An analysis of four missing data treatment methods for supervised learning. Appl Artif Intell 17: 519–533
Bernadó-Mansilla E, Ho TK (2005) Domain of competence of XCS classifier system in complexity measurement space. IEEE Trans Evol Comput 9(1): 82–104
Berthold MR, Cebron N, Dill F, Di Fatta G, Gabriel TR, Georg F, Meinl T, Ohl P (2006) KNIME: The Konstanz Information Miner, In: Proceedings of the 4th annual industrial simulation conference, Workshop on multi-agent systems and simulations, Palermo
Cano JR, Herrera F, Lozano M (2003) Using evolutionary algorithms as instance selection for data reduction in KDD: An experimental study. IEEE Trans Evol Comput 7(6): 561–575
Cordón O, del Jesus MJ, Herrera F, Lozano M (1999) MOGUL: a methodology to obtain genetic fuzzy rule-based systems under the iterative rule learning approach. Int J Intell Syst 14(9): 1123–1153
Cordón O, Herrera F, Sánchez L (1999) Solving electrical distribution problems using hybrid evolutionary data analysis techniques. Appl Intell 10: 5–24
Cordón O, Herrera F, Hoffmann F, Magdalena L (2001) Genetic fuzzy systems: Evolutionary tuning and learning of fuzzy knowledge bases. World Scientific, Singapore, p 488
Chuang AS (2000) An extendible genetic algorithm framework for problem solving in a common environment. IEEE Trans Power Syst 15(1): 269–275
del Jesus MJ, Hoffmann F, Navascues LJ, Sánchez L (2004) Induction of Fuzzy-Rule-Based Classifiers with Evolutionary Boosting Algorithms. IEEE Trans Fuzzy Syst 12(3): 296–308
Demšar J, Zupan B Orange: From experimental machine learning to interactive data mining, White Paper (http://www.ailab.si/orange). Faculty of Computer and Information Science, University of Ljubljana
Dietterich TG (1998) Approximate Statistica Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation 10(7): 1895–1924
Eiben AE, Smith JE (2003) Introduction to evolutionary computing. Springer, Berlin, p 299
Freitas AA (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer, Berlin, p 264
Gagné C, Parizeau M (2006) Genericity in evolutionary computation sofyware tools: principles and case-study Int J Artif Intell Tools 15(2): 173–194
Ghosh A, Jain LC (2005) Evolutionary Computation in Data Mining. Springer, New York, pp 264
Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, New York, pp 372
Grefenstette JJ (1993) Genetic Algorithms for Machine Learning. Kluwer, Norwell, p 176
Holland JH (1975) Adaptation in natural and artificial systems. The University of Michigan Press, London, p 228
Keijzer M, Merelo JJ, Romero G, Schoenauer M (2001) Evolving objects: A general purpose evolutionary computation library. In: Collet P, Fonlupt C, Hao JK, Lutton E, Schoenauer M (eds) Artificial evolution: selected papers from the 5th european conference on artificial evolution, London, UK, pp 231–244
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence 2(12):1137–1143
Krasnogor N, Smith J (2000) MAFRA: A Java memetic algorithms framework. In: Proceedings of the Genetic and Evolutionary Computation Workshops. Las Vegas, Nevada, USA, pp 125–131
Llorà X (2006) E2K: Evolution to knowledge. SIGEVOlution 1(3): 10–16
Llorà X, Garrell JM (2003) Prototype induction and attribute selection via evolutionary algorithms. Int Data Anal 7(3): 193–208
Liu H, Hussain F, Lim C, Dash M (2002) Discretization: an enabling technique. Data Min Knowl Discov 6(4): 393–423
Luke S, Panait L, Balan G, Paus S, Skolicki Z, Bassett J, Hubley R, Chircop A (2007) ECJ: A Java based evolutionary computation research system. http://cs.gmu.edu/~eclab/projects/ecj
Martínez-Estudillo A, Martínez-Estudillo F, Hervás-Martínez C, García-Pedrajas N (2006) Evolutionary product unit based neural networks for regression. Neural Netw 19: 477–486
Meyer M, Hufschlag K (2006) A generic approach to an object-oriented learning classifier system library. Journal of Artificial Societies and Social Simulation 9:3 http://jasss.soc.surrey.ac.uk/9/3/9.html
Mierswa I, Wurst M, Klinkenberg R, Scholz M, Euler T (2006) YALE: Rapid Prototyping for Complex Data Mining Tasks. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1–6
Morik K, Scholz M (2004) The MiningMart Approach to Knowledge Discovery in Databases. In: Zhong N, Liu J (eds) Intelligent Technologies for Information Analysis. Springer, Heidelberg, pp 47–65
Mucientes M, Moreno DL, Bugarín A, Barro S (2006) Evolutionary learning of a fuzzy controller for wallfollowing behavior in mobile robotics. Soft Comput 10(10): 881–889
Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11): 1424–1437
Ortega M, Bravo J (2000) Computers and education in the 21st century. Kluwer, Norwell, p 266
Otero J, Sánchez L (2006) Induction of descriptive fuzzy classifiers with the Logitboost algorithm. Soft Comput 10(9): 825–835
Pal SK, Wang PP (1996) Genetic algorithms for pattern recognition. CRC Press, Boca Raton,p 336
Punch B, Zongker D (1998) lib-gp 1.1 beta. http://garage.cse.msu.edu/software/lil-gp
Pyle D (1999) Data preparation for data mining. Morgan Kaufmann, San Mateo, p 540
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo, p 316
R Development Core Team (2005) R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria http://www.R-project.org
Rakotomalala R (2005) TANAGRA: un logiciel gratuit pour l’enseignement et la recherche. In: Proceedings of the 5th Journées d’Extraction et Gestion des Connaissances 2:697–702
Rivera AJ, Rojas I, Ortega J, del Jesus MJ (2007) A new hybrid methodology for cooperative-coevolutionary optimization of radial basis function networks. Soft Comput 11(7): 655–668
Rodríguez JJ, Kuncheva LI, Alonso CJ (2006) Rotation forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28(10): 1619–1630
Romero C, Ventura S, Bra P (2004) Knowledge discovery with genetic programming for providing feedback to courseware author, user modeling and user-adapted interaction. J Personal Res 14(5): 425–465
Rummler A (2007) Evolvica: a Java framework for evolutionary algorithms. http://www.evolvica.org
Rushing J, Ramachandran R, Nair U, Graves S, Welch R, Lin H (2005) ADaM: a data mining toolkit for scientists and engineers. Comput Geosci 31(5): 607–618
Sonnenburg S, Braun ML, Ong ChS, Bengio S, Bottou L, Holmes G, LeCun Y, Müller K-R, Pereira F, Rasmussen CE, Rätsch G, Schölkopf B, Smola A, Vincent P, Weston J, Williamson RC (2007) The need for open source software in machine learning. J Mach Learn Res 8: 2443–2466
Stejić Z, Takama Y, Hirota K (2007) Variants of evolutionary learning for interactive image retrieval. Soft Comput 11(7): 669–678
Tan JC, Lee TH, Khoo D, Khor EF (2001) A multiobjective evolutionary algorithm toolbox for computer-aided multiobjective optimization. IEEE Trans Syst Man Cybern B Cybern 31(4): 537–556
Tan JC, Tay A, Cai J (2003) Design and implementation of a distributed evolutionary computing software. IEEE Trans Syst Man Cybern B Cybern 33(3): 325–338
Tan PN, Steinbach M, Kumar V (2006) Introduction to Data Mining. Addison-Wesley, Reading, p 769
Ventura S, Romero C, Zafra A, Delgado JA, Hervás C (2008) JCLEC: a java framework for evolutionary computation. Soft Comput 12(4): 381–392
Wang LX, Mendel JM (1992) Generating fuzzy rules by learning from examples. IEEE Trans Syst Man Cybern 22(6): 1414–1427
Wang X, Nauck DD, Spott M, Kruse R (2007) Intelligent data analysis with fuzzy decision trees. Soft Comput 11(5): 439–457
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2): 149–175
Wilson DR, Martinez TR (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38: 257–268
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco, p 525. http://www.cs.waikato.ac.nz/ml/weka/index.html
Wong ML, Leung KS (2000) Data mining using grammar based genetic programming and applications. Kluwer, Norwell, p 232
Zhang S, Zhang C, Yang Q (2003) Data preparation for data mining. Appl Artif Intell 17: 375–381
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by the Spanish Ministry of Science and Technology under Projects TIN-2005-08386-C05-(01, 02, 03, 04 and 05). The work of Dr. Bacardit is also supported by the UK Engineering and Physical Sciences Research Council (EPSRC) under grant GR/T07534/01.
Rights and permissions
About this article
Cite this article
Alcalá-Fdez, J., Sánchez, L., García, S. et al. KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13, 307–318 (2009). https://doi.org/10.1007/s00500-008-0323-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-008-0323-y