Abstract
To avoid obtaining an unmanageable highly sized association rule sets– compounded with their low precision– that often make the perusal of knowledge ineffective, the extraction and exploitation of compact and informative generic basis of association rules is a becoming a must. Moreover, they provide a powerful verification technique for hampering gene mis-annotating or badly clustering in the Unigene library. However, extracted generic basis is still oversized and their exploitation is impractical. Thus, providing critical nuggets of extra-valued knowledge is a compellingly addressable issue. To tackle such a drawback, we propose in this paper a novel approach, called EGEA (Evolutionary Gene Extraction Approach). Such approach aims to considerably reduce the quantity of knowledge, extracted from a gene expression dataset, presented to an expert. Thus, we use a genetic algorithm to select the more predictive set of genes related to patient situations. Once, the relevant attributes (genes) have been selected, they serve as an input for a second approach stage, i.e., extracting generic association rules from this reduced set of genes. The notably decrease of the generic association rule cardinality, extracted from the selected gene set, permits to improve the quality of knowledge exploitation. Carried out experiments on a benchmark dataset pointed out that among this set, there are genes which are previously unknown prognosis-associated genes. This may serve as molecular targets for new therapeutic strategies to repress the relapse of pediatric acute myeloid leukemia (AML).
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Wang, J., Zaki, M.J., Toivonen, H., Shasha, D.: Data Mining in Bioinformatics. Advanced Information and Knowledge Processing. Springer, Heidelberg (2005)
Chen, Y.: Bioinformatics Technologies. Advanced Information and Knowledge Processing. Springer, Heidelberg (2005)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)
Hall, M.A., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Transactions on Knowledge and Data Eengineering 15 (2003)
Cornuéjols, A., Miclet, L., Kodratoff, Y., Mitchell, T.: Apprentissage artificiel: concepts et algorithmes. Eyrolles (2002)
Goldberg, D.E.: Genetic algorithms in search, optimization and machine learning. Addison-Wesley, Reading (1989)
Trabelsi, A., Esseghir, M.A.: New evolutionary bankruptcy forecasting model based on genetic algorithms and neural networks. In: 17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2005), pp. 241–245 (2005)
Liu, H., Li, J., Wong, L.: A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. Genome Informatics 13, 51–60 (2002)
Shang, C., Shen, Q.: Aiding classification of gene expression data with feature selection: A comparative study. International Journal of Computational Intelligence Reasearch 1, 68–76 (2005)
Esseghir, M.A., Yahia, S.B., Abdelhak, S.: Localizing compact set of genes involved in cancer diseases using an evolutionary conectionist approach. In: European Conferences on Machine Learning and European Conferences on Principles and Practice of Knowledge Discovery in Databases. ECML/PKDD Discovery Challenge (2005)
Narayanan, A., Cheung, A., Gamalielsson, J., Keedwell, E., Vercellone, C.: Artificial neural networks for reducing the dimensionality of gene expression data. In: Bioinformatics Using Computational Intelligence Paradigms, vol. 176, pp. 191–211. Springer, Heidelberg (2005)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, MIT Press, Cambridge (1986)
Zaki, M.J.: Mining non-redundant association rules. Data Mining Knowledge Discovery 9, 223–248 (2004)
Gasmi, G., BenYahia, S., Nguifo, E.M., Slimani, Y.: IGB: A new informative generic base of association rules. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 81–90. Springer, Heidelberg (2005)
Kryszkiewicz, M.: Representative association rules and minimum condition maximum consequence association rules. In: Żytkow, J.M. (ed.) PKDD 1998. LNCS, vol. 1510, pp. 361–369. Springer, Heidelberg (1998)
Zaki, M.: Mining Non-Redundant Association Rules. In: Data Mining and Knowledge Discovery, pp. 223–248 (2004)
Zaki, M.J.: Generating non-redundant association rules. In: Proceedings of the 6th ACM-SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, Massachusetts, USA, pp. 34–43 (2000)
Bastide, Y., Pasquier, N., Taouil, R., Lakhal, L., Stumme, G.: Mining minimal non-redundant association rules using frequent closed itemsets. In: Palamidessi, C., Moniz Pereira, L., Lloyd, J.W., Dahl, V., Furbach, U., Kerber, M., Lau, K.-K., Sagiv, Y., Stuckey, P.J. (eds.) CL 2000. LNCS (LNAI), vol. 1861, pp. 972–986. Springer, Heidelberg (2000)
Pyle, D.: Data Preparation for Data Mining (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Esseghir, M.A., Gasmi, G., Yahia, S.B., Slimani, Y. (2006). EGEA : A New Hybrid Approach Towards Extracting Reduced Generic Association Rule Set (Application to AML Blood Cancer Therapy). In: Tjoa, A.M., Trujillo, J. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2006. Lecture Notes in Computer Science, vol 4081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823728_47
Download citation
DOI: https://doi.org/10.1007/11823728_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37736-8
Online ISBN: 978-3-540-37737-5
eBook Packages: Computer ScienceComputer Science (R0)