Abstract
In recent years, a DNA microarray technique has gained more attraction in both scientific and in industrial fields. It is important to determine the informative genes that cause the cancer to improve early cancer diagnosis and to give effective chemotherapy treatment. In order to gain deep insight into the cancer classification problem, it is necessary to take a closer look at the proposed gene selection methods. We believe that they should be an integral preprocessing step for cancer classification. Furthermore, finding an accurate gene selection method is very significant issue in a cancer classification area, because it reduces the dimensionality of microarray dataset and selects informative genes. In this paper, we review, classify and compare the state-of-art gene selection methods. We proceed by evaluating the performance of each gene selection approach based on their classification accuracy and number of informative genes. In our evaluation, we will use four benchmark microarray datasets for cancer diagnosis (Leukemia, Colon, Lung, and Prostate). In addition, we compare the performance of gene selection method to investigate the effective gene selection method that has the ability to identify a small set of marker genes, and ensure high cancer classification accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ghorai, S., Mukherjee, A., Sengupta, S., Dutta, P.: Multicategory cancer classification from gene expression data by multiclass nppc ensemble. In: 2010 International Conference on Systems in Medicine and Biology (ICSMB), pp. 4–48 (2010)
Sheng-Bo, G., Michael, L., Ming, L.: Gene selection based on mutual information for the classification of multi-class cancer. In: Proceedings of the 2006 International Conference on Computational Intelligence and Bioinformatics, pp. 454–463 (2006)
Fu, L.M., Fu-Liu, C.S.: Multi-class cancer subtype classification based on gene expression signatures with reliability analysis. FEBS Lett. 561(13), 186–190 (2004)
Yu, H., Xu, S.: Simple rule-based ensemble classifiers for cancer DNA microarray data classification. In: 2011 Inter-national Conference on Computer Science and Service System (CSSS), pp. 2555–2558 (2011)
Narendra, P., Fukunaga, K.: A branch and bound algorithm for feature subset selection. IEEE Trans. Comput. 26(9), 917–922 (1977)
Kun, Y., Zhipeng, C., Jianzhong, L., Guohui, L.: A stable gene selection in microarray data analysis. BMC Bioinform. 7(1), 1–16 (2006)
Alonso, C., Moro-Sancho, I., Simon-Hurtado, A., Varela-Arrabal, R.: Microarray gene expression classification with few genes: criteria to combine attribute selection and classification methods. Expert Syst. Appl. 39(8), 7270–7280 (2012)
Yvan, S., Aki, I., Pedro, L.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Jorng-Tzong, H., Li-Cheng, W., Baw-Juine, L., Jun-Li, K., Wen-Horng, K., Jin-Jian, Z.: An expert system to classify microarray gene expression data using gene selection by decision tree. Expert Syst. Appl. 36(5), 9072–9081 (2009)
Juliusdottir, T., Keedwell, E., Corne, D., Narayanan, A.: Two-phase ea/k-nn for feature selection and classification in cancer microarray datasets. In: Proceedings of the 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB ’05, pp. 1–8 (2005)
Lee, C.P., Leu, Y.: A novel hybrid feature selection method for microarray data analysis. Appl. Soft Comput. 11(1), 208–213 (2011)
Mundra, P.A., Rajapakse, J.C.: Gene and sample selection for cancer classification with support vectors based t-statistic. Neurocomputing 73(15), 2353–2362 (2010). http://www.sciencedirect.com/science/article/pii/S0925231210002432
Liu, H., Liu, L., Zhang, H.: Ensemble gene selection by grouping for microarray data classification. J. Biomed. Inf. 43(1), 81–87 (2010)
Chen, Y., Zhao, Y.: A novel ensemble of classifiers for microarray data classification. Appl. Soft Comput. 8(4), 1664–1669 (2008)
Feng, C., Lipo, W.: Applications of support vector machines to cancer classification with microarray data. Int. J. Neural Syst. 15(06), 475–484 (2005)
Kulkarni, A., Kumar, B.N., Ravi, V., Murthy, U.S.: Colon cancer prediction with genetics profiles using evolutionary techniques. Expert Syst. Appl. 38(3), 2752–2757 (2011). http://www.sciencedirect.com/science/article/pii/S0957417410008614
Lee, C.P., Lin, W.S., Chen, Y.M., Kuo, B.J.: Gene selection and sample classification on microarray data based on adaptive genetic algorithm/k-nearest neighbor method. Expert Syst. Appl. 38(5), 4661–4667 (2011)
Huang, H.L., Lee, C.C., Ho, S.Y.: Selecting a minimal number of relevant genes from microarray data to design accurate tissue classifiers. Biosystems 90(1), 78–86 (2007)
Huang, H.L., Chang, F.L.: Esvm: evolutionary support vector machine for automatic feature selection and classification of microarray data. Biosystems 90(2), 516–528 (2007)
Abderrahim, A., Talbi, E., Khaled, M.: Hybridization of genetic and quantum algorithm for gene selection and classification of microarray data. In: IEEE International Symposium on Parallel Distributed Processing, IPDPS 2009, pp. 1–8 (2009)
Alba, E., Garcia-Nieto, J., Jourdan, J., Talbi, E.: Gene selection in cancer classification using pso/svm and ga/svm hybrid algorithms. In: IEEE Congress on Evolutionary Computation, CEC 2007, pp. 284–290 (2007)
Shen, Q., Shi, W.M., Kong, W., Ye, B.X.: A combination of modified particle swarm optimization algorithm and support vector machine for gene selection and tumor classification. Talanta 71(4), 1679–1683 (2007)
Xiong, W., Wang, C.: A hybrid improved ant colony optimization and random forests feature selection method for microarray data. In: International Conference on Networked Computing and Advanced Information Management, pp. 559–563 (2009)
Mohamad, M., Omatu, S., Yoshioka, M., Deris, S.: An approach using hybrid methods to select informative genes from microarray data for cancer classification. In: Second Asia International Conference on Modeling Simulation, AICMS 08, pp. 603–608 (2008)
Yang, C.S., Chuang, L.Y., Ke, C.H., Yang, C.H.: A hybrid feature selection method for microarray classification. Int. J. Comput. Sci. 35, 285–290 (2008)
Chuang, L.Y., Yang, C.H., Wu, K.C., Yang, C.H.: A hybrid feature selection method for dna microarray data. Comput. Biol. Med. 41(4), 228–237 (2011)
El Akadi, A., Amine, A., El Ouardighi, A., Aboutajdine, D.: A new gene selection approach based on minimum redundancy-maximum relevance (mrmr) and genetic algorithm (ga). In: IEEE/ACS International Conference on Computer Systems and Applications, AICCSA 2009, pp. 69-75 (2009)
Meir, P., DavidM., R., Marwala, T., Scott, L., Featherston, J., Stevens, W.: The fuzzy gene filter: an adaptive fuzzy inference system for expression array feature selection. In: Trends in Applied Intelligent Systems, vol. 6098, pp. 62–71. Springer, Berlin, Heidelberg (2010)
Huerta, E., Duval, B., kao Hao, J.: A hybrid ga/svm approach for gene selection and classification of microarray data. In: EvoWorkshops 2006, LNCS 3907, pp. 34–44. Springer (2006)
Kumar, P.G., Victoire, T.A.A., Renukadevi, P., Devaraj, D.: Design of fuzzy expert system for microarray data classification using a novel genetic swarm algorithm. Expert Syst. Appl. 39(2), 1811–1821 (2012)
Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, L., Downing, J., Caligiuri, M., Bloomfield, C., Lander, E.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)
Gordon, G.J., Jensen, R.V., li Hsiao, L., Gullans, S.R., Blumenstock, J.E., Ramaswamy, S., Richards, W.G., Sugarbaker, D.J., Bueno, R.: Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res. 62, 4963–4967 (2002)
Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., Levine, A.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. 96(12), 6745–6750 (1999)
Singh, D., Febbo, P.G., Ross, K., Jackson, D., Manola, J., Ladd, C., Tamayo, P., Renshaw, A.A., D’Amico, A.V., Richie, J.P., Lander, E.S., Loda, M., Kantoff, P.W., Golub, T.R., Sellers, W.R.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2), 203–209 (2002)
Osareh, A., Shadgar, B.: Microarray data analysis for cancer classification. In: 2010 5th International Symposium on Health Informatics and Bioinformatics (HIBIT), pp. 125–132 (2010)
Simon, R.: Analysis of dna microarray expression data. Best Pract. Res. Clin. Haematol. 22(2), 271–282 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Alshamlan, H., Badr, G., Alohali, Y. (2019). A Comparative Study of Gene Selection Methods for Microarray Cancer Classification. In: Abawajy, J., Othman, M., Ghazali, R., Deris, M., Mahdin, H., Herawan, T. (eds) Proceedings of the International Conference on Data Engineering 2015 (DaEng-2015) . Lecture Notes in Electrical Engineering, vol 520. Springer, Singapore. https://doi.org/10.1007/978-981-13-1799-6_60
Download citation
DOI: https://doi.org/10.1007/978-981-13-1799-6_60
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1797-2
Online ISBN: 978-981-13-1799-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)