Abstract
Gene selection is an important problem in microarray data processing. A new gene selection method based on Wilcoxon rank sum test and Support Vector Machine (SVM) is proposed in this paper. First, Wilcoxon rank sum test is used to select a subset. Then each selected gene is trained and tested using SVM classifier with linear kernel separately, and genes with high testing accuracy rates are chosen to form the final reduced gene subset. Leave-one-out cross validation (LOOCV) classification results on two datasets: Breast Cancer and ALL/AML leukemia, demonstrate the proposed method can get 100% success rate with the final reduced subset. The selected genes are listed and their expression levels are sketched, which show that the selected genes can make clear separation between two classes.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Support Vector Machine
- Acute Myeloid Leukemia
- Support Vector Machine Classifier
- Gene Selection
- Linear Kernel
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Liu, H., Li, J., Wong, L.: A Comparative Study on Feature Selection and Classification Methods using Gene Expression Profiles and Proteomic Patterns. Genome Information 13, 51–60 (2002)
Liu, H., Li, J., Wong, L.: Selection of Patient Samples and Genes for Outcome Prediction. In: Proceedings of the IEEE Computational Systems Bioinformatics Conference, Stanford, pp. 382–392. IEEE Computer Society Press, Los Alamitos (2004)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning 46, 389–422 (2002)
Zhang, X., Wong, W.: Recursive Sample Classification and Gene Selection Based on SVM: Method and Software Description. Technical Report, Department of Biostatistics, Harvard School of Public Health, USA (2001)
Furlanello, C., Serafini, M., Merler, S., Jurman, G.: An Accelerated Procedure for Recursive Feature Ranking on Microarray Data. Neural Networks 16, 641–648 (2003)
Tang, Y., Zhang, Y., Huang, Z.: FCM-SVM-RFE Gene Feature Selection Algorithm for Leukemia Classification from Microarray Gene Expression Data. In: Proceedings of the IEEE International Conference on Fuzzy Systems, pp. 97–101. IEEE Computer Society Press, Los Alamitos (2005)
Duan, K., Rajapakse, J.C.: A Variant of SVM-RFE for Gene Selection in Cancer Classification with Expression Data. In: Proceedings of the IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, San Diego, pp. 49–55. IEEE Computer Society Press, Los Alamitos (2004)
Duan, K., Rajapakse, J.C., Haiying, W., Azuaje, F.: Multiple SVM-RFE for Gene Selection in Cancer Classification with Expression Data. IEEE Transactions on Nanobioscience 4, 228–233 (2005)
West, M., Blanchette, C., Dressman, H., et al.: Predicting the Clinical Status of Human Breast Cancer Using Gene Expression Profiles. In: Proceedings of the National Academy of Science, vol. 98, pp. 11462–11467 (2001)
Golub, T., Slonim, D., Tamayo, P., et al.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 28, 531–537 (1999)
Krishnapuram, B., Carin, L., Hartemink, A.: Gene expression analysis: Joint Feature Selection and Classifier Design. In: Schölkopf, B., Tsuda, K., Vert, J -P (eds.) Kernel Methods in Computational Biology, Schölkopf, B, pp. 299–317. MIT Press, Cambridge, MA (2004)
Ben-Dor, A., Bruhn, L., Friedman, N. et al.: Tissue Classification with Gene Expression Profiles. Journal of Computational Biology 7, 559–583 (2000)
Li, Y., Campbell, C., Tipping, M.: Bayesian Automatic Relevance Determination Algorithms for Classifying Gene Expression Data. Bioinformatics 18, 1332–1339 (2002)
Figueiredo, M., Jain, A.: Bayesian Learning of Sparse Classifiers. In: Proceedings of International Conference on Computer Vision and Pattern Recognition, Wisconsin, pp. 35–41 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liao, C., Li, S., Luo, Z. (2007). Gene Selection Using Wilcoxon Rank Sum Test and Support Vector Machine for Cancer Classification. In: Wang, Y., Cheung, Ym., Liu, H. (eds) Computational Intelligence and Security. CIS 2006. Lecture Notes in Computer Science(), vol 4456. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74377-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-74377-4_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74376-7
Online ISBN: 978-3-540-74377-4
eBook Packages: Computer ScienceComputer Science (R0)