Abstract
Microarray data classification is a critical challenge for computational techniques due to its inherent characteristics, mainly small sample size and high dimension of the input space. For this type of data two-class classification techniques have been widely applied while one-class learning is considered as a promising approach. In this paper, we study the suitability of employing the one-class classification for microarray datasets while the role played by feature selection is analyzed. The superiority of this approach is demonstrated by comparison with the classical approach, with two classes, on different benchmark data sets.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)
Hughes, G.: On the mean accuracy of statistical pattern recognizers. IEEE Transactions on Information Theory 14(1), 55–63 (1968)
Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.A.: Feature Extraction: Foundations and Applications. Studies in Fuzziness and Soft Computing. Springer-Verlag New York, Inc. (2006)
Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A., Benítez, J.M., Herrera, F.: A review of microarray datasets and applied feature selection methods. Information Sciences 282, 111–135 (2014)
Valafar, F.: Pattern recognition techniques in microarray data analysis: a survey. Annals of the NewYork Academy of Sciences 980, 41–64 (2002)
Larrañaga, P., Calvo, B., Santana, R., Bielza, C., Galdiano, J., Inza, I., Lozano, J.A., Armañanzas, R., Santafé, G., Pérez, A., Robles, V.: Machine learning in bioinformatics. Briefings in Bioinformatics 7(1), 86–112 (2006)
Yip, W.K., Amin, S.B., Li, C.: A survey of classification techniques for microarray data analysis. In: Handbook of Statistical Bioinformatics. Springer Handbooks of Computational Statistics, pp. 193–223 (2011)
Krawczyk, B.: Combining one-class support vector machines for microarray classification. In: Proc. Federated Conference on Computer Science and Information Systems (FedCSIS 2013), pp. 83–89 (2013)
Tax, D.M.J., Duin, R.P.W.: Support vector domain description. Pattern Recognition Letters 20(11), 1191–1199 (1999)
Vapnik, V.: Statistical Learning Theory. Wiley (1998)
Tax, D.M.J., Duin, R.P.W.: Support vector data description. Machine Learning 54, 45–66 (2004)
Tax, D.M.J.: DDtools, the data description toolbox for matlab. Delft University of Technology (2005)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)
Kent Ridge Bio-Medical Dataset. http://datam.i2r.a-star.edu.sg/datasets/krbd (online; accessed January 2015)
Microarray Cancers, Plymouth University. http://www.tech.plym.ac.uk/spmc/links/bioinformatics/microarray/microarray_cancers.html (online; accessed January 2015)
Moreno-Torres, J.G., Raeder, T., Alaiz-RodríGuez, R., Chawla, N.V., Herrera, F.: A Unifying View on Dataset Shift in Classification. Pattern Recognition 45(1), 521–530 (2012)
Hall, M.: Correlation-Based Feature Selection for Machine Learning, PhD. Thesis (1999)
Yu, L., Liu, H.: Feature selection for high-dimensional data: A fast correlation-based filter solution, pp. 856–863 (2003)
Zhao, Z., Liu, H.: Searching for interacting features. In: Proceedings of the International Joint Conference on Artifical Intelligence, pp. 1156–1161 (2007)
Hall, M., Smith, L.: Practical feature subset selection for machine learning. Computer Science 98, 181–191 (1998)
Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Bergadano, Francesco, De Raedt, Luc (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1226–1238 (2005)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V., Cristianini, N.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1–3), 389–422 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Pérez-Sánchez, B., Fontenla-Romero, O., Sánchez-Maroño, N. (2015). One-Class Classification for Microarray Datasets with Feature Selection. In: Iliadis, L., Jayne, C. (eds) Engineering Applications of Neural Networks. EANN 2015. Communications in Computer and Information Science, vol 517. Springer, Cham. https://doi.org/10.1007/978-3-319-23983-5_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-23983-5_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23981-1
Online ISBN: 978-3-319-23983-5
eBook Packages: Computer ScienceComputer Science (R0)