Abstract
Breast Cancer is the most common type of cancer in women worldwide. In spite of this fact, there are insufficient studies that, using data mining techniques, are capable of helping medical doctors in their daily practice.
This paper presents a comparative study of three ensemble methods (TreeBagger, LPBoost and Subspace) using a clinical dataset with 25% missing values to predict the overall survival of women with breast cancer. To complete the absent values, the k-nearest neighbor (k-NN) algorithm was used with four distinct neighbor values, trying to determine the best one for this particular scenario. Tests were performed for each of the three ensemble methods and each k-NN configuration, and their performance compared using a Friedman test. Despite the complexity of this challenge, the produced results are promising and the best algorithmconfiguration (TreeBagger using 3 neighbors) presents a prediction accuracy of 73%.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Abreu, P.H. et al. (2014). Overall Survival Prediction for Women Breast Cancer Using Ensemble Methods and Incomplete Clinical Data. In: Roa Romero, L. (eds) XIII Mediterranean Conference on Medical and Biological Engineering and Computing 2013. IFMBE Proceedings, vol 41. Springer, Cham. https://doi.org/10.1007/978-3-319-00846-2_338
Download citation
DOI: https://doi.org/10.1007/978-3-319-00846-2_338
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-00845-5
Online ISBN: 978-3-319-00846-2
eBook Packages: EngineeringEngineering (R0)