Abstract
Correct classification of patterns from images is one of the challenging tasks and has become the focus of much research in areas of machine learning and computer vision in recent era. Images are described by many variables like shape, texture, color and spectral for practical model building. Hundreds or thousands of features are extracted from images, with each one containing only a small amount of information. The selection of optimal and relevant features is very important for correct classification and identification of benign and malignant tumors in breast cancer dataset. In this paper we analyzed different feature selection algorithms like best first search, chi-square test, gain ratio, information gain, recursive feature elimination and random forest for our dataset. We also proposed a ranking technique to all the selected features based on the score given by different feature selection algorithms.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Thomas, J.F., Buhmann, M.J.: Computational pathology: Challenges and promises for tissue analysis. Computerized Medical Imaging and Graphics 35(7-8), 515–530 (2011)
Haralick, R.M., Shanmugam, K., Dinstein, M.I.: Texture Feature for Image Classification. IEEE Transaction on Systems, Man and Cybernetics 3(6), 610–619 (1973)
(July 2014), http://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Dimensionality_Reduction/Feature_Selection
Kohavi, J.G.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 273–324 (1997), doi:10.1016/S0004-3702(97)00043-X
Quinlan, J.R.: C4.5: Programs for Machine Learning. Machine Learning, vol. 16, pp. 235–240. Academic Kluwer Academic Publishers, Boston (1994)
Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)
Kursa, M.B., Rudnicki, W.R.: Feature Selection with the Boruta Package. Journal of Statistical Software 36(11), 1–13 (2010)
R Statistical Package (July, 2014), http://CRAN.R-project.org/package=varSelRF
Svetnik, V., Liaw, A., Tong, C., Wang, T.: Application of breiman’s random forest to modeling structure-activity relationships of pharmaceutical molecules. In: Roli, F., Kittler, J., Windeatt, T. (eds.) MCS 2004. LNCS, vol. 3077, pp. 334–343. Springer, Heidelberg (2004)
Breiman, L.: Random Forests. Machine Learning 45, 5–32 (2001)
Davis, J., Goadrich, M.: The Relationship Between Precision-Recall and ROC Curves. Technical report #1551, University of Wisconsin Madison (January 2006)
Tan, P.N., Kumar, V., Steinbach, M.: Introduction to Data Mining. Pearson education, 321321367th edn. Addison-Wesley (2005) ISBN : 0321321367
Hall, M.A.: Correlation-based Feature Selection for Machine Learning. Ph.D. thesis in Computer Science. University of Waikato, Hamilton, New Zealand (1999)
Rich, E., Knight, K.: Artificial Intelligence. McGraw-Hill (1991)
Ensemble method (Ocober, 2014), http://scikit-learn.org/stable/modules/ensemble.html#b2001
Gruszauskas, N.P., Drukker, K., Giger, M.L., Chang, R.F., Sennett, C.A., Moon, W.K., Pesce, L., Breast, U.S.: computer-aided diagnosis system: robustness across urban populations in South Korea and the United States. Radiolog 253, 661–671 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Verma, K., Singh, B.K., Tripathi, P., Thoke, A.S. (2015). Review of Feature Selection Algorithms for Breast Cancer Ultrasound Image. In: Barbucha, D., Nguyen, N., Batubara, J. (eds) New Trends in Intelligent Information and Database Systems. Studies in Computational Intelligence, vol 598. Springer, Cham. https://doi.org/10.1007/978-3-319-16211-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-16211-9_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16210-2
Online ISBN: 978-3-319-16211-9
eBook Packages: EngineeringEngineering (R0)