Abstract
In the software assurance process, it is crucial to prevent a program with defected modules to be published to users since it can save the maintenance cost and increase software quality and reliability. There were many prior attempts to automatically capture errors by employing conventional classification techniques, e.g., Decision Tree, k-NN, Naïve Bayes, etc. However, their detection performance was limited due to the imbalanced issue since the number of defected modules is very small comparing to that of non-defected modules. This paper aims to achieve high prediction rate by employing unbiased SVM called “R-SVM,” our version of SVM tailored to domains with imbalanced classes. The experiment was conducted in the NASA Metric Data Program (MDP) data set. The result showed that our proposed system outperformed all of the major traditional approaches.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Menzies, T., Greenwald, J., Frank, A.: Data Mining Static Code Attributes to Learn Defect Predictors. In: IEEE Transactions on SE, vol. 33(1), pp. 2-13 (2007)
Bo, S., Haifeng, L., Mengjun, L., Quan, Z., Chaojing, T.: Software Defect Prediction Using Dynamic Support Vector Machine. In: 9th International Conference on Computational Intelligence and Security (CIS), 2013, pp. 260-263. China (2013)
Seliya, N., Khoshgoftaar, T.M., Van Hulse, J.: Predicting Faults in High Assurance Software. In: 2010 IEEE 12th International Symposium on High-Assurance Systems Engineering (HASE), pp. 26-34. San Jose, CA(2010)
Shuo, W., Xin, Y.: Using Class Imbalance Learning for Software Defect Prediction. In: IEEE Transactions on Reliability, vol. 62(2), pp. 434-443 (2013)
Gray, D., Bowes, D., Davey, N., Sun, Y., Christianson, B.: Software defect prediction using static code metrics underestimates defect-proneness. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1-7. Barcelona (2010)
Elish, K.O., Elish, M.O.: Predicting defect-prone software modules using support vector machines. In: Journal of System Software. vol. 81(5), pp. 649-660 (2008)
NASA IV & V Facility. Metric Data Program, http://MDP.ivv.nasa.org/.
Vateekul, P., Dendamrongvit, S., Kubat, M.: Improving SVM Performance in Multi-Label Domains: Threshold Adjustment. International Journal on Artificial Intelligence Tools (2013)
McCabe, T.J.: A Complexity Measure. Software Engineering, In: IEEE Transactions on SE, vol. 2(4), pp. 308-320 (1976)
Halstead, M.H.: Elements of Software Science. Elsevier Science Inc., (1977)
Chidamber, S. R., Kemerer, C. F.: A metrics suit for object oriented design. In: IEEE Transactions on SE, vol. 20, pp. 476-493 (1994)
Kubat, M., Matwin, S.: Addressing the curse of imbalanced training seta: One-sided selection, pp. 179-186, 1997
Han, J., Kamber, M.,: Data Mining: Concepts and Techniques, 2 ed., s.l.: Morgan Kaufmann, 2006.
Cortes, C., Vapnik,V.: Support-Vector Networks. Machine Learning, pp.273-297(1995)
Vateekul, P., Kubat, M., Sarinnapakorn, K.: Top-down optimized SVMs for hierarchical multi-label classification: A case study in gene function prediction. Intelligent Data Analysis (in press)
Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classification. Department of Computer Science and Information Engineering, National Taiwan University, (2003)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Choeikiwong, T., Vateekul, P. (2015). Software Defect Prediction in Imbalanced Data Sets Using Unbiased Support Vector Machine. In: Kim, K. (eds) Information Science and Applications. Lecture Notes in Electrical Engineering, vol 339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46578-3_110
Download citation
DOI: https://doi.org/10.1007/978-3-662-46578-3_110
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46577-6
Online ISBN: 978-3-662-46578-3
eBook Packages: EngineeringEngineering (R0)