Abstract
Software defect detection aims to automatically identify defective software modules for efficient software test in order to improve the quality of a software system. Although many machine learning methods have been successfully applied to the task, most of them fail to consider two practical yet important issues in software defect detection. First, it is rather difficult to collect a large amount of labeled training data for learning a well-performing model; second, in a software system there are usually much fewer defective modules than defect-free modules, so learning would have to be conducted over an imbalanced data set. In this paper, we address these two practical issues simultaneously by proposing a novel semi-supervised learning approach named Rocus. This method exploits the abundant unlabeled examples to improve the detection accuracy, as well as employs under-sampling to tackle the class-imbalance problem in the learning process. Experimental results of real-world software defect detection tasks show that Rocus is effective for software defect detection. Its performance is better than a semi-supervised learning method that ignores the class-imbalance nature of the task and a class-imbalance learning method that does not make effective use of unlabeled data.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Dai Y S, Xie M, Long Q, Ng S H. Uncertainty analysis in software reliability modeling by Bayesian approach with maximum-entropy principle. IEEE Transactions on Software Engineering, 2007, 33(11): 781–795.
Guo L, Ma Y, Cukic B, Singh H. Robust prediction of fault-proneness by random forests. In Proc. the 15th International Symposium on Software Reliability Engineering, Nov. 2–5, 2004, pp.417–428.
Khoshgoftaar T M, Allen E B, Jones W D, Hudepohl J P. Classification-tree models of software-quality over multiple releases. IEEE Transactions on Reliability, 2000, 49(1): 4–11.
Lessmann S, Baesens B, Mues C, Pietsch S. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Transactions on Software Engineering, 2008, 34(4): 485–496.
Menzies T, Greenwald J, Frank A. Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 2007, 33(1): 2–13.
Zhang H, Zhang X. Comments on “Data mining static code attributes to learn defect predictors”. IEEE Transactions on Software Engineering, 2007, 33(9): 635–637.
Zhou Y, Leung H. Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Transactions on Software Engineering, 2006, 32(10): 771–789.
Seliya N, Khoshgoftaar T M. Software quality estimation with limited fault data: A semi-supervised learning perspective. Software Quality Journal, 2007, 15: 327–344.
Pelayo L, Dick S. Applying novel resampling strategies to software defect prediction. In Proc. the 2007 Annual Meeting of the North American Fuzzy Information Processing Society, San Diego, USA, Jun. 24–27, 2007, pp.69–72.
Zhou Z H, Li M. Semi-supervised learning by disagreement. Knowledge and Information Systems, 2010, 24(3): 415–439.
Drummond C, Holte R C. C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling. In Working Notes of the ICML'03 Workshop on Learning from Imbalanced Data Sets, Washington DC, USA, Jul. 21, 2003.
Zheng A X, Jordan M I, Liblit B, Naik M, Aiken A. Statistical debugging: Simultaneous identification of multiple bugs. In Proc. the 23rd International Conference on Machine Learning, Pittsburgh, USA, Jun. 25–29, 2006, pp.1105–1112.
Andrzejewski D, Mulhern A, Liblit B, Zhu X. Statistical debugging using latent topic models. In Proc. the 18th European Conference on Machine Learning, Warsaw, Poland, Sept. 17–21, 2007, pp.6–17.
Chilimbi T M, Liblit B, Mehra K K et al. HOLMES: Effective statistical debugging via efficient path profiling. In Proc. the 31st International Conference on Software Engineering, Vancouver, Canada, May 16–24, 2009, pp.34–44.
Basili V R, Briand L C, Melo W L. A validation of object-oriented design metrics as quality indicators. IEEE Transactions on Software Engineering, 1996, 22(10): 751–761.
Khoshgoftaar T M, Allen E B. Neural Networks for Software Quality Prediction. Computational Intelligence in Software Engineering, Pedrycz W, Peters J F (eds.), World Scientific, Singapore, 1998, pp.33–63.
Halstead M H. Elements of Software Science, Elsevier, 1977.
McCabe T J. A complexity measure. IEEE Transactions on Software Engineering, 1976, 2(4): 308–320.
Gyimóthy T, Ferenc R, Siket I. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Transactions on Software Engineering, 2005, 31(10): 897–910.
Ganesan K, Khoshgoftaar T M, Allen E B. Verifying requirements through mathematical modelling and animation. International Journal of Software Engineering and Knowledge Engineering, 2000, 10(2): 139–152.
Khoshgoftaar T M, Seliya N. Fault prediction modeling for software quality estimation: Comparing commonly used techniques. Empirical Software Engineering, 2003, 8(3): 255–283.
Fenton N E, Neil M. A critique of software defect prediction models. IEEE Transactions Software Engineering, 1999, 25(5): 675–689.
Pérez-Miñana E, Gras J J. Improving fault prediction using Bayesian networks for the development of embedded software applications. Software Testing, Verification Reliability, 2006, 16(3): 157–174.
Chapelle O, Schölkopf B, Zien A. Semi-Supervised Learning. Cambridge: MIT Press, MA, 2006.
Zhu X. Semi-supervised learning literature survey. Technical Report 1530, Department of Computer Sciences, University of Wisconsin at Madison, Madison, WI, 2006, http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf.
Miller D J, Uyar H S. A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data. Advances in Neural Information Processing Systems 9, Mozer M, Jordan M I, Petsche T (eds.), MIT Press, Cambridge, MA, 1997, pp.571–577.
Nigam K, McCallum A K, Thrun S, Mitchell T. Text classification from labeled and unlabeled documents using EM. Machine Learning, 2000, 39(2/3): 103–134.
Shahshahani B, Landgrebe D. The effect of unlabeled samples in reducing the small sample size problem and mitigating the hughes phenomenon. IEEE Transactions on Geoscience and Remote Sensing, 1994, 32(5): 1087–1095.
Chapelle O, Zien A. Semi-supervised learning by low density separation. In Proc. the 10th International Workshop on Artificial Intelligence and Statistics, Barbados, Jan. 6–8, 2005, pp.57–64.
Grandvalet Y, Bengio Y. Semi-Supervised Learning by Entropy Minimization. Advances in Neural Information Processing Systems, Saul L K, Weiss Y, Bottou L (eds.), MIT Press, Cambridge, MA, 2005, pp.529–536.
Joachims T. Transductive inference for text classification using support vector machines. In Proc. the 16th International Conference on Machine Learning, Bled, Slovenia, Jun. 27–30, 1999, pp.200–209.
Belkin M, Niyogi P, Sindhwani V. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 2006, 7(11): 2399–2434.
Zhou D, Bousquet O, Lal T N, Weston J, Schölkopf B. Learning with Local and Global Consistency. Advances in Neural Information Processing Systems 16, Thrun S, Saul L, Schölkopf B (eds.), MIT Press, Cambridge, MA, 2004.
Zhu X, Ghahramani Z, Lafferty J. Semi-supervised learning using Gaussian fields and harmonic functions. In Proc. the 20th International Conference on Machine Learning, Washington, DC, USA, Aug. 21–24, 2003, pp.912–919.
Blum A, Mitchell T. Combining labeled and unlabeled data with co-training. In Proc. the 11th Annual Conference on Computational Learning Theory, Madison, USA, Jul. 24–26, 1998, pp.92–100.
Goldman S, Zhou Y. Enhancing supervised learning with unlabeled data. In Proc. the 17th International Conference on Machine Learning, San Francisco, USA, Jun. 29-Jul. 2, 2000, pp.327–334.
Li M, Zhou Z H. Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Transactions on Systems, Man and Cybernetics --- Part A: Systems and Humans, 2007, 37(6): 1088–1098.
Zhou Z H, Li M. Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(11): 1529–1541.
Zhou Z H, Li M. Semi-supervised regression with co-training style algorithms. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(11): 1479–1493.
Steedman M, Osborne M, Sarkar A et al. Bootstrapping statistical parsers from small data sets. In Proc. the 11th Conference on the European Chapter of the Association for Computational Linguistics, Budapest, Hungary, Apr. 12–17, 2003, pp.331–338.
Li M, Zhou Z H. Semi-supervised document retrieval. Information Processing & Management, 2009, 45(3): 341–355.
Zhou Z H, Chen K J, Dai H B. Enhancing relevance feedback in image retrieval using unlabeled data. ACM Transactions on Information Systems, 2006, 24(2): 219–244.
Chawla N V, Bowyer K W, Hall L O, Kegelmeyer W P. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, 16: 321–357.
Kubat M, Matwin S. Addressing the curse of imbalanced training sets: One-sided selection. In Proc. the 14th Int. Conf. Machine Learning, Nashville, USA, 1997, pp.179–186.
Domingos P. MetaCost: A general method for making classifiers cost-sensitive. In Proc. the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, USA, Aug. 15–18, 1999, pp.155–164.
Elkan C. The foundations of cost-sensitive learning. In Proc. the 17th International Joint Conference on Artificial Intelligence, Seattle, USA, Aug. 4–10, 2001, pp.973–978.
Batista G, Prati R C, Monard M C. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations, 2004, 6(1): 20–29.
Liu X Y, Wu J X, Zhou Z H. Exploratory under-sampling for class-imbalance learning. IEEE Transactions on Systems, Man and Cybernetics --- Part B: Cybernetics, 2009, 39(2): 539–550.
Angluin D, Laird P. Learning from noisy examples. Machine Learning, 1988, 2(4): 343–370.
Ho T K. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(8”): 832–844.
Breiman L. Bagging predictors. Machine Learning, 1996, 24(2): 123–140.
Chapman M, Callis P, Jackson W. Metrics data program. NASA IV and V Facility, 2004, http://mdp.ivv.nasa.gov/.
Schapire R E. A brief introduction to Boosting. In Proc. the 16th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, Jul. 31-Aug. 6, 1999, pp.1401–1406.
Witten I H, Frank E. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2005.
Bradley A P. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 1997, 30(6): 1145–1159.
Zhou Z H, Wu J, Tang W. Ensembling neural networks: Many could be better than all. Artificial Intelligence, 2002, 137(1/2): 239–263.
Khoshgoftaar T M, Seliya N. Tree-based software quality estimation models for fault prediction. In Proc. the 8th IEEE International Symp. Software Metrics, Ottawa, Canada, Jun. 4–7, 2002, pp.203–214.
Dietterich T G, Lathrop R H, Lozano-Pérez T. Solving the Multiple instance problem with axis-parallel rectangles. Artificial Intelligence, 1997, 89(1/2): 31–71.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the National Natural Science Foundation of China under Grant Nos. 60975043, 60903103, and 60721002.
Electronic Supplementary Material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Jiang, Y., Li, M. & Zhou, ZH. Software Defect Detection with Rocus . J. Comput. Sci. Technol. 26, 328–342 (2011). https://doi.org/10.1007/s11390-011-9439-0
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-011-9439-0