Handling Class Imbalance Problem in Heterogeneous Cross-Project Defect Prediction

Vashisht, Rohit; Rizvi, Syed Afzal Murtaza

doi:10.1007/978-981-15-5113-0_7

Rohit Vashisht²⁰ &
Syed Afzal Murtaza Rizvi²⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1165))

1258 Accesses
2 Citations

Abstract

Software Defect Prediction (SDP) is one of the key tasks in the testing phase of Software Development Life Cycle (SDLC) that discovers modules that are more susceptible to defects and therefore requires significant testing to identify these flaws early in order to cut up the extra cost for software development. Much research has been performed on Cross-Project Defect Prediction (CPDP), which seeks to predict defects in the target application that lacks historical defect prediction information or has restricted defect information to construct an efficient generalized model for forecasting defects in a software project. The proposed research work focuses on defect forecast using a heterogeneous metric set so that there are no common metrics between the source and the target applications. This paper also discusses the Class Imbalance Problem (CIP) that occurs in a dataset because of the disproportionate number of favorable and unfavorable cases. If trained using imbalance dataset, a classifier will offer biased outcomes. We used Adaptive Boost (AdaBoost) method to manage CIP in Heterogeneous Cross-Project Defect Prediction (HCPDP), and after managing CIP, experimental findings demonstrate significant improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Combined classifier for cross-project defect prediction: an extended empirical study

Article 15 February 2018

Value-cognitive boosting with a support vector machine for cross-project defect prediction

Article 17 December 2014

Cross Projects Defect Prediction Modeling

References

D. Han, I.P. Hoh, S. Kim, T. Lee, J. Nam, Micro interaction metrics for defect prediction, in Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (ACM, New York, USA, 2011)
Google Scholar
P. He, B. Li, Y. Ma, Towards cross-project defect prediction with imbalanced feature sets. CoRR, abs/1411.4228 (2014)
Google Scholar
W. Fu, S. Kim, T. Menzies, J. Nam, L. Tan, Heterogeneous defect prediction, in Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ser. ESEC/FSE (ACM, New York, NY, USA, 2015), pp. 508–519
Google Scholar
A.B. Bener, T. Menzies, J. Di Stefano, B. Turhan, On the relative value of cross- company and within-company data for defect prediction. Empirical Softw. Eng. 14, 540–578 (2009)
Article Google Scholar
X. Guo, Y. Yin, C. Dong, G. Yang, G. Zhou, On the class imbalance problem, in Fourth International Conference on Natural Computation (School of Computer Science and Technology, Shandong University, Jinan, 250101, China, 2008)
Google Scholar
M.W. Mwadulo, A review on feature selection methods for classification tasks. Int. J. Comput. Appl. Technol. Res. 5(6), 395–402 (2015)
Google Scholar
F.J. Massey, The Kolmogorov-Smirnov test for goodness of fit. J. Am. Stat. Assoc. 46(253), 68–78 (1951)
Article Google Scholar
C. Spearman, The proof and measurement of association between two things. Int. J. Epidemiol. 39(5), 1137–1150 (2010)
Article Google Scholar
N. Rout, D. Mishra, M.K. Mallick, Handling imbalanced data: a survey, in International Proceedings on Advances in Soft Computing, Intelligent Systems and Applications, Advances in Intelligent Systems and Computing, vol. 628. https://doi.org/10.1007/978-981-10-5272-9_39 (2018)
https://towardsdatascience.com/methods-for-dealing-with-imbalanced-data-5b761be45a18
L.C. Briand, W.L. Melo, J. Wurst, Assessing the applicability of fault-proneness models across object-oriented software projects. IEEE Trans. Softw. Eng. 28, 706–720 (2002)
Article Google Scholar
A.B. Bener, T. Menzies, J.S. Di Stefano, B. Turhan, On the relative value of cross- company and within-company data for defect prediction. Empirical Softw. Eng. 14(5), 540–578 (2009)
Article Google Scholar
Z. Xu, P. Yuan, T. Zhang, Y. Tang, S. Li, Z. Xia, HDA: cross project defect prediction via heterogeneous domain adaptation with dictionary learning. IEEE Access 6, 57597–57613 (2018)
Article Google Scholar
W. Fu, T. Menzies, X. Shen, Tuning for software analytics: is it really necessary? Inf. Softw. Technol. 76, 135–146 (2016)
Google Scholar
https://www.toppr.com/guides/business-mathematics-and-statistics/correlation-and-regression/karl-pearsons-coefficient-correlation/
J.E.T. Akinsola, F.Y. Osisanwo, O. Awodele, J.O. Hinmikaiye, O. Olakanmi, J. Akinjobi, Supervised machine learning algorithms: classification and comparison. Int. J. Comput. Trends Technol. (IJCTT) 48(3), 128–138 (2017)
Google Scholar
https://machinelearningmastery.com/gentle-introduction,gradient-boosting-algorithm-machine-learning/
M.J. Justin, M.K. Taghi, Survey on deep learning with class imbalance. J Big Data 27(6), 1–54 (2019)
Google Scholar
S. Maheshwari, R.C. Jain, R.S. Jandon, A review of class imbalance problem: analysis and potential solution. Int. J. Comput. Trends Technol. (IJCTT) 14(6), 3 (2017)
Google Scholar
F. Rayhan, S. Ahmed, A. Mahbub, M.R. Jani, S. Shatabda, D.M. Farid, C.M. Rahman: ME boosting: mixed estimators with boosting for imbalance data classification. arXiv:1712.06658v2[cs.LG], 13 January 2018

Download references

Author information

Authors and Affiliations

Jamia Millia Islamia, Delhi, India
Rohit Vashisht & Syed Afzal Murtaza Rizvi

Authors

Rohit Vashisht
View author publications
You can also search for this author in PubMed Google Scholar
Syed Afzal Murtaza Rizvi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rohit Vashisht .

Editor information

Editors and Affiliations

Maharaja Agrasen Institute of Technology, Rohini, Delhi, India
Deepak Gupta
Maharaja Agrasen Institute of Technology, Rohini, Delhi, India
Ashish Khanna
CHRIST (Deemed to be University), Bengaluru, Karnataka, India
Siddhartha Bhattacharyya
Department of Information Technology, Faculty of Computers and Information, Cairo University, Giza, Egypt
Aboul Ella Hassanien
Department of Computer Science, Shaheed Sukhdev College of Business Studies, University of Delhi, Rohini, Delhi, India
Sameer Anand
Department of Computer Science, Shaheed Sukhdev College of Business Studies, University of Delhi, Rohini, Delhi, India
Ajay Jaiswal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vashisht, R., Rizvi, S.A.M. (2021). Handling Class Imbalance Problem in Heterogeneous Cross-Project Defect Prediction. In: Gupta, D., Khanna, A., Bhattacharyya, S., Hassanien, A.E., Anand, S., Jaiswal, A. (eds) International Conference on Innovative Computing and Communications. Advances in Intelligent Systems and Computing, vol 1165. Springer, Singapore. https://doi.org/10.1007/978-981-15-5113-0_7

Download citation

DOI: https://doi.org/10.1007/978-981-15-5113-0_7
Published: 02 August 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-5112-3
Online ISBN: 978-981-15-5113-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Handling Class Imbalance Problem in Heterogeneous Cross-Project Defect Prediction

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Combined classifier for cross-project defect prediction: an extended empirical study

Value-cognitive boosting with a support vector machine for cross-project defect prediction

Cross Projects Defect Prediction Modeling

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Handling Class Imbalance Problem in Heterogeneous Cross-Project Defect Prediction

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Combined classifier for cross-project defect prediction: an extended empirical study

Value-cognitive boosting with a support vector machine for cross-project defect prediction

Cross Projects Defect Prediction Modeling

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation