Skip to main content

An Empirical Study on Discovering Software Bugs Using Machine Learning Techniques

  • Conference paper
  • First Online:
Computational Intelligence and Data Analytics

Abstract

Bug is a defect in software which needs to be identified early so as to avoid unnecessary burden caused by it later. Bug discovery from software modules has been around. However, of late, machine learning (ML) became a useful and appropriate solution to many real-world problems. In this context, usage of machine learning has become an important step forward in improving state of the art in bug detection. It is an artificial intelligence-based (AI) approach that makes it more effective due to the bulk of software modules. Many existing methods strived to incorporate ML for bug discovery. However, there is need for improvement with appropriate methodology. In this paper, we proposed a methodology that exploits two ML techniques known as decision tree (DT) and random forest (RF) for efficient means of discovering bugs from software modules. An empirical study is made using Python data science platform. Experimental results showed that RF performs better than DT in terms of accuracy of bug prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ferreira VC, Carrano RC, Silva JO, Albuquerque CVN, Muchaluat-Saade DC, Passos DG (2017) Fault detection and diagnosis for solar-powered wireless mesh networks using machine learning. In: Proceedings of IFIP/IEEE symposium on integrated network and service management (IM’17), pp 456–62

    Google Scholar 

  2. Tan JS, Ho CK, Lim AH, Ramly MR (2018) Predicting network faults using Random forest and C5.0. Int J Eng Technol 7(2.14):93–6

    Google Scholar 

  3. Duenas JC, Navarro JM, Parada HA, Andion J, Cuadrado F (2018) Applying event stream processing to network online failure prediction. Commun Mag 56(1):166–170

    Article  Google Scholar 

  4. Tran HM, Nguyen SV, Ha SVU, Le TQ (2018) An analysis of software bug reports using Random forest. In: Proceedings of 5th international conference on future data and security engineering (FDSE’18). Springer, pp 1–13

    Google Scholar 

  5. Tran HM, Nguyen SV, Le ST, Vu QT (2017) Applying data analytic techniques for fault detection. Trans Large Scale Data Knowl Cent Syst (TLDKS) 30–46

    Google Scholar 

  6. Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I, Zaharia M (2010) A view of cloud computing. ACM Commun 53(4):50–58

    Article  Google Scholar 

  7. Hammouri A, Hammad M, Alnabhan M, Alsarayrah F (2018) Software bug prediction using machine learning approach. Int J Adv Comput Sci Appl 9. https://doi.org/10.14569/IJACSA.2018.090212

  8. Zhang W, Wang S, Wang Q (2015) KSAP: an approach to bug report assignment using KNN search and heterogeneous proximity. J Inf Softw Technol 70:68–84

    Article  Google Scholar 

  9. Sabor KK, Hamdaqa M, Hamou-Lhadj A (2019) Automatic prediction of the severity of bugs using stack traces and categorical features. Elsevier J Inf Softw Technol

    Google Scholar 

  10. Ramesh G, Mallikarjuna Rao C (2018) Code-smells identification by using PSO approach. Int J Recent Technol Eng (IJRTE) 7(4). ISSN: 2277-3878

    Google Scholar 

  11. Pooja ASSVL, Sridhar M, Ramesh G (2021) Application and analysis of phishing website detection in machine learning and neural networks. In: Luhach AK, Jat DS, Bin Ghazali KH, Gao XZ, Lingras P (eds) Advanced informatics for computing research. ICAICR 2020. Communications in computer and information science, vol 1394. Springer, Singapore

    Google Scholar 

  12. Gupta A, Sharma S, Goyal S, Rashid M (2020) Novel XGBoost tuned machine learning model for software bug prediction. 2020 international conference on intelligent engineering and management (ICIEM), pp 376–380

    Google Scholar 

  13. Riza LS, Rachmat AB, Munir TH, Nazir S (2019) Genomic repeat detection using the Knuth-Morris-Pratt algorithm on R high-performance-computing package. Int J Adv Soft Comput Appl 11(1):94–111

    Google Scholar 

  14. Sheneamer AM (2021) Multiple similarity-based features blending for detecting code clones using consensus-driven classification. Expert Syst Appl 183

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. Ramesh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ramesh, G., Reddy, K.S.S., Ramu, G., Reddy, Y.C.A.P., Somasekar, J. (2023). An Empirical Study on Discovering Software Bugs Using Machine Learning Techniques. In: Buyya, R., Hernandez, S.M., Kovvur, R.M.R., Sarma, T.H. (eds) Computational Intelligence and Data Analytics. Lecture Notes on Data Engineering and Communications Technologies, vol 142. Springer, Singapore. https://doi.org/10.1007/978-981-19-3391-2_14

Download citation

Publish with us

Policies and ethics