Skip to main content

Credit Risk Scoring: A Stacking Generalization Approach

  • Conference paper
  • First Online:
Information Systems and Technologies (WorldCIST 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 799))

Included in the following conference series:

  • 211 Accesses

Abstract

Forecasting the creditworthiness of customers in new and existing loan contracts is a central issue of lenders’ activity. Credit scoring involves the use of analytical methods to transform historical loan application and loan performance data into credit scores that signal creditworthiness, inform, and determine credit decisions, determine credit limits, and loan rates, and assist in fraud detection, delinquency intervention, or loss mitigation. The standard approach to credit scoring is to pursue a “winner-take-all” perspective by which, for each dataset, a single believed to be the “best” statistical learning or machine learning classifier is selected from a set of candidate approaches using some method or criteria often neglecting model uncertainty. This paper empirically investigates the predictive accuracy of single-based classifiers against the stacking generalization approach in credit risk modelling using real-world peer-to-peer lending data. The findings show that stacking ensembles consistently outperform most traditional individual credit scoring models in predicting the default probability. Moreover, the findings show that adopting a feature selection process and hyperparameter tuning contributes to improving the performance of individual credit risk models and the super-learner scoring algorithm, helping models to be simpler, more comprehensive, and with lower classification error rates. Improving credit scoring models to better identify loan delinquency can substantially contribute to reducing loan impairments and losses leading to an improvement in the financial performance of credit institutions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ashofteh, A., Bravo, J.M.: A conservative approach for online credit scoring. Expert Syst. Appl. 176, 114835 (2021)

    Article  Google Scholar 

  2. Ashofteh A., Bravo J.M.: A non-parametric-based computationally efficient approach for credit scoring. In: CAPSI 2019 - 19th Conference of the Portuguese Association for Information Systems, Lisbon, Code 160805 (2019)

    Google Scholar 

  3. Chamboko, R., Bravo, J.M.: On the modelling of prognosis from delinquency to normal performance on retail consumer loans. Risk Manage. 18, 264–287 (2016)

    Article  Google Scholar 

  4. Chamboko, R., Bravo, J.M.: A multi-state approach to modelling intermediate events and multiple mortgage loan outcomes. Risks 8(2), 64 (2020). https://doi.org/10.3390/risks8020064

    Article  Google Scholar 

  5. Thomas, L.C.: A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers. Int. J. Forecast. 16(2), 149–172 (2000)

    Article  Google Scholar 

  6. Saunders, A., Allen, L.: Credit Risk Measurement-New Approaches to Value at Risk and Other Paradigms. Wiley, New York (2002)

    Google Scholar 

  7. Lessmann, S., Baesens, B., Seow, H., Thomas, L.C.: Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research. Eur. J. Oper. Res. 247(1), 124–136 (2015)

    Article  Google Scholar 

  8. Altman, E.I.: Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J. Financ. 23(4), 589–609 (1968)

    Article  Google Scholar 

  9. Chamboko, R., Bravo, J.M.: Modelling and forecasting recurrent recovery events on consumer loans. Int. J. Appl. Decis. Sci. 12(3), 271–287 (2019)

    Google Scholar 

  10. Chamboko, R., Bravo, J.M.: Frailty correlated default on retail consumer loans in developing markets. Int. J. Appl. Decis. Sci. 12(3), 257–270 (2019)

    Google Scholar 

  11. Altman, E.I., Haldeman, R.G., Narayanan, P.: ZETATM analysis a new model to identify bankruptcy risk of corporations. J. Bank. Finan. 1(1), 29–54 (1977)

    Article  Google Scholar 

  12. Ala’raj, M., Abbod, M.F.: Classifiers consensus system approach for credit scoring. Knowl.-Based Syst. 104, 89–105 (2016)

    Article  Google Scholar 

  13. Barboza, F., Kimura, H., Altman, E.: Machine learning models and bankruptcy prediction. Expert Syst. Appl. 83, 405–417 (2017)

    Article  Google Scholar 

  14. Zhang, D., Zhou, X., Leung, S., Zheng, J.: Vertical bagging decision trees model for credit scoring. Expert Syst. Appl. 37, 7838–7843 (2010)

    Article  Google Scholar 

  15. Huang, C.L., Chen, M.C., Wang, C.J.: Credit scoring with a data mining approach based on support vector machines. Expert Syst. Appl. 33(4), 847–856 (2007)

    Article  Google Scholar 

  16. Mukid, M., Widiharih, T., Rusgiyono, A., Prahutama, A.: Credit scoring analysis using weighted k-nearest neighbour. J. Phys. Conf. Ser. 1025, 012114 (2018)

    Article  Google Scholar 

  17. West, D.: Neural network credit scoring models. Comput. Oper. Res. 27(11–12), 1131–1152 (2000)

    Article  Google Scholar 

  18. Steel, M.F.J.: Model averaging and its use in economics. J. Econ. Lit. 58, 644–719 (2020)

    Article  Google Scholar 

  19. Ashofteh, A., Bravo, J.M., Ayuso, M.: A new ensemble learning strategy for panel time-series forecasting with applications to tracking respiratory disease excess mortality during the COVID-19 pandemic. Appl. Soft Comput. 128, 109422 (2022)

    Article  Google Scholar 

  20. Bravo, J.M., Ayuso, M., Holzmann, R., Palmer, E.: Addressing the life expectancy gap in pension policy. Insur. Math. Econ. 99, 200–221 (2021)

    Google Scholar 

  21. Bravo, J.M.: Pricing participating longevity-linked life annuities: a Bayesian model ensemble approach. Eur. Actuar. J. 12, 125–159 (2021)

    Article  MathSciNet  Google Scholar 

  22. Ayuso, M., Bravo, J.M., Holzmann, R., Palmer, E.: Automatic indexation of the pension age to life expectancy: when policy design matters. Risks 9(5), 96 (2021). https://doi.org/10.3390/risks9050096

    Article  Google Scholar 

  23. Bravo, J.M., Ayuso, M.: Mortality and life expectancy forecasts using Bayesian model combinations: an application to the Portuguese population. RISTI - Revista Ibérica de Sistemas e Tecnologias de Informação, E40, 128–144 (2020). https://doi.org/10.17013/risti.40.128–145

    Google Scholar 

  24. Bravo, J.M., Ayuso, M.: Linking pensions to life expectancy: tackling conceptual uncertainty through Bayesian model averaging. Mathematics, 9(24), 3307 (2021). 1–27

    Google Scholar 

  25. Feng, X., Xiao, Z., Zhong, B., Qiu, J., Dong, Y.: Dynamic ensemble classification for credit scoring using soft probability. Appl. Soft Comput. J. 65, 139–151 (2018)

    Article  Google Scholar 

  26. Xia, Y., Liu, C., Da, B., Xie, F.: A novel heterogeneous ensemble credit scoring model based on bstacking approach. Expert Syst. Appl. 93, 182–199 (2018)

    Article  Google Scholar 

  27. Re, M., Valentini, G.: Ensemble methods: A review. Advances in Machine Learning and Data Mining for Astronomy, pp. 563–594. Chapman & Hall (2012). https://doi.org/10.1201/B11822-34

  28. Zhou, Z.: Ensemble Methods: Foundations and Algorithms, pp. 15-16. Chapman and Hall (2012).https://doi.org/10.1201/b12207

  29. Dietterich, T.G.: Ensemble methods in machine learning. In: Multiple Classifier Systems. MCS 2000, LNCS, pp. 1–15 (2000). https://doi.org/10.1007/3-540-45014-9_1

  30. Wolpert, D.: Stacked generalization. Neural Netw. 5, 241–259 (1992)

    Article  Google Scholar 

  31. Cox, D.R.: The regression analysis of binary sequences. J. R. Stat. Soc. Ser. B (Methodol.) 20(2), 215–232 (1958)

    Google Scholar 

  32. Cortes, C., Vapnik, V.: Support vector network. Mach. Learn. 20, 273–297 (1995)

    Article  Google Scholar 

  33. Jijo, B.T., Abdulazeez, A.M.: Classification based on decision tree algorithm for machine learning. J. Appl. Sci. Technol. Trends 2(01), 20–28 (2021)

    Article  Google Scholar 

  34. Zhang, Y., Wang, J.: K-nearest neighbors and a kernel density estimator for GEFCom2014 probabilistic wind power forecasting. Int. J. Forecast. 32(3), 1074–1080 (2016)

    Article  Google Scholar 

  35. Jiang, W., Chen, Z., Xiang, Y., Shao, D., Ma, L., Zhang, J.: SSEM: a novel self-adaptive stacking ensemble model for classification. IEEE Access 7, 120337–120349 (2019)

    Article  Google Scholar 

  36. Marqués, A.I., García, V., Sánchez, J.S.: On the suitability of resampling techniques for the class imbalance problem in credit scoring. J. Oper. Res. Soc. 64(7), 1060–1070 (2013)

    Article  Google Scholar 

  37. Mishra, S., Sarkar, U., Taraphder, S., Datta, S., Swain, D., Saikhom, R., et al.: Multivariate statistical data analysis- principal component analysis (PCA). Int. J. Livestock Res. 7(5), 60–78 (2017)

    Google Scholar 

  38. Abdou, H., Pointon, J.: Credit scoring, statistical techniques and evaluation criteria: a review of the literature. Int. Syst. Acc. Finan. Manag. 18, 59–88 (2011)

    Article  Google Scholar 

  39. Powers, D.M.W.: Evaluation: From precision, recall and f-measure to ROC., informedness, markedness & correlation. J. Mach. Learn. Technol. 2, 37–63 (2011)

    Google Scholar 

  40. Luo, G.: A review of automatic selection methods for machine learning algorithms and hyper-parameter values. Netw. Model. Anal. Health Inform. Bioinform. 5(1), 1–16 (2016). https://doi.org/10.1007/s13721-016-0125-6

    Article  Google Scholar 

  41. Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. (JAIR) 16, 321–357 (2002)

    Article  Google Scholar 

  42. Mienye, D., Sun, Y.: Performance analysis of cost-sensitive learning methods with application to imbalanced medical data. Inform. Med. Unlocked 25, 1–10 (2021)

    Article  Google Scholar 

  43. Yu, H., Sun, C., Yang, X., Zheng, S., Wang, Q., Xi, X.: LW-ELM: a fast and flexible cost-sensitive learning framework for classifying imbalanced data. IEEE Access 6, 28488–28500 (2018)

    Article  Google Scholar 

  44. Ampountolas, A., Nyarko Nde, T., Date, P., Constantinescu, C.: A machine learning approach for micro-credit scoring. Risks 9(3), 50 (2021)

    Article  Google Scholar 

  45. Bravo, J.M., Ayuso, M.: Forecasting the retirement age: a Bayesian model ensemble approach. In: Rocha, Á., Adeli, H., Dzemyda, G., Moreira, F., Ramalho Correia, A.M. (eds.) WorldCIST 2021. AISC, vol. 1365, pp. 123–135. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72657-7_12

    Chapter  Google Scholar 

  46. Ashofteh, A. Bravo, J.M.: Life table forecasting in COVID-19 times: an ensemble learning approach. In: Rocha, A., Gonçalves, R., Penalvo, F.G., Martins, J. (eds.), Proceedings of CISTI 2021 - Iberian Conference on Information Systems and Technologies. IEEE Computer Society Press (2021). https://doi.org/10.23919/CISTI52073.2021.9476583

  47. Bravo, J.M., El Mekkaoui, N.: Short-term CPI Inflation forecasting: probing with model combinations. In: Rocha, A. et al. (eds.) Information Systems and Technologies. WorldCIST 2022. Lecture Notes in Networks and Systems, vol. 468, pp. 564–578. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-04826-5_56

  48. Ashofteh, A., Bravo, J.M. Ayuso, M.: A novel layered learning approach for forecasting respiratory disease excess mortality during the COVID-19 pandemic. In: CAPSI 2021 Proceedings, Volume 2021 – October 2021, Code 183080 (2021)

    Google Scholar 

  49. Bravo, J.M.: Longevity-linked life annuities: a Bayesian model ensemble pricing approach. In: CAPSI 2020 Proceedings. 29. https://aisel.aisnet.org/capsi2020/29 (Atas da 20ª Conferência da Associação Portuguesa de Sistemas de Informação 2020) (2020)

  50. Bouttier, F., Marchal, H.: Probabilistic thunderstorm forecasting by blending multiple ensembles. Tellus A 72(1), 1–19 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

This work has been supported by Fundação para a Ciência e a Tecnologia, grants UIDB/04152/2020 - Centro de Investigação em Gestão de Informação (MagIC) and UIDB/00315/2020 (BRU-ISCTE-IUL).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorge M. Bravo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Raimundo, B., Bravo, J.M. (2024). Credit Risk Scoring: A Stacking Generalization Approach. In: Rocha, A., Adeli, H., Dzemyda, G., Moreira, F., Colla, V. (eds) Information Systems and Technologies. WorldCIST 2023. Lecture Notes in Networks and Systems, vol 799. Springer, Cham. https://doi.org/10.1007/978-3-031-45642-8_38

Download citation

Publish with us

Policies and ethics