Skip to main content

Hyperparameter Tuning in Random Forest and Neural Network Classification: An Application to Predict Health Expenditure Per Capita

  • Conference paper
  • First Online:
Data Intelligence and Cognitive Informatics

Part of the book series: Algorithms for Intelligent Systems ((AIS))

  • 613 Accesses

Abstract

There is a lack of literature about the classification performance improvement effect of hyperparameter tuning to predict health expenditure per capita (HE). In this study, the effect of hyperparameter tuning on classification performances of random forest (RF) and neural network (NN) classification tasks is compared for grouping member of World Bank (WB) countries in terms of HE. Data gathered from 188 member countries of WB for the year 2019. GDP per capita, mortality, life expectancy at birth and population aged 65 years and over are used as predictors. Number of trees and neurons in hidden layer are changed from 5 to 100 for RF and NN by changing k-fold parameter from 2 to 20. The dependent HE variable is transformed into binary categories, and the categories are well balanced (%50–%50). Classification performances of learning techniques are good (AUC > 0.95). RF (AUC = 0.9609) is superior to NN (AUC = 0.9596) in terms of average AUC values generated by hyperparameter tuning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Kaur S, Aggarwal H, Rani R (2020) Hyper-parameter optimization of deep learning model for prediction of Parkinson’s disease. Mach Vis Appl 31(32):1–15

    Google Scholar 

  2. Passos D et al (2022) A tutorial on automatic hyperparameter tuning of deep spectral modelling for regression and classification tasks. Chemometr Intell Lab Syst 223

    Google Scholar 

  3. Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  MATH  Google Scholar 

  4. Cui H, Bai J (2019) A new hyperparameters optimization method for convolutional neural network. Pattern Recogn 125:828–834

    Article  Google Scholar 

  5. Spesier JL, Miller ME, Tooze J, Ip E (2019) A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst Appl 134:93–101

    Article  Google Scholar 

  6. Breiman B, Friedman CH, Olshen RA, Stone CJ (1984) Classification and regression trees, 1st edn. New York

    Google Scholar 

  7. Cutler A, Cutler DR, Stevens JR (2012) Random forests BT—ensemble machine learning: methods and applications. In: Ensemble Mach. Learn. Springer US, Boston, MA, pp 157–175

    Google Scholar 

  8. Probst P, Boulesteix AN (2018) To tune or not to tune the number of trees in random forest. J Mach Learn Res 18:1–18

    MathSciNet  MATH  Google Scholar 

  9. Grömping U (2009) Variable importance assessment in regression: linear regression versus random forest. Am Stat 63(4):308–319

    Article  MathSciNet  Google Scholar 

  10. Muchlinski D, Siroky D, He J, Kocher M (2015) Comparing random forest with logistic regression for predicting class-imbalanced civil war onset data. Polit Anal 1–17

    Google Scholar 

  11. Dreseitl S, Ohno-Machado L (2002) Logistic regression and artifcial neural network classification models: a methodology review. J Biomed Inform 35:352–359

    Article  Google Scholar 

  12. Feraud R, Clerot F (2002) A methodology to explain neural network classification. Neural Netw 15:237–246

    Article  Google Scholar 

  13. Ceylan Z, Atalan A (2021) Estimation of healthcare expenditure per capita of Turkey using artificial intelligence techniques with genetic algorithm-based feature selection. J Forecast 40:279–290

    Article  MathSciNet  Google Scholar 

  14. Marcot BG, Hanea AM (2021) What is an optimal value of k in k-fold cross-validation in discrete Bayesian network analysis? Comput Statistics 36:2009–2031

    Article  MathSciNet  MATH  Google Scholar 

  15. Wong TT (2015) Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recogn 48:2839–2846

    Article  MATH  Google Scholar 

  16. Cho WK et al (2021) Diagnostic accuracies of laryngeal diseases using a convolutional neural network-based image classification system. Laryngoscope 131(11):2558–2566

    Article  Google Scholar 

  17. Goutte C, Gaussier E (2005) A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: Proceedings of the European colloquium on IR resarch (ECIR’05), LLNCS 3408 (Springer), pp 345–359

    Google Scholar 

  18. World Bank Open Data (2019). https://data.worldbank.org/

  19. Manning W (2006) Dealing with skewed data on costs and expenditures. In: Jones AM (ed) The Elgar companion to health economics, 2nd edn. Edward Elgar

    Google Scholar 

  20. Neelakandan S, Paulraj D (2021) An automated exploring and learning model for data prediction using balanced CA-SVM. J Ambient Intell Human Comput 12:4979–4990

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gulcin Caliskan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Caliskan, G., Cinaroglu, S. (2023). Hyperparameter Tuning in Random Forest and Neural Network Classification: An Application to Predict Health Expenditure Per Capita. In: Jacob, I.J., Kolandapalayam Shanmugam, S., Izonin, I. (eds) Data Intelligence and Cognitive Informatics. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-19-6004-8_62

Download citation

Publish with us

Policies and ethics