Abstract
Supervised machine learning methods are described, demonstrated and assessed for the prediction of employee turnover within an organization. In this study, numerical experiments for real and simulated human resources datasets representing organizations of small-, medium- and large-sized employee populations are performed using (1) a decision tree method; (2) a random forest method; (3) a gradient boosting trees method; (4) an extreme gradient boosting method; (5) a logistic regression method; (6) support vector machines; (7) neural networks; (8) linear discriminant analysis; (9) a Naïve Bayes method; and (10) a K-nearest neighbor method. Through a robust and comprehensive evaluation process, the performance of each of these supervised machine learning methods for predicting employee turnover is analyzed and established using statistical methods. Additionally, reliable guidelines are provided on the selection, use and interpretation of these methods for the analysis of human resources datasets of varying size and complexity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alao, D., Adeyemo, A.B.: Analyzing employee attrition using decision tree algorithms. Comput. Inf. Syst. Dev. Inform. Allied Res. J. 4 (2013)
Al-Radaideh, Q.A., Al Nagi, E.: Using data mining techniques to build a classification model for predicting employees performance. Int. J. Adv. Comput. Sci. Appl. 3, 144–151 (2012)
Chang, H.Y.: Employee turnover: a novel prediction solution with effective feature selection. WSEAS Trans. Inf. Sci. Appl. 6, 417–426 (2009)
Chien, C.F., Chen, L.F.: Data mining to improve personnel selection and enhance human capital: a case study in high-technology industry. Expert Syst. Appl. 34, 280–290 (2008)
Li, Y.M., Lai, C.Y., Kao, C.P.: Building a qualitative recruitment system via SVM with MCDM approach. Appl. Intell. 35, 75–88 (2011)
Nagadevara, V., Srinivasan, V., Valk, R.: Establishing a link between employee turnover and withdrawal behaviours: application of data mining techniques. Res. Pract. Hum. Resour. Manag. 16, 81–97 (2008)
Quinn, A., Rycraft, J.R., Schoech, D.: Building a model to predict caseworker and supervisor turnover using a neural network and logistic regression. J. Technol. Hum. Serv. 19, 65–85 (2002)
Sexton, R.S., McMurtrey, S., Michalopoulos, J.O., Smith, A.M.: Employee turnover: a neural network solution. Comput. Oper. Res. 32, 2635–2651 (2005)
Suceendran, K., Saravanan, R., Divya Ananthram, D.S., Kumar, R.K., Sarukesi, K.: Applying classifier algorithms to organizational memory to build an attrition predictor model
Tzeng, H.M., Hsieh, J.G., Lin, Y.L.: Predicting nurses’ intention to quit with a support vector machine: a new approach to set up an early warning mechanism in human resource management. CIN: Comput. Inf. Nurs. 22, 232–242 (2004)
Valle, M.A., Varas, S., Ruz, G.A.: Job performance prediction in a call center using a naive Bayes classifier. Expert Syst. Appl. 39, 9939–9945 (2012)
Haq, N.F., Onik, A.R., Shah, F.M.: An ensemble framework of anomaly detection using hybridized feature selection approach (HFSA). In: SAI Intelligent Systems Conference (IntelliSys), pp. 989–995, IEEE (2015)
Punnoose, R., Ajit, P.: Prediction of employee turnover in organizations using machine learning algorithms. Int. J. Adv. Res. Artif. Intell. 5, 22–26 (2016)
Sikaroudi, E., Mohammad, A., Ghousi, R., Sikaroudi, A.: A data mining approach to employee turnover prediction (case study: Arak automotive parts manufacturing). J. Ind. Syst. Eng. 8, 106–121 (2015)
McKinley Stacker, I.V.: IBM waston analytics. Sample data: HR employee attrition and performance [Data file]. Retrieved from https://www.ibm.com/communities/analytics/watson-analytics-blog/hr-employee-attrition/ (2015)
Shahshahani, B.M., Landgrebe, D.A.: The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon. IEEE Trans. Geosci. Remote Sens. 32, 1087–1095 (1994)
Géron, A.: Hands-on machine learning with Scikit-Learn and TensorFlow: concepts, tools, and techniques to build intelligent systems. O’Reilly Media (2017)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Hum. Genet. 7, 179–188 (1936)
Murphy, K.P.: Machine learning: a probabilistic perspective. MIT press, Cambridge (2012)
Seddik, A.F., Shawky, D.M.: Logistic regression model for breast cancer automatic diagnosis. In: SAI Intelligent Systems Conference (IntelliSys), IEEE, pp. 150–154 (2015)
Bakry, U., Ayeldeen, H., Ayeldeen, G., Shaker, O.: Classification of Liver Fibrosis patients by multi-dimensional analysis and SVM classifier: an Egyptian case study. In: Proceedings of SAI Intelligent Systems Conference, pp. 1085–1095. Springer, Cham (2016)
Mathias, H.D., Ragusa, V.R.: Micro aerial vehicle path planning and flight with a multi-objective genetic algorithm. In Proceedings of SAI Intelligent Systems Conference, pp. 107–124. Springer, Cham (2016)
Ye, Q., Zhang, Z., Law, R.: Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Syst. Appl. 36, 6527–6535 (2009)
Durant, K.T., Smith, M.D.: Predicting the political sentiment of web log posts using supervised machine learning techniques coupled with feature selection. In: International Workshop on Knowledge Discovery on the Web, pp. 187–206. Springer, Berlin, Heidelberg (2006)
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp. 785–794, ACM (2016)
Bousquet, O., Elisseeff, A.: Stability and generalization. J. Mach. Learn. Res. 2, 499–526 (2002)
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Kotsiantis, S.B.: Supervised machine learning: a review of classification techniques. Informatica 31, 249–268 (2007)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 1189–1232 (2001)
Morgan, J.N., Sonquist, J.A.: Problems in the analysis of survey data, and a proposal. J. Am. Stat. Assoc. 58, 415–434 (1963)
Muller, K.R., Mika, S., Ratsch, G., Tsuda, K., Scholkopf, B.: An introduction to kernel-based learning algorithms. IEEE. T. Neural. Networ. 12, 181–201 (2001)
Zhang, H.: The optimality of naive Bayes. AA, 1, 3
Friedman, J., Hastie, T., Tibshirani, R.: The elements of statistical learning. Springer, New York (2001)
Jantan, H., Hamdan, A.R., Othman, Z.A.: Human talent prediction in HRM using C4. 5 classification algorithm. Int. J. Comput. Sci. Eng. 2, 2526–2534 (2010)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)
Cox, D.R.: The regression analysis of binary sequences. J. Roy. Stat. Soc. B. Met., 215–242 (1958)
Hong, W.C., Pai, P.F., Huang, Y.Y., Yang, S.L.: Application of support vector machines in predicting employee turnover based on job performance. Adv. Nat. Comput., 419 (2005)
DMLC: Introduction to boosted trees. Retrieved from http://xgboost.readthedocs.io/en/latest/model.html (2015)
Somers, M.J.: Application of two neural network paradigms to the study of voluntary employee turnover. J. Appl. Psychol. 84, 177 (1999)
McKnight, P.E., Najab, J.: Mann Whitney U Test. In: Corsini Encyclopedia of Psychology (2010)
Dos Santos, E.M., Oliveira, L.S., Sabourin, R., Maupin, P.: Overfitting in the selection of classifier ensembles: a comparative study between pso and ga. In: Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation, ACM, pp. 1423–1424 (2008)
Raschka, S.: Python Machine Learning. Packt Publishing Ltd, Birmingham (2015)
Efron, B.S., Hastie, T.: Computer Age Statistical Inference. Cambridge University Press, Cambridge (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, Y., Hryniewicki, M.K., Cheng, F., Fu, B., Zhu, X. (2019). Employee Turnover Prediction with Machine Learning: A Reliable Approach. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems and Applications. IntelliSys 2018. Advances in Intelligent Systems and Computing, vol 869. Springer, Cham. https://doi.org/10.1007/978-3-030-01057-7_56
Download citation
DOI: https://doi.org/10.1007/978-3-030-01057-7_56
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01056-0
Online ISBN: 978-3-030-01057-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)