Abstract
In a developing country like India, the growth of its citizens and consequently the advancement of the nation depend on the education provided to them. However, the process of delivering education has been hindered by considerable dropout rates which have multiple social and economic consequences. Hence, it is crucial to find out ways to overcome this problem. The advent of machine learning and the availability of an immense amount of data have enabled the development of data science and consequently, its application in educational institutions. Educational data mining enables the educator/teacher to monitor student requirement and provides the necessary response and counselling. In this paper, we use advance machine learning algorithms like logistic regression, decision trees and K-nearest neighbours to predict whether a student will drop out or continue his/her education. The accuracy of such models is calculated and studied. On the basis of the results, it was found that ML techniques prove to be useful in this domain with random forest being the most accurate classifier for predicting dropout rate. Educational institutions can analyse which students may need more attention using this research as it is base, thus modifying teaching methods to achieve the end goal of 0% dropout rate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Romero C, Ventura S (2010) Educational data mining: a review of the state of the art. IEEE Trans Syst Man Cybern Part C Appl Rev 40(6):601–618. https://doi.org/10.1109/TSMCC.2010.2053532
Sateesh M, Sekher TV (2014) Factors leading to school dropouts in India: an analysis of national family health survey-3 data. Int J Res Method Educ 4:75–83.https://doi.org/10.9790/7388-04637583
Kominski R (1990) Estimating the national high school dropout rate. Demography 27:303–311. https://doi.org/10.2307/2061455
McCaul EJ, Donaldson GA, Coladarci T, Davis WE (1992) J Educ Res 85(4): 198–207
Langley P, Simon HA (1995) Applications of machine learning and rule induction. Commun. ACM 38(11):54–64
Kotsiantis SB, Pierrakeas CJ, Pintelas PE (2003) Preventing student dropout in distance learning using machine learning techniques. In: Palade V, Howlett RJ, Jain L (eds) Knowledge-based intelligent information and engineering systems. KES 2003. In: Lecture notes in computer science, vol 2774. Springer, Berlin, Heidelberg
Yukselturk E, Ozekes S, Türel Y (2014) Predicting dropout student: an application of data mining methods in an online education program. Euro J Open Distance E-Learn 17:118–133. https://doi.org/10.2478/eurodl-2014-0008
Aulck LS, Nishant V, Blumenstock JE, West J (2016) Predicting student dropout in higher education. https://ArXiv.org/abs/1606.06364
Suh S, Suh J, Houston I (2007) vol 85(2): 131–255. Spring 2007. https://doi.org/10.1002/j.1556-6678.2007.tb00463.x
Kleinbaum DG, Klein M (2010) Logistic regression: a self-learning text. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-1742-3
Scott M (2001) Applied logistic regression analysis, 2nd edn. SAGE, pp 1–33. https://books.google.co.in/books?id=JbVIDwAAQBAJ
Soucy P, Mineau GW (2001) A simple KNN algorithm for text categorization. In: Proceedings 2001 IEEE international conference on data mining. San Jose, CA, USA, pp 647–648
Yigit H (2013) A-weighting approach for KNN classifier. In: 2013 international conference on electronics, computer and computation (ICECCO), Ankara, pp 228–231
Liaw A, Wiener M (2002) Classification and regression by random forest. R News 2:18–22
Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 30(7):1145–1159
Kloft M, Stiehler F, Zheng Z, Pinkwart N (2014) Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 25–29 Oct 2014. Association for Computational Linguistics, Doha, Qatar, pp 60–65
de Santos KJ, Menezes AG, de Carvalho AB, Montesco CAE (2019) Supervised learning in the context of educational data mining to avoid university students dropout. In: 2019 IEEE 19th international conference on advanced learning technologies (ICALT). Maceió, Brazil, pp 207–208
Rumberger RW (2001) Why students drop out of school and what can be done. UCLA: the civil rights project/Proyecto Derechos civiles. Retrieved from https://escholarship.org/uc/item/58p2c3wp
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Makhloga, V.S., Raheja, K., Jain, R., Bhattacharya, O. (2021). Machine Learning Algorithms to Predict Potential Dropout in High School. In: Khanna, A., Gupta, D., Pólkowski, Z., Bhattacharyya, S., Castillo, O. (eds) Data Analytics and Management. Lecture Notes on Data Engineering and Communications Technologies, vol 54. Springer, Singapore. https://doi.org/10.1007/978-981-15-8335-3_17
Download citation
DOI: https://doi.org/10.1007/978-981-15-8335-3_17
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8334-6
Online ISBN: 978-981-15-8335-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)