Introduction

There is an ever-increasing interest in exploring the subject of education dropout worldwide, with high incidence risks as one of the biggest challenges [1]. Dropout seriously impacts education systems, resulting in lower enrolment and failure to meet academic goals [2]. As a result, schools, colleges, universities, and governments face economic and social consequences. Moreover, when administrators lack the resources to detect at-risk students in danger of dropping out, dropout becomes a severe subject [3]. Consequently, only some remedial procedures are adapted on time to retain students in schools and colleges [4]. So, predicting student dropout and detecting the elements that could lead to this significant phenomenon is now becoming a priority [5]. Most of the predictive modeling techniques used need to be explained. This might be one of the reasons why this issue of dropouts still exists [1].

Machine Learning (ML) techniques are among the most widely researched solutions for dropout prediction. In developed nations, extensive research has been done on creating student dropout prediction algorithms [6,7,8]. Furthermore, there is substantial work on ML-based techniques to prevent dropouts [9, 10]. Literature-based knowledge can shift the dropout prevention effort from responsive to proactive. This could be more practical now than at any other time because Information and Communication Technologies (ICTs) have effectively changed how information has been gathered and handled, a vital element to the data-driven harnessing of a logical sequence of observed occurrences. However, confusion still prevails concerning the viability of the current insightful procedures and models. Despite a few past research endeavors, difficulties still need to be addressed.

The necessity to check the efficiency of systematic reviews examining the bold prediction of education dropout using statistical techniques has inspired us to investigate. In addition, this review looks beyond the results to highlight persistent problems such as data imbalance, interpretability, and geographic inequalities. These unresolved problems are noted as promising areas for future study, highlighting the domain's dynamism and the chances for advancement and innovation in dealing with student dropout on a worldwide scale. A sequential procedure is used to recognize, select, and evaluate the synthesized investigation results to align with research objectives [11, 12]. This systematic review and meta-analysis aim to survey the statistical techniques-based works conducted in education dropout prediction between 2000 and 2023. The objectives of the study are:

  1. 1.

    To better understand the statistical methods and strategies used to predict student dropout.

  2. 2.

    To evaluate the efficiency and standard performance metrics of current statistical methods in the reviewed studies.

  3. 3.

    To recognize the exploration difficulties and limits confronting the current statistical techniques for predicting student dropout.

Methods

Survey methodology

This systematic review and meta-analysis were performed to examine what types of statistical techniques are used to predict early education dropout. We framed the Population, Intervention, Comparison, Outcome (PICO) model to justify the above research question. Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines for searching and assessing published articles for systematic review and meta-analysis are followed [13].

Search strategy

The electronic literature databases IEEE Explore, ScienceDirect, Web of Science, Association for Computing Machinery (ACM), and Scopus were searched to collect the relevant articles published between 2000 and 2023. Figure 1 summarizes the general procedures followed in our systematic review. The databases were searched using the identified keywords. The keywords are trailed multiple times and modified in each database to obtain relevant studies and are given in Appendix A. Finally, full-text papers were manually searched and selected for this study.

Fig. 1
figure 1

Steps of our review methodology

Inclusion and exclusion criteria

Research articles that are obtained in the search strategy are assessed using the inclusion criteria fixed by researchers. The research articles published in peer-reviewed journals or scientific forums were mainly considered. Studies that predict early education dropout using statistical techniques were considered for review. At the same time, a study that uses student databases for the prediction was also included. Studies that do not focus on the student's education dropout prediction were excluded from the review. Studies missing the requisite data for a thorough investigation of dropout prediction were also eliminated. As part of this systematic review, we followed PRISMA's basic principles for ensuring clarity and quality. Table 1 provides the criteria for inclusion that we used when selecting the articles.

Table 1 Inclusion criteria in our systematic literature review

Two authors independently reviewed the potential research studies by reading the titles and abstracts of the articles using the search query and manual searches. In the second phase, the identified research articles are completely screened to remove duplicates and irrelevant studies. Two authors examine the selected research papers independently to decide whether to include the research paper in the review. A third author solved discrepancies between the two authors through a joint discussion. The included articles' quality is evaluated using the Joanna Briggs Institute (JBI) Critical Appraisal Checklist for Analytical Cross-Sectional studies in the third phase [14]. JBI is a methodological quality assessment tool for various types of research. Based on this assessment, methodologically good articles were included for qualitative synthesis. The JBI checklist of studies in this review did not exclude many owing to poor methodology. A summary of the risk of bias is provided in Appendix B. As shown in Fig. 2, we have searched and screened articles using the PRISMA flowchart.

Fig. 2
figure 2

PRISMA flowchart of our systematic review

Data extraction

An effective data extraction design was created in Microsoft Excel based on the survey objectives and the inclusion criteria [15]. Research articles selected for qualitative synthesis were analyzed to extract the data supporting the review's primary focus. Information such as types of dropouts, country, sample size, data source and software used, methodology, the prevalence of the study, and year of publication and title of the study were extracted.

The initial records identified through a database search of IEEE Explore included 749 articles, Science Direct 51 articles, Association for Computing Machinery (ACM) 453 articles, Scopus 185 articles, and Web of Science 858 articles. After screening the title and abstract of each research article, 29 articles from IEEE Explore, 6 articles from Science Direct, 10 articles from ACM, 23 articles from Scopus, and 25 articles from Web of Science were selected for further investigation. A Manual search resulted in the inclusion of 20 articles. Removing duplicates and screening full texts included 13 articles from IEEE Explore, 5 articles from Science Direct, 3 articles from ACM, 2 articles from Scopus, 6 articles from Web of Science, and 7 articles through manual search. Appendix C summarizes the outcomes of the search and screening details. The data for both qualitative and quantitative synthesis were taken from 36 and 22 publications finalized. Table 2 provides the characteristics of the included studies, and other features are given in Appendix D.

Table 2 Features of the included studies

Statistical analysis

To determine the proportion of the minority class (dropouts), a PRAW (meta-analysis with random effects and a summary measure of proportion random effects model) analysis was conducted. Multiple factors led to the selection of the PRAW method. First, it is incredibly well suited for meta-analytic examination of binary outcomes, like dropouts vs. non-dropouts. Second, PRAW is robust and suitable to our analysis because it considers both within-study and between-study variability, where several studies from various sources were included [51]. Furthermore, using random effects models enables a more accurate representation of the underlying diversity among multiple studies. The I2 statistic, with an I2 value between 75 and 100%, reflects the variability among research. To determine the reliability of our conclusions, we performed a sensitivity analyses by established practices for meta-analysis using leave-one-out approach. We further stratified our analysis by type of dropout (University vs. School). Additionally, publication bias was carefully examined using trim and fill plots, Egger's test, and rank correlation test. The meta and metafor packages were used for all analyses in R Studio.

Result

This section contains basic information on the reviewed studies, the data pre-processing techniques used, the statistical methods used to predict student dropout, and the model assessment metrics used to evaluate the performance of the model.

Trends in dropout prediction research

The quantity of studies focussing on predicting student education dropouts is steadily increasing. Since 2017, the interest in dropout prediction models has increased, which depicts the universal trend in identifying at-risk students early to improve their proficiency in education, as shown in Fig. 3. Our database search turned up published studies to the end of March 2023, which might indicate a minor drop in the count of published studies in 2023.

Fig. 3
figure 3

Number of articles distributed by year of publication

Geographic distribution of research

The result compiled research that was carried out in different nations such as Bangladesh, Brazil, Chile, Denmark, Egypt, Germany, Ghana, Guatemala, Hungary, India, Italy, Korea, the Netherlands, Peru, Philippines, Portugal, Slovakia, South Africa, South Korea, Spain, Thailand, United States of America, and Uruguay. We have opted to group them into continents, as shown in Fig. 4. Ten studies were from Europe, followed by nine studies in South America, Seven in Asia, and five in Africa and North America. Out of 36 studies, twenty-six studies (72%) used University databases to predict student dropout. Ten studies (28%) used school dropout data, as shown in Fig. 5.

Fig. 4
figure 4

Distribution of included studies by continents

Fig. 5
figure 5

Distribution of included studies by type of dropout

Dataset characteristics

The datasets used in the research articles are classified into five categories based on the number of data included, as mentioned in Fig. 6. Seven studies used an experimental dataset of less than 1000 students, twelve used datasets ranging from 1001 to 10,000 students, and eleven used 10,001–50,000 students. The remaining six studies used more than 50,000 student datasets for early prediction. When we analyzed the prediction accuracy of each prediction model, we found varied results based on the sample size used in their study. The articles with smaller datasets as well as larger datasets have provided mixed findings. To iterate, the included study with a sample size of 2401 students gave a weak prediction as the accuracy of the model was found to be (70.47%)[33]. In contrast, a study conducted in Slovakia with 261 students resulted in a prediction with greater accuracy of 91.66% [46].

Fig. 6
figure 6

Distribution of included studies by sample size

In this context, it is essential to point out that arriving at a significant conclusion based on the classification of training datasets into two pools, small or adequate sample size, is a challenging task. To reason, the results derived from mixed findings are influenced by factors such as data imbalance, error tolerance, and the kind of prediction technique used. Also, comparing the performance of each prediction model on varying datasets will not provide conclusive results. There is no conflict that the greater the sample, the more accurate the prediction. Whatever the case, this was not obvious from our qualitative synthesis.

Software utilization in dropout prediction

As for the software used to examine the datasets, we recognize eight different software from 22 studies in Fig. 7. The outcome features of broadly utilized software WEKA, R, and Python are doubtless because of their wide range of programmed learning algorithms for mining, adaptability in modeling, and functionalities. Fifteen investigations should have indicated the software they used to develop their dropout models.

Fig. 7
figure 7

Distribution of included studies by software used for analysis

Understanding prediction methods

Data pre-processing

The essential target of information pre-handling is comprehending the information and its factors. Before applying statistical techniques, it is vital to do some pre-processing procedures such as cleaning, integration, encoding, attribution, dimensionality decrease, standardization, and variable transformations. The quality and dependability of accessible data directly influence the outcome acquired; hence, pre-processing should be highlighted as a significant task. In practice, certain specific pre-processing procedures were used to set up every one of the recently portrayed data to complete grouping assignments accurately. First, all accessible data were incorporated into a solitary dataset. Those students who did not have 100 percent full data were wiped out throughout this cycle.

As one can expect, the dataset is profoundly unequal since the students who leave the studies are a minority, and the proportion between the negative (non-dropout) and positive (dropout) models is around 3:1. Even though this is great for educational management, preparing an ML model for paired order with an exceptionally unbalanced dataset may result in poor final performance, fundamentally in light of the fact that in such a situation the classifier would underrate the class with a lower number of tests [52]. To resolve this issue, sampling or balancing/rebalancing algorithms may be applied to the data prior to pre-processing.

Only a few studies in this qualitative synthesis have used the Synthetic Minority Over-sampling Technique (SMOTE), Random under-sampling, and Random over-sampling. Appendix E shows 26 different data pre-processing techniques found in this survey.

Data analysis

After pre-processing the acquired data, extraction, and analysis were used to transform the data into information and achieve the desired outputs. Regarding education, statistical or machine-learning approaches can be used to estimate student dropout.

Supervised or structured learning depends on training from a collection of labeled data in the test dataset as it can distinguish unlabelled data in the test set to the maximum possible accuracy [53]. The worldview of this learning is effective, and it generally finds answers for a few linear and nonlinear problems, such as classifications, predictions, forecasts, advanced mechanics, and so on.

Previous research concentrated on supervised or structured learning techniques for identifying academic dropout students. For example, the commonly used models are Bayesian Classifier, Association Rule Learning, Logistic Regression, Random Forest, Ensemble Learning, and Neural Network models [54]. As for clustering approaches, the researchers prefer neural networks and decision trees for forecasting students' success [55]. Gray et al. 2014 describe a neural network as having the unique feature of recognizing all the possible correlations between indicator factors and being able to detect independent and dependent factors with no doubt [56, 57]. In contrast, decision trees have been employed to discover smaller or larger data structures and forecast their value as they are simple and straightforward [58, 59].

A few predictive models were developed to resolve the issue of dropout utilizing various methodologies like time-to-event analysis, the Generalized Linear Model, Linear Discriminant Analysis, and Bayesian Network [10, 31, 47]. Different Regression methodologies, such as Probit Regression, Multi-Task Logistic Regression, and Neural Multi-Task Logistic Regression, were also introduced to perform early student dropout identification. The utilization of unsupervised learning is not found in any of our chosen investigations.

The time-to-event analysis is utilized to investigate information from the time until the incident occurs [60]. It provides different components to deal with censored data issues that emerge in modeling the longitudinal data, which happens universally in other application spaces [48].

The use of time-to-event analysis to examine student dropout was pioneered in the area of education and management. To investigate the impact of various school classifications on the school effects, Carl Lamote et al. (2013) employed a multilevel discrete-time hazard model [16]. Ameri et al. (2016) used a semi-parametric method to construct a time-to-event analysis framework to determine in-danger pupils [10]. This methodology collects time-varying characteristics and exploits that knowledge to estimate student dropout better, utilizing the available students datasets from 2002 to 2009 of Wayne State University. Indeed, in time-to-event analysis, individuals are generally monitored for a specific amount of time, focusing on the moment the event of interest happens [61]. Analyzing academic, socioeconomic, and equity factors, Daniel A. Gutierrez-Pachas et al. (2023) employed parametric, semi-parametric, and advanced survival approaches to predict higher education dropout [48]. As a result, the advantage of time-to-event analysis over the other techniques is the potential to incorporate a temporal factor into the framework as well as to manage censored information successfully. Despite the fact that the effectiveness of time-to-event analysis approaches in other fields, such as health sectors, technology, finance, human resource management, etc., there need to be more attempts to apply such techniques to the problem of student dropout [62].

Linear Discriminant Analysis acts as a dimensional reduction algorithm attempting to lessen the data complexity by projecting the real component space on a lower-dimensional one while attempting to hold significant data variation; likewise, it doesn't include parameter settings. Del Bonifro et al. (2020) fostered an ML technique to anticipate the dropout of a first-year undergraduate student. The proposed technique permits estimating the danger of leaving a scholarly course, and it tends to be utilized either during the application stage or during the principal year [31].

The Exemplary methodology for estimating a statistical connection between an independent variable and a few other independent variables is the regression technique. Berens et al. fostered an early recognition framework utilizing probit regression to predict student success in tertiary education and provide designated intervention [23]. Appendix F categorizes the 36 research articles as per the prediction methods utilized for early student dropout. Table 3 shows the most often involved prediction strategies in the qualitative synthesis. Random Forest has been used in 23 studies, followed by a Decision tree in 16 studies, Logistic Regression in 14 studies, Support Vector Machine in 12 studies, Artificial Neural Network in 11 studies, Naïve Bayes classifier, and K-Nearest Neighbour in 10 studies.

Table 3 Frequency of most involved prediction techniques

Performance evaluation

Evaluation metrics for dropout prediction

Given the inherent probabilistic nature of predictive models, evaluating the outcomes while using them is crucial. A model's performance is greatly influenced by evaluation measures, which also help determine what changes should be made to improve the accuracy of predictions. Various criteria have been suggested in the literature to achieve more accurate prediction models [63, 64]. The kind of problem being handled determines the evaluation metrics to be used. We analyzed the model measurements used in the investigations and found that 21 investigations evaluated their prediction quality through 'accuracy,' followed by area under the curve (AUC) in 19 studies. Precision and Recall, kappa, and F1 measures are additional performance measures identified for classification problems [50].

Similarly, Mean Squared Error (MSE) and Mean Absolute Error (MAE) were identified in fewer studies for regression problems [10, 27, 48]. The reviewed studies used several performance indicators to assess the validity of the dropout prediction models. A single measure was utilized in eight (22%) investigations to evaluate the dropout prediction. Seven investigations (19%) employed two performance criteria, whereas ten investigations (28%) used three. More than four measures were used in 37% of the investigations given in Appendix D.

Challenges in predicting student dropout

Data imbalance

We meta-analyzed the 22 studies that reported the proportion of dropouts in a dataset out of the 36 papers that were included in the qualitative synthesis. The estimated pooled proportion of overall dropouts was found to be 0.2061 (95% confidence interval (CI): 0.1845–0.2278). Estimates varied greatly from 0.0023 to 0.6018, which might be partially attributed to variations in dropout types. This means that, on average, only 20% of dropout samples were used for dropout prediction. This indicates the data imbalance problem in the prediction of student dropout problems. The estimated tau-squared value was found to be 0.0016, with a standard error (SE) of 0.0015, suggesting some heterogeneity among the effect sizes of included studies. The I2 statistic also showed significant heterogeneity with an I2 value of 99.96% (P < 0.05). We carried out further stratified meta-analyses to better comprehend this component's interaction. Focusing on the specific type of dropout, we calculated a pooled proportion of dropouts in universities to be 0.2393 (95% CI: 0.2086–0.2700), which was lower than school dropouts 0.1734 (95% CI: 0.1429–0.2039), which showed a significant difference. We observe that estimates for these groups ranged widely, from 0.0023 to 0.6018 for University dropouts to 0.0100 to 0.5522 for school dropouts, as shown in Fig. 8. Sensitivity analysis is performed to strengthen the reliability of the findings using the Baujot plot, Influencer analysis, and Leave-one-out analysis. The results of the trim and fill method, rank correlation test for funnel plot asymmetry showed no publication bias, and the Eggers test revealed a publication bias, which is given in Appendix G.

Fig. 8
figure 8

Forest plot showing the pooled proportion of dropout

Discussion

Student dropout prediction has become essential for higher education pioneers [65]. Statistical learning has acquired massive momentum in the past ten years to improve early student dropout identification. Statistical learning and advanced machine learning are declared to work on fulfilling dropout prediction [66]. Various significant concerns exist for computerizing the appraisal of individual dropouts that might serve as a bridge to student success in school [67, 68]. Yet, it is still being determined how statistical learning and machine learning can be used to illustrate and predict early student dropout. The present systematic review was conducted to make an effort to connect this gap in research.

To answer the first and second objectives, we thoroughly examined the statistical and machine learning methods employed in the included studies. The decision to go back twenty years was influenced by new revolutionary advancements in statistics and machine learning in producing high-quality results associated with education dropout prediction. The study closest to our systematic review was the review detailed by Chen J (2022), which looked into a few investigations estimating massive open online course (MOOC) dropouts from 2012 to 2022. Though the survey attempted to sum up the principal methods for predicting early dropout, analysis to detail the consequences of the predicted models still needs to be provided [69].

Based on our review, the development and use of predictive analysis methods that predict student dropout has been at an all-time high since 2017. Moreover, developed countries are taking a giant stride in researching the early identification of at-risk students. The quantity of articles published in the field of education dropout prediction is increasing year by year [70, 71]. Researchers still need to be satisfied with their attempts to develop modeling techniques that predict early student dropout [68]. Our review observed that the dropout expectation models were created as independent modules rather than evaluation programming in many studies. Though ensemble methods are well established to improve predictive performance [64], almost 90% of the research studies developed a model that relied on statistical or machine learning techniques. Furthermore, despite their importance, fewer models were expanded to clarify and validate the estimate of student dropout [72]. Only a few studies used advanced survival techniques and Bayesian networks for early dropout prediction. We utilized scientific classification to order the prediction models from our qualitative synthesis [70]. Random Forest, Decision Tree, Logistic Regression, Support Vector Machine, Naïve Bayes classifier, K-Nearest Neighbour, Artificial Neural Network, and Gradient Boosting have been the most utilized methods for estimating student dropout.

The wide use of diverse statistical approaches appears across all of the included studies, which is a similarity. A wide range of techniques are used by researchers, including traditional linear models, decision-based models, and advanced ensemble methods [22, 26, 27]. This variety emphasizes that statistical methods have distinct advantages in handling the multidimensional nature of dropout prediction. In the case of evaluation metrics, to assess the effectiveness of the predictive models, accuracy and AUC were used as the most important metrics [49, 50]. Some performance metrics such as specificity and sensitivity, Kappa, F1 score, confusion matrix, absolute mean error, mean square error, C-index, and precision-recall were also used for performance evaluation [36, 48]. Numerous researches recognized the difficulty of unbalanced datasets in prediction, where dropout instances were a minority class. The necessity of methods to resolve data imbalance was highlighted in most of the studies. Still, only a few studies solved the data imbalance issue using balancing and rebalancing algorithms [47, 49, 50].

The regional focus of the investigations showed a clear difference. While some study on dropout prediction was done in developed nations, there were few studies in emerging or resource-constrained countries [44, 46, 48]. This discrepancy underlined the need for specialized solutions in various global situations. Models did not consistently consider temporal elements, such as variations in dropout risk over time. While some research focused on temporal dynamics, others mainly used static predictors [10, 21]. Studies revealed differences in the distinct types and dimensions of risk factors adopted by dropout models for prediction. Some research examined various sociodemographic, academic, and behavioral variables, while others concentrated on fewer predictors [41].

A meta-analysis was performed to determine the proportion of dropouts in the included studies to address the third objective. This investigation exposed a fundamental problem with the research on student dropout prediction related to data imbalance. Many scientific works fail to consider how dropout could be reduced in the datasets that are currently accessible. This becomes a significant issue, particularly concerning student academic performance, as dropout pupils sometimes differ from those who stay [73]. Subsequently, future research should consider developing a student dropout model that considers its data imbalance problem. Data balancing techniques have successfully addressed the issue of data imbalances in predicting student dropout using machine learning [74]. These methods have enhanced the accuracy and decreased data bias in the model. More study is required in multiple geographic areas to guarantee the adaptability of these strategies across various educational contexts. Many methods for balancing data include over-sampling, under-sampling, and combining the two. While under-sampling results in fewer occurrences of the majority class, over-sampling results in more instances of the minority class. Some data-level strategies employed are the SMOTE and Adaptive Synthetic Sampling Approach (ADASYN). Techniques at the algorithmic level, such as Balanced Random Forest and SMOTE-Bagging, have also been used [75]. A number of the included studies illustrate the use of data-balancing approaches in predicting student dropout. For example, one study that used a decision tree method with under-sampling to balance the dataset got a precision of 98.9% [29]. An AUC of 78% was obtained in a different study that combined over-sampling and under-sampling methods with a logistic classifier [49].

Additionally, several other challenges were also identified. Primarily, a large portion of every prediction technique is applied and assessed in advanced nations utilizing the available data sources gathered from developed nations. The barriers to acquiring available datasets from emerging countries necessitated the development of new datasets [76]. This includes converting the enrolment data of individuals from paper-based methodology to computerized capabilities. Moreover, to the best of academics' knowledge, only some studies have been directed at developing nations. We suggest further examination to investigate the worth of statistical learning in preventing dropouts in emerging countries.

The inability to comprehend results is one of the fundamental flaws of ML models, particularly advanced ML models, as it is challenging to ascertain the exact method that was used to mine the results. It is challenging to comprehend why a specific prediction was made in some circumstances [49]. As a result, educational managers might not believe that such models can be relied upon to support their decisions, mainly when a student's future is at risk. To promote transparency of algorithms and reliability, explainable artificial intelligence is viewed as a viable technique. This could make outputs more intelligible as well as acceptable [77, 78].

Next, most studies have only focused on general terms of early identification. To add detail, research in emerging nations must emphasize working with a more robust and exhaustive detection system that can recognize individuals in danger in forthcoming groups, grade individuals as per their likelihood of dropping from schools (getting dropped), and distinguish individuals who are in danger way before they drop [79].

Moreover, several researchers utilized academic datasets to address the issue of student dropouts. Considering the resource constraints the developing nations experience, they can use alternative methods of school-level information that may address school-related attributes and apply appropriate prediction techniques to strengthen the suggested computation analysis [80]. To advance the area of student dropout prediction, research initiatives should be expanded outside of developed countries to address the difficulties encountered by developing countries. A sophisticated strategy for preventing dropout is required because these regions frequently have diverse cultural, social, and educational backgrounds [49]. Statistical methods can be modified to meet emerging countries' distinctive needs and difficulties, resulting in more efficient and locally appropriate dropout prevention measures. Statistical models must include temporal changes. The academic process of a student is unpredictable, and dropout risks might change over time. Models that consider these temporal considerations can deliver more precise and realistic predicted outcomes [48].

Limitations

As with all research efforts, a few limitations should be acknowledged. The same goes with a wide range of surveys, and there, we may have missed some studies predicting dropout due to our selected search queries or the screening procedures. Moreover, we focused our inquiry on the predictive models of student dropout over the last twenty years. It was also seen that a few investigations included only some exploratory and estimating characteristics, such as the dataset's quality, prediction model type, and variables affecting dropout. This, in the long run, impacts the nature of our qualitative synthesis. Sadly, many investigations did not implement a precise approach, making the evaluation more difficult. Our review was determined by the primary objective that may have outlined the review cycle, and we concluded.

Conclusion

Through a comprehensive and data-driven approach, our study delved into the crucial area of student dropout prediction, an important precursor to successful student transition. This evidence-based approach has the potential to significantly improve student outcomes and pave the way for a more successful and fulfilling educational journey. We followed Preferred Reporting Items for Systematic Reviews and Meta-Analysis and Systematic Literature Review procedures to design the survey. An overview of statistical methods for dealing with the problem of student dropout is introduced. The summary makes a few determinations; first, while a few strategies were presented for dealing with student dropout in advanced nations, there needs to be more literature about utilizing various techniques for resolving dropout prediction in emerging countries. Researchers must focus more on handling data imbalance, which is another important problem. Despite extensive efforts in employing various prediction techniques, inadequate assessment metrics are often used to evaluate model performance. Third, many experts focused on early prediction instead of positioning or even estimating components for resolving the dropout issue. Finally, when dealing with a problem, school-level datasets should be reviewed to develop alternative strategies that would assist administrators in predicting in-danger students for early intervention.