Machine Learning and Deep Learning-Based Students’ Grade Prediction

Korchi, Adil; Messaoudi, Fayçal; Abatal, Ahmed; Manzali, Youness

doi:10.1007/s43069-023-00267-8

Machine Learning and Deep Learning-Based Students’ Grade Prediction

Research
Published: 31 October 2023

Volume 4, article number 87, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Operations Research Forum Aims and scope Submit manuscript

Machine Learning and Deep Learning-Based Students’ Grade Prediction

Download PDF

Adil Korchi¹,
Fayçal Messaoudi²,
Ahmed Abatal³ &
…
Youness Manzali⁴

314 Accesses
4 Citations
Explore all metrics

Abstract

Predicting student performance in a curriculum or program offers the prospect of improving academic outcomes. By using effective performance prediction methods, instructional leaders can allocate adequate resources and instruction more accurately. This paper aims to identify machine learning algorithm features for predicting student grades as an early intervention. Predictive models spot at-risk students early, allowing educators to provide timely support. Educators can customize teaching methods, and these models assess program success, helping institutions refine or expand them through data-driven decisions. But the problem definition of student grade prediction is to develop predictive models or algorithms that can forecast or estimate the future academic performance or grades of students based on various input features and historical data, and to do so, we utilized a student dataset comprising personal information and grades, employing various regression algorithms, including decision tree, random forest, linear regression, k-nearest neighbor, XGBoost, and deep neural network. We chose these algorithms for their suitability and distinct strengths. We assessed their performance using determination coefficient, mean average error, mean squared error, and root mean squared error. The results showed that the deep neural network outperformed others with a determination coefficient of 99.97%, confirming its reliability in predicting student grades and assessing performance, and this will certainly help to develop predictive models that can accurately forecast or estimate students’ academic performance based on various input features and enable teaching staff to provide timely assistance in addressing these issues.

Enhancing Student Academic Performance Forecasting: A Comparative Analysis of Machine Learning Algorithms

Article 02 August 2024

Employing Deep Neural Network for Early Prediction of Students’ Performance

Predicting Student Academic Performance Using Machine Learning

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Predicting students’ marks is a common problem in educational data mining, with applications in areas such as student assessment, course design, and academic advising [1]. Machine learning techniques, such as linear regression and neural networks, can be used to build predictive models that can estimate students’ marks based on various factors, such as past grades, attendance, and test scores. These models can help educators to identify at-risk students, design personalized learning interventions, and provide feedback to students on their progress [2].

The primary goal of this endeavor is to harness the predictive capabilities of machine learning to improve educational outcomes. By utilizing historical data and the inherent patterns within it, the machine learning algorithms enable educators and institutions to identify students who may be at risk of underperforming or dropping out. This early intervention can significantly impact a student’s academic journey, providing timely support and personalized learning experiences.

One of the key advantages of using machine learning for predicting students’ marks is its ability to analyze and consider a multitude of variables simultaneously. In traditional assessment methods, educators might rely solely on a student’s performance in a single exam or assignment. However, machine learning models can take into account a wide range of factors that influence academic performance, including study habits, socioeconomic background, and even extracurricular activities. This holistic approach to prediction can offer a more comprehensive view of the educational path of a student.

Moreover, these predictive models can adapt and improve over time as more data becomes available. This adaptability is particularly valuable in the dynamic field of education, where student demographics, teaching methods, and curricula can change from year to year. By continuously training and refining these models, educators can stay ahead of the curve and make data-driven decisions to enhance the learning experience for their students, and there are various approaches that can be used to predict students’ performance with machine learning algorithms, but they generally involve the following steps:

Collect and prepare the data: this involves collecting the relevant data from students’ records, such as their past grades, attendance, and test scores. To ensure that the data is in an acceptable format for the machine learning method, it should be cleaned and preprocessed.
Choose a machine learning model: for this purpose, a variety of machine learning models, including neural networks, decision trees, and linear regression, can be utilized. The choice of model will depend on the characteristics of the data and the specific goals of the prediction.
Train the model: once the model has been chosen, it needs to be trained on the data. This involves feeding the model a large number of examples and changing the model’s parameters to reduce the difference in scores between predictions and actual results.
Evaluate the model: after the model has been trained, it is important to evaluate its performance to determine how well it is able to predict students’ marks. This can be done using techniques such as cross-validation, where the model is tested on a portion of the data that was not used for training.
Make predictions: the model can be used to make predictions on new data after it has been trained and assessed.

In this paper, we will explore different methodologies for predicting students’ grades through the application of machine learning. These methodologies encompass aspects such as data collection and preprocessing, model selection, model training, evaluation, and prediction.

The organization of this paper is as follows: Sect. 2 delves into pertinent research in the field, Sect. 3 enumerates the machine learning techniques utilized, Sect. 4 outlines the adopted methodology, Sect. 5 presents the results of the predictive models employed, Sect. 6 contains a discussion regarding the rationale behind our selection of the mentioned algorithms and the distinct qualities that set these algorithms apart from others, influencing our choice for the proposed work. Section 7 presents the final thoughts and conclusions of the research.

2 Related Work

Several studies have explored the use of machine learning for predicting student grades. One widely used approach is linear regression, which is a statistical method for finding the linear relationship between a dependent variable (in this case, student grades) and one or more independent variables (such as attendance, test scores, and other factors). Linear regression has been shown to be effective in predicting student grades in a number of studies [3,4,5,6].

Another popular approach for predicting student grades is the use of decision tree algorithms, which build a tree-like model of decisions based on the data. Decision trees have been used to predict student grades in a number of studies [7,8,9] and have demonstrated their effectiveness in performing this specific task.

In addition to linear regression and decision trees, other machine learning algorithms that have been used for predicting student grades include k-nearest neighbor (k-NN) [10, 11] and random forests [12]. These approaches have also been shown to be effective for this task, although they may have different strengths and limitations depending on the specific characteristics of the data and the goals of the prediction.

To predict students’ performance based on the use of the internet as a learning resource and the impact of the time spent by students on social networks, the authors of a study [13] used a variety of machine learning algorithms, including decision trees, naïve Bayes, artificial neural networks (ANN), and logistic regression. They discovered that the ANN model, which had an accuracy of about 80%, performed the best.

The BiLSTM deep neural network model was employed by the authors in [14] coupled with an attention mechanism model, to predict students’ grades from historical data. The results showed that the BiLSTM combined with the attention mechanism yielded a better accuracy of 90.16%.

In another study [15], the authors applied a deep learning model to predict students’ academic performance. They employed a data set containing different variables such as demographic, social, educational, and student grades. They used the synthetic minority oversampling (SMOTE) technique to overcome the data imbalance problem. Their proposed solution resulted in approximately 96% accuracy for grade predictions across courses.

Sekeroglu et al. [16] looked into two data sets to predict and categorize student performance using several machine learning techniques, such as backpropagation, long-short term memory, support vector regression, and for classification, gradient boosting classifier. As a result, the support vector regression model outperformed the other algorithms at the R-squared score of 83% in grade prediction, and for classification, the backpropagation model performed the best with an accuracy equal to 87%.

The purpose of [19] is to enhance online teaching quality by predicting student pass rates, improving academic performance, and strengthening online education management. Researchers have used machine learning to forecast pass rates and identify key student factors impacting learning. However, they have not developed an online education-specific pass rate prediction model or introduced deep neural network (DNN) algorithms.

The study establishes a pass rate prediction feature model for online education, optimizing decision tree (DT) and support vector machine (SVM) algorithms using grid search. It compares these with DNN and finds DNN more complex with lower interpretability, while DT and SVM are simpler. Figure 1 illustrates that all three algorithms perform well at different feature model partition ratios, but DNN excels.

In [22], the author introduced a student performance prediction system based on deep neural network (DNN). They conducted training and testing on a Kaggle dataset, employing various algorithms including decision tree, naïve Bayes, random forest, support vector machine, k-nearest neighbor, and DNN using R Programming. The comparison of algorithm accuracies revealed that DNN achieved the highest accuracy at 84%, as depicted in Fig. 2.

2.1 Comparison of Some Similar Works

The following table compares some similar works and provides information on the methodologies used as well as the advantages and disadvantages of this research.

3 Machine Learning Model Used

The choice of the best algorithm depends on the specific dataset and problem at hand. It is often a good practice to experiment with multiple algorithms, tune their hyperparameters, and evaluate their performance using appropriate metrics to determine which one works best for your particular use case. Additionally, feature engineering and data preprocessing play a significant role in improving prediction accuracy.

In the context of predicting student grades, the selection of machine learning algorithms is driven by the nature of educational datasets and the objectives of the prediction task. In general, there are not specific algorithms that you absolutely cannot use, but there are some that may not be well-suited or are generally not recommended due to various reasons.

Decision trees and random forests are often preferred for their ability to handle diverse data types, capture non-linear relationships, and mitigate overfitting. Linear regression is chosen when the assumption of a linear relationship between features and grades is reasonable and when interpretability is paramount. K-nearest neighbors can be effective in identifying students with similar characteristics who tend to achieve similar grades. XGBoost is a favored choice for large and high-dimensional datasets, offering robustness and the ability to capture complex interactions. Deep neural networks come into play when complex, non-linear relationships or unstructured data are involved. The ultimate selection depends on factors such as the prediction goal, dataset characteristics, and the trade-offs between interpretability and predictive accuracy. Hence, the algorithms we have opted for and due to their demonstrated effectiveness in numerous research investigations in the field of student grade prediction are as follows:

Decision tree regressor

A decision tree is a machine learning method used to categorize data or make predictions based on the answers provided to a series of previous questions. This model is a type of supervised learning, which means that it is trained and tested on a data set with the required categorization. It is a graphical representation that provides all possible solutions to a problem from given conditions.

Random forest regressor

A supervised learning technique [17, 18] called a random forest employs an ensemble learning approach for regression. It is a meta-estimator that employs the mean to increase prediction accuracy and reduce overfitting. It does this by fitting a number of classification decision trees to different subsamples of the data set [19].

Linear regression

Linear regression is a popular statistical method used for modeling the relationship between a dependent variable and one or more independent variables. It is a simple but powerful technique that assumes a linear relationship between the variables; its purpose is to solve regression problems. Regression builds a target prediction value on a set of independent variables. Linear regression is principally employed to find the relationship between variables and predictions.

K-nearest neighbors regressor

K-nearest neighbors (k-NN) regressor is a type of supervised learning algorithm used for regression tasks. It works on the principle of finding the k-nearest data points to a new, unseen data point and using their target values (dependent variable) to predict the value for the new data point. Here is how the k-NN regressor algorithm works:

Training: during the training phase, the algorithm stores the feature vectors and their corresponding target values (dependent variable) from the training dataset.
Prediction: when given a new, unseen data point for which we want to predict the target value.

XGBoost regressor

XGBoost is a supervised machine learning algorithm used on large data sets. It is an accurate implementation of gradient boosting which can be applied to predictive modeling by regression.

Deep neural network

A deep neural network is characterized by a particularity that is composed of an input layer, an output layer, and at least 3 layers in between of interconnected nodes, or “neurons.” This allows it to process data in a complex way, using advanced mathematical models. Each of these layers performs different types of sorting and specific categorization in a process called feature hierarchy.

4 Methodology

The methodology of this study includes the following steps, which are summarized in the figure below (Fig. 3).

4.1 Data Set Description

The goal of this study is to predict students’ total scores using techniques of machine learning and deep learning. To achieve this, we used a data set containing information on various 1000 student characteristics, including gender of education level of parents, lunch, and exam preparation courses. In addition, the data set included scores on math, reading, and writing exams as shown in Figs. 4 and 5.

4.2 Data Cleaning and Preprocessing

The first step in our analysis was to clean and preprocess the data. This included handling missing values, converting categorical variables to numeric form (Fig. 6), and scaling the data to ensure that all variables were on the same scale.

4.3 Feature Engineering

To improve the performance of the machine learning algorithms, we performed feature engineering on the data set. This involved selecting the most relevant features and creating new feature: total score, the target variable, by combining the existing ones (math, reading, and writing scores) as shown below (Fig. 7). Feature engineering seeks to produce more robust and predictive data set for the machine learning algorithms (Fig. 8).

4.4 Data Visualization

To enhance comprehension and interpretation of information, we opt for representing our processed data through graphical elements such as charts and visual displays. Utilizing visual representations like graphs is an effective means of unveiling patterns and trends within complex data, ultimately aiding in simplifying information and facilitating more informed decision-making.

We proceeded to visualize the dataset, aiming to gain a deeper understanding of its contents and explore relationships among various variables. Our goal was to detect any discernible patterns or trends within the data. This helped us to identify the most important features for predicting students’ total scores. Figures 9, 10, 11, and 12 represent graphically the different variables.

A correlation study is a statistical analysis that measures the relationship between two or more variables. It is used to understand how the values of one variable are affected by changes in the values of another variable. The degree and direction of the linear link between variables is measured by the Pearson correlation coefficient. The Pearson correlation coefficient, often denoted as “r” or the Pearson’s r, is a statistical measure that quantifies the strength and direction of a linear relationship between two continuous variables. It assesses how closely the data points in two datasets or variables cluster around a straight line. Its value falls between − 1 and 1, with − 1 denoting a high negative correlation, 0 denoting no correlation, and 1 denoting a significant positive correlation. To perform a correlation study in Python, the corr() method of a Pandas DataFrame or the pearsonr() function from the scipy.stats module can be used.

Upon data processing, we acquire the correlation depicted in Fig. 13. It becomes evident that a robust relationship exists among the variables: math score, reading score, writing score, and the target variable total score.

4.5 Data Splitting

To ensure the validity of our results, we used a 30/70 ratio to divide the data set into training and testing sets. The testing set was used to gauge how well the models performed, while the training set was used to train the machine learning algorithms.

4.6 Motivation

We are motivated to use regression models to predict student performance since there are valuable tools in education for the following reasons:

Identifying influential factors: regression helps pinpoint factors like attendance and socioeconomic status that affect student performance.
Data-driven decisions: schools use regression to analyze vast data sets, enabling informed decisions to enhance teaching methods and support systems.
Early intervention: regression identifies at-risk students, allowing early intervention and support, preventing long-term academic struggles.
Efficient resource allocation: schools optimize limited resources by focusing on areas identified by regression models, ensuring maximum impact.
Policy assessment: policymakers assess existing policies’ effectiveness using regression, guiding adjustments for improved education systems.
Personalized learning: regression tailors teaching methods based on individual student factors, enhancing engagement and achievement.
Continuous improvement: regression supports ongoing research, enabling the testing of new hypotheses and contributing to the evolution of effective educational practices.

4.7 Machine Learning Algorithm Implementation

The implementation of machine learning algorithms is driven by the desire to leverage data to automate tasks, make predictions, gain insights, and solve complex problems across a wide range of domains and applications. It empowers organizations and individuals to extract value from data and make data-informed decisions and that is why we implemented a range of machine learning algorithms for predicting student grades, including decision trees, random forests, linear regression, k-nearest neighbor, XGBoost, and deep neural networks [20].

The procedures listed below were used to implement these algorithms using the scikit-learn library and the Python programming language.

Create an instance of the algorithm class: each algorithm is implemented using a corresponding class from a library such as scikit-learn. To create an instance of the class, we need to call the class with any desired hyperparameters (the random_state parameter is set to ensure that the results are reproducible).
Fit the model to the training data: the fit() method is used to fit the model after it has been generated to the training set of data. This method takes the training data and target variables as inputs and adjusts the model’s internal parameters to fit the data.
Repeat the process for each algorithm: the process of creating and fitting the model is repeated for each algorithm. After all the algorithms are trained, they can be used to make predictions on the testing data.

4.8 Deep Neural Network Implementation

The DNN used was built using the Keras library in Python using the following steps:

The first step is to create a model object using the sequential class. This creates a model that is a linear stack of layers, where the input goes through each layer sequentially and the output of one layer is the input of the next layer.
Next, the layers are added to the model using the model.add() method. The model has 15 layers, each with a specified number of neurons and an activation function. The activation function determines the output of a neuron given an input or set of inputs. In this case, the relu activation function is used for all but the output layer, which uses a linear activation function.
After the layers are added, the model is compiled using the model.compile() method. This step specifies the optimizer, loss function, and metrics that will be used to train the model. The Adam optimizer is used, and the loss function is the mean squared error (MSE). The MAE, MSE, and RMSE metrics are also used to evaluate the model's performance.
Finally, the model is trained using the model.fit() method, which takes the training data and target variables as inputs and trains the model for the given number of epochs. The model is trained for 100 epochs. The model is also evaluated on the testing data using the validation data parameter.

5 Model Evaluation and Results

To assess and contrast how well the machine learning algorithms work, we used regression plots of the models as well as several metrics, including the R-squared score, mean absolute error (MAE), mean squared error (MSE), and root mean squared error (RMSE).

5.1 Regression Plots

A regression plot is a valuable tool in statistical analysis and modeling. It is a scatter plot that illustrates the connection between two variables and the fitted line or curve that represents the model’s predictions. It is a useful tool for visualizing the performance of a machine learning model and identifying trends and patterns in the data.

We used regression plots to visualize the predictions made by each model based on the test set (y_test). These plots showed the relationship between between expected and actual values, allowing us to see how well each model was able to accurately predict the total scores of students as shown in the following figures (Figs. 14, 15, 16, 17, 18, 19). The regression graphs can be created in Python by the regplot() function of the Seaborn library.

Based on the regression plots of the models used, we can see that the DNN model fits the data better compared to the other machine learning algorithm used.

5.2 Performance Metrics

Performance metrics play a fundamental role in assessing, improving, and optimizing performance across a wide range of domains. They provide a structured and measurable way to evaluate, compare, and make decisions, ultimately leading to better outcomes, increased efficiency, and informed actions.

We selected the following metrics to determine which of the models exhibits superior performance.

R-squared score: this metric gauges how much of the target variable’s variance the model is able to account for. An improved fit is indicated by a higher R-squared value.
Mean absolute error (MAE): it is the average absolute difference between the values that were predicted and the actual values measured by the mean absolute error (MAE). Better fit is indicated by a lower MAE.
Mean squared error (MSE): the average squared difference between the anticipated values and the actual values measured by the mean squared error (MSE). Better fit is indicated by a lower MSE.
Root mean squared error (RMSE): this measure is used to quantify the error in the same units as the target variable and is the square root of the MSE. Better fit is indicated by a lower RMSE.

The table below shows the R-squared, MAE, MSE, and RMSE values of the different models used (Table 1).

Table 1 Analyzing various research papers within the domain of student grade prediction through the application of machine learning techniques

Full size table

Based on the results of the regression plots and the different evaluation metrics shown in Table 2, the DNN model was found to be the best-performing model with a determination coefficient equal to 99.97% and MAE = 0.45, MSE = 0.05, RMSE = 1.13, followed by the LR model with an R-squared equal to 99.10% and relatively high errors. In the third position, there is the k-NN, followed by the RF model, then the DT, and the XGB in the last position.

Table 2 Model evaluation based on R-squared, MAE, MSE, and RMSE

Full size table

6 Discussion

In the research papers we have examined, particularly in [5, 19], and [22], deep neural networks (DNN) demonstrate superior performance in predicting students’ grades compared with DT, LR, RF, k-NN, and XGB. This is attributed to its capability to autonomously learn and extract pertinent information from raw data, reducing the necessity for labor-intensive feature engineering. Additionally, DNN can be customized for specific tasks, thereby conserving both time and resources. Its ongoing advancement in the realms of artificial intelligence (AI) and machine learning contributes significantly to innovation, particularly in domains such as enhancing learning processes. This does not imply that the other algorithms do not achieve superior performance; quite the opposite, they do so in situations dissimilar to our specific problem.

To draw comparisons between our study and previous research in the same domain and in order to avoid falling into the errors already made by other researchers, we concentrated on addressing the shortcomings observed in some prior works. Specifically, we made enhancements in the following aspects:

Completeness of the data that lead to inaccuracies.
The interpretability of the methods, especially complex ones like neural networks, hindered a complete understanding of the reasons behind specific predictions.
The overfitting problem.
The quality of data and underlying assumptions.
Lack of sufficient detail regarding the methodology and parameters utilized in the hybrid machine learning approach.
The dataset’s limitations restricted the generalizability of findings to other contexts.
Limitations concerning the choice of algorithms, parameter tuning, and the representativeness of the dataset.
The quality and completeness of the data used.

7 Conclusion

The objective of this paper is to apply machine learning algorithms for the prediction of student scores. After implementing and evaluating a range of machine learning algorithms, our findings demonstrated that the deep neural network model performed better than the competing algorithms in terms of determination coefficient and error metrics. With a determination coefficient of 99.97% and negligible errors, the deep neural network demonstrated the highest level of accuracy in predicting students’ grades.

These results have important implications for educators and administrators looking to use machine learning to improve student outcomes and support student success. By identifying the most effective algorithms for predicting student grades, we can better understand the factors that contribute to student performance and tailor teaching approaches and support to the specific needs of individual students.

Overall, this study highlights the potential of machine learning to revolutionize the way we approach education by providing personalized and targeted support to students. By continuing to explore and refine these techniques, we can continue to make progress in helping students achieve their full potential.

Finally, we can say that the application of machine learning techniques to predict students’ marks has the potential to revolutionize the field of education. These models can offer a data-driven approach to identifying at-risk students, personalizing education, and improving overall academic outcomes. However, their implementation should be guided by a strong commitment to ethics and privacy to ensure that the benefits of these technologies are realized while safeguarding students’ rights and well-being.

Data Availability

The datasets used during the current study are freely available in the UCI repository.

Code Availability

The code will be available upon request to reviewers.

References

Okewu E, Adewole P, Misra S, Maskeliunas R, Damasevicius R (2021) Artificial neural networks for educational data mining in higher education: a systematic literature review. Appl Artif Intell 35(13):983–1021
Article Google Scholar
Bañeres D, Rodríguez ME, Guerrero-Roldán AE, Karadeniz A (2020) An early warning system to detect at-risk students in online higher education. Appl Sci 10(13):4427
Article Google Scholar
Yang SJ, Lu OH, Huang AY, Huang JC, Ogata H, Lin AJ (2018) Predicting students’ academic performance using multiple linear regression and principal component analysis. J Inf Process 26:170–176
Article Google Scholar
Huang S, Fang N (2013) Predicting student academic performance in an engineering dynamics course: a comparison of four types of predictive mathematical models. Comput Educ 61:133–145
Article Google Scholar
Gorr WL, Nagin D, Szczypula J (1994) Comparative study of artificial neural network and statistical models for predicting student grade point averages. Int J Forecast 10(1):17–34
Article Google Scholar
Gadhavi M, Patel C (2017) Student final grade prediction based on linear regression. Indian J Comput Sci Eng 8(3):274–279
Google Scholar
Al-Barrak MA, Al-Razgan M (2016) Predicting students final GPA using decision trees: a case study. Int J Inf Educ Technol 6(7):528
Article Google Scholar
Kolo DK, Adepoju SA (2015) A decision tree approach for predicting students academic performance
Hamoud A, Hashim AS, Awadh WA (2018) Predicting student performance in higher education institutions using decision tree analysis. Int J Interact Multimed Artif Intell 5:26–31
Article Google Scholar
Amra IAA, Maghari AY (2017) Students performance prediction using KNN and Naïve Bayesian. In: 2017 8th Int Conf Inf Technol (ICIT). pp 909–913
Maghari A (2018) Prediction of student’s performance using modified KNN classifiers. In Alfere SS, Maghari AY (2018) Prediction of Student’s Performance Using Modified KNN Classifiers. In: The First International Conference on Engineering and Future Technology (ICEFT 2018). pp 143–150
Batool S, Rashid J, Nisar MW, Kim J, Mahmood T, Hussain A (2021) A random forest students’ performance prediction (rfspp) model based on students’ demographic features. In: 2021 Mohammad Ali Jinnah University International Conference on Computing (MAJICC). pp 1–4
Altabrawee H, Ali OAJ, Ajmi SQ (2019) Predicting students’ performance using machine learning techniques. J Univ Babylon Pure Appl Sci 27(1):194–205
Article Google Scholar
Yousafzai BK, Khan SA, Rahman T, Khan I, Ullah I, Ur Rehman A, Cheikhrouhou O (2021) Student-performulator: student academic performance using hybrid deep neural network. Sustainability 13(17):9775
Aslam N, Khan I, Alamri L, Almuslim R (2021) An Improved Early Student’s Academic Performance Prediction Using Deep Learning. Int J Emerg Technol Learn (iJET) 16(12):108–122
Article Google Scholar
Sekeroglu B, Dimililer K, Tuncal K (2019) Student performance prediction and classification using machine learning algorithms. In: Proceedings of the 2019 8th International Conference on Educational and Information Technology. pp 7–11
Burdakov O (2020) Ioannis C. Demetriou and Panos M. Pardalos (eds): Approximation and optimization: algorithms, complexity and applications. In: SN Oper Res Forum, vol 1. Springer International Publishing, pp 1–5
Korani W, Mouhoub M (2021) Review on nature-inspired algorithms. In: Oper Res Forum, vol 2. Springer International Publishing, pp 1–26
Ma X, Yang Y, Zhou Z (2018) Using machine learning algorithm to predict student pass rates in online education. In: Proceedings of the 3rd International Conference on Multimedia Systems and Signal Processing. pp 156–161
Manzali Y, Elfar M (2023) Random forest pruning techniques: a recent review. In: Oper Res Forum, vol 4, no 2:. Springer International Publishing, pp 1–14
Mitra A, Jain A, Kishore A, Kumar P (2022) A comparative study of demand forecasting models for a multi-channel retail company: a novel hybrid machine learning approach. In: Oper Res Forum, vol 3, no 4. Cham: Springer International Publishing, p 58
Vijayalakshmi V, Venkatachalapathy K (2019) Comparison of predicting student’s performance using machine learning algorithms. Int J Intell Sys Appl 11(12):34
Article Google Scholar

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Faculty of Juridical, Economic and Social Sciences, Chouaib Doukkali University, El Jadida, Morocco
Adil Korchi
National School of Commerce and Management, Sidi Mohamed Ben Abdellah University, Fez, Morocco
Fayçal Messaoudi
Faculty of Sciences and Techniques, Hassan Premier University, Settat, Morocco
Ahmed Abatal
Faculty of Science Dhar El Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco
Youness Manzali

Authors

Adil Korchi
View author publications
You can also search for this author in PubMed Google Scholar
Fayçal Messaoudi
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Abatal
View author publications
You can also search for this author in PubMed Google Scholar
Youness Manzali
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authors confirm their contribution to the paper as follows: Study conception and design: KA, MF, MY, AA. Data collection: KA, MF, MY, AA. Analysis and interpretation of results: KA, MF, MY, AA. Draft manuscript preparation: KA, MF, MY, AA. All authors reviewed the results and approved the final version of the manuscript.

Corresponding author

Correspondence to Adil Korchi.

Ethics declarations

Ethical Approval

Not applicable

Consent to Participate

Not applicable

Consent for Publication

All authors of the manuscript have agreed for authorship, read and approved the manuscript, and given consent for the submission of the manuscript.

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Korchi, A., Messaoudi, F., Abatal, A. et al. Machine Learning and Deep Learning-Based Students’ Grade Prediction. Oper. Res. Forum 4, 87 (2023). https://doi.org/10.1007/s43069-023-00267-8

Download citation

Received: 21 July 2023
Accepted: 13 October 2023
Published: 31 October 2023
DOI: https://doi.org/10.1007/s43069-023-00267-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Machine Learning and Deep Learning-Based Students’ Grade Prediction

Abstract

Similar content being viewed by others

Enhancing Student Academic Performance Forecasting: A Comparative Analysis of Machine Learning Algorithms

Employing Deep Neural Network for Early Prediction of Students’ Performance

Predicting Student Academic Performance Using Machine Learning

Explore related subjects

1 Introduction

2 Related Work

2.1 Comparison of Some Similar Works

3 Machine Learning Model Used

4 Methodology

4.1 Data Set Description

4.2 Data Cleaning and Preprocessing

4.3 Feature Engineering

4.4 Data Visualization

4.5 Data Splitting

4.6 Motivation

4.7 Machine Learning Algorithm Implementation

4.8 Deep Neural Network Implementation

5 Model Evaluation and Results

5.1 Regression Plots

5.2 Performance Metrics

6 Discussion

7 Conclusion

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical Approval

Consent to Participate

Consent for Publication

Conflict of Interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation