Keywords

1 Introduction

Educational data mining is a relatively new study subject that attempts to understand hidden connections in various educational scenarios, such as learner data analysis, student learning activity recognition, instructor course design, and academic planning, and scheduling. Student academic performance is defined from different perspectives, and quantifiable assessment serves an essential part in today’s educational institutions. Student performance prediction (SPP) certainly makes sense [1]. SPP could support learners in selecting appropriate programs or activities and making educational strategies. SPP can assist educators in modifying educational content and teaching approaches related to student capability and identifying at-risk individuals. SPP can assist educational administrators in evaluating the curriculum program and improving the coursework. Generally, educational development participants could produce better initiatives to expand academic attainment.

Furthermore, the data-driven SPP review attempts as an unbiased benchmark for the education sector. In diverse contexts, student performance prediction (SPP) can be expressed as various issues. The most critical process of data mining in education is predicting the students’ performance. It examines online information and uses several approaches and models such as correlation analysis, neural network models, rule-based frameworks, regression, and Bayesian networks [2]. Based on characteristics retrieved from information filtering, such an approach allows everyone to anticipate the performance of the students, i.e., forecast his performance in a program and their performance rating. The classification and prediction techniques help detect undesired academic achievements such as incorrect activities, decreased morale, cheating, and underachievement. Segmentation, grouping, anomaly analysis, feature engineering, logistic regression, and artificial neural are the most commonly utilized student performance prediction (SPP) algorithms.

The section categorization of this paper is as follows: In Sect. 2, several existing techniques of student performance prediction (SPP) are surveyed. Sect. 3 discussed academic prediction analysis’s applications and advantages, and Sect. 4 described several student prediction models with existing performance evaluations with different classification models. Sect. 5 elaborates the theoretical analysis conclusion of the several learning models.

2 Literature Review

Several existing methods of student performance prediction models are reviewed in this section.

Chango et al. [3] designed a data fusion approach based on blended learning. The information fusion system was used to determine university students’ overall educational excellence integrating different, multidimensional data across blended educational contexts. Information about first-year university graduates was collected and pre-processed from a diversity of ways, including classroom sessions, practical classes, interactive moodle workshops, and a midterm test. The main goal was to determine where the information fusion method yields the most extraordinary outcomes with our dataset. The findings indicate that aggregates and picking the best characteristics method with fractional order data generate the best predictions. Silva et al. [4] proposed an artificial intelligence model to predict academic performance. The neural network was used for the analysis of the performance of students. A backward propagating approach was used to develop a multilayer perceptron neural network to classify the chance to dominate the competition. The classification performance rates were very high, with an average of 74.98%, including all programs, demonstrating the determinants’ usefulness in forecasting educational attainment. Two estimation techniques, specifically student evaluation ratings and ultimate performance of students, were developed by Alshabandar et al. [5]. The algorithms could identify the determinants that affect MOOC students’ educational objectives. Mainly as a consequence, all methods perform practical and precise measurements. The most negligible RSME improvement was produced by RF, with an overall average of 8.131 general students’ evaluations grading system.

In contrast, GBM yielded the best performance in the ultimate version of the student, with an overall average of 0.086. Al Nagi et al. [6] used multiple classification algorithms on an educational database for courses online to select the optimum model to classify academic achievement based on critical variables that may lead to different results. Artificial neural network (ANN), decision tree (DT), KNN, and support vector machine (SVM) were some of the classifiers employed. Experiments were performed with actual statistics, as well as the algorithms were assessed using four performance indicators: precision, accuracy, f-measure, and recall. Raga et al [7] used the deep neural network design and Internet communications features as training sets. The authors were investigated with constructing a forecasting model for academic success in the initial stages of the teaching and learning process. Firstly, several measurements were carried out to find the model parameters for just the highest Convolutional Neural Network (CNN) architecture, as it was used as a foundation classification model. This result's accuracy for forecasting exam findings for a specific course was 91.07% with a ROC and AUC value of 0.88.

In contrast, the overall efficiency with forecasting midterms consequences was 80.36%, with a ROC AUC value of 0.70. Czibula et al. [8] presented Students performance prediction using relational association rules (S-PRAR). This unique categorization approach relies on interpersonal sequential pattern development to forecast an academic program’s outcome regarding student ratings during the academic session. Investigations on three multiple datasets acquired from Romania’s Babes-Bolyai University revealed that the S-PRAR was implemented to solve well. The S-PRAR classification algorithm presented in the proposed research benefited from becoming generalized since it was not limited to the learners’ achievement classification step. Table 1 discussed various existing student academic performance prediction models depicted with research gaps, comparative techniques, future work, and performance metrics.

Table 1 Comparative analysis of various existing methods of student performance prediction models

3 Applications and Advantages of Student Performance Models

3.1 Applications

There are various applications of the student prediction performance [9, 10], and some of the applications are (i) Assessing students for enhancing their outcomes: The main objective of the student performance prediction model is to monitor the performance of the students and help the learners to improve their performance. (ii) Help instructors in curriculum/course design and improvement: Student performance prediction (SPP) helps the instructors to design the program course. Also, it helps to analyze students’ interests. Students can quickly learn the intersecting content; therefore, SPP gives the instructors directions for designing the intersecting course. (iii) Providing comments to educators: Students’ performance can be predicted with the help of data mining approaches. The approaches are used to analyze the student’s achievements and based on achievement analysis. Comments are provided to educators. (iv) Student’s recommendations: Recommendation systems can also be used to tap into scientific relations collected in course-learning databases. This recommender platform’s objective is to facilitate learners across a whole program using a competency-based evaluation approach. Learners must attain escalating levels of achievement with each program competency through productive projects. Learners may find suggestions helpful in advancing toward the next level of expertise.

3.2 Advantages

The applications mentioned earlier of student prediction performance provided various advantages: Improves student’s self-reflection: The student performance prediction model stores the overall information of the students, which can be used to predict the academic performance [2]. Whenever students lack in any course, it provides an alarm to students. Also, it gives a performance chart of the students that can reflect the students learning. Identifying Unwanted Student Behaviors: The student performance prediction model helps determine students’ erratic and unwanted behavior. Student’s poor performance risk identification: Proactive advising of the student performance prediction model can aid students in a reasonable timeframe using the information to identify vulnerable learners in a program. Academic advisers and student achievement professionals need meaningful information regarding learner performance and outcomes, which the student performance prediction model provides. Measure the impact of tool adoption: Universities may accurately assess how and why the platform is being implemented through access to the information on improving the educational activities and promoting the implementation where it will have the most impact. Universities can also evaluate third-party services’ usage; a small number of academic staff may employ that and therefore do not reflect the hefty premium.

4 Supervised-Based Learning Classification Models for Student Performance Prediction

Predictive modeling is commonly used within educational data mining methods to determine academic achievement. Numerous activities are employed to develop predictive modeling, including categorization, Analysis, and segmentation. Categorization is the most common requirement used to determine academic achievement. Under the categorization issue, numerous techniques have been used to estimate academic achievement [11].

4.1 Artificial Neural Networks (ANN)

An artificial neural network combines multiple nodes and works like a human brain system. An artificial neural network is another commonly used technique for student performance prediction. The advantage of ANN is that it can find the relation between the large dataset and search for every possible response. ANN is a group of interlinked input/outcome labels, and a load exists on each link. At the time of the training stage, the system acquires knowledge through load arrangement to estimate the accurate labels of the input module [12]. It is exceptionally capable of deriving explanations from complex or non-specific data. The basic structure of the artificial neural network is presented in Fig. 1.

Fig. 1
A set of two diagrams has three layers. 1, input, hidden, and output layer. 2, inputs, weights, and activation functions.

The basic structure of ANN model [12]

4.2 Convolution Neural Network (CNN)

CNN is a deep learning technique for processing information. A convolutional neural network is a branch of artificial intelligence that collects information using convolutional layers, a computer vision unit method. It is designed as hierarchical spatial features from lower to higher-level patterns [13]. It is comprised of pooling, convolution, and fully connected layers. When the layers are weighted, then the structural design is created. The main functions of the layers of convolution neural network are categorized into different areas: The input layer may store the pixel values of the picture. The basic architecture of the convolutional neural network is presented in Fig. 2.

Fig. 2
An architectural flow has two sets. Feature extraction and classification. Feature extraction has input, convolution, and pooling layers. Classification has connections between fully connected and output nodes.

The basic architecture of CNN model [13]

The convolution layer may identify the output value of neurons that connect to the local area of the input by calculating the scalar product among the weights and area attached to the information. Thus, the rectified linear unit aimed to put on activated value like sigmoid to result of activation generated by the last layer. The pooling layer may execute the down sample with the spatial dimension of required input and decrease the number of metrics with the activated value. And, fully connected layer performs a similar function searched in standard neural network and creates the class scoring from the activators for the classification purpose.

4.3 Support Vector Machine

The (support vector machine (SVM) was created with binary classification in consideration. Several categorization methods have been proposed to generalize SVM to the multi-class case. SVM is the simplest way to classify binary data classes [14]. SVM, one of the machine learning algorithms based on supervised learning, helps in classification and regression. The main objective of SVM is to provide a hyperplane and generate the different class clusters [15].

4.4 Decision Tree

The decision tree is a prediction and classification technique. The structure of the decision tree is a flowchart-like structure. Each node is connected to the other and forms a tree structure. The internal node denotes test attributes; the test outcome is presented with branches [16]. The class label is shown as leaf nodes or terminal nodes. The trees inside the decision tree framework can be trained by dividing the source set into small subsets based on the test attribute values.

4.5 K-NN Model

The K-NN technique ensures that the particular incoming instance and existing cases are equivalent and assigns the new case mostly in subcategories that are most compatible with the existing subcategories [17]. The K-NN method accumulates all available information and identifies a subsequent set of statistics premised on its resemblance to the current data. It implies that new information can be quickly sorted into a well-defined subcategory that uses the K-nearest neighbor method. The K-NN approach is used for regression and classification issues, but it is more generally applied for classification methods.

4.6 Random Forest

Random forest is a flexible, straightforward computational model in the vast majority of circumstances, produces tremendous success with hyper-parameters or without hyper-parameters modification. Along with its simplicity and versatility, it has become one of the most commonly used approaches. It can be used for classification and regression tasks. The essential characteristics of the RF algorithm are that it can manage sets of data with both categorical and continuous, as in regression and classification issues [17].

4.7 Fuzzy Logic

Fuzzy logic is a method of variables computing that enables the computation of numerous possible conditional probabilities using the exact attributes. Fuzzy logic is a type of logic that is used to simulate human understanding as well as thinking. It is a computing technique that focuses on “truth degree” instead of the conventional “correct or incorrect” (1 or 0) binary decision logic that the digital machine is built on “fuzzy” refers to something unclear or ambiguous [18]. A fuzzy system can be described as a set of IF–THEN logic containing fuzzy propositions, a mathematical or differential calculation containing random variables as variables that represent the uncertainty of attribute values. In Table 2, various existing methods of student performance prediction (SPP) with attributes and performance metrics.

Table 2 Existing methods of student performance prediction (SPP)

The comparison of various existing methods of student performance prediction (SPP) is presented in Figs. 4 and 5. The comparison analysis of different current methods of student performance prediction provided that the naïve Bayes method has attained maximum accuracy, precision, and recall compared to the other academic prediction models.

Fig. 4
A bar graph compares the accuracy rates of existing models. The naive Bayes method has the highest accuracy, and A N N has the lowest accuracy.

Comparison between several existing models with an accuracy rate

Fig. 5
A double bar graph compares the recall and precision rates of existing models. The Naive Bayes method has the highest recall and precision rate.

Comparison between several existing models with recall and precision rate

5 Conclusion and Future Work

This paper analyzed recent research on student performance prediction (SPP) models. Estimating student achievement is the most efficient approach for learners and educators to enhance learning by ensuring that students complete their courses on time. Existing research has used a range of methodologies to develop the best forecasting models. Many factors were selected and evaluated to identify the most critical and robust characteristics to estimate an optimal framework. Attendance, CGPA, race, and evaluation score have all been used by the majority of the investigators. Several features have an immense impact on whether or not a learner will complete their studies. Among the most critical aspects of evaluating academic achievement are the forecasting techniques. Several researchers use the decision tree, artificial neural network, support vector machine (SVM), K-Nearest Neighbor K-NN, and other categorization, prediction, and segmentation techniques. Many use a combination of strategies to create a more reliable system with higher predicted accuracy. The comparison analysis of various existing methods of student performance prediction provided that the naïve Bayes method has attained maximum accuracy, precision, and recall. Further work will introduce the novel feature extraction and prediction models with mathematical expressions to fetch the reliable feature values and improve the classification metrics.