1 Introduction

With a progressively increasing competition in the education sector, student contentment has emerged as an essential factor in magnetizing and retaining the best students, who, in turn, perk up the rank and status of the university. The term ‘Student Contentment’ is a measure that quantifies the student’s appraisal of the services offered by the colleges and universities [6]. Student satisfaction is of convincing enthusiasm to schools and colleges as they try to consistently improve the learning condition for understudies, meet the desires for their constituent gatherings and authoritative bodies, and exhibit their institutional viability. It is an important indicator of the quality of learning experiences. Higher education establishments consider student satisfaction as one of the significant components in deciding the nature of different projects in the present markets [14, 38, 39]. Resources such as effort, money and time are being spent by the students at the university, and thus it matters to students as they aim to achieve quality education for their higher studies. Contentment level, at a particular Institute/University, alters the motivational level of the students and ultimately changes the rate of retention of the students at a specific University. It allows universities to build a system for continuously keeping an eye on how effectively University caters student needs. From a university point of view, satisfied students are more likely to do well academically, which further augment the financial status and reputation of the University. Student contentment reflects not only the relish time of a student at University, but it also focuses on his performance within and outside University. A new study has found that student contentment has a remarkable influence on the apparent uniqueness of an institution, which has a direct impact on the achievement of student’s recruitment efforts [20]. Student contentment is vital for promoting institution life as well as it is catering to an essential influence for standing the university in global rankings.

In recent times, differing research has featured the significance of examining studying the life satisfaction of students, for the scholastic and individual repercussions [26, 32]. Most of the research target a broad range of topics, addressing student contentment. For example, some practitioners attempt to explore which aspects influence student contentment [28, 35]. Others investigate the correlation involving student contentment and other factors such as learning environment, service quality or instructional design and management style [20]. The authors [17] focused on finding the viewpoint of the students on sustainability. The authors [33] presented research work to determine whether Facebook likes are sufficient to determine student contentment in Open Distance Learning (ODL) or not. Some research work analyzed student contentment for courses like Computer Architecture and Organization by the use of multiple e-Learning tools [3]. The researchers [9] discussed how student contentment is related to teaching, along with other university experiences.

Further, the research work [18] investigated the contentment level at the School of Technology and Applied Sciences, Mahatma Gandhi University; Kerala. The authors [29] concluded that there is a rise in the contentment level of the students from the year 2007 to 2013, with the improvement in laboratory conditions that included laboratory notes, the computers, and the other engineering equipment. The researchers [40] presented that how meaningful is the participation and contentment of the students towards the approach of online learning. The research also focused on finding the methods the faculty followed to promote online learning among the students [1, 23]. Some researchers [13] worked on a project that aims to enhance the level of student contentment using a Virtual Learning Environment. The work [25] aimed to trace out the aspects affect the happiness of the pupils in universities in Pakistan and to evaluate these factors’ relationship, either positive or negative with the satisfaction. The research [31] predicted the levels of student contentment in face-to-face learning. The work [19] examined the extent to which dealings and other predictors put in to student contentment in online learning. The researchers [21] determined the predictors of student contentment focusing on recreational sports and cultural facilities within the campus. The research [12] discovered that log files from student course activities play a significant role in predicting student contentment with modules from a virtual learning environment. The authors [11] investigated that data mining techniques can choose a lesser number of constructs that need consideration to handle student contentment. The work [15] concluded that the two most significant aspects to student course contentment are both the count of enrolled students to a course and their extraordinary merit rate in their final grading.

In recent times, machine learning has been associated with the field of educational studies. In the report of 2018 New Media Consortium Horizon (NMCH), the sector of machine learning is appealed to be embraced as a part of adaptive skills integrating artificial intelligence in the succeeding 2–3 years, and as a part of analytics skills less than a year [5, 8, 16, 22, 24].

Utmost studies on an eminence in the higher education sector pay more attention to academic features rather than administrative factors [10, 27, 30, 34, 41]. These studies mostly concentrate on the quality of courses, their teaching methodologies, and effective course delivery tools. The author [7] for example, suggests several determinants influencing the image of a higher education institution, such as average class strength, diversity of courses, reputation from academic perspective, students’ qualification, as well as their personal qualities, staff students interaction, etc. On the other hand, the author [4] highlights the significance of student consultation, well-being of the student, library facilities, teaching environment, technical facilities, etc.

Until now, research usually focuses on determining correlations among student contentment and academic success of students or the final scores and individual psychological elements. This work is the first step towards the prediction of student contentment score from different aspects of Academics, Faculty zone, General, Research zone, Extra-Curricular, Hostel, Technology, Teaching Practices and Recreational zones. The purpose of this work is to exploit machine learning practices in the field of educational research. The paper presents an ensemble machine-learning model for predicting student contentment score based on the data procured using a designed questionnaire circulated through online means [37].

1.1 Major research contribution

The following are the main contributions:

  1. 1.

    The proposed predictive analytic ensemble model integrates the capabilities of each predictive model thereby generating improved and consistent predictions. The proposed framework performed notably better than the best performance models viz. ‘Gaussian Process with Polynomial Kernel’

  2. 2.

    From a computational point of view, the ensemble approach is more in demand and Wrapper feature selection using heuristic inspired by cuckoo to optimize is the generalization performance of a predictive model

  3. 3.

    The superiority of the novel ensemble prediction model was validated based on the data procured using a designed questionnaire circulated through online means

  4. 4.

    In contrast to the baseline models, the proposed ensemble prediction established its supremacy and the results revealed superior student satisfaction predictive strength in comparison to the base models.

  5. 5.

    The comparison of the prediction efficiency of the proposed model along with the explorations of the significance of the proposed prediction model was carried out.

1.2 Organization

The remaining part of paper involves the following sections: Section 2 gives the detailed background of materials and methods for the proposed analytical model; Section 3 demonstrates the experimental results of the proposed model along with comparative glance with base models based on various performance metrics, and Section 4 concludes the work with highlights of future research directions.

2 Materials and methods

This section presents a detailed description of the proposed methodology along with the features of procured data. The proposed methodology encompasses the prediction module that predicts the contentment rate of the students by employing multiple ensemble machine learning models on the ratings received from the students through Google forms.

2.1 Data collection

In the current work, the data related to the evaluation of student contentment level is originally collected by circulating the questionnaire to the students of the university through online means. The survey named The Overall Contentment Predictor Questionnaire (TOCPQ) is designed to know the perceptions of the students of the university that can further aid in forecasting the student’s contentment level associated with that university as depicted as Table 2. The survey focused on distinct groups like Academics, Campus environment (Hostel facility, library facility, within the class environment), General, and Research as shown in Table 1. The data collected through online means include 78 parameters and 4386 distinct instances. The participants involved pursuing graduation, post-graduation, and doctorate degrees from the university (Table 2).

Table 1 Broad classification of the dataset
Table 2 The proposed overall contentment predictor questionnaire
Table 3 The models along with their tuning parameters
Table 4 Parameters results for Dataset with features, with different training data
Table 5 Average values of evaluation parameters for 10 models (5-Fold Cross validation)
Table 6 The 5 models combination (based on RMSE)
Table 7 Different ensemble models with the 5 models

2.2 Data preprocessing

Data collection step is followed by data pre-processing wherein the data is pre-processed to exclude irrelevant data from the dataset by cleaning the noisy data along with imputing missing values through interpolation technique. Various categorical attributes are changed to numerical values (Fig. 1).

Fig. 1
figure 1

Solution encoding

This step is followed by the application of cuckoo search meta-heuristics wherein the original dataset, including 78 features is reduced to an optimal subset of features.

2.3 Proposed methodology

The prediction module broadly consists of three broad sub-modules, namely Data Preparation, Feature selection using cuckoo search, Stacking ensemble, and Cross-Validation. The pipeline of the methodology is illustrated in Fig. 3.

2.3.1 Cuckoo search meta-heuristic

The cuckoo search, the best suited meta-heuristic for solving the combinatorial optimization problem relies on the lazy nature of cuckoos that lay their eggs in the host nest. The eggs look same as that of the eggs of host bird. If the host bird differentiates foreign eggs, the bird either discards that nest or kicks out those eggs. On the basis of this lazy characteristic of cuckoos’ species, the authors [36] proposed a cuckoo search algorithm that is a novel meta-heuristic optimization algorithm. The algorithm considers every nest as single solution. The algorithm iteratively improves the new solutions/cuckoo eggs by discarding/replacing the older ones based on fitness function values. The detailed pseudo code of the proposed model for finding the student’s contentment score is given in Fig. 2.

Fig. 2
figure 2

The pseudocode of the proposed approach

The main sections of the pseudo code are as follows.

Encoding of Solution

The solution is encoded as a combination of zeros and ones with ‘m’ bits where m-representing the number of features in original data (here m = 78) with 1 signifying selection and 0 as discard of feature in the feature space.

The solution \( {\mathrm{X}}_{\mathrm{i},\mathrm{j}}^{\mathrm{k}} \) is characterized as in Fig. 3.

Fig. 3
figure 3

Depicts the flow of the proposed scheme

The new solutions are bounded to binary values using Eq. (1) .

$$ \mathrm{Z}\left({\mathrm{X}}_{\mathrm{i},\mathrm{j}}^{\mathrm{k}}\right)=\frac{1}{1+{e}^{X_{i,j}^k}}\ \mathrm{and}\ {\mathrm{X}}_{\mathrm{i},\mathrm{j}}^{\mathrm{k}+1}=\left\{\begin{array}{c}1,\kern2.75em \mathrm{if}\ \mathrm{Z}\left({\mathrm{X}}_{\mathrm{i},\mathrm{j}}^{\mathrm{k}}\right)<\hat{\mathrm{u}}\\ {}0,\kern3.5em \mathrm{otherwise},\end{array}\right. $$
(1)

where \( \hat{\mathrm{u}}\sim \mathrm{U}\left(0,1\right) \) & \( {\mathrm{X}}_{\mathrm{i},\mathrm{j}}^{\mathrm{k}} \) represents the solution at generation k.

New Solutions generation using Lévy flight

The new solution is obtained from the randomly selected solution using Levy’s flight as given in (2).

$$ {\mathrm{X}}_{\mathrm{i},\mathrm{j}}^{\mathrm{k}}={\mathrm{X}}_{\mathrm{i},\mathrm{j}}^{\mathrm{k}-1}+\upalpha \mathrm{xLx}\left({\mathrm{X}}_{\mathrm{i},\mathrm{j}}^{\mathrm{k}-1}-\mathrm{Optimal}\_{\mathrm{X}}_{\mathrm{i},\mathrm{j}}\right) $$
(2)

where \( {\mathrm{X}}_{\mathrm{i},\mathrm{j}}^{\mathrm{k}} \) the new solution which is generated using Lévy flight; \( {\mathrm{X}}_{\mathrm{i},\mathrm{j}}^{\mathrm{k}-1} \) represents arbitrarily chosen solution from the population space; α represents step size; Optimal _ Xi, j signifies the best solution found so far; and L is the step length (Lévy flight vector). The newly obtained solutions are compared with the old ones on the basis of their fitness function values and the highly fitted solution is forwarded to the population for subsequent processing.

Discovery of alien eggs

For each cuckoo within the population, the probability matrix, is used for identifying the alien eggs as presented in Eq. (3)

$$ {\uprho}_{\mathrm{i},\mathrm{j}}=\left\{\begin{array}{c}1,\kern3em \mathrm{if}\ \mathrm{random}\left(0,1\right)<\rho \\ {}0,\kern7.5em \mathrm{otherwise},\end{array}\right. $$
(3)

where ρi, j represents the probability of noticing foreign eggs in the solution i for the jth cuckoo’s dimension.

The comparative values of ρ and uniform random number generator random(0, 1) determines if local random walk is required or not. After calculating the probability of discovery, new solutions are obtained using Eq.(4).

$$ {\mathrm{X}}_{\mathrm{i},\mathrm{j}}^{\mathrm{k}}={\mathrm{X}}_{\mathrm{i},\mathrm{j}}^{\mathrm{k}-1}+\mathrm{LSx}{\uprho}_{\mathrm{i},\mathrm{j}} $$
(4)

Where ρi, jrepresents the probability matrix and LS matrix signifies local step size found by using Eq. (5).

$$ \mathbf{LS}=\mathbf{\operatorname{rand}}\left(\mathbf{0},\mathbf{1}\right)\mathrm{x}\left(\mathbf{\operatorname{rand}}\_\mathbf{perm}\left(\mathbf{Solution}\left(\mathbf{i}\right)\right)\mathbf{\operatorname{rand}}\_\mathbf{perm}\left(\mathbf{Solution}\left(\mathbf{j}\right)\right)\right) $$
(5)

where rand  _ perm() shuffles the solution arbitrarily.

The new solution substitutes the old one in succeeding iteration, in case the objective function/ fitness value of the new solution is better than the present one. The entire procedure starting from the generation of solution till searching for foreign eggs is repeated until the fixed set of iterations.

2.3.2 Flow of proposed scheme

The first step deals with data collection. It includes the formation and circulation of the questionnaire. The next step involves data preprocessing. On the original dataset with 78 features, cuckoo search based wrapper feature selection is applied to select the specific set of features. In the next level, the reduced dataset with selected features is used to train the regression models with their optimum tuning parameters employing 5 fold cross-validation [2]. These machine learning models are analyzed on various evaluation parameters viz. Correlation, R-Squared, Root Mean Square Error and Accuracy. Out of the 10 machine learning models employed, the five models combinations are chosen out of possible combinations on the basis of RMSE value that yields better RMSE using stacking generalization. Figure 1 depicts the flow of the proposed methodology.

In stacking is ensemble the predictions from different base learners, are employed as inputs for meta- learning algorithm [5]. This meta-learner is trained to optimally integrate the predictions from base models to generate a new set of predictions.

In the stacked generalization framework, the predictions from four base models are exploited by the meta-model to predict the student satisfaction score.

The following are the steps of the generation the proposed model from stacking ensemble technique:

  • Specify a set of 10 base learners (with specific model parameters).

  • State a meta-learner.

  • Train each base learner on the training data for student score prediction.

  • Out of different combinations, the five models combinations are found with better RMSE (in a group of four-base model and one as meta-learner for making different configurations) that further refines the performance of the prediction model. The final model is chosen using the stacking ensemble. As the data traveled through the five models, the models trained the data to offer reliable and precise outcomes. The best ensemble model was based on RMSE contributes to the final prediction.

3 Experimental results

The experimental results to compute the performance of prediction models have been assessed by applying all approaches stated in Table 3. R programming [36] is employed to implement various machine learning models using the processed data.

The work is implemented in R language on standalone machine with an Intel Core®i7 processor running at 2.80 GHz, 8 GB RAM, using 64-bit windows OS. The main stages of the proposed work are as below:

  • Data procurement and Pre-processing

  • Feature selection using Cuckoo Search Meta-heuristic

  • Prediction using 10 machine learning approaches

  • Selecting best combination of five learners (on the basis of RMSE)

  • Application of stacking ensemble using four at one time as base learner and the 5th one as meta-learner

  • Choosing the best ensemble for final testing.

The dataset procured is free from missing values. The dataset contained values on mixed scale, as percentage, ratios, and average scores. The ratings offered by the students at the university are dealt with the prediction module. The various parameter values settings of the underlying CS algorithm are:

  • Initial solution count, N - 20

  • Generations count, Gen - 500

  • Step size, α - 0.15

  • Discovery probability, ρ - 0.25

  • Levy step length, L - 1.5

The normalize() function in R is utilized for normalization purpose. As data is procured from real data, the work is simulated on original data without considering any outliers effect. Table 3 depicts the tuning parameters of the models used in the present study along with their required packages.

Various parameters are being analyzed to compute the overall contentment rate within the university. To evaluate the performance of the proposed methodology various evaluation parameters are calculated such as Root Mean Square Error (RMSE), Correlation, R-Squared and Accuracy.

Root Mean Square Error (RMSE)

Root mean square error depicts the standard deviation of the model’s prediction by finding the difference between the model predicted values and the actual values. The lower the value of RMSE, the better is the performance of the model. The RMSE is given as in Eq. 6

$$ RMSE=\sqrt{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$n$}\right.\sum \limits_{i=1}^n{\left({y}_i- yj\right)}^2} $$
(6)

where n represents the total number of observations, yi denotes the actual observed values and yj denotes the predicted values.

Correlation(r)

The correlation coefficient illustrates how well are two variables linearly related. For variables X and Y, correlation(r) is computed as in Eq. 7.

$$ r=\frac{N\sum xy\kern0.75em -\left(\sum x\right)\left(\sum y\right)\ }{\sqrt{\left[N\sum {x}^{2\kern0.5em }-{\left(\sum x\right)}^2\right]\left[N\sum {y}^{2\kern0.5em }-{\left(\sum y\right)}^2\right]}} $$
(7)

N-total number of samples

xy - summation of products of paired sample values

x - summation of x values

y - summation of y values

x2 - summation of squared x values

y2 - summation of squared y values

R- Squared (R2)

R-squared determines how well the data fits to the regression line. It is square of coefficient of correlation and is given as in Eq. (8)

$$ {R}^2=\frac{\sum {\left({y}^{\Lambda}i-\overline{Y}\right)}^2}{\sum {\left( yi-\overline{Y}\right)}^2} $$
(8)

Here, yi denotes the observed values of the dependent variable, ͞Y depicts the mean, and ̂yi indicates the fitted value.

Accuracy

It illustrates how well the model been able to predict the values of the defined target class. It is computed by comparing the actual values with the predicted values and is calculated as in Eq. (9).

$$ Accuracy=\left( mean\left( abs\left( Actual== Predicted\right), 1, 0\right)\right)\ast 100 $$
(9)

The original cleansed dataset is is fed to cuckoo search meta-heuristic for feature selection. Table 4 provides the r, RMSE, R2 and accuracy results for dataset with selected features at different values of folds in 5 fold cross-validation, while Table 5 concludes the average result of 5-fold cross-validation for different evaluation parameters. The main evaluation parameter used in this work is predictive RMSE.

Figure 4 depicts the average RMSE values of 10 base machine learning models on a reduced dataset after feature selection. It is clear from the figure that the lowest RMSE of 1.18 is depicted by the Gaussian Process with Polynomial Kernel model and the highest RMSE of 4.726% is shown by Boosted Linear Model.. The K nearest Neighbour model has better performance in comparison to the Boosted Linear Model with less RMSE value of 3.75.

Fig. 4
figure 4

Comparison of average RMSE values for different machine learning models

Figure 5 depicts the comparison between machine learning models over the accuracy evaluation parameter on the reduced dataset. The Gaussian Process with Polynomial Kernel is the best performer in terms of accuracy with an accuracy value of 81.67% while Boosted Generalized Linear Model shows the least accuracy value of 72.12% in comparison to rest of the base models.

Fig. 5
figure 5

Comparison of accuracy values for different machine learning models

These models are further exploited for stacking ensemble with a group of four as base learners and one as meta- learner, leading to different possible combinations of the heterogeneous learners.

All heterogeneous learners are exploited (in combination of 5) for stacking ensemble, that further aid in avoiding the ensemble to overfit the model.

Out of the ensemble of these diverse predictors combinations, 5 models (in combination) are chosen from the set of 10 base models based on the RMSE values as given in Table 6. Table 7 presents the average RMSE values of stacking ensemble of the 5 models combinations.

Figure 6 concludes that the models Self Organizing Map, Multilayer Perceptron, Boosted Generalized Linear Model and Gaussian Process with Polynomial Kernel along with Partial Least Squares as meta-learner portrays lowest RMSE of 0.373. The combination of Multilayer Perceptron, Partial Least Squares and Boosted Generalized Linear Model is not proven to better combination for predicting the contentment score of a student in university. The proposed stacking ensemble with cuckoo search meta-heuristic is proven to be beneficial for computing the overall contentment rate of the university. The proposed methodology can help to find and conclude the parameters that impact the student contentment levels the most.

Fig. 6
figure 6

Comparison of average RMSE values for different stacking ensembles

4 Conclusion and future scope

The current work focuses on the development of an ensemble machine learning model to foresee the overall student contentment score according to their perceived level of satisfaction. The proposed approach utilizes cuckoo search meta-heuristic for selecting the best combinations of features out of 78 original feature set. The work exploits the stacking ensemble to improve the efficiency of the proposed approach. The results indicate that an ensemble model of Self Organizing Map, Multilayer Perceptron, Boosted Generalized Linear Model and Gaussian Process with Polynomial Kernel along with Partial Least Squares as meta-learner gives lowest RMSE of 0.373. The current work can be extended by enhancing the dataset by capturing feedback data and reviews from social media leading.