1 Introduction

When the education in the world was affected by the COVID-19 pandemic. The shift to digital environments was the solution to continue the teaching and learning processes. The transition was carried out by complete dependency through the learning management system (LMS), blended E-Learning or Traditional blackboard Teaching-Learning with E-Learning [1].

However, these systems have yet to prove their success in engaging users a quality learning process by providing necessary information in a timely and effective way [2, 3] with respect to pedagogical, learning and adaptation dimensions [4]. To address the evaluation necessity of E-learning systems, this current study attempts to present a systematic and analytical approach to evaluation criteria using analytic hierarchy process (AHP) [5, 6], entropy method (EM) [7], the criteria importance through inter-criteria correlation (CRITIC) [8] and simple derivation weighting (SDW) [9, 10] on the proposed evaluation criteria. The present research is undertaken to accomplish two major objectives, namely:

  1. 1.

    To develop and implement an evaluation model that enables making decisions based on indicators with respect to pedagogical, adaptive, student and course learning outcomes dimensions in a multiple criteria perspective.

  2. 2.

    To generalize the proposed model to be applied under three different scenarios reflecting worse and better conditions of system functionality, student behavior and learning outcomes (LO) to evaluate the impacts of proposed alternatives under different conditions.

This paper is organized as follows. Section 2 presents the previous research in literature on evaluating E-Learning. Section 3 discuss the presented methodology, Next, Sect. 4 experimental setup and results, while a discussion of the findings are presented in Sect. 5. Finally, Sect. 6 concludes the present study.

2 State of the Art of Evaluating E-learning

E-learning systems are multidisciplinary by nature. Many researchers from fields such as computer science, information systems, psychology, education, and educational technology, have been trying to evaluate E-learning systems. There has been a good recognition for the evaluation process even if it’s still in its early stages and has dealt only with partial components, each evaluation project or research had its own justification and reasoning. The main categories of the literature are:

Learning during COVID-19 has influence many experts to evaluate the performance of E-learning tools and websites. In [11] the MCDM integration with linguistic hesitant fuzzy sets to evaluate and select the best E-learning website for network teaching. The results indicate that MCDM can be practical and effective when used for website selection under vague and uncertain linguistic environment.

The researchers in [5] employed the analytic hierarchy process (AHP) with group decision-making (GDM) and Fuzzy AHP (FAHP) to study the diversified factors from different dimensions of the web-based E-learning system. The MCDM approach produced realistic results in categorizing each dimension and critical success factors of the E-learning system. This categorization was intended as a decision tool for stakeholders’ resources.

On the other hand, when MCDM was used in [12] for problem structuring, it was brought down by weak inter- and strong intra- connection between criteria. And highly affected by criteria definition and the number of criteria in each cluster (cluster size). Another factor that plays a great role in using MCDM is scale selection [13]. In statistical methods and AHP ranking of indicators using a scale of rating [0 to 100], where the higher the rating indicates the higher the performance of the approach under the evaluation criteria.

3 Proposed Methodology

The methodology starts with defining system evaluation criteria and generating alternatives that relate system capabilities to goal. It then follows with evaluating all alternatives in terms of criteria. After applying multi-criteria analysis method. A selection is made on one alternative as optimal. This process iterates until reaching better multi-criteria optimization. So to start with the first step in the research experts and decision makers definition of evaluation criteria we selected the list described in Table 1.

Table 1. The hierarchal structure for evaluation

After defining the criteria, we represent the criteria as scenario indicators which are obtained either from computer simulation models or model results. The model outputs the priority and performance of alternatives under proposed criteria scenario. This study delineates the best alternative on the basis of three different multi-criteria decision making (MCDM) methods, including simple additive weighting (SAW) [9] and compromise programming (CP) [14]. Each method is also applied with a set of criteria weights that represent objective judgments represented by expert’s reviews as well as subjective preferences of decision makers.

As for the methodology for calculating criteria values, we suggest that each criterion is specified with a range of values starting from a lower threshold to an upper threshold and measured in a finite amount of time. In this paper the range of the criteria is selected between [0.0–2.0] representing the lower and upper thresholds respectively, in certain conditions these values may be fixed to a specific number to serve better judgment. As for criteria weights, we use AHP [5, 6] as an elicitation technique as one way, the other involves assigning performance scores to each alternative in performance matrix.

In order to derive the preference structure and performance matrix we used analytic hierarchy process (AHP) [5, 6], entropy method (EM) [7], the criteria importance through inter-criteria correlation (CRITIC) [8] and simple additive weighting (SAW)[9]. We choose AHP because of its superiority in performing judgment and scales over others [14]. Using the performance matrix (PM) where the columns correspond to criteria (C1, C2.., Cm) and rows correspond to alternatives (A1, A2, … An), with the entries (\(a_{ij}\)) being the indicators for all alternatives across all criteria. Once the matrix is set up, the next step for the decision process is to define the weights (\(w_{1}\), \(w_{2}\), …,\(w_{m}\)) of the criteria. The value of each alternative or solution will be calculated according to the value \(n\) Eq. (1), where \(A_{i}\) represents the suggested alternative or in our case system or solution and where Aij, dik denotes the performance value of the ith system under criteria j and dimension k respectively.

$$A_{i} = \sum\nolimits_{k = 1}^{p} {d_{ik} = \sum\nolimits_{j = 1}^{n} {A_{ij} W_{j} } }$$
(1)

When defining weights of Criteria’s, these weights in MCDM do not have a clear economic significance, but their use provides the opportunity to model the preference structure. That is why they are assigned by decision makers (DM) using importance weights which represent the relative importance of criteria [15].

In evaluating the current system against the criteria; we will use rating-based and ranking-based methods [16]. Rating-based involves rating an alternative under each dimension then calculating the sum of its weighted performance under each criterion to give it an overall weight with which it will compete with other alternatives [19]. While in ranking-based the alternatives will be pair wise compared with respect to each criterion or dimension to derive normalized relative priorities of each alternative. The overall priority of each alternative and their rankings can then be used as the basis for selection. To apply this on our proposed system, it should be assessed against criteria or dimensions. For each criterion, if its performance value is less than the pre-defined threshold then it indicates that this is an area which needs improvement. However, if there so many areas, then the priority will be given to the area with the greatest weighted distance from perfection, defined by:

$${\text{WDP = CW }}\left( {{\text{DW}}} \right)* \, \left( {{\text{Perfect score}}{-}{\text{ Performance score}}} \right)$$
(2)

Where WDP: weighted distance from perfection; CW: criteria weight; DW; Dimension weight.

We depict our implementation for AHP in the following Table 2.

Table 2. Algorithm 1. Procedure of analytic hierarchy process

The next step in the experimental setup is defining the management alternatives as seen in Table 3. The first alternative A01, aims to enhance the educational process and the E-learning performance by adding positive modifications to courses, these modifications are determined after an assessment process for each course to determine its weakness points. Alternative A02 assumes that the deletion of a course is more appropriate as there is no use of modifying it, or that its participation in system performance is significantly poor, this is usually followed by a suggestion for another replacement of the deleted course.

Another alternative similar to A01 is A03, assuming modifications or cancelation A04 are determined and applied to system learning agents to enhance their productivity and there by boost the systems performance, modifications may include change of platform, addition in functionalities, these actions are automated unless necessary.

Alternative A05 is designed and applied to provide an indicator to the decision makers of wither the student or course LO have been written correctly, or need additional revision, sometimes the regression in performance is caused by wrongly written LO which could not by achieved.

If the value for Alternatives A05 or A06 is high, this will be an indicator that improvements are not only necessary for LO but also to course and agents respectively.

Table 3. The alternative matrix

As for the reference scenarios, we based the scenarios on changing factors affecting the E-learning system such as changes in learning outcomes, since it had direct effect on our system and was initially the goal of evaluation. The simulation of evaluation methods is run to identify the impact of changing learning outcomes on our system.

The values that are entered into the performance matrix were obtained by:

  1. A.

    Pair wise comparison, where it’s the responsibility of the decision makers (DM) or experts to make such a comparison between each criterion and the others usually based on their formal or informal judgment.

  2. B.

    Random distribution of values until a satisfactory state or result is reached.

The first scenario is based on increased demand on the system to fulfill the LOs. The second scenario represents the best conditions containing random values in the range [0.0–2.0] assigned to criteria and alternatives assuming the focus on preserving the current LO achievements with respect to an increase in available agents, content, and courses. Meanwhile the decrease in these mentioned factors were projected in the third scenario represented.

4 Experimenral Results

We start with the performance matrix (PM) for each scenario, containing 6 dimensions versus 8 alternatives, thereby we have 3 matrices shown in Table 4 for the three scenarios.

In scenario (1) the feasible alternatives could by arranged where the top alternatives are A03, A04, A07 these alternatives focus on agents’ modification or removal to enhance performance, followed by A01, A05, A06 which focus on modifying course and LO. As for A00 and A02, they were not feasible enough to overcome the domination of other alternatives as their values were slightly less that all others, This indicates that the top alternatives deserve deeper analysis. The same holds for scenarios (2) and (3).

Table 4. Performance matrix for different reference scenarios

The next step is calculating the criteria weights according to the methods explained before which are: Analytical Hierarchy Process (AHP), Entropy Method (EM), Inter-criteria Correlation (CRITIC) and simple derivation weighting (SDW).These objective weights are displayed in Table 5. The objective weights ranked the most important criteria to assess the alternatives performance as the SLO, Pedagogical issues, CLO in Scenario (1) but this was changed in Scenarios (2) and (3) as the pedagogical issues gained more importance than SLO, followed by CLO, after these come the adaptation criteria in all scenarios.

Table 5. Criteria weights obtained by weighting methods.

These criteria were considered important, while the others like system agents or system content were not considered to have a significant role in the decision making process as their scores are less divertive. The sum of these 3 criteria weights derived by the CIRTIC method and SDW ranges from [0.73–0.92] for all scenarios, with one interesting remark was that the pedagogical issues criteria gained importance as the scenarios moves towards better and worst values like in scenario (2) and (3).

The results collected from the AHP method is illustrated in Fig. 1. The number of performed pair-wise comparisons between the criteria (6*(6 − 1)/2 = 15) and the alternatives are evaluated against each criterion to determine those who are not considered important in the objective weighting methods. With respect to the preference judgments of DMs who gave priority to environmental effects, SLO were considered to have strong importance over CLO, and demonstrated importance over pedagogical issues.

Fig. 1.
figure 1

Results obtained from analytic hierarchy process.

As for ranking the alternatives according to the distance methods, which calculate the Euclidean distance demonstrated by compromise programming (CP), and simple additive method (SAW). The alternative rankings summarized in Table 6 were as follows:

In Scenario (1), using EM, CIRTIC, SDW; A03, A05 and A06 were given the highest ranking amongst all other alternatives for all methods used. Following are alternatives A07, A04 and A01 while the worst were A00, and A02.

Table 6. Alternatives ranks from different MCDMs.

As for AHP the alternatives were arranged according to Table 7 as follow; Alternatives A04, A03 and A01 were giving greater importance according to the DM who were concerned with outcomes criteria in a scenario aimed at enhancing the educational process, the alternatives modifying the agent was most preferable since it will cost less and provide better results than the others. The interesting thing is that when modifying the LO was not preferred by the DM, all alternatives including this choice were given low values and were placed on bottom of the list.

Table 7. Ranking of alternatives using AHP

5 Discussions and Findings

It has become quite important to develop evaluation methods for E-learning systems and be applicable with changing policies and environmental constrains. It is also necessary to focus this evaluation process on systems including their learning agents of all types and layers, in a form of simulator capable of handling all kind of scenarios and producing accurate results to DM, this is an important step in improving such systems.

Considering all presented scenarios, the alternative of modifying system agents was elected as most efficient management alternative followed by combining it with modifications to LOs. By that it would be possible to solve problems vastly if implemented immediately and recover from failure to improve current performance.

Traditional measures such as canceling a course or agents aren’t considered efficient for enhancing the learning process and insuring learners’ achievement of LOs. All the MCDM methods priorities the alternatives in a similar arrangement, there by the decision that is deduced from any method will be relatively similar to the other. With minor difference based on the method chosen and major difference on the weights assigned to the criteria.

The EM, CIRTIC, AHP, SDW methods can prove reliable in making use of all the information contained in row data, thereby considered reasonable ways in criteria weighting, and guaranteed to produce robust and fair decisions.

6 Conclusion

The success of E-Learning is highly depending upon adaptation and fulfillment of teaching objectives. The factors affecting the E-Learning systems success are many therefore it is essential to evaluate them so that the stakeholders, such as educational authority, students, and instructors, will be able to control the negative effects of each of these E-Learning factors and their dimensions in an effective manner. The Multi criteria decision making (MSDM) approach presented in this paper could prove successful in prioritizing each of the pedagogical, learning and adaptation s dimension according to a presented evaluation criterion. This categorization of factors will help stakeholders in ensuring the continuous improvement in E-learning systems and the design of more effective learning environments.