1 Introduction

Digital educational resources are means that support teaching and learning. By guaranteeing the usability of these resources, the teacher and students will be able to use them efficiently and effectively. Due to this, digital educational resources assume a technological and pedagogical mediation function.

The digital educational resources are devices stored and accessible on a computer, designed for educational purposes, with identity and autonomy about other objects, and with adequate quality standards (Ramos et al., 2010). Yang (2014) reflects that these digital resources include digital video, digital audio, multimedia software, websites, learning management systems, simulation programs, and resources that enable online discussions. Examples of these resources are virtual courses, learning objects, educational games, and educational repositories. In the particular case of open educational resources, they stand out for their adaptability and possibility of modification:

The open provision of educational resources, enabled by information and communication technologies, for consultation, use and adaptation by a community of users for non-commercial purposes (UNESCO, 2002, p. 24).

Different technological approaches determine how to design and implement digital educational resources, guaranteeing their effectiveness, satisfaction, efficiency, operability, and learning. In this sense, the usability characteristic influences the determination of the quality of an educational resource. Ensuring usability is essential in the interactive and communicative process established in b-learning, e-learning, and m-learning (Kumar & Goundar, 2019).

To ensure the correct usability assessment of digital educational resources, standards must be used to guarantee harmony between the criteria and their metrics (Revythi & Tselios, 2019). The most frequently used are:

  • ISO/IEC 9126-1 (understandability, learnability, operability, and attractiveness). It is a model that classifies the quality of software based on six characteristics (Usability, Functionality, Reliability, Efficiency, Maintainability, and Portability) which are manifested as a result of the internal attributes of the software (functional requirements) (International Organization for Standarization, 2001).

  • ISO/IEC 9241-11 (effectiveness, efficiency, and satisfaction). Standard focused on usability and ergonomics (hardware and software)(International Organization for Standarization, 2018).

  • Norma ISO/IEC 25010 (learnability, appropriateness, recognizability, operability, user error protection, user interface aesthetics, and accessibility). Standard that determines eight characteristics (Usability, Functional Suitability, Performance efficiency, Compatibility, Reliability, Security, and Maintainability) of quality to evaluate the properties of a software product (International Organization for Standarization, 2011).

  • IEEE Std.1061. (understandability, ease of learning, operability, and communicativeness). IEEE Standard for a Software Quality Metrics Methodology

Regardless of the criteria established in the standards presented above, there are other general criteria related to the Nielsen heuristics (Nielsen, 1994) that allow a more specialized evaluation. In addition, according to Mohd-Khir and Ismail (2019), it is not enough to use general criteria, since when evaluating usability, they must include criteria of pedagogical usability.

Recent systematic reviews (Vieira et al., 2019) declare the importance of evaluating the usability of digital educational resources according to the characteristics of the students, reflecting on the need to use already validated criteria and metrics, mainly the standards: ISO/IEC 9126-1 and ISO/IEC 9241-11(Yanez et al., 2019).

The models for evaluating usability (Hartson, & Pyla, 2012; Pensabe-Rodriguez et al., 2020) are based on traditional Software Engineering procedures, (Inspection, Inquiry, and Test Methods). However, current computational models are implemented to reduce the presence of uncertainty in the data offered by the evaluator of digital educational resources (experts, teachers, or students). For this reason, the technologies applied to education use models to further personalize and facilitate learning (Educase, 2018, 2019).

Various papers reflect how to assess the usability in different environments (Evaluation Methods: expert evaluation; model evaluation; user evaluation and location evaluation) including digital educational resources in educational informatics. Some researchers employ criteria associated with standards and pedagogical usability criteria (Alshehri et al., 2019), while others refine these traditional researches by using computational models (Paunović et al., 2018; Ramanayaka et al., 2019).

The applications of analytical, multi-criteria, hierarchical, and fuzzy logic methods, in general, contribute to the prediction of learning and the design of digital educational resources, which are more interactive, collaborative, and highly personalized to the characteristics of the students. Recent research (Zawacki-Richter et al., 2019) affirm trends in the use of computational models in education, identifying the need for critical educational reflection.

The ethical, educational, and social importance of the application of computational models in education is vital to be analyzed from an interdisciplinary perspective (Zawacki-Richter et al., 2019). Its introduction has been effective in educational processes; however, it is sometimes unknown to the professor.

A recurring limitation is that the usability of digital educational resources is only assessed from traditional engineering methods or only from a pedagogical perspective (Hinojo-Lucena et al., 2019). This is a consequence of the exclusion of current computational models that can improve the praxis and theory of technologies applied to education. Consequently, it was decided to determine whether or not there are convergences in educational researches and computational researches in the assessment of the usability of digital educational resources. Therefore, possible discrepancies are observed between educational research and computational research related to the evaluation of the usability of digital educational resources.

For this reason, the objective of this research is to determine whether or not there are convergences in educational research and computational research in assessing the usability of digital educational resources.

2 State of the art

2.1 Main strengths and weaknesses

Various systematic reviews published in the 5 years 2015-February 2020 identify trends in the last 10 years in the assessment of usability in digital educational resources. They focus their attention on the use of traditional methods of assessment, Inquiry, Test, and Inspection. These reviews cover some computational methods that are used to strengthen the assessment of usability.

In the last decade, some systematic reviews have been published in Scopus and WoS, and to a lesser extent mappings and meta-analyses (Table 1). In computational research (Yanez et al., 2016, 2019; Salas et al., 2019), the evolution of the main methods for assessing the usability of digital educational resources are determined and, in educational research, the main criteria of pedagogical usability (Abuhlfaia & Quincey, 2018; Gunesekera et al. 2019; Hainey et al., 2016; Kumar & Mohite, 2018; Missen et al., 2019; Murillo et al., 2019; Silveira et al., 2020; Vee Senap & Ibrahim, 2019; Vieira et al., 2019; Sulaiman & Mustafa, 2019).

Table 1 Research present in Scopus or WoS (2015–April 2021)

The referenced studies (Table 1) individually analyze various criteria to assess the usability of digital educational resources without being able to determine similarities or differences between educational and computational researches. Another limitation is that these studies are fundamentally focused on two types of digital educational resources: educational games and virtual learning environments, therefore it is vital to analyze empirical studies of others types of digital educational resources. Furthermore, the scientific literature lacks meta-analysis of empirical research related to the assessment of the usability of digital educational resources, influencing the development of interdisciplinary research.

Consequently, it was decided to determine whether or not there are convergences in educational researches and computational researches in the assessment of the usability of digital educational resources. To fulfill this purpose, the following scientific questions were determined:

  • Question 1: what general usability and pedagogical usability criteria are used in the assessment of digital educational resources in educational researches?

  • Question 2: what methods and algorithms are used to assess the usability of digital educational resources and what criteria and metrics do they use in computational researches?

3 Method

The objective of this paper is to analyze the empirical researches to determine if exists convergence between educational and computational researches on the assessment of the usability of digital educational resources. To fulfill the objective, the PRISMA protocol (Moher, 2009) was used to carry out two systematic reviews and answer the two scientific questions.

3.1 Search strategy and quality criteria

  • Selection criteria for scientific questions 1 and 2: English-language empirical researches published in Scopus, ACM Digital Library, IEEE Xplore, and Springer published were analyzed. The search was limited in journals and conference proceedings with peer review.

  • Exclusion criteria of scientific questions 1 and 2: articles and tutorials with poor scientific basis will not be included, as well as those with limited structure designs or that do not justify or prove their results.

The search strategy (Tables 2 and 3) was carried out from November 2019 to February 2020 (2561 and 115 initial registrations respectively). Duplicate papers were eliminated and the analysis was limited to those published from 2015 to February 2020 (first scientific question) and from 2000 to February 2020 (second scientific question).

Table 2 Initial search string for the first scientific question
Table 3 Initial search string for the second scientific question

3.2 Validity assessment and data extraction

The Keywording technique (Odun et al., 2019) was used and to ensure external validity, articles that did not argue their results were discarded. In the conclusion validity, a procedure was developed in which three reviewers completed the data of the papers according to the Keywording technique. For construct validity, measurement was performed using the known extreme group's approach.

In the selection of the primary studies, the following were analyzed: the abstracts, keywords; variables, case studies, and testing of their hypotheses.

To determine the reliability of the evaluators (A, B, C), Cohen’s kappa coefficient was used, in which the values of 0.40 to 0.60 are characterized as adequate, from 0.60 to 0.75 as good and those greater than 0.75 as excellent. The consistency between evaluators A and B for the inclusion and exclusion of articles was: k = 0.78; between A and C, K = 0.81; and between B and C, K = 0.67. It is reflected that the results are satisfactory and therefore, the reliability among the evaluators is considered excellent.

When applying the PRISMA protocol (Figs. 1 and 2) 69 articles were selected (57 related to the first scientific question and 12 to the second question).

Fig. 1
figure 1

PRISMA diagram associated with Question 1 (slightly modified after Brunton & Thomas, 2012, p. 86)

Fig. 2
figure 2

PRISMA diagram associated with Question 2 (slightly modified after Brunton & Thomas, 2012, p. 86)

3.3 Coding and data analysis

A form was designed that contained the title of the article, its variables (dependent and independent); years of production; the indexing database; general criteria of usability and pedagogical usability, and assessment methods. Microsoft Excel statistics were also used.

3.4 Limitations

All theoretical studies are prone to epistemological biases, for this reason, we tried to carry out a rigorous analysis limited by the search strategy. The analyzed papers are written in English, so the study did not analyze publications in other languages. Important databases were chosen, although this also limits the selection and exclusion of scientific papers. Short articles or tutorials were not included. Future research is expected to refine the search strategy by including other languages.

4 Results

To answer the first question, 57 articles (Fig. 3) indexed in Scopus, IEEE Xplore, ACM Digital Library, and Springer (Fig. 4) were analysed.

Fig. 3
figure 3

Distribution of the 57 papers

Fig. 4
figure 4

Distribution of articles according to their indexing

The ISO/IEC 9241-11 and ISO/IEC 9126-1 standard were the most frequent and the criteria of effectiveness and efficiency the most used. (Fig. 5).

Fig. 5
figure 5

Distribution of the use of international standards

Set researches, according to the criteria used to assess the usability of digital educational resources.

  • ISO/IEC 9241-11. Studies highlight the use of user-centred design as a whole with satisfaction. (Harpur & de Villiers, 2015; Koohang & Paliszkiewicz, 2015; Ibarra et al., 2016; Yanez et al., 2016; Varsaluoma et al., 2016; Quinõnes & Rusu, 2017; Rumanová & Drábeková, 2017; SobodiÄ et al., 2018; Alshehri et al., 2019; Eltahir et al., 2019; Mohd-Khir & Ismail, 2019; Hadjerrouit & Gautestad, 2019; Bernardino-Lopes & Costa, 2019; Yanez et al., 2019; Vieira et al., 2019; Alomari et al., 2020).

  • ISO/IEC 9126-1. The authors who apply this standard highlight the use of operability and learnability. (Koohang & Paliszkiewicz, 2015; Alsabawy et al., 2016; Chin et al., 2016; Casano et al., 2016; Ibarra et al., 2016; Varsaluoma et al., 2016; Ramírez et al., 2017; Emang et al., 2017; Kumar & Mohite, 2018; Hadjerrouit & Gautestad, 2019; Bernardino-Lopes & Costa, 2019; Wan-Sulaiman & Mustafa, 2019).

  • The IEEE Std.1061 standard is used in Koohang and Paliszkiewicz (2015) and the ISO/IEC 25010 standard, by Wan-Sulaiman and Mustafa (2019).

  • Nielsen’s heuristics (Nielsen, 1994), related to error prevention, consistency and standards, and help and documentation. (Jou et al., 2016; Revythi & Tselios, 2019).

  • Research that selects certain criteria from the standards (Alshehri et al., 2019; Ávila et al., 2017; Awang et al., 2019; Beswick & Fraser, 2019; Bozkurt & Ruthven, 2016; Calderon et al., 2018; Chang et al., 2016; Chen, 2018; Chu et al., 2019; Didik-Hariyanto & Bruri-Triyono, 2020; Hadjerrouit, 2015; Harpur & de Villiers, 2015; Ishaq et al., 2019; Pujiastuti et al., 2020; Radovan & Perdih, 2018; Salas et al., 2019; Sarkar et al., 2019; SobodiÄ et al., 2018; Toda et al., 2015; Tomaschko & Hohenwarter, 2017; Tsouccas & Meletiou, 2017).

These studies are only referenced because, in the opinion of the authors of this research, these articles state-specific criteria of pedagogical usability, however in the other articles they use general quality criteria that are related to the standards specified above (Table 4).

Table 4 Pedagogical usability criteria

It is valid to emphasize that the assessment of usability in educational research includes: the use of general (standard) and specific criteria (pedagogical usability). However, is there a convergence between these and computational researches?

In this particular context (Question 2), little evidence was found (115), of which only 12 (10.43%) were selected (Table 5). In these papers (n = 12) 33.33% apply criteria associated with pedagogical usability and the rest only criteria based on standards. Of the 12 articles, 10 are indexed in Scopus and one indexed in both IEEE Xplore and Springer respectively.

Table 5 Computational models to assess the usability of digital educational resources

Description of computational models:

  • (S1) The author uses 10 characteristics and their sub-characteristics associated with pedagogical usability. The procedure is basic in the introduction to fuzzy logic, it establishes the linguistic variables of input and output; fuzzy inference rules; activation conditions; Takagi–Sugeno fuzzy model, and defuzzification.

  • (S2) The method is a Fuzzy Cybernetic Analytic Network Process (FCANP) being composed of four steps. First, the networks of structures and the method of experts are created, later a level of importance of the relationships and established weights is determined using a method of linguistic variables to obtain a fuzzy matrix. As a third step, a Grand Matrix is established, which includes: (1) Objective Row establishing the end of each criterion (C1, C2 … Cn.) of ISO/ IEC 9126–1; (2) Factor Row, which establishes a vector with the comparison between criteria and its objectives and a vector with the relationships between each criterion; (3) Sub Factor Row, contains a vector that compares and relates the sub-criteria and metrics (M1, M2 … Mn.) with the general criteria (C1, C2 … Cn.); finally (4) the Alternative Row, a vector that illustrates the comparison of existing alternatives in the relationships between sub-criteria and metrics. The final usability assessment is done based on the multiplication between the global importance according to each metric with the Alternative Row.

  • (S3) Procedure based on the Fuzzy Analytical Hierarchy Process (FAHP). As a particular case, the consistency test is included, to determine the random consistency index, which is only accepted if the consistency radius is less than 1.

  • (S4) For the assessment of the usability of digital educational resources and the selection of learning objects, the Multi-Criteria Decision Analysis Approaches and the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) are used.

  • (S5) This study suggests a fuzzy DEMATEL model (Decision making trial and evaluation laboratory) to determines the interrelations between learning management systems assess criteria. Determines 12 fundamental criteria for the evaluation of usability focusing its attention on user satisfaction and learnability.

  • (S6) The research uses the Complex Proportional Assessment of alternatives (COPRAS) technique to assess the usability and operability of learning management systems. The authors focus their study on the aesthetics criteria; navigation; user-friendly interface; structuring of information, and customization.

  • (S7) The methodology uses a fuzzy linguistic model by aggregation operators with linguistic information which handles words directly. This work reflects an adaptation of the SERVQUAL methodology (multiple-item scale for measuring customer perceptions of service quality). An ordinal fuzzy linguistic modeling is used in this research to represent the users’ perceptions with words, based on the Linguistic Ordered Weighted Averaging (LOWA) and Linguistic Weighted Averaging (LWA), to assess groupware usability.

  • (S8) This research applies the TOPSIS method (Fuzzy Technique for Order Preference by Similarity to Ideal Solution), proposed by (Chen,1992).

  • (S9) Models a diffuse Mamdani-type system; uses the formula of Triangular Fuzzy Number; the max–min composition rule for each fuzzy rule and defuzzification.

  • (S10) The authors assess the usability of digital educational resources using the fuzzy logic of the Mamdani algorithm.

  • (S11) The research uses a hybrid method that integrates: FAHP and Fuzzy DELMATEL (fuzzy decision-making trial and evaluation laboratory method). The AHP-DELMATEL relation is made from the multiplication with the normalized direct relation matrix obtained in the application of the DELMATEL method.

  • (S12) The quality of the learning objects is assessed by the decision analysis (MCDA) theory and the use of triangular fuzzy numbers.

5 Discussion

Regarding question 1, educational investigations use general (international standards) and specific (pedagogical usability) criteria to assess usability. These results confirm similar research (Gunesekera et al., 2019; Missen et al., 2019; Salas et al., 2019; Silveira et al., 2020; Vee Senap & Ibrahim, 2019; Wan-Sulaiman & Mustafa, 2019; Yanez et al., 2019). However, we highlight that some of these studies only apply general criteria, delimiting the evaluation of the usability of educational resources.

Regarding question 2, similar research (Zawacki-Richter et al., 2019) also highlights that computational studies tend to use general criteria to assess usability. Therefore, interdisciplinary research between educational and computer science is essential. Our objective is to identify the computational methods and techniques most used to evaluate the usability of digital educational resources from an educative point of view and not just a technological one. Therefore, we highlight the computational studies that use pedagogical usability criteria and their main weaknesses.

The two reviews reflect important interdisciplinary aspects, highlighting that:

  • There are various classifications of criteria, with different names, but in essence, the pedagogical usability criteria that are most used in digital educational resources with an emphasis on the design of virtual courses and learning objects are perceived usefulness; self-evaluation; interactivity-interaction platform; personalization; clarity of goals, objectives, and outcomes; effectiveness of collaborative learning; content; learning and support; visual design; navigation; accessibility; interactivity; self-assessment and learnability; compatibility with learning preferences.

  • The pedagogical usability criteria most used is in digital educational resources with an emphasis on educational games -digital-, are functional playability, structural playability, audio-visual playability, and social playability.

  • Computational models reflect little use of pedagogical usability (inconsistent when evaluating a digital educational resource). The most frequent criteria and adapting it to the names present in educational research are interactivity-interaction platform; personalization; clarity of navigation, and accessibility.

  • Educational research fundamentally uses pedagogical usability criteria; however, these researches do not always use all or most of the criteria present in a standard chosen by them. Also, to assess the usability of digital educational resources, they fundamentally use questionnaires and not various assessment methods or techniques. It is reiterated in the scientific literature the need to include computational methods based on artificial intelligence to reduce the uncertainty of human thought present in the usability assessment methods, for example, questionnaires, an inspection of standards, and heuristic evaluation.

Research with an emphasis on education lacks scientific analysis in the assessment of usability from the theoretical and practical alternatives offered by the mathematical and computational sciences since they are permeated with the uncertainty present in the assessment given by the expert, the professor, or the student.

Research with an emphasis on computing lacks pedagogical criteria that underlie the practice of its theoretical model; for this reason, its empirical results may not be well received in the community of educators. Interdisciplinarity provides answers to highly complex social and scientific problems, enriching the frontiers of science. In this sense, educational informatics and technologies applied to education have a great challenge to meet.

The introduction of computational models in education allows us to diversify how the usability of digital educational resources is evaluated and determines. In addition, it allows reducing the “uncertainty of human thought” present in the assessment of usability, since there are final users, evaluators (experts), and the interpretation of the results obtained by using tools (produced by third parties), for example, selenium, Woorank, Gtmetrix, Mouseflow, Pingdom Tools, and Crazygg. Therefore, we will enunciate the main characteristics and deficiencies of these computational models.

Great progress has been made in evaluating the usability of digital educational resources; however, as a result of this analysis, the introduction of computational models still lacks pedagogical foundations that strengthen their design and praxis in pedagogical practice, which of course is applied through software that supports the computational algorithms designed. These identified computational models are permeated by the disadvantages of multicriteria methods and those based on fuzzy logic, with neutrosophy being an alternative that could respond to this, at least theoretically.

As a trend, in computational models, various multi-criteria techniques are used, the most used being AHP and TOPSIS, to determine the usability of a digital educational resource or, to select which one or which are the most appropriate (Harshan et al., 2018). The first is used to determine the weight of importance of each of the attributes and the second to determine an order of two alternatives in the usability assessment.

Other alternatives are classic fuzzy logic methods, allowing data to be processed with a high degree of imprecision, for example, opinions of experts, students, and teachers. The most widely used methods are the FAHP and the Takagi–Sugeno (T-S) model. The first (FAHP) allows decisions to be made, based on criteria in diffuse environments, and the second models a nonlinear system using a set of linear local models defined by fuzzy rules of the IF–THEN form, stating a significant behavior of the system expressed as a linear model (Takagi & Sugeno, 1985). In general, they have as a disadvantage the guarantee of obtaining a stable system and the interpretation of the fuzzy values.

From these techniques, a greater tendency to use AHP is shown, since it allows the problem to be decomposed at different levels; however, to reduce the uncertainty of human thought in the values given by the user, the FAHP model is used. Even when these methods are used with some efficiency in the assessment of the usability of educational resources, they do not escape the weaknesses of fuzzy logic; for this reason, since neutrosophy proposes to solve these limitations, the following advantages are achieved:

  • It provides the user with a more efficient algorithm than classic AHP, FAHP, and intuitive fuzzy AHP.

  • It efficiently describes the values of the expert's preference judgment, considering three different degrees: membership, indeterminacy, and non-membership.

  • It points out how to improve inconsistent judgments.

To resolve the limitations of the aforementioned computational models (poor pedagogical base and problems with the AHP and FAHP algorithm) and the models proposed in educational research (little use of international standards and the presence of uncertainty in human thought at the time of assessment), the following model is proposed as future work (Fig. 6.).

  1. 1.

    Pedagogical characterization: It is important to determine the characteristics of the students (learning styles; learning needs, motivation, etc.), the educational process (face-to-face, b-learning, m-learning, or e-learning model), and the types of educational resources we have or need (Al-Fraihat et al., 2019; Alshehri et al., 2019).

  2. 2.

    Usability criteria: Declare that criteria of general usability and pedagogical usability will be used and the relationship between them, to create a graph or matrix of dependencies (Fig. 7) (Al-Fraihat et al., 2019).

  3. 3.

    Computational model: To choose which computational model (Garg & Jain, 2017) is the most appropriate, it is necessary to answer the question:

Fig. 6
figure 6

Initial model for usability assessment

Fig. 7
figure 7

General modeling of criteria to assess usability

Do you want to assess usability or choose the best alternative among several digital educational resources according to their usability?

It is proposed to use the AHP-N neutrosophic technique. Its general structure is described below.

Assessing the structure of this technique (neutrosophic analytic hierarchy process, AHP-N) and its modification (Molina et al., 2020) for the assessment of usability in web applications, an extension is proposed to assess the usability of digital educational resources and the selection of the best option among several of these resources, corresponding to their usability.

Firstly, the foundations of neutrosophy are established as a mathematical theory developed by Florentín Smarandache (Liu & Shi, 2017) applied to decision-making problems (Romero et al., 2020) in coherence with mathematical definitions (Clemen, 1996).

Let N be a set defined as (1):

$$N=\{(T,I,F)\} : T,I,F \subseteq [\mathrm{0,1}]$$
(1)

a neutrosophical evaluation n is a mapping of the set of propositional formulas, i.e., that for each sentence p we have \(v (p)=(T,I;F)\).

As for single valued neutrosophic set (SVNS), it is applied as a central definition:

Definition: Let X be a universe of discourse. A neutrosophic set of unique value A over X is an object that has the form of:

$$A = \{ \left\{ { \left\langle {X, \left( X \right), \left( X \right), \left( X \right)} \right\rangle : x \in X} \right\}$$
(2)

Where

$$u_{a} \left( x \right):X \to \left[ {0,1} \right], r_{a} \left( x \right),: X \to \left[ {0,1} \right] and v_{a} \left( x \right): X \to \left[ {0,1} \right] with 0 \le u_{a} \left( x \right) + r_{a} \left( x \right) + v_{a} \left( x \right): \le 3 for all x \in X.u_{a} \left( x \right):X \to \left[ {0,1} \right], r_{a} \left( x \right),: X \to \left[ {0,1} \right] and v_{a} \left( x \right): X \to \left[ {0,1} \right] with 0 \le u_{a} \left( x \right) + r_{a} \left( x \right) + v_{a} \left( x \right): \le 3 for all x \in X.$$
(3)

The resulting intervals denote the degree of belonging of truth, the degree of indeterminacy, and the degree of the falsehood of x to set A.

SVNS are denoted by \(A= a,b,c\) where a, b, c, ∈ [0,1] and \(\mathrm{a}+\mathrm{b}+\mathrm{c}\le 3\)

Subsequently, the alternatives are classified according to the Euclidean distance in SVN; however, hybrid vector similarity measures and weighted hybrid vector similarity measures can be used for SVN. (Liu & Wang, 2018; Romero et al., 2020).

The algorithm to select which alternative is the most suitable according to the assessment of the usability of digital educational resources is structured in five stages.

Following the conception of the AHP-N technique (Romero et al., 2020) the algorithm that is proposed is made up of five activities and supported by computational models (Abdel-Basset et al., 2018) to determine the values of the criteria given by the users who will carry out the usability assessment.

5.1 Algorithm objective

Determine the usability of a digital educational resource and order the different educational resources from its assessment.

$$C=C_1,C_2,\dots,C_n,n\geq2,set\;of\;criteria.$$
(4)

Note: It is valid to remember that there are general criteria (international standards and authors' criteria) and particular (pedagogical usability)

$$\mathrm E=\{{\mathrm E}_1,{\mathrm E}_2,...,{\mathrm E}_{\mathrm t},\}\mathrm t\geq10,\mathrm{set}\;\mathrm{of}\;\mathrm{experts}.$$
(5)

Note:

  • It is important to know the perception of students or teachers.

  • Experts in usability and pedagogical usability should be selected.

$$RE={\mathrm R}_1,{\mathrm R}_2,\dots,{\mathrm R}_{\mathrm a},a\geq2,set\;of\;digital\;educational\;resources.$$
(6)

5.2 The weighting of expert criteria

Firstly, the usability criteria and their relationships are modeled (Fig. 7)

Subsequently, according to the AHP technique, the relative weights of the criteria are determined, assuming the basic conception (Saaty, 1980): Extremely important [8, 9]; Very strong [6, 7]; Strong [4, 5]; Moderately important [3, 2]; Equally important [1]. The intermediate values between the two adjacent judgments are 2,4,6,8. Depending on the answer, the preference matrix is obtained for each respondent (Rodríguez & Martínez, 2013).

All the elements of the matrix are positive, where the lower diagonal of the matrix, taking into account that Mi,j of the rows (i) and column (j), is filled following:

$$M_{ji} = {\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 {M_{ij} }}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{${M_{ij} }$}}$$
(7)

To give the weights, trend objective or direct allocation methods are used, being the approximate method the simplest and the classic are used. For this reason, recent studies show the importance of the geometric mean on the row of the comparison matrix (Romero et al., 2020, p. 755), a method that is also proposed to be used to calculate the weights. Subsequently, the consistency ratio (CR) is calculated from the consistency index (CI) of the matrix with our judgments and the consistency index (RI). (Abdel-Basset et al., 2018). The result is accepted if CR ≤ 0.10.

5.3 Vector evaluation

At this time, each expert Et assesses for each criterion i and each project j, representing the SVN numbers by vectors. For example, the SVN numbers can be used based on the triangular fuzzy numbers (neutrosophic SVN), obtaining a neutrosophic triangular scale (Abdel-Basset et al., 2018):

  • Extremely important [8, 9] corresponds to: \(\tilde{9}{ }\left\langle {\left( {9,9,9} \right);1.0,1.0,1.0} \right\rangle\)

  • Very strong [6, 7] corresponds to: \(7{ }\left\langle {\left( {6,7,8} \right);0.90,0.10,0.10} \right\rangle\)

  • Strong [4, 5] corresponds to: \(\tilde{5}{ }\left\langle {\left( {4,5,6} \right);0.80,0.15,0.20} \right\rangle\)

  • Moderately important [3, 2] corresponds to: \(\tilde{3}: \left\langle {\left( {2,3,4} \right);0.30,0.75,0.70} \right\rangle\)

  • Equally important [1] corresponds to: \(\widetilde{{{ }1:}}{ }\left\langle {\left( {1,1,1} \right);0.50,0.50,{ }0.50} \right\rangle ;\)

  • Intermediate values (2,4,6,8) correspond to

$$\tilde{2}:{ }\left\langle {\left( {1,2,3} \right);0.40,0.65,{ }0.60} \right\rangle ;$$
$$\tilde{4}:{ }\left\langle {\left( {3,4,5} \right);{ }0.60,0.35,{ }0.40} \right\rangle ;$$
$$\tilde{6}:{ }\left\langle {\left( {5,6,7} \right);0.70,0.25,0.30} \right\rangle ;$$
$$\tilde{8}: \,\left\langle {\left( {7,8,9} \right);0.85,0.10,0.15} \right\rangle$$

At this time, the neutrosophical comparison matrix is constructed so that:

$$\tilde{M }=\left(\begin{array}{ccc}\tilde{1 }& \cdots & {\tilde{m }}_{1n}\\ \vdots & \ddots & \vdots \\ {\tilde{m }}_{n1}& \cdots & \tilde{1 }\end{array}\right),\mathrm{ donde }{\tilde{m }}_{ij}={\tilde{m }}_{ij}^{-1}$$
(8)

To obtain the final weights, the matrix \(\stackrel{\sim }{\mathrm{M}}\) is converted into a comparison matrix by numerical pairs, using the formulas established (Molina et al., 2020) for the triangular neutrosophic numbers. Subsequently, the degrees of precision for each \({\stackrel{\sim }{\mathrm{m}}}_{\mathrm{ij}}\), are calculated:

$$S\left( {\tilde{m}_{ij} } \right) = {\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 {S\left( {\tilde{m}_{ji} } \right)}}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{${S\left( {\tilde{m}_{ji} } \right)}$}}; A\left( {\tilde{m}_{ij} } \right) = {\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 {A\left( {\tilde{m}_{ji} } \right)}}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{${A\left( {\tilde{m}_{ji} } \right)}$}}$$
(9)
$$\mathrm{S}(\tilde{\mathrm{a}})=\frac{1}{8}[\mathrm{a}_1+\mathrm{a}_2+\mathrm{a}_3](2+\alpha_{\tilde{\mathrm{a}}}-\upbeta_{\tilde{\mathrm{a}}}-\updelta_{\tilde{\mathrm{a}}})$$
(10)
$$\mathrm{A}(\tilde{\mathrm{a}})=\frac{1}{8}[\mathrm{a}_1+\mathrm{a}_2+\mathrm{a}_3](2+\alpha_{\tilde{\mathrm{a}}}-\upbeta_{\tilde{\mathrm{a}}}+\updelta_{\tilde{\mathrm{a}}})$$
(11)

Finally, a matrix is obtained

$$M=\left(\begin{array}{ccc}1& \cdots & {\mathrm{m}}_{1\mathrm{n}}\\ \vdots & \ddots & \vdots \\ {\mathrm{m}}_{\mathrm{n}1}& \cdots & 1\end{array}\right)$$
(12)

When obtaining this matrix, a vector of priorities is determined from the eigenvector. At this point, the consistency ratio (CR) is calculated according to the classical procedure of AHP.

  1. 4.

    Implementation. It is vitally important to assess the results obtained in the application of traditional engineering methods in the assessment of usability, together with the results reflected by the computational model or algorithm used.

  2. 5.

    Evaluation: This stage is very important. Its objective is to verify that each stage fulfills its mission. Assessing usability is a process that culminates in the effectiveness of student learning in the use of digital educational resources.

This proposed model has the advantage of using neutrosophy in the evaluation of the usability of digital educational resources, for which it modifies the classical AHP algorithm. This algorithm is chosen since it is more used in computational models due to its stability and efficiency (Molina et al., 2020). In addition, the proposed model requires that for its development, the researcher must previously determine the specific quality criteria (pedagogical usability) and which general criteria will be used (established in international standards).

As negative aspects of the proposed model, the following stand out: (1) the researcher must determine what pedagogical foundations he will assume since depending on them, will be the pedagogical usability criteria that he will use; (2) the development of neutrosophic models entails an understanding of mathematical models and their correct computational development; and (3) this model, although it is based on theoretical and practical foundations established in the scientific literature, is only in its design phase, therefore, its practical validity is still lacking.

6 Conclusions

This paper describes what the main usability criteria are for evaluating digital educational resources. This work was designed from two perspectives: didactic experiences of educational research and computer experiences, derived from computational research. Subsequently, an analysis was carried out to determine differences and congruences between the two types of research.

In educational research, when evaluating the usability of digital educational resources, they integrate general criteria from ISO/IEC 9241-11 and ISO/IEC 9126-1, and pedagogical usability criteria.

In research with technological emphasis, the use of the ISO/IEC 9241-11 standard is reflected as a trend; however, in the application of its theoretical models, they lack the inclusion of pedagogical usability criteria. The most widely used computational models are the AHP; FAHP and fuzzy logic.

In research with technological emphasis, the use of the ISO/IEC 9241-11 standard is reflected as a trend; however, in the application of its theoretical models they lack the inclusion of pedagogical usability criteria. The most widely used computational models are the AHP; FAHP and fuzzy logic.

In the use of international standards (ISO/IEC 9241-11 and ISO/IEC 9126-1) there is a convergence between educational and technological research, however:

  • the pedagogical usability criteria do not coincide in their denomination;

  • educational research lacks the use of computational models to perfect its methodologies for evaluating the usability of digital educational resources;

  • technological researches that use pedagogical usability criteria (25%) do not describe their metrics or their meaning. These researches do not declare or argue the pedagogical foundation.

As a result of the two reviews carried out, it is proposed as future work the creation of hybrid models to assess the usability that integrates:

  • criteria and metrics of general usability based on international standards (ISO / IEC 9241-11 and ISO/IEC 25010 as it updated the ISO/IEC 9126-1;

  • criteria approved by the scientific community (example: Nielsen criteria for usability);

  • pedagogical usability criteria;

  • current trend hybrid computational models based on FAHP methods; the Takagi–Sugeno (T-S) and AHP-N model.

Regarding question 1, the results obtained in the systematic review made it possible to identify which quality criteria are most used to assess the usability of digital educational resources, from a general perspective (according to the criteria of international standards). and the specific ones (pedagogical usability). As explained above, the research analyzed in the systematic review stands out as a deficiency that (1) in educational research there is a preponderance to use only general criteria but not specific or only specific ones, and (2) when only using classical methods usability evaluation does not achieve a complete and interdisciplinary evaluation. Therefore, the model designed as a result of this research includes in its components the inclusion of an AHP-based algorithm to strengthen the standard evaluation procedure.

Finally, regarding question 2, the systematic review reveals that even when the computational models express satisfactory results in their «computational» aspect, they lack specific pedagogical usability criteria, which from an educational perspective is inadequate and insufficient. Therefore, the designed model can surpass the previous ones because: (1) it includes and requires the selection and application of general criteria (criteria established by international standards) and specific (pedagogical usability) and (2) adapts the classical AHP algorithm to the neutrosophic conception, since this new discipline tends to solve the computational limitations of the classical AHP algorithms; TOPSIS, AHP-DELMATEL, among others.

There are limitations when considering the implications of this study. The main limitations are the search period, the selection of certain databases, and the language. The main recommendation to assess the usability of digital educational resources is to design models or procedures that integrate pedagogical, engineering, and computational aspects.

We recommend analyzing articles published from February 2020 to the present (July 2021) to confirm or not the results of this paper. Virtual education as a social process is constantly evolving and changing. Therefore, the constant analysis of the scientific literature perfects our paths in the improvement of educational technology.