Keywords

1 Introduction

E-Learning involves activities to assist the teaching-learning process using Information and Communication Technologies (ICT). There are different ways to implement e-learning strategies. The most recent ones are based on online courses (distance learning) and courses following some kind of “mixed” modality (a.k.a. Blended Learning). In the first type, students only use digital platforms and do not physically attend their campus. In the second kind, a certain number of classes occur in a digital environment whereas the rest are presential [1]. The e-Learning process can be synchronous or asynchronous. The synchronous model involves a learning modality in which both instructors and students are online and communicate directly with each other in real-time. In the asynchronous model, instructors and students interact intermittently, and not necessarily in real-time time, through discussion groups, e-mail, chats and other communication solutions integrated within a given OLE (On-line Learning Environment). E-learning normally relies on Learning Management Systems (LMS). An LMS is a type of software that offers administration, documentation, delivery, and compliance with activities established during the teaching-learning process, allowing teachers and students to communicate and interact efficiently and effectively [2]. Some examples of LMS are: Moodle, Sakai, dotLRN, ROLE, ATutor, Claroline, Dokeos, Subtopic, OpenACS, ILIAS, OpenUSS, Doubtfire, Totara, open class, among others.

LMS are not exclusive to educational environments such as universities and schools, but have also been adopted by companies, hospitals, government agencies, and other institutions. Given their relevance, evaluating these platforms to improve their quality and performance in terms of usability, UX, and success in supporting the educational process is of the utmost importance. The evaluation of LMS can improve the way teachers and students interact and ideally overcome difficulties that get in the way of successful learning, since usability and UX are perhaps the most important factors in that regard.

Usability concerns the pragmatic aspects of task execution by users, specifically, the interactions between two types of users, students and teachers, through an interface. UX is related to the subjective, phenomenological quality of the interaction with and through a digital system, i.e., aspects related to emotions and aesthetic value experienced by users [3].

When the teaching-learning process occurs through an LMS, students need to set aside time to learn how to use the system, while simultaneously attending their classes. Therefore, if an LMS does not offer good usability, students will not only spend time overcoming the learning curve of the system but also battling issues caused by its faulty design. Although, in many cases, students and teachers do overcome these obstacles and continue using the platform, albeit ending up with a negative experience. It follows that a more pleasant and satisfying LMS environment (i.e., one that provides a better UX) tends to be more stimulating for students [2, 4].

Nonetheless, there are still significant difficulties when it comes to evaluating the usability and UX of LMS and, ultimately, proposing alternatives to improve their interfaces [2]. Although some approaches have sought to integrate existing heuristics and pedagogical principles to improve the quality of LMS interfaces [5,6,7], software development culture is still reluctant to consider usability, user experience, and pedagogical principles when developing systems. Consequently, evaluation methods for LMS need to be improved and perhaps need to incorporate more sophisticated empirical procedures, for example, automated analysis, technological resources, or even artificial intelligence (AI) [2, 4, 7, 8].

This paper describes a Systematic Mapping (henceforth SM) of the main approaches used over the last decade (2010–2020) to evaluate the usability and UX of LMS. The article is organized as follows: part one presents a brief introduction on LMS; part two presents some studies about LMS; part three presents the methodology adopted to carry out the SM; part four presents a discussion about results and a triangulation with a reference study; finally, part five offers some concluding remarks.

2 Related Work

Over the last years, some studies have attempted to systematize the methodological approaches used to evaluate the quality of the interaction in LMS [4, 9,10,11]. For example, a Literature Review (LR) carried out by [10] addressed the relationship between ergonomics and usability in e-learning contexts. One of the most important conclusions of this study is that instead of using evaluation methods adapted for educational systems, most studies are conducted using generic software evaluation metrics.

The authors note the constant use of consolidated methods to assess the usability of learning systems, such as interviews and questionnaires, or a combination of methodological approaches related to User Centered Design (UCD), Cognitive Walkthrough (CW), among others. On the other hand, in 2012 these authors identified the start of a paradigm shift for LMS evaluation, observing that several researchers began to analyze ergonomic principles of usability focusing on e-learning [10].

For their part, [9] developed a systematic mapping (SM) to find publications related to the usability of mobile e-learning. The study highlights the absence of appropriate frameworks or guidelines to assess usability and educational factors in m-Learning systems (Mobile e-Learning). The authors proposed a model of m-learning applications during the system development phases, considering factors such as educational goals, usability, and student experience. Subsequently, the survey conducted by [11] proposed an update to Cota’s SM [9]. The authors mention that although it has been possible to perceive an advancement in the techniques and evaluation approaches for m-Learning environments, there were still not enough framework or guidelines to improve aspects related to UX, usability or the pedagogical context. As a result, the authors proposed a framework to evaluate mobile learning environments.

The systematic mapping carried out by [4] focused on publications that evaluated LMS from the perspective of usability and UX of desktop and mobile applications [4]. The SM analyzed 62 publications retrieving information such as origin, type, method of execution of the study, existence of learning factors, technical application restriction and resource availability. The authors concluded that there was still no sufficient evidence to indicate a more appropriate method for evaluating learning environments. The SM reveals the need for further research to find better techniques to evaluate the complexity of OLE environments.

The above studies, however, were not included in the sample analyzed by this paper, precisely because they are systematic literature reviews, but they were nonetheless used as a reference for the development of our own SM.

3 Research Method

In general terms, a systematic mapping is a rigorous and well-defined research method used to identify, evaluate, and interpret the largest possible number of relevant publications about a given topic. A SM allows finding results that are less influenced by the researchers’ biases. It also allows to capture more information about a variety of methods applied within a certain area of research [12]. Our SM expands the approach followed by Nakamura’s [4] “Usability and User Experience Evaluation of Learning Management Systems”. In addition to identifying LMS evaluation methods, our SM included the following criteria and metrics used over the last decade:

Objective of the Systematic Mapping:

To obtain evidence and “gaps” about research techniques used to evaluate LMS in the context of usability and UX over the last 10 years. Secondary objectives include determining which criteria have been used to assess the quality of the LMS interfaces and the interaction between students and teachers.

Working Protocol:

The process consisted of two steps: definition of scientific knowledge bases to be consulted; definition of data recovery criteria, such as language, document types, date of publication, and large clusters of keywords.

Knowledge Bases:

B-ON - Online Knowledge Library was used to recovery the publications. The library indexes the following databases: Academic Search Complete, American Chemical Society, American Institute of Physics, Annual Reviews, Association for Computing Machinery, Business Source Complete, Coimbra University Press, Current Contents (ISI), Elsevier, Essential Science Indicators (ISI), Eric, IEEE, Institute of Physics, ISI Proceedings, Journal Citation Reports (ISI), LISTA.

Research Criteria:

Knowledge area, publication date, document type (PDF, DOC, etc.), language.

Data Retrieval Goal:

The SM used Basili's GQM (Goal-Question-Metric) Paradigm [13] as a Reference, as shown in Table 1.

Table 1. Research objectives according to Basili's GQM (Goal-Question-Metric).

Search Method:

17 guidelines were defined to perform the Data Recovery (DR) according to Table 2.

Table 2. Data recovery questions.

Retrieval Research Languages:

Portuguese and English languages.

Fig. 1.
figure 1

Papers by filters.

Database Research Terms:

The procedures described by [14] were used to define the research terms. Both studies suggest defining the following parameters: population, intervention, comparison, result and context. Based on that, the following set of parameters was defined: Population—LMS; intervention—techniques, tools, processes; comparison—does not apply since the objective is to characterize the techniques; Results—Assessment of usability, UX or pedagogical criteria for systems definition learning management; Context—it does not apply because there is no comparison to determine the context. The terms of the research were divided into LMS—oncerns different ways of writing the research terms and the different synonyms for the terms themselves and Usability + UX— rlates to the different types of research approaches for the two terms. There was still a subdivision considering the defined languages. The terms were identified using authors referenced within HCI. An exploratory study (EE) carried out previously helped in this process [2]. This phase used a semantic analysis toolFootnote 1 to identify terms related to the keywords found in the previous Exploratory Study: UX Analysis, Usability Analysis, LMS, Pedagogical Criteria. The search string passed through several tests until we found one that resulted in a reference that was semantically close to the terms of interest. During refinement, we noticed that the search term “pedagogical criteria” and its variants returned a considerable number of results outside the Systematic Mapping definition scope. Therefore, we decided to remove it from the Data Recovery phase, without affecting the application of the protocol.

Application of the work protocol:

The application of the search protocol in scientific databases, resulted in 326 publications.

Papers Selection:

At this stage our goal was to ensure the recovery of papers exclusively related to usability, UX and LMS. The process was conducted in 3 stages, using three sequential filters: Filter 1 (F1) - applied to the following paper elements - title, summary and keywords recovered from the execution of the search string. After this first step, 217 publications remained. Filter 2 (F2) - this filter corresponds to the complete reading of all remaining items from F1. After applying F2, 109 publications were left. Filter 3 (F3) - corresponds to the Data Treatment (DT) and happened during the extraction of information from the papers (Data Extraction - DE) trying to identify the answers to the Systematic Mapping Questions. After applying Filter 3 (F3), 77 publications remained. Figure 1 shows the recovered funnel of publications and their respective phases.

Inclusion and Exclusion Criteria:

Throughout every stage, the inclusion criteria (IC) and exclusion criteria (EC) were applied as indicated in Table 3 and Table 4. Four “inclusion criteria” and nine “exclusion criteria” were defined. The exclusion and inclusion criteria were adapted from our reference study, albeit with some changes [4].

Table 3. Criteria used to include papers after string application.
Table 4. Criteria used to exclude papers after string application.

Considerations About the Inclusion/Exclusion Criteria:

Regarding the Criterion Exclusion 2 (CE-2), the systems denominated as MOOC (Massive Open Online Course)—open and massive online course are considered learning platforms focused on mass and unrestricted access of students [15]. MOOC main representatives are Coursera, Udacity, Udemy, Open University. There are obvious difficulties to access the administrative environment of these platforms, which are mostly closed-source (Criterion Exclusion 8 (CE-8). Additionally, it is not possible to identify whether a course is not inserted into a MOOC because it is of no interest for the teacher/instructor or due to usability and UX issues [4]. In the exploratory study (EE) it was possible to verify the categorization of the LMS in the context of the On-line Learning Environments [2]. OLEs can be categorized in two groups: LMS and MOOC. LMS can be subdivided into: Open Sources (OS) - Moodle, Sakai, Dotlrn, Role; Closed Sources (CS) - Blackobard, Ping Pong, LMS Canvas, McGrawhill Education, Blackbaud. Based on this, the Criterion Exclusion (CE-8) was defined considering the specifics of the closed sources applications.

Data Extraction:

After selecting publications following Filter 2 (F2), the data extraction from papers was based on the complete reading of each publication. The extraction was processed based on a set of pre-defined questions according to the work of [16] and [4]. Besides that, new questions were added to this study. The purpose of this phase was to ensure that the same extraction criteria were equally applied to each one of the publications.

Data Treatment:

The data obtained were processed through Microsoft Power BI and Excel. First, trying to find the frequency of publications related to the evaluation of usability/UX in LMS context between 2010 and 2020. This study was carried out in September 2020. Therefore, the data for this year may be incomplete, which may explain the low rate of publications belonging to that period. 2012 and 2019 have the lowest indexes of publications. The Library b-on - Biblioteca de Conhecimento Online available at https://www.b-on.pt was used and it has a considerable set of scientific indexers. However, that papers analyzed was restricted to this database. The data of this phase can be found at: https://jc7.co/dgc21.

The following Table 5 is an overview of the questions used in the Protocol Application.

Table 5. SM application protocol questions.

The model used was adapted from [4]. At the beginning of the Data Extraction (DE) - corresponding to F2, there were 109 publications. During the refinement process, 32 articles were excluded according to the established exclusion criteria. The reasons for exclusion are mostly related to Criteria of Exclusion 1 and 9 (CE-1 and CE-9). Some articles were excluded because they were literature reviews themselves. Some articles were also excluded because they analyzed different LMS environments according to CE-1. In some cases, it was not possible to obtain enough information about the process applied in the study. In other situations, some studies were in an initial stage, making their process difficult to analyze. Some articles used more than one evaluation technique. In these scenarios, unlike the reference study, we chose to identify the other techniques applied. Because of this approach, the QE02 (Question 2) was subdivided into three parts: QE02-1 - Primary Technique Type (PTT), QE02-2 - Secondary Technique Type (STT) and QE02-3 - Tertiary Technique Type (TTT). In this context, 116 techniques were identified across 77 publications, with repetitions. In this study, we also decided to verify the research categorization of papers. It was defined in QE07 (Question 7) - Category of Research (CR) which was subdivided in QE07-1 - Primary Research Category (PRC), QE07-2 - Secondary Research Category (SRC) and QE07-3 - Tertiary Research Category (TRC).

4 Preliminaries Results

Through our Systematic Mapping we obtained 77 publications extracted after Filter 3 (F3) was applied. Geographically speaking, there was a higher concentration of scientific work from Asia (30) followed by Europe (23) and South America (9). There was a greater concentration of scientific work from Brazil (9), Indonesia (7) and Malaysia (6). For other countries, we found a ratio of one to four studies throughout the decade. The year 2015 had a peak of 14 publications, the highest in the sample, followed by years 2010, 2011 and 2016, which corroborates the results found in [4].

The Systematic Mapping can gather studies with different methodological approaches to investigate the quality of LMS from the perspective of usability and UX. The approaches based on Heuristic Evaluation (HE), are still largely or entirely based on Nielsen’s heuristics. They are frequently used alone or in combination with other methods, such as User Testing, Inspection by Evaluators, Questionnaires and other qualitative techniques. On the other hand, some initiatives were more related to the Pedagogical Field. It was also possible to verify a certain overlapping of the scientific production. Many studies make repeated use of standards and methods that are already consolidated. In some way this can contribute to validate the approaches, but it can also become an obstacle innovation.

Moreover, the use of new approaches and techniques is still restricted. The analysis and criteria are still limited to qualitative and already consolidated approaches and quantitative studies are less frequent. The use of automated evaluation processes, based on computational strategies potentially leading to automated—or at least, semi-automated—analysis is still scarce. The criteria used in each study to evaluate the LMS interface were numbered and from this result was possible to identify the authors most frequently adopted by the analyses. Preliminary results point to the frequent use of the criteria established by Nielsen's studies, followed by the metrics established by the ISO 9242–11. It is possible to verify a more frequent use of Nielsen’s heuristics until the mid of the decade, with a higher concentration in the first half of this period, as shown in Table 6. The same is true for the ISO 9241–11 criteria. Interestingly, our data shows that the year 2017 marks a reduction in the presence of these references.

Table 6. Frequency of author by year (2010–2020). Data processed and extracted with Power BI.

5 Conclusion

The preliminary results of this Systematic Mapping confirm the significant use of Nielsen’s heuristics to evaluate Learning Management Systems (LMS). This situation is especially relevant regarding the early years of the decade (2010–2020). Towards the end of the decade, it is possible to see a specific paradigm change in the research field with the emergence of new authors whose theories were considered as a basis for evaluating LMS interfaces. The study presented here is still under development. However, we expect that it will contribute to finding the most used methodological approaches during the last ten years to evaluate LMS. A systematization of the criteria used by the studies to evaluate LMS is underway and can be a good point for advancing research to improve the quality of interaction in OLE such as LMS. The data used in this research is available at: https://jc7.co/dgc21.