Keywords

1 Introduction

Learning analytics make use of student data to improve learning and the environment in which this takes place. This improvement is achieved via data-driven interventions, which are an important step in the learning analytics process. Presently, much learning analytics activities are aimed at enhancing academic achievement [16]. Learning, however, is more than its mere outcome in the form of scores and grades. In this study, we research what other measures of affected learning can be identified in existing learning analytics literature. We conduct a systematic literature review and synthesize the results in order to provide an answer to the research question: In what way does existing learning analytics literature measure affected learning?

We structure the results of our study based on a classification scheme which is derived from prevalent learning theories. Our research supports both academics and practitioners in their work as it provides (1) different types of affected learning which can be the target of learning analytics activities and (2) actual measures of these effects which help to determine the benefits of learning analytics on learning.

We structure the remainder of this paper as follows. First, we provide a short overview on the background of the study. We then describe in detail the methodology, followed by an elaboration on the results. Finally, we provide recommendations for future research and discuss the limitations of our study.

2 Background

In this section, we give an overview of learning analytics, its process and goals. Furthermore, we introduce a classification scheme to classify and analyze the key studies found during the literature review.

2.1 Learning Analytics

Learning analytics is “the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environment in which it occurs” [43]. We can achieve optimized learning in various ways, e.g., personalize learning, enhance instructor performance or improve curricula [33]. Learning analytics takes place at the micro and meso level within educational institutes, so the focus is on the learner and its surroundings [48]. Analytics at macro (or institutional) level is usually referred to as academic analytics.

The Learning Analytics Cycle [10] describes the process of turning data into action and involves four steps, which are: (1) students generate learner data, (2) the infrastructure captures, collects and stores the data, (3) the collected data is analyzed and visualized, and (4) the design and use of data-driven pedagogical interventions based on the analysis and visualizations (see Fig. 1). The cycle then starts again, enabling the measurement of effects caused by the performed interventions. This analysis, however, requires measures that allow for comparison of affected learning.

Fig. 1.
figure 1

Learning Analytics Cycle [10].

A systematic review of learning analytics literature by Papamitsiou and Economides [35] classifies studies by learning setting, analysis method, and research objectives. That study shows that learning analytics uses a wide variety of techniques and is not limited to only Learning Managements Systems (LMSs), but can also be applied within other Virtual Learning Environments (VLEs), such as web-based education, social learning, and cognitive tutors. The objectives of the studies are diverse and include e.g., student behavior modelling, prediction of performance, prediction of dropout and retention, and increased (self-) reflection and (self-) awareness. These goals are achieved via pedagogical interventions. Interventions are an important part of the learning analytics process, since in this step, information is turned into action. Learning analytics interventions can be defined as “the surrounding frame of activity through which analytic tools, data, and reports are taken up and used” [52]. In our research, we analyze in what way the effects of interventions are measured by selecting key studies which report on empirical results, often from (quasi-)experimental settings and case studies applying data-driven pedagogical interventions. To categorize the various types of measures, we first synthesize a classification scheme from the extant literature.

2.2 Classification Scheme

To evaluate whether learning is indeed affected, we should be able to measure the effect of these interventions, by measuring the observed difference in learning. This raises the question in what way(s) learning can be measured. Below, we discuss several prevalent learning theories from this perspective.

Biggs’s 3P model [6] describes the educational system based on three factors: (1) presage factors which affect learning, (2) the learning process, and (3) the desired learning outcomes – see Fig. 2.

Fig. 2.
figure 2

3P model [6].

Within Presage, Biggs presents a distinction between students and teaching context, which is also present in learning analytics literature. Siemens and Long [44] distinguish between learning analytics at course level and departmental level. Furthermore, Van Barneveld et al. [48] propose a conceptual framework where learning analytics focusses on both learners and department. The latter is a broad concept, as it includes the context in which the learning takes place or, in other words, the ‘environment’ where the aforementioned learning analytics definition refers to. Departmental variables may consider a more long-term effect of learning analytics, which has been posed as an important feature of future learning analytics research [16].

Learning can either be described as a process or as the outcome of this process: a (relatively permanent) change in a person’s behavior, knowledge and/or skills [8]. As mentioned by Kolb [24]: “learning is best conceived as a process, not in terms of outcomes”. Learning outcomes – or ‘products’, to use the term provided by Biggs [6] – are often operationalized by performance indicators derived from assessment of learning, such as grades, degrees, and so on. A concept closely related to performance is (academic) achievement: performance outcomes that indicate the extent to which a person has accomplished specific goals that were the focus of activities in instructional environments, specifically in school, college, and university [46]. These ‘specific goals’ that students try to achieve are often formulated by the instructor as learning outcomes, which can be defined as a way “to express what the students are expected to achieve and how they are expected to demonstrate that achievement” [55]. In order to do so, learning outcomes should be clearly measurable – either direct or indirect – and used to gauge whether students can move to a higher level. Direct measures specifically assess learning as students must demonstrate this by performance of a task [38]. Indirect measures only give a general indication of learning and may include questionnaires and self-reports. Although grades may seem to be a direct measure, this is debatable. Grades can be regarded as a proxy for learning and therefore be an indirect measure, as they often comprise a combination of learning outcomes or included non-related corrections like extra credits for certain activities [14]. Therefore, as learning involves more than just a grade at the end of a course, we take a broader view at learning and include measures related to the process as well.

Based on the literature described above, we now discern two dimensions: (1) level of learning analytics and (2) learning as a (supported) process or learning as a result. Combining these two dimensions, we propose a classification scheme to classify learning analytics measures – see Fig. 3. We use this scheme to classify the measures of affected learning we find in our literature study.

Fig. 3.
figure 3

Classification scheme for (measures of) learning analytics effects.

3 Method

In this section, we will provide a detailed description of the method used for our systematic literature review. The method applied in this literature review builds on other systematic literature reviews in the learning analytics domain (cf. [7, 33, 35, 39]). In our study, we aim at providing an answer to the following research question: In what way does existing learning analytics literature measure affected learning?

3.1 Literature Sources

During the literature review, papers from seven different databases are sourced: (1) Learning Analytics and Knowledge (LAK) is the main conference in the learning analytics field. Organized for the first time in 2011, it produced an extensive amount of proceeding papers ever since. In this study, we include the LAK conference proceeding papers. (2) SpringerLink is the world’s most comprehensive online collection of scientific, technological and medical journals, books and reference works, including the EC-TEL proceedings. (3) The Association for Computing Machinery (ACM) database is a large, comprehensive database focused on computing and information technology. (4) IEEE Xplore is technical-oriented database and contains papers related to, among others, computer science. (5) ScienceDirect is Elsevier’s leading information solution for researchers and includes over 3,800 journals. (6) The Education Resources Information Center (ERIC) database is focused on educational literature and resources. (7) Learning Analytics Community Exchange (LACE) was a European Union funded project and one of the project aims was to collect evidence of the effects learning analytics have on education. In the study at hand, we include papers which relate to the proposition “Learning analytics improve learning outcomes”.

3.2 Search Terms

To search the aforementioned databases for literature related to measures of affected learning, different search terms are used. The search terms are formulated based on a priori analysis of relevant papers. Generally, the search includes the terms “learning analytics” AND student* AND (achievement OR “student learning” OR “learning goal” OR “learning outcome” OR performance OR “student success”). When allowed for by the search engine, we specifically search the abstracts for student* and (“learning analytics”) to ensure we get learning analytics-related articles.

3.3 Selection of Papers and Inclusion Criteria

The aim of this study is to identify measurable effects of data-driven interventions in real-life educational settings. It is a first step to identify what types of measures are currently used for effects of learning analytics endeavors. These insights will allow the learning analytics community to develop (possibly standardized) instruments to measure these effects, in turn creating opportunities for the replication or reproduction of results and performing meta-analyses on effect sizes. We therefore focus on studies reporting on quantitative results, as they provide us with actual measures of learning which can be calculated and can ultimately be applied in a standardized way. The following inclusion criteria are used during our search process:

  • Paper is written in English;

  • Paper must either be a conference proceeding paper or journal paper;

  • Paper is published between 2011 and July 2017;

  • Paper must describe interventions performed based on data analysis;

  • Paper must present empirical data;

  • Paper must report on quantitative results.

From the papers found in the previous step, the title and abstract are read to determine whether it meets the inclusion criteria. Papers clearly not meeting the criteria are dismissed. If the abstract and title do not provide enough information to make the selection, the paper is scanned – especially the method and result section – to make a better-informed decision. In a second selection round, the remaining papers are entirely read and again gauged against our inclusion criteria. To ensure the objectivity of the selection, a subset of the retrieved articles was handled separately by a second researcher and the results were discussed. No conflicts were observed in the selection of key studies by the two researchers. The key studies are all included in the analysis phase of the review. From these papers, we extracted and collected: author(s); title and subtitle; year; research objectives; level of analytics (descriptive, predictive or prescriptive); measure or indicator of improved learning; and operationalization of these measures and indicators. We analyzed these data to synthesize the results presented in the next section.

4 Results

This section presents the results of our literature review. From the 1034 hits on the search terms in the seven databases, 38 key studies meet the inclusion criteria – see Fig. 4. A retention of around 4% sounds rigid, however, other literature reviews in the learning analytics domain like Bodily and Verbert [7] and Ruiz-Calleja et al. [39] show similar results of 10% and 3%, respectively.

Fig. 4.
figure 4

Search process results.

4.1 Classifying Key Studies Based on Affected Learning

Using the classification scheme introduced in Sect. 2.2, we now classify the key studies based on the different measures of affected learning.

Learning Process.

The learning process relates to learning-focused activities. Learning analytics key studies within this category try to affect different tasks which can be distinguished during this process like the planning of coursework [21], supporting self-regulated learning [32, 34, 41, 42], time management skills [47], discussion board posts quantity and quality [4], engagement with assignment [28], number of readings [29], plagiaristic behaviors [1], and choosing to solve more difficult questions [11]. One of the major objectives of the key studies in this category is increase of (self) reflection and (self) awareness. By providing students with the right visualizations, they can take control of their own learning, thereby improving the learning process.

Student Performance.

Containing 19 key studies, this category is by far the largest in our research. Most studies relate to academic performance, achievement, grades or scores [9, 11, 12, 15, 17, 22, 23, 25, 36, 37, 52, 54, 55]. Other mentioned forms of affected learning in this category are learning gains [40], content mastery [27], students predicting their own final scores [2] or the quality of a written computer program [5]. Remarkably, some of the key studies claim to affect aspects which one would expect in the learning process category – e.g., supporting self-regulated learning [31], time management skills [47] – but the effects that are measured fall in the student performance category (e.g., grades or scores). That is, the product or outcome of the learning process is measured rather than the actions performed during this learning process. Objectives of the key studies in this category include the increase of (self) reflection and (self) awareness, prediction of performance, recommendation of resources, and student behavior modeling.

Learning Environment.

Although the optimization of the learning environment is explicitly mentioned in the commonly accepted definition of learning analytics [43], with only five key studies this category is the smallest within our research. The learning environment is affected by providing teachers with tools to intervene on problematic groups [49, 50], assessment time savings [18], improvement of course quality and outcomes [45], and teachers attention [30]. The sole objective of studies in this category is the improvement of assessment and feedback services.

Departmental Performance.

Instead of focusing on individual students, departmental performance mostly relate to the success of students as a group [3, 12, 17, 22, 26], to student retention [13, 20], or to financial benefits of Early Warning Systems [19]. The prediction of performance, dropout and retention are the most common objectives of the key studies in this category.

4.2 Measures of Affected Learning

The previous paragraph describes the aspects of learning which learning analytics literature aims to affect. We regard these aspects as the dependent variables of these studies. The operationalization of the dependent variables are measures of affected learning and can be used to describe changes causes by learning analytics. We use our classification scheme to give an overview of the measures used in the key studies (see Table 1).

Table 1. Measures of affected learning.

5 Conclusions and Discussion

The aim of this study was to provide an answer to the research question: In what way does existing learning analytics literature measure affected learning? The first conclusion is that, from 1034 articles on learning analytics, only 38 describe quantitative, measurable effects of complete learning analytics cycles in education. This is a noticeable shortcoming, since studies in which both qualitative and quantitative results were present also satisfied our inclusion criteria. By analyzing these 38 key studies, we identified different measures of learning which can be affected with learning analytics. The measures are positioned according to a classification scheme: learning process, student performance, learning environment, and departmental performance. Our study allows for improved positioning of learning analytics research based on concrete measures, which helps learning analytics research and endeavors to be better compared. This systematic literature review shows that key studies mostly relate to the categories student performance and learning process. This was to be expected, as learning analytics particularly aims at learners and learning at the micro level. Only four papers report on measures in more than one category [11, 17, 22, 47], even though cross-categorical learning analytics provide a better, multi-perspective view on learning as it includes both process and performance or multi-level measures.

5.1 Recommendations

In order to justify the use of data analytics within educational processes, the effects of learning analytics on learning must be clear and well-defined. Some of the analyzed papers do report on potential improvements gained by data-driven intervention but do not describe their actual effect in terms of measures of affected learning. By describing those effects, more evidence about the benefits of learning analytics on education can be gathered, consequently strengthening the field in general. We suggest the use of our research outcomes for reporting on and comparing learning analytics results in both research and practice.

Gašević et al. [16] urge us to remember that “learning analytics are about learning”. In line with this statement, and based on the outcomes of this study, we recommend learning analytics researchers and educational institutes to move away from mere performance-based evaluation of learning analytics projects and include measures related to learning processes and learning environment as well, as that is also a core objective of learning analytics [43]. Moreover, by optimizing the learning environment, learners are provided with better and more prompt feedback, while instructors can make more accurate decisions. Regardless of the dominant learning theory within an institute, a more complete view on learning is taken by looking at measures from multiple categories of our classification scheme.

5.2 Limitations and Future Work

Our goal was to identify measures of affected learning and group these based on a classification scheme. In order to do so, we only included empirical, quantitative results from data-driven interventions in our study. However, several studies use tools, techniques or methods as an intervention, even though they do not rely on data analytics itself. These papers then use data to describe the effect the intervention has on learning. Although this provides insight in the variables used to measure affected learning, these studies were disregarded as they do not meet our inclusion criterion demanding data-driven interventions, which is an important step within the learning analytics process. Future research might adopt broader inclusion criteria and extend the current findings with a larger set of key studies, thereby enhancing our results and identifying more and different measures of affected learning.

Finally, this study revealed that in recent learning analytics literature, no default set of constructs exists from which the dependent variable for a study can be selected. In Table 1, we see several different terms for closely related concepts, while the operationalizations also differ between studies. Building on the classification scheme in Fig. 3, a next step would be to devise an ontology of constructs - with operationalizations in the form of measures or instruments - for learning analytics benefits, in order to facilitate the reproducibility of empirical learning analytics research.