1 Introduction

Currently, big data and analytics are burgeoning fields of research and development (Abdous et al. 2012; Ali et al. 2012; Dyckhoff et al. 2012). In education, several concurrent developments are taking place that have implications for big data and analytics in the field of learning. A wide range of promises and anxieties about the coming era of big data and learning analytics (LA) are in debate (Cope and Kalantzis 2016; Ifenthaler 2015; Ifenthaler et al. 2014). Overall, there is widespread consensus that the educational landscape itself is in transition and the changes are substantial, with expository instructional methods being replaced by more learner-centred approaches to learning. As more and more learning is either taking place online or is supported through technology, these active learners produce an ever increasing stream of data—both inside learning management systems (LMS) and outside, in other IT-based environments (Pardo and Kloos 2011).

Learning analytics refers to the use of ”dynamic information about learners and learning environments to assess, elicit, and analyze them for modeling, prediction, and optimization of learning processes” (Mah 2016, p. 288). As Roberts et al. (2017, p. 317) states, the pedagogical potential to provide students “with some level of control over learning analytics as a means to increasing self-regulated learning and academic achievement”. Visualisation of information, social network analysis, and educational data mining techniques are at the methodological core of this newly emerging field (Greller and Drachsler 2012). Techniques for analyzing big data are such as machine learning and natural language processing based on the particular characteristics of these data for learner and teacher feedback, the possibility of real-time governance, and educational research (Cope and Kalantzis 2016, p. 2).

While this field is multi- or even interdisciplinary, the pedagogical perspective appears to be somewhat underrepresented (Greller and Drachsler 2012). Current research on big data in education revolves largely around (1) the potential of learning analytics to increase the efficiency and effectiveness of educational processes and (2) the ability to identify and support students at risk and to thereby reduce drop out-rates. Accordingly, the main problem is that the core focus of research is on prediction, while the potential for supporting reflection on processes of learning is being neglected. Therefore, the main purpose of this paper is to map out how LA can be carried out from a pedagogical perspective and to conceptualize a generic framework for the design of LA environments.

2 Research Questions and Methodology

In line with Kelly et al. (2015), the claim that we put forth in this paper is that “theory-led design has the potential to yield innovation in the development of LA tools and, in turn, that the development of LA tools and their use may contribute to learning theory” (p. 15). Our paper presents a framework for the theory-led design of LA environments with particular focus on digital learner support and students’ cognition.

The key research question we pursue in this paper is the following:

How can big data and learning analytics be employed in order to improve learner guidance, students’ learning processes and learning outcomes with regard to meta-cognitive abilities for self-regulated learning?

We pursue these issues by asking a range of more detailed questions:

  • What are critical dimensions/aspects when designing LA services that are integrated in a pedagogic process? And what would a generic framework for designing such LA services need to look like?

  • What generic strategies for developing LA services currently exist? And what form would the concept and set-up of a decision-support framework for devising LA strategies need to take?

  • Which skills are required by learners in their roles as data subjects and/or data clients in order to make competent use of LA services?

The research project we report on here was based on a methodological combination of systematic literature analysis and model development.

The goal of the proposed framework is to provide relevant stakeholders—in particular designers and teachers of learning environments—with decision guidelines from a pedagogical perspective. In order to obtain an overview of existing LA research, an initial systematic literature analysis was conducted. The focus of this analysis was on work that addresses basic conceptualisations of LA, reference models for LA, and methods applied in order to pursue LA. Building on the findings of this literature review, and by combining and expanding or extrapolating existing models, the generic framework for designing LA was created.

Our starting point was the framework provided by Greller and Drachsler (2012). This pedagogical model contains six dimensions: competences, constraints, method, objectives (distinguishing between reflection and prediction), data, and stakeholders. On this basis, we have proposed a design framework for a more holistic approach to learning analytics rooted in a pedagogical perspective and focusing on students’ cognition resulting in four generic LA approaches we discussed and elaborated with stakeholders at our university.

For that reason, we conducted a needs analysis (e.g. in terms of relevant competences, data issues, etc.) at our university with 12 lecturers (diverse group, large-scale and small group lectures, different subjects, at least five years teaching experience, four lecturers with a programme manager role; all of them have experience with LA at least in one of the developed generic approaches).

We discussed the developed use cases and received feedback on the needs of important implementation factors. These interviews were helpful in order to (1) provide an understanding of the current state of the learning analytics field and (2) assist in identifying teachers for setting up an internal task force.

In the process, we applied cognitive mapping techniques with the programme managers and lecturers participating in the task force (Ackermann et al. 2004). We used cognitive mapping as a communication tool between the analysts and the users for adapting the initial framework. Furthermore, we used cognitive mapping to decompose the model into finer detail by using elements of additional frameworks. We structured the use cases according to Greller and Drachsler (2012), and emphasized the learning objectives as well as skills required by learners as a core element for the competent use of LA applications.

3 Results

3.1 Literature Analysis

This study reviews literature selected with the primary focus on big data and learning analytics and their implications on higher education, educational technology, and instructional design. Google Scholar was used to search and locate academic papers from journals, conference proceedings, and professional magazines with the keywords “big data” and/or “learning analytics” and “framework” or “concept” or “model” or “applications” or “approaches”. The search period was set from 2010 to 2017 and the papers reviewed include both qualitative and quantitative studies from researchers in the field of learning analytics worldwide. For the purpose of this study, the data collection process resulted in the identification of 45 articles. Ten of the articles provided frameworks that were too narrow, e.g., general principles or policy frameworks for the ethical use of data. Therefore, 35 articles were further analyzed and compared. The frequency with which these articles are cited by researchers bears witness to their relevancy and to the fact that they are a representative sample of the literature in the field. In addition to this search for original contributions, we conducted a literature analysis to identify current literature reviews on Learning Analytics. Of primay importance are the reviews by Papamitsiou and Economides (2014) who identified 40 articles; Sin and Loganathan (2015) who identified 45 articles; and Leitner et al. (2017), who identified 101 papers on learning analytics.

Starting from this body of research, the selection criteria for the overview presented in Table 1 were the following:

Table 1 Comparison of conceptual frameworks for developing LA applications
  1. 1.

    Holistic frameworks that describe or develop LA systems (e.g., static models vs. dynamic process models);

  2. 2.

    Generic approaches to a partial theory of LA with a focus on LA objectives and students’ competences as this is our research focus.

    The analysis of the contributions in the body of research identified resulted in four categories: (1) research on prediction of performance; (2) research on formative individual feedback and assessment services; (3) research on social learning analytics; and, (4), research on competent use of LA applications.

In Table 1, below, the LA frameworks are clustered first in terms of their LA type and then according to the identified categories as shown in Table 1.

3.2 A Design Framework for Learning Analytics

As the literature analysis reveals, there are “softer” challenges that influence the acceptance of LA. These relate to issues of data ownership, ethical use and potential abuse of LA, and competences required to engage in meaningful LA activities. The pedagogic frameworks (e.g., Bakharia et al. 2016; Greller and Drachsler 2012; Gibson et al. 2014) for engaging in LA differ from other, more process-oriented frameworks (e.g., Clow 2012; Ferguson et al. 2014; Verbert et al. 2012). Building on holistic pedagogic frameworks, we aim at a descriptive framework that can later on be extended to a domain model or ontology. Depending on the (institutional) context, basic pedagogic principles and specific objectives, the workflow and process when engaging in LA may vary (Greller and Drachsler 2012).

The framework we propose (see Fig. 1 below) is similar to Greller and Drachsler (2012) and essentially represents a feedback loop. This conceptualization of the overall process as a feedback loop has been inspired by quality development frameworks (e.g., West et al. 2015) and dialogue with the multiple stakeholders involved is a key element. A particular pedagogic theory (or theory in use) and a specific learning design represent the starting point. From this consideration, both the particulars relating to the facilitation of learning as well as the specifics of LA are derived. The learning outcomes represent the feedback required in order to adjust and improve on the process and a particular pedagogic theory (in use) or learning design.

Fig. 1
figure 1

Design framework for LA

The design framework for LA comprises four dimensions:

  • LA objectives

    These may relate to supporting reflection and/or prediction with regard to learning. Likewise, the LA objectives may relate to supporting individual students in their learning or to supporting interaction among students and/or facilitators. The framework of Greller and Drachsler (2012) distinguishes mainly between “reflection” and “prediction“ as LA objectives. However, “individual learning” and/ or “social learning” need to be differentiated as well. From a pedagogical perspective it is a design criterion for LA applications whether you focus on individual learning (e.g., individualized feedback, assessments, tracking learning progress, etc.) or on social learning in a collective context (e.g., social comparison activities, rewards from others as motivational factor for student engagement, etc.).

  • LA stakeholders

    Stakeholders in LA activities are those that either are subjects of data analysis services or clients of data analysis services. Students and teachers, for example, may be subjects of data analyses in that data resulting from their learning activities are aggregated and analysed. Students, teachers, and institutional representatives, for example, may be clients of data analyses in that such analyses aim at supporting their activities and decisions.

  • LA application

    Learning analytics applications comprise, among other things, technologies, platforms, data sets, and algorithms employed in carrying out analytics activities. The configuration of these elements may vary depending on the specific given context.

  • LA constraints

    These constraints comprise rules and regulations concerning privacy and ownership of data, ethical considerations, as well as cultural norms and values. Again, these constraints may depend on the context at hand, for example, whether the educational institution pursuing LA is a primary school, an institution of higher education, or a commercial provider of learning and development services.

Taking this overall design framework for LA as a starting point, we propose in the following section a systematisation of one dimension of this framework: the learning objectives. The matrix derived later on serves as a basis for the use cases derived which focus on learning process and students’ cognition.

3.3 A Framework for Learning Analytics Objectives

With regard to employing LA as a means to support and improve on (digital) learning, we propose a set of generic approaches based on a 2 × 2 matrix (see Table 1). This matrix includes the main pedagogical objectives to improve students’ cognition and learning processes in either an individual or social learning context.

3.3.1 Student Cognition: Reflection and/ or Prediction

One dimension is set up via the distinction between reflecting on past learning activities versus predicting next/future learner activities. Reflection in this context refers to critical self-evaluation on the basis of (1) own data sets created in the process of either learning (students) or supporting learning (teachers/facilitators) and (2) data sets created by others (e.g., a teacher reflecting on his or her own teaching style based on data sets generated by the students) (Greller and Drachsler 2012, p. 41). Prediction refers to anticipating learner activities (e.g., further reducing investment in classwork or discontinuing with classwork altogether) and interventions that aim at preventing this (Siemens et al. 2011).

3.3.2 The Context of Learning Activities: Individual LA Systems and/ or Social LA Systems

The other dimension is set up via a distinction between individual learning activities versus social learning activities. Much work in LA is oriented towards supporting and determining individual achievement, for example by analysing the data generated through summative assessments. The focus on individual learners is related to the goal of personalization and individualization. In order to provide pedagogically valuable feedback, assessment systems have to become intelligent and connected with higher-order learning skills. Adaptive learning systems (focused on individual learning and prediction) represent a distinct, quite new field of research based on interactive machine learning.

Buckingham Shum and Ferguson (2012, p. 4) have argued that “new skills and ideas are not solely individual achievements, but are developed, carried forward, and passed on through interaction and collaboration”. In consequence, LA in social systems (e.g., in the context of a classroom at a school) “must account for connected and distributed interaction activity”. Buckingham Shum and Ferguson therefore propose social learning analytics as a domain in its own right (2012). Similar, gamification or gameful design for learning is considered as an on own domain using LA in social systems, for example to provide visible status and learning progress, social comparison and reputation (e.g., based on badges). Rule-sets and game design elements implemented in a learning environment can provide systematic support for learning and may contribute to student engagement. They may function as “nudges” that influence student behavior in a predictable manner without having to resort to prohibitions, commandments, or extrinsically motivating incentives (Fig. 2).

Fig. 2
figure 2

Generic approaches to learning analytics with focus on Students’ Cognition

The matrix developed here elaborates one dimension of the proposed framework and emphasizes the need to tackle LA objectives from a pedagogical perspective in order to support students’ learning skills. The matrix provides a starting point for generating use cases in an LA systematic.

4 Use Cases

The following section illustrates how the framework comprising generic approaches can be translated into specific use cases. Starting from the use cases provided by Greller and Drachsler (2012), we elaborate the pedagogical perspective by exemplifying the pedagogical theory. Additionally, we spell out relevant aspects to consider in the design of learning activities.

4.1 Use Case 1: Social Learning Analytics for Reflection

The first use case relates to conducting a social network analysis of students discussing in a forum, for example using the SNAPP tool developed by Dawson (2008). This implies a shift in attention away from the summative assessment of individuals towards learning analytics of social activity (Buckingham Shum and Ferguson 2012, p. 5). In this context, it is relevant to distinguish between social analytics sui generi (e.g., social networks analysis or discourse analytics) and socialised analytics that are based on personal analytics while also being relevant in a social learning context (e.g., analytics of user generated content, analytics of personal dispositions, or analytics of contexts such as mobile computing and the networking opportunities related thereto) (Buckingham Shum and Ferguson 2012, p. 10).

The following example illustrates the first type of social analytics sui generis (Table 2).

Table 2 Exemplary detailing of use case 1

4.2 Use Case 2: Individual Analytics for Reflection

This use case is about LA with a focus on reflection at the individual level. As Evans (2013) discovered in a thematic analysis of the research evidence on assessment feedback in higher education (based on over 460 articles over a time span of 12 years), effective online formative assessment can enhance learner engagement during a semester class. Focused interventions (e.g., self-checking feedback sheets, mini writing assessments) can make a difference to student learning outcomes as long as their value for the learning process is made explicit to and is accepted by students and lecturers. The development of self-assessment skills requires appropriate scaffolding on the part of the lecturer working with the students so as to achieve co-regulation (Evans 2013) (Table 3).

Table 3 Exemplary detailing of use case 2

4.3 Use Case 3: Social Analytics for Prediction

The more environments for working and learning become digital, the more data is generated in the course of activities relating to working and learning: accessing web pages, working on short knowledge tests, posting in an online forum, commenting on a forum post, etc. (Manouselis et al. 2010). Until recently, the availability of such data for analysis had been mostly confined to what is going on inside a particular learning management system (LMS). With the development of the xAPI specification for transfer of interaction data, a much wider range of data from both inside and outside an LMS can be made available for analysis (Berking et al. 2014).

These developments help to enable gamified learning designs (Berkling and Thomas 2013). By this we refer to “the use of game design elements in non-game contexts”. Frequently, this takes the form of awarding points and badges for individual learning activities (e.g., posting in a discussion forum) and displaying top performers (or rather point generators) on leaderboards (Deterding et al. 2011; Mak 2013). While there is evidence that gamified designs (can) lead to higher student engagement and improved learning (Dicheva et al. 2015, p. 83), the opportunity to engage in a more systematic motivation design that also includes choices, social integration, team assignments, as well as characters and stories is often missed (Seufert et al. 2017).

The following use case focuses on gamified learning designs as an example of behavorial “nudging” (Table 4).

Table 4 Exemplary detailing of use case 3

4.4 Use Case 4: Individual Analytics for Prediction and Prescription

More than 30 years ago, Leonard Bloom demonstrated that individual tuition leads to a 2-Sigma performance improvement in tests compared to then standard expository teaching techniques in classrooms with about 30 learners (Bloom 1984). The idea of individualised tuition for large numbers of learners is currently being pursued in the context of the research and development of adaptive or intelligent tutorial platforms (Romero et al. 2008). The research and development in this area is based on advances in artificial intelligence and cognitive computing (Verbert et al. 2012). Adaptive learning systems aim at supporting the development of conceptual structures in learners rather than merely supporting (repetitive) problem solving as was the case in prior generations of so-called intelligent tutorial systems.

Adaptive learning systems closely track student activities and student performance and provide students with adequate learning pathways and adaptive learning resources based on machine learning algorithms and predictive models (Butz et al. 2003).

However, more substantial empirical research is needed, in particular to investigate (Nour et al. 1995) the appropriateness of such algorithms in disciplines other than the typical mastery learning subjects (e.g., biology, mathematics, information science) and their effectiveness for reaching higher learning outcomes (Table 5).

Table 5 Exemplary detailing of use case 4

5 Discussion

Learning analytics (LA) has the potential to enable learners, teachers, and their institutions to better understand and predict learning and performance. However, the pedagogical perspective, and in particular the focus on reflection instead of prediction, has been neglected in research so far. Therefore, the main contribution of the paper is to provide a generic framework for the design of LA environments from a pedagogical perspective and focusing on students’ cognition.

The presented framework provides a matrix with two important dimensions from a pedagogical point of view: (1) Objective for LA: Reflection versus Prediction and (2) Main context and target group: Individual analysis versus social (network) analysis. Based on the proposed framework we developed use cases in order to define the overall generic strategy in more detail. The proposed conceptual framework serves as a heuristic model for identifying and structuring the research questions. A learning analytics plan for research could be tuned depending on the pedagogic goals.

However, we want to emphasize that the proposed generic framework has its limits as a helpful concept map for further research. The proposed two dimensions might be too narrow to pursue the pedagogical perspective in LA environments. The second limitation of our research is that the sample of teachers of our focus group was rather small with 12 lecturers. A further limitation is that the empirical validation of our developed framework is missing. Most important for the validation of the proposed framework is its perceived utility by the stakeholders, in particular the course designers, lecturers, as well as the students in the different use cases. In order to verify that the model does indeed provide actionable information, a pilot within an action research design to validate and revise the generic model and for every use case is planned with only a few experts of the initial task force. These more experienced teachers are looking at the model in terms of both its accuracy (does the information provided by the model align with what they learn by talking to the student?) and its utility (does it trigger contact with the right students and are those students then successful?). Once the pilot is completed, the utility will be evaluated and a decision will be made as to whether to implement the model into the production processes, making the results available to all teachers. The model will continue to be refined even after initial implementation.

6 Conclusion and Outlook

Current research and discussion on big data in education focuses largely on (1) the potential of learning analytics to increase the efficiency and effectiveness of educational processes, (2) the ability to identify and support students at risk, and (3) to inform efforts to reduce drop-out rates. Accordingly, the main focus is on prediction. Therefore, we emphasized the research question how big data and learning analytics can be employed in order to improve learner guidance, students’ learning processes and learning outcomes with regard to reflection and meta-cognitive abilities for self-regulated learning.

Competency development on the part of the data clients (students, teachers/tutors, institutions) is a key requirement for progress in this area. On the basis of the survey data available, Greller and Drachsler (2012, p. 51) have pointed out that the large majority of students currently do not have command of the competences required to interpret LA results and to determine appropriate next activities.

In our model (cf. Figure 1), we include critical evaluation skills among the key competences for LA (similar to Greller and Drachsler 2012). A superficial understanding of data presentation can lead to false conclusions. Furthermore, it is important to understand that data not included in the respective LA approach may be equally if not more important than the data set that is included. To judge a learner’s performance merely on one aspect, such as quantitative data provided by a LMS, is like looking at a single piece taken from a larger jigsaw puzzle. Lifelong learning takes place across a wide range of schooling, studying, working, and everyday life situations. In addition to competency requirements, acceptance factors influence the application or decision making that follows an analytics process. Lack of acceptance of analytics systems and processes can lead to blunt rejection of either the results or the suggestions on the part of relevant constituencies (data clients).

In order to deal with these issues, future research should focus on empirical evaluation methods of learning analytics tools (Ali et al. 2012; Scheffel et al. 2014) and on competence models for digital learning (Dawson and Siemens 2014). The conceptual framework can be further elaborated with the application of the four different use cases by adjusting and integrating partial theories for the competence development of students (e.g., mapping multiliteracies to learning analytics techniques and applications (Dawson and Siemens 2014). It is planned that these cases become four real case studies in which we analyse critically the outcomes, problems and implications of each case. This will be based on a Student Tuning Model as a continual cycle in which students plan, monitor, and adjust their learning activities (and their understanding of the learning activities) as they engage with LA (Wise et al. 2016).