Keywords

1 Introduction

LA is a modern data-gathering technique that mainly relies on automatically aggregated learning traces. While it presents opportunities to gather insights into the teaching and learning practices, it is often difficult to analyse and interpret this data without additional context and multiple sources of data. With the emergence of the Massive Online Open Courses (MOOCs), we aggregate even more data, and since MOOCs are mainly video-driven, they become the main data source for analytics thus the analysis of video data becomes crucial. Also, the design of educational videos needs more evidence-based approaches by connecting different video designs with underlying factors such as learning outcomes, learner behaviour and learner perception on the educational value of the videos. Research indicates that the analysis of the video learning traces gives useful insights into the learning processes [1] but can also be beneficial for learning dashboards, however, the research into this direction is still its in infancy [2].

Based on the existing challenges in video-analytics, this paper offers a new approach to enrich video analytics with feedback data to understand the connection between the design of videos, student behaviour and student’s perceptions on the educational value of the videos, which is done by introducing semi-automated student feedback on several scales directly in the videos. In the current paper, we present a proof-of-concept study connecting cognitive theory-driven [3] analysis of video designs and semi-automated student feedback to enable meaningful inclusion of interaction data and potentially learning outcomes to inform video design but also to build teacher dashboards.

2 Background and Related Literature

Learning Analytics (LA), as a field, has established itself as a ubiquitous method for analysis of large sets of digital footprints coming from the interactions between/with the learners, teacher or the learning environment. LA has many promises, one of which is the capability to contribute to the awareness and reflection on learning processes. However, among one of the critical issues with learning analytics are the dimension of data (mainly click-based) and the connection of the data with context: theory and design [4, 5]. For this reason, LA is rarely used on its own and it usually is combined with other types of data collection and analysis methods - such as self-report data, annotations for sense-making, observations, multimodal data etc. [6].

Video-based learning has been explored from different angles: mostly their effect on learning outcomes, attendance and academic performance, which yields mixed results [1]. There are also different types of videos for learning and depending on their affordances of interactions, we can have different types of data. This data can give us information on learning processes aligned with the data on other types of interactions and student profiles: “combination of various learning analytics (e.g. content metadata, learners’ profile) as well as the state-of-the-art statistical analysis techniques” [7]. For instance, some studies investigated potential attitudinal differences among the diverse video lectures usage patterns and found that usage patterns affect students’ attitudes to video lectures as a learning tool [8]. However, there remain many essential unexplored aspects of video-based learning and the related challenges and opportunities; such as, how to use all the data obtained from the learner, how to combine data from different sources, how to make sense heterogeneous learning analytics, how to synchronize and take the full advantage of learning analytics coming from different sources, how to use analytics to inform and tune smart learning etc. [7].

There are different properties of videos used as indicators for diverse reasons. One of the most highly cited studies in the area has investigated the relationship between the engagement of students and video properties [9] defining video properties with their length, speaking rate, video type, production style. A literature review found that most common measurements in video-analytics are video watch time, video interactions and learning results, reporting fine-grain measurement indicators for each [2]. According to the literature review by Poquet et al., the most common focus is the modality, and the most studied independent variable is the presentation style, while independent - recall test. Self-reports (feedback) are often used to evaluate different effect sizes [1].

Even though researchers and practitioners have been largely focused on the effects of video learning in higher education, the information on the impact of the videos on online students’ learning perceptions and experiences has been scarce [10]. Some findings show that students’ satisfaction with video learning has a strong relationship with a positive overall learning experience and perception of the impact of video on learning [11]. However, most of the studies investigate the overall perceptions of the general concept of a video as a learning tool but not the educational value of individual videos. Moreover, to the best of our knowledge, semi-automated student feedback on the educational value of the videos is one of the underexplored areas.

From an analytics perspective, video data can be useful to understand and improve learning processes [12]. Fine-grain video interaction data can bring valuable insights [1] and they can also be helpful to build learner or teacher dashboards, but this area is in an initial stage [2]. It is worth noting, that most of the video analysis is based on the interaction analysis of learning traces, for this reason, it is important to look beyond the click-stream data. Depending on the learning activity, meaningful interactions may not be tracked by digital learning platforms [13]. Thus, narrowing down the analysis to the data available in the digital platforms introduces the so-called issue of the “street light effect” bias [14, 15]. Moreover, to make sense of the learning data on one hand, and on the other, to have actionable learning dashboards the connection with theory [16], and human-centred design is needed, involving user feedback as in the data collection but also the development processes [17]. At the same time, only automated data is often superficial and not enough to create a hypothesis space and as educational processes and systems are highly contextual, different factors such as pedagogical design, actors, learning settings etc. come into play [18]. Human inference through annotations is often used to make sense of learning analytics data and contextualise automatically-collected learning traces. Moreover, since one of the challenges of learning analytics is the theory-driven analysis of data, we suggest using human inference and combining it with different sources of data [19].

In this paper, we argue that to understand the connection between student learning perceptions and experiences about the educational value of videos, we can collect semi-automated student feedback on the perceptions of the educational value of the videos, and combine with interaction (log data). To understand the objective value of the videos and their design i.e. to establish the ground truth, we can also relate them to the theory-driven properties of videos through human inference. Moreover, since the data is quantified and semi-automated later on, it can be used for different purposes: for real-time analytics and dashboards, or retrospective analysis to combine different sources of data and enrich the analysis.

In the following chapters, we will present the methodology of this exploratory, proof of concept study, present preliminary results and discuss them thoroughly with their limitations and potential areas to explore.

3 Methodology, Research Questions and Methods

3.1 Context of the Study, Research Design and Research Question

The context of the study is situated in higher education, blended learning setting. The study investigates and preliminarily evaluates the usefulness of semi-automated student feedback in the evaluation of the educational value of videos and inclusion of feedback in the learning dashboards with other data such as logs and learning outcomes. To this end, in this study we investigate the feasibility and usefulness of using semi-automated ratings on videos (Fig. 1) to gather feedback from the students based on three different scales: (a) quality of audio and video, (b) clarity of the teacher and (c) usefulness of the video to prepare for the exam. We hypothesise that this information later can be further aligned with different indicators to enrich the data coming from videos with structured user (student) feedback. The semi-automated student feedback is based on the 5-star ratings. This input can potentially be useful not only to inform better design of the videos but also to feed the data to learning dashboards.

Fig. 1.
figure 1

The rating system based on 3 scales. (translation: 1. Quality of audio and video 2. Clarity of the lecturer 3. Usefulness of the lecture for the exam)

Therefore, this article answers to the following research question: Can we use semi-automated video-ratings and theory-driven video annotations to understand what types of videos lead to learning satisfaction and perceived educational outcomes?

To illustrate and evaluate our exploratory study and the proposal, and to operationalize theory-driven video properties, we have used a research-based cognitive theory of Multimedia Learning Principles (MLP). “Multimedia instruction refers to presenting words and pictures that are intended to foster learning” and consists of 12 principles aimed at providing effective and evidence-based tools for multimedia learning [3].

The theory of 12 principles of multimedia learning has been developed by Richard Mayer and is based on the cognitive theory of learning. Its three main assumptions are that:

  • We have two separate channels for processing information, one is the visual/pictorial and the other one is the auditory/verbal;

  • There is a limited channel capacity for processing;

  • Learning is an active process of filtering, selecting, organizing and integrating information [20].

Multimedia learning is learning from words and pictures, it focuses on the assumption that “people learn more deeply from words and pictures than from words alone”.

Of course, it is not enough to associate images with words but it is essential to understand how pictures and words can be used together to foster learning, avoiding overloading the learner’s working memory capacity. Within this pedagogical framework, there are three fundamental goals for instructional design to improve the results of learning strategies:

  • Minimize extraneous processing, cognitive processing that is not related to the instructional goal.

  • Manage essential processing, understanding what kind of items are necessary to represent and summarize the complexity of the material.

  • Foster generative processing, cognitive processing aimed at making sense of the incoming material, organizing and integrating it.

For each of these goals, Mayer provided to regroup the 12 principles of multimedia learning, explaining their indicators [21].

Five principles to reduce extraneous processing are:

  1. 1)

    The coherence principle implies to avoid extraneous, distracting material

  2. 2)

    The signalling principle, suggests that people learn better when essential words are shown on the screen and highlighted;

  3. 3)

    The redundancy principle, suggests that people learn better from animation and narration than animation, narration and text altogether;

  4. 4)

    The spatial contiguity principle implies that corresponding texts and pictures should be near and on the same page or screen;

  5. 5)

    The temporal contiguity principle implies that corresponding narration and animation should be presented together, at the same moment.

Three principles to manage essential processing are:

  1. 6)

    The segmenting principle, suggests that people learn better when information is presented in segments, rather than a long stream;

  2. 7)

    The pre-training principle, suggests that people learn better if they already know the basics of what they are learning, for instance, the meanings of essential components;

  3. 8)

    The modality principle, suggests that people learn better from graphics and spoken words rather than a printed text.

Four principles to foster generative processing are:

  1. 9)

    The multimedia principle, suggests that people learn better from words and pictures than from words alone;

  2. 10)

    The personalization principle, suggests that people learn better from an informal, conversational style;

  3. 11)

    The voice principle, suggests that people learn better from a human voice than a computer voice;

  4. 12)

    The image principle, suggests that people learn better from the animation on the screen than a talking head video of an instructor.

To evaluate the feasibility and usefulness of the approach, this study has used several sources of data: video annotations based on the 12 MLP principles; video ratings on several scales to gather semi-automated student feedback based on 5-scale ratings; engagement with videos (visualisations); the total number of ratings; video duration.

3.2 Data Collection and Sample

Videos were coded based on annotations according to the 12 MLP principles to denote whether and how many of the principles were followed. One expert coder coded the videos in discussion with another expert coder to establish the reliability; in case of doubt, the codes were agreed between the coders. At the same time, for reliability reasons, the codes were also reviewed on a random basis. The unit of analysis in this study is the video. We chose 6 different blended courses from the Department of Education and Human Sciences and we coded only the videos with ratings above 25 to account for the relative uniformity of data. The course information with the number of students is reported below:

  1. 1.

    Cognitive Psychology (N. of students 372);

  2. 2.

    Group Psychology (N. of students 388);

  3. 3.

    Environments and Technologies for Training (N. of students 116);

  4. 4.

    Digital Linguistics (N. of students 117);

  5. 5.

    Developmental and Educational Psychology (N. of students 113);

  6. 6.

    Society and Digital Educational Contexts (N. of students 122).

While the amount of the videos in each course varies (from a minimum of two videos for Developmental and Educational Psychology to a maximum of 20 videos for Group Psychology), we hypothesise that this is due to the video properties that videos in some courses are not rated above average.

As previously mentioned, aside from annotations, we have collected:

  1. 1)

    Semi-automated data:

    • video ratings on several scales to gather semi-automated student feedback based on 5-scale ratings; this data has been collected in the period between February-May 2020.

  2. 2)

    Automated data:

    • engagement with videos (visualisations);

    • total number of ratings;

    • video duration.

3.3 Data Analysis and Results

The first finding is that based on the analysed 44 videos students’ average ratings do not significantly differ across the dataset: the average ratings are 4 and above, there was no rating below 4 (on a scale of 5). Initially, we run some analysis with Tableau software [22]. We plotted the average of N. Total ratings against the Total number of principles broken down by Course (total 6 courses). Preliminary results show there is some association between N° of MLP followed (above 10) on average per course and students’ N° of ratings (Fig. 2) while different dimensions of the ratings (clarity, quality, usefulness) are not counted.

Fig. 2.
figure 2

Preliminary results: the plot of average of N. Total ratings for the total N. of principles broken down by Course (total 6 courses). The colour shows the average of Total N. of principles. Details are shown for Course. (Color figure online)

While this analysis mainly illustrates our sample, it also gives us some possible ideas on video-ratings: videos/courses that tend to have a lower number of principles preserved also get a lower number of ratings on average. However, this is a preliminary finding as the number of students in some courses were significantly higher than others and we chose videos for coding that had higher than 25 N. of ratings. This result is also associated with other finding reported further (association between the N. of visualisations and N. of principles followed). In a way, N. of ratings can be indicative of principles followed. One observation is that most of the time in our sample, N. of principles followed can be generalized to the whole course (the number and types of principles followed in the videos are almost invariable across courses). This analysis also illustrates that students tend not to rate some videos if the course contains videos with low N. of principles preserved (hence the selection of our sample).

From the visualisation (Fig. 3) we can see that the clarity and usefulness in some videos are associated with the N° of principles followed; we can notice that when the N° principles followed descend below 9, clarity and usefulness are rated lower.

Fig. 3.
figure 3

The plot visualizing different data sources analysed together. In some cases, we can observe a slight tendency of decreasing ratings for Usefulness for the Exam and Instructor Clarity when the number of principles followed fall below 9.

To better understand the correlations between different indicators, we have also run a regression analysis on the dataset in R based on the following indicators:

  1. 1.

    N. of total principles followed and video duration;

  2. 2.

    N. of total principles followed and N. of total ratings;

  3. 3.

    N. of total principles followed and N video visualisations (engagement);

  4. 4.

    And finally, the correlations between different questions (quality of audio and video; clarity of the lecturer, usefulness of the lecture for the exam) and all the above indicators (video duration, N. of total ratings and N video visualisations (engagement) (Fig. 4).

    Fig. 4.
    figure 4

    An overall analysis of different indicators: princ = N° of total principles, duration = duratioN. of the video; visual = N° of visualisations; total_rat = N° of total ratings; q1, q2, q3 = three feedback questions

The analysis showed that there is some correlation between the N° of principles and the N of visualisations (R = 0.37; P = 0.016) (Fig. 5). We could presume that the number of MPL followed should be at least 9 for the videos to have educational value for the students, however, given the size of the sample and insignificant variance between video ratings, we will need further studies. Also, to understand the relationship between different principles (out of 12) and the ratings, in future, we will need to analyse data according to each principle with a larger data-set.

Fig. 5.
figure 5

Regression analysis based on N° of principles and the N° of visualisations.

4 Conclusions and Discussion

Generally, the most interesting preliminary finding in this exploratory, proof-of-concept study is the weak correlation between the research-based MLP and significant differences in the average student ratings (all above 4). So, our main question in a way remained open. It is also due to little differences in video ratings, we were not fully able to respond to our main question. However, we found that the N. of principles followed are somewhat correlated to the video visualisations. While this might mean that we need to reconsider the questions asked, it can also be by different factors, this finding needs further research with mixed methods approaches, as it can have design implications for the feedback system and respective dashboards. Aside from this, our study demonstrates the need for contextual, theory and design-driven data to solve validity issues of analytics data, and the need to examine the data-set closely before including them in the dashboards.

Aside from field-specific findings that are relevant to TEL and LA researchers, the results our study can potentially inform the research and practice in other contexts where student feedback is used for evaluation of the performance of the academic staff; while it is true that if we did analyse a big enough dataset, still, we found that average ratings across contexts and designs did not change, even if the video’s properties did change. This potentially can mean that, first of all, careful consideration of the student evaluation questions is needed. Second, we need to think about the quality and dimension of the data: qualitative approaches, different data sources and triangulation, as well as careful formulation of questions to be asked are important. Furthermore, this study once again confirms previous research on the need for contextual data for learning analytics studies [23].

The limitations of the preliminary study include the sampling method of the coded videos, that was based on above 25 N. of ratings. Due to the lower N. of ratings, this might have introduced a selection bias in our data-set. At the same time, the overall data-set for this proof-of-concept study was quite small that naturally restrict the generalizability of the findings. However, since the nature of this research was exploratory, the indicators enabled by the results will be used to inform the research design of the next study as well as the design of semi-automated student feedback tool and the dashboard study. Potential scenarios are discussed in the following chapter.

5 Future Research

Following the study, we will first analyse larger data-set, after which will involve students to investigate the factors behind the ratings and the correct formulation of the rating questions. This will result in a redesign of the rating system, after which we will aggregate more data to re-evaluate it. Moreover, the outcomes of this research will be used to build learning analytics dashboards and evaluate the potential of our proposal for its actionability to understand whether our approach brings valuable insights to educators. To create a path for actionable dashboards we will also run a qualitative study involving a design session with participatory approaches to understanding what indicators teachers need for evidence-based teaching practice; the aggregated visualizations will be presented to the teachers to understand whether semi-automated student feedback is informative and actionable for them. We also plan to include different sources of data such as learner engagement, motivation and learning outcomes to answer our next research question: What are the relationship between video design, student engagement and student perceived educational value and quality of the videos and the learning outcomes?