Introduction

Videos are a frequent companion of students in school (Feierabend and Klingler 2003; Hobbs 2006). One of their strengths is that they allow for vicarious experiences that exceed the range of experiences possible in real life (Mar and Oatley 2008). In biology education, students can safely observe predators on the hunt (Smith and Reiser 2005); in history education, videos can give students an impression of past eras that are no longer available for direct access (Paschen 1994; Stoddard and Marcus 2010). Thus, videos can be a valuable source of information for students in several subjects.

With regard to this observation, it can be considered problematic that videos were frequently shown to be inferior to print when juvenile and adult recipients tried to recall television or print news (Furnham and Gunter 1985, 1987; Gunter et al. 1984, 1986). Moreover, the transience of information presented in videos is assumed to undermine the positive effects of note-taking on learning outcomes. Overall, note-taking is considered to be a relevant activity during the learning process (Kobayashi 2005); however, the continuous flow of information in videos interferes with the process of taking notes (Ash and Carlton 1953; Kobayashi 2005). Especially for audiovisual learning materials that the students usually cannot take home with them, taking notes could be a major means to preserve information for later learning. Therefore, it is important to reduce the negative effects of transience in video-based learning environments.

The non-interactive videos used in previous studies did not represent state of the art video technology. In contrast to these non-interactive videos, modern video environments offer many opportunities to interact with the contents and to engage in active and self-regulated information processing. In the early 1990s, Wetzel et al. (1994) noted that interactivity in videos benefitted learning; however, there was no appraisal of the utility of specific features. Because interactive learning environments can be equipped with a broad range of different features, investigating the effects of specific features on information processing is warranted. Such features might, for example, allow for controlling the pace of the video, selecting information, sequencing the contents, or might even offer the learners feedback on their performance (Domagk et al. 2010; Scheiter and Gerjets 2007). We decided to limit the functionality of the interactive video environment used in the current study to features that afford the control of pacing and the selection of information because we assume that these features address basic mechanisms of information processing. Features that allow for the control of pacing may remediate the negative effects of transience on recall and on note-taking. The benefits of interactive features that allow for the control of pacing were demonstrated in several studies using videos (Schwan and Riempp 2004; Zahn et al. 2004) and animations (Hasler et al. 2007; Höffler and Schwartz 2011).

Features that allow for the selection of information may facilitate the localization of information. The localization of information is considered a central characteristic of on-the-job reading tasks in that media are consumed to find an immediate solution to a prevalent problem (Guthrie 1988; Guthrie and Kirsch 1987). Furthermore, when using video-based information systems in informal learning scenarios at home (e.g., learning software), students are not explicitly informed on where to find the information that they are interested in. Hence, the ability to locate information in dynamic learning environments is an important prerequisite for achievement on the job and in informal learning scenarios. Therefore, it should be considered an important component of information literacy in the twenty-first century (Bruce 1999; Rotherham and Willingham 2010). Implementing chapter selection and an index might facilitate the localization of information in video-based learning environments. However, the benefits of these features should be subject to critical analysis. In two recent studies, Merkt et al. (2011) observed that a video offering only stop and browsing as basic features to control the pace of the information led to the consideration of information from more different chapters of the video than a video additionally offering chapter selection and an index as enhanced features that allowed for the selection of information.

The current study will contribute to the literature on processing state of the art video technology by identifying preconditions for the efficient use of the interactive features of videos to support the localization and extraction of information from them. The theoretical framework of information processing that informed the design of the video-based learning environment will be described in the following section.

Information processing

Based on the original model of Guthrie and Mosenthal (1987), Guthrie (1988) proposed five steps of information processing in reading, namely, (1) the formation of a reading goal, (2) the selection of relevant categories (e.g., chapters in books), (3) the extraction of information from the document, (4) the integration of new information with previous information and the reading goal, and (5) recycling through these steps if the reading goal was not met. The validity of this model was shown in various studies using timetables, payrolls, or textbook chapters (Dreher and Guthrie 1990; Guthrie 1988; Guthrie and Dreher 1990). Taking a closer look at the recipients’ allocation of processing time to the single steps of this model, Dreher and Guthrie (1990) observed that especially for more complex search and integration tasks, more efficient searchers compared to less efficient searchers took relatively more time for category selection compared to the overall time they needed for the task. This finding points to the importance of category selection for the mastery of complex search tasks. Even though Guthrie’s model (1988) was developed in the context of text processing, we assume that it also applies to the processing of audiovisual materials because various studies have shown that similar processes are involved in reading and in watching a video (Magliano et al. 1996, 2001; Tibus et al. 2013).

In the two following subsections, we will take a closer look at empirical evidence concerning interactive features of videos that support the different steps of information processing as postulated by Guthrie (1988). In this process, we will focus on the selection of relevant information and the extraction of this information because these stages of information processing are most likely to benefit from the availability of the interactive features that were investigated in this study.

Extraction of information: stop and browse

When recipients extract and process information from media, they try to make sense of it. According to the Construction-Integration Model (Kintsch 1988, 1998), learners create a representation of the information explicitly mentioned in the medium (textbase) as well as a representation of this explicit information integrated with prior knowledge (situation model). This process is considered to be active and self-regulated—drawing on cognitive resources. For example, recipients draw inferences (Graesser et al. 1994; Magliano et al. 1996) or engage in self-explanations (Chi et al. 1994; McNamara et al. 2004) while reading a text or watching a video.

Because cognitive resources are limited (Sweller et al. 1998), an active elaboration of information might constrain the recipients’ attention to new information due to cognitive overload, particularly if information is transient and the recipients have no possibility to control the flow of information. Sturm (1984) termed this issue as the “missing half second”. Obviously, permanent media such as print allow the recipients to stop the reading process and to engage in re-readings and lookbacks whenever necessary. Studies using think aloud protocols and eye-tracking found such processes to be beneficial for learning with texts (Coté et al. 1998; Hyönä et al. 2002; Hyönä and Nurminen 2006). Whereas transient media such as traditional non-interactive videos do not allow for such processes, the digitization of the medium opened a new range of possibilities for recipients to interact with videos. On the most basic level, features such as stop and browsing enable the recipients to stop and rewind the video anytime. Consequently, more recent studies (Schwan and Riempp 2004; Zahn et al. 2004) have presented some first evidence for the benefits of interactive features that allow for the control of pacing when participants acquired procedural (tying nautical knots) or declarative knowledge (learning about history) from videos. Hence, self-regulated pacing of information seems to be an important prerequisite for learning—with print (Coté et al. 1998; Hyönä et al. 2002; Hyönä and Nurminen 2006) and with videos (Schwan and Riempp 2004; Zahn et al. 2004). Research on animations pointed to comparable effects of self-paced presentation of information compared to system-paced presentation of information (Hasler et al. 2007; Höffler and Schwartz 2011; Mayer and Chandler 2001).

Selection of information: chapter selection and index

In general, comprehensive textbooks offer features that allow the recipients to locate specific information precisely. For example, a table of contents and an index might serve this purpose. Investigating the usefulness of such features in a textbook search task, Rouet and Coutelet (2008) observed that pupils between the third and seventh grade began with rather crude strategies such as browsing through the pages and eventually developed more sophisticated search strategies such as using a table of contents and an index. The use of these more sophisticated processing strategies was associated with reduced search time and less frequent need for the experimenter’s support to master the task (Rouet and Coutelet 2008). In line with this observation, college students used an index efficiently to locate isolated facts that were listed in an index (Yussen et al. 1993).

State of the art video technology allows for the implementation of comparable features. However, two studies using complex tasks such as summarizing the information that is included in a medium (Merkt et al. 2011) did not find beneficial effects of chapter selection and an index. Instead, an enhanced video including chapter selection and an index resulted in the consideration of information from fewer chapters than a common video that merely offered stop and browsing when gathering information for an essay task. Moreover, the common video led to the naming of marginally more information than the enhanced video in one of these studies. The authors argued that even though the participants may have had appropriate strategies for the use of an index and chapter selection for the localization of isolated facts (see Rouet and Coutelet 2008; Yussen et al. 1993), they may have lacked suitable strategies to integrate the use of these features to master a more complex task. This explanation is in line with the observation that students’ mastery of search tasks declined with increasing task complexity (Dreher 1992).

However, the argumentation of Merkt et al. (2011) should be considered tentative for two reasons:

  1. (1)

    Merkt et al. (2011) argued that the students lacked appropriate strategies to implement the basic skill of using a table of contents and an index into more comprehensive task assignments. With this argument, they implicitly assumed that students are capable of using the respective features for less comprehensive tasks. However, no simple search tasks comparable to the tasks used in previous studies on textbook search (Rouet and Coutelet 2008; Yussen et al. 1993) were employed. We have addressed this issue by implementing a simple search task requiring the localization of isolated facts.

  2. (2)

    Even though Merkt et al. (2011) argued that the participants in their studies lacked the strategies to effectively use the features of the enhanced video, there was no manipulation or assessment of appropriate strategies when the students gathered information for the essays. We have addressed this issue by giving students training before working with either an enhanced video or a common video, hence, manipulating their awareness of suitable search strategies.

In their studies, Merkt et al. (2011) collapsed the data regarding the three essay tasks that the students had completed into one measure. Consequently, potential differences with regard to the essay tasks’ characteristics were not taken into account. However, based on a comparison of the results observed by Rouet and Coutelet (2008) and Yussen et al. (1993) on the one hand, and Merkt et al. (2011) on the other, it can be assumed that task characteristics play a crucial role when assessing the effectiveness of interactive features. Therefore, we decided to analyze the essays individually on a more fine-grained level regarding the results for the two essay tasks that were employed in the current study.

Current study

Addressing the issues brought up in the last paragraph of the previous section, we requested students to write two essays (complex task: summary vs. argument) and search for isolated facts (simple task) when working either with an enhanced or with a common video. Prior to these tasks, one half of the students received a search training that addressed suitable strategies for the use of a table of contents and an index when summarizing information from a medium; the other half received control training.

Hypotheses

Our main goal was to investigate the videos’ utility for the selection and extraction of information depending on task demands and knowledge of suitable search strategies. Based on the body of research discussed earlier, we formulated one hypothesis for each task, namely, searching for isolated facts and writing two essays about the contents of the videos.

Hypothesis 1

Studies by Rouet and Coutelet (2008) and Yussen et al. (1993) have shown that students can successfully use an index to locate specific facts in a textbook. We assume that these skills transfer to the use of comparable features in a video-based environment because several studies have pointed to similar processes when recipients process text and video (Magliano et al. 1996, 2001; Tibus et al. 2013). Consequently, we expect the enhanced video offering an index to outperform the common video not offering an index, independent of training in a simple search task that requires the localization of specific information.

Hypothesis 2

Regarding the two essays, we expect an interaction of the factors video and training. More specifically, based on the findings of Merkt et al. (2011), we expect that the availability of chapter selection and an index does not result in superior performance because the students lack appropriate strategies to implement the use of these features into more comprehensive task assignments. However, after the search training, the enhanced video should outperform the common video because the students should have acquired the necessary skills to effectively use chapter selection and an index in comprehensive essay tasks.

Method

Participants

We conducted the study in eight 9th grade German secondary school classes. From an overall of 204 students, 74 students volunteered to participate in the experiment with parental consent. Ten students had to be excluded from the sample because they scored 0 on at least one of the two essay tasks, not fulfilling the minimal requirements for these task assignments. The 64 remaining ninth graders (38 female) were on average 14.84 years of age (SD = 0.48). The students were randomly assigned to the experimental conditions. Please refer to Table 1 for the distribution of pupils across conditions. Each secondary school class received up to €100 for participation, depending on the proportion of students that participated in the entire experiment. Additionally, the best participant in each experimental condition was paid €25.

Table 1 Means (and standard deviations) for the control variables

Materials

Types of media

We used an educational video about post-war Germany after World War II (FWU 2003). In ten chapters (duration: 16 min 24 s), the video described Germany’s development between 1945 and 1950 in the light of the implementation of the Potsdam Treaty in the US-American, British, French, and Soviet zones of occupation. Based on this video, we created two different video environments that differed in the amount of interactivity that they enabled—a common video and an enhanced video. These videos were also used by Merkt et al. (2011).

The common video allowed for controlling the pace of the video. More specifically, it contained a start/stop button and buttons to fast-forward and rewind the video. Thus, it was functionally equivalent to a regular VHS system. Additionally, indicators for the time left to work with the video and the video running time were included. Log-files of the students’ use of the common video’s features were recorded.

The enhanced video allowed for control of pacing and for control of content selection. Control of pacing was implemented via a start/stop button and a slider that allowed navigation along a timeline that was implemented below the video. The timeline was divided into ten chapters that could be directly accessed by selecting the respective section of the timeline. The chapters of the video were additionally listed chronologically in the table of contents. Central key terms were listed alphabetically in an index. Chapter names and numbers were shown above the video; central key terms were shown left of the video while the video was playing. The enhanced video also included indicators for the time left to work with the video and the video running time. The usage frequency of the enhanced video’s features was recorded. The usage frequency of stop was raised by 1 each time the stop button was used. The usage frequency of the use of browsing was raised by 1 each time the slider was moved and released. The frequency of the use of chapter selection was raised by 1 each time a chapter was selected in the table of contents or in the timeline. The usage frequency of the index was raised by 1 each time a keyword was selected in the index.

Training

Before working with one of the two videos, the students received either search or control training. Both trainings were administered in the same computer environment as the video.

The search training intended to teach strategies that improve the use of the table of contents and the index as these features have been shown, unexpectedly, to narrow the students’ scope when gathering information for essays (Merkt et al. 2011). To improve the students’ strategic use of these features, the search training described strategies for information processing including the use of a table of contents and an index to gather as much information as possible about a topic of interest. Based on theoretical models of information seeking that were shown to make valid predictions about successful search processes (see Guthrie, 1988), information processing was described as a process in which recipients initially identify a need for information, subsequently formulate a reading goal, and then locate information in a medium. Additionally, different characteristics of media were discussed such as linearity/non-linearity, segmentation of media into different chapters accumulating most of, but not necessarily all of the information about a specific topic, and the functionality of search features such as a table of contents and an index. Following this theoretical information, the students were guided through an exemplary information gathering task in a practice trial including the following steps: (1) activation of prior knowledge, (2) getting an overview by consulting the table of contents and watching the video as a whole, (3) selecting relevant chapters in the table of contents, (4) re-watching relevant chapters taking notes, and (5) using these notes as a starting point for subsequent search processes with the index. At each step of this practice trial, the students were asked to perform the actions that had just been described. In a final step, the importance of connecting the new information with prior knowledge was stressed.

The control training did not include any information about specific search strategies. However, to satisfy the call for fair control conditions (Levin 1994; Levin and O’Donnell 1999), the control group also practiced strategies that might be relevant for information gathering and knowledge acquisition. Thus, the importance of an activation of prior knowledge and the integration of new information with prior knowledge was stressed using the Construction-Integration-Model (Kintsch 1988, 1998) as a theoretical background. After a theoretical introduction to the model describing the formation of a textbase and a situation model, the students engaged in a practice trial guiding them through an exemplary information gathering task including the following steps: (1) activation of prior knowledge, (2) getting an overview by watching the video as a whole (3) taking notes about the video’s contents and connections to prior knowledge, and (4) watching the video again to complete the notes. Again, at each step of this practice trial, the students were asked to perform the acts that had just been described. Finally, the importance of integrating the new information with prior knowledge was stressed.

As a final step of both trainings, the students were asked to write down the different steps of information processing as described in the respective training. This final step only served as a rehearsal of the strategies and was not analyzed with regard to the quality of the students’ answers.

To sum up, both trainings pointed out the importance of activating prior knowledge and integrating newly acquired knowledge with what we already know. However, the students in the search training condition received additional information to support the localization of information for comprehensive task assignments, whereas the students in the control training condition received theoretical information about the process of understanding. We feel that this procedure satisfies the call for fair control conditions (see Levin 1994; Levin and O’Donnell 1999) because only those aspects of the search training that were assumed to result in better performance were removed from the training in the control condition.

Measures

History-related interest

History-related interest was measured using eight items from Sparfeldt et al. (2004). Because Sparfeldt et al. (2004) did not include a scale for history-related interest, the procedure was adapted for the purpose of this study. In this procedure, participants indicated their agreement with eight statements (e.g., “I can imagine majoring in history.”) on a scale ranging from 1 (totally disagree) to 6 (totally agree). The authors report alphas between 0.93 and 0.94 for the subjects, mathematics, German, physics, and English.

History-related self-concept

History-related self-concept was assessed with the DISC-Grid (differential self-concept-grid; Rost and Sparfeldt 2002). In this procedure, participants indicated their agreement with eight statements (e.g., “It’s easy to have good grades in history.”) on a scale ranging from 1 (totally disagree) to 6 (totally agree). An alpha of 0.94 was reported for the subject history.

Prior knowledge

To assess prior knowledge, participants were asked to assign 24 events of recent and most recent history to one of ten 25-year-slots between 1750 and 1999. The events were selected so that they reflected the contents of local school curricula as well as historical events closely related to our experimental materials. Time was limited to 10 min. Alpha was 0.56 in the current study.

Homework motivation

To measure students’ homework motivation, we adapted one subscale of the Homework Motivation and Preference Questionnaire (HMPQ; Hong and Milgram 2000, 2001) measuring the students’ homework self-motivation. Participants rated three statements (e.g., “When I do my homework, I like to do the best work that I can”) on a 5-point scale ranging from 1 (totally disagree) to 5 (totally agree). The alpha for this scale was 0.81 in the current study.

Search task

To test whether students were capable of using an index for the localization of isolated facts, we implemented a search task. In this search task, the students were required to locate 11 facts. Each of these questions contained one key term that was included in the index of the enhanced video. Thus, the information was directly accessible via the enhanced video’s index. However, the students were not explicitly instructed to use the index. Working time was limited to 5 min to avoid ceiling effects. The students’ answers were recorded in the video environment’s log-files.

Essays

While watching the video, the students wrote two essays. Analogous to the procedure in Merkt et al. (2011), each essay was coded for the amount of information and for the distribution of this information across the videos’ chapters. For that purpose, the information that was presented in the video was represented by individual codes that could have been assigned to the essays if that information was mentioned. Each code represented one piece of information that was included in the video and could only be considered once per essay. Both essays were analyzed by two independent raters that assigned codes to the essays. Disagreements between the raters were resolved by discussion. Both raters were blind to condition.

For Essay 1, the participants were asked to describe how the Potsdam Treaty was implemented in the US-American, British, French, and Soviet occupation zones. Overall, 48 codes representing the information from the video were included in the coding scheme and could be assigned to the participants’ essays. The essay required the participants to gather information from seven different chapters of the video conducting a comprehensive information search. Twenty-eight of 48 pieces of information that could be coded as relevant for the essay task were included in four chapters that were titled “The American Occupation Zone”, “The British Occupation Zone”, “The French Occupation Zone”, and “The Soviet Occupation Zone”. Thus, there was an explicit overlap between the wording of the task and the names of these chapters for a substantial amount of information that could have been extracted from the video. Additionally, eight more pieces of information were included in two chapters titled “Reparations and Turn in American Politics” and “Western Zone and the Marshall Plan”. Even though these titles did not explicitly overlap with the wording of the question, a reference between the question and these titles could be made. Hence, access to the majority of information that was considered to be relevant for the task was supposed to be facilitated by the availability of chapter selection in the enhanced video condition (i.e., chapter selection facilitated access to 36 of 48 pieces of information that were represented by codes in the coding scheme).

For Essay 2, the participants were asked for an appraisal whether the division of Germany was predictable from the events occurring between 1945 and 1949. Overall, 53 codes representing information from the video that could be used to back or oppose the hypothesis were included in the coding scheme and could be assigned to the participants’ essays. This information was distributed across six different chapters of the video. To write the essay, the students were required to draw inferences based on the video’s explicit contents in order to back their argument. More specifically, the students had to decide whether the information from the video supported or opposed the hypothesis that the division of Germany was predictable from the events occurring between 1945 and 1949. Even though five of 53 pieces of information were included in a chapter titled “Emergence of two German States”, the large majority of the relevant information was not included in chapters that could be directly or indirectly linked to the wording of the essay task.

For the amount of information mentioned in the essays, inter-rater correlations were 0.97 and 0.92 for Essay 1 and 2, respectively, both p < 0.001. For the distribution of this information across the videos’ chapters, inter-rater correlations were 0.94 and 0.84 for Essay 1 and 2, respectively, both p < 0.001.

From the description of the two essays, it becomes obvious that the essays differ in their demand characteristics, because Essay 2 required the participants to draw inferences about the video’s contents, whereas Essay 1 was an information collection task that did not require inferences. Hence, the two essays differed with regard to the cognitive skills required for their mastery. Therefore, an analysis of the two essays on the level of individual essays is appropriate.

Procedure

The study was carried out in two regular history lessons (periods) and as a homework assignment. In the first lesson, individual prerequisites (prior knowledge, interest and self-concept in history, homework motivation) that might influence the study’s outcome were assessed. Then the students were given a homework assignment which consisted of training and a video. This homework had to be finished within 1 week. The students were allowed to do the homework whenever and wherever they wanted. However, they were told to finish all the tasks in a single session.

Starting on the homework assignment, the students first did either the search training or the control training. Then they wrote two essays about the implementation of the Potsdam Treaty (Essay 1) and about whether the division of Germany was predictable from the events between 1945 and 1949 (Essay 2) while watching either the enhanced or the common video about “Post-War Germany” between 1945 and 1950. After finishing the essays, the students finished the search task. After 5 min in the search task, the video automatically stopped. Due to necessary changes to the search task after assessing the first ninth grade class, results for the search task are only available for the remaining seven ninth grade classes. The students’ navigation through the video was recorded via log-files.

The second lesson was used to collect the homework assignments. Moreover, the students had the chance to indicate whether there were any technical difficulties. This information was used to determine whether the data of the students could be used for further analyses.

Results

All post hoc comparisons reported in this section were Bonferroni-adjusted.

Control variables

We assessed several variables to control their influence on the dependent measures. Please refer to Table 1 for descriptive data. To test for differences between the experimental conditions, we conducted a 2 × 2-factorial ANOVA with the factors video and training for each variable.

For the variables interest in history, self-concept in history, and prior knowledge, there were no main effects of video and training, and no interaction video × training, all p > 0.10.

For homework motivation, there was an interaction video × training, F(1,60) = 5.83, p = 0.019, η 2 p  = 0.09. In the control training condition, participants working with the enhanced video (M = 4.05, SD = 0.75) reported marginally more homework motivation than participants working with the common video (M = 3.48, SD = 1.02), p = 0.080. There were no significant differences regarding the students’ homework motivation between the enhanced video (M = 3.16, SD = 1.01) and the common video (M = 3.71, SD = 0.93) in the search training condition, p = 0.106. There also were no main effects of video, F < 1, and training, F(1,60) = 2.01, p = 0.162, η 2 p  = 0.03. As homework motivation did not correlate with any dependent measure, it was not included as a covariate in further analyses.

Search task

The search task served as an indicator for the students’ ability to locate isolated facts in the video. In the following section, we will first give an analysis of the search outcomes before we explore students’ usage patterns and examine the relationship between usage patterns and search outcomes.

Search outcomes

A 2 × 2-factorial ANOVA with the factors video and training revealed that the participants working with the enhanced video found more facts (M = 5.19, SD = 2.17) than the participants working with the common video (M = 2.50, SD = 1.96), F(1,46) = 19.46, p < 0.001, η 2 p  = 0.30. The main effect of training as well as an interaction video × training were not significant, both F < 1. Please refer to Table 2 for more descriptive data concerning the search task.

Table 2 Means (and standard deviations) for the outcome variables

Usage patterns

To compare the usage of the enhanced video’s search features in the different training conditions, a 2 × 3-factorial ANOVA with the factors training and feature (browsing vs. chapter selection vs. index) was performed, whereas the latter was a within subjects factor. Even though the stop functionality was used in all the experimental conditions (see Table 3), we did not include it as a dependent variable in the ANOVA because we were mainly interested in the usage of the search features. The analysis yielded a main effect of feature, F(2,48) = 4.45, p = 0.017, η 2 p  = 0.16, with browsing (M = 6.27, SD = 4.46) being used more frequently than the index (M = 2.92, SD = 3.52), p = 0.045. The frequency of the use of chapter selection (M = 3.50, SD = 3.90) and the index, p = 1.000, and browsing and chapter selection did not differ, p = 0.119. There was no main effect of training, F < 1, and no interaction training × feature, F < 1. Thus, the search training had no differential effect on the students’ usage of the enhanced video’s search features in the search task.

Table 3 Means (and standard deviations) for the usage of the enhanced video’s features

The relationship between usage patterns and search outcomes

As a last step, we explored whether the search outcomes were related to the use of any of the enhanced video’s search features. To consider potential differences between the different training conditions, we correlated the search outcomes with usage patterns for each training individually.

When the students used the enhanced video after engaging in the control training, there was a positive correlation between the search outcomes and the frequency of the use of the index, r = 0.59, p = 0.020. However, there were no correlations for the use frequency of browsing and chapter selection, both p > 0.10. When the students used the enhanced video after engaging in the search training, there was no correlation between the use frequency of browsing and the search task, r = 0.16, p = 0.693, However, there was a positive correlation between the search outcomes and the use frequency of the index, r = 0.60, p = 0.053. There was a negative correlation between the search outcomes and the frequency of the use of chapter selection, r = −0.61, p = 0.048.

Essays

While watching either the enhanced or the common video, the students wrote two essays that required gathering information (Essay 1) and drawing inferences to back an argument (Essay 2). Table 2 gives a detailed overview of the descriptive data with reference to the essays.

Essay 1

For Essay 1, the participants were asked to describe the implementation of the Potsdam Treaty in the four occupation zones requiring them to gather information from the video. Concerning the amount of information mentioned in Essay 1, there was a marginal effect of video, F(1,60) = 3.18, p = 0.079, η 2 p  = 0.05, with the enhanced video (M = 12.73, SD = 5.85) outperforming the common video (M = 10.10, SD = 7.47). However, this main effect was qualified by an interaction video × training, F(1,60) = 3.98, p = 0.051, η 2 p  = 0.06, with the enhanced video (M = 15.73, SD = 5.01) outperforming the common video (M = 9.60, SD = 6.22) after the search training, p = 0.012, whereas there were no differences between the enhanced video (M = 10.22, SD = 5.41) and the common video (M = 10.56, SD = 8.66) after the control training, p = 0.879. The main effect of training, F(1,60) = 1.96, p = 0.166, η 2 p  = 0.03, did not reach statistical significance. This pattern was even more clearly pronounced using a statistical procedure proposed by Niedenthal et al. (2002). In this analysis, we used the contrast A (−1, −1, −1, +3, for common video and control training, common video and search training, enhanced video and control training, and enhanced video and search training, respectively) to express the hypothesis that the participants using the enhanced video after the search training would outperform the participants in all the other conditions. As there were four conditions, we additionally used the orthogonal contrasts B (+1, −1, 0, 0) and C (+1, +1, −2, 0) to capture residual variance between the conditions. The three contrasts were entered into a multiple regression analysis with the amount of information reported in Essay 1 as a dependent variable. In this analysis, contrast A expressing our hypothesis was statistically significant, F(1,60) = 8.59, p = 0.005, R ² change = 0.13, whereas contrasts B and C, entered in the regression analysis as a set, did not reach statistical significance, F < 1. Thus, this analysis confirms that the enhanced video combined with the search training led to providing more information than the other conditions that did not differ from each other (see Fig. 1).

Fig. 1
figure 1

Amount of information reported in Essay 1. Thin lines represent standard deviations

Besides the amount of information reported, we also assessed the distribution of this information across the chapters of the video. A 2 × 2-factorial ANOVA with the factors video and training revealed a marginal effect of video, F(1,60) = 3.37, p = 0.071, η 2 p  = 0.05, with the enhanced video (M = 4.91, SD = 1.35) leading to reporting information from more different chapters than the common video (M = 4.16, SD = 1.83). The main effect of training and the interaction video × training were not significant, both F < 1.

Essay 2

Essay 2 required an appraisal of the hypothesis that the division of Germany was predictable from the events between 1945 and 1949. Overall, the students’ performance in this essay was very low. On average, only 3.75 of 53 possible arguments were named (SD = 2.71). Due to this clear indication of a floor effect, we decided not to analyze the data concerning this essay. Reasons for this floor effect will be discussed.

Usage patterns

While the participants wrote the essays, log-files of their use of the enhanced video’s features were recorded. In the following section, we will explore how the search training influenced the use of the enhanced video’s search features. As both essays were written during one 90 min interval of a homework assignment, the use of the enhanced video’s features while writing the essays cannot be traced back to the individual essays. Therefore, we had to collapse the log-file data regarding the use of the enhanced video’s features while writing the two essays into one measure. Please refer to Table 3 for descriptive data about the usage patterns.

To test whether the search training influenced the usage frequency of browsing, chapter selection and the index in the enhanced video condition, we performed a 2 × 3-factorial ANOVA with the factors training and feature (browsing vs. chapter selection vs. index), of which the latter represented a within subjects variable. The ANOVA revealed a main effect of feature, F(1.49,46.08) = 12.51, p < 0.001, η 2 p  = 0.29. As the sphericity assumption was violated, the Greenhouse-Geisser correction for degrees of freedom was used. Post-hoc comparisons revealed that the index (M = 2.82, SD = 4.77) was used less frequently than chapter selection (M = 6.09, SD = 5.83), p = 0.022, and browsing (M = 10.85, SD = 10.10), p = 0.001. Additionally, chapter selection was used less frequently than browsing, p = 0.017. There also was a marginal effect of training, with the interactive features being used more frequently after the search training (M = 24.87, SD = 15.92) than after the control training (M = 15.50, SD = 12.65), F(1,31) = 3.55, p = 0.069, η 2 p  = 0.10, indicating that the search training led to more search processes than the control training. The interaction of the factors training and feature was not significant, F < 1, showing that the search training led to an overall increase in search processes and did not just foster the use of individual search features.

Discussion

In the current study, we have investigated in how far an interactive video environment that offered students the opportunity to control the pace of the information and the selection of the contents benefitted students’ mastery of different tasks that required the localization and extraction of information from videos. We will first discuss the interpretation of the results before we offer some conclusions of the study.

Interpretation of the results

Two more recent studies by Merkt et al. (2011) failed to find a superiority of an enhanced video offering chapter selection and an index over a common video that merely offered basic features of interactivity. Instead, the common video led to reporting the same amount of information or marginally more information from more different chapters than the enhanced video, indicating a broader scope in the common video condition. The authors argued that the students in these studies lacked appropriate strategies to implement the use of chapter selection and an index in more comprehensive task assignments. The results of the current study back this explanation.

This study investigated scenarios in which an enhanced video, offering enhanced search functionality in the form of chapter selection and an index, benefits the localization and extraction of information compared to a common video not offering such features. For this purpose, we varied task demands (localization of isolated facts versus summarizing the video’s content versus drawing inferences based on the video’s content) and students’ knowledge of suitable search strategies using chapter selection and an index to gather information. We observed that different task demands had differential effects on the utility of the enhanced video’s features.

Concerning the localization of isolated facts, the enhanced video offering chapter selection and an index outperformed the common video independent of training. In line with our hypothesis, the students were able to benefit from the enhanced video’s search functionality when the task required the naming of isolated facts (Hypothesis 1). Moreover, the use of the index was positively correlated with search outcomes replicating the observations by Yussen et al. (1993) investigating search processes in textbooks. Because the enhanced video outperformed the common video independent of training, we conclude that at least some ninth graders are aware of successful strategies for locating isolated facts using an index. This is in line with research by Rouet and Coutelet (2008). However, the use of chapter selection was negatively correlated with search outcomes in the search training condition. This negative effect might have been because chapter selection was less precise than the index with regard to the search task as the wording of the questions used in the search task could be explicitly mapped to key terms that were included in the index. With respect to accessing the relevant information using chapter selection, this was not possible. In the light of time restrictions (time limit 5 min), a heavy reliance on chapter selection might have hampered the students’ search performance—especially when more suitable search strategies (i.e., using the index) were activated by the search training.

With regard to the essay tasks, Hypothesis 2 was partially supported. Concerning the amount of information mentioned in Essay 1, we observed the predicted pattern. More specifically, the students required search training to benefit from the enhanced video’s search features when engaging in a comprehensive search assignment. This observation was corroborated by the search training having led to a marginally more frequent use of the enhanced video’s search features. Without search training, the enhanced video was comparable to the common video condition regarding the amount of information mentioned, replicating the findings of Study 1 reported by Merkt et al. (2011). However, concerning the distribution of this information across the videos’ chapters, Hypothesis 2 was not supported. More specifically, independent of training, the use of the enhanced video led to considering information from marginally more chapters than the use of the common video. This finding contradicts the observation by Merkt et al. (2011) that without a search training, students lacked the necessary skills to implement the use of chapter selection and an index into a comprehensive task assignment. However, this study differs from Merkt et al. (2011) in so far that those students that did not engage in the search training received a control training stressing the importance of prior knowledge, whereas the students in the studies by Merkt et al. (2011) did not receive any training. Obviously, the activation of prior knowledge in the control training sufficed to make the students consider information from more different chapters; however, only training of appropriate search strategies led to a translation of this broader consideration of chapters into actually reporting more information. This observation is similar to Dreher (1992) stating that some students were capable of selecting appropriate sections of a text but failed to extract relevant information when engaging in a complex search task with textbooks.

Due to a floor effect, the data of Essay 2 were not analyzed. To understand this floor effect, remember that Essay 2 required students’ inferences about the videos’ contents to back their argumentation about whether the division of Germany was predictable from the events happening between 1945 and 1949. Whereas this task should address higher order cognitive skills than merely gathering information (Essay 1), it might also differ in its incentive to engage in comprehensive search processes. From research on persuasion, it is known that merely one argument from a likable person can change an opinion when personal involvement with the topic is low (Chaiken 1980). Given this observation, it is not surprising that students backed their argumentation with only a few arguments. Furthermore, because Essay 2 was written after Essay 1, the students might have invested less effort into the collection of arguments from the video because they had already engaged in a long and comprehensive homework assignment.

Conclusions

Overall, the results of the current study show that the implementation of chapter selection and an index can indeed improve students’ search processes with videos. However, the utility of such features without previous training is limited to situations in that only a small amount of information is required to achieve a (learning) goal. If the task is more comprehensive, students need instructional support (i.e. search training) in order to benefit from interactivity while taking notes from a video. Because the implications that we can derive from the current data are limited to the extraction of relevant information while the medium is still available, we cannot make definite statements about potential learning outcomes. However, from research on note-taking, it is known that efficient note-taking is associated with better learning outcomes (Kobayashi 2005). Furthermore, using notes as an external store for knowledge and reviewing these notes was also shown to benefit learning (Hartley and Davies 1978; Kiewra 1987). Therefore, it is reasonable to assume that taking more complete notes should result in better learning outcomes. However, this assumption should be subject to more empirical testing before making definite statements about the effects of the video-based learning environments employed in this study on actual learning outcomes.

Further, the current study sheds light on the use of interactive videos on the job. Remember that the localization of information is an important objective of on-the-job reading tasks (Guthrie 1988; Guthrie and Kirsch 1987). If workers need to locate relevant information in video-based information systems, we can assume that they would benefit from the availability of chapter selection and an index that allows for direct access to the relevant sections. However, efficient use of such features depends on the users’ ability to employ appropriate strategies. In this respect, it was shown that the localization of isolated information should be possible without further instructional support, whereas training is necessary to prepare users for the mastery of more complex task assignments. Demonstrating that the use of new technologies for learning purposes does not always intrinsically foster better performance, but requires training, blends nicely with the observation that technology needs to be implemented in a didactically sound fashion. For this purpose, the students’ individual prerequisites that they bring to the learning environment, as well as the pedagogical objectives that the instructor pursues, should be considered because a neglect of these variables might result in suboptimal learning processes (Govindasamy 2002; Jacobson and Spiro 1994).

The availability of appropriate strategies for using technology efficiently should be an important objective of education in the twenty-first century simply because the competent use of information, which is often referred to as information literacy (Bruce 1999; Rotherham and Willingham 2010), starts with the localization of relevant information. Therefore, students’ education should include enabling them to locate relevant information in video-based learning environments, the more so because respective strategies of using chapter selection and an index should transfer to text-based information systems. Beyond facilitating the localization and extraction of information from the learning materials, interactive videos could also provide support for deeper comprehension. However, it is questionable in how far features such as chapter selection and an index, which were the only features that were systematically varied in the current study, can serve this purpose or in how far additional functionality is necessary to reach higher levels of cognitive achievement. According to the classification of learning objectives proposed by Bloom (1956), we assume that the benefits of the interactive features that were systematically varied in the current study are limited to lower level processes that include handling and organizing details that are included in a medium. However, the localization and extraction of information from a medium has to be considered an important prerequisite for deeper comprehension because learners cannot engage in higher level cognitive processes if they fail to locate and extract relevant information from a medium.

We decided to analyze the data of the current study with a quantitative approach without taking into account indicators of deeper understanding such as narrative coherence or argumentative quality. Because we assume that the interactive features investigated in the current study do not intrinsically support deeper comprehension, corresponding measures would reflect the students’ individual prerequisites instead of the effects of interactivity. However, future research should also consider deeper comprehension. For this purpose, instructional designers and researchers need to consult literature on comprehension and derive appropriate features that could influence how the recipients process information. Building on previous research, prompts that ask students to engage in self-explanations (see Berthold et al. 2009; Chi et al. 1994; Schworm and Renkl 2007) or cues that focus the learners’ attention to specific aspects of a video (see de Koning et al. 2010; Fischer and Schwan 2010; Jarodzka et al. 2012, 2013) could be a promising way to go.