Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

The widespread use of advanced learning ­technologies (ALTs) poses numerous challenges for learners of all ages. Learning with these nonlinear, multi-representational, open-ended learning environments typically involves the use of numerous self-regulatory processes, such as planning, cognitive strategies, metacognitive monitoring and regulation, emotions, and motivation. Unfortunately, learners do not always monitor and regulate these processes during learning with ALTs, which limits their effectiveness as ­educational tools for enhancing learning about complex and challenging topics. Metacognition and self-regulation comprise a set of key processes that are critical for learning about conceptually rich domains with ALTs, such as hypermedia, intelligent tutoring systems, simulations, multi-agent tutoring systems, serious games, and other hybrid systems. We argue that learning with ALTs involves a complex set of interactions between cognitive, affective, metacognitive, and motivational processes. Although we acknowledge the importance of motivation in learning, it is not a process that we will be discussing in this chapter given our current measurement of it, and we will therefore focus on cognitive, affective, and metacognitive (CAM) processes.

Recent interdisciplinary research provides evidence that learners of all ages struggle when learning about conceptually rich domains with ALTs (Aleven, Roll, McLaren, & Koedinger, 2010; Azevedo, Johnson, Chauncey, & Graesser, 2011; Biswas, Jeong, Kinnebrew, Sulcer, & Roscoe, 2010; Greene, Moos, & Azevedo, 2011). In brief, this research indicates that learning about conceptually rich domains with ALTs is particularly difficult because it requires students to continuously monitor and regulate several key aspects of their learning. For example, regulating one’s learning involves the following: analyzing the learning context, setting and managing meaningful learning subgoals, determining which learning and problem-solving strategies to use, assessing whether selected learning strategies are effective in meeting the learning subgoals, monitoring and making accurate judgments regarding one’s emerging understanding of the topic and contextual factors, and determining whether there are aspects of the learning context that could be used to facilitate learning. During self-regulated learning (SRL), students need to deploy several metacognitive processes to determine whether they understand the material. Students must also consider whether it is necessary for them to modify their plans, goals, strategies, and efforts in relation to dynamically changing contextual conditions. Further, students must monitor, modify, and adapt to fluctuations in their motivational and affective states, and determine how much social support (if any) they may need to perform a task. Depending on the learning context, instructional goals, perceived task performance, and progress made toward achieving the learning goal(s), students may also need to modify certain aspects of their cognition, affect, metacognition, and motivation. As such, we argue that self-regulation plays a critical role in learning with ALTs.

In this chapter, we provide an overview of the theoretical SRL model that serves as the foundation of our research and fundamental assumptions. We then describe how features of a multi-agent, intelligent hypermedia system (i.e., MetaTutor) support learners in regulating several aspects of their learning. We also provide specific examples of key monitoring and regulatory processes used prior to, during, and following learning with MetaTutor. In addition, we provide extensive evidence from five different types of trace data (i.e., concurrent think-alouds, eye-tracking, note-taking and drawing, log files, and facial recognition) and indicate how they contribute to our understanding of SRL. Finally, we present several implications for future research of ALTs that focus on metacognition and SRL.

Self-Regulated Learning as an Event: Theoretical Framework

SRL frameworks, models, and theories attempt to explain how cognitive, affective, metacognitive, and motivational processes and contextual factors influence the learning process (Boekaerts, 2011; Pintrich, 2000; Winne, 2001; Winne & Hadwin, 1998, 2008; Zimmerman, 2000, 2008; Zimmerman & Schunk, 2011). Although there are important differences between various theoretical definitions, self-regulated learners are generally characterized as active and efficient at managing their own learning through monitoring and strategy use (Boekaerts, Pintrich, & Zeidner, 2000; Butler & Winne, 1995; Efklides, 2011; Greene & Azevedo, 2007, 2009; Pintrich, 2000; Winne, 2001; Winne & Hadwin, 1998, 2008; Zimmerman & Schunk, 2001, 2011). Students are self-regulated to the degree that they are metacognitively, motivationally, and behaviorally active participants in their learning (Zimmerman, 1989). The goal of this section is to briefly describe the theoretical basis underlying our research on MetaTutor to understand the temporal dynamics of SRL processes deployed during learning with the system.

SRL involves actively constructing an understanding of a topic or domain, such as human biology (e.g., body systems), by creating subgoals; using learning strategies; monitoring and regulating certain aspects of cognition, behavior, emotions, and motivation; and modifying behavior to achieve the desired goal(s) (see Boekaerts et al., 2000; Pintrich, 2000; Zimmerman & Schunk, 2001). Though this is a common definition of SRL, the literature includes multiple theoretical perspectives that make different assumptions and focus on different constructs, processes, and phases (see Azevedo et al., 2010; Dunlosky & Lipko, 2007; Metcalfe & Dunlosky, 2008; Pintrich, 2000; Schunk, 2008; Winne & Hadwin, 2008; Zimmerman & Schunk, 2011). For present purposes, we further specify SRL as a concept superordinate to metacognition that incorporates both metacognitive monitoring (i.e., knowledge of cognition or metacognitive knowledge) and metacognitive control (i.e., involving the skills associated with the regulation of metacognition), as well as processes related to manipulating contextual conditions and planning for future activities within a learning episode. Ultimately, SRL is based on the assumption that learners exercise agency by consciously monitoring and intervening in their learning.

Our research is theoretically influenced by contemporary models of SRL that emphasize the temporal deployment of these processes during learning (Azevedo, Moos et al., 2010). As such, multiple measures must be used to detect, track, and model learners’ use of cognitive, affective, and metacognitive (CAM) processes during learning. Underlying our approach is Winne and Hadwin’s SRL model (1998, 2008), which proposes that learning occurs in four basic phases: (1) task definition, (2) goal setting and planning, (3) studying tactics, and (4) adaptations to metacognition. The Winne and Hadwin model emphasizes the role of metacognitive monitoring and control as the central aspects of learners’ ability to acquire complex material across different instructional contexts (e.g., using a multi-agent system to track and foster SRL) in that information is processed and analyzed within each phase of the model. Recently, Azevedo and colleagues (Azevedo, Feyzi-Behnagh, Duffy, Harley, & Trevors, 2012a, Azevedo, Landis et al., 2012b, Azevedo, Bouchet et al., 2012c; Azevedo & Feyzi-Behnagh, 2011; Azevedo, Cromley, Moos, Greene, & Winters, 2011; Azevedo & Witherspoon, 2009) extended this model and provided extensive evidence regarding the role and function of several dozen CAM processes during learning with ALTs (e.g., using an intelligent, hypermedia multi-agent system).

In brief, the following assumptions are associated with the current model. First, successful learning involves individuals monitoring and controlling (i.e., regulating) key CAM processes. Second, SRL is context-specific and successful learning may require a learner to increase/decrease the use of certain key SRL processes at different points in time. Third, a learner’s ability to monitor and control both internal (e.g., prior knowledge) and external factors (e.g., changing dynamics of the learning environment, relative utility of an agent’s prompt) is crucial. Fourth, a learner’s ability to make adaptive, real-time adjustments to internal and external conditions, based on accurate judgments of their use of CAM processes, is fundamental to successful learning. Finally, certain CAM processes (e.g., interest, self-efficacy, task value) are necessary to motivate a learner to engage and deploy appropriate CAM processes during learning and problem solving.

An important strength of this model is that it deals specifically with the person-in-context perspective and postulates that CAM processes occur throughout learning with a multi-agent system, which is useful in examining when and how learners regulate learning. The focal macro-level processes discussed in this chapter are reading, metacognitive monitoring, and learning strategies. Reading behavior is critical since it is the most important activity related to acquiring, comprehending, and using content knowledge related to a particular topic. During reading, learners need to monitor and regulate several key processes, such as the following: (1) selecting relevant content (i.e., text and diagrams) based on their current subgoal; (2) spending appropriate amounts of time on each page, depending on their relevance regarding their current subgoal; (3) deciding when to switch or create a new subgoal; (4) making accurate assessments of their emerging understanding; (5) conceptually connecting content with prior knowledge; (6) adaptively selecting, using, and assessing the effectiveness of several learning strategies (e.g., rereading, coordinating informational sources, summarizing, making inferences); and (7) making adaptive changes to behavior based on a variety of external (e.g., quiz scores, quality and timing of agents’ prompts and feedback) and internal sources (e.g., affective experiences, including both positive and negative emotions, perception of task difficulty). In sum, SRL involves the continuous monitoring and regulation of CAM processes during learning with multi-agent, intelligent hypermedia systems (e.g., MetaTutor).

MetaTutor: An Adaptive, Multi-agent Hypermedia Learning System for Biology

MetaTutor is a multi-agent, adaptive hypermedia learning environment, which presents challenging human biology science content. The primary goal underlying this environment is to investigate how ALTs can adaptively scaffold SRL and metacognition within the context of learning about complex biological content (Azevedo, Feyzi-Behnagh et al., 2012). MetaTutor is grounded in a theory of SRL that views learning as an active, constructive process whereby learners set goals for their learning and then attempt to monitor, regulate, and control their cognitive and metacognitive processes in the service of those goals (Winne & Hadwin, 2008). More specifically, MetaTutor is based on several theoretical assumptions of SRL that emphasize the role of cognitive, metacognitive (where metacognition is conceptualized as being subsumed under SRL), motivational, and affective processes (Pekrun, 2006; Pintrich, 2000; Winne & Hadwin, 2008; Zimmerman & Schunk, 2011). Moreover, learners must regulate their cognitive and metacognitive processes in order to integrate multiple informational representations available from the system. Although all students have the potential to regulate, few students do so effectively, possibly due to inefficient or insufficient cognitive or metacognitive strategies, knowledge, or control.

MetaTutor is both (1) a learning tool designed to teach and train students to self-regulate (e.g., by modeling and scaffolding metacognitive monitoring, facilitating the use of effective learning strategies, and setting and coordinating relevant learning goals), and (2) a research tool used to collect trace data on students’ CAM processes deployed during learning.

As a learning tool, MetaTutor has a host of features that embody and foster SRL (see Fig. 28.1). These include four pedagogical agents (PAs), which guide students through the learning session and prompt students to engage in planning, monitoring, and strategic learning behaviors. In addition, the agents can provide feedback and engage in a tutorial dialogue in order to scaffold students’ selection of appropriate subgoals, accuracy of metacognitive judgments, and use of particular learning strategies. The system also offers the possibility for the learners to express metacognitive monitoring and control processes through the use of a palette of actions (see in Fig. 28.1). For example, learners can click on a button to indicate that they want to make a statement about their understanding of a page and then indicate on a scale that their understanding is poor. They can also indicate that they want to summarize the content of that page and then type freely their summary in a text box.

Fig. 28.1
figure 00281

Annotated screenshot of the MetaTutor interface

Additionally, MetaTutor collects information from user interactions to provide adaptive feedback on the deployment of students’ SRL behaviors. For example, students can be prompted to self-assess their understanding (i.e., system-initiated judgment of learning [JOL]) and are then administered a brief quiz. Results from the self-assessment and quiz allow PAs to provide adaptive feedback according to the calibration between ­students’ confidence of comprehension and their actual quiz performance.

The system’s interface layout also supports SRL processes. As depicted in Fig. 28.1, an embedded palette provides students with the opportunity for initiating an interaction with the system according to the SRL process selected (e.g., take notes). Overall, in line with its theoretical foundations, MetaTutor supports and fosters a variety of SRL behaviors, including prior knowledge activation, goal setting, evaluation of learning strategies, integrating information across representations, content evaluation, summarization, note-taking, and drawing. Importantly, it also scaffolds specific metacognitive processes, such as judgments of learning, feelings of knowing, and monitoring progress toward goals (Feyzi-Behnagh, Khezri, & Azevedo, 2011).

There are some aspects of the espoused theoretical models of SRL yet to be implemented. Initially, the theoretical and empirical foci have been on cognitive, metacognitive, and behavioral learning processes. Thus, this ALT does not extensively incorporate the motivational and affective dimensions of SRL into its design. Affective-related elements are currently collected by the system and analyzed following learners’ ­interaction with MetaTutor. Moving forward, the varieties and regulation of learners’ affective processes, the affective qualities of human-agent interaction, and how the system and learners’ self-regulation influence the activation, awareness, and motivation will be areas of interest with important implications for SRL theory and instructional design.

Self-Regulated Learning with MetaTutor: Understanding the Nature of CAM Processes Prior to, During, and Following Learning

When interacting with the current version of MetaTutor, during a 2-h session, a student is asked to learn about the human circulatory system. The environment contains 41 static diagrams and hundreds of paragraphs containing 7,545 words. Each of these representations of information is organized similarly to sections and subsections of book chapters, thus allowing students to navigate freely throughout the environment (see table of contents on the left of Fig. 28.1). In addition to CAM processes, motivational and emotional processes may also be assessed during the MetaTutor session. In this section, we describe the nature and role of CAM processes experienced by learners prior to, during, and following their learning session with MetaTutor.

CAM Processes Prior to Using MetaTutor

Once a student is given the overall learning goal for the session and prior to using MetaTutor, she or he analyzes the learning situation, sets meaningful learning goals, and determines which strategies to use based on the task conditions. The student may also generate motivational goals and beliefs based on prior experience with the topic and learning environment, success with similar tasks, contextual constraints (e.g., perception of scaffolding and feedback provided by a PA), and contextual demands (e.g., a time limit for completion of the task).

For example, a student may espouse different achievement goals and beliefs about knowledge prior to engaging with the learning environment. According to achievement goal theory (see Ames, 1992; Ames & Archer, 1988; Hulleman, Schrager, Bodmann, & Harackiewicz, 2010), some students may espouse a more dominant mastery goal for learning if their prior experiences in classroom environments encouraged them to increase competencies by focusing on personal progress. In contrast, other students may enter the learning environment with a tendency to strive for competition and outperform other students, particularly if their learning experiences typically emphasized the importance of performance through peer comparisons.

Further, students’ beliefs about the nature of knowledge and what it means to know—their epistemic beliefs—are another active component during the task definition phase (Muis, 2007). Students adapt their cognitive processing during the preparatory planning phases of learning in response to task complexity, a relationship that is mediated by their epistemic beliefs. That is, students who espouse beliefs in unstructured and variable knowledge report using a greater proportion of deep cognitive processing across all tasks (Bromme, Pieschl, & Stahl, 2010). These constructivist beliefs about knowledge and knowing allow for a greater perception of task complexity and flexibility in selecting strategies best suited to accomplish the task. Such beliefs and motivational approaches can be shaped by previous academic experiences, perceptions, and attitudes, as well as by the instructions provided at the beginning of the MetaTutor learning session. Importantly, differences in goal orientations and epistemic beliefs will likely influence the strategies deployed during learning, as well as the criteria learners use to evaluate success or failure.

Additionally, students may have particular emotional responses prior to interacting with MetaTutor. These may be based on either an existing trait emotion (e.g., more habitual, reoccurring emotions, such as trait test anxiety) that would be aroused by the learning environment or prospective emotional responses that relate to potential outcomes of the particular academic achievement activity (e.g., hope to learn as much as possible about the circulatory system) (Pekrun, 2006).

CAM Processes Deployed During Learning with MetaTutor

During the course of learning, a student may assess whether particular strategies are effective in meeting learning subgoals, evaluate their emerging understanding of the topic, and make the necessary adjustments regarding knowledge, behavior, effort, and other aspects of the learning context. Ideally, the self-regulated learner will make adaptive adjustments, based on continuous metacognitive monitoring and control related to the standards of the particular learning task and that these adjustments will facilitate decisions regarding when, how, and what to regulate (Pintrich, 2000; Schunk, 2001; Winne, 2005; Winne & Hadwin, 1998, 2008; Winne & Nesbit, 2009; Zimmerman, 2008; Zimmerman & Schunk, 2011). These monitoring and control processes may interact with motivational facets of learning, such as self-efficacy and epistemic beliefs. Self-efficacy represents an individual’s perceived capacity to successfully complete a learning task (Schunk & Usher, 2011), such as completing a subgoal created within MetaTutor. During the learning session, a student’s confidence about his or her capability to master a certain concept or complete a subgoal may influence his or her decisions about which pages to read in MetaTutor, how long to persist on challenging material, and resilience to adverse outcomes, such as poor performance on quizzes.

Another factor that influences online metacognitive behaviors is students’ epistemic beliefs, which are related to the standards that are set for subsequent learning (Muis, 2007). Standards for learning are used to compare an emerging learning product (e.g., comprehension of a text) with the initial goal that was set (e.g., studying in order to be prepared for the posttest). If, for example, a student holds a belief in simple knowledge, he or she may judge that memorization of key terms is an adequate standard for learning, without being motivated to consider their interconnectedness across multiple representations and pages in MetaTutor (Dahl, Bals, & Turi, 2005; Schommer, 1998). In contrast, a belief in complex knowledge motivates a greater effort at understanding its interconnectedness (Muis, 2007; Muis & Franco, 2009). Both self-efficacy and epistemic beliefs can potentially change during learning depending on a host of variables, such as performance on quizzes, self-evaluations about the effectiveness of learning strategies deployed, and emotions experienced during the learning process (e.g., learning-centered emotions).

Activity emotions are also subject to change based on learners’ evolving appraisals, such as control and task value, regarding progress toward achieving learning goals (Pekrun, 2006). These emotions are also influenced by learners’ ability to adaptively regulate their emotions (Gross, Sheppes, & Urry, 2011). Therefore, a learner may approach MetaTutor feeling hopeful (prospective emotion) that he or she will be able to learn about a particular topic of importance (i.e., an appraisal of positive value and medium control), such as the relationship between the circulatory and ­nervous system, but become frustrated (activity emotion) after learning that this goal cannot be set because MetaTutor does not cover the nervous system (i.e., appraisal of low control). The learner may then question whether the learning session will hold anything of interest (i.e., an appraisal of negative value). The learner, however, may be effective in dampening their frustration and rather than giving up and disengaging with the task (i.e., becoming bored), instead be able to set a subgoal more focused on the circulatory system that is still of personal interest. After having proposed a new subgoal (e.g., to learn about malfunctions of the circulatory system), the learner may then experience enjoyment. In this type of positively valenced emotional state, the learner is better poised to approach and succeed in the achievement task (Pekrun, 2006; Pekrun, Goetz, Frenzel, Petra, & Perry, 2011).

CAM Processes Following Learning with MetaTutor

Following the learning session with MetaTutor, the learner may make several cognitive, motivational, and behavioral attributions that affect subsequent learning (Pintrich, 2000; Schunk, 2001). Learners’ retrospective emotions may be aroused based on their success or failure regarding goal achievement, as well as motivational factors, such as appraisals of control and value (Pekrun, 2006; Weiner, 1985). For example, if learners were successful in achieving their goal, the control-value theory of achievement emotions predicts that they would experience pride if they cared about the goal (positive value) and felt that they were responsible for their success. Conversely, they would be expected to experience shame if they were unsuccessful, cared about the goal, and felt responsible for their failure. The experience of pride or shame may have motivational consequences. That is, learners may either be more eager to learn about content and do so using the intelligent tutoring system or become less interested in learning and/or interacting with intelligent PAs. In other words, a combination of emotions, perceived task value, and personal explanations for success or failure may influence students’ response to the learning environment, feelings about performance, and attitudes toward similar learning situations.

The preceding scenarios represent an idealistic approach to self-regulating one’s learning with an ALT, such as MetaTutor. Unfortunately, the typical learner does not engage in all of these adaptive CAM processes during learning with ALTs (see Azevedo & Witherspoon, 2009; Biswas et al., 2010).

Multi-level Processes of SRL During Learning with MetaTutor: Converging Evidence

As a research tool, MetaTutor is capable of measuring the deployment of self-regulatory processes through the collection of rich, multi-stream data, including self-report measures of SRL, online measures of cognitive and metacognitive processes (e.g., concurrent think-alouds), dialogue of agent-student interactions, physiological measures of motivation and emotions, emerging patterns of effective problem-solving behaviors and strategies, facial data on both basic (e.g., anger) and learning-centered emotions (e.g., boredom), and eye-tracking data regarding the selection, organization, and integration of multiple representations of information (e.g., text, diagrams). The collection of these various data streams is critical to enhancing our understanding of when, how, and why students regulate or not their learning and adapt their regulatory behaviors. These data are then used to develop computational models designed to detect, track, model, and foster students’ SRL processes during learning (for a review see Azevedo, Moos et al., 2010). In this section, we present data from five different sources that exemplify the complex nature of trace data in terms of frequency of use, level of granularity, temporal sequencing, ease of inference making regarding specific macro-level SRL processes, and the role of context needed, in order to understand how the trace data can augment understanding of conceptual, measurement, and analytical issues. As such, we present data associated with concurrent think-alouds, eye-tracking, note-taking and drawing, log files, and facial detection of emotions.

Concurrent Think-Aloud Protocols: SRL Events Based on Microlevel Processes

Azevedo and colleagues have provided detailed analyses of the dozens of cognitive and metacognitive processes used by learners of all ages (e.g., middle-school, high-school, and college students) when using several ALTs (see Azevedo, 2007; Azevedo, Cromley, Winters, Moos, & Greene, 2005, Azevedo, Moos, Greene, Winters, & Cromley, 2008, Azevedo, Moos et al., 2010; Azevedo et al., 2012a; Azevedo & Witherspoon, 2009; Greene & Azevedo, 2007, 2009). Their analyses of SRL processes during learning with ALTs are of particular relevance since SRL is treated as an event. Their analyses of hundreds of concurrent think-aloud protocols and other process data (e.g., log-file and video analyses) provide detailed evidence of the macro-level (e.g., metacognitive monitoring) and microlevel processes (e.g., JOL) and valence that augments Winne and Hadwin’s (1998, 2008) model. In general, these processes include planning, monitoring, strategy use, and handling of task difficulty and demands (see Azevedo, Moos et al., 2010 for details). The conceptual, theoretical, methodological, and analytical assumptions and issues regarding the use of concurrent think-alouds to examine SRL processes are well documented by Azevedo and colleagues (see Azevedo et al., 2005, 2007, 2010; Azevedo & Witherspoon, 2009; Greene & Azevedo, 2007, 2010 for details). In this section, we contextualize our definitions with examples of metacognitive processes typically used with MetaTutor and then present how learners’ monitoring processes and corresponding judgments are addressed by regulatory processes.

Monitoring Processes During Learning with MetaTutor

As previously mentioned, Winne and colleagues’ model provides a macro-level framework for the cyclical and iterative phases of SRL. The data presented in this section exemplify the microlevel processes that can augment Winne’s model. In particular, we present six metacognitive monitoring processes we have identified as essential to ­promoting students’ SRL with MetaTutor. Some of these monitoring processes include valence, positive (+) or negative (−), which indicates the learners’ evaluation of the content, their understanding, progress, or familiarity with the material. For example, a learner might state that the current content is either appropriate (positive content evaluation) or inappropriate (negative content evaluation) given their current learning subgoal and valence associated with the evaluation (and accuracy of the metacognitive judgment). They may also make choices about how and which metacognitive regulatory process to choose in order to address the result of the metacognitive judgment (e.g., set a new subgoal, summarize content).

JOL is when a learner becomes aware that he or she does (+) or does not (−) know or understand something just read or inspected (e.g., diagram). Feeling of knowing (FOK) is when the learner is aware of having (+) or having not (−) read, heard, or inspected something in the past (e.g., prior to the learning session) and having (+) or not having (−) some familiarity with the material (e.g., never presented in a previous biology class). Self-test (ST) is when a learner poses a question to himself or herself to assess understanding of the content and determine whether to proceed with additional content or to readjust strategy use. In monitoring progress toward goals (MPTG), learners assess whether previously set goals have been met (+) or not met (−) given particular time constraints. This monitoring process includes a learner comparing the goals set for the learning task (i.e., set during the subgoal phase) with those already accomplished and those that still need to be addressed. A related metacognitive process, time monitoring (TM), involves the learner becoming aware of the remaining time allotted for the learning task. Content evaluation (CE) occurs when a learner monitors the appropriateness (+) or inappropriateness (−) of the current learning content (e.g., text, diagram, or other type of static and dynamic external representation of information) given the overall learning goal and subgoals. In sum, these are just a few of the relevant metacognitive monitoring processes used by students during learning with MetaTutor. Based on our previous discussions of SRL models, these processes play important roles in facilitating and supporting students’ SRL with ALTs.

Self-Regulation of Learning Based on Metacognitive Monitoring Processes

In this section, we describe the learner’s application of these six monitoring processes within the context of self-regulation with MetaTutor. The processes described in this section are based on empirical findings (e.g., Azevedo et al., 2010, 2012a; Johnson, Azevedo, & D’Mello, 2011). For each monitoring process, we provide the aspects of the learning environment (i.e., MetaTutor) that are evaluated by learners and illustrate them using examples of task and cognitive conditions.

FOK is used when the learner is monitoring the correspondence between his or her own preexisting domain knowledge and the current content. The learner’s domain knowledge and the learning resources are the aspects of the learning situation being monitored when a learner engages in FOK. If a learner recognizes a mismatch between preexisting domain knowledge and learning resources (negative valence), more effort should be expended in order to align the knowledge and resources. Following more effortful use of the learning material, a learner is more likely to experience more positive FOKs. However, if a learner experiences familiarity with some piece of material (positive valence), a good self-regulator will attempt to integrate the new information with existing knowledge by summarizing or taking notes. Often, a learner will erroneously make a positive FOK toward material and quickly move on to other material with several misconceptions still intact. These occurrences can be prevented through feedback from the agent based on the results of the quiz administered after FOK (and JOL) to check content understanding.

In contrast to FOK, JOL is used when a learner is monitoring the correspondence between his or her own emerging understanding of the domain and the learning resources. Similar to feelings of knowing, when engaging in JOL, a learner is monitoring domain knowledge and learning resources. If a learner recognizes that his or her emerging understanding of the material is not congruent with the material (i.e., the learner is confused), more effort should be applied to understanding the material. A common strategy employed after a negative JOL is rereading previously encountered material. In order to capitalize on rereading, a good self-regulator should pay particular attention to confusing elements in a textual passage or diagram. When a learner expresses a positive JOL, he or she might self-test to confirm that the knowledge is as accurate as the evaluation suggests. As with FOK, learners often overestimate their emerging understanding and progress too quickly to other material.

Learners apply self-testing (ST) as a way to monitor their emerging understanding of content. When tackling difficult material, learners should occasionally assess their level of understanding of the material by engaging in ST. If the results of this self-test are positive, the learner can progress to new material. If, however, the learner recognizes that emergent understanding is not congruent with what is stated in the material, he or she should revisit the content. Learners can engage in FOK, JOL, and ST using a palette of self-regulating processes available in MetaTutor. When doing so, a learner is provided with a 6-point Likert scale to evaluate knowledge (FOK) or learning (JOL) about the material just read on the current page. Such assessment is then systematically followed by a quiz (ST). The feedback provided by the agent can, therefore, not only be associated with a learner’s actual knowledge but also related to the validity of the individual’s self-monitoring. Specifically, the agent can indicate situations in which an individual expressed confidence with the material, yet obtained a poor quiz score.

When monitoring progress toward goals (MPTG), a learner is monitoring the fit between learning results and previously set learning goals for the session. Aspects of the learning situation monitored during MPTG are the learner’s domain knowledge, expectations of results, and the learning goals. Closely related to time monitoring, MPTG is an essential monitoring activity that learners should use to stay “on track” for the completion of the learning task. A learner may be able to generate several critical subgoals, but if he or she does not monitor their completion or incompletion, the subgoal generation SRL strategy will be inadequate. When a learner monitors goal progress and realizes that only one of three has been accomplished in 75% of the time devoted to the learning task, a good self-regulator will revisit the remaining subgoals and decide which is most important to pursue next. In time monitoring (TM), a learner is monitoring the available time with respect to learning goals. These learning goals can be either the global learning goal defined before engaging in the learning task or subgoals created by the learner during the learning episode. If the learner recognizes that very little time remains and few of the learning goals have been accomplished, adaptations should be made. For example, if a learner has been reading a very long passage for several minutes and realizes that learning goals have not been accomplished, a good self-regulator will begin scanning remaining material for information related to the goals not yet reached. In MetaTutor, learners can use the system interface to prioritize subgoals (e.g., to revisit a current subgoal if there is still time left) or confirm that they have finished learning about a particular subgoal (see Fig. 28.1 for the list of self-set subgoals that are always present). In the latter case, the learner is prompted with a long quiz to help them self-test their understanding of all the materials related to this subgoal. The learner can also monitor progress by referring to a progress bar that indicates the percentage of relevant material reviewed for the current subgoal. Moreover, pages already visited are marked in the table of contents, which can facilitate the scanning strategy if they want to apply it.

When learners engage in content evaluation (CE), they are monitoring the appropriateness of the learning material they are currently reading or viewing with regard to their current subgoal(s). In contrast to CE, evaluation of adequacy of content relates to the learner’s assessment of the appropriateness of available learning content, rather than content currently being inspected. The aspects of the learning situations monitored in both of these processes are the learning resources and the learning goals. The learner should remain aware of whether learning goals and learning resources are complementary. If a learner evaluates a particular piece of material as particularly appropriate given their learning goal (positive valence), more cognitive resources should be directed toward this material. Conversely, if particular content is evaluated as inappropriate with respect to a learning goal, a good self-regulator will navigate away from (or simply avoid) this content to seek more appropriate material. A learner can perform CE using the SRL palette, in which case he or she has to state if a particular page and/or image is relevant to the current subgoal. The agent can provide feedback related to the accuracy of this assessment.

In sum, these monitoring processes and corresponding regulatory processes are based on studies examining the role of self-regulatory processes deployed by learners during learning with open-ended hypermedia learning environments. They also play a critical role during learning with other ALTs described in the next section.

Using Eye-Tracking Data to Trace and Infer Self-Regulatory Processes

Eye-tracking has been used extensively in reading research (see Just & Carpenter, 1980; Rayner, 1998), and its use has extended to ALTs, such as multi-agent systems (e.g., Conati & Merten, 2007). Eye-tracking provides fine-grain information about the allocation of a learner’s visual attention in terms of what, for how long, and in what order an object is attended to (Scheiter & Van Gog, 2009). The information obtained from this channel is important since the objects, text, or images being fixated on by the eyes indicate that they are being processed in the mind ­(eye-mind assumption; Just & Carpenter, 1980). Eye-tracking provides us with data that is time-stamped to the millisecond and includes the location and duration of gaze fixation, saccades, pupil diameter, blinks, and gaze behavior patterns. Within MetaTutor, we use the time-stamped data stream and align it with other data sources and channels, including concurrent think-alouds, video footage of a learner’s face, and reading behavior. Aligning these data ­channels allows us to understand how learners perceptually attend and process multimedia materials (e.g., text, diagrams, images, and videos) presented and accessible both linearly and nonlinearly in MetaTutor.

In MetaTutor, where learning material about the human circulatory system is presented in text and diagram format, data from eye-tracking provides valuable information about how learners navigate between the text and diagram(s) (i.e., coordinate informational sources, COIS), how long and how many times they fixate on relevant and irrelevant parts of the text and diagram (e.g., relevant and irrelevant Areas of Interest, AOIs), and how they integrate information presented in multiple representations. These data are critical because they reveal processes often not verbalized by learners in think-aloud protocols (Azevedo, Moos et al., 2010). For example, repeat and prolonged fixations on irrelevant AOIs (e.g., septum) may indicate that the learner does not recognize or understand that the specific part of the diagram is irrelevant. Ideally, PAs in the learning environment should scaffold learners by guiding their attention to relevant material or parts of the interface, which are conducive to the successful completion of the learners’ current subgoal. In another example, prolonged fixation on a specific portion of text for which a negative JOL had been made may indicate that the learner is spending time rereading that section to gain a better understanding of the text on that page. This inference needs to be corroborated by examining subsequent behaviors (e.g., clicking the SRL palette to indicate that they understand the textual content or verbalizing a positive JOL). In a similar way, a prolonged fixation after a negative FOK may indicate that the learner has recognized that the material is unfamiliar to them and is spending time to read and learn it more carefully. These metacognitive judgments can be made by learners either by verbalizing in their think-aloud protocol or by clicking on a button in the SRL palette embedded in MetaTutor’s interface to indicate that they want to make a judgment. When several channels of data are collected (e.g., think-aloud protocol and eye-tracking) in an experiment, eye-movement traces can be triangulated with think-aloud protocols to investigate different planning, monitoring (e.g., metacognitive judgments), and strategy deployment processes (e.g., rereading, COIS). Analysis of fixation location and duration on different parts of a learning environment’s interface can assist in improving the design of the interface and the presentation of the learning material in order to further scaffold learners’ SRL.

One of the important channels of data obtained from eye-tracking is pupil diameter. The pupillary response has been associated with increased mental processing activity and task difficulty. Many studies have provided evidence that cognitive processing load is associated with pupil dilation (see reviews by Beatty, 1982, 1988; Hyönä, 1995). According to the working memory model there is a trade-off between processing demands and cognitive resources, such that when more resources are allocated to one process, less remains for the other. In other words, when processing difficult and complex learning material, there will be a higher processing load on the working memory, which will allow only limited resources to be free for attending to higher-order processes like metacognition. Investigating pupil dilation data obtained from eye-tracking can be helpful in identifying the instances during the learning task requiring high cognitive processing, which will assist in developing metacognitive scaffolds that can help learners manage their available cognitive resources, direct their actions (e.g., rereading difficult or misunderstood material), and off-load their working memory by using effective learning strategies (e.g., taking notes).

Note-Taking and Drawing: Integrating Knowledge During Learning

Although there are many SRL processes that students may deploy to facilitate learning, note-taking and drawing provide important opportunities for learners to synthesize information and build coherent mental representations of the material. Within an SRL framework, note-taking and drawing represent instantiations of SRL strategies that may vary in quantity (e.g., frequency and duration) and quality (e.g., depth of cognitive processing). As such, not all learners engage in these processes in the same way. For instance, different note-taking patterns or drawing behaviors may emerge according to the degree of metacognitive monitoring, instructional support, and learners’ level of prior knowledge (Moos & Azevedo, 2008). To better understand the relations between these types of strategies and learning outcomes within MetaTutor, note-taking and drawing events are collected as trace data while students interact with the learning environment. The following section describes how these data are collected within MetaTutor, the analytical approaches employed by our research team, and the potential of these data sources to improve scaffolding and advance our understanding of SRL within ALTs.

An instructional video is displayed at the beginning of the learning session with MetaTutor to advise students about the note-taking and drawing features available throughout the session. Learners can take notes in two ways: (1) by selecting the note-taking feature from the SRL palette embedded within MetaTutor and (2) by pen and paper using a digital notepad located on the desk beside the computer. Learners can also use this notepad to draw diagrams. Each time the learner selects the take notes (TN) button on the palette, a new window appears for learners to type notes. There are three tabs associated with this feature. The tab that automatically displays is page notes. Notes under this tab are associated with the page the learner is currently viewing. Under the page note overview tab, learners can view a list of pages associated with their notes. There is also a general notes tab available for learners to take notes that are not directly associated with a particular page. Learners can select save and close to exit this window and return to it at a later time.

The note-taking feature is entirely learner-initiated (i.e., agents do not prompt activation of this learning strategy). In contrast, learners receive prompts from a PA to draw at various points throughout the session. Specifically, when a learner has viewed a relevant page, but has not opened the image associated with the page, he or she is prompted within 45 s to draw. Students are also prompted to draw after they have had an image open for 96 s. These prompts are referred to as coordinating informational sources as they encourage students to integrate multiple sources of information, such as text and images, by drawing visual representations.

Time-stamped log files capture learners’ note-taking and drawing events for subsequent analyses. For example, if a learner draws a diagram on the notepad, a record is created in the log file to indicate the time of occurrence and duration of the event. Thus, the frequency and duration can be captured to provide process data in relation to other SRL events and materials that the learner viewed before and after the drawing was created. Furthermore, the hard copy of a learner’s diagram can be analyzed for quality and potential misconceptions related to the topic. Similarly, notes typed in the note-taking viewer are also time-stamped and stored in the log files. These types of data can also be analyzed in relation to other SRL process and learning outcomes, including posttest scores.

There are several approaches to analyzing note-taking and drawing within an SRL framework. In previous research (e.g., Trevors, Duffy, & Azevedo, 2011), we have extracted log-file data to obtain frequencies of note-taking episodes (measured by the number of times participants selected TN from the SRL palette), as well as experimental conditions, learning efficiency scores, prior knowledge, and note-taking text. Notes can be segmented into idea units or naturalist segments (Chi, 1997) and subsequently coded for quality using theoretically grounded coding schemes. For example, we have used depth of cognitive processing frameworks (see Entwistle & Peterson, 2004) to determine whether a segment of notes represents either content reproduction (i.e., verbatim copying of the text) or elaboration (i.e., text-based or prior knowledge-based inferences). Video and screen recordings can also be used during coding to determine whether notes represent a deep or shallow level of strategy use. For example, while evaluating a participant’s notes, these recordings can be played to determine which section of text the participant viewed and what types of verbalizations were made during note-taking. This allows coders to verify whether the participant integrated ideas from multiple sections or copied the text verbatim. Based on these analyses, we have found that students frequently engage in content reproduction (i.e., shallow processing), which is negatively related to achievement. Furthermore, although the presence of agents resulted in decreased note-taking behaviors among low prior knowledge learners, the agents did not effectively promote more adaptive note-taking strategies, such as elaboration. As a result, we have modified the architecture of MetaTutor to scaffold deeper level note-taking strategies through modeling and prompts from PAs. Moving forward with this research, future analyses may also involve examining learners’ drawing behaviors in relation to note-taking strategies and learning outcomes. Moreover, triangulating these events with eye-tracking and think-aloud data could help to provide a more detailed analysis of the role of note-taking and drawing for SRL. For instance, eye-tracking data would allow us to systematically analyze exactly which sentences or images were viewed before, during, and after note-taking and drawing. Additionally, analyses of think-aloud data may allow us to determine whether there were specific types of metacognitive processes that prompted these learning strategies.

Log Files: Event-Based Traces During System Interaction

Within ALTs, log files provide a time-stamped record of every key stroke and mouse click on system features made by the learner. From this unobtrusive source of data, a great many ­inferences can be made into learners’ real-time cognitive and metacognitive processes (e.g., Aleven et al., 2010; Malmberg, Jarvenoja, & Jarvela, 2010; Schoor & Bannert, 2012). MetaTutor log files collect hundreds of user- and system-initiated actions every millisecond during a learning session. Computerized log files provide an automatic record of learners’ interactions with the system, which includes, but is not limited to, natural language input by the learner, questionnaire, quiz, and test responses; mouse clicks on any system feature (e.g., concept maps); the frequency and duration of all seven of MetaTutor’s interface layouts viewed by the learner; metacognitive judgments; time spent on individual content pages; time spent with individual diagrams visible; and the use of any external equipment connected to the system (e.g., digital writing pad). Additionally, log files also record all events performed by the system. In MetaTutor, this includes learner-agent dialogue moves, text of verbal instructions, feedback, and scaffolding by the four PAs or any system-initiated event, such as the onset of testing, summarizing, or comprehension monitoring activities. In addition, the exact learner- and system-initiated rules triggered by several conditions (e.g., time thresholds) are also logged in the file.

Given the broad scope of information contained in log files, researchers are able to know, for example, how long a learner spent viewing an instructional text, how often he or she went back and forth between the text and related diagram or video, and the frequency and content of summarizations (or other learning products). Furthermore, log files provide a transcription of a PA’s instructions to the learner to evaluate understanding of the current content, the administration and results of a quiz, and the feedback based on the accuracy of the learner’s subjective self-evaluations of comprehension vis-à-vis objective quiz results.

Careful tailoring of system design and features, as described in the example above, can provide evidence of learners’ cognitive and metacognitive processes while minimizing inferences made by researchers. At the cognitive level, the duration of viewing instructional text can be inferred as time spent reading. Likewise, all things being equal, a longer reading time is evidence of increased cognitive processing of textual content (Lorch & van den Broek, 1997; O’Brien, 1995; Zwann & Singer, 2003). Reading times can be affected by the inclusion of multiple representations of information (van Someren, Reimann, Boshuizen, & de Jong, 1998) or conflicting information (Albrecht & O’Brien, 1993; Cook, Halleran, & O’Brien, 1998). Navigating to and viewing related multimedia can be considered as an attempt to integrate multiple representations of informational sources. At the metacognitive level, features or sequences of events can be designed to promote and record self-monitoring and self-regulation of cognition. For example, Table 28.1 depicts the interactions between a learner and MetaTutor during a sequence of scaffolded monitoring. In this table, the first and second columns represent numbered events with associated time stamps during the session (in milliseconds), respectively. The third and fourth columns depict the layout number and title (e.g., Student Input). Lastly (or finally), the fourth and fifth columns are a record of activities as well as the student input and agent output. In this example, a PA prompts the learner to reflect on his or her comprehension of the current content after navigating away from the page too quickly to read (e.g., < 7 s). At entry 619, the learner rates her understanding as 5 (on the 6-point Likert scale described earlier) or higher. She obtains a high quiz score, for which he or she receives positive feedback and encouragement from the agent to move onto new content at entry 632. For researchers, this episode is evidence of a calibrated metacognitive judgment, onto which various analytical procedures can be applied.

Table 28.1 A 1-min excerpt of a log file depicting a learner’s judgment of learning, quiz results, and positive feedback from a pedagogical agent, Mary

Specifically, educational data mining techniques provide new opportunities for researchers to represent internal cognitive and metacognitive states and their interactions. Biswas et al. (2010) describe hidden Markov modeling (HMM) as an analytical method to discern mental states and probabilistic transitions between these states, such as transitioning from the creation of a learning product to a monitoring state. Although these states cannot be directly recorded in log files, they are ascertained on the basis of learner’s recordable interactions within ALTs; multiple monitoring activities, such as the JOL in Table 28.1, can be grouped together to form the basis for one state, thus providing a higher-level perspective on log-file data (Biswas et al., 2010). Similarly, cluster analysis can group learners across a large number of variables (i.e., multivariate differences), discerning what similar patterns of learner interactions are more and less effective within MetaTutor (Bouchet, Harley, Trevors, & Azevedo, 2012; Bouchet, Kinnebrew, Biswas, & Azevedo, 2012). Latent profile analysis (LPA), latent class analysis (LCA), and latent growth modeling (LGM) are additional analytic techniques that hold great promise for using log-file data to model intraindividual changes during the learning session. These techniques permit the identification of individual growth curves (trajectories) with the opportunity of identifying particular groups/classes of similar curves. Employing these analytical techniques with log-file data provides insight into dynamic cognitive and metacognitive processes not gained with traditional analysis, such as simple frequency counts or pre-post scores alone.

The use of any single data source to understand phenomena as complex as learning has inherent limitations. First, the strength of log-file data rests on the degree to which the system’s features and analytic techniques are grounded in a theory of learning. Data from Table 28.1 are meaningful because an explicit decision was made to design a system feature to measure calibration of metacognitive judgment, which can then be analyzed with other monitoring behavior as a reflection of an underlying mental state. Weaker empirical conclusions result from a lack of theoretical explicitness in system design and data analysis. Second, log files are only one limited perspective of the events that occur in a learning session. What information was the learner attending to when making an initial JOL? What influence, if any, would positive or negative feedback have on the learner’s subsequent cognitive, metacognitive, affective, or motivational processes? To answer these relevant questions, researchers need greater context than log files can provide. These issues speak to the need to integrate multiple streams of data to generate defensible inferences about relevant learning processes. In sum, we address these issues by triangulating multiple streams of data (i.e., concurrent think-alouds, eye-tracking, note-taking behavior) during learning with MetaTutor.

Emotional Attribution Through Facial Expression Analyses

In addition to the emerging use and convergence of data streams to understand and measure cognitive and metacognitive processes, we have also begun to collect and examine video data of students’ facial expressions during learning with MetaTutor. This data stream is vital, in that it provides a new data source necessary to understand the fluctuations in students’ emotions during learning. Facial expressions are configurations of different micro-motor (small muscle) movements in the face, which are used to infer a person’s discrete emotional state. Facial expressions have been a popular and well-researched method for analyzing participants’ emotional states for decades (Ekman & Friesen, 1978, 2003), and to this day they remain one of the most widely used, as well as one of the most theoretically and empirically grounded emotional measurement channels (Arroyo et al., 2009; Calvo & D’Mello, 2010, 2011; D’Mello & Graesser, 2010; Ekman, 1992; Zeng, Pantic, Roisman, & Huang, 2009). Accordingly, facial expression analysis has been the primary method through which we have detected and traced learners’ experience of emotions throughout their learning session with MetaTutor (Azevedo & Chauncey-Strain, 2011; Harley, Bouchet, & Azevedo, 2011; Harley, Bouchet, & Azevedo, 2011, 2012a, 2012b).

Our work analyzing emotions has utilized Noldus FaceReaderTM 3.0 and 4.0, a software program that analyzes learners’ facial expressions and provides a classification of their emotional states. The program uses an active appearance model to match and track learners’ faces and then relies on an artificial neural network trained on a database of high-quality facial images from 70 individuals (Lundqvist, Flykt, & Öhman, 1998) acting out Ekman and Friesen’s six basic emotions (Ekman, 1992) in addition to a neutral emotion. FaceReader has been validated through comparison with human coders’ ratings of basic emotions (Terzis, Moridis, & Economides, 2010) and specified acted emotions (Van Kuilenburg, Wiering, & Den Uyl 2005).

Additionally, using an automatic facial recognition software program confers us the advantage of analyzing learners’ facial expressions much faster than if we were to use Ekman and Friesen’s Facial Action Coding System (FACS; Ekman & Friesen, 1978, 2003), which is highly human-resource intensive to use, train, and certify coders. In short, FaceReader is able to code more data than would be possible with human coders. For example, in a recent analysis we examined a sample of 50 learners engaging with one of MetaTutor’s PAs during the subgoal setting phase of the learning episode (M  =  2m22s, SD  =  1m10s). During this short portion of the learning session, FaceReader was able to make 224,582 emotional state classifications, each corresponding to a different video frame of footage of a learner engaging with MetaTutor (Harley, Bouchet, & Azevedo, 2012b).

The preceding example highlights another FaceReader asset: the ability to act as a macro- and micro-measurement tool. In other words, FaceReader can be used to examine incremental transitions in emotional states that occur less than a second apart while also being able to summarize the prominence of different emotional states occurring over a time span that ranges for 2 h (in our application) without comprising its ­validity or reliability. Being able to examine emotions data continuously at multiple levels is crucial to examining emotions as a dynamic, rapidly changing psychological process (Ekman, 1992).

The primary disadvantage of using FaceReader is that its analyses of facial expressions is limited to basic, universal emotions (Ekman & Friesen 1978, 2003), which do not represent the whole scope of emotions relevant to learning with MetaTutor. Most notably, basic emotions exclude learning-centered emotions, such as boredom and confusion (D’Mello, Craig, & Graesser, 2009; Pekrun, 2006; McQuiggan, Robinson, & Lester, 2008). To capture these emotions, one would need to either develop a new coding scheme, add to an existing coding scheme (e.g., Craig, D’Mello, Witherspoon, & Graesser, 2007), or make use of additional emotional channels (Calvo & D’Mello, 2010, 2011; Mauss & Robinson, 2009; Zeng, Pantic, Roisman, & Huang, 2009), as we are doing. A potential additional disadvantage to FaceReader is the fact that the database is formed from acted, as opposed to naturally occurring, emotions. Given that humans are not able to control all their facial muscles efficiently (Ekman, 2003), it is possible that some subtle differences, such as artificially limited micro-motor muscle variance, may exist between posed and naturally embodied facial expressions. It should be noted, however, that capturing high-quality images of natural, unfolding emotions from multiple angles would be technically challenging without distracting participants and interfering with the emotions one is trying to measure. It should also be noted that these limitations might be more problematic for more subtle emotional states, such as boredom and curiosity, than higher intensity expressions, such as anger and sadness.

We conclude this section by identifying some of the specific features and opportunities regarding FaceReader through a guided tour of several components of FaceReader’s online interface presented in Fig. 28.2. In the top left-hand corner of Fig. 28.2, the analysis visualization window, we can see the active appearance model FaceReader uses to model participants’ faces, as well as the video quality bar, which is at an acceptable threshold. The top right-hand corner displays the emotional valence (experience of positive or negative emotions). One can see from this window that the learner has spent, from the duration of time shown, most of her visible learning session experiencing negatively valenced emotions (e.g., sadness, anger). The bottom right window illustrates the proportions of the different discrete emotions the learner has experienced, which tell us that she has embodied, during the time her video has been analyzed, a fairly equal proportion of surprise, anger, sadness, and neutrality. The bottom left expression window shows the onset and offset of the different discrete emotions, transitions between different emotional states, and that at times, different discrete emotional states that are co-occurring together (occurring simultaneously) (Harley et al., 2012a). The latter half of this window provides an example in which the learner suddenly embodies an intense surprised expression, which degrades slightly and is accompanied by a short accompanying peak of anger. We can interpret from these data that something in the learning environment (e.g., PA feedback) surprised the learner and also made her feel angry, though the experience of anger was fleeting (possibly because the learner successfully downregulated this negative emotion). FaceReader is a rich source of data, especially when combined with other data channels (e.g., log files), which allows us to identify the context in which learners are experiencing their emotions.

Fig. 28.2
figure 00282

FaceReader™ 4.0 interface

Summary and Conclusions

Early in this chapter we noted that MetaTutor is both a research tool and a learning tool. One of the objectives of this chapter has been to demonstrate the interconnectedness of these functions and the capacity for enhancing learning with MetaTutor. One of the chief strengths of MetaTutor is the multitude of different channels available for collecting and analyzing learners’ interactions with the system. Going forward, we are exploring the addition of new channels as well as exploring new features of existing channels and how they can be aligned to provide an ever deeper and more contextualized understanding of students’ learning and co-regulation with MetaTutor. We conclude this chapter by outlining some of the future directions we are currently pursuing and have planned for MetaTutor.

Developments regarding measuring and understanding learners’ experiences of affect and motivation represent one of the primary and broadest future directions for MetaTutor. Our analyses, which have focused on basic emotion facial expression analyses, are being expanded to include physiological measures of emotions (e.g., galvanic skin response and pupil dilation) as well as human-rater and self-report measures. These new methods for measuring emotion will provide us with the means to investigate convergent evidence for emotional states across a variety of different affective dimensions, including arousal, valence, discrete, and co-occurring emotions (Conati & Maclaren, 2009; Harley et al., 2012a; Hess & Polt, 1960; Lang, Greenwald, Bradley, & Hamm, 1993; Portala & Surakka, 2003). Some of these methods, including self-report and human-rater (based on a coding scheme that we are developing), will allow us to expand our analyses from basic emotions to include learner-centered ones. In addition, by having access to emotional data that are prospective, state (including trace), retrospective, and trait in nature, we will be able to explore dynamic fluctuations in emotions with a contextualized understanding of antecedents (e.g., co-regulation between PA and learner, trait emotions, motivations). Another component of our research that investigates the nature of ­emotions is analyzing (including pioneering ways to do so) learners’ experience of co-occurring (i.e., simultaneous experience of) different discrete emotions (Harley et al., 2012a). These developments will be used to enhance learners’ experience with MetaTutor by providing recommendations for adapting the system, such as PA’s dialogue and behavior (e.g., facial expression) changes, as well as contributing to the development of a more comprehensive theory of SRL in terms of the role of affect and emotions.

Finally, as more channels of information become available, it will be even more crucial to align and merge them together in order to obtain an accurate overview of students’ experience when learning with MetaTutor. Considering the richness of the collected data, educational data mining approaches will be particularly useful in order to (a) group students into different categories according to similarities in their browsing behavior and use of SRL processes; (b) extract from trace logs of the different data channels some patterns of browsing action, emotions, and/or eye movements that are characteristics of these categories of students; and (c) identify in which of those categories future students belong to in real time in order to provide them with the most relevant agents’ feedback and scaffolding strategies (Bouchet, Harley et al., 2012).

In summary, we emphasized the importance of using multichannel trace data to examine the complex roles of CAM self-regulatory processes deployed by students during learning with multi-agent systems. We also argued that tracing these processes as they unfold in real time is key to understanding how they contribute both individually and together to learning. In addition, we described MetaTutor (a multi-agent, intelligent hypermedia system) and how it can be used to facilitate learning of complex biological topics and as a research tool to examine the role of CAM processes used by learners. We also provided a theoretical perspective and underlying assumptions of SRL as an event; we provided empirical evidence from five different trace data to exemplify how these diverse data sources can be used to understand the complexity of CAM processes and their relation to learning. Lastly, we provided implications for future research of ALTs that focus on examining the role of CAM processes during SRL with these powerful technological environments.