Keywords

1 Introduction

A growing research interest in understanding cognitive and metacognitive actions using learner data from online learning environments has encouraged the proliferation of self-regulated learning (SRL) theoretical models. Several studies have pointed out the importance of self-regulation in learning science, as it is closely associated with positive learning outcomes [1,2,3]. Learners who regulate their learning are aware of the learning process as they acquire knowledge or skills and actively participate in and control the necessary steps toward mastery [3]. A growing community of scholars proposes a dynamic trajectory model with phases and processes of SRL: that learning behavior is developed over time and composed of preparation, performance, and reflection [4,5,6]. Despite this landscape, fewer studies have evaluated how these strategies function simultaneously.

Responses to inventories and think aloud are regarded as two categories of self-report data that are commonly collected in computer-based learning environments (CBLEs) to measure SRL constructs [7], and prior research has leveraged quantitative ethnographic approaches for SRL data mining such as Epistemic Network Analysis (ENA) [8,9,10,11]. These developments benefit learning analytics by making previously unobservable patterns of the process of thinking visible in online learner data while avoiding issues with the accuracy of memory retrieval. Real-time self-report is complementary to online trace measures, which offers a more direct way to examine learners’ thought processes and explain more variance in learners’ performance by revealing unobservable problem-solving steps and strategies [12]. Shaffer [13] explained the idea of thick description by noting that language is not just a reflection of how the mind works but also a tiered ‘hierarchy of meanings.‘

Using learning analytics and quantitative ethnography, we can shed light on the implications of students’ self-reported answers during their interactions with computer-based learning environments. The research question seeks to unpack the contextual SRL strategies in math problem-solving and understand how they are associated with successful performance. The following research questions were scrutinized in the paper: (1) What are the differences in connections made between four SRL indicators across learner responses with correct and incorrect answers to math problems? And (2) How do learner patterns of SRL indicators differ as they get familiar with the CueThink and generate more Thinklets (a series of problem-solving tasks)? We develop automatic detectors as codes within an approach of epistemic network analysis (ENA) for SRL process analysis. The current study is concerned with building detectors that detect SRL components of operations based on Winne and Hadwin's SMART model, which consists of five operations: (1) searching, (2) monitoring, (3) assembling, (4) rehearsing, and (5) translating. Based on data collected from the CueThink, we deductively define four SRL indicators and detect each of them in discourse. These indicators include (1) numerical representation, (2) contextual representation, (3) outcome orientation, and (4) data transformation. Additionally, we defined an indicator called strategy orientation, but there are not enough examples in the dataset to include it in the analysis. We develop a set of epistemic network models to evaluate differences in individual performance on correctness and general development of familiarity. In our study, a method of learning analytics in conjunction with a practice of quantitative ethnography is expected to provide insights about the dynamic associations of SRL indicators detected in math problem solving.

2 Literature Review

Self-regulated learning has gained popularity in education research, and be expanded in use and conceptualization across disparate goals and measurements including: (1) types of measures in data reflecting SRL behaviors, (2) constructed processes or models of SRL processes, and (3) analytical methods to evaluate SRL. Detecting cognitive and metacognitive learning activities in log data that differentiate less efficient learner performance from more efficient learner behaviors could inform the design of more fundamentally conductive prompts and scaffoldings for long-term independent learning. Deeply rooted in established SRL constructs, recent studies explore the idea of developing detectors to investigate SRL patterns [14,15,16,17]. These detectors focus on fine-grained micro-level SRL processes to find help-seeking, self-monitoring, self-assessment, goal-setting, and information assembly. According to Zimmerman's cyclical phase model, students with mastery of SRL analyze the learning task and motivate themselves in a forethought phase, control and be aware of the learning progress in the performance phase, and assess and react to their performances in the self-reflection phase [18]. Most approaches for detecting self-regulated learning strategies have not fully explored Winne's [19] theories. Winne and colleagues [19, 24, 35] suggest that higher-level cognitive strategies are integrated into every phase and process of SRL, including both the top-to-bottom processing and their independence and associations with different cognitive abilities in the context of a learning situation. Our broader research aims to develop new operationalized indicators to understand learners’ SRL behaviors on an online mathematics problem-solving platform. Winne and Hadwin [20] investigated the process of SRL as four states in sequence, including recognization of the learning task, building goals and plans, implementing them, and reviewing the output of prior steps based on feedback and adaptations. Further developing an integrated theory, Winne's SMART model of SRL [19] identified five operations of cognitive and behavioral actions in task performance: searching, monitoring, assembling, rehearsing, and translating. It attempts to place diverse constructs into one framework that functions in all of the four SRL sequential tasks [20].

Literature in quantitative ethnography explored SRL in collected log data based on textual inputs and categorized learning events [8, 9, 11, 21]. Paquette and colleagues [8] examined SRL with trace data collected from 98 students in an open-ended online learning environment that allowed students to build models of scientific concepts and phenomena causally. ENA was used to illustrate the problem-solving actions in relation to ‘information seeking, solution construction, and solution assessment’ (p. 4). Uzir and colleagues [21] studied blended learning environments and investigated time management and learning tactics associated with positive learning outcomes. They used ENA models to illustrate different patterns in different strategy groups. Gamage and colleagues [9] compared MOOC participants based on familiarity (i.e., first-time user/multiple classes completer) and used ENA to compare reflections on video-watching and communication in log data and interviews. Wu et al. [11] developed ENA models to evaluate the metacognitive behaviors of learners in collaborative learning contexts based on self-report reflection. By drawing upon this knowledge, this paper combines a method of labeling log files via text replay with a study of the occurrence of self-regulated learning operations in math problem-solving.

Fig. 1.
figure 1

Adapted from Jiayi et al. (2022).

CueThink's interface. Screenshots of CueThink’s Four Phases.

3 Method

3.1 CueThink

CueThink is an online learning application that allows middle school students to tackle math word problem practice, aiming at developing their math knowledge, self-regulated learning skills, critical thinking, and independent problem-solving strategies. Students independently complete assignments using CueThink, and teachers review and grades them later. Peer and supervisor feedback can be given asynchronously in CueThink’s online classroom community combining with the school setting. Except for students working on randomly assigned math problems, the overall online setting is the same for all students. During the problem-solving procedure, students use CueThink to solve a math problem by manipulating multimedia tools (e.g., create a screencast video, generate a table, make a draft, etc.). CueThink developed four phases (see Fig. 1) aligning with the temporal and sequential nature of self-regulated learning (i.e., Understand, Plan, Solve, and Review) consolidated together as a Thinklet. Learners can move back and forth across the four phases in CueThink, and select any Thinklet to continue with a status of ‘Create Thinklet.’ At the same time, the tool captures how learners solve the math problem in each phase by recording their performance during learning events, including their textual inputs, answers to multiple-choice questions, a screencast video, and the time spent on each activity. As a means of labeling log files, we use a method known as text replays. It allows us to retrieve a segment of learner behavior that contains a sequence of actions for a selected period, and organize pieces of information in a textual format.

Students create a Thinklet and start with the Understand phase, where students are invited to engage with a math word problem and respond to three question prompts: (1) “What do you notice?” (2) “What do you wonder?” and (3) “What is your estimation about the answer?” In this phase, students look for information from given resources, understand the story, and are encouraged to gather meaningful pieces of information for completing a learning task. The second phase is the Plan phase, in which students will read a multiple-choice question about math problem-solving strategies and will have the option to draw a picture, model an equation, work backwards from the solution, etc. The Plan phase encourages students to write a plan on math problem-solving steps, and discuss solutions from a story. Students develop their thoughts and strategies based on prior understanding and explain their plans through descriptive written responses. In the Solve phase, students explore multimedia scaffolding tools (e.g., ruler, calculator, colorful pen, etc.) and work on a whiteboard space. Students then make a screencast video to describe and demonstrate their use of tools. In the Review phase, students provide final answers to the given math problem and reflect on the quality of their answers, video, and performance in prior phases. CueThink fosters this activity by asking students to fill in a checklist. To evaluate students’ performance, the current study will focus on students’ textual responses generated in the Understanding phase and Planning phase (phases 1–2), and their final answers provided in the Review phase (phase 4). The structure of phases and events is summarized in Table 1.

Table 1. Learning Events in Three Phases with Textual Inputs.

3.2 Participants and Procedures

CueThink has been used by 79 students from six classes in a suburban middle school in California in grades 6th and 7th (approximately 11–12 years old). In 2020, this school's ethnic diversity index is near 50%, and about half of the students are English learners. White students and Hispanic/Latino students both makeup 40% of the student population, while African American students and Asian students each make up 5%.

Data about student usage events and system operations are recorded in log files within the CueThink system. Users’ log files, for example, record when they log in, what pages they view, and what text they type or select as their answer to multiple choice questions. Each student spent an average of 5.2 h using CueThink and 1.8 h on each Thinklet [22]. We coded and analyzed 349 Thinklets based on activities and textual responses. We consider a learner high-performing if the learner has more correct final answers than incorrect final answers, which means more than 50% of the solved problems are correct. Otherwise, we regard them as a relatively low-performing group. For the first comparison, we split 349 Thinklets into two categories based on the rate of the correctness of individuals: low performing group (NL = 104) and high performing group (NH = 245). In the second category, we identified four levels of familiarity (i.e., first time/second time/third time generating a Thinklet, and more than three times) in 349 Thinklets. When we talk about learner behaviors, we focus on automatedly detected indicators of SRL processes designed for aligning with Winne and Hadwin’s SMART model [20], which we will further discuss in the next section.

3.3 Detectors and Code Book

As a learning analytics team effort, the process of building automated detectors of self-regulated behaviors is challenging, recursive, and iterative. The first step was to convert the log data into human-readable text replays and then operationalize qualitative elements concerning Winne and Hadwin’s SMART constructs [22]. Each indicator was classified as one of the strategies relevant to the SMART model, existing in every phase. The log data is usually restructured to reflect the constructs that researchers would like to measure. To define the final codebook, the research team and system developers communicated several times, including seven stages: 1) define concepts of codes and operationalization, 2) refine conceptualization in small groups, 3) build the first codebook, 4) refine it again, 5) implement the codebook, 6) revise the description of codebook based on implementation [22].

In this study, we mainly focus on assembling and translating operations considering the design of the learning environment and data availability, while other operations not detected in discourse were excluded from our analyses. According to Winne [24], by tra you are attempting to identify a relationship, and by translating you are changing the way in which a given piece of information is presented. Two coders developed four SRL indicators and coded them manually (See Table 2), including (1) numerical representation, (2) contextual representation, (3) outcome orientation, and (4) data transformation. Numerical and contextual representations, usually defined as strategies used at the beginning of the problem-solving process, contribute to a learner’s representation of a problem story and recognition of a learning task [20]. Both indicators reflect assembling in the SMART model in that learners actively use, control, and manipulate information provided in the learning environment to make a general representative picture. The other indicator (outcome orientation) reflects assembling by explaining a learner’s goal-setting and planning behaviors and stressing an outcome-focus. Data transformation is representative of the translating operation, in that learners adjust and make a change to the way information is presented in the problem to find a solution. Two raters examine inter-rater reliability in their coding process (see Table 3). Based on clear definition of classification and fully communication, all the Cohen’s kappa results are above 0.6 verifying the relatively high agreement between two raters.

Table 2. SRL Code Book with Examples
Table 3. SRL Code Book Kappa Results

3.4 Epistemic Network Analysis

Using the webtool (version 1.7.0) [27], we generated two sets of epistemic networks to visualize differences in patterns of connections across the generated codes (numerical representation, contextual representation, outcome orientation, and data transformation). The first set of networks compare low-performing and high-performing student groups based on the number of correct final answers. Correctness of final answers is an essential metric in evaluating a student's abilities in math problem solving. CueThink participants were divided into a low-performing student group (0% - 50% of final answers correct) and high-performing student group (51% - 100% of final answers correct). In general, if a learner has more correct answers than incorrect answers, we conceptualize them as high performing. Unit variables consisted of the binary low/high performance groups, subset by user ID. Conversations were segmented by the variable Thinklet ID so that associations were only calculated across data within each Thinklet. A whole conversation stanza window was used to aggregate co-occurrences across all lines of data in each Thinklet. Since the initial unit compares two groups, a mean rotation was used to maximize differences across the x-axis.

The second set of networks track patterns of change across participants as they repeatedly work through Thinklets. Learners could begin a new Thinklet at any time, and tended to generate more Thinklets as they were assigned more math problems. Students’ first Thinklet serves as a record of their initial experience using and exploring CueThink. As students begin and complete subsequent Thinklets, they gain familiarity with the specific online learning environment, and learners are more likely to develop expertise in adapting functions, prompts, and external resources in the tool. Most students generated two to four Thinklets, with a few participants creating five or more. To track changes in participants’ connection-making between codes across Thinklets, models were generated using the unit variable “Order,” which groups data into first, second, third, and fourth or more Thinklet participation. This unit was also subset by participants’ Thinklet ID. Conversations were again segmented by Thinklet ID, and a whole conversation stanza window was applied. For both sets of networks, goodness of fit was assessed, and Mann-Whitney tests were calculated to assess statistically significant differences between groups. Figures and interpretive descriptions are provided in the following section to examine how learners using CueThink engaged in self-regulated learning practices, which offers insight into SRL behavioral patterns in relation to math problem-solving performance.

4 Results and Discussion

4.1 Group by Low and High Performing

Fig. 2.
figure 2

Networks of Low Performers and High Performers

As we evaluate the networks based on correctness, we found that four nodes are connected loosely generally, and data transformation and contextual representation are always closely associated in both networks. To detect the difference between groups as low performing and high performing based on the overall rate of correctness, we compare two networks along the X-axis. Along the X axis, a two sample t-test assuming unequal variance showed low performing group \((\mathrm{mean }= 0.11,\mathrm{ SD }= 0.26,\mathrm{ N }= 29)\) was statistically significantly different from high performing group \((\mathrm{mean }= -0.06,\mathrm{ SD }= 0.44,\mathrm{ N }= 51)\) at the 0.05 alpha level (t77.96 = −2.24, p = 0.03*, Cohen’s d = 0.45). Observing Fig. 2, there is no extraordinary difference between the two networks if we conduct Mann Whitney test. However, we find a co-occurrence of data transformation and contextual representation appears often in both groups, which is more evident in low-performing group network. Within the network of low performing group, most co-occurrences of codes are between 0.33 and 0.40, where no specific connection stands out. When we compare it with high performing group, it shows that lines connected to data transformation are slightly thicker. We notice that within the network of high performers, the co-occurrence of contextual representation and outcome orientation (0.37) is stronger than numerical representation and data transformation (0.27), data transformation and outcome orientation (0.29), outcome orientation and numerical representation (0.30), and numerical representation and contextual representation (0.30). The co-occurrence of contextual representation and outcome orientation stands out more in the high performing group than the low performing group. An example of this would be a student who goes beyond simply providing the numerical answer to the math problem (outcome orientation) by providing a clarifying contextual representation of a background, a situation, a character, or a setting. Julia (pseudonym), a student in the high performing group, connected these two codes in the following example: ‘I will take the total from my withdrawals from the bank and add it to the total from my deposits, and my leftover amount will be my answer (25 + 50).’ These results indicate that CueThink distinguished learners based on their rate of correctness and SRL performance during the problem-solving process. This can be shown as learners who have a high rate of correctness focusing heavily on obtaining a numerical answer (outcome orientation), as well as using contextual details to aid in understanding and solving the problem (contextual representation) within a given situation. Most learners pay attention to contextual details (contextual representation) and manipulate and transform the given data information within the math problem (data transformation). But when they have the intention or habit of seeing the numerical outcome in the understanding and planning phase, learners tend to improve the correctness of their final answers.

4.2 Group by Familiarity

Fig. 3.
figure 3

Networks of Four Levels in Familiarity

To build epistemic networks of familiarity, we grouped Thinklets based on the order they have been generated for each individual. We use One, Two and Three to refer to the Thinklets generated by a learner for the first time, second time, and third time. We use Four to refer to the group of ‘after-third-Thinklet’ conditions, which involves all Thinklets generated after the third time. In total, there are 108 items in the Four category, 77 in the One, 76 in the Two, and 65 in the Three. Co-occurrences of data transformation and contextual representation show a less and less stable connection from Time 1 to Time 4. As we can observe, the importance of the four codes has been distributed much more evenly in the latter network. However, the diminishing numbers of co-occurrence rates do not simply reflect a failure of self-regulated learning operations to connect in the text. These numbers are impacted by sample format (unit and conversation variables). Every learner at least generates one Thinklet, but there are fewer and fewer learners having more than three and four Thinklets. The differences in numbers within each network provide more information. The largest number in the first, second, and third networks always represents a strong co-occurrence of contextual representation and data transformation, but there is a decrease in importance compared to other numbers within each network. In contrast, the importance of the association of outcome orientation and contextual representation is stronger from the first to fourth network. Comparing every two-pair of four models, we received results showing significant differences between One vs. Four (Mdn = −0.41, U = 2525, p = 0.00*, r = 0.39), Two vs. Four (Mdn = −0.41, U = 2483, p = 0.00*, r = 0.40), and Three vs. Four (Mdn = −0.41, U = 2701, p = 0.01, r = 0.23). In situations where multiple hypotheses get tested, the alpha level will be corrected by Bonferroni approach, that the significant alpha level will be 0.008. In this case, two comparing groups still show significantly differences (Fig. 3).

In summary, network Four is mostly different from other three conditions. If we combine datasets of One, Two, and Three together and compared it to Four, along the X-axis a Mann-Whitney test we detected that Thinklets generated after the third time (Mdn = 0.32, N = 45) were statistically significantly different at the alpha = 0.05 level from other conditions (Mdn = 0.56, N = 78, U = 926, p < 0.001, r = 0.47). We concluded that as the students get more familiar with the scaffoldings and prompt in the system, they start to use more strategies about outcome orientation. They also consider outcome orientation and contextual representation together much more often in explaining their understandings and problem-solving strategies about the math problem.

5 Conclusion

The findings of this study suggest that researchers can label and measure the connections between various cognitive behaviors with roots in Winne’s SMART model. This paper examines how assembling and translating operations by connecting contextual representation and data transformation could impact the performance of learners as they become more familiar with the learning environment. Additionally, we understand how different aspects or focuses of content can related within the assembling operation by connecting contextual representation and outcome orientations. It sheds light on the possibility of deconstructing the unit category in the SMART model to see different patterns.

The innovative approach applied in this paper connects automated detectors of SRL constructs to ENA models based on the textual responses in the learning environment. Epistemic networks can illustrate the co-occurrences of SRL indicators to show a dynamic pattern of SRL behaviors and stress the importance of connections among micro-level SRL operations. Findings offer insights into how math problem-solving languages can reveal and support the intentional SRL process. A range of analyses conducted in this study demonstrated that contextual representation and data transformation are associated with math problem-solving and have been heavily relied on by learners. Otherwise, students focus more on outcome orientation as they become proficient users of the CueThink platform. When they get familiar with functions and tools embedded in the system and generate more correct answers, we detect more outcome orientation in the text.

It is pertinent to note that there are a limited number of students participating in this study. The future potential of this work is investigating more students and their performances on math problem solving. The current study has limitations in exploring other operations within Winne's [25] SMART model. In the future, we will work on developing more constructs connecting SRL theories and the context of the learning environment. Another challenging problem is how to interpret the connections of our measuring constructs of self-regulated learning within a discourse of teaching and curriculum design. One possible implication is that, in the math problem solving system, we provide suggestions for students about making full use of numbers (numerical representation), building relationships between values and equations (data transformation), and having expectation and estimation about the final answer (outcome orientation). The difference between the two performing groups is not cogent enough to persuade future learners to learn from high-performing group, but it could be a method used for future research. Also, it is a promising direction to consider learner behavioral events and textual responses together in understanding the continuing development of self-regulated learning. It facilitates learning scientists to design more think-aloud activities and foster students to verbalize and visualize their cognitive processes. However, we still have the challenge of making sense of the trajectory nature of log data associated with discourse data when the language is generated based on given prompts and questions. Additionally, instead of only developing codes deductively, we might consider generating codes inductively to better understand special learning environment and learners’ personality. It is crucial to frame what is occurring in the internal world of the learner instead of fitting learner words into a pre-structured framework [26]. It argues for the importance of interpreting cultures in discourse rather than straying from textual contexts and relying only on abstract concepts. Overall, this study is an exploratory learning journey, and we will continue to appreciate the power of language in explaining learner behaviors and thoughts.