Introduction

As information technology continues to evolve and globalization accelerates, people are increasingly faced with complex and diverse challenges in their daily lives and learning environments (e.g., Jiang et al., 2023; Çini et al., 2023). When dealing with these challenges, people inevitably need to collaborate. Collaboration is not only a process of team agreement, but also an activity that involves information sharing, decision-making, and implementation among multiple partners (e.g., Sun et al., 2020; Ouyang et al., 2023). This process not only requires that team members have trust and respect for each other, but also effective communication skills and conflict resolution (e.g., Andrews‐Todd & Forsyth, 2020). Information sharing is the cornerstone of collaboration, ensuring that team members are kept informed of the project’s progress and can adjust their tasks accordingly. Meanwhile, collaboration in decision-making and implementation is equally critical. It requires members to be able to work together to analyze problems and develop optimal solutions (e.g., Jiang et al., 2023). In particular, interdisciplinary, multilevel collaboration is essential to solving contemporary complex social and scientific problems. Therefore, it is imperative for students to possess a robust and collaborative problem-solving competency that enables them to effectively contribute to the resolution of these complex issues (e.g., Zhang et al., 2022). Complex problem solving often exceeds the capabilities of a single discipline or individual. Similarly, collaborative problem solving (CPS) competencies enable groups to cross knowledge boundaries and bring together diverse perspectives and approaches through joint exploration (e.g., Zheng et al., 2020). Consequently, the development of CPS competencies necessitates not just an ability to appreciate and respect the perspectives and contributions of peers, but also proficiency in effective communication and strategic teamwork (e.g., Sun et al., 2020). For instance, Sun et al. (2020) constructed a generic CPS competency model including constructing shared knowledge, negotiation/coordination, and maintaining team function.

CPS integrates the complexities of collaboration and problem solving into a dynamic process that is intertwined and mutually reinforcing (e.g., Swiecki et al., 2020). In developing CPS competency, the significance of research and practical application in CPS cannot be overstated (Von Davier et al., 2017). Correspondingly, many assessment frameworks have been proposed to uncover the interactive influences and dynamics of the CPS competency and to identify possible bottlenecks in teamwork (e.g., Andrews-Todd & Forsyth, 2020; Ouyang & Chang, 2019; Sun et al., 2020). For example, the CPS framework proposed by Hesse et al., (2015) suggested social skills centered on social interactions with group members through participation, perspective taking, and social regulation. Meanwhile, cognitive skills such as task regulation and learning and knowledge building were also essential. In addition, to highlight the dynamic interconnections of collaboration and problem solving, three collaboration skills and four problem solving skills in PISA were intersected to form 12 categories of skills (OECD, 2017). In particular, the collaboration dimension contained three core skills, i.e., establishing and maintaining shared understanding, taking appropriate action to solve the problem, and establishing and maintaining team organization, while the problem-solving dimension contained four core skills, i.e., exploring and understanding, representing and formulating, planning and executing, and monitoring and reflecting (OECD, 2017). Moreover, Andrews-Todd and Forsyth (2020) categorized CPS skills into four social skills and five cognitive skills on the basis of human–human interaction settings. Social dimension skills include maintaining communication, sharing information, establishing shared understanding, and negotiation. Cognitive dimension skills include exploring and understanding, representing and formulating, planning, executing, and monitoring. In addition, meta-cognitive skill is an integral part of CPS. It represents the ability of individuals to monitor and adjust their cognitive activities (Çini et al., 2023). Meanwhile, meta-cognitive skill exhibits profound social attributes that facilitate learners’ adaptability to diverse cognitive styles among their peers (Dindar et al., 2020). As emphasized by Dindar et al. (2020), investigating meta-cognition independently cannot adequately account for the interaction between cognitive and social aspects in CPS process. There is a complex interplay between cognitive, social, and meta-cognitive skills during CPS. This interaction not only affects individual performance in collaboration, but also has a profound impact on the CPS outcomes of the team as a whole (Tan et al., 2014). Previous studies on CPS have either focused on the dynamic association between cognitive and social dimensions or have chosen the meta-cognitive perspective to explore its impact on CPS performance (Dindar et al., 2020; Smith & Mancy, 2018). As a result, we remain challenged in the field of CPS by the lack of a perspective that organically integrates the triple dimensions of social, cognitive, and meta-cognitive.

Previous studies have assessed students’ CPS competencies in depth using diverse methods such as questionnaire survey, computerized tests, and quantitative content analysis (e.g., Luengo-Aravena et al., 2024; Ma et al., 2023; An & Zhang, 2024). They have provided a rich empirical foundation for our understanding of how students behave in the area of CPS (e.g., Jiang et al., 2023). However, the statistical analysis method typically calculates the percentage of frequencies for each dimension across the entire CPS session. This procedure explains the difference in the contrast of the individual dimensions accumulated by each category in a long-term temporal context. Analytical methodology that only reveals changes that have occurred or that produce cumulative frequency counts is inadequate (e.g., Lämsä et al., 2021). To address this challenge, more research needs to consider CPS as a process that unfolds over time and space. Furthermore, learning science research has shifted from a result-oriented focus to a process-oriented focus (e.g., Zheng et al., 2020; Xu et al., 2023). Currently, the computer-based environment allows for detailed recording of learners’ traces in the CPS process, such as online collaborative discourses. This makes it possible to analyze CPS as a process that unfolds over a longer period of time (Jiang et al., 2023). Collaborative problem-solving patterns characterize the transitions between a complex set of behaviors exhibited by a group throughout the problem-solving process (Zheng et al., 2020). By analyzing the time sequences formed by learner behaviors, researchers may gain valuable insights into the underlying patterns that shape collaborative problem-solving endeavors (e.g., Li & Liu, 2017; He et al., 2022). Therefore, conducting an in-depth analysis of CPS patterns contributes to revealing the relationship between learning strategies and learning outcomes (Jiang et al., 2023). However, students possibly adopt a variety of different learning strategies during the CPS process, leading to heterogeneity of the learning data. Therefore, how to effectively integrate and sanitize these messy data to select appropriate analytical methods is a key issue to understanding CPS patterns in an in-depth manner (e.g., Zhang & Andersson, 2023; Zhang et al., 2023).

Similarly, the investigation of CPS patterns among university students is paramount in higher education (e.g., Jiang et al., 2023). As higher education institutions strive to equip students with the essential skills for success in an increasingly complex world, understanding the dynamics of CPS becomes imperative (e.g., Ouyang & Dai, 2021). Therefore, the exploration of CPS patterns serves as a conduit for unveiling the nuanced strategies employed by university students, shedding light on the factors that contribute to successful CPS outcomes (e.g., Tan et al., 2014; Zhang et al., 2022). Xu et al. (2023) identified four significant CPS patterns, i.e., a consensus-achieved pattern, an argumentation-driven pattern, an individual-oriented pattern, and a trial-and-error pattern by analyzing process data from 19 pairs of undergraduate students during pair programming tasks. Moreover, Zheng et al. (2020) used a combined sequence analysis approach to investigate the CPS pattern from an interaction perspective. Sequence analysis and assessment were conducted by coding student discussion transcripts for both high- and low-performing groups. The research revealed that the process of in-depth discussion arose from conflicting arguments between peers, which facilitated learning performance. In recognition of the increasing emphasis on analyzing CPS patterns in collaborative learning processes among university students, there is an urgent need for more in-depth research to explore potential key differences among different student groups and provide more precise insights and guidance for enhancing CPS outcomes.

Given the above research gaps, this study aims to analyze CPS as a process that unfolds over a longer period of time and within a complex spatial dimension, focusing on exploring the interrelationships between multiple implicit dimensions in CPS. In particular, this study proposes a three-stage analytical framework to conduct a gradual exploration of the CPS pattern. The results provide insights into pattern inquiry in CPS and the design and implementation of collaborative learning.

Literature review

Collaborative problem solving

Collaboration is generally defined as a social activity that aims to maximize productivity (e.g., Bozeman et al., 2001). According to the constructivism learning paradigm, learners actively build their understanding through meaningful engagement in tasks and peer activities during collaboration (e.g., Bada & Olusegun, 2015). Additionally, the success of collaborative learning relies on mutual regulation among group members. For example, group members engage in socially shard regulation of learning, where they coordinate strategic enactments of the joint task, collectively monitor the task’s progress and group’s products, and make adjustments when needed to optimize collaboration both within and across tasks (e.g., Malmberg et al., 2017). Another important concept related to collaborative learning is meta-cognition, which refers to the knowledge about cognition and the regulation of cognitive processes aimed at achieving specific goals (e.g., Dindar et al., 2020). In this regard, meta-cognitive knowledge and processes of group members are considered to be the driving force behind group-level regulated learning (e.g., Dindar et al., 2020).

CPS is characterized as “the capacity of an individual to effectively engage in a process whereby two or more agents attempt to solve a problem by sharing the understanding and effort required to come to a solution and pooling their knowledge, skills, and efforts to reach that solution” (OECD, 2013, p. 6). In particular, the realm of assessment research has frequently delved into examining the social and cognitive skills exhibited by CPS (e.g., OECD, 2017; Hesse et al., 2015). The concept of CPS competencies is multifaceted, encompassing both social and cognitive dimensions. Social dimension refers to the social interactions that learners engage in when responding or replying to their peers’ ideas or opinions (e.g., Ouyang & Dai, 2021). Interaction between members is often achieved through online synchronous or asynchronous communication tools. Cognitive dimension refers to learners accomplishing individual knowledge inquiry and group knowledge enhancement through discussion (e.g., Damşa, 2014). Therefore, both social and cognitive skills are critical in the CPS process. The combination of the two can significantly improve the efficiency of the CPS process and the quality of the outcomes (e.g., Liu & Matthews, 2005). Andrews-Todd and Forsyth (2020) identified four distinct types of collaborative problem solvers: high social/high cognitive, high social/low cognitive, low social/high cognitive, and low social/low cognitive. They revealed that in a three-person collaborative setting, the presence of at least one member with high social and cognitive capabilities significantly enhanced the CPS outcomes. Furthermore, Li et al. (2022) delved into the intricate relationship between students’ action transition patterns and their levels of social and cognitive skills when engaging in CPS tasks. The results also showed that pairs of students exhibiting high social and cognitive skills displayed a higher average rate of transitions between actions. This underscores the importance of diverse skillsets in collaborative endeavors, particularly when it involves critical skills such as those required in CPS. Numerous studies have explored the relationship between social and cognitive dimensions of CPS (e.g., Ouyang & Chang, 2019; Li & Liu, 2017). For example, Ouyang and Chang (2019) found that learners needed to maintain social interactions among peers, such as responding to others’ ideas or making substantive cognitive contributions during CPS tasks. Students with different levels of social activity engagement also tended to differ in their participation and contribution to cognitive activities. According to Zhang et al. (2022), it was important for students to extract more non-shared information by asking questions and giving specialized feedback. This not only contributed to deeper learning for individual students, but also reinforced the centrality of social interaction and cognitive exchange in CPS tasks.

Meta-cognitive dimension refers to a sequence of activities in which learners plan tasks, monitor and regulate the learning process, and facilitate the successful completion of learning tasks during the collaboration process (Hadwin et al., 2017). Therefore, successful collaborative problem solving also requires effective coordination of individual and group processes across both cognitive and social dimensions (e.g., Baker et al., 2001; Hadwin et al., 2017). For example, Smith and Mancy (2018) coded elementary school students’ discourses about the process of small-group math problem solving in a natural classroom setting, where the coding dimensions were meta-cognitive talk, cognitive talk, and social talk. Compared with cognitive talk, the study found that meta-cognitive talk was more likely to conform to the criteria for collaborative talk. Furthermore, Çini et al. (2023) explored meta-cognition in the context of collaborative group learning. The study found an association between the level of meta-cognition at the individual, social, and environmental levels in CPS and CPS task performance, further demonstrating the important role that meta-cognitive awareness plays in the CPS process. Although it has been widely recognized that meta-cognition plays a critical role in students’ collaborative problem solving, the interplay between meta-cognitive and social engagement as well as cognitive engagement during learners’ CPS remains a rarely addressed issue in the research field. Therefore, there is a need for more in-depth research on the interplay of these three dimensions in the process of CPS. It would contribute to our better understanding of the nature of CPS in the digital age.

Multiple analyses of collaborative problem-solving patterns

A key feature of CPS in computer-based settings is the ability to record the behaviors taken by students without interfering with their learning process (e.g., von Davier et al., 2017). All the behaviors taken by different students or groups then form a time sequence of CPS (e.g., He et al., 2022; Li et al., 2019). CPS patterns refer to the sequential relationships that constitute the behaviors of a group in the process of collaborative problem solving (e.g., Zheng et al., 2020). These patterns outline the changing dynamics and interactions of a group throughout the multiple stages of problem identification and solution exploration (e.g., Swiecki et al., 2020; Zhang & Andersson, 2023). Furthermore, educational data mining (EDM) is being emphasized by an increasing number of researchers (e.g., Zhang & Andersson, 2023; Xu et al., 2023). A large number of studies have explored students’ problem-solving processes after clustering process data (e.g., Lee, 2018; Ouyang et al., 2023; Zhang & Andersson, 2023; Xu et al., 2023). For example, Ouyang et al. (2023) proposed a three-layer analytic framework designed to investigate the characteristics of collaboration patterns in CPS activities. Three patterns of collaboration were obtained by coding and clustering students’ verbal and behavioral data. According to Baker (2010), clustering, as an important area of EDM, can be established at several different levels of granularity, such as student clustering to study differences between students, and student behavioral clustering to study learning behavioral patterns. However, previous empirical studies have explored the underlying factors of CPS in the overall sample using mainly a variable-centered approach (e.g., structural equation models), neglecting crucial differences between subgroups (e.g., Biasutti & Frate, 2018; Li & Liu, 2017). In addition, current clustering algorithms typically tended to employ traditional similarity measures when determining clusters. Considering the spatial–temporal characteristics of CPS data, there is an urgent need to select appropriate distance metrics on the basis of the unique characteristics of the data to effectively complete the clustering process (e.g., He et al., 2019; Li et al., 2019). When faced with complex datasets with spatial–temporal characteristics, traditional similarity measures may fail to capture the underlying relationships among the data, so it is crucial to implement a more flexible and adaptable distance measure into the clustering algorithms (He et al., 2022).

Numerous studies have utilized various learning analytics methods to explore differences between CPS patterns of high and low performance groups at a fine-grained level (e.g., Zheng et al., 2020; Xu et al., 2023). For example, process mining (PM) is rooted in process modeling-driven methodologies and data mining designed to reveal and understand actual operational processes of events. Through analyzing and visualizing these processes, organizations can identify problems and make decisions on the basis of data (Bannert et al., 2013). Furthermore, many studies have demonstrated that combining indicators produced by several algorithms can better explain learning processes compared with individual algorithms (e.g., Saint et al., 2021; Saint et al., 2020; Zheng et al., 2020). As two complementary algorithms for process mining, the combined analysis of fuzzy miner and pMineR has been proven to provide better insight into fine-grained learning processes in the field of self-regulated learning (SRL) (Saint et al., 2020, 2021). Therefore, there is a need to investigate how to combine complementary process analysis methods to explore collaborative data from multiple perspectives to understand the CPS pattern more precisely.

In conclusion, more and more studies have recognized that CPS is a dynamic process that unfolds over time, in which multiple dimensions of activities have complex interactions. To analyze the complex data with temporal properties generated in computer-based environments, there is a need to select appropriate distance measures for clustering and combine multiple learning analytics to investigate from multiple perspectives. Therefore, this study attempts to fill this research gap by employing a three-stage analytical framework. In particular, this study mainly answers the following two research questions:

  1. (1)

    What kinds of CPS subgroups can be identified by using clustering algorithm considering both temporal and spatial attributes?

  2. (2)

    How do CPS patters differ between subgroups comprising diverse cognitive, social, and meta-cognitive skill levels?

Methodology

Research context and participants

We conducted this study in an elective course for university students, titled Instructional Design and Courseware Development at a comprehensive university in central China. The goal of the course was to enable students from different disciplines (Math, History, English, etc.) to present teaching content and organize teaching activities well in specific teaching environments. Therefore, the emphasis of the course was on the flexible application of two information and communication technology (ICT) tools called Seewo Whiteboard 5 and Seewo EasiCare. Moreover, the course instructor recommended websites and tools related to each subject to the students depending on the classroom activities. Through the 8-week course, students were able to integrate subject content knowledge, subject-specific pedagogies, and ICT tools. In addition, the course utilized a task-driven format to facilitate the application of knowledge while enhancing the collaborative skills of students (Zhang et al., 2022). They were asked to work in groups to collaborate through online discussions to complete two tasks, namely instructional design and multimedia courseware development. We reminded students that the multimedia courseware needed to be developed on the basis of the content of the instructional design. In total, 24 students participated in this study. They were in groups that discussed through the online synchronous chat tool Tencent QQ to complete the two tasks. Therefore, we collected complete discourse data from eight groups (three students per group). The way students were grouped followed the previous study (Zhang et al., 2022). Among them, there were 18 girls and 6 boys, aged around 20~22 years.

As shown in Fig. 1, the students were required to conduct group discussions to determine the topics of instructional design after the week 1. Afterward, the instructional design and multiple courseware development tasks were organized around the topics. The instructional design task was divided into three stages: first draft, revised draft, and final draft. The students produced the first draft of the instructional design at the end of week 2, revised the draft of the instructional design at the end of week 3, and completed the final draft of the instructional design at the end of week 4 through group discussion. Similarly, the multimedia courseware development task was also divided into three stages: first draft, revised draft, and final draft. The students were required to complete the corresponding stages through group discussion at the end of week 5, week 6, and week 7. Finally, each group presented and introduced their products to the whole class in week 8. In this study, WPS Office was used for the completion of the instructional design task, which was able to implement synchronous online editing by group members and save the records of the collaborative process at the same time. In addition, for the multimedia courseware development task, this study provided two ICT tools called Seewo Whiteboard 5 and Seewo EasiCare to students to facilitate better completion of the task. The online discussion of the members in each group was mainly conducted through a social software, Tencent QQ. For effective peer-to-peer communication and collaboration, each group was tasked with establishing a dedicated chat room on Tencent QQ, an instant messaging software service. Furthermore, assistant instructors actively participated in the chat rooms, offering technical assistance and support to the groups as needed (Zheng et al., 2023; Su et al., 2018).

Fig. 1
figure 1

Schedule of learning activities for the 8-week course

Procedures for collecting and analyzing CPS process data

Data collection

The present study collected 16 datasets of 8 groups, which corresponded to the discourse data of 8 groups in the instructional design phase and the multimedia courseware development phase, respectively. To ensure data consistency, we excluded task-related information posted by the instructor of the course in each group. In addition, we also collected the products that were completed as a result of each group's collaboration, i.e., instructional design documents and multimedia courseware documents.

Data analyses

When exploring the multiple dimensions of CPS, we found that its complexity and depth can be understood through different theoretical perspectives. Given our focus on the interactive processes of social communication, cognition, and meta-cognition in CPS, three theories are relevant to our study: collaborative learning, constructivism, and self-regulated learning (SRL) (e.g., Laal & Laal, 2012; Bada & Olusegun, 2015; Clark, 2012). Firstly, collaborative learning theory outlines the interpersonal interactions and dynamics inherent among participants in collaborative learning environments (Laal & Laal, 2012). Secondly, the cognitive dimension provides insight into the cognitive processes and knowledge construction mechanisms that sustain CPS, which is closely related to the constructivist perspective. Finally, meta-cognition emphasizes the role of self-regulation and reflective practice in guiding and monitoring cognitive processes during CPS. This dimension is based on SRL theory and emphasizes the importance of meta-cognitive awareness, strategic planning, and adaptive regulation of learning strategies (e.g., Clark, 2012).

Since student collaboration primarily manifests as communication-technology-mediated online discussion (e.g., Tencent QQ), we focused on analyzing the correlations and dependencies between verbal behaviors that emerge over time (e.g., Lämsä et al., 2021). We selected an appropriate cluster analysis method on the basis of the collected time sequence features in the subsequent analyses to identify subgroup differences in CPS behaviors. In addition, CPS patterns illustrate the sequential relationships that support group behavior during collaborative activities (Zheng et al., 2020). Such patterns capture the evolving dynamics and interactions between individuals during the various stages of problem identification and solution exploration (e.g., Swiecki et al., 2020; Zhang & Andersson, 2023). Therefore, we proposed a three-stage analytical framework to explore the CPS patterns (see Fig. 2). In the first stage, we coded the 16 collected datasets in terms of meta-cognitive, cognitive, and social communication dimensions (see Appendix 1). As a result, we obtained discourse sequences with temporal properties. In the second stage, we clustered these 16 time sequences. Time sequences clustering consisted of three main parts: similarity measure, prototype computation, and clustering algorithm. In the third stage, to explore the CPS patterns, we further examined the cluster formed in the previous stages using statistical analysis and process mining techniques. Multiple analysis provided us with a deeper understanding of CPS patterns than any single approach. Therefore, we used two process mining algorithms. The first process mining algorithms, named temporal process mining, produced sequential process maps with frequency-based and temporal-based focus. The second process mining algorithms, named stochastic process mining, explored the same data from the probabilistic lens of associative processes, focusing on a measure of the transition probability of behaviors.

Fig. 2
figure 2

A three-stage analytical framework of CPS processes

  • Stage 1: Coding of CPS processes

We utilized Python software to transcribe the online discourse data on Tencent QQ platform into Excel documents. To simplify subsequent coding, we split and consolidated the discourse data into single sentences. After transcription and organization, the 16 datasets contained 4116 sequences (mean 257.25; SD 108.39), which corresponded to 4116 events. The CPS coding framework employed in this study was adopted from Tan et al. (2014). As a framework for dialogic analysis, it is proposed in the context of a computer-based CPS formative assessment task, aiming to theorize, measure, and foster students’ collective creativity. As shown in Appendix 1, the CPS coding framework included meta-cognitive, cognitive, and social-communicative skills as well as sub-skill components. In particular, meta-cognitive dimension corresponded to the group’s ability to self-examine, reflect, and reformulate solutions as manifested in CPS. Cognitive dimension included both divergent and convergent production, e.g., divergent and convergent thinking. Divergent thinking was the group’s ability to generate ideas, suggestions, and alternatives in facilitated CPS, while convergent thinking was the group’s ability to evaluate, filter, and integrate ideas and suggestions to find the best solution. Social-communicative dimension corresponded to the group’s ability to engage in reciprocal and productive interactions. In the coding framework, the meta-cognitive dimension was subdivided into three coding categories, the cognitive dimension was subdivided into ten coding categories, and the social communication dimension was subdivided into seven coding categories. These categories are detailed in Appendix 1.

We coded the online discourse data in the Excel documents according to the CPS coding framework outlined in Appendix 1. As recommended by Tan et al. (2014), each message sent by students was considered as a meaningful unit. If a single message contained a composite function, it was analyzed and coded in segments. If the single message was an incomplete message (i.e., statements that stopped abruptly or communicative function was clearly unfinished), it was coded after being integrated into a meaningful unit. Two coders (the first and second authors) encoded the 4116 events of the 16 datasets. The coding of online discourse data by two coders consisted of three stages. First, the two coders reached a common understanding of the coding framework through discussion before coding. Second, they independently encoded a random 20% of the datasets. Interrater reliability coefficient (Cohen’s kappa) was 0.830 among two raters at this phase. The scores demonstrated good reliability (Fleiss, 1981). The two coders continued to resolve the discrepancies in the coding process at this stage through discussion, further revising the common understanding of the coding framework. Finally, the remaining 80% of the datasets was equally divided between the two coders for independent coding. As a result, 926 codes belonged to the meta-cognitive dimension, accounting for 22.5%; 1561 codes belonged to cognitive dimension, accounting for 37.9%; and 1626 codes belonged to social-communicative dimension, accounting for 39.5%.

  • Stage 2: Time sequences clustering of CPS processes

After coding the 16 datasets, we collected 16 time sequences. Therefore, a clustering approach was used in this study to explore CPS patterns. The clustering process was organized into three steps. First, on the basis of the chronological features that the datasets have, we used the dynamic time warping (DTW) method to calculate the distance between each CPS time sequence. Next, we obtained the centroids of the clusters by prototype calculation. In particular, we adopted a prototype function on the basis of DTW called DTW barycenter averaging (DBA), which was the most robust time sequences averaging method. Finally, we selected K-means, a typical representative of partitioning algorithms, to complete the clustering of CPS processes. The detailed clustering process is introduced as follows.

  1. (1)

    Similarity measure of CPS processes

As an unsupervised learning task, the similarity between time sequences during clustering is usually measured by Euclidean distance. However, common Euclidean distance measure is insensitive to time offsets and ignore the temporal dimension of the data. When two time sequences are highly correlated but one of them undergoes a shift of even one time step, the Euclidean distance will incorrectly measure them as further apart. Distances between time sequences need to be carefully defined to reflect the inherent proximity of these particular data, which is usually based on shape and pattern (e.g., He et al., 2022). Intuitively, distance metrics used in standard clustering algorithms (e.g., Euclidean distance) are often inappropriate for time sequences. A better method is to replace the default distance metric with a metric for comparing time sequences. Dynamic time warping (DTW) is a technique for measuring the similarity between two time sequences that are not identical in time, speed, or length. It finds the best match between the coordinates of different time sequences, thus warping the first signal to the time domain of the second signal and vice versa. The goal of DTW is to identify similar patterns of change independent of the time axis, which makes it possible to produce more intuitively correct similarity measures. Therefore, the discourse data generated by students during CPS reflected the same requirement. Finding the best warping path between two sequences helped to reflect the appropriate similarity measure, and thus DTW was chosen as the similarity measure method for this study (e.g., He et al., 2019; Li et al., 2019).

Given sequences X={\({x}_{1}\), \({x}_{2}\), \({x}_{3}\)…, \({x}_{n}\)} and sequences Y={\({y}_{1}\), \({y}_{2}\), \({y}_{3}\)…, \({y}_{m}\)} , the distance from X to Y is formulated as the following optimization problem. Firstly, a matrix of size (n+1) × (m+1) is constructed. Secondly, combining the formula (1) we can initialize that matrix. In particular, when the row number and column number are both 0, the value of the cell in the matrix is 0. When the 0th column or 0th row, as well as the value of 0th row and 0th column are not equal, then the value of the cells are infinity. Next, combining the formula (2) we can derive the values of the other elements of the matrix. For the calculation of the values of the remaining cells in the matrix, the absolute value of the difference between the values corresponding to the X-sequences and the Y-sequences is obtained firstly, and then the value of the smallest neighboring element in that matrix is added.

$${DTW}_{(i,j)}=\;\left\{\begin{array}{ccc}\infty&if&\left(i=0orj=0\right)\;and\;i\neq j\\0&if&i=j=0\end{array}\right.$$
(1)
$${DTW}_{(i,j)}=|{x}_{i}-{y}_{j}|+\text{min}\left\{\begin{array}{c}{D}_{(i-1,j-1)} (match)\\ {D}_{(i-1,j)} (insertion)\\ {D}_{(i,j-1)} (deletion)\end{array}\right.$$
(2)

For example, let us assume sequence X={1, 2, 3, 4, 5, 6} and sequence Y={1, 2, 3, 4, 3, 2, 5} . For the value of the cell in column 3, row 5, which is shaded blue, and its neighboring elements, highlighted by a red box in Fig. 3, we calculated \({\text{DTW}}_{(\text{5,3})}=|{x}_{5}-{y}_{3}|+{\text{min}}_{(\text{6,3},1)}=2+1=3\). In other words, the matrix elements are calculated as the absolute value plus the minimum value. There are three possible selections for the minimum value, which corresponds to the “match,” “insertion,” and “deletion” in formula 2. Indeed, the three selections correspond to what we need to do to make the sequences Y more compatible with the sequences X. Selection 1 “match” implies that the elements corresponding to the two sequences are identical. Selection 2 “insertion” implies that an element needs to be inserted in the sequences Y to correspond to an element in the sequences X. Selection 3 “deletion” implies that an element needs to be deleted from the sequences Y to correspond to an element in the sequences X. With formula 1 and formula 2, the calculation of the values of the matrix can be completed. After we get a complete matrix, we can find the shortest path from the starting point to the ending point, which is the diagonal (highlighted in yellow shading in Fig. 3). Finally, we sum the values of the shortest paths to get the DTW distance similarity score between the two sequences, i.e., 4+3+3+1+0+0+0+0+0+0=11.

Fig. 3
figure 3

Example of a distance matrix for DTW calculation

A path is an alignment between two CPS time sequences involving a one-to-many mapping of each pair of elements. The cost of a warping path is calculated by summing the cost of each pair of mappings. In addition, there are three requirements that must be met by a warping path. The first is the endpoint constraint. The path we end up with must start at the lower left corner of the matrix and end at the upper right corner of the matrix. The second is the monotonicity constraint. This requires that the order of the elements of X and Y in the path should remain the same as the original order in their respective sequences. The third is the step size constraint. This constraint requires that the step size of each transition in the path we obtain is 1. For example, the next possibility for x (3, 5) is x (4, 5), x (3, 6), or x (4, 6). From the above algorithmic process, it is clear that DTW is applicable to sequences composed of numbers. Given this situation, it is helpful to mention that in this study, the 20 CPS sub-encoding dimensions correspond to each number for application in DTW calculation.

  1. (2)

    Prototype computation of CPS processes

Clustering algorithms rely heavily on prototypical computational functions. For example, the mean, as a very commonly used prototype computation function, is often used in conjunction with the Euclidean distance measure, since it simply takes the average of the time sequences at each point in time. Partition around medoids (PAM) operates on the principle of calculating from a cluster a representative object, called a “medoid,” which has the smallest distance from all other objects in the same cluster. However, neither of them (mean and PAM) takes into account the uniqueness of DTW. Therefore, we used a prototype function on the basis of DTW called DTW barycenter averaging (DBA). DBA operates by randomly choosing an initial averaging sequence as the common link between the coordinates of multiple time sequences to be averaged (Li et al., 2019). At each iteration, the DTW between each sequence and the averaging sequence is computed. For each coordinate of the averaging sequence, the coordinates of the multiple time sequences associated with that coordinate are averaged together to obtain a new averaged time sequence. This process is repeated until convergence.

K-means clustering is a method that groups data points into a predetermined number (k) of clusters based on their distances to the centroid of each cluster. This is an iterative algorithm that aims to minimize the sum of squared distances between data points and their assigned cluster centroids (e.g., He et al., 2019). As the goal of our research is to identify typical collaborative patterns, K-means approach is a favorable method for interpreting homogeneous patterns. In addition, K-means is more resistant to noise and outliers, which helps to recognize time sequences with similar shapes or patterns more easily. The selection of the K-value is crucial since a good selection can lead to better clustering results. Clustering is carried out separately for different K values in a given K range, and the optimal clustering result is finally obtained by using the evaluation index of clustering performance. The silhouette coefficient combines the two factors of cohesion and separation has been used to evaluate the impact of different operational approaches on the clustering performance (He et al., 2022). Therefore, in this study, we set the initial clustering centroid k from 2 to 10 and used the silhouette coefficient to evaluate the optimal number of K. The silhouette coefficient ranges from −1 to +1, and the closer the silhouette coefficients are to 1 the better the clustering results.

  • Stage 3: CPS pattern analysis

We employed two methods to analyze the two clusters (CPS subgroups) identified in stage 2 and explore the characteristics of their CPS patterns. Firstly, we utilized statistical analyses to obtain the frequency and distribution of each subdimension under the three dimensions of meta-cognition, cognition, and social communication. Furthermore, chi-squared test was used to verify whether there was a significant difference in the frequency and distribution of each subdimension between the two clusters.

Secondly, we employed process mining (PM) to further explore the CPS patterns at the micro level. There were many different algorithms and visualization methods for PM (Saint et al., 2021). In this study, two PM algorithms were used to analyze the clustering, and the reasons have been mentioned in the previous text. Regarding frequency and temporal analysis, we used the fuzzy miner algorithm developed on the ProM platform (Xu et al., 2023). The algorithm can be applied to identify less sensitive processes and merge them with other low-frequency/high-correlation processes to simplify the process model. We analyzed the process pattern using Disco 3.6.7 software, which examined and visualized node transitions. In addition, we used the first-order Markov models (FOMMs) and the pMineR software package to generate and visualize probability transformation matrices for both clusters (e.g., Gatta et al., 2017). The layout of its visualization was similar to that of Fuzzy Miner, but the lines between one process and the next showed a measure of transition probabilities between processes. In conclusion, the two PM algorithms used in this study provided extensive information about CPS patterns from different perspectives (Saint et al., 2020). The fuzzy miner algorithm focused on frequency and temporal analysis. The process maps drawn under this algorithm can provide us with the frequency magnitude and temporal intervals of the transitions between CPS processes. The probabilistic transition matrix, on the contrary, focused on exploring the data from the perspective of transition probabilities between CPS processes. By analyzing this matrix, we can obtain the probability of transitions between events. As a result, we combined the two algorithms to analyze the CPS patterns of two clusters more comprehensively at the micro level, obtaining the frequently occurring and most probable transition patterns.

Finally, we used a TPACK scale to grade the group products (Zhang et al., 2022). In particular, 16 time sequences corresponded to 16 group products, 8 for the instructional design documents and 8 for the multimedia courseware documents. The TPACK scale contains seven dimensions, namely, TK, CK, PK, TCK, TPK, PCK, and TPACK. Each dimension corresponds to an integration of different aspects of subject matter knowledge, pedagogical knowledge, and technological knowledge. Two empirical coders (the first and second authors) graded the products from seven dimensions, with scores ranging from 1 to 5 corresponding to the level of group products from low to high, totaling 35 points. Therefore, they first discussed the definitions of the dimensions and specified the details of the scoring. Next, group products generated through six sequences were used as cases for joint scoring, in the process further eliminating the differences existing in the scoring process. Finally, the two coders graded each of the group products generated from the 16 time sequences. Interrater reliability coefficients (Cohen’s kappa) were calculated and a reliability of 0.801 was returned, showing good reliability (Fleiss 1981). Therefore, the two coders negotiated again to resolve a few inconsistencies in the scoring process, and the negotiated scores were taken as the final scores of the group products.

Ethical considerations

This study has been approved by the university’s Institutional Review Board. We informed the participants of the research goals, procedures, duration, and potential disadvantages during their registration process. Participants in this study were allowed to withdraw at any time without penalty. To protect the privacy of the participants, all information about the participants in this study was anonymized and used only for research purposes.

Results

What kinds of CPS subgroups can be identified by using clustering algorithm considering both temporal and spatial attributes?

To answer the first research question, we performed a cluster analysis on each of the 16 sequences with temporal characteristics. As shown in Fig. 4, when the optimal number of clusters is set to 2 (k = 2), the silhouette index shows the maximum value (0.22). Therefore, after calculating the distances by DTW calculation and clustering by the K means algorithm, we found that the optimal number of clusters for the 16 datasets was two. Table 1 presents the specific clustering of the 16 temporally characterized datasets generated from the two learning tasks. Cluster 1 contained a total of nine datasets, of which six datasets belonged to the instructional design task and three datasets belonged to the multimedia courseware development task. Each dataset corresponded to a different task phase and their average discussion duration was 21.4 days. A total of seven datasets were included in the cluster 2. Two datasets belonged to the instructional design task and five datasets belonged to the multimedia courseware development task. Each dataset corresponded to a different task phase and their average discussion duration was 21 days. Furthermore, we calculated the learning performance of the two clusters. The group products resulting from the 16 discussion periods were scored in this study based on a TPACK scale, with a total score of 35 for the seven dimensions. The mean score for group products in cluster 1 was 27.83, with a maximum of 35 points and a minimum of 24 points. The mean score for group products in cluster 2 was 24.43, with a maximum score of 28 and a minimum score of 19.5.

Fig. 4
figure 4

Silhouette index in time sequences clustering

Table 1 Two clusters (CPS subgroups) obtained by DTW calculation and K-means algorithm

How do CPS patters differ between subgroups comprising diverse cognitive, social, and meta-cognitive skill levels?

Statistical analysis of two CPS subgroups

We analyzed the datasets of the two clusters (CPS subgroups) from a quantitative perspective. As presented in Table 2, cluster 2 preferred to utilize the skills of meta-cognitive dimension. They employed monitoring, planning, and regulation in higher proportions than cluster 1. Furthermore, chi-squared test results show that there is no significant difference between the two clusters in meta-cognitive dimension (X2 = 1.935, p = 0.380). The cognitive dimension was composed of both divergent production and convergent production. Divergent production corresponded to divergent thinking, the process by which students presented different ideas to expand the group’s common knowledge pool. As presented in Table 2, cluster 2 tended to generate ideas on the basis of principles or standards. In contrast, cluster 1 proposed more specific ideas and solutions. For example, the students in cluster 1 provided further information to explain and justify the ideas they proposed, or compared and analyzed between different ideas. In addition, premature closure (anti-divergent) refers to the desire to get out of a situation or the unwillingness to further consider possible solutions. We observed such discourse in the discussion of cluster 2 but not in cluster 1. In other words, cluster 1 did not show a reluctance to consider further during the discussion while cluster 2 showed a few discourses with a sense of resistance. Convergent production was more about recognizing, questioning, and evaluating the ideas expressed by their peers. In other words, the group reached a consensus on the best solution and aggregated the different opinions through mutual evaluation. We found that cluster 1 focused more on mutual evaluation among peers in the group. They engaged in more challenging and evaluating discussions beyond simply agreeing and recognizing ideas. In contrast, the discussions in cluster 2 were focused on analyzing the requirements of the task assignment. In particular, they were less likely to critique and evaluate their peers’ ideas. In the social-communicative dimension, we found that cluster 2 tended to build a shared understanding among peers through questions and answers. In addition, they preferred cohesive-task, which means they actively encouraged and provided feedback on their peers’ contributions. Meanwhile, cluster 2 had more running jokes and statements that were purely social functions. Cluster 1 had a higher percentage of both affective and disaffective dimensions than cluster 2. In other words, the students in cluster 1 engaged more in affective expressions, which included both positive and negative emotions. For example, they tended to leverage humor to activate the group atmosphere, while expressed negative emotional dialogue about the task more often. Neither cluster generated discourses on the uncohesive (anti-social) dimension. Therefore, a friendly discussion was able to be accomplished during the collaboration in this course. Finally, we found significant differences between the two clusters on divergent production (X2 = 12.332, p = 0.002), convergent production (X2 = 98.135, p < 0.001), and social communication dimensions (X2 = 53.185, p < 0.001) using chi-squared tests.

Table 2. Frequencies and distributions of CPS codes in two clusters

Frequency-based and temporal process mining of two CPS subgroups

As shown in Figs. 5 and 6, the process map consisted of two features. One feature was the square nodes, which represented different CPS skills (with the less frequent skills filtered out). In addition, the relative importance of the processes at the micro level can be immediately shown by the color shades. For less frequently used CPS skills, the color level became lighter. Another feature was the connectivity between the nodes, where the upper black numbers represented the frequency of transitions between skills and the bottom black numbers represented the time it takes to transition between CPS skills. The frequency and time metrics showed the absolute number of transitions and median lag time between two CPS skills.

Fig. 5
figure 5

The analysis results of cluster 1 using fuzzy miner algorithm

Fig. 6
figure 6

The analysis results of cluster 2 using fuzzy miner algorithm

From a transition frequencies perspective, the commonality between the two clusters (CPS subgroups) was reflected in the fact that both students tended to prioritize the use of meta-cognitive skills to initiate discussions (monitoring and planning in cluster 1; monitoring in cluster 2). In particular, students tended to begin by focusing on the group’s progress in completing the task and making a plan for the task. For example, they often began by setting a time for the group to discuss, “When will everyone have time to discuss our group products?” (LYT, group 2). In addition, the skills solution generation-concrete (177 in cluster 1; 55 in cluster 2) and mutual grounding-responding (196 in cluster 1; 61 in cluster 2) were repeated more frequently in cluster 1 and cluster 2. Differences were shown in how they used the three dimensions of skills in accomplishing their learning tasks.

There was a different transition pattern between the two clusters (CPS subgroups), which was reflected in their process maps. Cluster 1 contained two main transition patterns of CPS skills. The first transition pattern started the learning task with a plan. After planning, the group discussion moved to presenting concrete ideas (i.e., solution generation-concrete). Afterward, some of the participants made further analyses and comparisons of the previous ideas (i.e., solution generation-elaboration). Peers briefly agreed on the solution to establish a common understanding, and the group discussion ended here (i.e., solution evaluation-acquiescence → mutual grounding-responding). Another part of the members moved to the monitoring of the task process (i.e., monitoring). Furthermore, the group discussion shifted to convergent thinking, (i.e., solution evaluation-checking → solution evaluation-critique → solution evaluation-justification). The second transition pattern initiated the task with monitoring (i.e., monitoring). In this case, the group discussion followed up with a direct transition to convergent thinking. Finally, group discussions in cluster 1 tended to end with discourses related to social communication skills, e.g., the frequency of transition to end with mutual grounding questioning was 255.

Cluster 2 did not directly express specific views and ideas after opening the discussion through monitoring. They shifted more toward communication on a social level (i.e., monitoring → mutual grounding questioning → mutual grounding responding). Through this model of questioning and answering, the groups attempted to describe the problem and sought to establish a common problem space from which concrete ideas and thoughts could be generated (i.e., problem defining-establishing → solution generation-concrete). Afterward, part of the discussion turned to disputing and criticizing specific ideas (i.e., solution evaluation-critique). The other part of the discussion shifted to confirming (i.e., solution evaluation-checking) and simply agreeing on specific ideas (solution evaluation-acquiescence). However, group discussions in cluster 2 tended to end with cognitive discourses, e.g., the frequency of transition to end with solution evaluation-critique was 17.

From a temporal duration perspective, however, we found that the two clusters (CPS subgroups) were characterized by different use of various skills. The temporal transition patterns of cognitive skills were different in the group discussions of cluster 1 and cluster 2, e.g., solution generation-concrete (instant in cluster 1; 5 days in cluster 2). The discussion duration for each stage of the course was 1 week, with a duration of 5 days in cluster 2. As a result, the groups in cluster 1 engaged in a great deal of output of specific ideas in a short period of time, whereas the discussions in cluster 2 spanned the entire task period. In addition, the temporal duration of the meta-cognitive skill (such as monitoring) for the groups in cluster 1 was 7 days, which coincided with the task period. In contrast, no long-term use of the monitoring skill was found in cluster 2. The temporal duration from divergent thinking to convergent thinking was larger in cluster 2 (7 days). Finally, we found a similar temporal transition pattern of social-communicative skill (such as affective) between the two clusters (CPS subgroups). As social communication discourses, such as humor and appreciation, were used to liven up the atmosphere, it facilitated the occurrence of coherent communication at the cognitive dimension in both cluster 1 and cluster 2.

Stochastic process mining of two CPS subgroups

Although the above analysis explains the transition patterns between different CPS skills in terms of frequency transitions and temporal duration, it may be difficult to understand the dominant transitions without a broader analysis of other micro-process indicators in the process map. Therefore, exploring the same data from the perspective of transition probabilities, we used the R package pMineR to generate and visualize first-order Markov model (FOMM) probabilistic transition matrices for the both clusters (CPS subgroups) (Gatta et al., 2017). As illustrated in Fig. 7, we can see the alignment of all the CPS skills within the scope of the “begin” process and the “end” process. In the process map, the “begin” process showed the transition probability of the first CPS skill used at the beginning of the group discussion. The “end” process showed the transition probability of the last CPS skill used at the end of the group discussion. As a result, we found that the discussions in cluster 1 all started with the usage of skills related to the meta-cognitive dimension, i.e., monitoring (2/4) and planning (2/4). In addition, we found that the discussions in both clusters did not have a higher probability of switching across skills in the “end” process. This may be due to the characteristics of the task design. Both learning tasks were open ended, without an optimal solution path.

Fig. 7
figure 7

The analysis results of cluster 1 (top) and cluster 2 (bottom) using pMineR algorithm. M, monitoring; P, planning; R, regulation; SGEP, solution generation-epistemic; SGCO, solution generation-concrete; SGEL, solution generation-elaboration; PC, premature closure (anti-divergent); PD, problem defining/establishing; PA, problem analysis; SEAC, solution evaluation acquiescence; SECH, solution evaluation checking; SECR, solution evaluation critique; SEJU, solution evaluation justification; MGQ, mutual grounding questioning; MGR, mutual grounding responding; AF, affective; CT, cohesive-task; CP, cohesive-playful; DAF, disaffective (anti-social)

Although there was no final CPS activity in either cluster that was most likely to be conducted, we observed that group discussions in cluster 1 tended to end with mutual grounding-responding (196/196) and solution generation-concrete (178/178). The students established a shared problem space by describing a screen view to their partners, i.e., problem defining/establishing. Afterward, they moved to the social-communicative dimension (17/48) and the meta-cognitive dimension (31/48), respectively. Within the social-communicative dimension, we found a transition pattern between mutual grounding-questioning and mutual grounding-responding, with a high transition probability of 0.73 (135/184). Students established a shared understanding with their partners through questioning and adjacent pairs’ responses. Within the meta-cognitive dimension, monitoring served as a mediator for the group members to reach a shared understanding. In addition, we observed multiple Markov chains in cluster 1 that achieved solution generation-concrete. The expression of positive and task-related emotions, affective, shifted to mutual grounding-responding (30/57) and monitoring (27/57). Negative affective expressions, i.e., disaffective, or task-irrelevant discourses, i.e., cohesive-playful, remained more independent of the group discussion, creating a self-loop (i.e., 40/40 of CP, 37/37 of DAF).

Compared with cluster 1, more diverse ending discourses appeared in cluster 2. Similarly, we observed in cluster 2 that ending discourses were characterized by mutual grounding-responding (61/111) and solution generation-concrete (55/55). In other words, discourses of the cognitive dimension ended up generating specific ideas and strategies. Furthermore, in cluster 2, we observed the discourses of planning after reaching a common understanding of the learning tasks. In addition to question generation and adjacency pairs of responses among peers (i.e., 25/111 of MGQ and 61/111 of MGR), the group shifted to planning (25/111) at the meta-cognitive dimension after reaching a shared understanding. Positive and task-related affective expressions, i.e., affective, shifted to mutual grounding-responding and monitoring in cluster 2, with a transition probability of 0.5 (6/12). Unlike cluster 1, the negative emotions, i.e., disaffective, produced self-loop (5/7), while the remaining portion shifted to the question–answer pattern on the social-communicative dimension (2/7). The task-irrelevant discourse, e.g., cohesive-playful, produced self-loop (18/26), while the remaining portion shifted to the meta-cognitive dimension of monitoring (8/26).

Discussions

As digital technologies and their usage have become more widespread, learners are able to collaborate on problem-solving tasks in a computer-based environment (e.g., Swiecki et al., 2020; Lämsä et al., 2021). Meanwhile, discourse data from collaborative processes can be recorded in a sequence of events. Compared with traditional measurement techniques, intensive analysis of process data more fully reflects the dynamics and complexity of CPS (Von Davier et al., 2017). students in this study were required to collaborate on two complex problem-solving tasks. We collected the discourse data on the group’s discussions using an online synchronous chat tool in completing the tasks. In particular, we obtained 16 time sequences corresponding to the two tasks of the eight groups. Furthermore, we coded the time sequences in terms of meta-cognitive, cognitive, and social-communicative dimensions. Afterward, the clustering was completed and two main clusters were obtained by combining the DTW calculation and K-means algorithm. The average discussion duration of the sequences in cluster 1 was higher than that of cluster 2. Comparing the scores of the group products, we found that cluster 1 was higher than cluster 2 in terms of average, maximum, and minimum scores. Finally, we investigated the CPS patterns of the two clusters using statistical analysis and two PM algorithms.

Differences in CPS patterns between clusters from statistical analysis

We used statistical analysis to explore the differences between the two clusters on each coding dimension. We found that students in both clusters actively used the skills included in the meta-cognitive dimension, such as monitoring, planning, and regulating. In other words, meta-cognition facilitated smooth group collaboration to some extent and ensured the completion of group products. Similar results were found in Smith and Mancy (2018), showing that activities related to meta-cognitive skills were more likely to meet the requirements and criteria for group collaboration than social-communicative activities and cognitive activities. However, there is a significant difference between the two clusters at the cognitive dimension. Specifically, cluster 1 focused on discussion about specific solutions to problem solving tasks. They tended to present specific ideas and opinions, and further compared, explained, and summarized them by questioning and evaluating different ideas. Cluster 2 focused on the response requirements of the problem-solving task itself. They tended to present ideas and opinions about the problem solution on the basis of rules or criteria, and showed a desire to complete the task as quickly as possible rather than consider it further. Finally, on the social-communicative dimension, we found more emotional expressions in cluster 1, which included positive and message emotions. There were more utterances of task-irrelevant social functions in cluster 2. This result is in accordance with previous findings that there is a strong association between social and cognitive activities (e.g., Ouyang & Chang, 2019; Avry et al., 2020). The expression of emotions, which includes both positive and negative emotions, can provide a good foundation for deep knowledge construction, leading to high-quality collaboration. This may also explain the generally higher performance of group products in cluster 1. For example, Ouyang and Dai (2021) argued that the way in which peers achieved interactions in the social dimension influenced subsequent cognitive activities such as resource sharing and knowledge creation.

Differences in CPS patterns between clusters from process mining analysis

Firstly, cluster 1 and cluster 2 both preferred to use meta-cognitive skills, such as monitoring and planning, to initiate discussion in the early stages of discussion. Combined with the transition probabilities, we found that the groups were more likely to initiate group discussions with monitoring than with planning. This may be related to the task setting of this study, in which the groups collaborated on the two learning tasks after class through an online synchronous chat tool. Therefore, the groups first needed to agree on a point in time when all members could collaborate online at the same time. This suggests that meta-cognition is not only a reflection of individual behavior, but also should be valued from a group level. Successful collaboration involves the employment of a variety of meta-cognitive strategies to control the progress of team activities and to regulate the processes applied by the group (Biasutti & Frate, 2018). Self-looping in the expression of specific ideas and responses to peers was present in both cluster 1 and cluster 2 due to the fact that students provided multiple specific ideas during the divergent thinking stage, and social communication tended to occur in pairs. As a result, three members of each group were more likely to have repeated neighboring pairs of responses. Combined with the transition probabilities, we found that most of the group discussions eventually turned to generating specific ideas and reaching a shared understanding, which corresponded to two activities at the cognitive and the social-communicative dimensions. Positive affect played an important role in the discussion process in both cluster 1 and cluster 2, where emotional interactions in collaborative situations influenced learners’ cognitive processes and collaboration satisfaction (Huang & Lajoie, 2023). Negative affect reacted more independently of other activities in cluster 1, generating self-looping. In cluster 2, negative affect shifted to a question-and-answer mode at the social-communicative dimension in addition to generating self-looping. Meanwhile, task-irrelevant discourse shifted to monitoring and regulating in addition to generating self-looping. These results demonstrated the value of emotional interactions in moderating the CPS processes. Consistent with this finding, Avry et al. (2020) concluded that learners understood and promoted better self-regulation and co-regulation by expressing emotions during CPS. Extending this conclusion, this study enhances the understanding of the impact of emotional expression on cognitive and meta-cognitive dimensions in teamwork. Emotional expression had been shown to facilitate effective communication and information sharing among members, thus contributing to the process of knowledge construction in teamwork. Meanwhile, such social interactions were manifested at the meta-cognitive level by facilitating a better understanding of team members’ roles and contributions to each other. Those further guides team members to develop shared goals and expectations through regulation and monitoring (Huang & Lajoie, 2023).

Secondly, monitoring in cluster 1 not only initiated the discussion, but also served as a bridge from divergent to convergent thinking in the group discussion. When the group discussion was initiated with making a task plan, the students expressed more specific ideas about the group products. Meanwhile, they provided further explanations to justify that specific idea. In this case, the group members used the meta-cognitive skill of monitoring to transition the group discussion to convergent thinking. This corresponded to the process of evaluating different ideas to reach the best solution. For example, the groups went through the process of disputing, criticizing, and re-evaluating ideas. Meta-cognitive processes have also been found in previous studies in computer-based collaborative environments and are believed to enhance group coordination and develop effective learning (e.g., Biasutti & Frate, 2018). The initial stage of group discussion tended to envision and explore multiple answers from different directions, approaches, and perspectives. Task-based meta-cognitive activities, such as monitoring task progress, motivated students to shift from divergent to convergent thinking. As a result, students were more inclined to make quick judgments and arrive at the best solution from a wide range of possible outcomes at a later stage. Çini et al. (2023) argued that the problem-solving process involved not only the application of various methods and strategies, but also the monitoring and regulation of progress. Therefore, meta-cognitive awareness facilitated students’ transition to high-level thinking skills such as evaluating, summarizing, and creating. In conjunction with the temporal duration of the nodes in process mining, we further found that cluster 1 tended to focus on specific ideas for solutions and to evaluate and summarize, intermittently monitoring and regulating task progress. In particular, cluster 1 had a small temporal duration in the use of cognitive skills, so the divergent thinking and summation of different ideas were always continuous and focused. Meanwhile, the groups intermittently engaged in reflection and reprogramming during discussions. Cluster 2 tended to intermittently express specific ideas, which may be due to the group’s inability to intermittently utilize meta-cognitive skills to achieve a smooth transition between divergent thinking and convergent thinking. Consistent with this finding, Bannert et al. (2013) used process mining techniques to explore temporal patterns of student spontaneous learning. The results showed that successful students exhibited more meta-cognitive skill-related event types in the process model. For example, they constantly monitored and evaluated different learning activities. In addition, Chen and Hapgood (2021) found that meta-cognitive skills played an important role in the collaborative writing process. Participants trained in meta-cognitive skills tended to exhibit more collaborative interaction patterns and produce more language-related events. Expanding on the conclusions of previous studies, the present study further confirms that successful collaboration requires meta-cognitive skills to effectively coordinate individual and group processes at both cognitive and social dimensions, especially when group members fail to perceive challenging learning situations and their regulatory needs.

Finally, the meta-cognitive skills in cluster 2 did not serve as a transition between divergent and convergent thinking. In contrast, since the tendency in cluster 2 was more to reach a shared understanding of the task itself through the responses of neighboring pairs in the social-communicative dimension, the discussion ended when the specific ideas expressed by peers were argued and questioned. Therefore, we found that solution-driven cluster 1 had a more complete discussion process. Problem-driven cluster 2 lacked the discussion process of summarizing and evaluating the different ideas to further filter out the best solution. This finding is consistent with the conclusion of previous studies that reaching consensus in arguments is considered critical for achieving high-quality collaboration (Zhang et al., 2022; Xu et al., 2023; Ouyang & Dai, 2021). For example, Zhang et al. (2022) found that the high academic achievement groups tended to negotiate different solutions. They tended to deepen collaboration through disputes and conflicts among peers and generate inferences closely related to the topic, continuously optimizing group products to the best solution. The low academic achievement group was less adept at discussing and integrating disputes among peers, and lacked the process of negotiating to optimize group products. Instead, they focused on how to divide up the learning tasks and how to set a time frame for completion the tasks. Similarly, Zheng et al. (2020) revealed the overall characteristics of the types of student behaviors during CPS. High-performing groups managed to revise and refine their ideas on the basis of shared information and summarized a common solution through argumentation and debate. In contrast, low-performing groups simply provided alternatives without summarizing and revising them. Therefore, the present study further confirms that it is insufficient if groups remain in the stage of sharing information and expressing opinions. To reach a high level of knowledge construction and identify the optimal solution, learners are required to critique, challenge, and debate ideas and perspectives. In other words, how to resolve cognitive conflicts that arise during collaboration is critical to improving collaborative performance. Cognitive conflict facilitates in-depth discussion and contributes to more creative solutions.

Implications

From a theoretical perspective, this study proposes a three-stage analytical framework grounded in collaborative learning, constructivism, and self-regulated learning theories (Laal & Laal, 2012; Bada & Olusegun, 2015; Clark, 2012). In particular, we selected an appropriate cluster analysis method on the basis of the collected time sequence features in the subsequent analyses to identify subgroup differences in CPS behaviors. During the clustering process, we demonstrated a suitable method for clustering time sequences by calculating the distance similarity metric via the DTW calculation algorithm. This method was more conducive to identify and analyze the best warping path between two time sequences (He et al., 2019). In addition, we also employed a prototype function DBA on the basis of DTW, which helped to determine the best initial centers for clustering in a more controlled way. Finally, we focused on analyzing the temporal aspects of collaboration using multiple analytical method, concentrating on correlations and dependencies between behaviors that emerge over time (e.g., Lämsä et al., 2021).

From the practical perspective, the findings of this study provide some implications for pedagogy design. For example, instructors are suggested to provide meta-cognitive scaffolding to regulate and monitor group exchanges to facilitate students’ contributions in cognitive activities. The cognitive dimension mainly consists of divergent and convergent thinking. Divergent thinking promoted the groups to think beyond traditional modes of thinking leading to interesting and novel solutions, while convergent thinking integrated and evaluated a large number of ideas and perspectives after they had been generated by the groups. Indeed, learners often need to use meta-cognitive skills to transition group discussions from divergent to convergent thinking. Therefore, it is necessary for instructors to refine the timing of sub-tasks and provide different meta-cognitive scaffolds and flexible support in a timely manner to facilitate collaboration among learners. In addition, learners will choose to avoid or simply concur when facing cognitive conflicts during collaboration, resulting in the group’s inability to move toward deeper cognitive engagement (Lee et al., 2003). Therefore, it is necessary for instructors to provide cognitive conflict resolution scaffolding in the later stages of group collaboration. Through the conflict of different perspectives, each learner can add and modify their own perspectives. In the process of cognitive conflict resolution, learners can complete knowledge construction on the basis of the collective achievements.

Conclusions, limitations, and future directions

This study marks a significant stride in the investigation of collaborative problem-solving patterns among university students. After coding online discourse data using DTW computation and K-means algorithm, this study successfully clustered 16 time sequences into two clusters. We focused on the correlations and dependencies between the behaviors of the two clusters as they emerged over time using multiple analytical methods, further revealing the factors that enhance learning strategies and learning outcomes. Meanwhile, there are three limitations in the present study, which lead to future research directions. Firstly, since this study utilized student discourse data to explore CPS patterns, the findings may be more valid for patterns of students in similar research contexts. Secondly, the subjects in this study were 24 students from an elective course at a university. Due to the small sample, some key patterns of CPS may be missing from the analysis. Although conducting research studies with a limited number of participants is common in learning science studies (e.g., Zhang et al., 2021; Chen et al., 2021; Ouyang et al., 2023; Xu et al., 2023) due to practical constraints, we acknowledge that the small sample size reduces the generalizability of the results presented. Hence, the findings of the study need to be interpreted with caution, and we must treat it as an initial exploratory study with a limited number of participants. Although the sample size of our study is limited, we believe that the investigation of collaborative problem-solving patters via the three-stage analytical framework is useful for setting expectations for potential future discoveries in studies with larger samples of participants. In addition, although we investigated students’ problem-solving skills and prior knowledge, we did not control for the gender distribution, motivation, and learning styles of the students within the groups, which may have partially influenced the collaborative process. Future research could explore groups with larger sample sizes in multiple curricular settings. Finally, the data utilized in this study were discourses generated by the groups using an online synchronous chat tool. Simply clustering and analyzing the online discourse data was not sufficient. Future research could collect process data as comprehensively as possible, including behavioral data, eye movement data, and so on.