Introduction

Metacognition is a crucial lifelong learning skill in both academic and non-academic settings, and is at the center of individual learning (Zimmerman, 2013). From the theory of expansive learning (Engeström, 1987), learning focuses on learning processes in which the very subject of learning is transformed from isolated individuals to collectives and networks (Engeström & Sannino, 2010; Erneling, 2010). That is, learning occurs across workplace boundaries. In the workplace, collaboration is a critical skill for success in the majority of professions (Lobczowski et al., 2021). In collaborative learning, more attention has been paid to socially shared metacognition (or shared regulation), which is regarded as an important factor affecting effective collaborative learning (Iiskala et al., 2011). Socially shared metacognition or social regulation is an inter-individual metacognition which differs from individual metacognition, and which refers to the consensual monitoring and regulation of joint cognitive processes in demanding collaborative problem-solving contexts (Iiskala et al., 2004). However, from an expansionist’s viewpoint, social regulation is only a subcomponent of social metacognition (Jost et al., 1998), which includes knowledge or cognition of one’s own or someone else’s emotions, motives, or thinking. Little research has been found to systematically study social metacognition in collaborative learning.

In an expansionist review (Jost et al., 1998), metacognition’s social attribution and social metacognition were addressed, which involves thinking about the thinking processes or content of self and others. From an expansionist perspective, social metacognition is considered to be the awareness and regulation of cognition about oneself and others, involving knowledge, regulation, and judgement of self, and others’ minds as well as emotions (Erneling, 2010; Jost et al., 1998). In educational settings, social metacognition generally occurs during collaborative learning. However, in empirical studies on collaborative learning, most scholars used socially shared metacognition (Hurme et al., 2009; Lobczowski et al., 2021; Lyons et al., 2021), socially mediated metacognition (Goos et al., 2002; Larkin, 2009), socially shared regulation (Grau & Whitebread, 2012; Isohätälä et al., 2017; Järvelä & Hadwin, 2013; Malmberg et al., 2015), and co-regulation (Lim & Lim, 2020). In the empirical research, social metacognition has not been widely considered by scholars. Most empirical studies have utilized a qualitative coding approach to analyze socially shared metacognition, socially mediated metacognition, socially shared regulation, or co-regulation. Only a few studies have been conducted on the use of scales to study social cognitive regulation in collaborative learning, such as exploring the development of metacognitive constructs in communities of inquiry (Garrison & Akyol, 2013, 2015) and group metacognition in online collaborative learning: validity and reliability of the Group Metacognition Scale (Biasutti & Frate, 2018), which have mainly framed the constructs in terms of cognitive regulation to self and others. In contrast, the existing social metacognitive scales have not covered a comprehensive range of factors, and it is necessary to develop a social metacognition scale that covers the three dimensions of social metacognitive knowledge, social metacognitive skills (e.g., regulation), and social metacognitive judgements (Efklides, 2008). The development of this scale will provide a new measurement tool for future empirical research on online collaborative learning, while at the same time giving researchers multiple dimensions based on social metacognitive knowledge, social metacognitive skills, and social metacognitive judgements to analyze other variables that influence online collaborative learning, such as team performance and mental models.

Theoretical background

Metacognition

Metacognition was first introduced by the American scholar Flavell in the 1970s and has been considered as “cognition about cognition” or “thinking about thinking” (Dinsmore et al., 2008). Looking back on the evolution of metacognition, the conceptualizations and components of metacognition vary with scholars. Flavell (1976) conceptualized metacognition as the active monitoring and consequent regulation and orchestration of these processes in relation to the cognitive objects or data on which they bear, usually in service of some concrete goal or objective. In 1979, Flavell proposed a model of metacognition with the four components of metacognitive knowledge, metacognitive experiences, goals, and actions. However, goals and actions were discussed only in terms of how they relate to the first two primary components. Metacognitive knowledge is a kind of knowledge about persons, tasks, and strategies (Flavell, 1979). Knowledge about the person refers to beliefs concerning the nature and capability of personal cognitive variables and universal cognitive properties related to learning. For example, a learner may believe they learn better in the morning than in the afternoon. Or a student may understand that the level of attentional engagement has a strong association with study effectiveness. Knowledge about tasks is knowing the requirements of the task and its goals, as well as understanding how the available information will influence task performance. For instance, a student may understand that the unfamiliar information materials or incomplete instructions will make a task difficult and may change the chosen information materials. Knowledge about strategies means knowing effective problem-solving strategies to achieve the goals of a task. For example, concept mapping is a good strategy for better reviewing what has been learned. As for metacognitive experiences, Flavell (1979) pointed out that conscious cognitive and affective thoughts are regarded as metacognitive experiences. These experiences involve monitoring and active self-regulation. Monitoring/self-regulation refers to the ability to evaluate the effectiveness of current strategies and progression towards goals and to regulate one’s behavior during the learning process (Gascoine et al., 2017), which are also considered as metacognitive skills by other scholars (Efklides, 2008; Veenman, 2011).

Besides, Schraw and Moshman (1995) argued that metacognition consists of two main components: knowledge of cognition and regulation of cognition. Compared to Flavell’s definition of metacognition, Schraw’s knowledge of cognition is congruent with Flavell’s metacognitive knowledge, whereas Schraw and Moshman (1995) classified metacognitive knowledge into declarative, procedural, and conditional knowledge according to their functionality, in which declarative knowledge is depicted similarly to Flavell’s knowledge about the person, embracing any knowledge about factors affecting learners’ performance; accordingly, procedural knowledge is largely parallel to Flavell’s knowledge about strategies, although Schraw and Moshman (1995) articulated procedural knowledge as knowledge about the procedures themselves and the automaticity of their performance rather than knowledge about where and when they might be effective, which is described as conditional knowledge. In terms of regulation of cognition, it refers to planning (selecting strategies and time/resource management), monitoring (online awareness of knowing and performance effectiveness), and evaluating (assessing one’s learning results) as essential regulatory skills (Schraw, 1998; Schraw & Moshman, 1995). In contrast, Schraw highlighted the specific skills involved in the regulation of cognition, whereas Flavell attached importance to the experiential manifestations of metacognitive thought (Chen & McDunn, 2022). Additionally, Efklides (2006, 2008) divided intrapersonal metacognition into metacognitive knowledge, metacognitive experiences, and metacognitive skills. Metacognitive knowledge and metacognitive experiences bear manifestations of monitoring function, while metacognitive skills function by controlling. Metacognitive knowledge is addressed as declarative knowledge stored in long-term memory and involves models of cognitive processes such as language, memory, and others (Fabricius & Schwanenflugel, 1994), parallel to the conceptualization of metacognitive knowledge by Flavell (1979, 1987). Metacognitive experiences (Efklides, 2006) involve feelings (feelings of familiarity, difficulty, confidence, and satisfaction), judgments/estimates (judgment of learning, source memory information, estimate of effort, estimate of time), and online task-specific knowledge (task features, procedures employed), which occur in working memory (Lories et al., 1998). Efklides’ definition of metacognitive experiences is consistent with the one given by Flavell (1979). Metacognitive skills are addressed as procedural knowledge, such as orientation/monitoring of the comprehension of task requirements, planning the steps to be taken for task processing, checking, and regulating cognitive processing when it fails, and evaluating the outcome of processing (Veenman & Elshout, 1999), regarded as part of self-regulation processes (Pintrich et al., 2000). Building upon the elucidation as above, the similarity and divergence across the three schools of metacognition are presented in Fig. 1.

Fig. 1
figure 1

Similarities and divergences across three schools of metacognition

Social metacognition

From an expansionist perspective, metacognition is considered to be a process of understanding, and thinking about one’s own and other people’s understanding and thinking (Jost et al., 1998). Cognition and thinking about what others know and think is known as social metacognition. Social metacognition extends from intrapersonal to interpersonal (Efklides, 2008; Jost et al., 1998). Through in-depth insights into the multifaceted model of metacognition (Efklides, 2008), social metacognition refers to the social level of metacognition, encompassing metacognitive judgments about one’s own and others’ metacognitive experiences, metacognitive knowledge, and metacognitive skills, metacognitive knowledge of others’ cognition, and metacognitive skills to control one’s own and others’ cognition and affect at the object level through the personal-awareness level of the interacting persons. Metacognitive judgements and metacognitive knowledge at this level are informed by self-awareness at the personal level as well as by information derived from the ongoing interaction with others, functioning by monitoring reflection. Of the three dimensions of social metacognition, metacognitive judgments about one’s and others’ metacognitive experiences, metacognitive knowledge, and metacognitive skills play a crucial role in co-regulation processes during collaborative activities. Iiskala et al. (2004) indicated that peers who collaborate on problem solving co-regulate their learning in light of their metacognitive judgements based on cues from the metacognitive experiences of their partner. Salonen et al. (2005) further showed this effect of metacognitive experiences that reveals the social aspect of metacognition. Herein, metacognitive experiences are an inextricable component of the self-regulation process as well as of the co-regulation or shared-regulation of cognition, since the experiential part of metacognition reflected in nonverbal behaviors, such as gaze, pause, smile, and so forth, other manifestations of metacognitive experiences, for example, false alarm versus correct responses (Brown, 1978), and the person’s verbal utterances are cues for the person’s underlying cognition and affect. Herein, metacognitive experiences resulting from monitoring one’s own cognition and affect exert an effect on controlling one’s own and others’ cognition (Efklides, 2006).

According to Flavell (1987), “if one has knowledge or awareness of one’s own or others’ emotions or motivations, then it can be considered as metacognition.” Questions about the functioning of others’ minds are so important to us as social actors that we invest considerable metacognitive effort in determining the actions and abilities of others (Nelson, 1998). Judging from these theoretical inferences, metacognitive judgments about one’s own and others’ emotions should be regarded as monitoring function, a kind of manifestation of metacognitive experiences. In this study, in light of metacognitive studies from Jost et al. (1998) and Efklides (2008), social metacognition consists of social metacognitive knowledge, social metacognitive judgments, and social metacognitive skills. Social metacognitive knowledge includes beliefs of other persons and awareness of other persons’ thinking, wherein the term “beliefs” was adapted from most metacognitive notions referred to as descriptive beliefs about how the mind works (Metcalfe & Shimamura, 1994; Nelson, 1992, 1996). Social metacognitive judgments involve judgments about others’ emotions or motivations and judgements about others’ online task-specific knowledge (i.e., evaluation of other persons’ thinking), and social metacognitive skills refer to co-regulation processes (i.e., goal setting, help seeking, strategy regulation for each other, etc.). The third-order five factors are respectively deduced from social metacognitive knowledge, social metacognitive judgements, and social metacognitive skills as shown in Fig. 2.

Fig. 2
figure 2

The third-order five factors of social metacognition

Social regulation of learning in online collaborative argumentation

Argumentation or transactive discussion is an indispensable component of online collaborative learning (Jonassen & Kim, 2009). Although argumentation is an essential aspect of scientific thinking in education, it has been applied in multidisciplinary and interdisciplinary domains beyond one discipline (Noroozi et al., 2012). Online collaborative argumentation involves self-, co-, and socially shared regulatory learning (Hadwin & Oshige, 2011; Järvelä & Hadwin, 2013; Lobczowski et al., 2020). These regulatory processes are regarded as social regulation of learning. In order to successfully collaborate in a group, peers need to regulate their cognition, motivation, emotions, and behaviors (Hadwin et al., 2018; Järvelä et al., 2014). In terms of self-regulated learning, individuals deliberately plan, monitor, control/regulate (i.e., use strategies), evaluate, and adapt their learning to reach a desirable academic goal (Greene, 2018; Zimmerman, 2013). During social regulation of learning, group regulation mirrors individual regulation in some but not all ways in the form of loosely sequenced phases such as planning/goal setting, monitoring and controlling, and reflecting (Pintrich, 2000; Zimmerman, 2013). For example, when students engage in online collaborative dialectic argumentation, they also undergo a series of loosely sequenced phases like individuals interweaving with dialectic argumentation episodes. Before arguers begin dialectic argumentation, they usually plan together who assumes the proponent or opponent, which is regarded as the planning and goal setting phase of social regulation of learning. After that, each arguer synchronously monitors whether his or her own argument is rebutted by others and regulates evidence and reasoning to defend his or her own argument when they take part in the argumentation. During this phase, each arguer engages in awareness of the cognition, motivation, emotion, and behaviors of each other and then selects and adapts strategies to manage those factors (Pintrich, 2000). These processes are in line with the monitoring and controlling phase of social regulation of learning. In the final stage of the dialectic argumentation, each arguer needs to summarize his or her own claim, evidence, and reasoning, which is considered as the reflection phase of social regulation of learning. According to Efklides (2006, 2008), these social regulation processes interweave with social metacognitive knowledge, social metacognitive judgments, and social metacognitive skills. However, according to the extant literature on the social-cognitive perspective of self-regulated learning, they focused more attention on momentary and dynamic regulatory skills (i.e., metacognitive skills) about cognition, motivation, and emotion (Isohätälä et al., 2018; Lahdenperä et al., 2022) in a qualitative discourse analysis method, and less on social metacognitive knowledge and social metacognitive judgements.

In addition, compared to face-to-face collaborative argumentation, online collaborative argumentation cannot provide nonverbal cues such as facial expressions, gesture, and posture (Isohätälä et al., 2018; Robinson, 2013) for arguers to make metacognitive judgments about others’ feelings and emotions other than emoticons, nonstandard/multiple punctuation, and lexical surrogates in the form of text (Vandergriff, 2013). Hence, the online collaborative argumentation context stimulates personal awareness-level metacognitive experiences (i.e., feelings of familiarity, confidence, satisfaction, etc.) different from those evoked by the face-to-face collaborative argumentation context (Robinson, 2013). To date, few studies have been conducted to develop a scale for measuring social metacognition of individuals within a group during online collaborative argumentation based on the perspectives of social metacognitive knowledge, social metacognitive judgements, and social metacognitive skills.

Scales for measuring social metacognition in collaborative activities

In recent years, although growing attention has been paid to social metacognition in collaborative learning environments, few studies have been conducted to develop quantitative instruments for assessing social metacognition. In the existing literature, the majority of studies administered qualitative coding schemes to analyze socially shared regulation or socially shared metacognition, which are unfavorable to a large-scale experimental study. To date, only a few studies have developed social metacognition in collaborative learning (Biasutti & Frate, 2018; Garrison & Akyol, 2013, 2015). For example, the shared metacognition construct for communities of inquiry is currently used to measure social metacognition in collaborative learning (Garrison & Akyol, 2015), consisting of the two factors of self-regulation and co-regulation of cognition with, respectively, 13 items. Both factors exhibit a monitoring (awareness) and managing (strategic action) function, where the self-regulation of cognition reflects metacognitive monitoring and managing strategies and skills when the individual is engaged in the personal reflective learning process, and the co-regulation of cognition exhibits group-level metacognitive monitoring and managing strategies and skills in collaborative activities. In this self-report questionnaire, items I1 to I7 reflect monitoring strategies of self-regulation, while items I8 to I13 represent managing strategies of self-regulation. Similarly, items G1 to G6 and G7 to G13 respectively relate to monitoring and management strategies of co-regulation. As far as group metacognition for online collaborative learning is concerned, Biasutti and Frate (2018) developed the Group Metacognition Scale consisting of 20 items assigned to the four dimensions of knowledge of cognition, planning, monitoring, and evaluating. Each dimension consists of five items.

From examination of the two inventories for measuring metacognition in a collaborative group, some limitations were found. They are both restricted to cognitive awareness and regulation, neglecting affective interaction in collaborative activities. In addition, they are both used in general collaborative contexts. As we know, in specific collaborative argumentation, not only are high-level cognitive processes involved, such as reasoning, co-elaboration, and negotiation, but they are also accompanied with emotions, such as irritation, anxiety, joy, empathy, and other affective feelings (Goldberg & Schwarz, 2016; Polo et al., 2016). Productive collaboration requires not only deep-level joint thinking but also a healthy socio-emotional climate (Isohätälä et al., 2018; Mänty et al., 2020). In other words, in an effective collaboration group, each member should be aware of their own or others’ emotions which will affect team cohesion and in turn influence the cognitive progress in collaborative learning. Besides, Zhang et al. (2021) revealed that language learners adopted emojis and words to regulate emotion in online collaborative settings, while Hernández-Sellés et al. (2019) verified that intra-group emotion support including encouragement and help from team members facilitates online collaborative learning. Hence, emotional interaction is also a key component of online collaborative learning. However, the extant two inventories for measuring metacognition involved in online collaborative argumentation have insufficient psychometric properties and cannot comprehensively cover social attributes in collaborative learning.

Based on the theoretical models of Flavell (1976, 1979, 1987), Jost et al. (1998), and Efklides (2008), this study developed a self-report questionnaire with the five dimensions of beliefs of other persons (BOP), awareness of other persons’ thinking (AOPT), judgment of other persons’ emotions (JOPE), co-regulation of each other’s thinking (CREOT), and evaluation of other persons’ thinking (EOPT), in which beliefs of other persons and awareness of other persons’ thinking are classified into social metacognitive knowledge; judgment of other persons’ emotions and evaluation of other persons’ thinking are categorized into social metacognitive judgements; and co-regulation of each other’s thinking is referred to as social metacognitive skills. With reference to previous studies (Garrison & Akyol, 2013, 2015; Janssen et al., 2007; O’Neil & Abedi, 1996; Schraw & Dennison, 1994), items were revised in this study. The following questions were required to be answered:

  1. (1)

    Are the dimensions of the Social Metacognition Inventory (SMI) verified by the exploratory and confirmatory factor analyses?

  2. (2)

    Is the SMI found to be sufficiently reliable and stable?

Method

Participants

The participants consisted of 518 undergraduates (Mean age = 22.32, SD = 0.48) who took part in online collaborative argumentation at two universities located on the south-eastern coast of China. In total, 61.78% (n = 320) of participants were female, while 38.22% (n = 198) were male. Participants were recruited from the International College of Education and Sino-Foreign Cooperative Educational Institution, where English is their main language of communication.

Procedure

Immediately after the online collaborative argumentation activity, scaffolded by Toulmin’s argumentation pattern, using the Tencent QQ discussion board in a flipped lesson of the blended learning-based Psychology Basics curriculum at the beginning of the 2022 fall semester, 218 undergraduates from the Sino-Foreign Cooperative Educational Institution were required to respond to the Social Metacognition Inventory (SMI) issued via the online QuestionnaireStar tool. At the same time, 300 undergraduates from the International College of Education who come from 52 countries were recruited to attend the online collaborative argumentation activity about the pandemic using Tencent QQ’s discussion board. At the end of the online argumentation activity, they were required to answer the Social Metacognition Inventory on the online QuestionnaireStar tool. The first dataset of 218 responses was assigned to exploratory factor analysis, and the second dataset of 300 responses was assigned to confirmatory factor analysis. Both datasets were subjected to multi-group invariance testing.

Development of the social metacognition inventory for online collaborative argumentation

In light of the Community of Inquiry development methodology of metacognition (Biasutti & Frate, 2018; Garrison & Akyol, 2013, 2015), we first systematically reviewed the literature on definitions and psychometric attributes of social metacognition (Iiskala et al., 2011; Järvelä & Hadwin, 2013; Jost et al., 1998) and refined a construct model with five factors: beliefs of other persons (BOP), awareness of other persons’ thinking (AOPT), judgment of other persons’ emotions (JOPE), co-regulation of each other’s thinking (CREOT), and evaluation of other persons’ thinking (EOPT), enlightened by the constituents of individual metacognition given by Efklides (2008) and Sperling et al. (2002). At the start, the study established 24 indicators, of which some items were adapted from the aforementioned metacognition scales and revised to be suitable for online collaborative argumentation contexts, and some were originally developed according to the qualitative analysis of the dialogue moves in online collaborative argumentation. This Social Metacognition Inventory uses a 5-point Likert type scale as follows: 1: strongly disagree; 2: disagree; 3: neutral; 4: agree; 5: strongly agree (Akyol et al., 2012; Biasutti & Frate, 2018; Garrison & Akyol, 2013). The development process of the items is presented in Table 1. Although as a general guide, a minimum of three items per factor should be developed to maximize the scale reliability and validity (Raubenheimer, 2004), a factor with two items can be considered reliable when the items are highly correlated with each other (r > 0.70) but fairly uncorrelated with other items (Yong & Pearce, 2013). In this study, the factor of judgment of other persons’ emotions (JOPE) has only two items with high correlation (r = 0.77 > 0.70), hence the two items are considered reliable.

Table 1 The development process of the original SMI items

To ensure content validity, the 24 items of the scale (see Table 1) were reviewed by three native English-speaking experts. They had worked in the Sino-Foreign Cooperative Educational Institution for over 3 years and had years of experience in metacognition and online collaborative argumentation research. The experts examined the relevance among items, ambiguous statements, and the association between conceptual validity and the formulation of items. With reference to experts’ suggestions, the items were changed correspondingly. Then, the SMI scale was administered in a pilot study involving 10 native English-speaking undergraduates, who were required to complete the questionnaire and to give comments regarding their understanding and the appropriateness of the items.

Data analysis and results

IBM SPSS Statistics 23.0 and AMOS 23.0 were used to analyze the construct validity and reliability of the SMI scale. In this study, an exploratory factor analysis, KMO, and Bartlett tests, Cronbach’s alpha, a confirmatory factor analysis, and multi-group invariance testing were calculated. In the two datasets, the missing data were handled using the maximum likelihood method, and outliers were judged with reference to the Mahalanobis d-squared value, normality and the item’s Z score. If the difference between the Mahalanobis d-squared values of a case and some case is significantly greater than that of any pair of other cases in the dataset, the case is considered as an outlier and is excluded (Arbuckle, 2009).

Psychometric properties and the factorial structure of the social metacognition inventory

To analyze the construct of the Social Metacognition Inventory, firstly the appropriateness of the sample size and the discrepancy among items were represented by KMO and Bartlett parameters. With reference to Worthington and Whittaker’s (2006) protocols, a KMO value higher than 6.0 is considered good, while the values of the Bartlett test illustrate that the discrepancy among items is acceptable if the significance level is lower than 0.05 (Snedecor & Cochran, 1989). In the first dataset consisting of 218 samples, the KMO value was 0.894; in the Bartlett test, χ2 = 2242.94, df = 276 (p = 0.000), which indicates the factorability of the SMI.

Using the principal axis component analysis and the varimax rotation, the exploratory analysis of the first dataset was conducted for the first time, and five factors were extracted using an eigenvalue greater than 1 and a scree plot (see Fig. 3) according to the stopping rules for choosing the number of factors in EFA (Brown, 2009). After the first EFA, the trivial variables with loadings less than 0.50 were deleted and the remaining variables were explored again using the EFA method. After several iterations of EFA, the final EFA findings showed that the SMI has five factors with 17 non-trivial items with loadings all higher than 0.50 (see Table 2). The eliminated indicators/variables are, respectively, BOP1, AOPT1, AOPT5, CREOT2, CREOT6, CREOT9, and EOPT1.

Fig. 3
figure 3

The scree plot for EFA of the original SMI with 24 indicators

Table 2 The rotated factor matrix

To verify the construct validity of the SMI, a confirmatory factor analysis was administered on the second dataset composed of 300 samples using AMOS 23.0. Due to the high MI value of CREOT7 and CREOT8 in the covariance, these two indicators were removed to correct the model, which greatly reduced the chi-squared value. Meanwhile, EOPT5 and EOPT4 were highly correlated in measuring errors, so the item EOPT5 was also deleted. According to Fornell and Larcker (1981), if average variance extracted (AVE) is less than 0.50, but composite reliability (CR) is higher than 0.60, the convergent validity of the construct is still adequate. Hence, the scale can be regarded as having good convergent validity (see Table 3). Moreover, for each construct, the correlations with other constructs should be less than the “square root” of AVE of the construct (Wang, 2019).

Table 3 The convergent validity (CR and AVE) of the SMI

From Table 4, it is found that correlations between the factors are less than the “square root” of AVE, indicating that SMI has an acceptable discriminant validity, as contended by Ursavaş et al. (2019). After CFA using the maximum likelihood method, the structural model of the SMI had goodness of fit, in which χ2 = 104.093, df = 67, χ2/df = 1.554 (< 3), CFI = 0.976, GFI = 0.953, NFI = 0.936, IFI = 0.976, RFI = 0.913, RMR = 0.026, SRMR = 0.042 (values ≤ 0.08 are acceptable), RMSEA = 0.043 (values ≤ 0.05 indicate a good fit and values as high as 0.08 are a reasonable fit), CFI, GFI, and IFI were greater than 0.95, and RFI was greater than 0.9, achieving a good degree of fit. The goodness of fit for each of NFI, RMR, SRMR, and RMSEA all achieved standards in accordance with Byrne (2010). The structural model is depicted in Fig. 4.

Table 4 The discriminant validity of the SMI
Fig. 4
figure 4

The confirmatory factor analysis of the SMI (N = 300)

Reliability and stability of the SMI

To examine the reliability of the SMI, this study used the first dataset (n = 218) and the second dataset (n = 300) to analyze the reliability. In the two datasets, the reliability of the SMI is represented by Cronbach’s alpha, as listed in Table 5. Judging from Table 5, regardless of the characteristics of the learning course and the features of the learning task, the 14 indicators of the SMI all indicated good internal consistency. Moreover, the Cronbach’s alpha values are all higher than 0.70.

Table 5 The average, standard deviation, and Cronbach’s alpha of the SMI

In order to examine the stability of the SMI, the multi-group invariance testing was executed using the first (n = 218) and second (n = 300) datasets. The multi-group configural, metric, and scalar (intercept) invariance, structural covariance, and measurement residuals tests were conducted as suggested by Chen (2007), who used the multi-group confirmatory maximum likelihood (ML) factor analysis of variance–covariance matrices. The tested models’ invariance is usually judged by Δχ2 and ΔCFI values in measurement invariance studies. Although Byrne (2010) contended that the measurement invariance is rejected if the Δχ2 test is statistically significant, taking the sensibility of the chi-square test to the sample size into account, Cheung and Rensvold (2002) argued that the ΔCFI < 0.01 criterion could be used to evaluate measurement invariance. Ursavaş et al. (2019) also adopted such a criterion to demonstrate the measurement invariance of a technology acceptance model across the preservice and in-service teachers’ cohorts. In addition, if the configural, metric, and scalar invariances are verified, the measurement invariance of constructs across groups is accepted (Brown, 2006; Schmitt & Kuljanin, 2008). Scholars have indicated that metacognitive skills or metacognitive experiences varied with task difficulty levels (Efklides et al., 1998; Iiskala et al., 2011). Dindar et al. (2020) further justified the relationship between task difficulty levels and metacognitive experiences using the structural equation model analysis. In the present study, the configural, metric, scalar, covariance and residual invariances were verified. The construct invariance of the SMI is given in Table 6.

Table 6 Measurement invariance tests for the first group (N = 218) and the second group (N = 300)

Discussion and conclusion

The main aim of this study was to construct and validate a self-report questionnaire on social metacognition for online collaborative argumentation. Based on the operational definitions of social metacognition (Efklides, 2008; Jost et al., 1998), this study developed a 24-item social metacognition inventory consisting of five factors: beliefs of other persons, awareness of other persons’ thinking, awareness of other persons’ emotions, co-regulation of each other’s thinking, and evaluation of other persons’ thinking, with reference to previous scales on individual metacognition (O’Neil & Abedi, 1996), group cohesiveness (Huang, 2009), and metacognition in COI (Garrison & Akyol, 2013, 2015). Although a few studies have developed metacognition instruments for collaborative learning, they all focused much attention on regulation of cognition, neglecting knowledge of cognition, awareness of emotions, and regulation of emotion in collaborative learning. According to the definition of social metacognition given by Jost et al. (1998) and the multi-faceted nature of metacognition elucidated by Efklides (2008), the available metacognition scales in collaborative learning could not cover the complete multi-faceted properties. Moreover, different degrees of learning task difficulty and collaborative learning contexts all affect learners’ metacognition including self-regulation and co-regulation or social-regulation. Hence, it is essential to develop a social metacognition inventory used in the online collaborative learning environment in order to conduct a large-scale quantitative study on online collaborative learning in the future.

In this study, an exploratory factor analysis and a confirmatory factor analysis were administered to check the construct validity and reliability of the Social Metacognition Inventory. The findings indicated that the revised Social Metacognition Inventory with 14 items had good convergent and acceptable discriminant validity as well as high reliability. Using two datasets, the multi-group invariance test verified the configural, metric, and residual invariance, illustrating that the 14-item SMI had better structural invariance between the two groups. The stability of SMI was verified. In other words, to some degree, the 14-item SMI can be generalized to other non-English speaking undergraduates, but not for native English-speaking undergraduates.

It should be noted, however, that this study has some limitations. For example, the selected participants as non-English speaking learners only came from two universities in the same region, where English is the main formal communication language, and the difficulty of the learning tasks in online collaborative argumentation was moderate rather than demanding. Previous studies have demonstrated that learners’ regulation of cognition in a collaborative group is largely affected by their prior knowledge, the difficulty of the learning task, the learning partners and scaffolding. Therefore, whether the developed 14-item SMI is suitable for any collaborative learning task and collaborative learning context needs to be further justified in future research. Moreover, in future research, the Social Metacognition Inventory is required to be tested on a population of native English-speaking learners.