Introduction

Collaborative learning has been widely applied as an instrumental approach in EFL reading classes to foster interactive environments where students are encouraged to work together in groups, exchange perspectives, and co-construct the meanings of texts (Huang, 2023; Kiili et al., 2012; Sun & Yuan, 2018; Wang & Zheng, 2019). Additionally, the domain of education has witnessed a recent and remarkable trend, characterized by the integration of mobile devices to enhance the collaborative learning experience. Sung et al.’s (2017) meta-analysis offered compelling evidence supporting the efficacy of collaborative learning facilitated by mobile devices and wireless technology when contrasted with control groups engaged in individual learning or group learning without the aid of learning devices.

Among the wide array of application of mobile devices in educational contexts, one notable example is web-based student response systems (SRSs), which also termed as clickers, audience response system or personal. For the sake of uniformity, the term SRS (Student Response System) will be consistently used throughout this paper. Numerous meta-analyses (Castillo-Manzano et al., 2016; Chien et al., 2016; Kocak, 2022; Wang & Tahir, 2020; Wood & Shirazi, 2020) conducted for empirically evaluating the use of SRSs reported some commonly-noted advantages, such as enhancing learning performance, promoting cognitive and emotional engagement, increasing class participation, enabling anonymity, and reducing anxiety. However, Liu et al. (2017), in their review of the effects of SRS integrated with different teaching strategies, also found that the combination of clicker questions with collaborative learning yielded superior results compared to the use of clickers alone during lectures. Nevertheless, it is important to note that the majority of SRS-related research has been conducted in individual modes, and there is a lack of literature that delves into the role of SRS in collaborative learning environments and its impact on peer mediation. Further exploration in these areas would provide valuable insights into the efficacy and potential of SRS in facilitating collaborative learning experiences. Thus, this study seeks to assess and compare the potential effectiveness of gamified and non-gamified student response systems (SRSs) in the context of collaborative reading. Additionally, it aims to examine their respective impacts on learners’ perceptions of collaborative learning experiences.

Collaborative Learning Facilitated with SRS

Several research syntheses and meta-analyses (Chien et al. 2016; Hunsu et al., 2016; Liu et al., 2017) provided compelling evidence that the effect of SRS application appeared more robust when coupled with peer-mediated learning. In this approach, a pedagogical emphasis is given to creating learning environments where learners engage in meaningful and productive peer discussion before giving their responses to SRS questions. Previous studies integrating SRS application with peer discussion (Blasco-Arcas et al. 2013; Chan et al. 2019) claimed such a combination promoted students’ engagement and a higher level of interactivity among peers and the teacher, which, in turn, contributed to learning outcomes.

A line of research has deliberately compared classrooms where SRS-mediated instruction was incorporated with peer discussion to those that applied individualized SRS. Jones et al. (2012) indicated that learners receiving collaborative SRS strategy demonstrated better near concept transfer abilities compared with those receiving individualized SRS strategy. However, as far as the metacognitive skills were concerned, the results revealed a significant interaction between gender and SRS strategy, further indicating females performed better on the regulation of cognition when utilizing collaborative SRS whereas males did better when utilizing individualized SRS.

McDonough and Foote (2015) analyzed the occurrence of peer interaction when students used individual and shared clickers. The results revealed that shared clicker use resulted in not only greater joint reasoning but also increased accuracy of students’ answers. In Wang’s (2018) study where Kahoot was applied as a web-based SRS platform, students with group SRS use had better achievements in immediate learning performance; nevertheless, their excel did not extend to learning retention. On the contrary, the individualized SRS group scored higher on retention of learning contents and showed more improvement on delayed tests. According to Wang (2018), one factor affecting the learning success of collaborative SRS could be due to the social loafing phenomenon, where some participants did not contribute to the group activities. Regarding learners’ perceptions of these two SRS strategies, Wang (2018) corroborated McDonough and Foote (2015), suggesting that students overall preferred collaborative SRS use over individualized SRS use.

In another study, Sun et al. (2018) investigated the impact of three polling strategies (i.e., traditional IRS clickers, group polling on tablets, and group polling with competition on tablets) on students’ learning performance, anxiety, and attention levels. Although the three polling strategies all promoted learning outcomes, the results indicated that group polling, in general, helped students eliminate anxiety and increase attention levels.

Gamification and Gamified SRS

Gamification has always been generally regarded as an effective instructional method to blend learning experiences with games so learners may be more motivated and engaged. Subhash and Cudney’s (2018) systematic review on gamified learning in higher education revealed that higher perceived learning, improved student attitudes, better learning outcomes, participation, attendance, confidence, and interest in class were widely cited benefits of gamified learning. The study identified several key game elements, including points, badges, leaderboards, levels, quests, and feedback. Another systematic review by Dehghanzadeh et al. (2021) presented an overview of the potential of gamification, specifically, in the context of learning English as a Second language (LESL). The synthesis of 22 relevant studies for the time period of 2014 to 2019 revealed that gamified learning environments can be easily created for LESL through an integration of digital tools and that feedback was the most-frequently-used gamification element. Another aim of the study was to explore learners’ experiences of gamification for LESL. All the reviewed studies seemed to be overwhelmingly positive in terms of learners’ learning experiences, and some common words used by learners to describe their gamified language learning experiences were “enjoyable, fun, attractive, interactive, and interesting” (p.12).

Among various forms of gamification, the use of gamified SRS (GSRS) has gained popularity in a wide range of educational contexts and disciplines. A vast amount of literature to date has gauged the effectiveness of GSRS-facilitated instructions. When compared to traditional learning environments, most studies have confirmed the positive effects of the integration of GSRSs into class, including increased interest and motivation, positive classroom dynamics, reduced learning anxiety, better attendance and participation, high level of students satisfaction, enhanced collaboration, and positive perceptions of teachers and students (Chen and Hwang 2019; Chiang, 2020; Lee et al., 2019; Licorish et al. 2018; Ranieri et al., 2021; Öden et al. 2021; Wang & Tahir, 2020; Zainuddin et al., 2020; Zhang & Yu, 2021). Some studies were found to specifically compare different GSRSs. For example, Göksün and Gürsoy (2019) compared conventional lectures to two experimental groups, utilizing two gamified applications Kahoot and Quizizz respectively as formative assessment tools, on academic achievement and student engagement. The results indicated that the Kahoot group outperformed the other two groups on both dependent variables (though insignificantly) and that the Quizizz group was the least effective among the three groups. Another study that employed three types of gamification applications: Socrative, Quizizz, and iSpring Learn LMS, Zainuddin et al. (2020) found that the Quizizz group had the highest scores in academic performance, followed next by the Socrative group. With respect to perceived engagement, students in both the Socrative and Quizizz groups rated higher perceived levels of four types of engagement (cognitive, behavioral, emotional, and agentic) than those in iSpring Learn LMS group.

From a review of abundant related studies, the effectiveness of GSRS incorporation seems to be promising. However, it is found that few attempts have been made to investigate whether the positive relationships between GSRS application and learning outcomes are due to gamification, SRS use, or simply presentation questions. Besides, All et al. (2016) pointed out one major methodological limitation observed in the past studies examining the effectiveness of digital game-based learning. That is, many researchers failed to ensure similar interventions in all aspects except for game elements to be the only difference across conditions.

In the present study, three groups were involved and instructed with a collaborative learning approach whereby students engaged in group discussions for guided reading questions in an EFL reading class. The two experimental groups applied game-based- and non-game-based SRSs for question–answer activities, respectively, whereas the control group received identical practice questions on PowerPoint slides but did not use any SRS application. Similarities in all aspects of interventions were attained across three groups, such as time exposure, instructor, learning content, and types of exercises. Specifically, this paper aims to address the following research questions:

  1. (1)

    Does a statistically significant difference exist in the learning achievements among three groups of students engaged in collaborative learning integrated with gamified SRS, non-gamified SRS, and traditional non-SRS approaches?

  2. (2)

    Are the students more satisfied with their collaborative learning experiences facilitated with gamified-SRS and non-gamified SRS than with traditional non-SRS classroom?

Methodology

A quasi-experimental design was undertaken to examine the effects of gamification and SRS mediation in an EFL reading class on several dependent variables: learning outcomes and different aspects of collaborative learning such as perceived collaboration experiences, task interest and enjoyment, peer interaction, and social relatedness. The independent variable in this study was the type of digital response system (i.e., non-SRS, SRS, and gamified-SRS) incorporated with question–answer activities through collaborative learning.

Participants

The participants in this study, aged between 17 and 18, were 156 nursing majors. They shared the same L1 (Chinese) and have received an average of 11 years of English education in Taiwan. Three intact classes were recruited from General English courses offered to the second-year students at a junior college in northern Taiwan. The three classes were all taught by the researcher, adopted identical materials, and followed the same syllabus. The three classes were randomly assigned to one control group and two experimental groups. The control group (non-SRS group, N = 52) did not use any digital response system to respond to questions posed by the teacher. One experimental group was determined as SRS group (N = 52) that applied non-gamified SRS, namely, Nearpod, to question–answer activities whereas the other experimental group was determined as gamified SRS group (n = 52) that applied gamified SRS, namely Kahoot.

A sample reading test from TOEIC bridge was administered before the instructions to ensure the three groups were comparable in terms of their English proficiency at the outset of the study. An ANOVA showed that no significant differences were found in English proficiency levels across the three classes (F(2, 138) = 0.91, p = 0.40). In other words, the results proved that the three groups were equivalent in L2 competency at the outset of the research. Based on the proficiency results, the majority of the students were labeled as A1-A2 level in CEFR.

For general intervention, question–answer activities were implemented and mediated by group discussions in all three classes. During instructions, the teacher posed questions about reading content, and students engaged in group discussion to reach a consensus and responded to questions. Each class was sorted into groups of four based on heterogeneous grouping, using previously mentioned TOEIC bridge test results as grouping references. That is, the division was made in each group on the basis of the students’ TOEIC bridge scores, with two from the top half and the other two from the bottom half on the rank. The reason for choosing such a grouping method was to ensure each group was composed of students with mixed abilities, with two high achievers and two low achievers. Thus, more capable learners could provide personalized peer support to their group members so as to achieve effective collaborative learning (Boardman et al., 2018; Jalilifar, 2010).

Experimental Procedure

Throughout the study, a total of two reading texts were involved, all taken from the designated, local-published textbook. The titles of the two articles were as follows: (1) “Online Buddies or Online Bullies?” (2) “Take the bills away.” Both articles were similar in length and difficulty, each containing approximately 500 words with a Flesch-Kincaid 8 grade-level readability.

Care was taken to ensure reading instructions accustomed to the three groups followed similar steps, encompassing three reading phases. At the pre-reading phase, question–answer activities were launched to generate group discussions for not only introducing the topic but also activating learners’ prior knowledge and arousing their interests. The question types at this phase normally included open-ended questions and poll questions. During the reading phase, students would normally proceed to read a portion of the text silently for comprehension, and the instructor also incorporated vocabulary instruction into lessons to make texts comprehensible to them. Meanwhile, the students were encouraged to discuss within groups to clarify the meaning of texts. Immediately after reading the designated portion, question–answer activities were launched for the instructor to assess students’ level of comprehension and for students to identify misconceptions. The students were instructed to engage in group discussions before indicating their answers or responses.

The question types used for post-reading question–answer activities included short-answer, multiple-choice, and true–false questions. These questions prompted students to search for specific information within the texts, extract information from different sections of the texts, and draw inferences based on the content. Personal response questions or those lacking specific answers were omitted, as student response system limitations restrict their effective use. The students were engaged in cycles of reading texts, group discussions, and question–answer activities until the text is completed. The question–answer activities created for each complete text contained 20–25 items. The three groups received collaborative learning as a pedagogical approach, identical questions formats, and reading procedures.

The variation across groups was the response system type incorporated with the question–answer activities. The reading questions were presented on power-point slides, Nearpod, and Kahoot platforms in the non-SRS, SRS, and gamified-SRS classes, respectively. In the control group (non-SRS class), the students either raised their hands or were randomly chosen by the instructor to indicate their answers after group discussions. On the other hand, students clicked and typed their responses using one shared remote device within groups in the two experimental groups of digital SRS. The sole difference between the two experimental groups was the application of game mechanics in the digital SRS systems. To encourage participation, team responses were incorporated into the course grades in all three classes.

Instruments

Figure 1 shows the experiment design of this study. All participants were given a reading comprehension test after completing in-class activities for each text to compare learning achievements due to different SRS applications. The questionnaire used for the study aimed to investigate learners’ perceived collaborative learning, task interest-enjoyment, peer interaction, and perceived social relatedness. The dimension of perceived collaborative learning was administered before and after the intervention so as to explore whether different types of SRS mediation moderated learners’ perceived collaborative learning over time. The other three dimensions: interest-enjoyment, peer interaction, and social relatedness, on the other hand, were applied to the participants only after the intervention to compare these measurements across different learning conditions. All questionnaire items were rated on 5-point Likert scales ranging from 1 (strongly disagree) to 5 (strongly agree), except for the peer interaction questionnaire, ranging from 1 (never) to 5 (always).

Fig. 1
figure 1

Experiment design of the study

Reading Comprehension Tests

Each test consisted of 10 reading comprehension questions in multiple-choice format, with five literal and five inferential comprehension questions. When constructing test items, the researcher avoided similar wording used for in-class question–answer activities so as to reduce the possibility that students memorized specific questions and answers. Each test item scored 10 points, so the maximum possible score for each reading comprehension test was 100 points.

Collaborative Learning Questionnaire

Perceived Collaborative Learning (PCL)

Perceived collaborative learning measured student perspectives on preferences to group versus individual work and overall perceptions of collaborative learning experiences. The questionnaire was taken from the Collaborative Learning subscale of So and Brush’s (2008) Collaborative learning, social presence, and satisfaction (CLSS) questionnaire. The Cronbach alpha reliability as reported by Su and Brush was 0.72 for the subscale. The questionnaire contained 8 items. One original item “Collaborative learning experience in the computer-mediated communication environment is better than in a face-to-face learning environment” was slightly modified to “Collaborative learning experience is better than individual learning” to better fit the current study. There was one negatively-worded item, “Collaborative learning in my group was time-consuming.” The reliability of the scale on the current sample was 0.86.

Interest-Enjoyment (IE)

Interest-enjoyment measured learners’ interest and enjoyment toward a learning activity, using a 7-item subscale of the Intrinsic Motivation Inventory (McAuley et al., 1989). The questions varied from “I enjoyed doing this activity very much” to the negative item “This activity did not hold my attention at all”. The Cronbach alpha reliability reported in the original study was 0.78. The reliability of the scale on the current sample was 0.91.

Peer Interaction (PI)

Peer interaction composed of 3 items investigated to which extent learners engaged in peer interactions during in-class activities. The questionnaire was taken from the Peer Interaction subscale developed by Lai (2021) with reported Cronbach alpha reliability of 0.84 in the original study. The reliability of the scale on the current sample was 0.82.

Social Relatedness (SR)

Social relatedness reflected learners’ feelings of connectedness and belongings within the groups. Social relatedness was measured via four items adopted from Xi and Hamari’s (2019) Relatedness Need Satisfaction subscale. To fit the present study context, the stem “when I visit the online community” was slightly modified to “In a team activity.” A sample item is “In a team activity, I feel I was supported by others.” The reported Cronbach alpha reliability was 0.83 in the original study. The reliability of the scale on the current sample was 0.88.

Semi-structured Interview

With respect to qualitative data, personal semi-structured interviews were conducted to collect more detailed feedback from the students. Upon completion of the instruction activities, four students, comprising two high achievers and two low achievers, were randomly selected from each class and asked the following questions:

  1. (1)

    What are the advantages of the learning approach (i.e., combining Kahoot/Nearpod with collaborative learning)? Why?

  2. (2)

    What are the disadvantages of the learning approach (i.e., combining Kahoot/Nearpod with collaborative learning)? Why?

  3. (3)

    What are your views on the impact of Nearpod/Kahoot application on collaborative learning and interactions among group members?

Results

An Analysis of Learning Achievements

Table 1 summarizes the mean scores and standard deviation for the three groups’ test performance on two reading comprehension tests and the midterm. As shown in Table 1, the gamified SRS group achieved the highest mean scores and the non-SRS group the lowest for all three tests.

Table 1 Comparison of group differences in learning achievements on descriptive statistics

The learning achievements among the three learning modes were compared using one-way ANCOVA to exclude the influence of covariance (TOEIC Bridge pretest scores) and to examine the impact of the effects of gamification and SRS mediation. Table 2 displays the ANCOVA results of the learning outcomes, with language proficiency pretest being controlled. The results indicate that the SRS group achieved the highest adjusted mean scores for the two reading assessments, while the gamified SRS group attained the highest adjusted mean score for the midterm. In contrast, the non-SRS group demonstrated the lowest learning outcomes across all three posttests.

Table 2 Summary of ANCOVA on the learning achievement post-test

Additionally, as shown in Table 2, ANCOVA analysis revealed no significant differences for reading comprehension test 1 (F = 0.10, p > 0.05) whereas significant differences were observed among the groups for reading comprehension test 2 (F = 5.189, p < 0.05) and the midterm (F = 6.11, p < 0.05). Pairwise comparisons were further performed to determine which pairs of means were notably different from each other for reading comprehension test 2 and the midterm. The results indicated that in both reading assessment 2 and the midterm the adjusted means scores of the non-SRS group were significantly lower than both of the gamified-SRS and SRS groups whereas no significant differences were found between gamified SRS and SRS groups.

An Analysis of Questionnaires

Learners’ collaborative learning was evaluated along four dimensions: perceived collaborative learning (PCL), interest and enjoyment (IE), peer interaction (PI), and social relatedness (SR). Descriptive statistics of these four dimensions are summarized in Table 3. The gamified SRS class had significantly higher ratings than SRS class and non-SRS class in all dimensions of post-PCL, IE, PI, and SR. The employment of pre-post measurements of perceived collaborative learning was to investigate whether learners’ views on collaborative learning changed over time due to the interventions. As seen in Table 3, participants’ post-measurement scores of perceived collaborative learning increased by 0.32 for gamified SRS group whereas that decreased by 0.09 and 0.08 for SRS group and non-SRS group, respectively. To find out whether there was a significant difference across groups, the analysis of covariance (ANCOVA) was applied by using the pre-PCL measurements as the covariate and the post-PCL measurements as dependent variables. The results indicated that no statistically significant difference was found across groups (F(2, 130) = 1.79, p = 0.17).

Table 3 Means and standard deviations of 4 dimensions of collaborative learning

With regard to whether there exists a significant difference in interest and enjoyment (IE), peer interaction (PI), and social relatedness (SR) among groups, a one-way analysis of variance (ANOVA) was conducted and revealed a non-significant difference between the SRS mediation methods in these three variables either.

An Analysis of Qualitative Data

To address the second research question, thematic analysis was initially employed to analyze open-ended comments from questionnaires, allowing for the emergence of codes and themes. The coding themes were then categorized based on the advantages and disadvantages of the three learning modes. The summarized coding themes and their frequencies of occurrence are presented in Table 4. Subsequently, transcribed individual interviews were carefully reviewed multiple times to identify significant segments aligning with established coding themes, thereby offering deeper insights and context.

Table 4 Summary of perceived advantages and disadvantages of three learning modes

As shown in Table 4, the themes of coded advantages across the three learning modes were similar, including fun/enjoyment, a good atmosphere, associated with learning, engagement, self-reflection, group discussions, and social connections. Regarding the perceived disadvantages, there were two concerns raised by all three learning modes, including social-emotional stress and social loafing. The data also showed that the non-SRS group’s negative responses outnumbered the other two groups, and some other notable complaints were that students remained passive during question–answer activities and that the collaborative learning experience was not better, or even worse than individual learning.

Regarding individual benefits, the interviews unveiled that the majority of participants expressed that collaborative learning was more enjoyable, and the classroom atmosphere was more pleasant compared to individualistic and teacher-driven approaches. Furthermore, participants highlighted that peer discussions captured their attention, resulting in increased engagement and reinforcing their learning. In terms of collective learning experiences, below are excerpts extracted from students’ responses that align with the two identified coded advantages: group discussions and social connections.

Group Discussion

  • I can directly ask my team members during the class if I have questions because I am too afraid to ask the teacher (S3, gamified SRS).

  • In order to explain things clearly to my team members, the group discussions provide me with opportunities to think and practice expressing myself (S1, gamified SRS).

  • Due to heterogeneous grouping, students with weaker English proficiency in the same team can ask those with better English proficiency for help (S7, SRS).

  • My team members would ask about my thoughts on the answers and even patiently clarify the meaning of texts and teach me how to pronounce words correctly when I encountered problems (S8, SRS).

  • Through continuous discussions, I have a better impression of the text content (S9, non-SRS).

Social Connection

  • I had the opportunity to interact more with classmates whom I’m not usually familiar with (S4, gamified SRS).

  • I got to make some new friends through group discussion (S5, SRS).

However, two interviewees from the control group, without any SRS mediation, reported that their group members showed a lack of engagement in group discussions and were hesitant to respond to questions. The following excerpts from some of the interviews also reflected coded disadvantages for non-SRS group: passive attitudes and no better than individual learning.

Passive Attitudes

  • Our group is not very proactive in discussing or raising hands to answer (S12, non-SRS).

No Better Than Individual Learning

  • I’m not familiar with my team members, so we didn’t really have much interactions. For me, individual learning is more effective (S10, non-SRS).

The interview data revealed another theme that SRS conditions, where each group’s responses were displayed on the screen for comparison and review, fostered a competitive atmosphere. One participant noted, “Since all groups’ answers were shown on the screen, we tried harder to collaborate within our groups to avoid feeling embarrassed if we didn't get answers correct.” These inter-group competitions created a gamified atmosphere, significantly motivating students in SRS classes to collaborate with their fellow group members.

While peer interactions were seen as a valuable learning resource, they could also result in apprehension and negative social dynamics. The interview excerpts corresponding to the two coded disadvantages are provided below:

Social-Emotional Stress

  • I don’t have strong English skills, I am worried I don’t have any contribution (S3, gamified-SRS).

  • There is not much I can do for the group as I don’t understand the texts (S8, SRS).

  • I hope my group members won’t dislike me because I keep asking them questions (S12, non-SRS).

Social Loafing

  • It was a dreadful group experience, as one member made no effort to search for answers or participate in discussions (S12, non-SRS).

  • Only three of us were engaged in discussion, and the other member was silent the whole time (S9, non-SRS).

Discussion

The statistical results of this study indicated that both experimental groups utilizing SRS systems outperformed the control group without SRS mediation on the second reading assessment and the midterm significantly. However, no significant differences were observed among the three groups for the first reading assessment. One plausible reason for this could be attributed to the students’ adjustment period to collaborative learning contexts where they needed time to familiarize themselves with the dynamics of group discussions and interactions. Several participants mentioned during the interviews that instructions featuring peer discussions were unfamiliar to them.

While learning performance is widely recognized as a common advantage of SRS intervention in the literature, the empirical evidence has presented conflicting findings regarding the impact of SRS mediation on examination results. To address these discrepancies, two meta-analyses conducted by Castillo-Manzano et al. (2016) and Hunsu et al. (2016) aimed to shed light on the potential effects of SRSs on academic marks. The findings of both metanalyses suggested that SRSs exhibited a positive, albeit limited, impact on academic performance compared to other conventional teaching–learning methods. Furthermore, both studies also revealed that the effectiveness of SRSs was influenced by various factors, such as class size, clicker questions (Hunsu et al., 2016), educational context, and the category of discipline (Castillo-Manzano et al., 2016). Of a particular note, Hunsu et al. (2016) highlighted that academic outcomes were more strongly linked to clicker questions than to SRS application, as the effectiveness of SRS use disappeared when both SRS and non-SRS groups received similar question–answer instructions.

However, in order to mitigate the potential “unequal-item exposure effect” (Chien et al., 2016, p.4), all three classes in this study were provided with identical question–answer intervention, with the only variation being the incorporation of the SRS system. Thus, the results of this study indicate that SRS use indeed offered certain advantages in terms of achievements over the traditional method in collaborative learning contexts. According to the qualitative data, such an advantage could be attributed to that SRS use promoted peer interactions, which led to successful collaborative learning. The interviewees in both experimental groups indicated that the SRS application made them more likely to participate in group discussions actively as their responses to questions would be displayed on the screen, allowing the instructor and the class to compare. This is also aligned with Liu et al.’s (2017) review on SRS use incorporated with different teaching pedagogies, claiming that one prominent feature of SRS was its ability to enhance communication and interaction, thus resulting in stronger positive effects on academic achievements in collaborative learning settings compared to traditional lectures. Chan et al. (2019), in their study exploring the relationships between interactivity, active collaborative learning, and learning performance when using SRSs, also highlighted that SRS use facilitated collaborative learning, consequently enhancing the positive effect of interactivity on students’ learning performance.

The benefits of engaging in more active and productive group discussions within SRS conditions were also reflected in SRS groups’ significantly higher midterm scores. In this research context, while the two reading assessments functioned as formative evaluations of reading comprehension, the midterm exam served as a summative evaluation, providing an overall assessment of students’ language learning outcomes, including vocabulary, grammar, and paragraph reading based on the taught reading texts. The SRS application likely served as a catalyst for promoting deeper content processing during peer interactions, wherein participants continuously monitored their understanding of the texts, actively inferred text meaning using contextual and linguistic clues, and grasped the gist of new words. Despite vocabulary and grammar not being the primary focus of the present study, the findings indicated that enhancement in reading comprehension, as demonstrated in the second reading assessment, also contributed to overall language learning transfer.

When comparing the learning outcomes of SRS and gamified SRS using ANCOVA, no significant differences were found in any of the posttests between the two groups. This finding, thus, does not suggest any added significant value of gamification embedded in SRS on learning outcomes, which differs from the findings in Turan and Meral’s (2018) study. In their research, comparing game-based SRS (Kahoot) with non-game-based SRS (Socrative), Turan and Meral (2018) suggested that the game-based SRS significantly improved achievement. One plausible explanation for such disparity could be that, in this current study, a non-game-based SRS was applied in a collaborative learning context whereas in Turan and Meral’s (2018) study, it was used in an individual learning context. Despite the absence of explicit gamification features of Nearpod in SRS class, a game-like atmosphere unintentionally emerged in this study due to the comparison of each team’s responses on the SRS system and their contribution towards group performance as part of the course grades (Kay & LeSage, 2009).

Another aim of the study is to explore the potential impact of SRS mediation or gamification on learners’ perceptions of collaborative learning experiences. Although the quantitative analysis of the Likert scale did not show significant differences across the three groups for any of the four aspects of perceived collaborative learning, the qualitative data (open-ended comments, interviews) did demonstrate that learners’ perceptions of collaborative learning in gamified SRS and SRS classes were generally more positive compared to those in non-SRS class. Consistent with previous studies (Chen, 2022; Kocak, 2022; Ranieri et al., 2021; Zhang & Yu, 2021), students in both experimental classes with SRS mediation generally perceived SRS use as contributing to an enjoyable learning environment, fostering participation and engagement, increasing motivation, and enhancing the learning experience compared to those receiving traditional approach.

One notable theme that emerged from the qualitative data is that the SRS application compensated for group member unfamiliarity, which was highlighted as an advantage by several students during interviews. The students emphasized that being familiar with their group members would make them feel more comfortable and at ease during discussions (Astuti & Lammers, 2020). Janssen et al. (2009) also confirmed that group familiarity was associated with critical and exploratory group norms, positive perceived collaboration, and even group performance. Nonetheless, the students from the three classes were placed into heterogeneous ability groups for optimal learning in this research context, students might not have been familiar with their group members. However, the SRS application in both experimental classes created inter-group competitions that motivated team members to collaborate towards a shared goal and actively engage in discussions, ultimately leading to group cohesion and maximizing group performance. This also stands in contrast to the claims of some students in the control group without SRS use, who expressed that their collaborative learning experience was not significantly different from individual learning, as they felt a lack of meaningful interaction and engagement within their groups partly due to group unfamiliarity. This study also confirmed the joint effect of peer competition between groups and peer collaboration within groups. Similarly, Pareto et al. (2012) emphasized the benefits of pedagogy featuring peer competition coupled with peer collaboration, which exerted a compelling motivational influence and encouraged students’ active engagement.

Based on the qualitative data, the fact that inter-group competition had boosted intra-group collaboration existed not only in gamified SRS class but also SRS class without the gamification feature. As aforementioned, this could be due to that team competition and team rewards in the SRS class might have also contributed to competitive game-like learning experiences. Ho et al.’s meta-analysis (2022) suggested that competitive conditions may have a more significant impact on learning outcomes compared to gamification. However, it is worth noting that features of Kahoot, such as scoreboards, audio, graphics, and the countdown tick, might have led to a more competitive learning environment compared to the Nearpod used in the SRS class (Wang & Tahir, 2020; Zhang & Yu, 2021). Nevertheless, this higher level of competitiveness observed in gamified SRS classes did not have a significant association with learning achievements and perceived collaborative learning experiences in this study.

On the other hand, the lack of significant differences in the collaborative perception among the three groups could be due to two possible explanations. First, while SRS use might promote peer interactions and collaborative discussions, it might not completely address negative group dynamics, such as issues with free riding or social-emotional stress. Similarly, when comparing individual and group SRS use, Wang (2018) also found free-rider issues with less motivated learners relying excessively on their team members in group SRS use. This study, however, revealed that the lower participation of certain group members was not solely attributable to low motivation but also to communication apprehension experienced by less advanced students. Ter Vrugte et al. (2015) reported some interesting findings about heterogeneous groupings, indicating that low achievers benefited from collaboration when competition was not absent, whereas high achievers experienced greater benefits from collaboration when competition was present. Another explanation is related to the nature of SRS systems, which are typically designed for rapid evaluation and feature closed-ended questions. These close-ended questions did not require deeper thought sharing or extensive dialogue among students, thereby limiting opportunities to enhance students’ perceived feelings of connection with their peers. So and Brush (2008) posited that feeling of connection and closeness with other students affects individual motivation to engage in group activities and their perception of collaborative learning. Collaborative activities which allow relationships to be forged through exchange of information may be particularly crucial for heterogeneous group learning settings where psychological distance could serve as a barrier to effective communication.

Although negative group dynamics could be influenced by various factors (i.e., competition, group familiarity, and relevant abilities) in this study, the group reward method could be another potential contributing factor. In collaborative learning contexts, Sung et al. (2017) recommended providing balancing points based on individual performance whereas Wang et al. (2017) suggested incorporating unrewarded peer competition to reduce inefficient group processes.

Conclusion

The study findings support previous research indicating that incorporating SRS use into collaborative pedagogy can enhance peer-aided learning, such as boosting motivation, promoting interactivity and peer discussions, and creating an enjoyable learning environment, all of which could lead to positive effects on learning outcomes. Several implications can be drawn from the study. First, SRS could be particularly beneficial in heterogeneous group compositions, where low-ability students could learn from their more capable peers and benefit from collaborative learning. Second, the study reveals that SRS use, even when integrated with gamification features, did not significantly improve collective learning experiences. Furthermore, while SRS being effective for convergent questions, it could limit exchange of multiple perspectives among students, thereby potentially hindering the development of social interactions and perceived feelings of connection within groups. In such cases, educators should incorporate a variety of question types that encourage students to explore multiple perspectives and engage in deeper discussions whether before, during, or after SRS activities.

Additionally, educators may rely on appropriate scaffolding to stimulate students’ participation in group discussions. Third, the study highlights the role of group rewards in fostering a gamified and competitive environment, even in instances where explicit gamification features are not present in the SRS. Nonetheless, these gamified dynamics can exert both positive and negative influences on collaborative learning. On one hand, they enhance interactivity and active participation, but on the other hand, they may also induce stress and communication apprehension among certain students, potentially leading to inefficient group processes. Thus, educators should design reward guidance with caution.

Given the complexity of collaborative learning practices, it is recommended that future research adopts qualitative discourse analysis to unveil the dynamic and situated nature of group discussions during SRS-enabled collaborative learning. Additionally, a promising avenue for further exploration could involve exploring the varying effects of SRS technologies integrated with peer competition and peer collaboration on low- and high-achieving students in terms of their learning outcomes, motivation, and perceived group dynamics.