1 Introduction

To date, as digital technology has evolved and made significant advancements, artificial intelligence (AI) chatbots and the metaverse platform have generated significant interest in the field of computer-assisted language learning (CALL) (e.g., Chen, 2022; Hew et al., 2023; Huang et al., 2022; Hwang, 2023; Ji et al., 2023; Lee & Hwang, 2022; Lee et al., 2023; Wu et al., 2023). Many scholars in CALL have paid sustained attention to the use of AI chatbots to support students’ language education for the last decade (Huang et al., 2022; Ji et al., 2023). Research notes that different types of speech-recognition chatbots may provide diverse language learning affordances, such as social interaction (Huang et al., 2022), goal-oriented language learning (Hew et al., 2023), task-based language learning (Jeon, 2022), communication opportunities (Ji et al., 2023), and language-related feedback provision (Hew et al., 2023). With the recent development of large language models, such as ChatGPT and Gemini, AI-based conversational chatbots have now gained renewed attention, as they can provide diverse opportunities for language teaching and learning based on their advanced AI technologies for chatbot-human communication (Kasneci et al., 2023).

In addition, there has been a notable increase in research on the utilization of the metaverse in language education (Hwang & Chein, 2022; Hwang et al., 2023b; Wu et al., 2023). Drawing upon prior research on the educational advantages of virtual environment platforms for language teaching and learning (Cheng & Chen, 2016; Lee & Hwang, 2022), scholars have directed their attention towards metaverse platforms that may offer enhanced connectivity, immersive learning experiences, support for collaborative learning, a sense of authentic social presence, and contextually-rich language interactions (Hwang, 2023; Lee & Hwang, 2022; Lee et al., 2023; Wu et al., 2023; Zhang et al., 2022). Not only is research on the use of metaverse platforms in language education evolving, but also the growing attention showcases its potential to innovate language learning practices by integrating emerging AI technologies in virtual spaces (Hwang et al., 2023a; Kasneci et al., 2023; Rospigliosi, 2023; Wu et al., 2023). Specifically, the metaverse, which encompasses virtual reality (VR) and augmented reality (AR), among other immersive learning technologies, is expected to be at the center of CALL research when it is integrated with AI technologies, such as chatbots and AI-powered learning management systems (LMSs) (Hwang & Chein, 2022; Wu et al., 2023).

Although prior research has discussed such potential roles of AI chatbots and the metaverse in language education, empirical research on the integration of the metaverse and AI chatbots remains scarce. In addition, as Kim et al. (2022) and Lan et al. (2018) suggested, integrating the design and utilization of AI chatbots and the metaverse into teacher training courses might have the potential to prepare pre-service teachers for future language teaching situations by offering interactive and immersive experiences that are not possible in traditional settings. As AI chatbots gives more enhanced task-based language teaching opportunities, and the metaverse provides immersive teaching environments, when they are used in a collaborative manner, these technologies can help pre-service teachers develop their language teaching affordances (Lee et al., 2024a) as well as enhance teacher readiness to design technology-mediated learning environments, which are increasingly important in the current digital era (Lee & Hwang, 2022). Research also notes that design-based training modules are pivotal for the success of teacher preparation programs, as pre-service teachers, who do not have classroom-based teaching experiences, can be active learners of both technology and pedagogy (Campbell et al., 2022; Nami, 2022; Tondeur et al., 2016). However, the integration of these two technologies in language learning teacher training has yet to be explored.

In sum, while previous studies have investigated the educational effects and students’ learning affordances of AI chatbots and metaverse technologies separately (Huang et al., 2022; Hwang, 2023; Hwang & Chein, 2022; Hwang et al., 2023b; Lee et al., 2023), little has been known about how these two technologies can be combined to maximize their educational potential, particularly in pre-service teacher training. To date, only one study has investigated this combined approach (Lee et al., 2024a). To address the gaps, this study aims to explore the integration of AI technology into the metaverse for professional teacher development. It specifically examines how pre-service English-as-a-foreign-language (EFL) teachers design AI chatbots and connect them to metaverse-based virtual classrooms for their teaching demonstrations with their peer teacher candidates. It additionally delves into their perceptions of designing chatbot-based lesson plans and their mock teaching experiences.

2 Theoretical background

2.1 The educational potentials of chatbot use in language education

Research within the CALL field demonstrates an increasing interest in the use of chatbots for language instruction over the last two decades (Huang et al., 2022; Ji et al., 2023). Initially developed as a text-based agent, a chatbot began to be used as a verbal agent with the development of AI technologies, such as natural language processing and automatic speech recognition (Jeon, 2022). According to Huang et al. (2022), AI chatbots can provide various affordances in relation to language learning, such as the technological, pedagogical, and social affordances, albeit with some communicative limitations. It was also noted that these chatbots can support second language (L2) learning through their interactional capabilities for language learning tasks (Kim et al., 2022). Similarly, Jeon (2022) explored these affordances for task-based English language learning for young learners and observed significant effects on their language learning perceptions, including perceived L2 competence, technology control, the pedagogical and interactional values of chatbots, and L2 motivation.

Furthermore, studies have found that educational AI chatbots assume multiple roles in language education, ranging from providing resources, evaluating language abilities, tailoring instruction, facilitating verbal interaction, and offering personalized feedback (Hwang & Chein, 2022; Ji et al., 2023). Among these, the role of a conversational partner is the most frequently reported, as seen in Lee and Jeon (2024), which found young English learners perceive AI agents as human-like conversation partners. However, it was noted that L2 learners also face technological and psychological challenges when utilizing AI chatbots for language learning, such as communication breakdown, lack of goal-orientation, and a potential novelty effect impacting their motivation and emotional needs (Hew et al., 2023; Huang et al., 2022; Jeon, 2022).

Literature emphasizes that AI chatbots, if designed well, can enhance L2 students’ learning experiences in language teaching (Hew et al., 2023; Huang et al., 2022; Kim et al., 2022; Lee et al., 2020). In this vein, research calls for designing goal-oriented chatbots mimicking real-life scenarios for effective language teaching and learning, requiring consideration of learner traits (Kim et al., 2022). In a teaching context, teachers should learn to design language learning tasks with clear objectives while using platforms like Dialogflow and Botsify to customize chatbots to student needs (Lee et al., 2020). Despite this necessity, there is scant knowledge on promoting pre-service teacher training for educational chatbot design, especially in the CALL field (Ji et al., 2023; Kim & Lee, 2022). For example, Kim and Lee’s (2022) use of Dialogflow for pre- and in-service English teacher development did not delve into diverse AI chatbot design aspects, including pedagogical benefits and limitations (Huang et al., 2022), task-oriented design (Hew et al., 2023; Lee et al., 2020), and its usage perception (Ji et al., 2023).

Educational stakeholders’ perceptions and voices are crucial to examining the effective integration of new technologies in L2 classrooms for both teaching and learning purposes (Nishino, 2012; Teo, 2015; Yang & Chen, 2023). In this context, focusing on the cognitive dimensions of chatbot-assisted L2 education, empirical research has consistently highlighted favorable views from both learners and teachers on the integration of AI chatbots into L2 instruction (e.g., Hew et al., 2023; Lee & Jeon, 2024; Yang & Chen, 2023). For example, Lee and Jeon (2024) demonstrated that young L2 learners embraced AI chatbots as authentic conversation in diverse language learning tasks. Hew et al. (2023) used goal-oriented chatbots to facilitate college students’ goal-setting and social presence in language learning, discovering active engagement in chatbot activities along with positive attitudes toward the chatbots’ usefulness and ease of use. In a teacher-training context, Yang and Chen (2023) reported that pre-service teachers’ intentions to adopt chatbots were influenced by the chatbots’ usefulness in enhancing their content understanding and accessibility. Overall, these studies shared the idea that chatbots are regarded as being useful, user-friendly, and engaging tools in L2 education.

Recent research indicates that AI chatbots can enhance language learning for EFL students when combined with technologies like VR, AR, large language models, and the metaverse (Hwang & Chein, 2022; Rospigliosi, 2023; Wu et al., 2023). Specifically, within metaverse spaces, chatbots can offer immersive communication as non-player characters (NPCs) (Hwang & Chein, 2022; Hwang et al., 2023a). Yet, there is no empirical research integrating chatbots into the metaverse for training pre-service EFL teachers. This study aims to address this gap by exploring how pre-service EFL teachers can design and integrate AI chatbots into the metaverse for their microteaching.

2.2 The emergence of the metaverse

In this section, we examined literature on virtual space and traced the trajectory of relevant terms, from the spread of the term virtual environment to the recent emergence of the metaverse, to provide a conceptual model of how the term metaverse has emerged (see Fig. 1). Until the late 2010s, what is now known as the metaverse was largely termed a virtual environment (VE) or virtual world (VW) (Cheng & Chen, 2016; Deutschmann & Panichi, 2009). From 2020, the metaverse began to be used interchangeably with these terms but gained more prominence in CALL (Chen, 2022; Jeon et al., 2022; Lee et al., 2023). We found that this evolving concept of VE now mirrors and even surpasses real-life possibilities, with AI and non-fungible tokens (NFT) adding new dimensions to it (Hwang, 2023). As the expanding circles of the model from VE 1.0 to metaVErse indicate, the metaverse builds on previous versions of VE, while also offering enhanced benefits such as greater design freedom, decentralized user roles, and a persistent real-life connection (Chen, 2022; Hwang & Chein, 2022; Hwang et al., 2023b). That is, the term metaVErse suggests an extension and transcendence of earlier VE versions with ‘meta’ signifying going beyond.

Fig. 1
figure 1

Historical transformation of VEs

VE 1.0 emerged from the advent of Internet technology, allowing information sharing beyond physical barriers through tools, including browsers, email, and social networks. These platforms primarily offered a two-dimensional environment featuring text, images, and videos (Choudhury, 2014). VE 2.0 then emerged with more immersive elements such as VR simulations, AR, and 360-degree content, offering three-dimensional experiences where users interact with objects and NPCs (Cheng & Chen, 2016; Lan et al., 2018). While VE 2.0 occasionally focused on single-user experiences, it set the stage for more realistic and interactive immersion, paving the way for the next generation of the virtual environment. Building on VE 2.0, VE 3.0 emphasized social interactions facilitated within virtual spaces. Multi-user virtual environments (MUVE) and massively multiplayer online role-playing games (MMORPG) began to be used as key platforms, promoting communication and shared experiences among users (Kuznetcova et al., 2019). However, this era often had rigid, developer-defined environments without deep real-life connections (Hwang, 2023). The emerging metaverse, termed VE 4.0, presents a consistent virtual world closely mirroring daily activities, such as work, ownership, learning, creation, and functional economy in a way that feels more connected to their real lives (Hwang & Chein, 2022; Zhang et al., 2022). This addresses the societal need that arose during the COVID-19 pandemic when the complexities of the physical world were constrained (Hwang et al., 2023b). In this sense, the metaverse is not merely a technological concept but a paradigm shift that encompasses the social, cultural, economic, and educational potential (Hwang & Chein, 2022; Wu et al., 2023). In essence, each virtual environment evolution, as depicted in Fig. 1, shows the technological progression of VEs in tandem with sociocultural shifts. The metaverse, which combines the actual and virtual worlds and connects people and technologies, is the peak of these advancements. Within this framework, experts spotlight the pivotal role of emerging technologies such as AI and NFT (Chen, 2022; Hwang, 2023) as precursors to the forthcoming VE iterations, potentially VE 5.0 and beyond (Hwang & Chein, 2022; Rospigliosi, 2023). For example, Hwang (2023) studied a metaverse exhibition featuring NFT-certified artwork and found that university students experienced heightened creative cognition, a sense of achievement, and positive virtual exhibition perceptions. In addition, researchers stress the potential of using various types of AI tools, such as AI chatbots and AI-integrated learning systems, in recent metaverse explorations (Hwang, 2023; Hwang & Chein, 2022; Wu et al., 2023).

2.3 The educational potentials of metaverse use in language education

Recent research in the CALL field has shown an increasing interest in using the metaverse for interactive and immersive L2 education (Hwang, 2023; Hwang et al., 2023b; Lee et al., 2023; Wu et al., 2023). First, the metaverse can help teachers create a synchronous online learning environment that enables L2 learners to engage with teachers and peers through avatars, creating a more realistic sense of social and cognitive presence (Hae et al., 2023; Hwang et al., 2023b; Lee et al., 2023; Sra & Pattanaik, 2023). According to Lee et al. (2023), interacting with others in the metaverse via avatars has a positive impact on students’ emotional and affective engagement in L2 learning, as it provides a stress-free environment where students can have confidence in language communication with others. Moreover, Hwang et al. (2023b) discovered that L2 learners experience a stronger sense of social presence when interacting in a 3D metaverse platform, which results from immediate feedback from teachers and virtual interaction with peers. This authentic online interaction in the metaverse promotes collaborative knowledge construction and dynamic group activities (Jeon et al., 2022; Lee & Wu, 2023; Lee et al., 2023), which is particularly advantageous compared to the limited interaction with peers and teachers in asynchronous distance learning or video conferencing platforms like Zoom or Webex (Li & Yu, 2022; Hwang et al., 2023a). Likewise, the synchronous collaboration and dynamic nature of the metaverse not only enhances cognitive learning opportunities but also establishes a foundation for building collaboration and community, which is crucial in developing affinity skills and facilitating language learning skills (Jeon et al., 2022; Lee & Hwang, 2022; Lee et al., 2023).

Second, previous research has shown that unlike static 2D online teaching tools, the metaverse provides teachers with various opportunities to invite L2 learners into enriching learner-content interactions through 3D-based multimodal experiences (Lee & Wu, 2023; Lee et al., 2023). This means that L2 learners are not passive recipients of language input, but active participants in a cycle of experience and reflection (Lee & Hwang, 2022; Wu et al., 2023). This immersive interaction in the metaverse not only echoes Dale’s (1969) cone of learning theory, advocating for experiential learning for deeper retention but also aligns with Lave and Wenger’s (1991) situated learning theory that highlights learning occurs when prior information is connected to authentic learning contexts and through interpersonal relationships. In the metaverse, L2 learners can navigate and negotiate meaning in authentic language use, reshaping traditional language learning frameworks. These context-rich and immersive experiences are crucial for intuitive and cognitive language acquisition (Hwang et al., 2023b; Lee et al., 2023; Lee & Wu, 2023).

In sum, the metaverse can provide the authenticity and relevance needed for social and cognitive language learning experiences (Lee & Wu, 2023), especially aligning with the principles of interactionist theories of second language acquisition (SLA), which argue that effective language learning relies on two-way communication and mutual interaction in authentic contexts (Long, 1985; Pica, 1996). In this light, the metaverse encourages increased interaction between teachers, learners, and content, thereby supporting communication, functional language use, task-based activities, collaborative learning, and student-centered active learning, all of which are key aspects of CLT (Hwang, 2023; Hwang et al., 2023b). Given these pedagogical advantages, it is important to expose pre-service EFL teachers to the use of the metaverse in their training programs. The metaverse offers a range of scenarios and activities that can help teachers explore ways to encourage and engage EFL learners (Jeon et al., 2022; Wu et al., 2023). The immersive and interactive environments of the metaverse also facilitate CLT by providing opportunities for authentic language learning contexts, synchronous communication, and collaborative learning (Lee & Wu, 2023). In addition, it can help teachers develop innovative and creative teaching methods, which positively impacts their readiness to design technology-enhanced learning environments (Lee & Hwang, 2022).

2.4 The current study

As discussed in preceding sections, the current body of research on AI chatbots and the metaverse has demonstrated that these technologies possess promising pedagogical potential for L2 instruction, aid in teachers’ professional development in teacher-training programs, and foster positive attitudes among educational stakeholders towards the integration of technology in L2 classrooms (e.g., Hew et al., 2023; Hwang & Chein, 2022; Ji et al., 2023; Kim et al., 2022; Yang & Chen, 2023). Despite these advancements, notable research gaps still remain to be addressed. First, while prior research has examined the educational benefits of AI chatbots and metaverse technologies separately, there is a paucity of empirical research on the combined potentials of these technologies in metaverse-based classes, particularly in the context of pre-service teacher training. Second, although diverse cognitive dimensions such as attitude, perceived usefulness, self-efficacy, engagement, and intention to use these technologies in L2 education have been explored (e.g., Hew et al., 2023; Lee & Hwang, 2022; Lee & Wu, 2023; Yang & Chen, 2023), there is a lack of focus on pre-service teachers’ attitudes towards the combined use of these two advanced technologies in a single, design-based teaching environment. This aspect is crucial as understanding different facets of technology acceptance is key to effectively integrating new technologies into L2 classrooms (Nishino, 2012; Yang & Chen, 2023). Therefore, it is crucial to conduct an empirical study that investigates the combined effects of the metaverse and AI on pre-service teachers’ designing experiences and attitudes.

Addressing these gaps, the current study focuses on how pre-service EFL teachers design AI chatbots for educational purposes and embed them in either traditional or metaverse classrooms. It further explores their perceptions regarding lesson planning and microteaching as part of training modules in these two different classroom environments. The study is guided by the following research questions (RQs).

  1. 1.

    How did pre-service EFL teachers utilize AI chatbots for language teaching tasks in traditional and metaverse classrooms?

  2. 2.

    What similarities and differences emerged in perceptions of lesson planning and teaching demonstrations between the two groups?

3 Method

3.1 Participants and context

This study was conducted at a comprehensive university in South Korea. A total of 55 pre-service EFL teachers voluntarily participated in this study (17 males and 38 females; aged 21–25). The participants were Year 2 or Year 3 students who completed basic courses in English education, such as English for Academic Purposes, English Communication Skills, and Second Language Acquisition. In the fall 2022 semester, they took a course named English Education with AI, which was designed to equip these student teachers to teach the English language to young learners with a focus on CALL. The students enrolled in this course were randomly assigned to two different classes to maximize their individual learning opportunities. As a result, 24 students were assigned to Class One, and the other 31 students were assigned to Class Two. To address the research questions in this study, these two groups of students were asked to carry out different technology-design projects. The first group was named the AI chatbot-metaverse group (CMG) and developed AI chatbots and metaverse spaces for teaching practices. They were instructed to develop task-specific chatbots and embed them within the metaverse learning environments. They then conducted microteaching sessions within these virtual classrooms. The second group, named the AI chatbot-only group (COG), was asked to build AI chatbots exclusively tailored for teaching demonstrations. Their microteaching was subsequently executed in traditional physical classroom settings.

3.2 Research procedure and design

Drawing on the existing literature on the professional development of pre-service English teachers with technology (Al-Furaih, 2017; Campbell et al., 2022; Crosthwaite et al., 2023; Jeon et al., 2022; Jeong, 2017; Lee & Hwang, 2022), the authors prepared an integrated research procedure for pre-service teacher training in an AI chatbot-metaverse development project (see Fig. 2). To compare the two groups in their design work and perceptions, we adopted a mixed-methods research approach. The study lasted over the course of 16 weeks, including five stages.

Fig. 2
figure 2

Research procedure: AI chatbot-metaverse development project

During the first two weeks, both groups were introduced to L2 acquisition and English education, drawing from Brown and Lee (2015). Participants learned essential strategies for interactive English teaching, specifically through the CLT approach. As per CLT, they explored ways to enhance students’ communicative competence, emphasizing authentic communication environments, task contextualization, learner autonomy, and group interactions for meaningful communication (Brown & Lee, 2015; Nishino, 2012). In the second stage, from Week 3 to Week 4, participants were introduced to AI technology, focusing on CALL. They initially learned traditional lesson planning based on the instructor’s model plans. Then, they explored enhancing interactivity in lessons using ICT tools like AI chatbots, multimedia resources, and metaverse platforms (Crosthwaite et al., 2023). The third stage lasted for eight weeks from Week 5 to Week 12, during which both groups engaged in distinct technology projects. The COG group focused solely on designing AI chatbots, while the CMG group created both AI chatbots and metaverse spaces. Using Google Dialogflow, all participants individually crafted AI chatbots tailored for specific pedagogical tasks, such as warm-ups, exercises, and role-plays, drawing from topics in the 5th- and 6th-grade digital English textbooks (see Fig. 3).

Fig. 3
figure 3

Digital English textbooks for chatbot and metaverse design projects

Dialogflow provided pre-service teachers with a suite of features, including intent recognition and entity extraction. As illustrated in Fig. 4, when tasked with the topic of ordering juice at a school cafeteria, the participants set up an intent for the learner objective, “Can I have tomato and grape juice?”. To accommodate their diverse preferences for juice ordering, an entity (@Fruit) was defined to encompass different types of juice. In this case, the system response, coded as $Fruit in “Good choice, $Fruit1 and $Fruit2 are our best menu in the menu. How many do you want?”, dynamically integrates the learner’s specific fruit choices.

Fig. 4
figure 4

Dialogflow function map

Using Dialogflow, both groups were tasked with creating AI chatbots tailored to content stories found in the digital English textbooks (see Fig. 3). Initially, participants designed dialogue scenarios aimed at specific L2 learning tasks derived from their selected textbook content. Next, they developed AI chatbots based on these scenarios to implement the L2 tasks within their classes. The next step involved integrating these customized AI chatbots into their lesson plans, focusing on their application in teaching demonstrations. In other words, each group came up with unique ways to implement chatbot use in their teaching contexts while they attempted to integrate their chatbots with their teaching contexts based on lesson plans. This process entailed devising innovative strategies for chatbot deployment in their respective educational settings, ensuring a seamless integration of chatbots with the lesson content and teaching methodologies.

The COG devised strategies to incorporate AI chatbots in traditional classroom environments. They were encouraged to create a variety of classroom activities, teaching aids, and worksheets designed to enhance student interaction with the language through the chatbots. In comparison, the CMG focused on employing their chatbots within virtual environments, thereby constructing a metaverse space that mirrored the contextual and situational backdrop for chatbot interactions. Using the SPOT Virtual program, a 3D-based metaverse platform, CMG created virtual spaces tailored to AI chatbot topics, adding 3D objects from the system or online sources (see Fig. 5). The participants adjusted floor plans, decor, and more to fit their AI chatbot themes within the virtual environments.

Fig. 5
figure 5

Spot Virtual customization function

During the subsequent three-week stage, both groups conducted teaching demonstrations, based on their lesson plans integrated with their custom-developed chatbots. The CMG group carried out their teaching demonstrations in metaverse classrooms, structuring their lessons around three key phases: Introduction, Development, and Closure. Initially, they showcased the metaverse classroom spaces they had created and outlined their lesson plans. Their virtual spaces were noted for being enriched with AI chatbots, designed to foster immersive and interactive learning experiences. In comparison, the COG group conducted their teaching in traditional classroom settings, employing AI chatbots in conjunction with a variety of teaching aids and materials. These included worksheets, PowerPoint presentations, multimedia resources, and mobile devices such as phones and tablet PCs, to facilitate dynamic L2 interactions in a physical classroom environment. This approach required the COG participants to find innovative ways to seamlessly integrate chatbots into standard lesson plans.

The objective of both groups’ teaching demonstrations was to actively involve their peers in AI chatbot-based activities, striving for task authenticity within their respective teaching contexts (Crosthwaite et al., 2023; Jeon et al., 2022; Nishino, 2012). Following the teaching demonstrations, feedback was shared among the peer teacher candidates to facilitate a collaborative learning and evaluation process.

3.3 Data collection and analysis

According to the mixed-method research approach selected in this study (Dörnyei, 2007), we used multiple data sources for the triangulation of collected data (see Fig. 6). To address RQ 1, which focuses on the two groups’ uses of AI chatbots either in the metaverse space or the traditional classroom, we utilized a qualitative approach to collect data regarding the participants’ design projects and their use of AI chatbots for teaching practices. Specifically, we collected their chatbot and metaverse design works and their teaching-demonstration videos (n = 55, a total of 832.7 min). The data collection was conducted during the third and fourth stages from Week 5 to Week 15 (see Fig. 2).

Fig. 6
figure 6

Data collection and analysis

For RQ 2, which examines participants’ perceptions of using AI chatbots for teaching demonstrations, we used survey questionnaires and reflection papers. The questionnaire was developed based on the Technology Acceptance Model (TAM), a well-established framework for evaluating the perception and acceptance of a new technology (Davis, 1989; Venkatesh & Davis, 2000). As discussed in the literature review, core components of TAM, such as participants’ perceived attitudes, self-efficacy, usefulness, and intention to use, have been used by foregoing studies that explore educational stakeholders’ attitudes toward the integration of technology into classrooms (e.g., Hew et al., 2023; Lee & Hwang, 2022; Lee & Wu, 2023; Nishino, 2012; Yang & Chen, 2023). We expected the TAM framework to aid in comprehensively investigating how pre-service EFL teachers perceive, respond to, and plan to use AI chatbots and the metaverse in facilitating L2 learning tasks from the CLT perspective. Then, the questionnaire included additional constructs such as social image, immersion, and engagement to reflect core affordances of the metaverse platform (Hae et al., 2023; Lee et al., 2023; Sra & Pattanaik, 2023). Specifically, these additional constructs may help us gain deeper insights into the external factors that may influence perceptions towards the use of the metaverse.

The survey questionnaire was composed of two sections: (1) demographic information and (2) perception. Particularly, the second section comprised seven constructs, including attitude (Barrette, 2015; Venkatesh & Davis, 2000), perceived technology self-efficacy (Barrette, 2015; Gurer, 2021), perceived usefulness for English language teaching (Nishino, 2012; Teo, 2015), social image (Moore & Benbasat, 2001; Venkatesh & Davis, 2000), immersion (Jennett et al., 2008), engagement (Reeve & Tseng, 2011), and use intention (Barrette, 2015; Gurer, 2021). A seven-point Likert scale was used to measure the constructs. Appendix 1 provides the mean, SD, and Cronbach α values of each survey construct and item: Cronbach α values of all constructs exceed 0.7, the recommended threshold of reliability.

We also collected participants’ reflection papers (n = 46, a total of 5,665 words) although nine students failed to submit their papers at the end of the semester. The reflection papers focused on the participants’ perceptions about the use of AI chatbots either in the metaverse or in the traditional classroom from the affective aspects. Sample questions include “How can you describe your feelings and your experience regarding AI chatbot development?” and “What are the benefits and limitations of using AI chatbots (and the metaverse) for English education?”.

Data analysis, using the data collected in a mixed-methods manner, was conducted in two different directions to answer two RQs (see Fig. 6). For RQ 1, we adopted a multimodal content analysis to analyze participants’ chatbot designs and their application in language learning tasks during teaching demonstrations (Serafini & Reid, 2019). Our analysis was guided by the CLT approach, as outlined by Brown and Lee (2015) and Nishino (2012): our initial steps involved examining video content to grasp the participants’ chatbot designs drawing on CLT. Then, we extracted still images from these videos that represented a CLT perspective. This visual data was systematically analyzed and classified according to a coding scheme developed on key CLT principles, including authentic environments, task contextualization, learner autonomy, and group interactions that foster meaningful communication (Appendix 2). Following this step, we engaged in reflective, iterative coding in line with this coding scheme. For data representation, we incorporated still images from video clips complemented by pertinent extracts from teaching demonstrations. Discourse analysis was also adopted for some conversations, with the transcription conventions detailed in Appendix 3.

To address RQ 2, which is designed to explore the similarities and differences in their perceptions of using AI chatbots for lesson design and teaching demonstrations between the two groups, we first used descriptive statistics using the independent samples t-test to analyze the survey data. We calculated the mean and standard deviation (SD) and then operated t-test on their perceptions. Then, the reflection papers were analyzed through sentiment analysis, using Orange 3, open-source data-mining software, to explore the participants’ affective attitudes toward the use of AI chatbots in their teaching contexts. We used VADER and SentiArt modules to investigate their emotions, such as positive and negative emotions, anger, fear, among others. In addition, we operated co-occurrence network analysis with KH-Coder to discover potential relationships among keywords in terms of their meaning-making in the reflection papers.

4 Findings

4.1 AI chatbot and metaverse classroom design

In this section, we detail the findings from a multimodal content analysis examining pre-service teachers’ design and use of AI chatbots in traditional and metaverse classroom settings. While both participant groups designed AI chatbots for language learning tasks, the environments in which these chatbots were employed during teaching demonstrations varied. Thus, this section offers a comparative analysis of student projects across these two different classroom context.

4.1.1 Authentic communication environments

In addressing the content from the “I have a cold” unit in the 5th-grade English textbook, some participants from both groups designed chatbots to facilitate interactions between students and a chatbot agent. As described in Fig. 7, one participant from COG implemented her teaching demonstration utilizing her customized chatbot. This presentation, conducted in a physical classroom, encouraged a student to engage with the chatbot via a mobile device. However, despite the interaction as transcribed in Fig. 7 (B), this physical setting did not afford the student an immersive experience pertinent to the language task. In the absence of such an authentic environment, students simulated interactions with a chatbot, adopting the role of a nurse, as if situated in a nurse’s office.

Fig. 7
figure 7

COG’s chatbot use in a traditional classroom

In contrast, one CMG participant developed a chatbot to facilitate conversational practice with a school nurse. Figure 8 (A) showcases a realistic school infirmary setting where students can interact with peers and AI chatbots. To enrich the chatbot experience, a metaverse nurse’s office was designed, immersing students in the infirmary ambiance. Figure 8 (B) further details a conversational setup with the nurse chatbot, enumerating various intents (e.g., welcome, grade, symptoms) and entities (@grade, @person, @disease: cold, fever) suitable for language activities in a nurse’s context. The creator of this space elucidated the chatbot interaction process:

figure a
Fig. 8
figure 8

CMG’s metaverse environment for a school infirmary

4.1.2 Task contextualization

Task contextualization is a pivotal principle of CLT, which was emphasized in the lectures attended by participants. Accordingly, they were prompted to integrate this concept when designing and using their chatbots. Most participants from COG, who conducted their teaching demonstrations in a traditional classroom, employed presentation screens for language activities, as described in Fig. 9 (A). A subsequent chatbot conversational task is presented within this presentation context, as illustrated in Fig. 9 (B):

figure b
Fig. 9
figure 9

COG’s use of PPT for language tasks with chatbots

Compared to the COG, the CMG developed an “Ordering Kiosk” chatbot to help students learn English expressions for ordering food and drinks, as outlined in the 6th-grade English textbook. Participant 8 from the CMG incorporated this kiosk into a metaverse café, aiming to simulate a real-life ordering scenario in virtual spaces. As depicted in Fig. 10 (A), he introduced the café setting and instructed on ordering from the displayed menu. Figure 10 (B) highlights a conversation with the AI chatbot. Regarding this activity, this participant illustrated:

figure c
Fig. 10
figure 10

CMG’s metaverse café with a kiosk chatbot

4.1.3 Collaborative learning

The last theme is pertinent to the learning mode: individual versus collaborative learning (Brown & Lee, 2015; Nishino, 2012). Observations indicated that COG participants predominantly emphasized individual interactions with chatbots in a traditional classroom setting. Figure 11 shows one pre-service teacher from COG guiding her students to engage with individual chatbots for a camping task, demonstrating the functionality of a chatbot through a presentation:

figure d
Fig. 11
figure 11

COG’s individual interactions with their chatbots

The learning mode utilized in the CMG’s chatbot integration into the metaverse was quite different. As seen in Fig. 12, one participant designed a chatbot around two camping scenarios from the 6th-grade English unit titled “We are going camping.” In this metaverse space, students were immersed in a realistic camping environment, equipped with interactive 3D objects. They were given a choice: a spot by a pond (Fig. 12 A) or a summer camp setting (Fig. 12 B). During her demonstration, students were asked to use the chatbot collaboratively, determining which environment to explore by navigating a hallway connecting both scenarios. Choosing the water festival took them to the pond-side environment, while the cooking festival directed them to a culinary-themed space. As depicted in Fig. 12 (B), a group of students engaged with the chatbot in a shared virtual camping setting, exemplifying collaborative learning:

figure e
Fig. 12
figure 12

CMG’s collaborative interactions with chatbots for camping activities

4.2 Survey results

The analysis of the survey data showed that the participants expressed relatively positive perceptions across seven constructs (see Table 1). The highest scores were given to engagement (M = 5.93, SD = 1.06), indicating that the participants significantly invested in using AI chatbots for teaching practices. In addition, they showed positive perceptions regarding perceived usefulness (M = 5.90, SD = 0.96), social image (M = 5.88, SD = 1.14), and attitude (M = 5.86, SD = 1.14). In contrast, their perceptions of immersion were neutral to positive (M = 4.81, SD = 1.80). This result indicates that their experiences of using AI chatbots, either in the metaverse or in the traditional classroom, were on average positive.

Table 1 Perceptions regarding the use of AI chatbots

The result showed that there were significant differences in their perceptions. For example, the mean scores of the CMG (M = 6.46, SD = 0.68) were significantly higher than the COG (M = 5.42, SD = 1.21) regarding their attitudes toward using AI chatbots t(53) = 4.02, p < 0.01, d = 1.06. The CMG also showed significantly more positive perceptions on the other six constructs, with t values ranging from 3.67 for immersion to 2.30 for engagement although the effect sizes for usefulness (0.68), intention to use (0.63), and engagement (0.61) were not significantly high. Overall, the results suggest that the CMG had significantly more positive views on chatbot design and use in metaverse classrooms than in traditional classrooms.

4.3 Sentiment and keyword analysis results

To better understand the emotional attitudes of the participants towards their experiences of AI chatbot design, we employed sentiment analysis, including VADER and SentiArt. First of all, VADER enabled the evaluation of sentiments as positive (+ value), negative (- value), or neutral (0 value). The results revealed that the pre-service teachers had generally positive experiences in creating AI chatbots, as indicated by the compounding scores, which are a weighted average of all lexical ratings (see Table 2). Both groups had a positive compounding score, with the mean score of CMG (M = 0.70, SD = 0.41) being higher than that of COG (M = 0.52, SD = 0.62). Additionally, while the neutral sentiment scores (M = 0.78 [COG], M = 0.84 [CMG]) were higher than the positive sentiment scores (M = 0.16 [COG], M = 0.13 [CMG]), negative attitudes received the lowest rating (M = 0.06 [COG], M = 0.03 [CMG]). The results indicate that both COG and CMG generally had positive experiences creating AI chatbots. The higher compounding score in CMG suggests a greater overall positive sentiment compared to COG. Additionally, while the neutral sentiment scores were higher than the positive sentiment scores in both groups, negative attitudes received the lowest rating. These findings suggest that the participants’ engagement with AI chatbot design (and the metaverse) was generally positive.

Table 2 Results of sentiment analysis

Second, the SentiArt analysis allowed for the representation of emotional intensity patterns, including anger, fear, disgust, happiness, sadness, and surprise, towards the experience of creating and using AI chatbots. The findings demonstrated that scores associated with positive emotions, such as happiness (M = 0.79 [COG], M = 0.63 [CMG]), and surprise (M = 0.80 [COG], M = 0.62 [CMG]), were significantly greater than those of negative emotions such as anger (M = 0.04 [COG], M = -0.07 [CMG]), and disgust (M = 0.33 [COG], M = 0.19 [CMG]). All the numerical values, except for fear (M = 0.54 [COG], M = 0.57 [CMG]), were slightly higher in COG compared to CMG.

While the first question addressed participants’ experiences with AI chatbot design, the second question explored their teaching demonstration experiences in physical (COG) and virtual (CMG) classrooms (Fig. 13). Keywords from their responses were analyzed using KH coder 3.0 co-occurrence network analysis.

Fig. 13
figure 13

Co-occurrence of words of COG vs. GMG

The COG’s attitudes towards AI chatbot use in teaching demonstrations were evident in Networks 1, 2, and 3 of word co-occurrence. In Network 1, the prevalence of terms such as “teacher,” “increase,” “learner,” “interest,” and “motivation” indicates that AI chatbots can effectively motivate language learners. Participants often highlighted this benefit in their reflections as below:

figure f

Network 2 highlighted the benefits of AI chatbots in providing corrective feedback, evident from co-occurring words like “receive,” “provide,” “correct,” and “feedback.” Meanwhile, Network 3, featuring terms like “communication,” “confidence,” “mistake,” and “fear,” suggests that AI chatbots can reduce the anxiety related to English language use, thereby enhancing communication confidence. One participant’s excerpt below further supports these insights:

figure g

In contrast, the networks in Fig. 13 detail word co-occurrence from CMG’s metaverse teaching experiences with AI chatbots. Unlike COG’s emphasis on student learning via chatbots only, CMG’s feedback highlights spatial and contextual experiences associated with avatar movement. For instance, Network 4 reveals participants appreciated the ability to navigate the metaverse, engaging with learning materials offered by AI chatbots, as highlighted by terms such as “explore,” “move,” “familiar,” “theme,” and “progress.”

figure h

Furthermore, Network 5 reveals that CMG participants were highly interested in designing and sharing their learning spaces in the metaverse, as evident by words such as “experience,” “platform,” “own,” and “design.” Networks 6 (place, activity), 7 (practice, room), and 8 (fascinating, attend) emphasize the enriched teaching experience derived from merging AI chatbots with metaverse spaces, fostering direct interaction with the chatbot content. Participant reflections below further validate these observations.

figure i
figure j

Finally, sometimes the participants also voiced frustration (“regret” in Network 4) over its technical constraints, which limited their ability to add more 3D objects and enhance the learning environment:

figure k

5 Discussion

This study examined how pre-service English teachers designed and utilized AI chatbots for lesson planning and teaching demonstrations in metaverse and traditional classroom settings. While both groups developed interactive language learning chatbots, their applications varied according to the teaching environment. We also probed the participants’ perceptions of using AI chatbots for these purposes. Data analysis from student works, surveys, and reflections revealed similarities and differences between the two groups, CMG and COG.

RQ 1 explored how AI chatbots were used for language learning by pre-service EFL teachers in both traditional and metaverse classroom settings. This investigation is crucial to understanding the design and utility of innovative educational environments by pre-service teachers, as it may display their enhanced awareness of technology integration in their teaching practices through the outcomes of actual chatbot integrations, such as chatbot tasks, lesson plans, and teaching demonstrations (Al-Furaih, 2017; Crosthwaite et al., 2023; Jeon et al., 2022). While there were differences in how the CLT approach was implemented in terms of communication environments, task contextualization, and collaboration among the two groups, it is important to recognize the effectiveness of utilizing AI chatbots in both teaching contexts. Specifically, in CMG, participants created 3D virtual spaces, customizing these environments to specific topics addressed by AI chatbots. They used strategies to maximize learner engagement and interaction within these contexts, with AI chatbots serving as interactive agents that guide, facilitate, and participate in scenarios. This approach could lead to more immersive language learning experiences in a context-rich environment as previous research supports the potential of the metaverse in constructing authentic learning environments (Cheng & Chen, 2016; Lan et al., 2018; Lee et al., 2023). The potential function of the metaverse platform in simulating real-world interactions presented a rich context for applying the CLT approach in a more dynamic and engaging manner (Lee & Wu, 2023; Wu et al., 2023). For COG, the integration of AI chatbots, while more traditional and less immersive compared to CMG, still played a significant role in creating an interactive learning environment through immediate feedback and diverse linguistic interactions. Though contextually less rich than CMG, this mode of learning can be effective in reinforcing language skills and supporting the CLT approach in a controlled and structured teaching environment (Hew et al., 2023; Ji et al., 2023). These findings shed light on the diverse potential contexts in which AI chatbots can be used for language education, thereby offering valuable insights on how to integrate this technology into different educational settings.

When designing language teaching tasks, teachers can take advantage of the unique benefits of each environment—creating immersive, real-life scenarios in the metaverse and using AI chatbots as supportive tools for language practice and reinforcement in traditional classrooms. This study emphasizes the importance of tailoring language teaching strategies and design principles to fit the specific characteristics of each educational setting. By unpacking the experiences and attitudes of pre-service teachers, our research provides empirical evidence on how AI chatbots and the metaverse can effectively support the CLT approach. This approach caters to diverse learning needs and fosters more engaging and effective language learning experiences for potential L2 students (Lan et al., 2018; Nishino, 2012; Wu et al., 2023). For example, one of the pre-service teachers in CMG stated in his reflection, “I believe that creating an AI chatbot alone is beneficial for providing students with English tasks aligned with the CLT perspective. However, integrating it into the metaverse space amplified the effectiveness of the AI chatbot.” Therefore, the outcome of this research corroborates those of previous research that argued for the affordances of design-based activities in enhancing pre-service teachers’ practical knowledge of integrating specific technology tools into their teaching practices based on their pedagogical beliefs (Crosthwaite et al., 2023; Jeon et al., 2022; Yang & Chen, 2023).

RQ 2 addressed the similarities and differences in their perceptions of using AI chatbots for lesson design and teaching demonstrations between the CMG and the COG. First, the survey results showed that the participants in CMG, who implemented the project in the metaverse spaces, had significantly more positive perceptions of designing and utilizing AI chatbots for lesson design and teaching practices. Specifically, the differences were significantly large in affective attitude, immersion, social image, and technology self-efficacy, while the gaps in their perceptions were relatively smaller (but statistically also significant) in terms of perceived usefulness, intention to use, and engagement. The results indicate that the participants who used AI chatbots in combination with metaverse technology felt significantly more satisfied, immersed, self-fulfilled, and technologically self-efficient than those who did the same project in traditional classroom contexts. In fact, the foregoing studies on the use of VE confirm its pedagogical benefits in enhancing L2 learners’ immersive and engaging experiences (Hwang et al., 2023b; Huang et al., 2019; Lan et al., 2018), positive attitudes (Rama et al., 2012; Wang et al., 2020), self-efficacy in language learning (Cheng & Chen, 2016; Wang et al., 2020), perceived usefulness (Hwang et al., 2023b; Lan et al., 2018), and behavioral intentions to use it (Huang et al., 2019). However, teachers’ perceptions of language teaching activities in virtual environments remained less explored, not to mention pre-service EFL teachers’ views (Jeon et al., 2022; Lan et al., 2018). Therefore, this study has provided new findings about pre-service EFL teachers’ positive experiences of designing and using AI chatbots in metaverse virtual classrooms for their teaching practices (Lee & Hwang, 2022).

RQ 2 also addressed participants’ reflections on AI chatbot utilization and metaverse interaction. Emotion analysis showed positive attitudes toward AI chatbot design across both groups. Positive feelings such as happiness and surprise were dominant, with negative sentiments, including anger and disgust, being minimal. The slightly higher emotional intensity in CMG indicates that the participants found value in AI chatbot design within metaverse spaces, underscoring the potential of integration of technologies to revolutionize educational approaches (Ji et al., 2023; Kim & Lee, 2022). Analyzing the reflections further revealed insights into participants’ teaching demonstrations in both physical (COG) and virtual (CMG) contexts. Co-occurrence network analysis for COG identified prevalent themes such as motivation, learner interest, and corrective feedback. Moreover, AI chatbots in traditional classrooms were recognized for enhancing task-based learning, supporting communication, and mitigating language apprehensions, especially the fear of errors. These observations align with previous research, suggesting the efficacy of AI chatbots in enhancing task-oriented language learning and amplifying human–computer interactions (Hew et al., 2023; Ji et al., 2023). In contrast, CMG themes focused on active metaverse engagement through avatars. Prior studies indicate that students view avatars as their extensions, even if they do not represent physical movement (Cheng & Chen, 2016; Jeon et al., 2022). Another theme emphasized the unique teaching prospects the metaverse offers by ensuring contextual and immersive learning experiences (Hwang et al., 2023a; Lan et al., 2018). Although several technical constraints, such as restricted 3D object integration, were identified as a concern, a more holistic approach, integrating diverse emerging AI technologies, might be important to refining metaverse-based education (Hwang & Chein, 2022; Hwang et al., 2023a, b).

6 Implications

Important implications for the field of CALL can be derived from this study for researchers, teacher educators, and teachers who may want to explore AI chatbots and the metaverse. First, for researchers, this study may serve as an initial reference by providing preliminary results for the exploration of the combined use of AI chatbots and the metaverse, which can be used for future research that examines this use from students’ perspectives. We have found that the combined approach to the technologies introduced three benefits, including authentic communication environments, task contextualization, and collaborative learning, while also facilitating positive perceptions of pre-service teachers about the training programs. As we found the affordances that AI chatbots and the metaverse synergically provide to technology-based teacher education, exploring how this use leads to students’ measurable learning outcomes and how students actually perceive this new environment will be a significant contrast to the current study.

Second, for teacher educators, this study expands the horizon of technology use for teacher education programs. That is, by showing how combining different technologies can synergistically support teachers’ pedagogical goals, this study takes a step further from previous teacher education programs that focused on a single technology and how it can help teachers achieve pedagogical goals (e.g., Crosthwaite et al., 2023). In line with the results from this study, we recommend that teacher educators consider different combinations of technologies to better support teachers’ pedagogy, rather than being limited to a single technology. As shown in the current study where AI chatbot and metaverse technologies were examined, the actual benefits and challenges that different combinations of technology introduce is an empirical question that deserves future investigation. On this note, one of the practical challenges observed during the process of the current study is that the technologies we chose for teaching design, Dialogflow and Spot required a certain degree of technical expertise, which made it difficult for teachers to effectively integrate the two different technologies. However, recent advancements in generative AI have simplified the process of developing LLM-based AI chatbots (e.g., My GPT, PoeAI, GetGPT, etc.), as well as creating 3D content and metaverse environments. In recent years, there has been significant attention in L2 research on the ease with which educational programs can be created using only natural language prompts (Lee et al., 2024b).

Last, in terms of instructional design principles, this study provides useful model works that can be referenced when teachers attempt to integrate chatbots and/or metaverse technology. For example, if teachers are unable to personally interact with all students in a traditional teaching environment, they can create an AI chatbot to offer personalized interaction and feedback to students (Lee et al., 2024b; Yang & Chen, 2023). Chatbots can also be incorporated to stimulate a variety of English learning environments when teachers are unable to provide comprehensible English input to all students (Hew et al., 2023; Huang et al., 2022). Furthermore, even if a learning task requires students to engage in conversations in real-world settings, teachers can integrate chatbots into metaverse spaces that simulate the language use context (cf. Wu et al., 2023). For example, if the task involves practicing check-ins at an airport, teachers can design an AI chatbot in the form of a kiosk and integrate it into a metaverse space that resembles a real airport environment. In addition to text-based chatbots, task-based language teaching can be more authentic and provide human-like interaction by integrating chatbots with AI-based conversational agents (Kim et al., 2022; Lee and Wu, 2023). These agents can assume roles similar to teachers and colleagues within virtual spaces, potentially taking the form of NPCs. In fact, some metaverse platforms (e.g., Engage, Virtual Speech, etc.) have already started integrating ChatGPT-powered AI chatbots into NPCs (Lv, 2023). Building on the current study, it is thus recommended that teacher educators explore these emerging types of chatbots and metaverse technologies to introduce synergetic benefits in a more efficient and effective manner (Jeon et al., 2022).

7 Conclusion

This study explored how pre-service English teachers designed and used AI chatbots for teaching practices by comparing their experiences across the metaverse and traditional classroom contexts. The outcomes of their chatbot-design projects displayed that the metaverse space provided them with more immersive and interactive experiences than the traditional classroom settings. The participants expressed more positive perceptions of the metaverse-based activities than of the classroom-based ones. From these findings, we suggested some practical implications to integrate the technology-enhanced design projects into teacher education courses for pre-service teachers’ professional development.

Despite the contributions, we acknowledged some limitations. First, a methodological limitation needs to be noted in the present study: the findings of this study cannot be extensively generalized in other research contexts because they were obtained with a mixed-methods approach that is mostly descriptive. Therefore, further research is still necessary to testify to its conclusion. Second, given that this study concentrated on pre-service teachers’ experiences and perceptions of technology design and teaching practices, it did not comprehensively show how these technologies could improve L2 students’ learning experiences in both physical and virtual classroom settings. Future research could build on the findings of this study by exploring diverse aspects of technology-enhanced L2 learning, including its impact on learning engagement and motivation and the development of L2 productive skills in virtual environments. Third, although this study highlighted the engaging and motivating use of AI chatbots and the metaverse in L2 education, the potential novelty effect of educational technologies needs to be noted in future research (Hew et al., 2023; Huang et al., 2022). This aspect of technology-enhanced L2 education should be considered in both teaching and learning contexts.

To conclude, this research took a significant step forward in both chatbot and metaverse research by demonstrating how the combined use of these technologies in a pre-service training course can maximize the educational potential that each technology offers. This work may serve as a starting point for teacher educators and researchers who might wish to explore different combinations of emerging technologies, particularly AI chatbots in the metaverse space, to support not only pre-service teachers but also in-service teachers.