Keywords

1.1 Introduction

Learning a foreign language (FL) or more in addition to one’s native language is always an import educational issue, especially in the age of globalization. Most of the countries view the capabilities of using multiple languages by their citizens as one of the critical ways to increase their competitiveness. Regarding the instruction of an FL, how to help learners acquire linguistic knowledge is the most prominent concern in the early days. However, pragmatic competence has become the core of FL education in recent years (Council of Europe, 2001; Lan et al., 2016) due to the influence of socioeconomic factors of the global market and technological development (Warschauer, 2000).

As mentioned above, the improvement of modern technologies plays an important role in the transition of human-to-human interaction; consequently, there has been a huge revolution in reading and writing (Eshet-Alkalai, 2004). The traditional paper-based reading and writing are also gradually replaced by diverse digital editing tools. For example, emails, social network, and other networks seem to be the most popular approaches to connecting people in the twenty-first century. As a result, the approach to learning or teaching an FL is also changing due to the advantage and usage of technologies (Arnó, 2012; Lan, 2020).

The changes in the approaches to FL learning and teaching mentioned above are also influenced by the transition of learning theories. A look back at the development of FL teaching and learning shows that the research focuses have been changing (Grosse & Voght, 2012), from de-contextualization to contextualization; from behaviorism to social constructivism; from linguistic skill-centered to pragmatic orientation. As a result, researchers, when investigating the process of second language acquisition, do not only focus on the linguistic skills any longer but also consider the factors in the real contexts that have the potential for influencing the FL learning outcome (Godwin-Jones, 2018).

As mentioned above, language learning and teaching nowadays are contextualization oriented. Furthermore, according to Lan (2014), the essential components of successful FL learning are (1) learner’s active involvement, (2) authentic contexts, and (3) meaningful and social interaction. In other words, only if a learner actively involves in meaningful and social interactions in authentic contexts will successful language acquisition happen (Lan, 2020). However, creating the authentic contexts and social occasions needed in an FL educational setting to encourage learners to actively get involved in meaningful interactions is always a challenge to FL teachers and researchers (Lan, 2015). Given the specific features of virtual reality (VR)—creation, immersion, and interaction—with a platform to create authentic contexts and the capabilities to support social interaction, VR is a potential solution to the abovementioned obstacles in FL settings (Lan, 2020). However, without support and guidance from solid theoretical foundations, VR could be only a fancy “toy” for FL learners for a short period, and it could fail to enhance FL learning.

This chapter, therefore, focuses on bridging the theoretical foundations of FL learning in VR and their empirical practices. The following sections describe the theoretical foundations of FL learning, especially focusing on the perspectives of contextualized learning, the bridge between theories and practices of FL in VR, theory-driven studies of VR for FL learning, and conclusions.

1.2 Contextualized Foreign Language Learning

Contextualized learning is rooted in Vygotsky’s (1978) sociocultural theory. Sociocultural theory emphasizes the effect of the interaction of interpersonal, cultural-historical, and individual factors on learning and development (Tudge & Scrimsher, 2003). The cultural-historical factor mentioned above further reveals that learning and development cannot be isolated from the contexts. Therefore, a context should include the environment, the objects, and the individuals, and the interaction among the three elements is the essential part of the theory. Although sociocultural theory is also a kind of constructivism, it is different from Piaget’s theory. It emphasizes the interaction between persons and their environment through cultural objects, languages, and social contexts.

Briefly, the Vygotskian sociocultural theory includes three key themes: (1) cultural-psychological tools, especially language, mediate human actions, including thinking and speaking; (2) learning is a process of internalization, especially through communicative interaction; and (3) the development is a dynamic and historical process in which the nature (settings) and human have a mutual influence. Sociocultural theory emphasizes not only the importance of a learner’s perception but also the internal dialogue and interpersonal interaction in the process of knowledge development.

The argument of sociocultural theory views the social interactions among individuals in their environments beneficial to transforming learning experiences. As one of the remarkable applications of sociocultural theory to education, “scaffolding” provided by a more capable interlocutor or a peer plays an essential role in the development process and helps the individual reach a higher development level, i.e., zone of proximal development (ZDP). According to the perspective of sociocultural theory, ZPD stands for “the distance between the actual developmental level as determined by independent problem solving and the level of potential development as determined through problem-solving under adult guidance or in collaboration with more capable peers” (Vygotsky, 1978, p. 86). Following the concept of ZPD, in addition to scaffolding, reciprocal teaching, peer collaboration, and apprenticeships are also widely adopted in educational research and settings.

In addition to sociocultural theory, embodied cognition is another theory that supports contextualized learning. It emphasizes the formative role of the environment (context) that plays in the development of the cognitive process (Cowart, 2005). It also argues that the representation of knowledge is grounded in a person’s experiences of interacting with and perceiving the environment, which involves whole-body involvement, including sensation, perceptions, and actions. That is, it considers the interactions among perceptions, the body, and the environment (Barsalou, 2008).

Following the perspective of contextualized learning described above, FL researchers and educators also believe that contexts provide FL learners with a direct link between the FL materials that they had learned and stored in their brains (Legaulta et al., 2019; Prince, 1996) and the underlying concepts or clues, thereby enhancing their FL learning (Lan et al., 2015; Snow, 2005). Under the belief in contextualized learning, FL learning is hence both social and mental, in which the person and the environment are necessarily connected in an inseparably dialectic relationship (Lantolf, 2005; Swain, 2000). Additionally, based on embodied language processing, a person’s certain way of moving his/her body impact how he/she comprehends a language (Havas et al., 2007; Lan, Hsiao, Fang et al., 2018). In recent years, the evidence obtained from brain research also approves the belief that language processing is an embodied process (Hertrich et al., 2016; Willems & Casasanto, 2011). Mahon and Caramazza (2008) further argued that the motor system is automatically activated when a person (a) observes manipulable objects; (b) processes action verbs; and (c) observes the actions of another individual.

In brief, contextualized FL learning is an experience-oriented and dynamic process which highlights situated learning with FL learners’ active involvement in the interaction among individuals, objects, and the environment. Under the belief in contextualized FL learning, creating authentic contexts for FL learning is strongly suggested by several commonly referred foreign language teaching/learning guidelines, such as The Common European Framework of Reference for Languages (Council of Europe, 2001) and the proficiency guidelines developed by the American Council on the Teaching of Foreign Languages (American Council on the Teaching of Foreign Languages, 2012).

1.3 From Theory to Practice: VR for FL Learning

To integrate the theoretical themes described in the previous section into daily FL teaching and learning, using technology is never absent. Technology has been subtly and profoundly changing the language educational landscape (Arnold & Ducate, 2015). As described in Lan’s (2015) article, it is always a challenge for FL teachers to create the authentic contexts and social occasions which are needed in traditional FL educational settings and are able to encourage FL learners’ active involvement in meaningful interactions. By adopting advanced and appropriate technologies, the barriers caused by space or time can be easily conquered.

However, what are the essential features of technologies to successfully implement contextualized FL learning? Technology that supports the following three essential components of successful language learning should be concerned, i.e., learner’s active involvement, authentic contexts, and meaningful and social interaction (Lan, 2014). Among numerous potential technologies, VR is one of the most prominent options in contemporary technologies due to its capability of supporting social interaction by immersing language learners in authentic contexts (Lan, 2020). Consequently, VR has been widely adopted in research related to foreign language learning and teaching (Wang et al., 2020).

VR is a set of images and sounds, produced by a computer, and seems to represent a place or a situation that a person can take part in. According to Robertson et al. (1993), VR can be classified into two categories: immersive VR and non-immersive VR. The first kind of VR emphasizes spatial immersion (Howard-Jones et al., 2014) from the first-person view with a restricted meaning of “being there.” In contrast, the second kind of VR allows users to experience immersion from the third-person view by using a mouse, a keyboard, or a monitor to control their avatars. In addition to the abovementioned definition given by Howard-Jones et al. (2014), Papagiannidis et al. (2008) defined VR according to its function as either the game-based VR or the society-based VR. Entertaining is the main purpose of the game-based VR, whereas providing users with a platform to perform social connections is the design focus of the society-based VR. In this chapter, the society-based VR will be the focus, and there will be no restrict classification between immersive and non-immersive VR when bridging foreign language learning in VR and the related theoretical foundations.

As described above, VR is a world rich in imagination produced by a computer. Any contexts existing in the real world or only in humans’ imagination can be created with the VR technique. The world can be as large as a universe or as tiny as a cell. It can be a very common location in daily life like a supermarket (Lan et al., 2015) or a highly professional room like an operating room (Mohsen, 2016). In this environment, both images and sounds are authentic. By using their avatars, users have the perception of “being there,” regardless of which VR categories are used (Lan, 2020). Additionally, after logging in the virtual world, a user can interact with the objects. They can observe, manipulate, or operate the objects. They can also interact with other users in the world, via text or voice, just like what they do in the real world (Lan, 2015; Yeh et al., 2018). Interestingly, this kind of avatar-based and social immersion can be viewed as a kind of virtual “whole body” interaction with the environment (Lan, Hsiao, Fang et al., 2018). With the specific features of VR, by immersing themselves in such an authentic world, FL learners can also collaborate with others to accomplish language tasks (Lan et al., 2016; Yeh & Lan, 2018).

Based on what is described above, the specific feature of VR seems to perfectly meet the requirement for contextualized FL learning. Therefore, some FL researchers or educators might think that the only thing they need to do is to have FL learners log in VR, then the expected learning goals would be reached. Is that correct? Definitely not! A careful and meaningful theory-driven activity design is a necessary catalyst to make successful FL learning happen in VR (Lan, ). As described in Lan’s (2016) article, although the features of VR match the essential components of successful language (Lan, 2014), it is not guaranteed that the expected learning goals would be reached if no appropriate learning activities as mediators are adopted during the learning process.

A learner-centered activity that inspires and motivates FL learners to interact with other learners in the VR world autonomously is the solution to the problems mentioned above. Numerous approaches are viewed as learner-centered activities, such as task-based learning, problem-solving, inquiry-based learning, discovery learning, and project-based learning. However, when the sociocultural perspective is considered, a combination of cooperative/collaborative activity and those approaches mentioned above should be adopted. During such learning processes, an FL teacher’s role is not a knowledge provider but a learning facilitator by organizing the learner-centered activities and providing FL learners with supports when necessary (Bhattacharjee, 2015; Lan et al., 2016). While working with and co-pursuing the learning goals with peers, FL learners are encouraged not only to develop new insights and to connect them with their previous learning experience but also to subconsciously use what they have learned. Such a learning state helps lower learners’ anxiety in using an FL, and consequently enhance their performance in FL learning (MacIntyre, 2017). Additionally, throughout the process of learner-centered learning, FL learners need to identify the authentic situations, raise their own hypotheses and plan their problem-solving actions, collect useful information, and refine their problem-solving plans according to the experience obtained from the learning process or the feedback provided by the teachers, peers, or environment. Obviously, interpersonal or person-environment interaction is necessary during this process. Finally, they will be highly motivated in reaching the learning goals through the learning process by collaborating with others. Moreover, what they have learned during the process will cultivate their new skills which can be applied in real life.

In sum, simply logging in VR without the integration of learner-centered activities that are following the perspectives of contextualized learning would show that VR is just another fancy technology for FL learners and their learning motivation will fade predictably.

1.4 Theory-Driven Studies of VR for FL Learning

The studies included in this section will be briefly introduced in five categories of practice: social connection, game-based learning, self-exploration, cooperative task-based learning, and learning by creation. They are chosen because their theoretical foundations were rooted in the perspectives of contextualized learning and they give good examples of bridging the theoretical foundations and empirical practices; therefore, they maximize the effects of learning in VR on learners’ language acquisition (Lan, 2020).

1.4.1 Social Connection in VR

Building a successful social connection provides opportunities for FL learners to engage in social interaction and therefore enhances FL learning, especially when such a connection involves native speakers and non-native speakers. Additionally, VR’s feature that supports social interaction among global learners in immersing themselves in the contexts is naturally becoming one of the preferable methods for FL teachers or researchers. As VR becomes more affordable, commercially available and accessible, it has attracted more attention of researchers and educators (Lan, 2020; Pan & de C. Hamilton, 2018).

Liaw (2019) used vTime, a VR social networking site, and VR Box, a headgear similar to Google Cardboard, in a two-phase study focusing on EFL Intercultural Communication. The participants of Liaw’s study were college students taking a required year-long EFL course targeting at enhancing oral communication skills. At phase 1, with clear task goals, the participants carried out various cooperative language tasks in VR with their classmates, including giving information, carrying out small talks, and engaging in group discussions. After the participants were familiar with carrying out those tasks with their Taiwanese classmates in VR, at phase 2, they were asked to carry out intercultural communication tasks with English speakers around the world who also used the open social VR software. They were encouraged to interact with as many interlocutors as they could find online. They also needed to upload a YouTube video of their process of one online interaction. At the end of phase 2, each participant was asked to briefly report their learning experience gained during the study. The results reveal that the participants’ positive perception of the social and physical presences afforded by the VR environment. Additionally, interacting with international interlocutors in VR not only provided the participants with a joyful learning experience but also expanded their learning from a classroom setting to an unlimited digital wild context.

Tang and her colleagues’ studies are another example of investigating how VR can be used as a catalyst for learners of Chinese as a foreign language (CFL)’s social connection. They observed how CFL learners transferred their role in a VR community from a peripheral participant to a more active and central one (Tang et al., 2012). During the transformation process, they also tried to determine if the acquisition of CFL can occur in a virtual situation without explicit instruction (Tang et al., 2016). During the study, they observed and analyzed interpersonal interactions, the development of CFL communication competence by the participating learners, their communication models, and the interaction frequency. The results show that the immersion in a friendly VR environment enhanced interpersonal interactions. Additionally, CFL learners demonstrated their communication competence during a natural process of social interaction, including receiving information from others, exploring the virtual world, dialogue, assimilation, adaptation, asking questions, problem-solving, etc. Based on the findings obtained from their study, Tang and her colleagues claimed that language acquisition could be improved autonomously in such a virtual situation.

1.4.2 Game-based Language Learning in VR

Learning a language in addition to one’s native language takes time and constant engagement. But it is challenging for most of the learners. How to transform FL learning into a joyful and gamified process for motivating learners’ involvement, therefore, becomes one of the highlights of FL research. According to Kapp (2012), gamification is a strategy that can engage people, motivate action, promote learning, and solve problems. Numerous studies on game-based learning reported the effective effects of gamification on learners’ motivation and engagement in learning activities, and consequently, their learning performance (Kotob & Ibrahim, 2019).

In the game-based learning VR, the learning materials are embedded in the interactable objects, such as a car or a non-player character (NPC), in the environment. Learners can log in the VR repeatedly and learn the embedded materials whenever they are available. Usually, there are learning goals set beforehand, but learners can make their own learning plans and take corresponding actions. Additionally, to prevent learners’ extreme excitement in playing from distracting them from accomplishing the learning mission, common approaches, such as providing hint prompts, leveled challenges, rewards, or competition, are adopted to reaching the abovementioned goal, i.e., to promote and maintain learners’ motivation and autonomy during the learning process, as well as encourage them to log in it and keep learning again and again (Lan, 2016).

Lan’s (2015) study about contextual EFL learning includes an example of game-based VR learning. In her study, several contexts were constructed in Second Life (SL), such as restaurants, a night market, an airport, a clinic, a playground, several kinds of shops, and a station. All the learning materials were embedded in the abovementioned contexts. EFL learners can log in and learn the embedded materials anytime, anywhere. Take the night market as an example, EFL learners learned the English vocabulary words by clicking the objects, such as the teddy bear or a lightsaber. They can also practice English conversation with the NPC boss at the ring toss booth. Figure 1.1 shows the context of the ring toss booth. The EFL learner first clicked the objects on the shelf on the left-hand side to learn the vocabulary, then practiced the conversation with the NPC boss on the right-hand side. After they finished all the learning tasks, they can play the ring toss game, and finally, they can choose a reward from those on the shelf on the left-hand side. All the learners’ scores received in the ring toss game were recorded and shown on the screen. Besides playing the ring toss game and obtaining a reward after accomplishing the learning task, this competition approach was also found to have motivated EFL learners to repeatedly log in the VR environment and learn the materials by playing.

Fig. 1.1
figure 1

A night market in SL

It is worthy of notice that the learning activities reported in Lan (2015) were conducted after school. EFL learners logged in during their free time and before they took their EFL classes. Therefore, it is also a kind of flipped learning application. The learning outcome was promising. The participating EFL learners made a significant improvement in their EFL learning performance.

The study of Lan and her colleagues in 2018 (Lan, Hsiao, Shih et al., 2018) is another example of game-based language learning in VR. Still, the participants of this study were special education students, rather than regular participants like those in the studies described in the previous sections. Additionally, the target language was Mandarin Chinese, the students’ first language. Four special education students with a language delay participated in this study. A total of eight lessons and the corresponding contexts were developed with the embedded materials which matched the four students’ learning needs, especially for the design of the human–computer interface. Besides, additional scaffolding was designed to serve as learning guides to help the four students individually learn the materials by playing in and conquering the graded challenges of each lesson.

The four special education students of that study came to the resource center in their school twice a week for eight weeks. They logged in the VR contexts together but carried out learning by playing individually. Although they learned individually, they could share their playing experience and findings with the others or the teacher during or after playing via their avatars in the VR contexts. And they loved to do so. Moreover, they even loved to share their VR experience with their families. The special education teacher and one researcher stood by to provide them with support if necessary, usually regarding technical support. Figure 1.2 shows that they logged in a health center to learn the materials embedded beforehand.

Fig. 1.2
figure 2

A health center in VR

The results were very promising. The learning outcomes of the eight-week learning sessions were comparable to those of one-year traditional learning lessons. In addition to the gains of Chinese Mandarin ability, all the four participants enjoyed the learning process very much. For example, one student answered the teacher’s one question about how he thought about the learning experience after the eight-week learning. “How much do you like it? Very much or just a bit?” His answer was “I super love it.”

1.4.3 Self-exploration in VR

When engaging in self-exploration VR in language learning, users should be explained and provided with precise learning guides or learning goals, or they will easily lose the learning focus in the splendid and colorful world, and consequently obtain unsatisfactory learning results (Lan, 2016; Mayrath et al., 2011). With a precise learning guide, an authentic VR environment allows learners to expand their experiences by visiting places, such as the Arctic or the outer space, which they cannot visit physically in the real world. For example, a virtual field trip is a common activity carried out in VR to have learners explore a target spot without the barriers caused by the limitations of time or space (Blyth, 2018; Pilgrim & Pilgrim, 2016). As reported in Mark (2016)’s study, which integrated Google Cardboard and Google Expeditions Pioneer Program in students learning, the outcome of a virtual trip was positive. Students instantly engaged in real contexts with immersive expeditions. With the tour guide embedded in the expeditions, students’ curiosity was inspired, and it helped them keep observing and exploring the visited contexts. It did not only excite the participating students as they “traveled” to Machu Picchu or the moon but also helped them better to understand the places they “visited” and to strengthen their learning.

The study of Lan et al. (2019) is another example of using VR as a self-exploration tool to enhance Chinese as a second language (CSL) essay writing by students from Singapore. In their study, the VR contexts were constructed in Second Life (SL), including a hotel, two restaurants, and a zoo. The VR exploration was integrated into the pre-writing stage. Take the topic of “comparing the two restaurants” as an example, the CSL students logged in SL to explore and figure out the differences between the Western- and the Chinese-restaurant (see Fig. 1.3). They wrote down what they found and then used the collected information to complete the latter essay writing activity. It was found that with the very clear requirement of the essay writing activities, the CSL students used the time well in exploring the VR environments and collecting information needed for essay writing. Students’ learning motivation was high, and consequently, the learning outcome was positive and satisfactory without a doubt.

Fig. 1.3
figure 3

Left: The Chinese restaurant; right: The Western restaurant

Similar to the study of Lan et al. (2019), that of Lan and Van (2020) investigated the impact of 360-degree VR videos on basic Chinese writing by CSL students from Vietnam. All the participants were CSL beginners, learning Chinese in Taiwan. After receiving the instruction on the vocabulary and sentence patterns, the participants used Google Cardboard and 360-degree VR videos in their smartphones to take a VR field trip and visit several famous landscapes in Taiwan, such as Mountain Ali and the Taiwan International Balloon Festival (https://youtu.be/47Gm2Lwp4LA) (see the screenshots shown in Fig. 1.4).

Fig. 1.4
figure 4

Two screenshots: left, Mountain Ali; right, the Taiwan International Balloon festival

After the VR field trip had finished, the CSL students wrote down what they saw in the 360-degree VR videos. By using Google Cardboard and the VR videos, all the students expressed a strong feeling of “being there.” Additionally, immersing themselves in the videos with the rich information in the scenes inspired them to write an essay with better quality and quantity, compared with those written without such the immersive experience. Interestingly, they became more willing to write a Chinese essay than they were before having an immersive experience.

1.4.4 Cooperative Task-based Learning in VR

Cooperative learning is an effective method and is commonly adopted in FL learning (Lan et al., 2009). Rooted in the constructivist theory, cooperative learning is a form of socially mediated learning that emphasizes the interactions between individuals. However, without a proper activity design or well-shared accountability among individuals, there could be conflicts, leading to a poor learning outcome (Lan et al., 2007).

In FL learning settings, cooperative learning is usually integrated with task-based learning (TBL), forming cooperative task-based learning. A task, from the perspective of FL learning, can be any daily activity that results in processing or understanding a language (Lan et al., 2016). Additionally, a language task comprises two essential elements: the settings and the conditions under which the task takes place (Nunam, 1989). While carrying out a language task, FL learners perform social interaction and meaningful negotiation to reach the task goals. Successful TBL requires learners’ active participation. Based on the description above, a cooperative language task should include four essentials: actual events in daily life, authentic contexts, a precise task goal, and cooperation among a group of accountable members.

The matching specific features of VR with the abovementioned goals of cooperative TBL makes VR an ideal platform to implement cooperative TBL for language learning (Lan, 2020). Based on the results obtained from numerous empirical studies, many language tasks can be easily designed and carried out better in VR than in traditional classrooms (Lan et al., 2016). Take the commonly adopted information-gap task as an example, FL teachers split the information into several parts, at least two, and distribute each part to an individual student. Then they are paired or grouped to exchange what information they have in hand without peeking at the others’. However, it is commonly found that the paired students put their information on the desk, and then look at two pieces of information together to figure out the answer (Lan & Lin, 2016). But the situation cannot be seen in VR because of the hindrance between the learners’ avatars.

Lin and her colleagues’ (Lin et al., 2014) study demonstrates a good example of implementing cooperative information-gap tasks in VR. Three units and the corresponding contexts were created: “Unit 1: Pair the Friends,” “Unit 2: Family Day,” and “Unit 3: Clues in the Maze.” The participants in this study were 144 CFL learners from Monash University, Australia. They were divided into small groups. Then each group was further divided into two small teams. When carrying out the task in each unit, the information was divided into two halves; each small team had one half of the information. The two teams in the same group had to exchange their information via oral interaction with their partner team. Failing to successfully exchange complete information in a small group could not bring out the answer, and consequently, the group would fail in reaching the task goal.

For example, in the third unit “Clues in the Maze,” the participants were either inside or outside the maze (see Fig. 1.5). The participants who were inside the maze did not receive any notes from the teacher, but they could see the stuff in each shop and could share the information with their peers who were outside.

Fig. 1.5
figure 5

Left: Inside the maze; right: outside the maze

In contrast, those participants who were outside the maze did not know what was inside the maze, though they were given a map of the maze. On the map marked the shops’ numbers, as well as their locations. The shop numbers were in random order. Additionally, the hint for the answers, such as “when,” “where,” “who,” etc., was also added into some shops, as shown in Fig. 1.6. It should be noted that the map with the hints was a digital note in Second Life, rather than a paper-based hard copy. Moreover, the attribute of the digital note was neither duplicable nor sharable. That is, it was only owned by one avatar in the context. The unshareable characteristic of the digital note gave the participants who owned the map a push to use Chinese to communicate with their team members who were inside the maze while leading them to the correct shops and to collect the information needed for reaching the goal. Figure 1.6 shows the maze and the map with hints. The results showed that carrying out cooperative language tasks in VR helps CFL learners set clear goals and enrich their oral output.

Fig. 1.6
figure 6

Left: The maze used in Unit 3, “Clues in the Maze;” right: the map of the maze with hints

Lan and her colleagues (2016) investigated the effects of language tasks on CSL students’ oral communication performance, which is another example of implementing cooperative task-based learning in VR. In the abovementioned study, the participants were 30 CSL beginners from 4 countries, learning Chinese in Taiwan. The participants were divided into several small teams and were assigned language tasks. They had to work with their team members to collect, share, and exchange information, and finally reach the task goal, i.e., solve the problem in the scenario. For example, one of the tasks was a detective-like one. The scenario was about four people sharing an apartment. One person’s cookie was eaten by someone else without permission. The cookie owner was furious and wanted to find out the “criminal.” There were four rooms in the apartment. Each room was embedded with some clues, some of which were interference not contributing to solving the problem. Therefore, the team members had to discuss how to distinguish the useful information from the interference. At the end of the task, all the teams told the teacher their answers and provided as many reasons as possible. The team who had the correct answer and provided the most reasons won the competition. Additionally, each member of the winning team was rewarded with a motorcycle to freely explore the VR environment.

Figure 1.7 shows the students entering the shared living room, “the crime scene,” to see what the “criminal” left there. Figure 1.8 shows the end of the activity when all the students gathered to tell the teacher their answers and reasons. The motorcycle on the right of Fig. 1.8 was the reward for the winning team. The results of the study show that the CSL students made a significant improvement in their oral communication performance. Additionally, they behaved with high motivation. They expressed that they loved to learn Chinese in VR, and it was much fun to cooperate with peers to solve the problems given by the teacher. In brief, their Chinese oral performance and learning motivation and attitudes were enhanced by doing cooperative task-based learning in VR.

Fig. 1.7
figure 7

Students entering the shared living room to see what the “criminal” left

Fig. 1.8
figure 8

Students getting together to share their answers and reasons

1.4.5 Learning by Creation in VR

Creation means the act of creating something, based on the Cambridge online dictionary. Usually, the process of creation involves learners’ ability to produce or use original and unusual ideas. Similar to problem-solving, creation deals with generating solutions. According to Hennessey and Amabile (2010), creativity can occur in daily life. It involves problem-solving and pragmatic skills. With creation, learners construct knowledge by linking what is known to create the solutions to new problems in the situations faced. Through creation, learning can be deepened. It also improves learning transformations by enhancing students’ ability to bridge what they learned in the classroom settings and the real world. Given the benefits to students’ learning by creation, learning by creation has been attracting researchers’ and educators’ attention (New Media Consortium, 2015).

Some VR tools, such as Tilt Brush, Google Blocks, Tinkercad, Omni-Immersion Vision (OIV), and Minecraft Realms, not only provide learners with an immersive environment but also support hands-on creation activity. With such an authoring tool, users can create their own VR objects or contexts without a high technical threshold. After the completion of construction, the creators can share their creations with others by uploading them to the cloud. Moreover, some authoring tools, such as OIV, even allow users to experience social interaction with others from around the globe in the virtual contexts they have created. Additionally, when language learners are involved in the collaborative VR creation process, it becomes a critical thinking process in which they proceed with collaboration, problem-solving, and self-directed learning, and consequently their learning outcome, learning ownership, and autonomy are enhanced (Grover et al., 2015; Yeh & Lan, 2018).

Wu et al. (2019) adopted OIV in an EFL class with a focus on enhancing the communication skills of college students. The participants were Taiwanese EFL learners from the College of Medicine at a university in northern Taiwan. The students created their stories and then created the corresponding contexts in OIV. Then, they acted out their stories via their avatars in the VR environment. Their role-play in VR was recorded and shared with their classmates. Figure 1.9 is a screenshot of one video created by one OIV team. It is found that in the video one student chose a modern avatar, the man in a suit, while the other a historical one, a female emperor from Chinese history. Additionally, the house and all the objects in the scene were created by the students using OIV. The screenshot showed a couple discussing the husband’s health problem before they decided to go to the doctor. The husband told his wife that he felt dizzy all day. His wife found that he ate too many hamburgers and was also overweight. She doubted that he might be suffering from diabetes and then suggested that he go to the doctor.

Fig. 1.9
figure 9

A screenshot of a student’s video

It was found that the students in VR performed well in creations and role-plays with detailed descriptions of the scenarios and plots. Cooperative learning by creation in VR benefited EFL students’ English language skills, healthcare professional-patient communication skills, and in enhancing empathy and understanding toward patients. The participating students also expressed very positive attitudes toward the use of VR creation in their English learning.

Yeh and Lan (2018) collaborated with an EFL teacher in an elementary school in the rural area in northern Taiwan. They conducted a one-semester study by integrating a 3D authoring tool, called Build & Show, into daily English classes. In the study, a class of 15 fifth graders were taught to use Build & Show in their computer lessons. Then, they helped the English teacher by collaboratively constructing the contexts they later needed in their English lessons. Figure 1.10 shows the students cooperatively created the contexts needed for their English learning. The screenshots in Fig. 10 show that more than two students’ avatars were at the same location in the virtual environment. On the left of Fig. 1.8, students were constructing a playground, while on the right students were working together to arrange the furniture in a living room.

Fig. 1.10
figure 10

Students used a 3D authoring tool, Build & Show, to create VR contexts needed in their English lessons: left, a playground; right, a living room

Later, during the English lessons, the teacher logged in her avatar in the VR environments created by her students and started teaching the materials in the VR environment. It was found that the students paid their most attention to what the teacher was teaching with pride shown on their faces. They also showed high ownership of their English learning. It is worthy of notice that the participants’ learning strategies, planning, self-evaluation, peer-discussion, and problem-solving skills improved as well.

1.5 Conclusion and Suggestions

As VR technology and the devices become more accessible, researchers and educators are more interested in understanding the potential of emerging technology. Moreover, when low-priced VR devices, such as Google Cardboard, become popular, integrating VR technology into daily learning seems to be an inevitable result in the digital age. In Castaneda et al. (2017) report of a large-scale study, more than 1300 students across six grade levels participated in the study of using VR for daily learning during the 2016–2017 school year. Additionally, a report about Google’s ongoing investment in the UK says that Google is going to bring VR to one million UK school children (Heathman, 2016, November 15).

VR has been used for enhancing FL learners’ intercultural awareness, oral communication skills, pragmatic competence, social connection, learning motivation and autonomy, creativity, etc. Although most of the existing literature reports positive and promising results, some essential components must be considered when adopting VR in FL research or empirical practices. Lan (2016) mentioned in her article about the design principles of 3D VR games for language learning, indicating that learners, linguistic knowledge and pragmatic competence, and the process of acquiring the language are important elements for the successful implementation of using VR for FL learning. Individual differences, one of the learner variables, are an important factor that researchers and educators must pay attention to. Learners’ learning styles are different. The learning difficulties encountered by individuals are different. Learners’ motivation differs from that of other persons. All the individual differences mentioned above should be considered when considering VR as a learning platform or a tool for FL learning. It is also worthy of notice that whichever kind of VR devices is adopted, the immersive- or non-immersive VR, situated tasks and appropriate learning scaffolding are also one of the major influences on students’ learning outcomes (Castaneda et al., 2017; Lan, 2016, 2020).

In sum, VR, as an emerging technology, advances markedly in the digital age. It can be a facilitator or a barrier in the FL learning process. The learners’ needs and the theoretical foundations are the keys to a sound and successful usage of VR for language learning; therefore, they should always be emphasized.