Keywords

5.1 Introduction

Virtual Reality (VR) has been increasingly influential in various disciplines. Through VR’s immersive and experiential effects in games and environments including Second Life, Active Worlds, Quest Atlantis, and World of Warcraft, users can maximize their creative, imaginative, and experimental capabilities. In recent years, these benefits have expanded into the educational field, as educators and investigators have suggested that VR may “improve our digital and cultural literacies, understand more fully the links between immersion, empathy and learning, and develop design skills that can be used productively to exploit virtual spaces” (Warburton, 2009, p. 425). For example, VR enables people to visit a faraway museum without traveling there, and VR videos enhance training for natural disasters by allowing trainees to more directly feel the potential effects and better predict and prepare. Such implementation of VR has undoubtedly furthered the development of the educational field, especially by bringing hard-to-achieve field experiences into the classroom; the adoption of VR in these disciplines has become increasingly prevalent (Angulo & Velasco, 2013; Warburton, 2009). However, its application to language teaching and learning, especially in Chinese as a foreign language, remains unexplored and deserves further examination. In addition, in VR’s early stages, the developer had to, for the most part, control the tool or environment, which meant the settings were predetermined and educators were not able to customize them for their specific teaching objectives. Some recent tools and platforms, such as Unity and Google Expedition, have been more open to further adaptation by the users. This new flexibility gives educators more freedom to design VR tools to fulfill their pedagogical goals and objectives.

Therefore, in the current study, we investigated the functionality and feasibility of 12 applications (apps) and tools, focusing on their pedagogical value and appropriateness in teaching Chinese as a foreign language in the United States. Using these tools, the study exposes students to simulated authentic Chinese language and cultural environments, which many of them have never experienced before, with the goal of enhancing their language performance. The study aims to incorporate two series of selected apps and tools: (1) Unity, Google Poly, and HTC VIVE goggles; and (2) 360° videos and pictures, Google Expedition, and Google Cardboard. Used in Chinese language courses, these tools help instructors to create full immersive environments for learners, in order to foster their autonomy, motivation, and active engagement. Additionally, this study evaluates the effectiveness of different VR apps and tools incorporated for student-centered communicative tasks. Based on analysis of both qualitative and quantitative data gathered in this groundbreaking project, the study will propose effective strategies and offer recommendations for incorporating VR apps and tools in language instruction.

5.2 Virtual Reality in General Education

While VR technology has been applied in movies, theme parks, and games, its development has also influenced the field of education. VR is defined as “a system that aims to bring simulated real-life experiences, providing topography, movement, and physics that offer the illusion of being there” (Smart et al., 2007). Educators are increasingly using these appealing features to create theme-based virtual worlds in which students can have semi-authentic experiences, which is especially valuable when the real environments are not easily available.

In the earlier stages of VR development, educators used two-dimensional (2D) text-based VR tools and environments, such as domains that multiple users could use at the same time (Lin & Lan, 2015). With further advancements, three-dimensional (3D) VR tools and environments, including for example Active Worlds and Second Life, have taken the place of the 2D tools, because they more closely simulate the real world. Although many of the 3D VR environments and tools were originally developed for commercial and business use, educators have put considerable efforts into adapting these technologies to teaching. For example, Angulo and Velasco (2013) developed a VR environment to support architectural design and reported positive feedback from the participating design-major students; Izatt and colleagues (2014) developed a novel VR application called Neutrino-KAVE which functions as a visualization and data interaction application. VR has also been used in medical education fields including nursing, medical professional training, and surgical training (Freina & Ott, 2015). Ribaupierre and colleagues (2014) examined the development and use of VR in health care training and discussed the importance of virtual experiences in teaching.

The preliminary findings of these first attempts at developing VR for education suggest that VR can offer great advantages for learning, such as allowing direct experience of objects and events that are physically out of reach, and supporting training in a safe environment. Freina and Ott (2015) also reported that VR “increases the learner’s involvement and motivation while widening the range of learning styles supported” (p. 138). Although these studies were conducted using a variety of VR tools and platforms, very few of these tools were originally designed for education purposes. Jones (2006) found that because many existing VR environments are meant for gaming instead of education, enormous time and effort required to adapt them and create new educational materials, resulting in grave inefficiencies.

5.3 Virtual Reality in Foreign Language Teaching

VR has proven educational benefits in math, nursing, science, aviation, and social studies, to name a few. However, research into VR in language education is still in its infancy, and mostly exploratory (Bonner & Reinders, 2018). One of the biggest documented advantages of using VR in language classrooms is reduction of distractions. By blocking out visual and auditory distractions, VR tools and apps fully immerse learners in the content they are viewing and exploring (Gadelha, 2018). Cognitively, VR supports “embodied” and “extended” cognitive activities that stimulate thinking and body movement (Atkinson, 2010).

Research on VR in language education may employ single or multiple existing VR tools. Several language educators have experimented with the online 3D game Second Life for its affordance, effectiveness, and impacts on cultural involvement, oral interactions, and motivations. Developed by Linden Labs, Second Life is a multi-user virtual environment in which participants are able to freely create their contexts for interaction using visual, text, and audio modes (Jauregi et al., 2011; Schwienhorst, 2002). Jauregi and colleagues (2011) adopted Second Life to investigate its impact on the learning of several languages. Data collected from their questionnaire, completed by 430 foreign language learners of Dutch, Portuguese, Russian, and Spanish, show that VR experiences from Second Life have positive effects on cultural, linguistic, interpersonal, and motivational issues. Deutschmann and colleagues (2009) designed oral participation tasks in Second Life and compared the effectiveness of language courses with the Second Life tasks to those without. Results indicate that meaningful VR-based tasks that include authentic and collaborative elements have a direct impact on learner participation and engagement. Liou (2012) examined how Second Life could be used in a computer-assisted language learning (CALL) course for 25 college English learners in Taiwan. The researchers designed four tasks for student orientation in Second Life: chatting, pedagogical activities, peer review, and a Second Life platform tour. Their findings reveal strong student motivation and engagement. They also observed that “sound pedagogy with appropriate tasks, instead of 3D virtual software alone, guides applications advancing toward language learning objectives or sense-making in student learning” (Liou, 2012).

Cheng and colleagues (2017) investigated the impact of the 3D VR game Crystallize on the study of cultural interaction, such as Japanese bows, on 68 Japanese language learners. They found that VR technology provided “an opportunity to leverage culturally relevant physical interaction, which can enhance the design of language learning technology and virtual reality games” (Cheng et al., 2017, p. 541).

For teaching English as a foreign language, Liaw (2019) adopted a VR social networking site, vTime, which allows users to socialize in immersive virtual environments. The study investigated students’ intercultural communication learning and found occurrences of rich intercultural communication in learners’ interactions in VR environments. Learners were also found to be enjoying the VR activities. However, the study also reported that some students doubted the effectiveness of VR in language practices in general, and further investigation into such concerns may be needed.

Vázquez and colleagues (2018) examined kinesthetic learning in Spanish language education by using Words in Motion, a VR language learning system that “reinforces associations between word-action pairs by recognizing a student’s movements and presenting the corresponding name of the performed action in the target language” (Vázquez et al., 2018, p. 272). Comparing effects on learners in VR to those in non-VR learning experiences, the researchers used knowledge of 20 transitive verbs as their assessment of effectiveness. The study revealed that VR did not affect immediate learning gains of the verbs, but did result in higher retention rates after a week of exposure, compared to the non-VR learners.

Madini and Alshaikhi (2017) explored the acquisition of English-for-specific-purposes (ESP) vocabulary of postgraduate students using 360° videos in VR headsets. Specifically, they examined whether VR headsets helped the 20 participating learners enrolled in the Didactic Terminologies in English Course to retain vocabulary related to their field. The study found positive vocabulary retention through pre- and post-test evaluations, and suggests a need for further research on learners’ perception of vocabulary learning using VR.

Adopting multiple tools, Bonner and Reinders (2018) investigated the capabilities of Augmented Reality (AR) and VR in English language education. In their study, three VR activities were introduced: (1) more realistic presentation practice through 360° videos and VR; (2) VR video creation; and (3) orienting students to a reading topic through 360° videos. The study also raised concerns related to using different kinds of VR tools, such as privacy and security for language educators. Mohsen (2016) compared the use of an online video simulation game that allowed Arabic learners to drag various virtual-surgery devices during a knee surgery simulation, to that of another group of learners who watched a YouTube video of the surgery. The pre- and post-test results showed that the students who played the game demonstrated significantly better language performance in vocabulary than did those who watched the video.

The preceding studies have looked in fruitful ways at VR in language teaching and learning, but none of those described thus far have dealt specifically with Chinese as a foreign language. There are a very few studies that have. Several by Grant and Huang (2010, 2012), involved a series of tasks that the researchers designed and implemented in Second Life. Tasks included purchasing train tickets and inquiring about accommodation. To accomplish them, students had to communicate with nonplayer characters in “Chinese Island,” a customized area within Second Life. Grant and Huang found that learner-to-expert dyads—as opposed to learner-to-learner interactions—generated better results in remedial error correction while resembling interlocutors’ interactions in real life. Their findings show that VR enables students to conduct communicative tasks through collaborative work, increasing their engagement in learning.

Also focused on VR in Chinese language teaching, Cheng (2018) performed an exploratory study investigating four case studies. The study found that a seventh grader’s final version of an essay included more words and longer sentences after the student had watched AR videos and experienced a virtual tour of Beijing through VR technology. Cheng’s study employed a four-step learning strategy that led to better outcomes for essay writing in Chinese characters: (1) read a sample article; (2) learn to use adjectives to modify observed objects and scenes and expand the sentences created; (3) understand the concepts of writing a well-structured paragraph; and (4) produce a short essay. The study concluded that VR technology can reduce students’ learning anxiety, help satisfy individual learning needs, and offer cultural and real-life experiences for learners.

Until now, research on VR and language teaching and learning is still in an embryonic stage of development in terms of scope, focus, and the range of languages investigated. The majority of studies into VR and language education reported a positive impact of VR tools on learners’ motivation and a measurable benefit of simulated virtual environments. Researchers have also developed various tasks in accordance with language teaching objectives for more focused and effective VR-incorporated learning. However, these studies have mainly focused on utilizing the existing tools and software, and language educators have very limited access to the development and design process of the tools. When researchers in this field have designed tasks, they have been restrained by the limited range of the original technology designs. As Stockwell (2007) mentioned, “the most important responsibility for those teachers who make the decision to use technology as a part of their language learning environments is to ensure that they are familiar with the technological options available and their suitability to particular learning goals” (p. 118). However, educators and students’ agency have not been reflected in any of the reviewed studies. Also due to the inflexibility of technology design and resultant inability to closely support teaching goals, current language task design is mostly aimed at developing general language functions instead of focusing on the specific linguistic structures essential for novice learners moving toward intermediate language proficiency.

5.4 Tasks in Foreign Language Teaching

The above-reviewed literature about VR in language education has shown some evidence of the effectiveness of using tasks in VR-incorporated teaching. A task in the language education field generally refers to a communicative problem to be solved by language learners using their own language resources (Ellis, 2009; Skehan, 1998). A task has a primary focus on meaning and should be performed in real-life situations. Therefore, tasks are learner-centered and designed to generate authentic use of language in connection with real-world communication needs. Considering tasks by type, one widely adopted classification was developed by Ellis (2003), who proposed as a main criterion a task’s structure, meaning whether a task is focused or not. A focused task has a specific objective: the use of a particular targeted linguistic feature during meaning-oriented communication. By contrast, an unfocused task does not have a predicted outcome in terms of specific linguistic features. Each of these two broad types of tasks coincides with different pedagogical purposes, and instructors may choose to design and implement contextual, situational, and goal-appropriate tasks. In another classification scheme, which places authenticity as a criterion for dichotomizing tasks, Richards (2001) distinguishes pedagogical from real-world authentic tasks. It is important to note here that the degree of authenticity is decided by instructors’ local judgment rather than through a formulaic, scientific measurement. A third scheme is by Willis and Willis (2007), who identify seven types of tasks involving different degrees of higher-order critical thinking skills and cognitive development based on different pedagogical purposes: listing, ordering and sorting, matching, comparing, problem-solving, sharing personal experiences, and projects and creative tasks. Finally, taking interaction as the central guiding principle, Pica et al. (1993) list task types as jigsaw, information gap, problem-solving, decision making, and opinion exchange, some of which are most commonly used in language classrooms.

For tasks in a curriculum, there are two types of course design: task-based language teaching (TBLT) and task-supported language teaching (TSLT) (Long, 1988; Skehan, 1998). TBLT refers to instruction in which tasks are the core components of curriculum and syllabi. TSLT, sometimes called the “weak” version of TBLT, refers to a more flexible approach in which tasks support the entire curriculum and syllabi, along with other language teaching methods already in use (Ellis, 2003; Littlewood, 2007). Tseng (2014, 2019a) has created a wide array of tasks in alignment with commonly taught topics in Chinese as a foreign language for novice learners moving toward an intermediate level of proficiency. The tasks, which are either pedagogy-oriented or authenticity-focused, include clearly identified communicative modes, can-do statements, instructional procedures, and accompanying rubrics that are well suited for a TSLT curriculum. Looking toward curriculum design, Tseng (2021) has contributed to the field of teaching Chinese as a foreign language by creating a sequence of authentic tasks with a list of authentic materials essential for the development and implementation of TBLT courses geared toward intermediate and developing advanced learners.

Regardless of type and model, language teaching tasks are proven to have significant pedagogical value in communicative language teaching and demonstrable effects on language proficiency, motivation, and cultural awareness. Regarding their connection with VR applications, both focused and unfocused tasks have the clear goal of authentic interaction in real-life situations, which is in line with the educational potential of VR applications. Peterson (2006) finds that VR can efficiently create or simulate an authentic and immersive environment that triggers language learners’ motivation, agency, and cognitive-founded output and communication. As Long (2017) stresses, increasing learners’ motivation is key in developing pedagogic and target tasks. The combination of task design and VR technology therefore emerges as an exciting explorative domain for research in this study.

5.5 The Study

The current study adopts a TBLT curriculum in which tasks, enhanced by VR technology, are created and implemented as a summative assessment to strengthen and evaluate language learning at the end of an instructional topic. Specifically, this curriculum implements four focused tasks that are enhanced by VR technology and include pre-selected topics and language functions in two levels of Chinese language classes. This preliminary study attempts to answer the following two research questions:

  1. (1)

    Do immersive virtual experiences created by VR technology foster language learning, and to what extent?

  2. (2)

    What are students’ perceptions of VR technology? To what extent does VR technology enhance learners’ motivation, engagement, and confidence?

In order to answer these two research questions, both quantitative and qualitative data were collected to investigate students’ perceptions about their language learning and motivation, engagement, and confidence in participating in the focused tasks enhanced by two types of VR technology.

5.6 Research Methods

A combination of quantitative and qualitative analysis was employed to supplement each measure. Quantitative data were collected through a questionnaire composed of a set of 5-point Likert scale questions and short open-ended questions on learners’ learning experiences in VR-incorporated tasks, using Unity and Google Cardboard. The data were organized in four categories based on the research questions: (1) learners’ perceptions of language learning with Unity-enhanced tasks and (2) Google Cardboard-enhanced tasks; and (3) learners’ motivation, engagement, and confidence with Unity-enhanced tasks and (4) Google Cardboard-enhanced tasks. Descriptive statistical analysis was utilized for quantitative analysis due to the relatively small size of the study.

Qualitative data were coded and analyzed based on grounded theory. Grounded theory is a research methodology that operates almost in a reverse fashion from some of the more established modes of social science research of the positivist tradition. Unlike positivist research, a study that employs grounded theory is likely to begin with a question, or even with the collection of qualitative data as primary research material (Davis, 1995). For the current study, despite the overarching scope provided by the research questions, there is no pre-set hypothesis about the potential findings, especially in terms of students’ perceptions. Therefore, the researchers read the entire data sets, selected contents relevant to the study purpose, and categorized based on the nature of the data collected.

5.6.1 Participants and Settings

The study was conducted in a southeastern public university in the United States. Students enrolled in Elementary and Intermediate Chinese language classes were invited to participate in the VR experiments completed through Unity and Google Cardboard. For participants in both levels of classes, the VR-enhanced tasks were part of their daily required class activities, but completing the follow-up survey was completely voluntary. Table 5.1 gives a summary of the number of students across two levels of language proficiency.

Table 5.1 Number of student participants in four VR tasks

Of the 16 students taking the Elementary Chinese course, 13 participated in the VR tasks and completed the post-task survey. Of the 15 students taking the Intermediate Chinese course, 14 participated in the VR tasks and completed the post-task survey. Ultimately, a total of 27 students took part in the VR-enhanced tasks and completed the survey. Of the 13 Elementary Chinese learners, 2 were male and 11 were female. Of the 14 intermediate Chinese language learners, 5 were male and 9 were female. Their ages ranged from 18 to 22 years old, with a mean of 19.7 years old. Their language proficiency ranged from novice-high to intermediate-low levels. Since the second scenario of Unity-enhanced task 1 required language knowledge and skills that the Elementary Chinese course students had not learned, Elementary Chinese language learners did not participate in the second scenario of Unity-enhanced task 1.

The VR tasks were developed in accordance with the curricula of Elementary Chinese and Intermediate Chinese, focusing on specific topics covered in the course syllabus. The curricula of the participating Chinese courses used Integrated Chinese as the main textbooks. The following sections introduce the pedagogical and technology design of the tasks as well as the implementation process in the Chinese language classrooms.

5.6.2 The VR Tools

After experimenting with more than 10 VR tools and analyzing their pedagogical applications, two sets were selected in this study: (1) Unity, Google Poly, and the matching 3D Google headsets; and (2) Google 360 videos/pictures and Google Cardboard headsets. In the following sections, Unity refers to the tasks using the first set of VR tools, and Google Cardboard refers to the tasks using the second set of VR tools. Unity is a cost-free cross-platform game engine that is highly customizable and handy for educational use. Unlike other VR applications, Unity allows instructors to be fully involved in the process of the technology part of task design, not just the pedagogical part. The instructors can easily get familiar with the user-friendly interface of the software, and they can include diverse objects in the VR world that they create based on their pedagogical needs. Using Unity, they can also create an interactive virtual world that enables users not only view to the scene but also to move around and touch and move objects in the virtual world. All instructors participating in this study received about 10 h of software training, as they were designing the projects at the same time. Accompanying Unity was Google Poly, a website with premade 3D objects, such as buildings, furniture, and plants that could be added to the VR world. The VR goggles the students used in class were HTC VIVE, a VR system that includes visual and audio access and a remote controller to navigate and move objects in the VR world.

Google Cardboard involves a simpler type of goggles than the HTC VIVE; these allow users to view the VR world, but do not allow as much interaction between the user and the VR world objects as the HTC goggles do. However, Google Cardboard is easy to carry, install, and view, and it has lower requirements for space and supporting equipment for activities. Since Google Cardboard is a viewing tool, the instructors took 360° pictures in selected scenes in China using their cellphones and professional 360° camera: the Ricoh THETA V 2 × 14.0 MP Ultra HD Camcorder. They then organized the pictures into Google Expeditions and added supporting text and audio materials. In class, instructors sent students a web link generated by Google Expedition, which they could open from their phones and which enabled them to attach their phones to the Google Cardboard goggles.

5.6.3 An Overview of Four Tasks with Unity and Google Cardboard

Student participants completed a total of four tasks during the study. The tasks closely aligned with the pedagogical objectives for creating semi-authentic immersive learning experiences, whereby learners stayed “within” the target language and culture most of the time. The design of the tasks included multidimensional prompts to maximize students’ interaction in the target language. In addition, the authentic feeling of space and natural sense of direction imparted by the VR tools significantly eased the pressure of “imagining” directions in the classroom, allowing students to focus more fully on the language tasks. Concerning the specific features of the two VR tools, Unity was selected for the first two tasks, and Google Cardboard for the remaining two tasks.

Tasks using Unity focused on developing students’ language proficiency in two main language functions: describing space and physical settings, and communicating about the physical movements of a person (giving and following directions) and objects (giving and following instructions when moving objects). A virtual 3D world featuring street views, a certain part of a city, or a dormitory was created specifically for learners to implement the communicative tasks in collaboration with their classmates. Tasks using 3D pictures and videos and Google Cardboard focused on creating immersive, authentic, and information-rich environments in which students could develop their communicative and presentative skills in the target language. The 3D pictures and videos of authentic settings were created in China before the study started; they included street views, shops, and restaurants. In class, students were asked to conduct communicative and presentative tasks, including ordering food and drinks after reading a menu, describing directions, and discussing public transportation.

All four tasks involved two-way interactions that required collaboration between two learners, facilitated by the instructor or tutors on site. Following is a synopsis of the four tasks enhanced by Unity and Google Cardboard, outlined in four aspects: task type, instructional topic and level, linguistic structures and functional foci, and scenario.

5.6.3.1 Task 1. Unity-Enhanced Task: Cleaning and Describing Your Apartment in Beijing

Instructional topics and level. The first task using Unity and related tools was designed to support learning of the topics “apartment” and “housing” in Elementary Chinese and Intermediate Chinese.

Linguistic structures and functions. The linguistic structures that this task focused on were the “ba” construction (subject + 把 + object + verb + other element) and existential sentences (place + verb + 了 or 着 + numeral + measureword + noun). The language functions were communicating about physical movements of objects (giving and following instructions when moving objects) and describing space and physical settings.

Task type: Information gap. In each pair of students, one viewed the entire virtual world using the HTC VIVE goggles, and the other viewed a screen that only partially displayed the virtual world.

Scenarios. Two scenarios were set in this task to elicit students’ language output.

Scenario 1: You are studying abroad in Shanghai. This is just your third week here. You have rented an apartment. Your potential girlfriend/boyfriend is visiting your place, but you cannot be back in time to clean the apartment which is very messy. Please call your roommate and let him/her help you organize the apartment. Please remember that you only have a vague idea of what is in the room and you are not video calling. Therefore, you need to check where things are. Start the conversation from the beginning of your phone call.

Scenario 2: You had fun with your friend and are now resting in your apartment. Suddenly, your Chinese friend in the US called and want to see your new apartment. Virtually show her or him around by introducing your room and furniture. The Chinese friend may ask detailed questions, such as, What’s on the table? What’s on the couch? Are you eating American food? etc. Start the conversation from the beginning of your phone call. [Only Intermediate Chinese students were required to complete this part of the task.]

5.6.3.2 Task 2. Unity-Enhanced Task: Looking for a Peking Duck Restaurant in Beijing

Instructional topics and level. The second Unity task aimed at supporting the topic “directions” in Elementary Chinese.

Linguistic structures and functions. The linguistic structure that this task focused on was direction and location devices. The main function practiced in this task was describing space and physical settings and communicating about a person’s physical movements (giving and following directions).

Task type: Information gap. In each pair of students, one viewed the entire virtual world using the HTC VIVE goggles and the other viewed a screen that only partially displayed the virtual world.

Scenario. You are visiting Beijing. You would like to meet with your Chinese friend in a Peking Duck restaurant in a busy shopping plaza which is not very far away from your hotel. Your friend knows the area well and he/she is guiding you on the phone. Follow his/her direction and find the Peking Duck restaurant.

5.6.3.3 Task 3. Google Cardboard–Enhanced Task: Visiting a University Campus and Ordering Food in the Dining Hall in Shanghai

Instructional topics and level. The first task using Google Cardboard and related tools was designed to support the learning of the topics “direction” and “dining” in Elementary Chinese and Intermediate Chinese.

Linguistic structures and functions. The linguistic structures that this task focused on were direction and location words, Chinese food words, and topic-comment sentence structure. The language functions to be practiced were communicating physical surroundings, and ordering and discussing food and flavors.

Task type: Role play. In each pair, one student performed as the student who studied abroad, and the other student performed as his or her language partner in Shanghai.

Scenario: You are studying abroad in Shanghai. Today is your first day on campus. Walk around the campus and find the dining hall where you will meet with your language partner in Shanghai. Order food in the dining hall based on the recommendation of your language partner.

5.6.3.4 Task 4. Google Cardboard–Enhanced Task: Riding Subway Lines and Meeting with Friends in a Milk Tea Shop in Shanghai

Instructional topics and level. The second task using Google Cardboard and related tools was designed to support the learning of the topics “transportation” and “dining” in Elementary Chinese and Intermediate Chinese.

Linguistic structures and functions. The linguistic structures that this task focused on were direction and location words, public transportation words and phrases, Chinese snack and drink words, and sequencing devices in sentences. The language functions to be practiced were communicating physical surroundings, and ordering and discussing drinks and flavors.

Task type: Role play. In each pair, one student performed as the student who studied abroad, and the other student performed as his or her language partner in Shanghai.

Scenario: You have been studying in Shanghai for a few days. Your Chinese friends told you there is a very tasty milk tea shop not very far from the University. Ride two stops of the Shanghai Subway with your Chinese friends to go to the milk tea shop. Ask him/her as many questions as you wish about how to ride the subway and what flavor of milk tea is good.

5.6.4 Instructions for Using Unity and Google Cardboard for Task Completion

Before students began to engage in the tasks, they received training on how to use the VR tools as a pre-task. The step-by-step instructions for using Unity in tasks 1–2 and Google Cardboard in tasks 3–4 are outlined below. They are in principle identical in Steps 1–3, except for differences in detailed implementation and an added presentation in Step 4 for Tasks 3 and 4.

5.6.4.1 Step-by-Step Instructions for Using Unity to Complete Tasks 1–2

The Unity-related tasks were conducted in a media lab where HTC VIVE goggles were available. Students were assigned to pairs to participate in the two role-playing tasks. Each pair took turns and spent about 15–20 min to complete each task. The implementation of the Unity tasks included three steps, as follows:

  • Step 1. Technology training. Although the young generations are often exposed to new technologies, some students are not familiar with the VR world, so basic navigating and controlling techniques were introduced at the very beginning of the activity. Each student also experienced wearing the goggles and moved around the virtual world for about one minute to reduce the distractions of a new technology before engaging in the language task.

  • Step 2. Language warm-up and instructions. After technology training, the instructors led the class in a five-minute warm-up session and gave instructions about the language structure and procedure of the task.

  • Step 3. Task performance in pairs. In each pair, one student wore the HTC goggles to navigate/move objects in the VR world, while the other student gave instructions. In each scenario, the students took turns and switched roles when performing the task.

During the activities, the instructors or tutors provided language support and corrective feedback related to the language foci.

5.6.4.2 Step-by-Step Instructions for Using Google Cardboard to Complete Tasks 3–4

The Google Cardboard-related tasks were implemented in the regular classrooms. Students were also assigned to pairs to participate in this role-playing task. Each pair spent 10–15 min to complete the task. The implementation of the Unity tasks included the following four steps:

  • Step 1. Technology training. Students used their own cellphones to look at the 360° pictures and slides with the assistance of Google Cardboard. Therefore, before the VR activity, the instructor guided the students to set up their cellphones properly and let them briefly experience the Google Cardboard VR world.

  • Step 2. Language warm-up and instructions. After technology training, the instructors led the class in a five-minute warm-up session including instruction about the language structure and procedure of the task.

  • Step 3. Task performance in pairs. In each pair, both students wore the Google Cardboard to view the 360° pictures and communicate to complete the task. In each scenario, the students took turns and switched roles when performing the task.

  • Step 4. Concluding presentation of the tasks. After the pair work, students were required to present their discussion and communication by answering questions such as “What kind of food did you order in the dining hall?” and “How did you get to the milk tea shop?”

While students were working in pairs, the instructors walked around to provide support and corrective feedback related to the language foci.

5.7 Results and Discussions

The results of the study are organized by the two research questions on students’ perceptions of language learning and their motivation, attitudes, and confidence. Since the design of VR tasks is made possible through the two types of VR tools and apps, Unity and Google Cardboard, four subcategories of discussions follow accordingly.

5.7.1 Students’ Perceptions of Language Learning with Unity-Enhanced Tasks

Both quantitative and qualitative data were collected to investigate students’ perceptions about their language learning, especially the learning of specific linguistic structures. Table 5.2 shows the participating students’ perception of their learning on a 5-point Likert scale. Students rated their perceptions of using specific linguistic structures and functions from Strongly Disagree, rated as 1, to Strongly Agree, rated as 5. Items 2, 3, and 5 inquired about students’ perceptions of the learning of linguistic foci and language functions emphasized in the second scenario of Unity-enhanced task 1, in which only the 14 students from the intermediate course participated. Therefore, the results of items 2, 3, and 5 are calculated based on the participation of 14 students in total.

Table 5.2 Students’ perceptions about linguistic structure and function learning in Unity tasks

The results show that students mostly perceived as very positive the ability of Unity activities to foster their language learning. First, the survey shows that all participating students agreed that their language learning improved through the Unity activities. Most strongly agreed that they could understand and were able to use the specific linguistic structures emphasized in the tasks, including the “把” structure (85%), the “着” and “了” structure (93%), and the direction and location devices (86%). Second, regarding the learning and practice of language functions, all students believed that they could better communicate for the specific purposes at which the tasks aimed. A majority of the students strongly agreed that the VR activity effectively helped their learning of the language functions: 81% strongly believed that they could better give or follow instructions about moving objects; 89% strongly felt that they could better describe interior settings; 81% strongly felt that they could better give and follow directions; and 81% strongly believed that they could better describe exterior place and physical settings.

Students’ qualitative reflections also show that they believed the VR activities effectively fostered their language learning. For example, one participant mentioned in the survey having “never used ‘把’ in such an authentic environment” and felt that he or she gained “deeper and more direct understanding of when and how to use this structure.” Another student wrote that the VR activities helped him or her “become more comfortable with forming sentences about location since we had to use it a lot with repetition in conversation.” Regarding the language function, one student described feeling able to “ask direction confidently when I go to Shanghai next summer after this activity.”

In addition to pointing to the effectiveness in helping students’ learning of specific linguistic structures and functions, participants’ comments also reveal their perception of the activity as of great assistance in applying what they had learned in class. For example, several students stated that the activity helped them remember what they had learned in the previous classes, and that VR was a cool way to apply what they had learned to a realistic scenario. Students also felt that the VR activities further improved their conversation skills:

The activities were very interactive. It provided us with new ways to use Chinese conversationally which really helped me with my conversational skills. I also think it was a more intensive way of learning because in class me and my classmate would use notes or other ways to facilitate our conversation. I don’t think that is going to happen in the real world just like in VR.

Beyond the benefit of better practicing interactive skills, some students also thought that the Unity activities were able to help them practice language outside of the classroom setting: “it forced us to think more like we would outside of an educational setting—it was harder to speak in that setting than I would have thought.” Students also mentioned in the survey that what triggered them to speak in class was the instructor’s guidance and PowerPoint slides, while in the VR environment, it was the semi-authentic and information-rich surroundings that triggered them to converse without much rehearsing. Such differences helped them to better prepare themselves in a potential future real-world situation.

5.7.2 Learners’ Perceptions of Language Learning with Google Cardboard–Enhanced Tasks

Table 5.3 shows the participating students’ perceptions of their learning on a 5-point Likert scale. Students rated their perceptions of using specific linguistic structures and functions from Strongly Disagree, rated as 1, to Strongly Agree, rated as 5.

Table 5.3 Students’ perceptions about linguistic structure and function learning in Google Cardboard tasks

These results show that students mostly perceived positively the ability of the Google Cardboard activities to help their language learning. First, the survey shows that all participating students agreed that their language learning improved through the Google Cardboard activities. Most participants strongly agreed that they could understand and were able to use the specific linguistic structures emphasized in the tasks: direction and location words (89%), Chinese food/drink/snack-related words (78%), topic-comment sentence structure (89%), and sequencing devices in sentences (85%). The slightly lower number in the learning of Chinese food/drink/snack-related words was further explained in some students’ qualitative reflections, which indicated that some new words had appeared in the VR environment when they looked at the authentic menus, especially in the Chinese milk tea shop. Such exposure to words about drinks they had never learned before had caused some distraction and difficulty in looking for and using the previously learned words.

Second, regarding the learning and practice of language functions, all students believed that they could better communicate for the specific purposes at which the tasks aimed. A majority of the students strongly agreed that the VR activity effectively helped their learning of the language functions; 89% strongly felt that they could better communicate physical surroundings, and 85% strongly felt that they could communicate better in ordering and discussing food/drink/snack and their flavors.

For the Google Cardboard activities, students did not reflect much on how the VR activities fostered their learning of specific linguistic features, although they confirmed this in their ratings in the survey. However, several students wrote that the VR activities helped them acquire skills in language function, and especially that the VR environment fostered their focused attention on the function. For example, one student stated,

When ordering food in the university cafeteria [during the VR activity], me and my classmates were so focused on discussing which dish we wanted to have. We were using all we learned to fulfill this task since the 360° picture looked so real. I wouldn’t focus my mind this much in regular classes.

Many students also commented on the meaningful aspect of the Google Cardboard activities. They reported that the authentic VR surroundings created by 360° pictures from China made them feel that they were granted the opportunity to apply what they had learned to real-world scenarios. “I never imagined that I could either ride subway or order milk tea in Shanghai even if we learned related stuff,” one student mentioned, “but now I think I can do both.”

5.7.3 Learners’ Motivation, Engagement, and Confidence with Unity-Enhanced Tasks

Table 5.4 shows students’ motivation, engagement, and confidence in Unity-enhanced Tasks. Overall, most participating students held very positive impressions of the Unity-enhanced tasks. The results indicate that 11% of the students felt motivated in the Unity activities and 89% strongly agreed that Unity activities motivated them in learning Chinese. The survey results also show that students felt very engaged in the activities. A total of 96% stated that they felt very engaged in the Unity activities. Regarding students’ confidence, 89% of the participating students strongly felt that they would be more comfortable speaking in the similar real-world context in the future.

Table 5.4 Students’ motivation, engagement, and confidence in unity-enhanced tasks

The participating students also rated their perceptions of their comfort levels in speaking in a virtual environment and their attitudes about the technology tools they were using to navigate or interact with the VR world. Findings show that all of them felt comfortable (7%) or very comfortable (93%) in speaking in the virtual world, and most of them agreed that the VR tools were easy to use (89%).

Qualitative data also reveals students’ positive attitudes regarding their motivation, engagement, and confidence. Most students believed that the tasks “can be more beneficial to Chinese learning” as compared with regular class activities. Beyond feeling motivated about language activities in VR environments, their motivation also extended to the long-term learning of Chinese. For example, one student stated:

In the VR world, I was able to see myself speaking and doing things in China. It gave me an opportunity to convince myself that I will be able to survive in China with the things I learned in class. I am glad I’ve learned this language and will definitely learn more to prepare myself for the future.

In addition, the Unity activities motivated students to further explore cultural and intercultural aspects of China. For example, some students mentioned that it was very interesting to see Starbucks, Pizza Hut, and hotdogs in the virtual world, and they were curious about Western foods’ current trend in Chinese cities.

Concerning engagement, many students described the Unity-enhanced activities as “really fun,” “unique,” and “engaging,” and they felt that they were fully immersed in the VR world while using Chinese to accomplish the required tasks. Several students mentioned that they were focused on the tasks even more than they were in the regular classroom, because the VR environment provided them with distraction-free surroundings where only task-related objects and texts existed. Many students reflected that they would have liked to have more Unity activities to practice different kinds of structures that they were learning in class. However, some students also mentioned some side effects regarding their engagement in learning when using Unity. As one student mentioned, “because it [VR] is fun, people may get carried away while playing and not engaging in the overall purpose of the activity.”

Qualitative findings also support participating students’ ratings of the Unity-enhanced tasks as having a positive impact on their confidence in speaking Chinese. They felt that being able to speak in a semi-real environment before the “real thing in China” was very beneficial to their mental preparedness and confidence levels, especially in speaking. Some reported that being in a foreign environment “could be very intimidating,” and the VR environment gave them “an opportunity to rehearse the potential future.” One student wrote,

I always thought that being in a foreign country with all the things I don’t know, like the unfamiliar streets and the language. Today when I walked in the [virtual] shopping district in China, I realized it is not that bad. I am able to recognize many words on the streets and this gives me confidence in future traveling.

5.7.4 Learners’ Motivation, Engagement, and Confidence Through Google Cardboard-Enhanced Tasks

Table 5.5 shows students’ motivation, engagement, and confidence in Google Cardboard–enhanced tasks. Overall, the findings of the Google Cardboard projects show positive perceptions and attitudes from participating students.

Table 5.5 Students’ motivation, engagement, and confidence in Google Cardboard–enhanced tasks

All students were motivated in the task, and 89% were very motivated in learning and practicing Chinese through participating in the activities. Students also felt very engaged in the Google Cardboard activities. All participating students agreed that the activities were engaging, and 85% felt they were very engaging. Regarding their confidence in speaking in Chinese, 19% stated that they felt more confident in speaking in similar real-world contexts in the future, and 81% reported feeling very confident after practicing through the Google Cardboard activities. In addition, most students (93%) claimed that they felt very comfortable speaking in a virtual environment. When rating the ease of use of Google Cardboard and related tools, 85% of the students stated that they were very easy to use; however, a few participants did experience some difficulties in the technical setup and use.

In the reflection section of the survey, students reported that they felt “very excited to look around a real Shanghai,” as they felt they were “truly surrounded by the Chinese shops.” “It is a fun way to learn Chinese,” one student wrote; “I hope we can do more of this in class.” According to them, the authentic 360° pictures of China served as a bridge between what they practiced in the regular classroom and the real scenario in China. Such a bridge also showed them the gap between what they knew and what they needed to know to communicate in China. In their survey answers, several mentioned that after seeing the 360° pictures of subway stations and shopping malls in Shanghai, they wanted to research more on the related language, cultural, and societal information online and review more about linguistic devices related to location and purchasing, so that they could better fulfill the communicative task in the real-world context in the future.

The information-rich surrounding created by Google Cardboard also motivated students to learn new Chinese words. In many of the 360° pictures, Chinese words were everywhere, since the photographs were taken in busy commercial areas. Students reported feeling a little overwhelmed at the beginning but becoming used to it after a while. Some mentioned that they had “never felt so eager to learn new Chinese words.” As one student stated,

It was a bit scary when looking at so many characters [in the shopping mall picture] which I have never learned about, but gradually I figured some them out by guessing from the stuff inside of the stores. This is an interesting journey and I really want to learn more words after this so I can recognize more next time.

In addition, many students enjoyed the feeling of cultural involvement when viewing 360° pictures. “Being in China is different from looking at the pictures of China on the slides in class,” one wrote; “this is a very cool culturally immersed experience.”

Some students raised concerns about the Google Cardboard–enhanced tasks. Concerning their confidence in speaking in similar real-world contexts, a few students wrote that they were very excited looking around in the VR environment, which affected their actual language practice, so that they needed more time to be able to feel confident about performing the communication tasks. Some students also mentioned that during the activities, they were not communicating with their partners while wearing the Google Cardboard goggles. Instead, they put the goggles down and only then started their interactions and negotiations. Therefore, several students were not literally speaking in the virtual environment. This is related to one of the common phenomena during the activities: some students reported feeling very dizzy when looking at the 360° pictures, and they had to take the Google Cardboard goggles off frequently to relieve the dizziness. In addition, several students mentioned that their cellphones were not compatible with the application needed to run the 360° pictures, and they had to spend a long time looking for alternative ways to participate in the activities.

In sum, the original goal of implementing VR tasks in Chinese language teaching was to enhance students’ language learning, motivation, engagement, and confidence in a simulated authentic environment. The findings of the study show that the tasks created with Unity and Google Cardboard mostly achieved such language educational purposes.

First, both types of VR-mediated tasks effectively fostered students’ language learning, as evident in students’ self-assessments. Through the design of focused tasks, students worked collaboratively to use predetermined language functions and structures in the VR world’s immersive and semi-authentic surroundings. In an active learning environment created by a combination of a real-world scenario and photographed scene, learners were able to better perceive themselves as conversing in real-life conversations and solving real-life problems. Further, in terms of technology and task design, the instructors’ onsite support substantially contributed to students’ self-perceived improvement in language learning. Whether in a classroom or in a VR studio, the instructor’s scaffolding and feedback guided the students to be able to use targeted linguistic devices in order to achieve expected language functions and to engage in genuine meaningful interactions.

Second, the current study bolstered the findings of the studies summarized in the literature review that students hold very positive attitudes about the VR-incorporated tasks. Beyond recording students’ feelings of “fun” and “exciting,” as documented in the existing studies, we found that Unity-incorporated and Google Cardboard–incorporated tasks enhanced students’ motivation and engagement in Chinese language learning, although each type of task achieved this goal in different ways. One of the unique features of the VR world created by Unity is that it imparts full physical immersion and a feeling of realism to the users. Learners are able not only to view the 3D world but also to interact with it: they can engage in physical movements in the 3D world and touch and move the objects there. Such features help learners to feel more engaged with physical behaviors and memories, which may significantly contribute to their perceptions of motivation and engagement. In contrast to Unity, the VR world created by 360° pictures and Google Cardboard was very information-rich, which means that viewers are immersed in an entirely foreign language environment with Chinese texts, people, and nonverbal elements. Learners can experience a more real surrounding than in the Unity-related tasks, since the 360° pictures present real streets, restaurants, and shops in China, while in Unity, the scenes are only simulated with some authentic pictures attached to the VR objects. Aided by Google Cardboard, students experience a more authentic virtual tour of China, which may be one of the main contributors of improved motivation and engagement. Considering the learning design and technology of the tasks, it could be beneficial to find a combination of Unity and 360° pictures to maximize the advantages of both sets of tools.

5.8 Reflections on the Use of VR Technology

Currently, second language acquisition literature lacks guidelines and principles for incorporating VR tools and apps into the language classroom. In light of the findings summarized in the aforementioned sections, the following are some notes of caution largely pertaining to pre-task preparation to increase learners’ familiarity with VR technology and after-task reflections on how VR-related tasks could have been carried out more smoothly.

It is worth mentioning that the Unity-incorporated tasks were completed in the media lab, where only one pair of students performed the task at a time, with the instructor and their classmates observing on the spot. This gave the instructor more opportunities to fully support each acting pair and to attend to their here-and-now needs. This, however, is the downside of the Google Cardboard–enhanced tasks, which were conducted in regular classrooms where all students performed the tasks at the same time. Not all students could get immediate support and attentive guidance from the instructor, as they could in the Unity-enhanced tasks. This may explain the slightly higher ratings of students’ self-perceived language improvement in the tasks enhanced by Unity.

Despite such differences in technology features and foci, tasks supported by each type of VR are empowered to increase learners’ confidence in speaking. One thing to note, however, is the slightly lower confidence level observed in the Google Cardboard–related tasks. This might result from a sense of detachment between viewing through Google Cardboard and speaking the language at the same time. Since the Google Cardboard goggles are easy to take off and put on, some students took their goggles off when speaking and relied on their short-term memory of the 360° pictures that they had just seen. This may have negatively affected the authenticity and synchronicity of real-life communication with native speakers. This pitfall may have led students to perceive their language performance with less confidence and as less authentic than it could be.

As mentioned in the previous section, one reason for a sense of detachment between viewing and speaking in the Google Cardboard–enhanced tasks was the dizziness issue. This can likely be easily resolved by trying out the goggles for a second time to help learners get used to them. Advising students to turn slowly when viewing the 360° pictures is also very helpful to relieve dizziness. The dizziness issue was never reported or observed in process of implementing the Unity-enhanced tasks.

While both Unity and Google Cardboard tasks have proven to have positive effects on students’ learning, motivation, and engagement, there exist several challenges in the refinement and adjustment of task design and planning. One of these challenges is associated with the limitations of technology devices, equipment, and applications. For the Unity-related tasks, one of the concerns raised by many students was scheduling. Since space in the media lab was limited, the entire class needed to be divided into two to three groups, which created scheduling inconveniences for students and instructors. With the relatively expensive equipment that Unity-enhanced tasks require, it is impossible to install a compatible well-functioning computer and bring the more expensive goggles into a regular classroom. With the Unity-enhanced tasks mainly performed in a supervised setting with a VR specialist present, no major technical issues occurred in the implementation of the Unity-incorporated tasks, whereas technical issues did arise when students used their cellphones to complete the Google Cardboard–enhanced tasks. The major problem lies in the fact that individual cell devices are hard to manage, navigate, and reformulate without technology support being provided in a regular educational setting for immediate problem-solving. Some students did not even have smartphones, so the 360° pictures would not display correctly or at all on their phones. A recent update of the cellphone system also caused some unexpected issues, which prevented some students from viewing through Google Cardboard. To plan more carefully ahead of time, a detailed and well-thought-out procedure is needed to ensure that students are technologically well-prepared in order to successfully perform the language tasks. This relies on a better-conducted technology orientation beforehand.

Another challenge is monitoring students’ emotions and foci. It is observed that students’ overexcitement and emotions may cause distraction from performing tasks. VR is a new and innovative technology, and students may never have experienced it before. Although excitement contributes in part to a higher level of engagement, the new technology does cause distraction, especially for the first time. It was observed that students were better able to focus on the tasks the second time they used the technology, so the more the students try it, the less overexcited they feel. One possible solution could be to extend the technology training time to give students more time to adjust to the new surroundings and explore the virtual world before they actually start interactive tasks.

5.9 Conclusion

The current study is a preliminary investigation of students’ perceptions of and attitudes toward four communicative tasks facilitated by two types of interactive VR tools and apps, Unity and Google Cardboard, in an American university. Through incorporating focused communicative tasks, the VR-enhanced tasks proved to have a positive impact on fostering students’ self-perceived language learning and enhancing their motivation, engagement, and confidence in Chinese learning, especially speaking. Although these findings are very encouraging, the study does have limitations in several aspects.

One concern is simply that the current sample size is relatively small, and further research should include a greater number of language learners across a wider range of proficiency levels, either through cross-sectional studies or longitudinal studies. Currently only a very small percentage of studies—27 out of 167 studies—focus on language learning (Reisoğlu et al., 2017), and the sample sizes tend to be very small (Wang et al., 2019).

VR technology increasingly emerges as a vital tool with the potential to transform how language can be taught and learned, through careful selection and implementation of technology. However, there is ample room for expansion in both scope and depth in follow-up research. The current study focuses on proficiency levels from novice-high to intermediate-low. Another set of pedagogical experiments should involve solid intermediate and advanced learners to test different types of task design and effects.

The current study also included a very limited set of linguistic structures and functions, and expansion beyond this set is another area for further exploration. The development of more tasks involving different linguistic structures and functions, aided by the incorporation of VR design, whether the same or different from this study, is critically needed in foreign language teaching and learning in general, and in Chinese language teaching and learning specifically. It is well recognized that with the current limitations of VR technology in language teaching and learning, it seems feasible to design tasks to fulfill pedagogical goals, partially or completely, through acquisition or enhanced learning of spatial arrangement, object movement, and directive information. Hypothetically, as VR continues to advance and becomes more accessible, more language functions will be the foci of studies. Along the same lines, vocabulary learning and development in four language skills may be potential areas to investigate in the near future.

In terms of methodology, an important consideration is that the current study is purely descriptive and does not include a control group. Future studies should include both an experimental group and a control group to compare learning in two different learning contexts: a traditional learning context without VR technology and a newly created learning context that incorporates VR technology. Both groups should have identical curricular and pedagogical objectives. Findings from this type of comparative study will contribute to experimental design and deepen our understanding of the roles VR technology can play in different dynamics and settings.

Future studies are also needed that further refine task design and link tasks closely to language gains. Tasks might be compared to a chameleon: nuances in design matter a great deal. In the current study, role play and information gap, which are among the most frequently implemented communicative tasks in any foreign language classroom, serve as a great starting point for a simple task design. Given that VR is still emerging, simplicity in tasks is still key. It is believed that future developments in VR technology will make its application to pedagogy more versatile and thus satisfy a greater diversity of language learning needs. These technologies hold great promise for the foreign language teaching and learning community.

Finally, the current study is limited in terms of analysis, which is in the form of learners’ self-perceptions; these, it may be safely assumed, may differ from actual, objective improvements in language learning. Given VR’s limited capacity and the constraints on task design those limits impose, the current study did not successfully record learners’ language output for further analysis. Admittedly, evidence-driven outcomes in language development, both short-term and long-term, are needed to supplement and strengthen self-perceived results. This capability awaits further advances in VR technology that will make possible the recording of speech and saving of written language output for substantial investigation.

Most generally, there is a great need for more studies into the possibilities and challenges of VR in foreign language teaching and learning (Lan, 2020). Future studies can place core emphasis on task design and implementation and the application of 3D VR technology in close alignment with instructional goals. There is a great deal to explore. A fundamental question that has been guiding second language researchers and practitioners has been how best to incorporate technology into language teaching. This question is all the more salient now, given the great promise of and rapid changes in VR technology.