1 Introduction

Augmented Reality (AR) is an education technology that contributes to contextualising language learning by offering a wide range of authentic experiences to learners across levels and languages. Different from virtual environments, AR superimposes and integrates computer-generated content over the real world environment (Wang et al., 2018), interactively connecting the real and the virtual world in three dimensions in real time (Azuma, 1997).

Despite the advantages that AR technology may bring to language education, few studies have discussed their theoretical underpinnings. Situated learning theory and (socio)constructivist learning theory have been used to understand these underpinnings, through (Dunleavy, 2014). Proposed by Brown et al. (1989), situated learning theory, based on Vygotsky’s (1978) sociocultural theory (SCT), underscores the importance of contexts and interactions in learners’ (language) learning. As culture and contexts are vital in language learning, especially in second/foreign language learning, adopting AR technology is thought to enable learners to immerse in the target language and its culture, which largely contributes to their language development. Constructivists have highlighted the role of learners’ prior knowledge and sociocultural background in language learning. In an AR-enhanced language learning environment, learners not merely immerse themselves in curated learning contexts, but also comprehend and construct the information they obtain from the mixed-reality multimedia inputs. During this process, they complete language tasks, interact with others, and finally apply what they have learnt to other situations.

Technology-enhanced language learning has stimulated the development of game-based learning, which also guides the design of AR-infused language learning products. The most popular AR game is undoubtedly Pokémon GO, which is perceived as particularly beneficial for developing learners’ vocabulary knowledge and digital storytelling abilities. However, research on use of such games in language education has failed to account for the motivation of users to select AR apps. In this vein, Rauschnabel et al. (2017) point out that existing AR-related theories and practices lack understandings about users. They therefore propose an adoption framework for mobile AR games, based on the uses and gratification theory (U&GT) which is one of the most widely applied theories in communication research (Rubin, 2002). U&GT theoretically answers the question of the reason why users select particular media rather than others, indicating users’ proactive selection and use of media that satisfy their cognitive, social integrative, tension release, affective and personal integrative needs. In Rauschnabel et al.’s (2017) model, they propose that learners’ reactions to the AR games and their intended behaviours to integrate the game into their language learning are impacted by their evaluation and perceptions of its various benefits (i.e. hedonic, emotional and social benefits), risks (i.e. data security and physical security) and social influences. This model, from a theoretical perspective, extends our understanding about the factors that may determine learners’ acceptance and use of mobile AR games.

Rauschnabel et al.’s (2017) model encourages researchers to engage with the learner’s perceptions, especially with their evaluation and satisfaction in adopting mobile AR games. However, their model, as well as the empirical verification, predominantly focuses on learner’s self-directed uses of AR beyond the classroom. As language education not only involves learners but other stakeholders such as teachers and schools, more attention should be paid to those other stakeholders in order to make attempts to apply the AR technology to other language learning settings (e.g. formal language learning in the classroom).

2 AR and Language Education in China

AR can be used in different language learning contexts for various learning purposes. AR has the potential to break through the spatiotemporal and financial limitations, making learning occur anytime and anywhere. According to Radu (2014), AR can improve learners’ understanding of content, enhance their memory, stimulate their learning motivation and peer collaboration and contribute to better task performance. In recent years, although AR has been implemented in a variety of disciplines, English language learning has been slow to embrace this emerging technology.

In China, the increasing ownership and use of mobile devices offer learners more non-formal language learning opportunities (Zhang & Pérez-Paredes, 2019). These learners increasingly enjoy participating in the mobile learning community. The Ministry of Education (MOE) in China highlights the affordances of AR technologies in education, clarifying that the development of AR may bring huge benefits and may even change the future education industry in China. In 2020, the Chinese government invested 5.76 billion dollars in developing AR and VR (Virtual Reality) technologies, accounting for 30% of the global market. According to the China Education Development Report published by Deloitte China, current AR technologies are mainly used to provide AR-enhanced early childhood education, K12 tutoring and higher vocational education.

In language learning, AR has been suggested to boost the learners’ language learning motivation (Liu & Tsai, 2013), benefit learners’ contextualised and authentic language learning (Godwin-Jones, 2016), build a bridge for communication and cultural exchange and more importantly, help learners develop their language learning abilities (Wu, 2019). As Zhang et al. (2020) argue, the ecology of AR in language education involves different stakeholders (e.g. teachers, designers, learners) in designing, developing and implementing AR-based language learning products. As a crucial stakeholder, teachers’ perspectives and attitudes indeed matter for an informed understanding of the contribution of AR in English language education. However, previous studies on AR predominantly aimed to explore the effect of applying AR into classrooms (Lee & Park, 2019). To the best of our knowledge, no study has been conducted to address Chinese EFL teachers’ perceptions and attitudes towards AR in language learning.

This study sets out to investigate Chinese EFL teachers’ perceptions of the potential of AR in EFL teaching and learning, and to discuss teachers’ expectations of AR-enhanced language learning. Theoretically, the design of this research draws on Traxler and Kukulska-Hulme’s (2016) context-aware mobile learning and Godwin-Jones’s (2016) conceptualisation of AR language learning and practice. Specifically, the study was designed to collect teacher’s perceptions and expectations of AR in (1) language learning, (2) its effectiveness, (3) content, (4) curriculum and pedagogy and (5) future use. Our research aims to offer some insights into how Chinese EFL teachers understand the role of AR in English language learning, allowing researchers to theorise on how AR may contribute to both computer-assisted language learning (CALL) and language education, and contribute to the conversations on how practitioners can best integrate this AR into language classrooms.

We adopted a survey methodology to tap into Chinese language teachers’ cognitions. The overarching question of this study is: What are Chinese EFL teachers’ perceptions and expectations of Augmented Reality (AR) in English language teaching and learning?

There are two sub-questions:

  1. (1)

    What are the differences in teachers’ perceptions between non-tertiary levels and the tertiary level?

  2. (2)

    What are the differences in teachers’ perceptions between Eastern China and other regions in China?

3 The Study

3.1 Participants

In total, 153 teachers completed the questionnaire study. There were 19 males and 129 females. Five of them did not state their gender. 102 of the participants were teachers from non-tertiary levels (including kindergartens, primary schools, middle schools and high schools), and 45 of them were from the tertiary level. Six teachers came from other educational institutions. In terms of school type, there were 100 teachers from public institution, 28 were from private institution and 25 were serving in training institutions at the time of completing our survey. Categorised by region, 101 of them came from provinces and municipalities in Eastern China,Footnote 1 and 52 were from other regions of China. Eastern China, as the main body of the national economy at present, plays a significant role in maintaining the sustained and rapid growth of the national economy and enhancing the national economic competitiveness. In 2019, among the top 30 cities of GDP in China, 24 of them belong to Eastern China. In this case, we assumed that learners, teachers and schools have more chances to access and embrace new technologies and integrate these technologies into their current language teaching and learning.

As an emerging technology, only two investigated teachers had used AR in their English language teaching. Even when other purposes were considered (e.g. playing games like Pokémon Go and using the museum audio guide), only eight teachers had any experience at all of using AR.

3.2 Research Design

Data of this study was collected from an online questionnaire, designed and presented on Qualtrics (http://cambridge.eu.qualtrics.com). The questionnaire was piloted by six Chinese Ph.D. students at Cambridge University who specialised in EFL research with previous EFL teaching experiences in mainland China. They suggested to shorten the texts in the introduction section and add a short video to introduce AR to the participating teachers. Besides, one “I don’t know” section was suggested to be added in order to reduce the possibilities of a guesstimate. This research design received ethics approval by the Humanities and Social Sciences Research Ethics Committee of the University of Cambridge. The questionnaire was distributed to the participants who read the information sheet and agreed to take part in this study.

The questionnaire included 32 questions, and it took participants between 10 and 15 min to complete. All the questionnaire questions were designed both in Chinese (participants’ L1) and English (participants’ L2). In this case, participants could choose the language they feel comfortable before answering the questions, and misunderstandings in languages could be reduced. Firstly, the questionnaire offers participants a short introduction about this study, including (1) the definition of AR, (2) its recent development and practice in different disciplines, (3) the implementation of AR in EFL and (4) the main aim of the survey. After that, participants were asked whether they would like to watch a one-minute introductory video about AR. Secondly, the questionnaire investigates participants’ demographic information, including their gender, school level, school type, region, course(s) taught in school and previous AR experiences. Thirdly, the questionnaire focuses on language learning motivation and experience (five questions), asking participants whether AR could (1) motivate their students to learn English, (2) lower learning anxiety, (3) make a connection between formal and informal learning and (4) stimulate learner’s self-directed learning. Fourthly, in terms of the effectiveness of AR (four questions), participants were asked to order their perceived effectiveness of AR in teaching and learning English listening, reading, writing, speaking, vocabulary and grammar. Apart from language learning, whether AR could enhance the development of non-linguistic skills (e.g. memory, problem-solving, creativity, critical thinking, collaboration, communication, information, media and technology, as well as life and career) is included. The fifth part (two questions) predominantly explores participants’ perceptions of how AR changes and improves EFL teaching content, asking them whether this technology could make EFL teaching and learning materials more comprehensive, authentic and contextualised. In the sixth part, with four questions, addresses if AR could make the current curriculum more flexible, personalised, authentic and interactive. The questionnaire finally asks participants their willingness to use AR in their future teaching.

3.3 Data Collection and Analysis

We collected the questionnaire data by using a convenience sampling approach. Invitation emails were sent to potential participants within our networking circles with the information sheet and the consent form. The teachers who agreed to participate in this study replied to our email with the signed consent form. Afterwards, we forwarded the survey link and instructions to them by email. In addition, for advancing the generalisability of the research data, we adopted a snowball sampling approach, asking the existing participants to help us recruit more subjects from among their acquaintances. We provided them with the information sheet and our email addresses for the potential participants to decide whether or not to participate.

The main body of the questionnaire uses a six-point Likert scale to describe seven different attitudes towards AR. Specifically, these six points are: 1 = Strongly disagree, 2 = Disagree, 3 = Somewhat disagree, 4 = Somewhat agree, 5 = Agree and 6 = Strongly agree, for all questions investigating teachers’ perceptions. For three ranking questions in the questionnaire asked the participants to order the following options in decreasing order of possibility. The first looked at whether they felt AR would help them teach English (1) listening, (2) reading, (3) writing, (4) speaking, (5) vocabulary and (6) grammar. The second one taped into their perceptions of AR when helping their students learn the skills mentioned above. The third question asked them whether AR would help students learn some non-linguistic skills, including (1) memory, (2) problem-solving, (3) creativity, (4) critical thinking, (5) collaboration, (6) communication, (7) information, media and literacy and (8) life and career, the number (1–6) stands for participants’ ranking for different skills. Compared with an odd number of responses that may give participants an “easy out”, the six-point responses can yield groupings that are easier to understand and discuss. The six-point Likert scale (from Strongly Disagree to Strongly Agree) was converted to a numerical scale from 1 (Strongly Disagree) to 6 (Strongly Agree), then the numeric data were imported into SPSS. For descriptive analysis, the mean and median of each question were calculated. Besides, in order to determine if there were significant differences between different school levels and regions, independent-samples t-tests were run as all assumptions of outliers, normality and homogeneity criteria (p > .05 Levene’s Test) were met.

4 Results and Discussion

4.1 Theme 1: Learning Experience and Motivation

Table 1 shows the results of the five questions regarding language learning experience and motivation. The mean of the second question that asks teachers whether AR can lower students’ language learning anxiety is only 2.46, which is relatively negative. Teachers’ perceptions of the other four questions are between “somewhat agree” and “agree”, with a mean of 4.88, 4.93, 4.49 and 4.84, respectively.

Table 1 Descriptive statistics of teachers’ perceptions of AR—Theme 1

Having a feeling of anxiety to learn English is a common phenomenon among Chinese EFL learners at different education levels (Cui, 2011; Tang, 2005). Learners, especially low-proficiency learners, often worry about their abilities when appropriately answering teachers’ questions in class, freely communicating with others and adequately expressing their ideas. Impacted by the Confucius culture in China, learners worry about the criticisms from their teachers and peers (Cui, 2011), feeling sensitive about others’ evaluations and caring about saving face (Wu, 2017).

In addition to descriptive analyses, independent-samples t-tests were run to determine if there were differences in the teachers’ perceptions of AR in language learning experience and motivation between the tertiary and non-tertiary groups. Except for the second question (i.e. Whether AR could lower students’ language learning anxiety) where the mean of the tertiary group is slightly higher (mean = 2.60, compared with 2.38 in the non-tertiary group), teachers in the non-tertiary group were more positive regarding the other four statements. In particular, there is a statistically significant difference between the two groups regarding AR in the third question (i.e. Whether AR could enhance and enrich students’ language learning) (mean difference = .417, p = .024). However, no statistically significant difference was found in terms of the other four questions between the tertiary and non-tertiary groups, p > .05. Besides, there was no significant difference between the two regional groups, p > .05.

4.2 Theme 2: Effectiveness

Concerning the question regarding whether AR would help students achieve their language learning aims in a balanced manner, the mean is 4.48, and the median is 5.00, demonstrating that their attitude is between “somewhat agree” and “agree”. Independent t-test results showed that there was a significant difference between the tertiary and non-tertiary groups (mean difference = .341, p = .045). However, the difference between the two regional groups was not statistically significant, p > .05.

Besides, regarding the effectiveness of using AR in enhancing learners’ linguistic skills, teachers believed that AR would be more helpful in teaching and learning listening (median = 2.00), speaking (median = 2.00) and vocabulary (median = 3.00). In comparison, writing (median = 5.00) and grammar (median = 6.00) are the two aspects that AR may not able to help. Drawing on previous studies, listening, speaking and vocabulary have attracted researchers’ and educators’ attention. Many studies have attempted to explore the effectiveness of AR in helping students enhance their listening and speaking skills as well as vocabulary knowledge. For example, an AR-based context-aware ubiquitous learning environment—Handheld English Language Learning Organization (HELLO)—designed by Liu (2009) seeks to enhance learners’ listening and speaking skills. Santos et al. (2016) see AR as a kind of multimedia, situated in authentic environments, that promotes better word retention and improves learners’ attention.

In terms of non-linguistic skills, AR was perceived as beneficial for cultivating students’ memory and creativity skills, especially compared with career skills (Table 2). Previous studies also provide empirical evidence of the effectiveness of AR in facilitating some non-linguistic skills (e.g. memory: Hou & Wang, 2013; creativity: Yilmaz & Goktas, 2017; collaboration: Szalavári et al., 1998). However, it has been found that very few of them were conducted in the Chinese context, especially focusing on Chinese EFL learners.

Table 2 Descriptive statistics of teachers’ perceptions of AR—Theme 2

As regards tertiary and non-tertiary teachers, independent-samples t-tests showed non-significant differences with regard to teaching and learning different linguistic skills (p > .05). As regards teaching and learning other non-linguistic skills, teachers at the non-tertiary level ranked “creativity” higher than their counterparts at the tertiary level (mean difference = −.638), while tertiary level teachers tended to believe that using AR could facilitate students’ learning information, media and technology-related skills (mean difference = 1.263). According to the independent-samples t-test results, there were statistically significant differences, p = .0048 and.005, respectively. No significant differences were found pertaining to other non-linguistic skills. When the teachers are categorised by region, the independent-samples t-test did not show any statistically significant differences regarding the effectiveness of AR (including teaching and learning linguistic skills as well as learning non-linguistic skills).

4.3 Theme 3: Content

In many EFL contexts like China, English language teaching and learning is often limited to the classroom, where teachers still depend on the traditional English teaching approaches (e.g., reading aloud, recitation and repetition), and little interaction and few authentic communicative opportunities are provided for learners. AR’s potential to integrate multimodal, authentic and contextualised stimuli into language teaching and learning materials is likely to disrupt the status quo of classroom-based EFL across levels. In this study, teachers’ attitude is generally positive regarding the content supported and afforded by AR. More specifically, they agreed that with the assistance of AR, they are enabled to integrate multimodal stimuli, such as sounds, images, videos, into the current English classroom, in order to make language teaching and learning materials more comprehensive (Mean = 5.18, Median = 5.00). Besides, they also believed that by using AR, teachers and students could access more authentic and contextualised language learning input (Mean = 5.11, Median = 5.00, Table 3).

Table 3 Descriptive statistics of teachers’ perceptions of AR—Theme 3

Some previous studies conducted in other language learning contexts have verified the benefits of AR in providing learners with multimedia and multimodal language learning inputs. For instance, Santos et al. (2016) drew on multimedia learning theory, designing a handheld AR system with situated multimedia (e.g. text, image, sound and animation) stimuli for vocabulary learning; students in Lee and Park’s (2019) study created multimodal gamified digital stories on a location-based AR application, which allowed them to practice the language in real contexts and share their experiences with others.

The independent-samples t-test results showed that non-tertiary teachers were slightly more positive to the content assisted by AR technology (mean = 5.20 and 5.14) than tertiary teachers (mean = 5.11 and 4.98), but there was no significant difference between the two groups (p > .05). The divergences between teachers in Eastern China and other regions are minimal (mean = 5.19 and 5.12 for Eastern China; mean = 5.12 and 5.14 for other regions), without statistically significant differences (p > .05).

4.4 Theme 4: Curriculum and Pedagogy

As Kerawalla et al. (2006) suggest, AR applications should improve interactivity and flexibility, enabling teachers to adapt the applications to the needs of individual students and empower students to regulate their learning. When considering adopting AR into classroom practice, it is necessary to make the content more flexible and personalised for students. In terms of curriculum and pedagogy, teachers, in general, agreed that using AR could encourage a more flexible, personalised, authentic and interactive curriculum (mean = 4.95, 4.95, 4.89 and 5.12, respectively, Table 4).

Table 4 Descriptive statistics of teachers’ perceptions of AR—Theme 4

Non-tertiary teachers held a slightly more positive attitude than their tertiary counterparts. An independent-samples t-test showed non-significant differences (p > .05). Comparing Eastern China and other regions, the differences between the groups were also small, and no significant differences (p > .05) were found.

Some scholars have attempted to use AR to engage students with an interactive and authentic curriculum in different disciplines (e.g. STEM: Hobbs & Holley, 2015; healthcare: Carlson & Gagnon, 2016; library instruction: Chen & Tsai, 2012). Nevertheless, we have not found studies conducted in the language education field to understand whether and how AR could make the current curriculum more flexible, personalised, authentic and interactive, mitigating the problems of English classrooms. This is an area that deserves further attention.

4.5 Theme 5: Future Use

38% of the teachers in the survey said that they were willing to use AR and integrate this emerging technology into their current teaching. Nearly 60% of teachers were still unsure about whether or not they would use AR. By school level, 39% of non-tertiary teachers planned to use AR in the near future while 58% said “maybe”, and only 2% said they would not use this technology. For tertiary teachers, 35% stated that they would use it, while only 4% held the opposite view. A large number of participants (27.6%) took a wait-and-see approach.

The differences between Eastern China and other regions regarding the future use of AR are not significant. 38% of teachers in Eastern China and 37% of teachers in other regions planned to use AR. 58 and 60% of teachers in the two regional groups were still unsure. Almost none of the participants were against the use of AR in teaching.

An open-ended question asking participants to specify the reasons why they would or would not use AR in the near future was included in the survey. Teachers stated that they believed that using AR could make language teaching and learning more efficient, interesting and relaxing. Comparing with traditional “spoon-feeding” classroom teaching and rote learning approaches, these teachers believed that AR-enhanced language teaching is more authentic, interactive and collaborative, which could lower students’ anxiety and motivate them to achieve their language learning goals. As a type of mobile technology, they believed that AR could allow their students to learn a language anywhere and at any time.

However, many teachers felt worried about their abilities to use AR, especially how to integrate it into the current curriculum. They also considered the cost of AR equipment and did not believe that their schools would have the funding to buy the equipment for their students. Besides, the few teachers who stated that they would not use this technology expressed their concern that they were unfamiliar with AR.

5 Conclusions and Suggestions for Future Research

We have examined Chinese EFL teachers’ perceptions and expectations of implementing AR in their English language teaching. As a first attempt to probe into teachers’ perceptions of AR, this study has made a contribution to our understanding of how AR affordances may impact ELT in China across levels and socioeconomic contexts. Our main findings suggest that while Chinese ELT teachers are not familiar with the uses of AR, they believe that the implementation of AR could (1) enhance their learners’ language learning experience and motivation, (2) benefit the teaching and learning of linguistic and non-linguistic skills, (3) enrich language learning content and (4) make the current curriculum more flexible, personalised, authentic and interactive. Similarly, teachers are willing to integrate AR into their English language teaching in the near future. This seems to be a tendency that spreads across levels, institutions and all regions in China. Non-significant differences have been found between tertiary and non-tertiary institutions regarding the four above-mentioned themes. Categorised by region, despite the gaps in the level of economic and technological development, the divergences in teachers’ perceptions and expectations of AR implementation between eastern China and other regions of China are also minimal, which are beyond our expectation.

At a micro level of analysis (Douglas Fir Group, 2016), EFL teachers’ perceptions and expectations about AR may derive from their previous experiences in adopting AR in other contexts (e.g. in daily life) and/or their perceptions of using other mobile technologies and devices (e.g. using iPad in classroom-based English teaching). At the macro and meso levels, our findings suggest that the state and local governments and schools predominantly focus on investing and implementing AR in other subjects (e.g., STEM), which may prevent language learners from an interaction with more personalised contexts of use and less dependency on declarative knowledge and grammar training. Thus, we are far from a situation where AR is tested in current English language education. The implementation and use of new technologies have been said to make education more inspiring, motivating and meaningful (Singhal et al., 2012). However, it is necessary to adopt evidence-based approaches to weigh up the benefits and limitations of a technology such as AR in the Chinese EFL context before discussing how to make use of this technology to facilitate English language education.

For future research, we should look at language learners, paying more attention to learners’ interaction with AR-based English language learning resources from a dynamic and multifaceted perspective. In light of the theoretical model proposed by Zhang and Pérez-Paredes (2019) on mobile English learning resources in Chinese EFL learners, future studies could feature how learners filter, select, use and evaluate the AR-based resources. From a sociocultural perspective, given the important role of other key stakeholders (e.g. designers) (Zhang et al., 2020), future research could also investigate the impact of communities on the implementation and effects of AR. Methodologically, qualitative approaches will provide us with an in-depth understanding of how different stakeholders understand AR’s contribution to SLA in instructed contexts.