Keywords

1 Introduction

Culture was defined as the history, customs, etc. of a particular society and it is formed over many years [1]. Understanding others’ culture is very important in today’s global society [2, 3]. It is therefore vital for educators to teach learners to understand and value others’ culture. Furthermore, learners need to amass a certain level of global competence to understand the world they live in and how they fit in this world.

Cultural convergence theory explains cross-cultural understanding [4, 5]. According to this theory, cross-cultural understanding takes place through communication and information sharing of learners from different cultures when they reach a mutual understanding of each other culture. That is, experiences and insights of other cultures that learners communicate and share enables them to expand their cultural awareness and behaviors [4, 5].

In cross-cultural learning process, learners acquire knowledge and skills related to different cultures and they also absorb new attitudes and values as a result of the experience and participation [6]. Traditionally, cross-cultural education in school is based on textbooks and an instructor’s knowledge and experiences. However, neither source can provide a thorough and authentic cross-cultural education [7]. Firstly, textbooks are often biased and mostly present the views of the dominant class. Secondly, teachers may be biased towards other cultures, or they may have only limited cross-cultural knowledge and experiences. Therefore, it is suggested that cross-cultural programs needs to be administered as united, connected events, and as a knowledge-building continuum [6, 7]. The following essential learning behaviors are underlined in the literature that lead to cross-cultural understanding [6]: (a) building relationships – interacting with members of the host culture regularly; (b) valuing people of different cultures – expressing interest and respect for the host culture; (c) listing and observing – spending time observing, reading about, and studying the host culture; (d) coping with ambiguity – understanding ambiguous situations and making sense of new experiences; (e) translating complex information – translating personal thoughts into the language of the host culture.

To facilitate these essential behaviors, various learning activities were proposed in the literature. Self-introduction is one activity that enables learners to become acquainted with one another and their cultures [8, 9]. This activity reinforces the comfort level in a classroom and encourages more social interactions among learners [10]. Self-introduction helps learners to identify and examine their own and peers’ cultural values [11]. Creating media content and sharing it with others is another activity. It enables peer-to-peer learning, diversification of cultural expression, a more empowered cross-cultural understanding, and respect of multiple perspectives across diverse communities [12, 29]. In addition, learners are able to discern important concepts from shared content and then synthesize it with information from other sources during this activity [12]. Performance and appropriation activity enables learners to adopt alternative identities and sample and remix media content meaningfully for the purpose of improvisation and discovery [12]. Through performance and appropriation learners from various cultures can introduce their own culture, share their ideas, artifacts and perspectives about it, and experience peers’ foreign culture [7]. Finally, reflecting on foreign culture activity enables learners to share their reflections and experiences with peers. This activity also allows learners to gain a better cross-cultural understanding and the strengths and weaknesses of cross-cultural project [9].

It is suggested to frame learning activities in a specific topic [30]. Therefore, cooking was selected as a topic for our project. Cooking is defined as the preparation of food for consumption [13] and associated with a specific culture, environment, and history. Therefore, it is distinctive in terms of ingredients, methods, and dishes [14]. Following this notion, “National Cuisine” term was proposed [15] which refers to food cultures that are practiced in terms of production and consumption in the specific ethnic communities and places. “Stinky tofu” is one distinct example of Chinese national cuisine. It is a kind of fermented tofu and it has a very strong unpleasant odor. Stinky tofu was a favorite food of Chinese in the period from the Wei Dynasty to the Qing Dynasty [16]. Despite an unpleasant smell, many develop an increased appetite for Stinky tofu and it is a popular local food in Taiwan and many regions of China nowadays [16].

We believe that learners from different countries may understand each other’s cultures better if they perform learning behaviors discussed in [6] by participating in learning activities proposed in [712]. Furthermore, if learning activities are framed in a specific topic, such as “National Cuisine,” we assume that learners will be more interested in cross-cultural learning and the topic will draw their attention, and stimulate their motivation [30].

However, how to ensure that learners from different cultures who speak different languages communicate and share culture-related information with each other is a question that concern most educators and researchers. One possible solution is an application of computer-assisted technologies. For example, Speech-to-text recognition (STR) technology synchronously transcribes text streams from speech input [17]. According to related studies, STR technology is a potential learning tool and it was successfully applied to many educational studies [18, 19, 31]. For example, this technology was used to assist learners with cognitive or physical disabilities and of those who attend speeches given in their non-native languages [20, 21]. Computer-aided translation (CAT) allows translating texts into different target languages [22]. Related studies suggest that CAT technology have a great potential to aid learning, especially in foreign language learning. For example, CAT was applied to assist learners to write texts in the target foreign language and correct grammatical and lexical errors in texts [23]. EFL learners had an online discussion for which they utilized CAT to translate and search for appropriate words to express their opinions and ideas and to check grammar and spell check [24]. Therefore, in this study we applied speech-to-text recognition with computer-aided translation to facilitate cross-cultural understanding of learners from two different cultures who do not share common communication language. Speech-to-text recognition system generated text from a speaker’s voice input in one language and computer-aided translation system simultaneously translated it into another one. We aimed to explore how an application of these technologies to an educational project may facilitate learners’ cross-cultural understanding.

2 Method

Ten students from the age of 14 to the age of 18 voluntarily participated in an online cross-cultural educational project. Six participants were Chinese native speakers from Taiwan and four participants were Russian native speakers from Uzbekistan. All participants had no experience with STR use but they had two to three-year experience with CAT. Besides, participants had more than 5 years’ computer and Internet experience. According to participants, none of them had any prior knowledge about the food introduced by their foreign counterparts and related culture. In addition, the participants never participated in any cross-cultural projects.

Two experienced in online cross-cultural educational projects instructors, one Chinese native speaker and one Russian native speaker, guided participants during a project. At the beginning, participants were explained by the instructors all steps of a project and how to communicate information to foreign counterparts more efficiently in order to enhance foreign counterparts’ cross-cultural understanding and to avoid any culture-related misunderstandings and miscommunications. In addition, the instructors trained participants how to use speech-to-text recognition and computer-aided translation. Participants then practiced to use speech-to-text recognition and computer-aided translation to generate texts in native language and then translate them into another language simultaneously. During a project, an instructor guided participants on using technologies and offered instant support for technology-related questions.

We aimed to enhance participants’ cross-cultural understanding through participation in an online cross-cultural project. Our project was carried out in four weeks. In the first week, participants were asked to make self-introductions, explain where they are from, and their interests. Participants introduced their favorite local food and recipes in the second week. In the third week, participants cooked food according to recipes introduced by participants from other culture. Finally, in the fourth week, all participants shared their experiences related to cooking food and reflected upon learning related culture.

STR and CAT systems were introduced to participants to assist their participation in the project. We employed Android based Google voice recognition system as the STR tool and Google Translate system was employed as the CAT tool. Figure 1 shows communication flow among participants. Participants from Taiwan spoke into a microphone and STR system generated speech input into Chinese texts. Texts then were translated from Chinese into Russian. After that, translated texts in Russian were posted online so that participants from Uzbekistan could read them. Participants from Uzbekistan communicated in the same way; their speech in Russian was transcribed into texts and texts were translated into Chinese. Then texts in Chinese were posted online for participants from Taiwan to read.

Fig. 1.
figure 1

Communication flow among participants

It is suggested that texts produced by STR and CAT systems may contain mistakes and ambiguities [17, 25, 31]. Therefore, two instructors corrected inaccuracies in texts that were produced either by STR or CAT and they prepared free of errors texts for participants.

The data for our analysis was collected from participants’ online communication (i.e. their self-introductions, introductions of local food and recipes, experiences to cook food, and reflections upon learning related culture) and one-on-one semi-structured interviews. This data enabled us to explore and understand the process of cross-cultural educational project, to measure participants’ cross-cultural understanding, and to learn about participants’ opinions related to facilitating cross-cultural understanding with the project:

Understanding how the project was carried out. We adopted a concept as a coding unit. Text segments that met the criteria for providing the best research information were highlighted and coded. Codes were then sorted to form categories; codes with similar meanings were aggregated together. Established categories produced a framework for reporting research findings. Three raters were involved in the coding process, and big differences in the coding were resolved through raters’ discussions and by consensus. Cohen’s kappa was adopted to evaluate the inter-rater reliability; the result exceeded 0.90, which indicates high reliability.

Measuring cross-cultural understanding. We analyzed participants’ reflections upon learning related culture. Particularly, we evaluated text segments extracted from their reflections that represented their cross-cultural understanding. The evaluation included three dimensions: (1) a foreign food, (2) related history, and (3) traditions. We carried out the evaluation by employing Anderson and Krathwohl’s [26] taxonomy. Specifically, we employed the following two rubrics of the taxonomy for the evaluation: (1) Remember - retrieve relevant knowledge from long-term memory and (2) Understand - construct meaning from instructional messages, including oral, written, and graphic communication. A score of “1” was given if participants remember but do not understand how to cook a foreign food and related history and traditions whereas a score of “2” was given if participants both remember and understand how to cook a foreign food and related history and traditions. Participants got a score of “0” if they did not remember and understand either how to cook a foreign food and related to its history and traditions. Three raters were involved in the evaluation process. The inter-rater reliability coefficients among them were calculated using Cohen’s kappa. The mean inter-rater reliability among the three raters exceeded 0.90, which demonstrates excellent agreement beyond chance.

Exploring participants’ project-related opinions. At the end of the project we carried out in-depth, one-on-one semi-structured interviews with all students and the instructors. The interviews contained open-ended questions in which students and the instructors were asked about their experiences during the project and opinions about facilitating cross-cultural understanding with the project. Each interview took approximately 30 min. Interviews content was audio-recorded with participants’ permission and then fully transcribed for analysis. Transcribed texts were coded and categorized to produce a framework for reporting research findings. Three raters were involved in the coding process, and big differences in the coding were resolved through raters’ discussions and by consensus. Cohen’s kappa was adopted to evaluate the inter-rater reliability; the result exceeded 0.90, which indicates high reliability.

3 Results and Discussion

The data analysis revealed that the project was implemented in four steps. In the first step, participants introduced themselves: participants mentioned were they are from, what they like to do, and what their favorite local food is. In the second step, participants introduced recipes of their local food. In addition, participants mentioned about history and traditions related to that food. In the third step, participants cooked food that was introduced by their foreign peers. In the fourth step, participants reflected on their experiences to cook and what they learned about related history and traditions.

According to participants, none of them had any prior knowledge regarding food they cooked during the project and history and traditions related to it. After the evaluation, we found that students could both remember and understand how to cook a foreign food and related history and traditions. According to Anderson and Krathwohl [26], Remember cognitive level represents the ability to retrieve relevant knowledge from long-term memory while Understand level represents the ability to grasp the meaning of the learning material. Our evaluation results shows that participants could recall, interpret, summarize, compare and explain what they cooked and related culture and traditions. This may suggest that cross-cultural learning took place through the project.

In the interviews, participants claimed that all steps of the project were useful for their cross-cultural understanding. For example, the self-introduction enabled them to become acquainted with each other. In addition, participants could learn about each other’s interests, hobbies and favorite food and notice some cultural differences between themselves and other participants. When participants posted recipes of their local food or cooked food using recipes posted by peers they could learn more about both peers’ and their own food and cultures. To better understand about peers’ and their own food and cultures participants also searched for additional information from Internet. The reflection enabled participants to reflect on their experiences of cooking and related cultures. In addition, participants could compare their local food and related culture to that presented by peers and find some similarities and differences.

Interviews with the instructors’ data analysis confirmed that participating in the project was beneficial for participants’ cross-cultural understanding. According to the instructors, students provided as well as received useful information related to their own or peers’ culture during the project. All presented information about food, history, and traditions was well understood by participants from both countries.

Self-introduction is a necessary activity for students to become acquainted with one another and their cultures [8, 9]. Through introductions, students identify and examine some of their own and peers’ cultural values [11].

Students admitted that it was easier to communicate with peers with whom they have no common language and whom they had never met before using our approach. According to students, our approach could help them ease anxiety and inhibition, and it motivated disclose of personal information more frequently and more effectively if compared to face-to-face interaction.

In this study, speech-to-text recognition (STR) system generated texts from speech input and computer-aided translation (CAT) system translated texts into another language. Table 1 demonstrates accuracy rates of STR and CAT systems for texts generation and translation with respect to Chinese and Russian. According to the data, self-introduction texts were generated with 99 % accuracy rate when spoken in Chinese and with 100 % when spoken in Russian. Recipes of local food were generated with 91 % accuracy rate when spoken in Chinese and with 96 % accuracy rate when spoken in Russian. Spoken reflections in Chinese were generated with 94 % accuracy rate while in Russian with 98 % accuracy rate. One reason that explains the slight difference between accuracy rates of STR-texts from input in Chinese and Russian (especially 5 % in Step 2 and 4 % Step 3) is that Uzbek participants practiced with STR and CAT technologies more than Chinese students. It is suggested to design STR technology-based teaching and learning activities in such way that encourages users, i.e. instructors and students, to use it more regularly [18, 19]. With such approach, users are able to identify strengths and limitations of the STR through real experience. For example, after noticing that STR technology generates text with errors when speech is too fast or too slow, influent, and in a low voice, speakers try to adapt to the STR recognition capacity. That is, speakers start to speak with moderate speed and volume, less spontaneity, and better fluency.

Table 1. STR and CAT accuracy rate (in percentage)

According to our results, in Step 2, STR technology generated texts from input in Chinese and Russian with lowest accuracy rate. One reason that explain this finding is that sentences used to introduce local food and related culture were longer compared to sentenced in which students introduced themselves (Step 1) or reflected on their experiences (Step 3). Another reason is that sentences in Step 2 contained some specific names of food ingredients or terminologies related to history and culture that STR technology could not recognize.

After STR-texts were ready, and before CAT process, we edited them to make 100 % accurate in order to increase CAT accuracy rate. That is, all errors in STR-texts were corrected and punctuation marks, such as commas and periods, were added. According to the table, self-introduction STR-generated texts were translated from Chinese into Russian with 89 % accuracy rate and from Russian into Chinese with 88 % accuracy rate. Recipes STR-generated texts were translated from Chinese into Russian with 76 % accuracy rate while recipes texts in Russian were translated into Chinese with 74 % accuracy rate. STR-generated texts of reflections were translated from Chinese into Russian with 82 % accuracy rate and from Russian into Chinese with 80 % accuracy rate.

The difference between CAT accuracy rate in Chinese and Russian was only 1–2 %. The lowest CAT accuracy rate occurred in Step 2 for both languages (74 and 76 %). Perhaps, the low CAT accuracy rate was due to the same reasons we mentioned earlier; the first reason is that sentences were longer and they contained some specific names of food ingredients and terminologies related to history and culture. According to researchers in the field [27, 28], current CAT technologies has not been able to deliver high-quality translations. It was further argued that CAT technologies produce better translations when confronted with short sentences compared to longer and more complicated ones because of highly limited linguistic context [28]. That is, the longer the sentence is, the more likely that the CAT technology will be led astray by the complexities in the source and target languages. Researchers [1719, 21], argued that STR- or CAT-texts with only reasonable accuracy rate is acceptable and useful for students. That is, only texts with accuracy rate of 75–85 % or higher [21] can enhance teaching and learning. Following this suggestion, we may conclude that all STR- and CAT- texts in this study were acceptable and useful for our participants except recipes texts translated from Russian into Chinese (74 % accuracy rate). To address low accuracy rate of CAT-texts, several approaches are proposed in the literature. One of them is to correct errors in CAT-texts (e.g. correct misrecognized words, insert missed words, or delete superfluous wording) by the instructor or students [27, 28]. Therefore, we revised all CAT-texts into 100 % accurate texts so as to make them useful and meaningful for teaching and learning.

Based on our results, we would like to highlight the pedagogical usefulness of STR and CAT systems for cross-cultural learning. First, our approach can facilitate communication between participants with no common language. Participants do not need to rely on translators but communicate independently. Besides, there is no limit in information amount they communicate, i.e. it can be small or large. Second, through communication using our approach, participants learn and understand foreign culture in authentic context as they communicate with the host of that culture. Third, participants not only receive information about foreign culture from host but also are able to ask question they have, share opinion, ideas and reflections to better and deeper understand foreign culture. Fourth, such communication makes the instructors and participants less anxious because no foreign language skills are required during such projects. Therefore, our approach demonstrates significant value and importance of STR and CAT systems utilization in education, especially in cross-cultural learning. As our approach is convenient and independent, it holds great potential for solving problems teachers and students typically encounter when teaching and learning cross-cultural understanding through participating in educational project.

4 Conclusion

Results of this study show that applying speech-to-text recognition with computer-aided translation have a potential to enhance cross-cultural learning. Particularly, application of these technologies helps participants from two different cultures without common language of communication to interact and share information with each other.

Based on our results, we make several implications and suggestions. First we suggest that teachers and students utilize STR and CAT technologies for teaching and learning. Particularly, this approach can be useful for courses on cross-cultural understanding when communication between teachers and students with no common language is required. Teachers and students need to be careful about accuracy rate of texts produced by STR and CAT. To increase accuracy rate of such texts in order to make them acceptable and useful for learning we suggest that students practice with STR and CAT systems more frequently. In this case, they will find strength and limitations of the technologies and then fully utilize them. Making input sentences shorter is also helpful for accuracy rate; one needs to split a long sentence into two or more short sentences. In addition, we suggest training both technologies to easily recognize some specific words and terminologies that are frequently misrecognized. This can be done by adding these words into STR or CAT terminology bank so that they will be remembered and recognized correctly in the future. Finally, we suggest that STR- or CAT-texts need to be edited by teachers or students. That is, mistakes in texts should be corrected to make texts acceptable and useful for teaching and learning.

One limitation of our study needs to be acknowledged. We had a small size and a short period of time was allotted for the project. For these reasons, the obtained results cannot easily be generalized. In the future study, more students will be involved. We are particularly interested in a web-based project in which students from different classrooms around the world representing more than two cultures and languages are communicating and sharing information with one another, especially, in real-time. Furthermore, other methods for bridging cross-cultural differences or enabling cross-cultural understanding will be presented in the future in order to test the feasibility of our approach.