Keywords

1 Introduction

The common approach to intercultural collaboration is to learn English [1], since English has become the global language [2]. In international discussions, however, the advantage of native speakers may be counter-productive. In fact, disparity in language skill is likely to suppress opportunities for non-native speakers to make significant contributions to intercultural communication. Using English in a group with language diversity can affect socialization and interpretation as it can act as a hidden barrier. Non-native speakers sometimes receive negative assessments and their intelligence be underestimated because of their lack of fluency [3]. We call this phenomenon language asymmetry; the participants in the communication channel have unequal semantics and language abilities.

Some researchers have attempted to improve communication by improving the quality of machine translation as well as using human intelligence. For example, Morita D. [4] introduced a method to use monolinguals to enhance the fluency and adequacy of both sides of two-language translation-mediated discourse. Taking a direction from the outsourcing of human intelligence, we realized that the ability of the users themselves is another valuable resource. Many people know more than one language and to communicate in a group, we can combine the full abilities of those users and machine translation services to realize best quality communication.

Several methods have been developed to help non-native speakers to effectively take part in conversations, for example, imposing artificial delays to help the non-native speaker understand the conversation [5], signaling the native speaker about the status of non-native speakers [6], helping non-native writers with vocabulary navigation [7], and providing real-time translation using eye gaze input [8]. Though these methods reduce the burden of non-native speakers, they cannot provide a completely balanced communication environment.

Beyond sharing a language, it is also important to understand different cultures. Because one cannot learn every language, machine translation and other technologies on the internet can be a solution [1]. Machine translation can enhance the efficiency and effectiveness of discussions [9]. However, machine translation can cause many communication problems during collaborative work. Because of uncertainty in machine translation accuracy and the different foreign language proficiency of participants, it is difficult to decide which languages or translation services should be used. If a foreign language is chosen, users with lower skill in that language can be left out of the conversation or have less chance to contribute. If machine translators are set between all users, some of whom might have adequate common language skill, the conversation will not be as fruitful as it should be. Polysemy and synonymy [10], common problems with machine translations, can trigger conversation breakdown, since translation output can be erroneous [11].

Beside language problems, balancing participation is also important for effective discussions and collaboration. In many kinds of collaboration, including collaborative problem solving, idea generation, collaborative leaning, and etc., the variety of backgrounds should yield a variety of opinions and ideas. Thus balancing the participation of the participants with different backgrounds is essential to these kinds of collaboration. Several studies have focused on rectifying unbalanced communication. Most focus on giving the users feedback in real time. A previous study provided a system that computes necessary features of speech and provides the users with some feedback via SMS on smartphones and creating animations that depict the participation of each user [12]. Related works use various types of interfaces to inform users about the activities of all users to increase users’ awareness of who is participating [12]. Related works use various types of interfaces to inform users about the activities of all users to increase users’ awareness of who is participating. For example, [13, 14] use a shared display to show speaker participation rates. They also suggest that providing a peripheral display helps to improve certain types of interaction. A similar study describes an interactive table that works as mirroring tool for group collaboration [15]. This tool also indicates how much each speaker participates in order to create awareness.

The works mentioned above describe various methods to help non-native speakers and balance the discussion. Our research is novel and orthogonal to existing research. Our model can support speakers of different languages with different proficiencies in a shared language skill by creating the best balance in terms of opportunity to participate in communication.

Our approach to balancing the discussion is also different from existing works, since our concept is to level the language burdens. To obtain the best balanced communication environment, we start with an existing study called user-centered QoS [16]. Normally, services are evaluated by users based on the Quality of Service (QoS). Yet, skill or information of users is important in selecting the best machine translation service. Therefore, a new function was introduced to calculate the Quality of Message (QoM) by incorporating the users’ skills in writing and reading messages when machine translators were used. In this paper, we extend QoM to define a model of the best balanced channel given the parameters of user language skills and machine translation accuracies. We then test the model in a real-world experiment to investigate the ability of our approach to create highly-effective multilingual communication environments.

2 Scenario

Figure 1 displays the difficult situation possible with multilingual communication. For a conversation between a Chinese user with fair English skill and a Japanese user with limited English skill, it is not complicated to choose the best communication method. In this case, the participants will be more effective if they use machine translation than using English as a shared common language if the machine translation quality is acceptable. Later, a Korean user with good English skill joins the conversation, it becomes more difficult to choose the best method of communication.

Fig. 1.
figure 1

A multilingual communication problem

It is possible to use the shared foreign language, English, use machine translation, or combine both options. If English is used as the medium for this conversation, it might cause difficulties for the Japanese whose English skill is limited. Machine translation could be a good alternative; however, two participants have good enough English skill to communicate directly, which might be better than using machine translation because of machine translation weaknesses. It is also possible to use both the shared language and machine translation. One might use his native language and the other two might use English. The problem is where machine translation should be used and which languages should be translated.

This situation is an example of asymmetry in collaboration caused by language. In that group, the members have asymmetric opportunity to participate. We believe that the best form of communication is to have mutual understanding and equal chance to participate. As a consequence, we tackle the language asymmetry problem.

3 Modeling Multilingual Communication

3.1 Best Balanced Channel

This paper proposes a model to solve the language asymmetry problem by providing equal opportunity to take part in a conversation even with the asymmetry nature of machine translation as stated above. Our model is called BB, the best balanced machine translation.

Based on existing work [17] on user-centered QoS, we model the Quality of Message (QoM) that user Pi who uses language Li to send a message to user Pj, who uses language Lj via a machine translation service MTi,j. MTi,j translates messages from language Li into language Lj. We consider the input language writing skill of the message sender, machine translation accuracy of MTi,j, and output language reading skill of the message receiver. Then, the quality of message from user Pi to Pj via machine translation service MTi,j. QoM (Pi, MTi,j, Pj), or simply QoMi,j, can be represented as follows:

$$ {{QoM\left( {P_{i} ,\,MT_{i,j} ,\,P_{j} } \right) = writing\_skill\left( {P_{i}, \,L_{i} } \right) \times accuracy\left( {MT_{i,j} } \right) \times reading\_skill\left( {P_{j} ,\,L_{j} } \right)}} $$
(1)

This model shows that writing skill of the sender, reading skill of the receiver and accuracy of machine translation impact QoM. As a consequence, selecting the most appropriate language pair is critical.

To increase the overall quality of communication, the quality of message should be maximized, since messages are dominant parts of conversations. BB comes with a method of choosing the language pairs that will maximize the quality of message.

Let (QoMi,j, QoMj,i) be a QoM pair between user Pi and user Pj, and (MTi,j, MTj,i) be an MT pair between language Li and language Lj. A QoM pair is called Pareto optimal when it is impossible to make a better QoM, without making another QoM worse off. A QoM pair is called best balanced when it is Pareto optimal and the variance of QoMi,j and QoMj,i is minimum if there is more than one Pareto optimal QoM pair.

If there are more than two users, we need to extend Pareto optimality. Recall that QoMi,j can be maximized by selecting appropriate language pair (Li, Lj), under the constraint that each user can speak one language. The average QoM of a QoM pair is defined as the average of QoMi,j and QoMj,i.

A set of QoM pairs is called Pareto optimal when it is impossible to make a better average QoM, without making any of the other average QoMs worse off. A set of QoM pairs is called best balanced when it is Pareto optimal and the variance of average QoMs is minimum among all Pareto optimal sets of QoM pairs. If there is only one Pareto optima, variance does not need to be calculated.

3.2 Example

Assume there are three users P1, P2, P3, who use languages L1, L2, L3, respectively. From the situation in Fig. 1, let ja, ko, and zh represent Japanese Korean, and Chinese language, respectively. Under the assumption that English can be used by everyone to some degree, possible combinations of languages \( Cx = \left\{ {L1,L2,L3} \right\} \) for the communication of the three users are as follows:

$$ \begin{aligned} & C1 = \left\{ {ja,\,ko,\,zh} \right\},C2 = \left\{ {ja,\,ko,\,en} \right\},\,C3 = \left\{ {ja,\,en,\,zh} \right\},\,C4 = \left\{ {ja,\,en,\,en} \right\}, \\ & C5 = \left\{ {en,\,ko,\,zh} \right\},\,C6 = \left\{ {en,\,ko,\,en} \right\},C7 = \left\{ {en,\,en,\,zh} \right\},C8 = \left\{ {en,\,en,\,en} \right\} \\ \end{aligned} $$

If there are n users in the group, the language combinations will consist of n(n – 1)/2 QoM pairs. For example, C1 consists of three QoM pairs including (QoM1,2, QoM2,1), (QoM2,3, QoM3,2), and (QoM3,1, QoM1,3). C1 utilizes three pairs or six of machine translation services, including (MTja,ko, MTko,ja), (MTko,zh, MTzh,ko), and (MTzh,ja, MTja,zh).

With the machine translator qualities and our user profiles in the example situation, the only combination that is Pareto optimal is C4, which means the conversation will be best balanced when the Japanese user uses Japanese while Korean and Chinese user use English, and the machine translation service needed is (MTjp,en, MTen,jp); (MTen,en, MTen,en) represents no translation.

In many cases, there is more than one Pareto optimal combination. The best balanced combination can be determined by evaluating the differences among the QoMs using variance. Lower differences raise the equality of the conversation.

4 Experiment

We designed and conducted a preliminary experiment to investigate our model. This experiment was designed to compare our best balanced channel with other channels including using English as a common foreign language and using a full translation service among all language pairs. However, in some cases, full machine translation can be the best balanced machine translation channel.

4.1 Task

In this experiment, the participants were instructed to play three games together using a multilingual embedded chat system.

As the games we set three survival problems: desert survival problem (DSP) [18], winter survival problem (WSP) from the project ARISE [19], and lunar survival problem (LSP) from NASA [20]. DSP is a popular collaborative task that asks the participants to arrange items in a list by their importance after a crash landing in a desert, in order to survive and reach the destination safety. WSP is similar to DSP, but the environment is in the woods and the weather is extremely cold. The item list is thus different from that the first game. LSP gives a slightly unique situation, landing on the moon but 80 km from the target place. Yet, LSP task is also the same as the first two but with a different item set.

Whereas, the original problems describe the situation using a number of paragraphs in English, we narrated the situation using short easy sentences in English and figures. Our games were simplified to cover the English proficiencies of the players. Each story explains time, location, and events that happened while the participants acted as survivors in the story. Participants were asked to rank a set of 6 items by their importance for each situation.

First, the participants were asked to rank the items individually, then they were asked to communicate with the other participants and negotiate with each other to make a team answer.

4.2 Experiment Design

At the beginning, we introduced each game and its instructions. Then, we demonstrated how to use Online Multilingual Discussion Tool (OMDT) which is a software created for multilingual symposia that enables multilingual chat, using services from the Language Grid. The Language Grid is a services-oriented collective intelligence that allows users to create language services by combining the existing language services [22].

In the OMDT web application, the user can choose the language to be shown on the right-top of the screen. He or she can type the target language into the message box. When the user clicks send, the message appears below. On the screen of the other users, the same message also appears but in the language selected by that user.

We played an example game, the results of which were ignored, for twenty minutes. During this example game, participants could ask questions and talk. After we made sure the participants understood how to play and how to use OMDT, the participants were asked to move and sit separately so they could not see each other.

The games were played using three strategies. The participants played the first game using English (EN), full machine translator (MT), or best balance (BB). The strategy was chosen randomly. The second game was played using one of the strategies not selected for the first game. The last game was played with the remaining strategy.

In each game, the participants had approximately 35 min in total. First, they had to try to understand the given problem, then write down their personal answers before discussing the selections with the other participants online by chatting or using machine translation. Afterwards, they discussed with the other participants to create the team answer. At the end of the game, the participants could give a new personal answer set if the discussion changed their mind.

After those three games were played with different communication channels, we interviewed the participants as to how they felt when they play the games with different communication modes.

4.3 Participants

Our nine research subjects were divided into three groups. Each group consisted of a Chinese, a Japanese, and a Korean. All were either undergraduate, graduate, or research students from various fields.

English skill profiles of the participants, displayed in Table 1, consisted of (writing_skill, reading_skill) normalized to the range of 0 to 1. English skills were measured using normalized standard test score from TOEIC, TOEFL, or IELTS. Gender is described as M, for male, and F, for female.

Table 1. Profile of Participants

4.4 Machine Translation

The Language Grid [12] currently offers a number of machine translation services. The services used in this experiment included J-Server and Toshiba English-Chinese Machine Translation. J-Server was used for all translations except between English and Chinese. To evaluate the quality of machine translation services, we randomly chose twenty sentences from a corpus provided by the Japan Electronics and Information Technology Industries Association (JEITA) in English.

We translated the original 20 sentences into three languages, including Chinese, Japanese, and Korean. After that, twenty sentences in each language were translated by machine into the other three languages. For example, Japanese sentences were translated into Chinese, English, and Korean.

Even though quantitative metrics are valuable for evaluation purposes, they cannot completely replace human assessment [20]. The translated sentences were rated by educated native speakers holding at least a bachelor’s degree. At this stage, each language had only one evaluator. This methodology of rating fluency and adequacy is widely used to measure machine translation as proposed by LDC [21] Our criteria include fluency of the sentence and its adequacy. Fluency of the translated sentences was rated from 0 to 5. Adequacy was rated as how much meaning of the original sentence was expressed by the translated sentence with score from 0 to 5.

The translation rating for each sentence was averaged to decide the quality of the translation service from one language to another language. Fluency and adequacy scores rated by the judges were added up and normalized to the scale of 0 to 1 as displayed in Table 2.

Table 2. Quality of translation services

4.5 Communication Channel

Using the participant profile from Table 1, and quality of translation services from Table 2, the value of QoM pairs for each combination, C1 to C8, can be calculated, as in Table 3. In this case, the only row containing Pareto optimal sets of QoM pairs is C4, so variance does not need to be calculated and the best balanced channel is C4.

Table 3. QoM values of all possible combinations

From the previous section, C4 contains {ja, en, en}, which means, using best balanced channel or BB, Japanese participant should use Japanese while Chinese and Korean participants should use English. The only machine translation used is Japanese – English machine translation.

As shown in Fig. 2 below, the strategy used in the experiment includes EN, MT, and BB. We also selected C1 and C8 combination since they are common methods used in multilingual communication. EN channel represents C8 {en, en, en}, which indicates that everyone uses English and no machine translation is used. MT represents C1 {ja, ko, zh}; all the members use their mother language and communicate fully via machine translations.

Fig. 2.
figure 2

Strategies of communication in this experiment

Messages in MT scenario are translated in to the languages used by the participants, for example, the message from Chinese participant in Chinese is translated into Japanese and Korean. The Japanese and Korean participants can see the message in their language and can reply using their native language. Reply messages from the Japanese or Korean user will also be translated into the other two languages for the other two participants.

5 Behaviors of Participants with Low Shared Language Skill

5.1 Simpler Sentences Used by Japanese When Using English

Using EN channel, sentences typed by Japanese users were simpler and shorter. To illustrate, a participant with limited English skill used only simple words and phrases for most parts of the English conversation, for example, “mirror is second”, “no need for aid”, without any further explanation. The longest sentence the participant used in English conversation was “transmitter tells us location or way”. The same participant expressed his opinion more fully using more complex sentences when he communicated in his own language via translation, for example “Raincoat. The reason is to protect ourselves against the direct sun. Not to wear but to use as a shade”. Simple sentences are not signs of bad quality of conversation but complex sentences might be more natural and can more easily trigger new assessments or interesting discussions.

5.2 Ignorance of Incomprehensible English Sentence

Incomprehensible sentences can be caused by low language proficiency. The conversation below shows a part of a conversation when all participants used English (EN channel) for the WSP game.

(Using EN Channel)

Ko :

We can make fire with lighter and tree

Zh :

But it is so cold and wet, I wonder if we can make it.

Zh :

Do you agree that the chocolate is the most useless one?

Ja :

can we solve shortening …?

Ja :

chocolate is most useful

Ko :

Wait a minute we can get fire from crash

They were discussing about which items to be selected based on the idea of how to start fire. Ko thought that item lighter was useful for making fire by using it with wood, while Zh doubted if this were really possible since the given situation was cold and wet, He also expressed his idea that the chocolate was the most useless choice and asked if the others agreed or not. Then the Japanese suddenly asked something about shortening in English, in which “solve” was not understandable, since her English skill is very limited.

Sentences which are not understandable are normally ignored by other parties [22]. When a low-English skill participant entered an incomprehensible sentence, sometimes the other participants just simply ignored that sentence as happened in this case. Instead, they continued the conversation without referring to what Ja said earlier.

This specific situation might not harm the quality of the conversation result. However, understanding all messages might trigger some interesting topic or idea to be discussed further. There might also be something important or useful in non-understandable sentences.

5.3 Less Engagement in Conversation of Japanese Users When Using English

Japanese users tend to engage less in conversations when English is used. The same Japanese participant can be more talkative when he/she uses machine translation. Machine translation makes people with low language skill worry less about what to say. They can easily think in their own language and simply type in that language. Using the mother tongue is more comfortable for the participants who have limited shared language skill and can provide more confidence in joining the conversation.

Table 4 shows the number of utterances in each game by each participant which reflects how talkative each participant was. Before the measurement, sentences not related to the collaborative task were excluded, such as greeting, self-introduction, etc. With machine translation, low language skill participants engaged in the conversation more often, since they took less time to come up with a sentence. We can see the degree of engagement in the conversation by comparing the talkativeness.

Table 4. Number of utterances in each game by each participant

Figure 3 shows the percentage of utterances made by each participant, and Fig. 4 shows the average percentage of utterances created by each nationality with similar English-skill level. From Figs. 3 and 4, the EN channel yielded unequal participation in the conversation. The Japanese tended to talk much less when using EN, while the balance became better when they used MT and BB.

Fig. 3.
figure 3

Talkativeness of each participant in each group as measured by percentage of utterances each participant made

Fig. 4.
figure 4

Average talkativeness grouped by country of origin measured by percentage of utterances

Conversation Encouragement

In games using the English Channel, sometimes one participant become quiet for a long period. For example, the Japanese user was asked for her opinion many times at different parts of a conversation by the other two participants.

(Using EN channel)

\( \begin{array}{*{20}l} { 1 5: 3 7: 3 9\,Zh} \hfill & {Ja,{\text{ what do you think}}?} \hfill \\ \ldots \hfill & {} \hfill \\ { 1 5: 4 8: 1 9\,Ko} \hfill & {{\text{How to you think about}}\,Ja?} \hfill \\ \ldots \hfill & {} \hfill \\ { 1 5: 5 7: 4 2} \hfill & {\text{How about Ja?}} \hfill \\ \end{array} \)

The reasons why a participant stopped talking include, not understanding the current conversation, taking time to express her opinion due to the language barrier, having no opinion, or her personality. Using machine translation can help the participant facing a language barrier in terms of expression and understanding and might increase confidence as the mother language is used. Asking for a specific participant’s opinion appeared much less when the MT or BB channel was used.

6 Benefit of Best Balance Machine Translation for Conversation Grounding

Machine translation is obviously useful for people who speak different languages to collaborate, however it also creates problems, for instance, translation mistake, conversation breakdowns, etc. One of the difficulties caused by using machine translation is building mutual understanding. In a group conversation, especially in an intercultural group, having a common ground is essential for people to collaborate.

An existing work [23] showed that using machine translation makes it more difficult to ground conversations. The study found that using machine translation violates the requirements for establishing common ground, especially when the number of languages exceeds two. It is difficult for the users to share the same content because of discrepancy between the translations and the users cannot be aware of the content that they share or do not share since they cannot monitor how the messages are translated.

However, both full machine translation and our proposed best balanced machine translation created more distributed talkativeness and more equal participation compared to using English as the mediated language. In many cases, using best balanced machine translation might need a fewer machine translation usages, which can decrease the difficulty of grounding the conversation.

7 Conclusion

The main contribution of this paper is proposing a best balance machine translation model that harmonizes participation rates in multilingual communication via the selective use of machine translation based on users’ language skills and quality of available machine translation services.

We conducted an experiment to study how our proposed method works compared to using users’ shared second language and simply use mother tongue for all participants by using machine translation. We asked the participants to collaborate on ranking importance of items in three survival games using a machine translation embedded chat system. Observations made during the experiment showed that utterances of participants who had limited skill in a shared foreign language increased when using machine translation services. This indicates that balance of participation among users is enhanced when machine translation is used.

Using our model helps to deal with imbalanced participation in multilingual conversations while raising the probability of successful conversation grounding. It also helped to reduce the chance of machine translation problems that can occur when the quality of machine translation is too low but the language skills of the users are acceptable. Our original model enhances communication quality by selecting the language combination for the best-balanced conversation, allowing people with different backgrounds to participate in conversations equally. Our concept is to harness the intelligence of both machines and people to boost participation balance in multilingual communication and collaboration.