Keywords

1 Introduction

The emergence of social robots in our everyday life is increasing rapidly day in day out. This fact highlights the important role of Social Robotics, which targets the integration of robots in our daily lives. A good example of social robots could be the category of “assistive robots”. In this case, it is almost obvious that the robot’s actions may cause serious consequences to the people surrounding them [1]. Another example could be health-care robots as well as companion robots for elders, which has been addressed in recent literature [2, 3]. The foremost requirement for a social robot interacting with elders, family or medics, is its trustworthiness [3]. In other words, in this case, the concept of trust in robots becomes a central issue. However, this concept is not only important for social robots, but also for other types of robots such as service robots, military, or even in industrial robots [4].

Apart from Human Robot Interaction (HRI), the concept of trust has been investigated for decades in other fields, specifically psychology and social science. In these contexts, the trust is defined as a factor of human personality [5], which is the result of a choice among behaviors under specific situation. From another perspective, trust enables an individual to face accepting a level of risk associated with the interaction with another agent [6]. In this view, we can define trust as a feeling of confidence that another individual will not put him/herself at risk unnecessarily [7].

Thereby, combining social robots and concepts of trust may raise a question: is it possible for a human to trust a machine? It is not surprising that the feeling of confidence felt by human subjects can turn robots into more collaborative partners [8]. This fact has motivated several studies in the field of social robotics to investigate factors influencing trust. Another motivating factor might be the fact that trust is entwined with persuasiveness in social and collaborative contexts. Hence, trust may directly affect people’s inclination to cooperate with the robot, for instance by accepting given information or following its suggestions [9].

The preceding factors motivated us to implement a framework that aims to evaluate the trust felt by a human in a robot. In this framework, we have designed different scenarios, described in the following sections in details, to compare and evaluate the level of trust under different circumstances. We argue that using a robot as a storyteller and a human subject as the recipient may reveal the influence of such factors. With this aim, the robot is programmed to tell the story expressing either joy or sad facial expressions. As well as this, the robot will be able to earn the confidence of a subject by making small talk before starting the storytelling phase. We expect to reach a higher level of trust in the case of a storytelling robot expressing sad facial expressions while starting the conversation by making small talk.

2 Background

Recently, a number of research has investigated the concept of trust in Social Robotics. For instance, Brule et al. [10] performed experiments to study the effect of a robot’s performance and behavior on human trust. The authors reported that the performance of robot’s tasks influences its trustworthiness. In another study Youssef et al. [11] investigated the effect of combining inarticulate utterances with iconic gestures in addition to the response mode (proactive or reactive). Results suggest that humans overwhelmingly prefer the proactive mode to the reactive mode; in fact, under this setting a higher trustworthiness of the robot was obtained.

Stanton and Stevens [12] conducted an experiment in which participants were asked to give answers in a game similar to “shell game”. The authors created scenarios where the robot eye gaze and eye tracking were used to help the participant to find the object. It is reported that the gaze has a positive impact upon trust for difficult decisions and on the contrary, a negative impact for easier decisions.

Another study performed by Kahn et al. [13], suggests that people will likely build intimate and trustworthy relationships with robots. This hypothesis is proven through an experiment in which the participants listened to a secret told by a robot. Another study performed by Desteno et al. [14], proves that accuracy in judging the trustworthiness of novel partners is heightened through exposure to nonverbal cues and identified a specific set of cues that are predictive of economic behavior.

Despite the promising reported results, there remains a paucity of evidence on other behavioral factors of a robot. In this sense, our work suggests that the trust a person has in a robot can be improved or acquired according to facial expressions, as well as making small talk before performing certain tasks.

3 Methodology

In this study we conceptualize the trust in terms of a storytelling situation, in which we imagine the trustor as the one who depends on another individual, and the trustee will be the one who is at risk [1]. To examine the level of trust, we attempt to design a scenario where the robot is at risk of being shut down or being replaced by another new robot, due to a particular fault or malfunctioning. Hence, the robot needs money to rescue himself by fixing the faulty part; in our scenario, the problem is with his left eye, as depicted in Fig. 1(c). However, as in real life, it is clear that a person will aid somebody if s/he believes that the suppliant is trustworthy. The same may happen to the robot in HRI scenarios. Going into depth, in this paper we suggest two hypotheses based on the effect of (a) small talk and (b) facial expressions on people trust in a robot. Figure 2 depicts the flowchart of the methodology steps aforementioned.

Fig. 1.
figure 1

The Emys robot expressing joy (a) and sad (b) facial expression, and showing its problem (c) to the participant.

Fig. 2.
figure 2

Methodology flow.

The former, i.e. the role of small talk, has been examined earlier in [15]. Based on this study, starting the conversation using small talk has a positive influence on the trust of participants. However, to examine the hypothesis, they implemented the scenario using a virtual robot, not a real physical robot; while it has been proved earlier that human interaction with a real robot is different in comparison with a virtual one [16]. In the same vein, we argue that the influence of small talk performed by a real robot might influence the level of trust using a real robot comparing to a virtual robot.

In case of facial expression and trust, to the best of our knowledge there is no recent study carried out. In sum, we suggest two hypotheses to evaluate in this paper, which are as follow:

  1. 1.

    Small talk: starting a conversation with small talk would enhance the level of trust in a robot.

  2. 2.

    Facial expression: expressing sad facial expression, while telling a sad story would enhance the level of trust in a robot.

As follows, we define the two hypotheses as bipolar variables. In other words, each variable can take two different values. For instance, the first hypothesis is a variable with two possible values: making small talk (ST) or not (NST). Similarly, the second hypothesis could be expressing either joy or sad facial expression (the two possible facial expressions are depicted in Fig. 1(a) and (b)). In this way, four different scenarios will be assumable:

  • starting the interaction with small talk – while expressing sad face [ST_SAD],

  • starting the interaction without small talk while expressing sad face [NST_SAD],

  • starting the interaction with small talk while expressing joyful face [ST_JOY] and

  • starting the interaction without small talk while expressing joyful face [NST_JOY].

4 Implementation

The experiment was conducted using the SERA Ecosystem developed by Ribeiro et al. [17]. This ecosystem is composed of a model and tools for integrating an AI agent with a robotic embodiment, in HRI scenarios. To be part of this ecosystem, we developed an application in C# and integrated it to other applications through of a high-level integration framework named Thalamus. This framework is responsible for accommodating social robots with possibility to include virtual components such as multimedia applications [18]. Moreover, we utilized a semi-autonomous behavior planner named Skene developed by Ribeiro [18]. To complete the experiment, a TTS component is used which serves only as a bridge to the operating system’s own TTS. Moreover we utilized a module of speech detector in order to capture the subject voice and make the interaction with robot more real, as well as a symbolic animation engine based on CGI methods called Nutty Tracks [19], which provides the opportunity to animate both virtual and robotic characters in a graphical language. This ecosystem used to integrate all applications together with Emys robot is depicted in Fig. 3(a) using this ecosystem, the Emys is able to demonstrate emotions based on Ekman’s facial Action Coding System [20].

Fig. 3.
figure 3

(a) Ecosystem used to perform experiments, (b) A participant interacting with the robot.

In the scenario starting with small talk, the participants have the opportunity to talk to the robot using an embedded microphone. The robot would wait after uttering each sentence for the subject’s response and then would continue the conversation.

After technical implementation, to validate the hypothesis suggested earlier, first we started designing scenarios and collecting data. In order to prevent any distraction, we ran the experiment in a quiet and isolated room. Moreover, to evade the experimenter effect [21], we did not inform the curious participants about the test and let them know the goal of the experiment before it was finished (Fig. 3(b) depicts a participant interacting with the robot). In addition, we prevent participants who were aware of the goal to participate in the experiments. We designed the experiment under four different scenarios to provide the opportunity of inspecting the four possible outcomes corresponding to the two variables (ST_SAD, NST_SAD, ST_JOY, NST_JOY). To measure the trust we applied a recently proposed trust questionnaire, available in [21].

After filling in the pre-questionnaire, the subjects start the interaction stage with the robot in one of the pre-defined scenarios. In the case of small talk scenarios, the robot makes a set of simple interactions asking general questions and then he waits for the response. After this stage, the robot tells the story with either sad or joy facial expression depending on the scenario. During the story the robot explains he will be replaced soon by another robot due to a problem with his left eye as depicted in Fig. 1(c).

At the end of the storytelling phase, regarding the donated amount, the robot expresses a happy or sad face. In our implementation, we considered a threshold of 20€, more specifically the amounts below 20€ are considered as low. This threshold was set only to create the robot reaction in response to the donation. Finally, the post-questionnaire is applied to measure the trust in the robot. As declared in [22], the pre- and post-questionnaire should have the same questions.

Participant and Dataset.

Considering the described scenario, we performed the experiment over three days, and a total of 42 people participated. The experiment was performed in the Instituto Superior Técnico (IST) in Porto Salvo, Portugal and the population was a random selection of students. Table 1 lists the descriptive statistics of the subjects participating in the experiment.

Table 1. Statistics of the participants, the numbers in parenthesis are the mean and standard deviation of age, respectively.

To examine the influencing factors, based on the four designed scenarios, we assume 6 different comparisons listed in Table 2.

Table 2. Assumed comparisons.

5 Results

Based on Table 1, 42 people participated in the experiment, among which, one of the participant’s native language was English. As it is a proven theory that speaking with a robot with the same language as the participant has emotional influences on him/her [23], we removed the record corresponded to this participant from the final dataset, to prevent any other influencing factor, such as this one. It should be noted here that the small talks and the story were told in English while most of the participants were Portuguese. Besides, to check whether all the participants were at the same level of English comprehension, we asked to the participants the level they perceived the utterances in a Likert scale. The final data set with 41 subjects is large enough to assume it as a normal population based on the central limit theorem [24]. However, it should be noted that the statistical population of each subgroup is not large enough to assume them as normal. Hence, we performed a normality test which indicated that the population is non-normal in any scenario (Comparison 1: D(19) = 0.13, Comparison 2: D(22) = 0.20, Comparison 4: D(21) = 0.15, Comparison 5: D(20) = 0.20; p < 0.05). Thus to analyze the collected data set by scenarios, we turned to non-parametric tests.

The trust questionnaire contains two parts, one part must be performed before the interaction and the other after the interaction. We performed a t-test on the whole data of the pre-questionnaire, which indicated that there was no significant difference between the subjects participated in the experiment, before interacting with the robot (t(39) = 1.39, p = 0.17). Hence, all the subjects had the same condition before interacting with the robot.

To assess the first hypothesis, i.e. the role of small talk, first we look at each subgroup independently. In Comparison 1, which compares the differences between the two groups of people interacting with sad facial expression, the result of a U Mann-Whitney test shows that there is a significant difference between the two group started conversation with or without small talk (U = 13.5, p = 0.010) and the higher mean in the first group (63.9 vs 43.4) endorses the fact that small talk makes in difference. However, this pattern is not observable in Comparison2 (U = 38.0, p = 0.13).

To investigate the role of small talk regardless of facial expressions (Comparison 3), first we checked whether there is any difference between the distributions of data in each subgroup. A Kolmogorov-Smirnov test shows that the distribution of trust level is the same across both categories (Z = 1.93, p = 0.001), hence we can put the two categories together. The result of a t-test shows that there is a significant difference in trust level of each group (t(39) = 3.34, p = 0.02). The higher means in case of small talk (63.95 vs. 50.10) approves the first hypothesis.

On the contrary, in case of the second hypothesis, Comparison 4 (U = 54.0, p = 0.94) and 5 (U = 32, p = 0.18) did not reach any significant difference between the two groups. The result of a Kruskal-Wallis test showed that the distribution of the two groups are the same (H(21) = 0.98; p < 0.05), hence we can mix the two groups to examine the influence of facial expression regardless of the small talk (Comparison 6). The result of a t-test showed that there is no significant difference between the two groups (t(39) = 1.95, p = 0.43), hence we cannot infer anything further from this comparison.

Another feature which might be a discriminant of the groups is the amount of donation which is listed in Table 3 in details. To analyze the data regarding the donation rate, we performed the uni-variate non parametric test of U Mann Whitney for the two hypotheses. For the first hypothesis, no significant difference is inferred between the distribution of the two groups (U = 0.92, n = 21, p < 0.05). In the same fashion, the distribution of donation is the same for second hypothesis across two categories of sad and joyful face (U = 0.93, n = 20, p < 0.05).

Table 3. Descriptive Statistics of donation per scenario.

6 Discussion and Future Works

As pointed out in the result section, based on the collected data, comparing the four groups separately does not lead to a significant difference in general. However, when we compare the variables over a larger set (Comparison 3), we reached significant differences in the case of the first hypothesis. The justification of this might be the small size of the collected data.

The most significant difference happens in the case of small talk together with the sad face. Although the sample size was small, the robot could maintain his partners’ trust in the ideal case: expressing sorrow after making small talk. In this case, the higher mean in the scenario with small talk indicates that people generally generate a higher level of trust when the robot seems sad while telling his story.

Although small talk seems to play a vital role in trust, however, it was not clear in the case when the robot expresses the joyful facial expression. The justification behind this fact is that, the robot’s facial expression while expressing joy was not as clear as sadness. In the post-questionnaire, we asked the subjects whether they perceived his facial expression or not. In case of sadness, almost all the subject perceived him as sad, except 4 people who perceived him as neutral (21 %). On the other hand, in case of joy, only one subject found him joyful, more interestingly, two people found him feeling sad. Hence, hypothesis 2 could not be evaluated well in the current setting.

As it is clear in the Table 3, the amount of donation when the robot was configured to perform the small talk (independent of facial expression) was higher than the amount of donation without small talk (579€ with small talk against 126€ without small talk). It should be noted that sad facial expression raised more donation than joy facial expression with (302€ against 277€) or without (75€ against 51€) small talk. Although the descriptive analysis of the donation amount sounds interesting, however, from statistical analysis viewpoint, there is no significant difference between the two groups, due to the high level of variance. We can argue that, in this experiment people were not supposed to donate from their own budget and it was purely down to imagination. Moreover, a normalization of donation amounts based on the corresponding a personality trait should be performed to make more viable analysis. However, if they were supposed to donate, those who had a higher level of trust in the robot might pay more than the others in reality.

Despite the promising results, further future steps are required. For instance, the results show that in the ideal case, i.e. starting the conversation with small talk while expressing sadness in facial expression, could maintain the trust of the users; however, we cannot infer anything more from other scenarios due to the small sample size of subgroups. Thus in our future steps, we are going to continue the experiments to gather more data, to be able to examine the influence of the other factors.

Furthermore, we are going to integrate the role of participants’ personality in our analysis. For instance, in the MBTI personality test [25], one of the characteristics of people with the ISFJ personality type is that they perceived as generous. This fact seems to have a crucial effect on the donation results and it seems necessary to be considered. Additionally, we are going to investigate the influence of a real robot in comparison to a virtual one. We have implemented the same scenario, on a virtual version of the Emys. Currently, we are running the experiment and gathering data to form our dataset.

Finally, in our future works we need to take care of the influence of the robot’s final word on the questionnaire outcome. In the current setting, the participants fill in the questionnaire after the Emys ended the conversation by either complaining to the user about their low rate of donation or thanking them for their help regarding the high donation rate. These final remarks of the robot might influence the mental state of the subjects before filling in the post-questionnaire, hence lead to biased data. In future settings, we are going to ask the subjects to fill in the post questionnaire prior to the robot’s final words. In addition, due to variable time of interactions in different scenarios, in our future steps we are going to investigate if the length of total interaction could be a relevant factor, and also take into consideration the responses of participants to the robot worth to perform.