Keywords

1 Introduction

The social machines are increasingly used in our societies, such as personal assistants, chatbots or robots. More specifically, the humanoid social robotics aim to create robots that are similar to humans with respect to their anthropometric structure and that are able to interact with humans in a way that seems natural. The robots often replace a human or animal in tasks where the robot endurance, rapidity or flexibility are beneficial for the better execution of the tasks. They can be used in numerous sectors: as assistance to elderly and/or disabled people, teaching to children or in the entertainment industry.

The sanitary crisis caused by Covid-19 allowed a greater used of these technologies because of their capability to fill a lack of social contacts for isolated people or their capability to be used safely in situations that requires contacts and communication with the public like some hospitals, airports or restaurants [23]

However, much remains to be done to understand how these machines impact the behaviours and the human capabilities, starting with the most basic level: their physical mere presence.

1.1 Presence Effect

The presence of others may have powerful effects, called Social Facilitation and Impairment effects, on cognition, especially executive attention (e.g. inhibitory control). It can be explained by the fact that a presence in the environment is an important clue to adapt how to behave and to communicate. The cognitive capacities and performance can be impacted by being facilitated or impaired depending on the complexity of the task. The presence of a conspecific leads to an improvement of performance during easy or well-learned tasks and an impairment of performance during difficult ones [21]. This effect can occur when the conspecific is acted either as a simple audience or as a co-actor [21]

Many studies have shown that the presence of a conspecific – from cockroaches [22] to baboons [8] – impacts cognitive systems. With the emergence of pseudo-conspecifics like social humanoid robots, the impact of their presence may be questioned. Because the humans have the tendency to anthropomorphise quickly objects in their environment, some social machines can be promoted to the status of pseudo-conspecifics, giving them some humans traits. The physical embodiment can turn the robot into a social agent that generates social effects nearby in the same way a human does. If the mere presence of a conspecific generates an effect on cognitive capacities, it can be possible that a social agent, with some human’s traits, also generates a presence effect.

1.2 Robotic Presence

An interaction with a robot has an additional dimension than interactions with others kind of social agents like a chatbot or a personal assistant: it is embodied and physically close to its interactor. The embodiment can be profitable for the acceptance and the use of robots during an interaction compared to an interaction with a picture on a computer screen. A social robot is judged more helpful, watchful and enjoyable than the same robot but tele-present (presented on a screen) [19, 20]. Indeed, an embodied robot but filmed and shown on a screen is between a physically present robot and a virtual agent without embodiment. In the field of healthcare, people who received advices from a physically-present robot take them more seriously by choosing healthier snack than people who received the same advices but from a tele-present robot or a virtual agent; the presence of the robot makes it more convincing [9]. In 2015, a meta-analysis  [11] looked at the impact on human cognition of the robotic presence by testing if a simple embodiment is enough or if the physically presence is needed. The results showed that the robots are perceived as more persuasive, less distracting and judged more positively when they are physically present that when they are only tele-present robot or virtual agent. The presence of robot also leads to better performance and faster learning in different cognitive and motor tasks (colour recognition, Hanoi tower...). In summary, the physical presence alongside people is more important for triggering the social presence effect than just the embodiment.

If a consequent number of studies show that social robots can impact human behaviour during face-to-face interactions [7, 15], fewer studies are looking to the effects of the robotic presence during a task where the robot is not directly engaged but just present (e.g. [3, 14]). A study replicated the beneficial effect of human presence for attentional control with a robotic presence during a task that requires to inhibit a detrimental automatism [16].

In addition of the mere presence, the perception of a potential evaluation by a conspecific can have an effect on the performance. Performing poorlier compared to the performer’s skill level (“choking”), can occur during situations with an increasing importance of good performance (outcome pressure) or during situations with evaluation of the performances (monitoring pressure). According to the choking literature, outcome pressure is associated with reduced executive control of attention [4, 5].

The experiment presented here aims to replicate the influence of the mere robotic presence on human cognitive control and more specifically on human executive control. The second objective is assessed to what extent the presence effect depends on the perceived capacities of the robot to evaluate.

2 Method

2.1 Participants

Ninety-one participants were recruited (Mean age = 23.54 years, SD = 5.73, 60 females and 31 males). All participants were right-handed, French native speakers and with normal or corrected to normal vision. They were naive about the purpose of the experiment and even that it implied a robot. They had no previous experience with the robot. This sample size was fixed based on an effect size of robotic presence effect during Stroop task [17].

The participants were randomly assigned to three different experimental conditions: 30 to the Alone Condition (control condition), 31 to the Non-Evaluative Condition and 30 to the Evaluative Condition

2.2 Procedure

All participants performed the Stroop task twice. First, all the participants performed the Stroop task alone once the experimenter left the room. This first task is used as a control to take account of the interindividual differences.

Then, participants moved to another room. Participants in the Alone condition watched a short landscape video (distracting task) before they performed another Stroop task. The Alone Condition is used to control the effects of the room change, the training and the fatigue. In the two other conditions (Evaluative and Non Evaluative condition), the robot is present in the second room and is facing a computer screen (see Fig. 1 for the experimental setting). Participants start to look at an interaction between the experimenter and the robot. In the Evaluative Condition, the robot explains to the experimenter and to the participant that it is able to evaluate the speed and the accuracy of Stroop answers that scroll on the screen. A quick demonstration (pre-scripted) is made by the robot in which it commented Stroop answers (for example “That was a quick answer!”). While in the Non Evaluative Condition, the robot explains that it is able to evaluate a Flanker task (where the direction of a target arrow is given among distracting arrows) and it explicitly says and demonstrates that it is not able to evaluate Stroop answers. After the interaction, the robot quietly continues to look at its screen, while the participant prepares to run another Stroop task and the experimenter leaves the room. While a participant performs this second Stroop task, he sits in front of the robot who passively watches them during 60% of time (see Fig. 3).

At the end, participants who met the robot (condition Evaluative and Non Evaluative) filled out the Human-Robot Interaction Evaluation Scale (HRIES [18]). This scale is used to evaluate their level of anthropomorphization of the robot. Participants also rated some perceived competences and evaluation capacities of the robot. These answers concern the Stroop task and the capacity of the robot to evaluate previous participants on this task (e.g. “Is the robot able to give the colour of a word?” or “Is the robot able to correct the colour of a word?”).

Table 1. Experimental conditions.

Robot. The robot of this experiment is an iCub robot with a modified head (photography in Fig. 2). This head aims to improve its capacities to communicate with humans (articulated lips and jaw, pinna, iris designed for being easily readable by humans...). It is 1 m tall, standing on a stand, which places the robot face at the same height of that of a seated adult. The movements of the head and torso, as well as its words during the experiment, were pre-scripted. During the interaction, the experimenter secretly pressed the button of a remote controller to give the illusion that the robot acted/reacted (talking, interrupting, turning to face humans...) at an appropriate timing.

Fig. 1.
figure 1

Experimental setting

Fig. 2.
figure 2

The robot of the experiment is an iCub with a modified head, called Nina.

Stroop Task. This well-known task ([12]) requires individuals to identify as quickly and as accurately as possible the colour in which a word is printed, ignoring the word (and its meaning) itself. Because of the automaticity of word reading, participants have to inhibit the meaning and/or the response activated by the word dimension.

This identification times are consistently longer for colour-incongruent words (e.g., the word BLUE in green ink) than for colour-neutral signs (e.g., +++ in green ink), a phenomenon called Global Stroop interference. Recent studies have shown that Stroop interference is a composite rather than unitary phenomenon, reflecting multiple processes and involving different types of conflicts: task conflict, semantic conflict, and response conflict ([2]; [1]; see also [13] for a review). We therefore used an extended semantic version of the Stroop task ([2]) that allows the measurement of all type of cognitive conflicts underlying the Global Stroop interference (standard Stroop interference, task conflict, semantic conflict, response conflict).

For that, four types of stimuli were used: standard colour-incongruent words (e.g., BLUE in green), associated colour-incongruent words (e.g., SKY in green), colour-neutral words (e.g., DOG in green), and colour-neutral symbols (e.g.,  +++  in green). The computation of these different conflicts are:

  • Global Stroop interference : RTs for standard colour-incongruent words minus RTs for colour-neutral symbols (BLUEgreen - +++green)

  • Standard Stroop interference : RTs for standard colour-incongruent words minus RTs for colour-neutral words (BLUEgreen - DOGgreen)

  • Task conflict : RTs for colour-neutral words minus RTs for colour-neutral symbols (DOGgreen - +++green)

  • Semantic conflict : RTs for associated colour-incongruent words minus RTs for colour-neutral word (SKYgreen - DOGgreen)

  • Response conflict : RTs for standard colour-incongruent words minus RTs for associated colour-incongruent words (BLUEgreen - SKYgreen)

Task conflict occurs because the individual’s attention is drawn by the irrelevant word reading task instead of being fully focused on the relevant colour identification task, leading the two processes to compete. Semantic conflict occurs because the (irrelevant) meaning of the word dimension and the (relevant) meaning of the colour dimension are interfering. Response conflict occurs because the incorrect pre-motor response activated by the word dimension interferes with the correct pre-motor response activated by the colour dimension.

The stimuli were taken from  [16] and consisted of four colour words (rouge [red], jaune [yellow], bleu [blue], and vert [green]), four colour-associated words (tomate [tomato], maïs [corn], ciel [sky], and salade [salad]), four colour-neutral words ( balcon [balcony], chien [dog], pont [bridge] and robe [dress]), and four strings of +++s of the same length as the colour-incongruent trials. Colour-incongruent and colour-associated words always appeared in colours that were incongruent with the meaning of their word dimension. There were 192 trials overall composed of the 16 stimuli presented in different colours, four times each, on a black screen. The interstimulus interval lasted 1500 msec during which a white fixation cross appeared on the center of the screen. Responses were given manually on a keyboard with four non-labelled keys (“2”,“4”,“6” and “8”), corresponding to the four colours used (respectively blue, green, yellow and red). Before the beginning of the first Stroop task, participants practiced a training session in order to learn and automatize the correspondence between keyboard keys and colours. 128 training trials were performed where the letter strings were replaced by symbols (“****”) in the four target colours.

3 Results

3.1 Questionnaires

Anthropomorphism. We compared data from HRIES with a repeated-measure analysis of variance (ANOVA). The dependent variable is the answer for each item of the HRIES. The independent variables are the value of the item and the presence condition (with an evaluative robot or a non evaluative robot). There is no significant variation caused by the presence condition (F(1,56) = 1.327, p = 0.25) and no significant variation caused by the interaction between the item and the presence condition (F(5,302) = 1.019, p = 0.41). This analysis shows that the same anthropomorphic inferences were done in the two robotic presence conditions.

Competence. To check if the competences of the robot were perceived differently depending on the presence conditions, we conducted a repeated-measure analysis of variance (ANOVA) on the competence questionnaire. The dependent variable is the answer for each item of competence questionnaire. The independent variables are the value of the item and the presence condition (with an evaluative robot or a non evaluative robot). There is no simple main effect of the presence condition (F(1,56) = 1.521, p = 0.22). As expected, there is a significant two-way interaction between the presence condition and the perceived competences (F(2,161) = 33.185, p = 2.34e−16). Then, simple pairwise comparisons were done to determine which groups are different by conducting paired t-test with Bonferroni adjustment. The results on items targeting the competence of the robot to evaluate the Stroop task (“It knows when the colour of a word has been correctly answered”, p = 0.0025, “It knows when the colour of a word has been rapidly answered”, p = 0.043 and “It knows when the colour of a word has been correctly and rapidly answered”, p = 0.016) reveal significant effects of the presence conditions; the robot has been perceived as more competent to evaluate the Stroop task in the ‘Evaluative’ condition than in the ‘Non evaluative’ condition. The interaction with the robot correctly induced evaluation capacities.

3.2 Stroop Task

Two participants were removed because their mean RTs were higher or lower than 2 sd from the total mean. Because the statistical analysis is based on (correct) reaction times, incorrect responses were removed (2.46% of the total responses) and 5% of the correct responses with reaction times lower or higher than 2 sd than the mean per participant and per condition were removed from the analysis. The values of the different Stroop conflicts are computed as explained before.

Analysis of Covariance (ANCOVA). An ANCOVA was performed to determine the effect of the condition of presence and the type of stimuli on the RT during the second Stroop task after controlling for RTs during the first Stroop task. This takes into account the interparticipant variability of the reaction times. The RTs during the second Stroop session is the dependent variable, presence condition (alone, non evaluative and evaluative) and type of stimuli are the grouping variables; RTs on the first Stroop session (performing alone) is the covariate.

Fig. 3.
figure 3

Mean answers for questionnaires. The anthropomorphism answers are the means for all the items of the HRIES. No difference are found between evaluative and non evaluative condition. The answers presented for competence questionnaire are the answers to the item “It knows when the colour of a word has been correctly answered”. There is a significant difference, the evaluative robot has been perceived as more competent than the non evaluative robot.

After adjustment for the first Stroop RTs, there was no statistically significant effect of the type of stimuli (p = 0.47) and no interaction between the type of stimuli and the condition of presence (p = 0.71). There was a large significant effect of the condition of presence (F(2,335) = 7.61, p = 5.86e−04, \(\eta ^2_{G} = 0.043\)). Post hoc analysis was performed with a Bonferroni adjustment. The adjusted mean RT was statistically significantly lower in the alone condition (742.9 ms +/− 11) than to the evaluative condition (772.7 ms +/− 11), p < 0.001. The non evaluative condition (748 ms +/− 10) was also significantly lower than the evaluative condition, p = 0.007. There was no statistically significant difference between the alone condition and the non evaluative condition (p = 0.53). The non evaluative presence of a robot did not have an effect on the Stroop RTs while the evaluative presence of a robot had an effect on the Stroop RTs, with a RTs roughly 30 ms longer than the RTs in the others conditions.

Fig. 4.
figure 4

Adjusts means during second Stroop task, after adjustment for first Stroop task reaction times

4 Discussion

The present studies replicated an effect on reaction times for a Stroop task in robotic presence under some conditions and bring new evidence about the importance of an evaluative robotic pressure.

In previous findings about social presence, the reaction times for a Stroop task were decreased ([16]). In this experiment the reactions times were longer with an evaluative robot than alone or even with a non evaluative robot. The presence of an evaluative robot has an effect on performances while the absence or the presence of an non evaluative robot has not. Because there is no interaction between the type of stimuli and the condition of presence, the presence of an evaluative robot increases the reaction times regardless of the stimuli. The impact of the evaluative pressure seems to be low-level; the distraction caused by this pressure impacts all the types of stimuli. The evaluative presence may impact the attentional resources by attracting the attention when there is a risk of being evaluate during the ongoing task. The choking under robot pressure has been more important than the potential facilitation due to the robot’s presence. The priority of the choking over the facilitation has been report by some previous studies (e.g. [6, 10]). What is more surprising is the absence of significative difference between the alone condition and the non evaluative condition. It can be explained by, despite our attempting to create a space where the participants feel like they are alone, the context of the laboratory and research experiment and the presence of cameras in each room which can lead to a monitoring effect, even in the alone condition.

To support the idea of an importance of evaluative pressure, the absence of effect on the non evaluative condition shows that the impairment during the evaluative condition is not due to a distraction caused by the noise of robot’s motors and battery. In both presence conditions, the same noise has been heard in the experimental room. The effect is neither due to a novelty effect caused by the meeting with a humanoid robot. So, despite this comparable environment between the presence condition, the performances have been significantly different with or without a perceived evaluative presence.

Moreover, the level of antropomorphization is the same in the two conditions of presence. One of the limitations is that the necessity of the interaction for the anthropomorphization of the robot used is not verify. It would be interesting to ask also to the participants in the alone condition to complete the anthropormophization questionnaire, without any previous interaction with the robot. It is possible that the robot has been too poorly anthropomorphized in both of the presence conditions and was solely considered like a non-social machine. However, some robots, with less humanoid features than the one used in this experiment, has been shown to be anthropomorphized after an interaction ([16]), it seems reasonable to accept that the robot has been anthropomorphized in this experiment.

In conclusion, the present study brings evidence that the presence of a humanoid social robot, who has the competence to evaluate the ongoing task, may capture attentional resources and impair performances during a Stroop task. Research about robotic presence and evaluative robotic pressure are crucial both for our understanding of social robotic effects on human cognition, with practical implications on how social robots should be designed, and for the development of this new facet of social robotics based on experimental social-cognitive psychology.