1 Introduction

Humanoid robots might become more and more present in the most ordinary contexts of millions people worldwide, a plausible projection given the increasing attention of social robotics to the cognitive, interactive, and affective skills of robots designed to live with humans [1, 2]. While tremendous progress has been made in this area, the influence that the mere presence of humanoid robots may have on human cognition itself remains poorly understood. There is evidence both in adults [3] and children [4] that the presence of a humanoid robot can lead to similar effects as human presence in terms of feelings [5] and task performance [3, 4]. However, these pioneering efforts overlooked both the attentional processes and anthropomorphic inferences (the attribution of human characteristics to non-human animals or machines) that may be involved in the influence of robotic presence. Here, we take advantage of research on human presence and attention and argue that the presence of humanoid robots—even passive—may affect attentional processes as fundamental as conflict resolution in the Stroop task, at least when the robot being present is anthropomorphized to some extent.

1.1 Brief Review of Earlier Research on Social Presence Effects

Evidence accumulated for more than a century in experimental social psychology show a tendency for humans and nonhuman animals to perform differently on a myriad of motor and cognitive tasks when in the presence of conspecifics—other members of the same species—than when alone. Following Triplett’s [6] pioneering efforts on what is referred to as social facilitation/impairment (SFI) effects (for reviews see [7,8,9]), many researchers tried to make sense of seemingly contradictory results: whatever the species examined, the presence of conspecifics sometimes facilitates and sometimes impairs task performance. Zajonc [10] was the first to notice that the presence of observers or coactors typically facilitates performance on easy or well-learned tasks, and impairs performance on difficult or poorly-learned tasks. Based on the Hull-Spence behaviorist theory of learning, conditioning, and motivation (well accepted in the 50s and 60s) [11], Zajonc suggested that the mere presence of conspecifics increases arousal and, thereby, the frequency of dominant (habitual) responses. According to Hull-Spence, the energization of dominant responses indeed improves performance in well-learned tasks where, by definition, correct responses are dominant, and deteriorates it in poorly-learned tasks where errors are the most likely responses. Zajonc’s view of SFI effects found support in many studies using very different species, whose dominant responses—whether correct or incorrect—increased under social presence, compared with isolation. However, although Zajonc’s classic view remains the most common interpretation of SFI effects (see also [12] for a motivational account close to Zajonc’s solution), there is evidence that these effects also involve attentional mechanisms, at least in humans (e.g., [13,14,15,16,17,18,19]) and non-human primates [20, 21].

Thirty years ago, Baron’s [13] distraction/conflict theory suggested the first integrative attentional view of SFI effects. The key idea is that social presence, when it is distracting or diverts attention away from the focal task, can create attentional conflict, a form of response conflict regarding what attentional response one should make (paying attention to the focal task vs. the person present). This conflict, in turn, may threaten the organism with cognitive overload and, ultimately, cause a restriction in attention focus. Ironically, attention focusing may produce just the task effects associated with the energization of dominant responses: facilitation of performance (by screening out nonessential stimuli) when the task is simple or requires attention to a small number of central cues, and impairment of performance (by neglecting certain crucial stimuli) when the task is more complex or demands attention to a wide range of cues.

One strategy for differentiating the two hypotheses (attention focusing vs. dominant response) is to use poorly-learned tasks that involve only a few key stimuli [13]. In this context, the attention focusing hypothesis predicts social facilitation whereas the dominant-response hypothesis predicts social impairment. To this end, Huguet et al. [16] used the well-known Stroop task [22, 23] requiring individuals to identify the color in which a word is printed, ignoring the word itself. Because of the automaticity of word reading [23, 24], identification times are consistently longer for color-incongruent words (the word “BLUE” in green ink) than for color-neutral items (“DESK” in green ink), a phenomenon typically referred to as standard Stroop interference. This interference indicates how difficult the control of attention can be when faced with competing, conflictual automatic activations. To the extent that word reading is the dominant tendency in Stroop’s paradigm, Zajonc’s [10] solution predicts that social presence should increase Stroop interference. In contrast, if social presence leads to a restriction in attention focus, it should reduce Stroop interference (by focusing more exclusively on the letter color cues). Huguet et al. [16] provided first evidence that Stroop interference is reduced in contexts involving the presence of other human agents, either as observers or coactors [16, 25], compared with when participants perform the Stroop task alone. Reduced Stroop interference under social presence circumstances have since been replicated (e.g., [26,27,28], and its underlying mechanisms clarified, especially regarding which component of the interference is impacted.

Recent studies have shown that Stroop interference is indeed a composite rather than unitary phenomenon, reflecting multiple processes and involving different types of conflicts: task conflict, semantic conflict, and response conflict (see [24, 29,30,31,32]). Task conflict is thought to arise because the individual’s attention is drawn by the irrelevant (i.e., word reading) activation instead of being fully focused on the relevant (i.e., color identification) task, leading the two processes to compete (e.g., [30,31,32]). Semantic conflict is thought to occur because the meaning of the word dimension and that of the color dimension are simultaneously activated. Since they both correspond to colors, the meaning activated by the irrelevant word dimension interferes with the meaning activated by the relevant color dimension, creating a delay in processing (e.g., [24, 28,29,30]). Response conflict is thought to arise because the incorrect (pre) motor response activated by the word dimension interferes with the correct (pre-)motor response activated by the color dimension [24, 30, 31]. This distinction is crucial to determine which of these three conflicts (task conflict, semantic conflict, or response conflict) is influenced by social presence.

Augustinova and Ferrand [26] showed that social presence does not prevent semantic processing per se (word reading does occur), but boosts the control of attention at the later stage of response competition reflecting a reduction of response conflict specifically (for a similar conclusion see also [28]). Thus, there is evidence that Stroop performance is facilitated in the presence of others, a phenomenon reflecting a reduction of response conflict rather than task conflict or semantic conflict. Finally, although Baron’s [13] distraction/conflict theory assimilated attention focusing to an automatic response in case of attentional conflict, more recent findings suggest that it may also reflect improved cognitive control under the presence of others. Sharma et al. [28], for example, showed that reduced Stroop interference in social presence is prevented by using short response-to-stimulus intervals that are thought to reduce cognitive control processes. This is also consistent with the more general view that successful Stroop performance relies on executive attention, especially the deployment of top-down inhibitory control to refrain word reading to the benefit of color identification. All these findings do not necessarily invalidate Zajonc’s [10] classic theory, but indicate that attentional mechanisms also matter in SFI effects.

1.2 The Present Research

In the present research, we used an extended semantic version of the Stroop task to specify which component of Stroop performance is influenced by robotic presence. This version allowed the measurement of all type of cognitive conflicts underlying Stroop interference (task conflict, semantic conflict, response conflict, as described earlier in this paper). This extended version comprised color-incongruent words (e.g., the word BLUE in green ink, i.e., BLUEgreen), color associated incongruent words (e.g., SKYgreen), color-neutral words (e.g., DOGgreen) and color-neutral letter-strings (e.g., XXXgreen). The inclusion of color-neutral letter-strings (e.g., XXXgreen) allows the separation of task conflict from the two other conflicts: a significant difference in mean response time between color-neutral words and color-neutral letter-strings (e.g., DOGgreen ‒ XXXgreen) reflects differences in activation of the irrelevant reading task set (Task conflict per se). The inclusion of color associated incongruent words (e.g., SKYgreen) allows the separation of semantic conflict from the two other conflicts: a significant difference in mean response between color associated incongruent words and color-neutral words (e.g., SKYgreen − DOGgreen) solely reflects the semantic conflict (with no response conflict). Finally, the standard color-incongruent words (e.g., BLUEgreen) allow a separation of response conflict from the two other conflicts: a significant difference in mean response between standard color-incongruent words and color associated incongruent words (e.g., BLUEgreen − SKYgreen) solely reflects the response conflict occurring at the level of response processing (response conflict per se).

In addition, we used a design which maximized anthropomorphic inferences in only one of two robotic conditions: a robot presence condition preceded by a verbal interaction with the robot (social robot condition) versus the presence of the same robot without any prior interaction (non-social robot condition). This strategy made it possible to determine whether the beneficial effects of social robotic presence, if any, reflect the action of strictly mechanical distraction or more sophisticated, social-cognitive processes involving anthropomorphic inferences. We expected these beneficial effects to occur on Stroop performance exclusively in the social robot condition (maximizing anthropomorphic inferences), compared to when individuals perform the Stroop task alone or in presence of a non-anthropomorphized robot. When the robot is thought to have human characteristics, we reasoned, its presence may produce exactly the same effects as human presence [16, 26, 28]: Under this social robotic condition, the robot’s presence should cause a reduction of standard Stroop interference and a better resolution of response conflict specifically, compared to when the Stroop task is performed in isolation or in presence of a non-anthropomorphized robot.

2 Method

2.1 Participants

Participants were 118 young adults (Mage = 19.24 years, SD = 1.32, 110 females and 8 males) with normal (or corrected-to-normal) vision (39 in the Alone condition, 40 in the Non-Social Robot condition, and 39 in the Social Robot condition). Sample size was determined—as recommended by Tabachnick and Fidell [33] —on the basis of the desired power (.80), alpha level (.05), number of groups (three in the main analysis), and anticipated effect size based on human presence effects (using between-subjects design) in Stroop’s paradigm (\( \eta^{2}_{p} \) = .10; [16]). Using G * Power 3.1 [34], the minimum required sample size was calculated as 90.

2.2 Procedure

Participants performed the standard Stroop task twice (Session 1, Session 2). Each participant therefore was his/her own control for Stroop performance, which allowed to control for inter-individual differences on the Stroop task [35]. First (Session 1), participants performed the task alone (the experimenter left the room) then (Session 2) either alone or in presence of a robot (the experimenter left the room in all conditions). The robot was a 1-m MeccanoidG15KS humanoid, as we assumed that even robots with a basic humanoid appearance can be anthropomorphized [36], at least under certain circumstances (see below). A 3-min break was inserted between the two Stroop sessions during which participants were randomly assigned to one of three conditions (Fig. 1). In the « non-social robot » condition (n = 40), participants were asked to give their opinion on the appearance of a physically present but passive robot as a means to provide data for unrelated projects with roboticists. In the « social robot » condition (n = 39), participants were asked to interact verbally with the same robot that was (unbeknownst to them) animated at distance by a human operator using two smartphones for the control of the robot’s gestures and speech (by selecting pre-established conversational scripts) in a coherent way (“Wizard of Oz paradigm”, [37]). This condition encouraged anthropomorphic inferences (see pre-test on anthropomorphic inferences below). The interaction always followed the same pre-established script (Table 1), the operator having only to choose when to launch a given sequence. In the « alone » condition (n = 39), participants described a picture of a landscape, a task that occupied them the same amount of time as participants in the other two conditions.

Fig. 1
figure 1

The design of the experiment was a 2 (Stroop task session 1, Stroop task session 2), × 4 (Type of stimuli) × 3 (Performance context: Alone, presence of a non-social robot vs. social robot)

Table 1 Verbal script used in the experiment

After the 3-min break, all participants again performed the Stroop task either alone (as before) or in presence of the non-social robot versus social robot. In the two robotic presence conditions, the robot was positioned in front of participants (to their right on the edge of their peripheral vision; see Fig. 2) and watched them 60% of the time by turning the head according to a pre-established script (for a similar procedure with human presence, see [16]). The robot was piloted by two smartphones connected in Bluetooth. Movements were controlled by a Motorola Moto G 4G. Sounds were controlled by a LG optimus 2 × connected to a JBL speaker. Both smartphones were powered by Android. Voices have been designed with Voxal by NCH Software using the Pixie voice module. A hidden control camera was used to ensure a good control over movements and responses for the Wizard of Oz paradigm.

Fig. 2
figure 2

Experimental setting

2.2.1 Stroop Task

EPrime 2.1 (Psychology Software Tools, Pittsburgh) running on a PC (Dell Precision) was used for Stroop stimulus presentation and data collection. The participants were seated approximately 50 cm from a 17-inch Dell color monitor. Their task was to identify the color of the letter-strings presented on the screen as quickly and accurately as possible while ignoring their meanings. To this end, the participants were instructed to fixate the white cross (“+”), which appeared in the center of the (black) screen for 500 ms. The cross was then replaced by a letter-string that continued to be displayed until the participant responded (or until 3500 ms had elapsed). After this response, a new stimulus appeared on the screen, again replacing the fixation point and beginning the next trial. The response-stimulus interval was 1 s [28]. The participants responded using a keyboard placed on a table between the participant and the monitor. The keys were labeled with colored stickers, with key “1” representing red, key “2” representing green, key “3” representing “blue” and key “4” representing “yellow”. Before the beginning of the experimental block in the first Stroop session, the participants practiced learning which key on the keyboard represented each color (key-matching practice trials). In these 128 practice trials, strings of asterisks presented in the four colors (e.g., ***, ***) were used (instead of the experimental stimuli, see above).

In order to assess the respective contribution of the different conflicts (task conflict, semantic conflict, and response conflict) involved in overall Stroop interference, four types of stimuli were used: standard color-incongruent words (e.g., BLUE in green), associated color-incongruent words (e.g., SKY in green), color-neutral words (e.g., DOG in green), and color-neutral letter strings (e.g., XXX in green; see [30] for presentation parameters). The different conflicts were computed as follows [30]: task conflict (RTs for color-neutral words minus RTs for color-neutral letter strings), semantic conflict (RTs for associated color-incongruent words minus RTs for color-neutral words), response conflict (RTs for standard color-incongruent words minus RTs for associated color-incongruent words), standard Stroop interference (RTs for standard color-incongruent words minus RTs for color-neutral words).

2.2.2 Attitudes Toward Robots

At the end of the experiment, participants in the two robot presence conditions completed Nomura, Kanda and Suzuki’s [38] scale measuring negative attitudes toward robots, hereafter referred to as NARS scale. The NARS scale was made of three constructs: social/future implications (e.g., “I feel that if I depend on robots too much, something bad might happen”); emotional attitudes (e.g., “I would feel uneasy if robots really had emotions); and actual interactions (e.g., I would feel very nervous just standing in front of a robot). For each dimension, participants rated whether they agreed or disagreed (from 1 to 5).

2.2.3 Anthropomorphic Inferences

Participants also filled out the humanness scale based on Haslam’s ([18]; see “Appendix 1”) dehumanization taxonomy made of four dimensions: human uniqueness (e.g., moral sensibility), animalistic dehumanization (e.g., irrationality), human nature (e.g., interpersonal warmth), and mechanistic dehumanization (e.g., inertness). Again, for each dimension, participants rated whether they agreed or disagreed (from 1 to 5) to attribute related characteristics to the robot being present. We conducted a pretest with 35 participants to evaluate the degree of anthropomorphism associated with the robot after either the verbal human–robot interaction designed for the experiment (social robot condition) or a simple observation of that same robot (non-social robot condition). The results showed a difference on mechanical dehumanization, F(1, 34) = 7.78, p = .008, \( \eta^{2}_{p} \)= .193, and human nature, F(1, 34) = 11.59, p = .002, \( \eta^{2}_{p} \)= .261: Participants attributed less mechanical traits and more human nature traits (e.g., interpersonal warmth) to the robot in the social robot condition than in the non-social robot condition. No effects were found on animal dehumanization and human uniqueness attributions (ps> .1).

3 Results

3.1 Stroop Data

3.1.1 Data Processing

The data from two participants were discarded because they responded randomly (around 50% of accurate responses) in at least one Stroop session. The results obtained from the remaining participants are summarized in Table 2 (presented in “Appendix 2”). Errors occurred on 1.6% of the trials and were analysed independently (see “Appendix 3” for the full analysis of error rates). Correct trials with a reaction time (RT) lower or higher than 3 standard deviations per condition for each participant were considered outliers and then removed from RT analyses, which corresponded to 403 trials (1.27% of the trials). This filtering procedure has the advantage of taking out extreme values without affecting the data of one condition or of one participant in particular.

Table 2 Mean correct response times (in milliseconds), standard deviations (in parentheses) and error rates as a function of the Type of stimuli, Stroop session, and Performance context

3.1.2 Analysis

We conducted a repeated measure Analysis of Variance (ANOVA) including Sessions (1 and 2), Type of conflict (standard Stroop interference, task conflict, semantic conflict, response conflict) as within factors (see also [26, 39]), and Performance context (Alone, Non-social robot, and Social robot) as between factor. This analysis revealed a significant Session × Type of conflict × Performance context interaction, F(2, 115) = 6.14, p = .003, \( \eta^{2}_{p} \)= .10 (see Fig. 3a, d). For the sake of simplicity, Fig. 3 (panels a–d) shows this interaction in terms of performance improvement from Session 1 to Session 2 on standard Stroop interference and each Type of conflict in each Performance context. This pattern was examined for standard Stroop interference (A) and each type of conflict (B, C, D) taken separately using two orthogonal contrasts according to our expectations: Alone versus Non-social robot condition; Social-robot condition versus Alone and Non-social robot conditions averaged. Consistent with our expectations, the first contrast was not significant, that is, the presence of the non-social robot did not make any difference compared with isolation : (A) t(115) = .86, p = .389, \( \eta^{2}_{p} \) = .01; (B) t(115) = .88, p = .379, \( \eta^{2}_{p} \) = .02), (C) t(115) = -.05, p = .964, \( \eta^{2}_{p} \) < .01; (D) t(115) = -.38, p = .705, \( \eta^{2}_{p} \) < .01. The second contrast proved significant exactly as expected: standard Stroop performance and resolution of response conflict improved from Session 1 to Session 2 in the presence of the social robot more than in the two other conditions averaged (A) F(2, 115) = 7.00, p = .001, \( \eta^{2}_{p} \) = .12; (B) F(2, 115) = 3.37, p = .038, \( \eta^{2}_{p} \) = .06. This effect was not found on semantic conflict, (C) F(2, 115) = 1.01, p = .368, \( \eta^{2}_{p} \) < .01, and task conflict, (D) F(2, 115) = 1.07, p = .347, \( \eta^{2}_{p} \) = .01. As also indicated on Table 2 (bottom; see “Appendix 2”), performance improvement was significant only in the presence of the social robot (ps < . 001 for standard Stroop interference and response conflict), with large effect sizes. Thus, although both standard Stroop interference and response conflict were significant in the three performance contexts in both sessions (all ps≤ .001, see Table 2 in “Appendix 2”), only the presence of the social-robot reduced them significantly in Session 2 relative to Session 1. The presence of the non-social robot left standard Stroop interference and all types of conflicts unchanged, compared to when participants worked alone.

Fig. 3
figure 3

Standard Stroop interference at baseline minus standard Stroop interference in experimental session (Alone, Non-Social Robot, Social Robot): the higher the positive value, the higher performance improvement in the Stroop Task from baseline (Session 1) to experimental session (Session 2). Error bars represent ± 1 standard error. *p < .05

3.2 Attitudes Toward Robots

The data related to NARS and humanness scales were examined using MANOVAs (one for each scale) with their different constructs entered simultaneously as DVs, and the two robot presence conditions (social vs. non-social robots) as independent variable. The two Robot presence conditions did not differ on the three constructs of the NARS scale, indicating that there were no more negative attitudes in one condition than in the other: actual interactions, F(1, 77) = .67, p = .416, 95% CI [− .32; .42], \( \eta^{2}_{p} \) = .001; emotional attitudes, F(1, 77) = −. 44, p = .51, 95% CI [− .30; .60], \( \eta^{2}_{p} \) = .006; or social/future implications F(1, 77) = 1.401, p = .24, 95% CI [− .16; .62], \( \eta^{2}_{p} \)= .018. Interestingly, the reduction of standard Stroop interference in the social robot condition, compared to the non-social robot condition, remained significant when controlling for participants’ NARS data, F(1, 74) = 5.599, p = .021, 95% CI [5.33; 62.14], \( \eta^{2}_{p} \) = .07.

3.3 Anthropomorphic Inferences

On the humanness scale, the two robot conditions differed significantly from each other, multivariate F(4, 74) = 3.18, p = .018, \( \eta^{2}_{p} \) = .15. As expected, participants in the social robot condition attributed more human nature characteristics (e.g. interpersonal warmth) (univariate F(1, 77) = 5.04, p = .028, 95% CI [− 3.20; − .19], \( \eta^{2}_{p} \) = .06) and less mechanical features (e.g., inertness) to the social robot than participants in the non-social robot condition (univariate F(1, 77) = 6.84; p = .011; 95% CI [.71; 5.21], \( \eta^{2}_{p} \) = .082). Both groups did not differ regarding the two other constructs: human uniqueness, univariate F(1, 77) = 2.71; p = .104; 95% CI [− 3.70; .35], \( \eta^{2}_{p} \) = .034; animalistic dehumanization, univariate F(1, 77) = .48; p = .489; 95% CI [− 2.77; 1.34], \( \eta^{2}_{p} \) = .006.

3.4 Mediation Analyses

We tested whether the effects found on anthropomorphic inferences mediated the impact of robotic presence on standard Stroop performance using the PROCESS plugin in SPSS. Not surprisingly, this analysis (see Fig. 4a for the whole mediational pattern) showed that participants attributed less mechanical traits, (a1) t(77) = − 2.96, p = .011, 95% CI [− 5.21, − .71], and more human nature traits, (a2) t(77) = 2.25, p = .028, 95% CI [.19, 3.20] to the social robot than to the non-social robot. Mechanistic dehumanization was not predictive of standard Stroop performance improvement, (b1) t(77) = .813, p = .419, 95% CI [− 1.56, 3.20]. More importantly, the direct effect of robotic presence (social robot vs. non-social robot) on standard Stroop performance improvement was no longer significant when controlling for mechanistic dehumanization and human nature attributions, indicating a complete mediation by anthropomorphic inferences. This effect of robotic presence was fully mediated by the attribution of human nature traits to the social robot, (c′) t(77) = − 1.48, p = .144, 95% CI [− 46.18, 6.85], (b2) t(77) = − 3.50, p < .001, 95% CI [− 10.87, − 2.98]; a mediation representing more than half, κ2 = 11.74, 95% CI [− 27.01, − 1.03] of the total effect size explained by the model, κ2 = 14.91, 95% CI [− 33.54, − .78].

Fig. 4
figure 4

Mediation of the robotic presence effect on standard Stroop performance improvement (a) and response conflict improvement (b) by anthropomorphic inferences

The same mediation analysis conducted on response conflict specifically, (c), (with a1, a2 equal to the previous mediation) revealed quite similar findings (see Fig. 4b). Controlling for the effect of mechanistic dehumanization and human nature attributions, the direct effect of robotic presence on response conflict improvement was no longer significant. This effect was mediated by the attribution of human nature traits to the social robot, (c′) t(77) = − .308, p = .759, 95% CI [− 30.30, 22.17], (b2) t(77) = − 4.45, p = .05, 95% CI [− 8.35, − .54]. The mediating role of mechanistic dehumanization also proved significant (b1) t(77) = 2.68, p = .001, 95% CI [.89, 6.11]. These two human nature and mechanistic mediations represented, κ2 = 7.53, 95% CI [− 20.59, .71] and κ2 = 10.35, 95% CI [− 26.68, − 1.41] of the total effect size explained by the model, κ2 = 17.88, 95% CI [− 35.02, − 5.48].

4 Discussion

There is evidence that attentional mechanisms such as attention focusing can be boosted in performance contexts involving the presence of other human agents, either as observers or coactors [13]. Performance on the Stroop task, requiring the deployment of inhibitory control to focus on letter-color cues at the expense of word meaning, is indeed typically better under these social circumstances, relative to isolation (e.g., [16, 26, 28]). Here, we used an extended version of the Stroop task to specify which component of standard Stroop performance is influenced in the presence of humanoid robots, assuming that this presence may influence response conflict specifically, as does human presence. Perhaps more importantly, we also used a design maximizing anthropomorphic inferences in only one of two robotic conditions—a robot presence condition preceded by a verbal interaction with the robot (social robot condition) versus the presence of the same robot without any prior verbal interaction (non-social robot condition). This strategy made it possible to determine whether the beneficial effects of social robotic presence in the Stroop task, if any, reflect the action of strictly mechanical distraction or more sophisticated, social-cognitive processes involving anthropomorphic inferences. The present findings increase our understanding of robotic presence effects in a number of important ways.

They indeed show that anthropomorphic inferences are needed for the facilitation of Stroop performance to occur in presence of a humanoid robot. Again, the passive presence of the non-social robot during the Stroop task did not influence performance (neither standard Stroop interference nor the different types of conflicts), compared with when participants worked in isolation. This passive presence caused a reduction of Stroop interference and response conflict exclusively (as expected) when it was preceded by a verbal interaction with the robot being present, which also caused anthropomorphic inferences to occur. Taken together, these findings run counter a purely mechanistic, non-social approach reducing the effects caused by the presence of humanoid robots on attention to the action of physical or noise distraction.

Of particular interest here, whether social presence effects involving the presence of human agents can or cannot be reduced to mechanical distraction has long been debated (for a review, see [13]). As noted earlier in this paper, there is evidence that, when the focal task is attention demanding, noise or other mechanical (non-social) sources of distraction can induce a conflict between paying attention to the focal task versus the distractor. This conflict may threaten the organism with cognitive overload and, ultimately, cause a restriction in the range of cue utilization (e.g., [13]), a restriction which can be sufficient for a performance facilitation to occur in the Stroop task (by focusing more exclusively on the color letter cues than on incongruent words). According to this approach, however, both the presence of a social robot as well as non-social robot—whose appearance and presence during task performance were strictly identical in both conditions—should have led to better Stroop performance, compared with isolation. Instead, Stroop performance improved exclusively in the social robot condition, in which anthropomorphic inferences about the robot being present were also more likely compared with the non-social robot condition (which did not differ from isolation on Stroop performance). Of course, we cannot exclude the possibility that anthropomorphic inferences about the social robot made its passive presence during Stroop performance more distracting, compared with the presence of its non-social counterpart. However, even this possibility implies not to reduce the beneficial effects of the social robot to the action of a mechanical (nonsocial) source of distraction. This more basic form of non-social distraction does not seem to operate at all in our research, otherwise the presence of the non-social robot would have also lead to better Stroop performance, compared with isolation, eventually to a lesser extent relative to the social robot condition. This is not what happened.

Further evidence that the effects caused by the presence of the social robot on Stroop performance are truly social can be found in the mediation analyses. These analyses examined whether participants’ anthropomorphic inferences about the robot mediated (vs. simply covaried with) the effects of robotic presence on standard Stroop performance and response conflict. In both cases, the direct effect of social robotic presence was not significant when controlling for anthropomorphic inferences, indicating their mediating role. This mediating role of anthropomorphic inferences can reasonably be taken as evidence that the effects of social robotic presence on attention were indeed social by nature and therefore cannot be trivialized or reduced to the action of any other nonsocial sources of distraction.

This conclusion is also strengthened by the fact that the presence of the social robot had the same impact on Stroop performance as in earlier research with human presence. Social robotic presence indeed reduced—rather than increased—standard Stroop interference and also improved the resolution of response conflict, specifically (no effects on semantic and task conflicts). This performance pattern extends the relevance of the attentional view of social facilitation from humans to social robots. According to this view (described earlier in this paper), social facilitation phenomena—at least in humans and nonhuman primates— should not be restricted to the energization of dominant responses [10]. Instead, considering that social presence can also boost attention focusing, even when this process requires the deployment of inhibitory control (as in Stroop’s paradigm), this attentional view leads to a more complex picture. This picture is even more complex when considering that the deployment of top-down inhibitory control can also be impaired rather than facilitated in contexts where the presence of others represents a potential threat to be monitored [15, 20], with negative consequences on learning and other complex tasks relying heavily on executive control resources.

Spatola et al. [40] provided preliminary evidence that Stroop performance can also be facilitated-rather than impaired—in the presence of a « bad robot » responding with contempt, and lack of empathy, and producing negative evaluations about human intelligence. However, whether this bad robot was really threatening, or on the contrary challenging, remains unclear (this robot was associated with feelings of discomfort but this does not necessarily mean that participants felt threatened by its presence). Spatola et al.’s [40] research also left open two important questions. Because of limitations in the type of Stroop stimuli that were used, the locus or type of Stroop conflict (task conflict, semantic conflict, response conflict) impacted by robotic presence remained unspecified. Second, because of its design, Spatola et al.’s [40] research could not specify the exact role of anthropomorphic inferences in the influence of social robotic presence on Stroop performance. By demonstrating that this presence can influence Stroop performance as does human presence—facilitating standard Stroop performance and resolution of response conflict specifically— and that this influence is mediated by anthropomorphic inferences, the present findings represent interesting advances.

Finally, the present research has its own limitations. It indicates that social robotic presence can boost attention focusing even when this process requires the deployment of inhibitory control, but this conclusion is limited to the Stroop task. Future research should clarify whether this finding can be replicated with a variety of tasks in which successful performance requires the deployment of executive resources. Likewise, special attention should be paid to the boundary conditions of the beneficial effects found in the present research. Given earlier findings on executive control in humans and nonhuman primates faced with the presence of potentially threatening others [15, 20], these beneficial effects seem unlikely in contexts where social robots are themselves perceived—rightly or wrongly—as threatening. Of course, robots designed to live with us are not designed to be threatening, but their impact on attentional mechanisms and behavior in general may strongly depends on what people come to believe (anthropomorphic inferences) about them. This is a critical issue for future research in social robotics. As in human–human interactions, a broad range of elements, internal or external to interpersonal relationships, can impact how people perceive and judge robots. A lot of works remain to be done in this area. For now, in line with the Computers Are Social Actors theory [41], our research supports the proposal that people may understand and relate to machines as to fellow creatures. Humans indeed tend to apply the same social scripts (specifying actions to produce in various social situations [42]) in human-robots interactions as in human–human interactions [43]. This tendency may be strengthened by the physical presence of the artificial agent and its humanoid shape as this shape provides more social cues to the observer [44, 45]. The more a robot is human-like, the less interaction should be needed to energize anthropomorphism and thus social presence effects [43, 44]. The relative adequacy between the advanced technological shape and the level of perceived capacities of a robot could also play an important role [46]. If the expectations induced by the appearance of the robot in terms of capacities are not fulfilled, it may result in disappointment, and less anthropomorphic attributions [47]. In this context, the fact that the presence of social robots can impact processes as fundamental as attentional control adds further reasons to pay special attention to the psychological, sociological, and philosophical impact of human-robotic interactions.