1 Introduction

Motivation is considered as a critical “force” of energy that leads to task engagement or sustained involvement. Psychologist Richard M. Ryan began his paper [20] by saying: “To be motivated means to be moved to do something. A person who feels no impetus or inspiration to act is thus characterized as unmotivated, whereas someone who is energized or activated toward an end is considered motivated.” The work of [10] demonstrates with excellence that robots can be a better motivator than computer-based or paper-based methods. The authors developed a long-term HRI experiment focusing on the role of social robots in motivating people for weight maintenance. Three kinds of weight loss coaching were examined, including robot as coach, computer-based application, and paper log (based on a methodology used in the “Nutrition and Weight Management Center” at Boston Medical Center). Their result reveals that people interacted longer with the robot than with the other modalities, and experienced closer relationship with the robot than with the other experimental coach styles. In other words, people were more motivated to interact with the robot than with a computer-based application or a paper log. In their case, a simple robot appearance and a vocal communication system really helped to motivate people.

When focusing only on the human–robot interaction (HRI), recent studies suggest that user’s motivation can come from their appreciation of the robot’s social cues. For example, in the the work of [23], where humans were exposed to several attention styles of the robot, including: exploration, interaction, interaction avoidance, and full interaction (i.e., a combination of the three previously mentioned attention styles), it is found that users reacted more when they received more feedback from the robot about their actions.

We posit that robots can motivate people and help improve their task performance by employing some social and intelligent behaviors. In this research, we focus on the role of personality in improving the user experience during HRI. We briefly present some related works about this subject in the following sections.

1.1 Robotic Motivator with Personality Matching Strategy

Lee and collaborators [14] suggest that a robot can motivate people during interaction by using its personality. The authors modeled two kinds of behaviors for AIBO robot: an extroverted and an introverted behavior (generally modulated in the vocal sound and in the speed of actions made by AIBO during the interaction with the human partner). The participants were asked to play with AIBO robot and evaluate its personality. The obtained results emphasized that participants were more joyful when interacting with the robot that had complementary personality to theirs. Their work also suggested that by changing the robot’s behavior (in this case, its personality), we could change the motivation level of people interacting with the robot.

Another work focusing on socially assistive robotics, and more precisely on post-stroke rehabilitation therapy [25], examined the effects of robot’s customized behavior on people’s motivation and task performance. The relationship between the extroversion–introversion personality spectrum and the style of encouragement in a rehabilitation task were explored and the role of adapting robot’s behavior to the user’s profile was addressed. The three factor Psychoticism–Extroversion–Neuroticism (PEN) Eysenck Personality model [5] was employed, with a particular focus on the extroversion dimension. The study showed that users preferred working and interacting with a robot with a similar personality as theirs during the therapy: extroverted users preferred the robot that challenged them during the exercises, while introverted users favored the robot that praised them.

The work in [1] explored the benefits of combining verbal and non-verbal behaviors to generate robot’s personalities appropriately during the interaction with a human peer. The system estimates first the interacting human’s personality traits through a psycholinguistic analysis of the spoken language, then it uses PERSONAGE natural language generator that tries to generate a corresponding verbal language to the estimated personality traits. Gestures are generated by using BEAT toolkit, which performs a linguistic and contextual analysis of the generated language relying on rules derived from extensive research into human conversational behavior. The results showed that individuals preferred to interact more with the robot that had the same personality with theirs. Participants also expressed their preference to the mixed speech–gesture behavior of the robot, saying that the robot’s speech was more engaging and more effective when accompanied by appropriate gestures than when no gestures were present.

Furthermore, similar results were also found in Human–Computer interaction (HCI). In [18], the authors presented an experiment where the influence of personality on human’s task performance was tested. In their experiment, participants were taught to use HyperCard application [13]. By testing several conditions during the experiment [in a 2 \({\times }\) 2 \({\times }\) 2 factorial design as follows: personality of the interface (extroverted/introverted), subjects’ personality (extroverted/introverted), task strength (low and high)], they found that introverted participants made better performance when using introverted interface rather than while employing the extroverted interface. However, this effect was not observed for extroverted participants. In terms of task strength, they found that the extroverted participants realized tasks significantly faster than introverted participants on low task strength, however, no significant difference was found for high task strength. This work also leads to believe that in human–machine interaction, personality of the machine can influence task performance in a certain manner.

1.2 Robotic Motivator with Ability to Adapt to Human’s On-going State

By enabling a robot to understand the user’s current internal state, we permit the robot to provide appropriate behaviors in specific interaction scenarios. Recent research in the field of HRI also aims to address robot’s capability to understand the internal states of the human interaction partner (e.g., [16, 17, 24, 27]). An aware robot can be used for example in clinical setups where monitoring a patient’s physiological condition is a crucial factor in terms of the usefulness of a robot as an interaction partner [12]. On the other hand, being able to make assumptions about user’s psychological condition enables the robot to do some efforts that aim to change the user’s mental state in a positive and context-dependent manner [8].

In [3], a robotic-based basketball game was designed to alter the game difficulty level in terms of the player’s anxiety level. The game consisted of a robot arm moving a basketball hoop with variable speeds. The player’s task was to shoot a number of baskets into the hoop within a given period of time. By monitoring the player’s anxiety, the robot’s arm changed the speed of the hoop in such a way that it could enhance and/or maximize the performance of the player. Player’s anxiety level was deduced from his/her physiological signals. In this experimental setup, the robot’s behaviour had a direct influence to the game, and this could be seen as a cooperation task where the robot and its human partner had to work together so as to maximize the performance.

Some studies in social and cognitive sciences show a strong link between the physiological signals (arousal), the personality (extroversion–introversion dimension), and the task performance. The Yerkes–Dodson law [26] advocates that the task performance increases with the level of arousal till a certain point (e.g., too high or too low arousal levels make the task performance decrease, as depicted in Fig. 1). Moreover, Eysenck [5] has found that the extroversion–introversion personality dimension is a matter of balance of the inhibition–excitation in the brain (i.e., extroverted individuals are less aroused than introverted individuals), and that the cortical arousal is strongly linked to the extroversion–introversion level. He also found that introverted individuals had higher arousal level than extroverted individuals. We believe that this personality-dependent arousal behavior can be of great help for social robots if used wisely.

Fig. 1
figure 1

Hebbian version of the Yerkes Dodson curve

1.3 Goal of Stress Game Experiment

We designed a “Stress Game Experiment” aiming at studying how to increase the task performance of users by tuning the robot’s behavior so as to influence the individual’s level of arousal. In our experimental design, we would like to verify (1) the effectiveness of personality-based coaching style in a non-rehabilitation context, (2) the correlation between stress elicitor and human’s heart rate variability, and (3) the difference in heart rate variability between introvert and extrovert.

The rest of the paper is structured as following: Sect. 2 describes the system test-bed (the robot, the game, and the ECG sensors); Sect. 3 presents the system architecture; Sects. 4 and 5 depict the experimental design and the hypotheses; Sects. 6 and 7 discuss the results obtained, and finally conclusions and future work are given in Sect. 8.

2 System Testbed

The goal of our “Stress Game” is to evaluate the robot’s capability of increasing human’s task performance and interest to the game. The game is intended to elicit stress and thus frustration into the player. The robot continuously monitors the on-going performance of the player (and therefore his/her state of frustration) and acts accordingly so as to help to lower the player’s frustration level and to enhance his/her task performance.

2.1 Robot Test-bed

The experimental test-bed used in this study is the humanoid Nao robot developed by Aldebaran Robotics.Footnote 1 Nao is a \(25\) degrees of freedom robot, equipped with an inertial sensor, two cameras, eyes eight full-color RGB LEDs, and many other sensors, including a sonar, which allows it to comprehend its environment with stability and precision. Nao is also equipped with a voice synthesizer and \(2\) speakers enabling vocal communication.

2.2 Stress Game

“Operation” board game is a game that tests players’ hand–eye coordination and fine motor skills. It consists of an “operating table” with a comic likeness of a patient (nicknamed “Cavity Sam”) drawn. On the surface, there are 13 openings filled with various funny little objects (see Fig. 2). The player has to extract the objects from the openings with the help of a pair of tweezers without touching the edges. The game was instrumented so as to be connected to the computer by using a Phidget interface kit.Footnote 2 Everytime the edges of an opening are touched a signal is sent by the Phidget interface to the computer. Moreover, the player has to push a button so as to validate that an object has been successfully removed from the board. The button is also connected to the Phidget interface.

2.3 Shimmer ECG Sensor

It is well known from the psychology literature that internal motivational processes (thoughts and feelings) activate, intensify, or energize observable behaviors [11]. The author in [15] posits that the motivational state (i.e., emotional state), representable under two parameters of valence (positive or negative) and arousal (high or low), is highly correlated to the physiological signals, such as skin conductivity, heart rate, and blood pressure. In our experiment, we used heart rate as an indicator for the stress level of the player.

For the retrieval of heart rate data of the player, we employed ShimmerFootnote 3 ECG sensor (see Fig. 3) to acquire the player’s signals in real-time (sample rate is 50Hz) and transfer this data via Bluetooth communication to the computer for further processing. The ECG sensor is attached to the player’s body. Shimmer ECG can be connected to three electrodes [left arm (LA), right arm (RA), and left leg (LL) electrodes], which allow two channel acquisition, which are RaLL (i.e., pair Ra+LL) and LaLL (i.e., pair La+LL). From each pair, the heart rate can be easily computed (see Sect. 3.1, Fig. 4 for an example of heart rate calculation).

Fig. 3
figure 3

Shimmer ECG sensor and related materials

Figure 4a presents the raw data ECG. Then, we calculate the first deviation and filter it to retain the heart beat (see Fig. 4b). Heart rate is the number of heart beats per minute, and thus depends on the interval between two consecutive heart beats. When this interval becomes shorter, this means that the heart rate accelerates. An example is shown in Fig. 4: at first the heart rate is in the normal condition; then the intervals between heart beats become shorter, obtaining a faster heart rate. If the beginning period corresponds to the heart rate in the normal condition, the second period can signify a stress period. The third period can be seen as the one when the heart rate becomes normal again after the stress period, for example when the stressful situation passed.

3 Implementation

Our Stress Game experiment (see Fig. 5) aims two important elements: (1) Collection of player’s heart rate and performance during the game for future development, and (2) Test of robot’s coaching style adapted to the player’s personality.

Fig. 4
figure 4

Example of heart rate calculation from ECG data using threshold of \(230\), sampled at 50Hz. a Extract of RaLl signal captured by Shimmer ECG sensor. b Presents the first deviation of RaLl. c RaLl’s first deviation filtered by the threshold \(230\) (i.e., only RaLl signals higher than \(230\) are kept, otherwise converted to \(0\)). d Heart rate calculated from the filtered RaLl

Fig. 5
figure 5

Software architecture of the robotic system

3.1 Collection of the Frustration State of the Players from Their Heart Rate Variability

As previously mentioned, our current work is based on two interesting findings from the psychology literature that correlate physiological signals (arousal), personality (extroversion–introversion dimension), and task performance together [5, 26].

The authors in [2] have demonstrated the possibility to predict the user’s stressful state via electrodermal activity signal (EDA). The EDA signal was recorded while the participant was engaged in a social interaction activity through a phone (e.g., reading emails or SMS, responding to phone calls). Their system successfully recognized stress in 78.03 % of cases. The work presented in [9] represents another example showing the possibility to predict user’s stressful period via physiological signal. They developed a stress recognizer that takes skin conductance as input. The system was tested on nine call center employees who processed 1,500 calls. The result showed that their system achieved relatively high accuracy on detecting stress experienced by the participants during phone calls. Moreover, the authors in [21] have shown that human’s frustration state can be elicited when being faced to specific events during a computer-based cognitive exercise.

In our Stress Game, the player’s frustration state is supposed to be elicited when he/she makes a mistake while playing, or when he/she does not perform well enough to finish the game on time. The player’s frustration state can be detected through his/her physiological signals (such as, heart rate, skin conductivity, blood volume pulsation), as discussed in [15, 21]. Therefore, in this experiment, participants’ heart rate and game events (success, failures) are carefully stored for later analysis.

3.2 Classification of Participants’ Personality

Based on results of related works on personality matching in HRI [14, 25], we choose to consider the human subject’s Extroversion dimension as the criteria to match the robot’s personality to each participant. In Big5 personality inventory [7], one can get an Extroversion score between 0 (extreme introverted) and 100 (extreme extroverted). In our design, we classify those whose Extroversion score is low (lower or equal to 33) as Introverted, those whose Extroversion score is high (greater or equal to 66) as extroverted, and those whose Extroversion score is between 33 and 66 as average introverted. This classification is to make sure that introverted people and extroverted people are well distinguished.

3.3 Robot’s Coaching Styles

During the game with the robot coach, the personality and the on-going performance of the player are used to determine the appropriate verbal reaction of the robot. Even though the robot has many abilities in terms of actions (for example: verbal languages, hand gestures, body gestures), we chose to model only the verbal language. This is because while playing the Stress Game, the player will be obliged to look at the game board all the time, which makes the verbal-based coaching much more important and effective than any other behavioral reactions. However, the verbal utterances were also accompanied by hand gestures. The hand gestures were previously fixed to go with the various utterances, and they were not adapted to the users’ personality.

Since the robot plays the role of a coach in our experimental scenario, we choose to follow the suggestion of [25] to match the personality of the robot with that of the human subject as described in Table 1. According to their findings, extroverted subjects preferred challenging coaching style while introverted subjects went with empathetic coaching style. We thus matched extroverted participants with the Challenging robot, the average introverted participants with the Encouraging robot, and introverted participants with the Empathetic robot.

Table 1 Robot coaching style in terms of player’s personality

Some examples of the verbal content used by the robot during the game are given in Table 2. They are conceived in the way that we believe to reflect at best the character of the robot’s coaching style, based on some psycho-linguistic studies that show the existence of some personality markers in language [4, 6, 22].

Table 2 Examples of robot’s verbal content in terms of its behavioral strategy

4 Stress Game: Experimental Design

4.1 Stress Game Description

A sketch of the experimental design is depicted in Fig. 6. The board game and the robot are placed on a table. The player sits in front of the table. He/she will be asked to play the game four rounds. During each round, the player has to pick up as many objects as possible. An annoying sound is played when the participant touches the edge of an opening. Each touch is counted as a mistake. One round lasts 1 min and represents a different game condition.

The game condition is defined by two factors: difficulty level (normal vs. stressful), and robot’s coaching (with or without), as explained below.

The game’s difficulty level is altered by adding or not adding the false alarms sounds randomly while the player is playing the game. The two levels are defined as follows:

  • Normal There are no false alarms.

  • Stressful There are false alarms (i.e., annoying sounds are played when no mistake occurs).

Robot’s coaching is either enabled or disabled. When it is enabled, the robot will make verbal comments / encouragements about the player’s on-going performance. When the robot’s coaching is disabled, the robot does not say or act whatsoever. The robot’s verbal utterances are personality-dependent (see Sect. 3.3 for detailed description).

Fig. 6
figure 6

Experiment setup

Player’s performance is assessed in terms of speed (i.e., number of objects removed per minute) and error rate (i.e., number of errors made per object). These two performance indicators are calculated as follows:

$$\begin{aligned}&Speed= number\, of \,objects\, removed \,per\, minute \end{aligned}$$
(1)
$$\begin{aligned}&Error\, rate =\frac{number\, of \,errors }{ number\, of\, objects\, removed} \end{aligned}$$
(2)

4.2 Experimental Protocol

The procedure adopted in the experiments contains several phases:

  1. (1)

    Introduction We explain to the participant the principle of the game and the steps he/she has to follow. We made sure that he/she understands how to play the game and what to expect when having the robot coaching him/her.

  2. (2)

    Player’s personality identification Before starting the game, we ask the participant to fill a Big5 questionnaire in order to determine his/her personality. This information is then used to choose the robot’s coaching style, as described in Table 1.

  3. (3)

    Recording the heart rate baseline The participant is invited to relax for 5 min in order to collect his/her heart rate while being relaxed. The collected data is used as a baseline.

  4. (4)

    Game playing in four rounds After the previously mentioned phases, the participant plays the game \(4\) times; each time corresponds to one condition of the game. The conditions are: (1) no robot intervention—normal alarm system; (2) robot intervention—normal alarm system; (3) no robot intervention—stressful alarm system (i.e., with false alarms); and (4) robot intervention—stressful alarm system. The duration of each game is 1  min. The conditions are presented in a random order. Robot’s action strategy is determined as a function of the player’s personality. After each condition, the player is asked to fill a post-condition questionnaire about his/her state during the game. Additional questions about the robot’s behavior strategy are asked when the robot is involved in the game (i.e., in the robot condition).

  5. (5)

    Rating of different coaching style of the robot: After finishing the game, the participant is asked to fill a web-based questionnaire where he/she watches three videos showing the robot NAO coaching a player with respectively three different personalities in each video (i.e., empathetic, encouraging, and challenging). The participant is asked to rate the NAO’s behavior in terms of challenging, encouraging, and empathetic on the scale of one (not at all) to seven (very much). He/she is then asked to choose the personality of the robot that he/she prefers if having to play the game again with the robot’s coaching enabled.

5 Hypotheses

Our hypotheses are as follows:

  • Hypothesis 1 The players will perform better when being coached by the robot than when playing the game without the robot’s coaching.

  • Hypothesis 2 The players prefer playing the game with robot’s coaching than no robot’s coaching.

  • Hypothesis 3 There will be a correlation between player’s personality and their preference about the robot’s personality.

  • Hypothesis 4 (a) Player’s heart rate accelerates when he/she makes an error and (b) decelerates when he/she successfully removes an object).

  • Hypothesis 5 Introverted players have greater heart rate variability than extroverted players.

6 Experimental Results

Our system was tested with \(17\) individuals (\(16\) male and \(1\) female; \(9\) introverted, \(2\) average introverted, and \(6\) extroverted), the age of participants was between \(23\)\(36\) years old, and they were all with a background in technical sciences. During the experiments, we collected the heart rate evolution during the game so as to study the correlation between the stress and the heart rate when the players are stressed.

6.1 The Effectiveness of Robot’s Coaching on Participants’ Performance

Participants’ performance is represented in terms of speed and error rate. Figures 7 and 8 illustrate the speed and the error rate for each condition and for each participant, respectively.

Fig. 7
figure 7

Player’s speed

Fig. 8
figure 8

Player’s error rate

It is easy to see that in the stressful condition (i.e., game with false alarms), the players’ speed is higher than in the normal condition (i.e., no false alarms). Furthermore, we also noticed that in terms of error rate, players performed with fewer errors per object in the stressful condition and with even fewer errors when having the robot talking and acting aside. Average values of speed and error rate are presented in Table 3.

Table 3 Average speed (number of objects per minute) and average error rate (number of errors per objects) of players during the four game conditions together with their respective standard deviation in parentheses

We also analyzed players’ performance in the different game conditions of the experiment, by considering each participant’s speed and error rate through his/her \(4\) game conditions (see Table 3). We can notice that the average speed in the stressful conditions is higher than in the normal conditions, which means that the participants tended to play faster in the stressful conditions. Moreover, in terms of error rate, players tended to make less errors when the game had additional elements (such as the stressful alarm or the robot’s intervention). For example, the average number of errors per object removed in the no robot—normal alarm condition is higher than in the other three conditions. It seems that the participants paid more attention (and thus, made less errors) when the game became more stressful and/or when having instantaneous feedback (from the robot) about their errors.

Validation of Hypothesis 1 In order to validate our Hypothesis \(1\), we analyzed the user’s performance in the robot’s coaching and no-robot’s coaching conditions. We can see from Table 3 that robot’s coaching made participants perform better during the game period of time (i.e., in our case 1  min). Furthermore, in the normal alarm condition, the participants’ accuracy enhanced greatly when having the robot aside. However, ANOVA finds no significance in this data. More specifically, in terms of speed, the two-factor analysis of variance (i.e., two-way ANOVA) showed no significant main effect for the difficulty level factor, \(F (1,64) = 2.03, p = 0.1594\), \(\eta _{p}^{2} = 0.037\); no significant main effect for the robot’s coaching factor, \(F (1,64) = 0.14, p = 0.7133, \eta _{p}^{2}= 0.0021\); no interaction between difficulty level and robot’s coaching was significant, \(F (1,64) = 0.14, p = 0.7133, \eta _{p}^{2} = 0.0021\). In terms of error rate, the two-factor analysis of variance showed no significant main effect for the difficulty level factor \(F (1,64) = 0.24, p = 0.6225, \eta _{p}^{2}= 0.0038\); no significant main effect for the robot’s coaching factor, \(F (1,64) = 0.58, p = 0.4506, \eta _{p}^{2} = 0.0089\); no interaction between difficulty level and robot’s coaching was significant, \(F (1,64) = 0.5, p = 0.48, \eta _{p}^{2} = 0.0078\).

A finer view can be obtained by analyzing participants’ performance in terms of their personality across different game conditions. In the following paragraphes, three way repeated-measure ANOVA (personality \({\times }\) coaching style \({\times }\) alarm condition) is conducted to investigate possible interaction among these factors.

Table 4 shows the average speed and error rate of participants considering their personality traits. We noticed that the extroverted participants performed faster with more errors than did the introverted participants. To investigate the significance of these differences, we applied a repeated-measure ANOVA analysis (including number of objects removed and number of errors per object removed) of the participants. Findings are presented in the next paragraphs.

Table 4 Average speed (number of objects per minute) and average error rate (number of errors per objects) in terms of personality in each condition together with their respective standard deviation in parentheses

Regarding player’s speed, Table 4 suggested that extroverted participants had higher speed than introverted participants. In fact, through repeated-measure ANOVA, we found that the extroverted individuals performed significantly faster than introverted individuals only in the no robot’s coaching– -stressful alarm condition (\(F (1,13) = 7.83, p = 0.015\)). Even though in the other three conditions, extroverted participants had higher speeds than introverted individuals, repeated-measure ANOVA analysis showed no significance in the data.

In terms of player’s error rate, the data from Table 4 shows that extroverted participants made more errors per object than introverted participants. When comparing accuracy between introverted and extroverted participants, repeated-measure ANOVA analysis gave significant difference in the following robot conditions: for with robot—normal alarm condition (\(F (1,13) = 13.3, p = 0.0029\)), and for with robot—stressful alarm condition (\(F (1,13) = 7.2, p = 0.019\)). This result may have several ways of explanation. Firstly, it may suggest that the robot coach during the game has a greater effect on extroverted participants than introverted participants, which is understandable as extroverted people tend to respond to external stimulus more often than introverted people. Secondly, it may suggest that robot’s coaching style plays some role in participants’ performance, for instant, introverted participants were praised by the robot while extroverted participants were challenged by the robot, which made extroverted participants more stressed and thus might have made more errors.

Validation of Hypothesis 2 After four rounds of the game, participants were asked whether they prefer to play the game with or without the robot’s coaching, most of the participants (\(11\) out of \(17\) participants, i.e., 65 % of participants) preferred having the robot coaching them while playing (Fig. 9). In order to verify if participants’ choice is well higher than chance value, we conducted Pearson’s Chi-square test of goodness of fit between evenly generated choice and our experimental data. The test returns a critical value of 13.2361 for \(p<0.001\) confirms that participants’ choice in our experiment is different from chance value. This validates our Hypothesis \(2\) stating that the robot’s coaching condition is preferred to the no robot’s coaching condition in the game.

Fig. 9
figure 9

Participants’ response about whether continuing the game with or without the robot coach

6.2 Participants’ Perception About Robot’s Coaching Style

When being asked about robot’s behavior, all participants reported that the robot’s behavior was somewhat appropriate to their preferences (average rating is \(4.4\) on a \(7\)-point Likert scale) (Table 5). They also found that the interaction with robot was engaging with an average rating of \(4.4\). Moreover, the introverted players tended to rate the robot’s behavior higher in terms of appropriateness to their preference and of interaction engagement comparing to the extroverted players. However, ANOVA analysis shows no significance in their ratings.

Table 5 Mean (and SD) of appropriateness of robot’s behavior towards participants’ preference, rated by participants

As shown in Table 6, all players found that the robot was social, extroverted, and helpful, but introverted players had higher ratings than extroverted players. Introverted players also considered the robot less stressful comparing to how extroverted players rated the robot. Concerning the helpfulness of the robot, most players appreciated the robot in the introduction phase, but during the game, some of them stated that robot’s speech and gestures stressed them more and distracted them a bit from the game.

Table 6 Mean (and SD) of robot character rated by participants

All players agreed that the robot had a human-like social behavior. Robot’s speech got higher ratings than its gestures (see Table 7).

Table 7 Mean (and SD) of of robot’s expressiveness in terms of speech and gesture, rated by participants

Furthermore, as described in the Step 5 of the experiment, participants were also asked to rate their perception about the robot’s behavior strategies (i.e., empathetic, encouraging, and challenging) based on participant’s personality (see Table 1) via a a web-based questionnaire. Footnote 4 We also asked the participants to evaluate the extroversion–introversion personality trait of Nao robot along with the three robot’s behavior strategies. We had \(13\) previously involved participants’ answers. Average ratings of the online questionnaire are presented in Table 8. The Challenging robot behavior was rated as challenging and encouraging, the Encouraging robot behavior was rated as encouraging, and the Empathetic robot behavior was rated also as encouraging. ANOVA analysis confirms the above statements. The Challenging robot behavior has been found significantly more challenging and encouraging than empathetic (\(F\) (2,36) \(=\) 10.75, \(p = 0.0002\)). ANOVA analysis found no significance in participants’ ratings regarding the Encouraging robot behavior. Furthermore, the Empathetic robot behavior was considered as significantly more encouraging than challenging or empathetic (\(F\) (2,36) \(=\) 13.78, \(p\) \(\le \) 0.0001).

Validation of Hypothesis 3 From the results of the online questionnaire, we were also able to evaluate people’s preference about robot’s personality in the context of the stress game experiment. Among the \(13\) people who answered the online questionnaire, \(9\) chose the Encouraging robot, \(3\) chose the Empathetic robot, and only \(1\) chose the Challenging robot as shown in Fig. 10 (no correlation is found between participants’ personality and their preference about the robot’s personality as Cohen’s Kappa coefficient = 0.1269). This finding rejects our Hypothesis \(3\) about the correlation between participant’s personality and his/her preference about the robot’s coaching style (Table 9).

Table 8 Mean (and SD) of of robot’s personality from the online post-experiment questionnaire, rated by participants

A possible explanation is that people prefer the robot to encourage and motivate them during stressful tasks, and do not want the robot to challenge them or criticize them when they make errors in such a situation. This tendency of preferring supportive robots to Challenging robots is originated from the human’s need for affiliation in stressful situations [19]. This suggests that in order to optimize the user’s experience during HRI in stressful situations the designated robots should act in a supportive, agreeable, and empathetic manner.

6.3 Heart Rate Analysis

During the experiments, we also collected the participants’ heart rate data. As described in Sect. 3.1, participants’ heart rate is acquired from the ECG Shimmer sensor during the game. The baseline acquisition phase allowed us to identify the heart rate baseline of each participant, and better determine the stress period of each participant during the game.

Our hypothesis is that participant’s stress level is correlated with the various events during the game, including successfully removing an object, making an error (i.e., touching the border of an opening). To verify this hypothesis, we constructed an algorithm to identify the stress period of the participants through their recorded heart rate and then analyze the correlation upon the resulted output.

The period of stress identification algorithm consists of three main steps:

Step 1 Heart Rate Calculation. From the ECG raw data recorded by Shimmer Connect application, we extracted the peak signal by calculating the first deviation of the raw data and filter it by using an adapted threshold (i.e., the threshold was manually fixed for each participant). Please refer to Sect. 3.1 for detailed explanation.

Fig. 10
figure 10

Participant’s preference about robot’s coaching strategy in stress game

Table 9 Number of choice about the robot’s coaching strategy from the online post-experiment questionnaire in terms of participants’ personality

Step 2: Heart Rate Event Detection. Our hypothesis about the correlation of game events and player’s heart rate is that heart rate accelerates when player makes an error and heart rate decelerates otherwise (e.g., successfully remove an object). The two heart rate events that are interesting in our case is the moment when the heart rate reaches local minimum and the moment when heart rate reaches local maximum. This heart rate event detection takes into account the baseline heart rate (acquired in the baseline acquisition phase of our experiment). The player experiences stress if his/her heart rate exceeds his/her baseline heart rate. The player does not experience stress if his/her heart rate remains under the baseline heart rate. Thus, in our algorithm, when the baseline heart rate is taken into account, a local minimum is reported only when this local minimum is below the baseline heart rate; and a local maximum is reported only when it is higher than the given baseline heart rate.

Step 3: Correlation Detection. A co-occurrence between a game event and a targeted heart rate event happens when the two events occur in the same period of time. As we suppose that the game event triggers spontaneous stress in the player, the co-occurrence should appear in the matter of seconds. In our algorithm, a period of \(2\)  s is chosen so as to detect if a co-occurrence occurs. For example, when the player makes an error and hears the alarm sound, his/her heart rate should attain local maximum if he/she is stressed by the sound. Let’s suppose that he/she made an error at the \(5\)th second of the game, then if our algorithm detects any local maximum in his/her heart rate from the \(3\)rd second to the \(7\)th s of the game, a co-occurrence will be reported. The consideration of the 2 s after the game event is transparent as we suppose that the game event is the one that triggers the spontaneous stress. The consideration of the two seconds before the game event is used to cover the case where the player anticipates his/her error, and/or when the player actually sees that his/her tweezers almost touch the border and thus anticipates that he/she will make an error.

Fig. 11
figure 11

Percentage of co-occurrence between heart rate events and game events in different game conditions

Validation of Hypothesis 4(a) The results obtained for the detection of the correlation between users’ heart rate acceleration and game events are presented in Fig. 11. We can notice that in the case of an error–stress correlation, the stress experience is detected around half of the times, meaning that the player experienced stress in half of the times he/she made errors during the game. This makes our Hypothesis 4(a) unable to be validated. [Note that the percentage of the heart rate acceleration in the conditions Normal (including thus the condition No robotic coach—Normal alarm and the condition with robotic coach—Normal alarm, there is no false alarm event, thus the percentage of correlation is 0 %, as shown in the Fig. 11.)]

Validation of Hypothesis 4(b) We can also notice the high correlation between heart rate deceleration and the success of removing an object out of an opening. The percentage of this correlation is really high (\(94.36\) % when taking into account the baseline heart rate). This suggests that players actually calmed down every time they made a good move during the game. This validates our Hypothesis 4(b).

Validation of Hypothesis 5 Eysenck in his work on personality [5] stated that introverted individuals had higher arousal level than extroverted individuals. In our experiment, we were also able to observe a correlation between the arousal level (i.e., being translated into the number of heart rate peaks per minute) and the personality. As shown in Fig. 12, the average number of heart rate peaks (i.e., local maximum) during a game of introverted participants is significantly higher than those of extroverted participants. ANOVA test on average peaks between introverted and extroverted participants across game conditions returns \(F\) (1,6) \(=\) 3.64, \(p\) \(=\) 0.1049, \(\eta ^2 = 0.3778\). This does not allow to validate our Hypothesis \(5\) about the correlation between heart rate variability and subject’s personality.

Fig. 12
figure 12

Average number of heart rate peaks in terms of game conditions

During robot’s intervention, the number of heart rate peaks of introverted participants decreases in the difficult level of the game (see Fig. 13). For extroverted participants, robot’s intervention increases people’s heart rate in easy game level and decreases it in the difficult level. This can be considered as a positive impact of the robot’s behavior because according to the Yerkes–Dodson law [26] (which is graphically summarized in Fig. 1), it is preferable to keep human at a moderate stress level. In our case, the robot was able to decrease the high stress level of introverted people and moderate the stress level of extroverted people in different difficulty levels. Our design of robot’s behavior in the context of Stress Game experiment is thus appropriate to assist people in coping with stress.

7 Discussion

The automatic detection of human’s stress and frustration while performing a task is an important element for HRI. Several aspects need to be taken into account so as to improve user’s task performance. Our designed experiment tried to create a stressful situation and focused on how to diminish stress with a robot with personality. Furthermore, we investigated the correlation between stressful events and physiological signals.

The effect of boosting human’s task performance while being monitored and encouraged by a personalized robotic coach was not observed in our experiment thus the inability to validate Hypothesis \(1\), as explained in Sect. 6.1. Moreover, as Hypothesis \(3\) is rejected according to our results, it seems that participants’ preference is task-oriented, not personality-oriented as reported in [14, 25]. Further researches should be conducted so as to clarify the impact of robots with personality on human’s performance in various applications.

As Hypothesis \(4\)(a) is currently not validated, it seems that heart rate variability may not be considered as enough to detect/predict stress in humans during task performance. A combination with other physiological signals (e.g., skin conductance) is to be investigated in future research projects.

Fig. 13
figure 13

Statistic about heart rate peaks in term of personality

Fig. 14
figure 14

Percentage of correlation between heart rate events and game events

7.1 Heart Rate Acceleration at the end of the Game

While examining the heart rate data to construct the stress detection algorithm, we noticed an interesting phenomenon: the players’ heart rate seemed to accelerate when the end of the game approached. To test if this correlation actually happens, we checked the correlation between heart rate local maximum and the end of the game. As shown in Fig. 14, this correlation exists at the rate of \(62.69\) % when the user’s heart rate baseline is taken into account, and at the rate of \(98.15\) % when no baseline heart rate was considered. This can be an interesting phenomenon to be considered in HRI as it suggests that human reacts physiologically also to the beginning and the end of a long event (in our case is the 1 min game). An assistive robot can use this information to choose an appropriate action in order to better assist the user.

The biggest weaknesses of our current results are the lack of female participants and the small tested population. Some of the data in the results can not be validated statistically due to this weakness.

8 Conclusions

In this paper, we discuss an experimental setup designed to test the robot’s role in reducing the frustration level of the player in order to enhance his/her task performance. Several conditions were tested: robotic coach versus no robotic coach condition; and neutral versus stressful condition.

From the experiment, we were able to evaluate the role of the robotic coach towards human’s performance. While we were able to show that people preferred performing the task with the robotic coach, the robotic coach did not have clear influence towards human’s performance. We were also able to investigate the personality matching strategy. We found that matching personality between human and robotic coach does not always allows a better performance, at least in the context of non-rehabilitation task, which was being tested in our experiment. We believe that this finding is complimentary to the current state of the art, and can be served as guidance for future research on personality-based HRI.

Through the experiment, we were also able to verify the correlation between heart rate deceleration with participants’ positive moves (successfully removed objects from the board) in the game. The heart rate acceleration is observed at \(50\) % in occurrence of negative events (noisy alarms) during the game, which may be just at random level, making it unable to predict/validate the participants’ stress state when making errors during the game. A combination with other physiological signals (such as skin conductance) can provide a better stress detection/prediction and should be considered in future works on this subject.