1 Introduction

Body language is a crucial feature in human communication. Facial expressions, body posture and gestures all convey information about a person’s internal state, and contribute to the overall effectiveness of communication. It has been shown that these features also benefit the interaction between humans and robots by ensuring a more fluent and natural communication (Park et al. 2011; Scheutz et al. 2007; Salem et al. 2013; Breazeal et al. 2005). Different research teams have implemented gestures in robots, in the light of investigating different aspects of communication. Since these gestures are generally preprogrammed off-line for a specific robot configuration (Sugiyama et al. 2007; Ido et al. 2006; Zecca et al. 2009), or generated by mapping motion capture data to the robot’s configuration (Matsui et al. 2005; Tapus et al. 2012; Do et al. 2008), they are robot-specific and not easily transferable to other robots. To offer a solution to this issue, which is known as the correspondence problem (Dautenhahn and Nehaniv 2002; Alissandrakis et al. 2002), we developed a generic method to generate gestures for social robots. By storing target gestures independently of a configuration and calculating a mapping based on a random configuration chosen by the user, gestures can be calculated for different robots.

Since for different types of gestures, different features are important, our method was designed to work in two modes (Fig. 1). The block mode is used to calculate gestures whereby the overall arm placement is crucial, like for emotional expressions (Van de Perre et al. 2015). The end effector mode, on the other hand, is developed for end-effector depending gestures, i.e. gestures whereby the position of the end-effector is important, like for manipulation and pointing (Van de Perre et al. 2016). The working principles and results of the block and end-effector mode were presented in detail in previous publications. In this paper, we focus on how these two modes are combined to generate blended deictic gestures and emotional expressions, and how information about the current emotional condition can be used to modify functional behaviors calculated by the end-effector mode into affective motions.

Fig. 1
figure 1

This schema represents the methodology of the developed gesture method, aiming to overcome the limitations of the current state of art where gestures are implemented for specific robot. The method uses a human base model to store target gestures independently of any configuration in a database, and to calculate a mapping at runtime, based on the robot configuration specified by the user. Two modes are used to allow for different types of gestures to be calculated. The block mode is used to calculate gestures whereby the overall arm placement is crucial, like for emotional expressions, while the end-effector mode was developed for end-effector depending gestures, like for deictic gestures. This paper focusses on how the two modes can be combined to generate blended emotional and deictic gestures, and how information concerning the emotional state can be used to modulate functional behaviors into affective motions. Robots: a WE-4RII (Itoh et al. 2004), b KOBIAN (Zecca et al. 2009), c NAO (Belpaeme et al. 2012), d ASIMO (Salem et al. 2009), e Myon (Hild et al. 2012), f HRP-2 (Hirukawaa et al. 2004)

2 Related work

Different attempts are made to ease the animation of social robots. Balit et al. (2016) suggested to use the knowledge of animation artists to generate lifelike robotic motions by providing a generic software, whereby different types and combinations of gestures can be created by keyframing or by 3D character articulation. However, since the generated motions are still dependent on the used joint configuration, this does not address the correspondence problem.

A possible solution lies in the field of developmental robotics, by using neural networks to learn the correspondence between a posture and the robot’s joint angles (Andry et al. 2001). A technique to teleoperate a humanoid robot without an explicit kinematic modeling by using neural networks was proposed by Stanton et al. (2012). Mühlig et al. (2012) ease the correspondence problem between a human tutor and robot in imitation learning by representing demonstrated movement skills using a flexible task space representation. Another approach of addressing the correspondence problem in imitation learning was suggested by Azad et al. (2007), by using a reference kinematic model, the Master Motor Map, to convert motion capture data to an arbitrary robot morphology. This is a similar strategy as we use to map target gestures from a database to a robot configuration (see Sect. 3.1). In a later stage, the Master Motor Map was extended with a dynamic model and improved to allow for on-line reproduction of human motion to a humanoid robot (Terlemez et al. 2014). In Koga et al. (1994), a semi-general approach for generating natural arm motions, specifically for manipulation tasks is presented. Their inverse kinematics algorithm is based on neurophysiological findings, and decouples the problem of calculating joint angles for the arm from calculating those for the wrist. The sensorimotor transformation model of Soechting and Flanders (1989) is used to determine the arm posture, while the wrist angles are found by assuming a spherical wrist and using orientation inverse kinematics.

In both Salem et al. (2010) and Le et al. (2011), a gesture framework initially developed for virtual agents is applied on a humanoid robot. In Salem et al. (2010), speech-accompanying gestures are generated for ASIMO by using the speech and gesture production model initially developed for the virtual agent MAX. For a specified gesture, the end effector positions and orientations are calculated by the MAX system and used as input for ASIMO’s whole body motion controller (Gienger et al. 2005). Similarly, in Le et al. (2011), speech-accompanying gestures are generated for NAO by using the GRETA system. The gestures are described independently of the embodiment by specifying features as the hand shape, wrist position and palm orientation. However, to obtain the corresponding joint values, a predetermined table listing values for the shoulder and elbow joints for all possible wrist positions is used. So although the gestures are described independently of the robot configuration, mapping these gestures to the robot requires hard coded joint information.

An interesting feature of this framework however, is the possibility of generating affective motions by modulating a neutral behavior using a set of expressivity parameters. In that way, it is possible to convey an emotional state through an ongoing functional behavior. This is indeed an important feature in human communication. Lots of research has been performed on how an emotional state is reflected in the motions generated by a human. Wallbott (1998) found that both the quantity as the quality of the emotion influences the generated body movements. A number of studies investigated the effect of different emotions on human gait (Montepare et al. 1987; Crane and Gross 2007), while other focussed on addressing affect to static body postures (James 1932; Coulson 2004; Atkinson et al. 2004). A number of researchers focussed on affective arm movements (Pollick et al. 2001) and whole body motion (Meijer 1989; Castellano et al. 2007), whereof a number of researches were directed to the effect of affect on dance (Dittrich et al. 1996; Castellano et al. 2007). Using the knowledge obtained by these numerous researches, it is possible to create behaviors conveying emotional information by modifying neutral motion patterns (see Sect. 5).

An interesting aspect of our method, is that, in contrast to learning techniques, no training is required. The method can directly calculate gestures for a chosen configuration. Furthermore, the method allows an impressive amount of flexibility, both for the desired robot configuration as for the targeted body motion. The method can be used for a collection of robots and virtual models that consist at least of one arm, a body, or a head (see Sect. 3). Regarding the body motions, our developed system uses two separate modules to calculate different types of gestures. Next to modulating a certain neutral gesture into affective motions, it is also possible to combine different types of gestures into one blended gesture. An emotional expression in the sense of an explicit, full body action as calculated by our block mode, can take place in combination with a deictic gesture as calculated by our end-effector mode, by assigning each gesture to other body parts. How the modes are combined to generate such a blended body motion is handled in Sect. 4. But first, to get a better understanding of the method, the working principles of the two modes are briefly repeated in Sect. 3. In situations where it is desirable to express an emotional condition not by using explicit bodily expressions, it can be expressed by modifying an ongoing functional behavior. How this is implemented in the method is handled in Sect. 5. A number of results are discussed throughout Sects. 4 and 5. We conclude this paper by a summary and some details about current developments in Sect. 6.

3 Working principles of the method

To ensure a generic method usable for different kind of robots, the framework was developed without using any kind of robot morphology. Instead, a simplified model of the rotational possibilities of a human is used; the human base model. Firstly, a set of Body Action Units (BAU’s) was defined, based on the human terms of motion. The defined BAU’s are listed in Table 1. The units were grouped into different blocks, corresponding to one human joint complex, such as the shoulder or the wrist. These blocks can subsequently be grouped into three body parts, namely the head, body and arm, which we refer to as chains. In that way, our human base model was defined. A standard reference frame was defined, whereby the x-axis is located in the walking direction and the z-axis is pointing upwards, and subsequently, a reference frame was assigned to each joint block (see Fig. 2). When a user desires to generate gestures for a certain robot or model, its morphological information is specified by inputting a limited amount of rotational information and the configuration’s Denavit–Hartenberg (DH) parameters into the program, whereby the different joints of the robot are grouped into the chains and blocks of the human base model. As such, the method can be used for any robot that consists at least of one arm, a body, or a head.

Table 1 The body action coding system
Fig. 2
figure 2

A reference frame was assigned to each block. For the body 1 block, the reference frame is the standard reference frame. The body 2 and body 3 axes are respectively, the body 1 and body 2 embedded axes. The head and clavicle’s reference axes are the body 3—embedded axes. For all other blocks of the arm, the axes are the embedded axes of the previous block

3.1 Block mode

The block mode is used for gestures whereby the overall placement of the arms is important, such as for emotional expressions. In this mode, the method uses a set of target gestures stored in a database and maps them to a selected configuration. To ensure a good overall posture, it is not sufficient to only impose the pose of the end effector, since inverse kinematics for robots with a different configuration and different relative arm lengths could result in unrecognisable global postures. Therefore, the orientation of every joint complex the robot has in common with a human needs to be imposed. Hence, the target gestures are stored in the database by specifying the orientation of every joint block i of the base model using the orthopaedic angles (Kadaba et al. 1990) of frame \(i+1\) (the base frame of block \(i+1\)) with respect to frame i (the base frame of block i) (see Fig. 2). To make a robot or model perform a selected expression, a mapped rotation matrix for every present joint block is calculated by combining the information from the database and the morphological data specified by the user:

$$\begin{aligned} R_{i} = ^{b, i}R_st \ \cdot R_{i, des}\cdot \ ^{st}R_{e, i} \end{aligned}$$
(1)

Here, \(R_{i}\) is the mapped rotation matrix for block i, \(^{b, i}R_{st}\) the rotation matrix between the base frame of block i and the standard reference frame, \(R_{i, des}\) the target rotation matrix in standard axes for block i, loaded from the database and \(^{st}R_{e,i}\) the rotation matrix between the standard reference frame and the end frame of block i, i.e. the base frame of block \(i+1\).

These mapped matrices serve as input for an inverse kinematics algorithm to calculate the necessary joint angles to make the specified robot configuration perform the desired expression. Using the Runge–Kutta algorithm (Ascher and Petzold 1998), for every block, the angle values q are obtained from their derivatives \(\dot{q}\), calculated by the following algorithm (Sciavicco 2009):

$$\begin{aligned} \dot{q}=J_{A}^{\dagger }(q)\left( \dot{x_{d}}+K \left( x_{d}-x_{e} \right) \right) +\left( I-J_{A}^{\dagger }(q)J_{A}(q)\right) \dot{q_{0}}\nonumber \\ \end{aligned}$$
(2)

Here, \(x_{d}\) is the desired end effector pose. Since the maximum number of joints in one block is three, it is not necessary to use all six parameters of the pose; the consideration of the orientation of the end effector is sufficient. Therefore, \(x_{d}\) is reduced to the \(zyx-\)Euler angles corresponding to the mapped rotation matrix. \(J_{A}^{\dagger }(q)\) is the Moore-Penrose pseudo inverse of the analytical jacobian \(J_{A}(q)\). Since only rotational information is imposed, \(J_{A}(q)\) is reduced to its rotational part only. \(x_{e}\) is the current end effector pose; i.e. the current \(zyx-\)Eulerangles, and K a positive definite gain matrix. Since the different blocks are treated separately, no redundancy is present, causing the second term \(\left( I-J_{A}^{\dagger }(q)J_{A}(q)\right) \dot{q_{0}}\) to be zero.

3.2 End-effector mode

The end-effector mode is used for gestures whereby the position of the end-effector is crucial, like for deictic gestures. In some situations, for example when reaching for an object, the position of the right and/or left hand is important and specified by the user. This situation is called the place-at condition. The specified position then serves as a basis to calculate the necessary end-effector position for the selected chain, which is used as input for the same inverse kinematics algorithm as used in the block mode (Eq. 2). While in the block mode, a constraint is imposed on the end-effector of every block and the inverse kinematics algorithm is used to calculated the joint angles of every block separately, in the end-effector mode a constraint is imposed on the end-effector of the chain, and the algorithm is used to calculate the joint angles of the chain as a whole. Since in the end-effector mode the position is specified, the desired end effector pose \(x_{d}\) is limited to positional information only, reducing \(J_{A}(q)\) to its translational part only. In the highly probable case of an arm chain consisting of more than three degrees of freedom, the functional redundancy is used to guide the configuration into a natural posture. In that case, the second term of Eq. 2 will differ from zero, activating the influence of \(\dot{q_{0}}\) on the calculated joint speeds. \(\dot{q_{0}}\) introduces the cost function w(q):

$$\begin{aligned} \dot{q_{0}}= k_{0}\left( \frac{\partial w(q)}{\partial q} \right) ^{T} \end{aligned}$$
(3)

with \(k_{0}\) a positive weight factor. For the cost function w, we decided to work with a slightly adapted form of the joint range availability (JRA) criterion (Jung et al. 1995), whereby an optimal human like posture is calculated by keeping the joints close to a selected set of minimum posture angles (see our previous publication Van de Perre et al. 2016):

$$\begin{aligned} w = \sum \limits _{i=1}^n w_{0,i} \frac{ \left( q_{i}-q_{mi} \right) ^{2}}{\left( q_{max,i}-q_{min,i} \right) ^{2}} \end{aligned}$$
(4)

Here, \(q_{i}\) is the current value of joint i and \(q_{mi}\) the minimum posture angle for that joint. \(q_{max,i}\) en \(q_{max,i}\) are the maximum and minimum joint limits, and \(w_{0,i}\) a weight factor for joint i. The pointing condition functions in a same way as the place-at condition, apart from the fact that by specifying a desired pointing position, no direct constraint is imposed on the end-effector. A series of configurations with a specific combination of end-effector position and orientation can fulfil the pointing constraint. When pointing to an object, the index finger is directed towards the object, implying that for a certain position of the end-effector, the orientation needs to be chosen along the connection line between the object and the last wrist joint. Or with other words, the extension of the end-effector needs to pass the selected target position. To calculate the different possible postures, the end-effector is gradually virtually extended and the pointing position is imposed on the virtual end-effector. For every virtual length, the optimal configuration is calculated. From the resulting collection of postures, the cost function finally selects the optimal result by comparing the total cost of every configuration.

Before calculating a trajectory to the desired end-effector position, the possibility of reaching this position by the current configuration is checked by using an approximate calculation of the workspace. If the desired end-effector position is indeed located in the workspace of the robot, a suitable trajectory towards this position is be calculated. In case of a reaching gesture towards a position located outside the workspace, the pointing-condition can be activated, and a trajectory towards a suitable posture for a pointing gesture is calculated instead.

4 Blended gestures

4.1 Priority levels

During natural communication, humans use and combine different types of gestures. By combining the two modes of our method presented above, it is possible to generate blended emotional expressions and deictic gestures. In order to do so, priority levels for each chain are assigned to both gesture types and a mode mixer was designed. If the mode mixer is turned off, all gestures are treated separately; starting a new gestures entails a previously started gesture to be aborted. By enabling the mode mixer, different gestures are blended by considering for every chain, only the end-effector condition(s) corresponding to the gesture with the highest priority level. The priority levels are defined using a number of rules:

  • For an emotional expression, the priority level for each chain is set on a basic level (level 1)

  • A deictic gesture has a higher priority than an emotional expression: the chain corresponding to the pointing/reaching arm receives a higher priority level (level 2)

  • Similarly, gazing has a higher priority than an emotional expression: the head chain receives a higher priority level (level 2)

For every separate chain, the highest priority level present determines which gesture needs to be considered for that chain. The corresponding calculation principles are enabled, and the required constraints are loaded for the different chains: orientational information for every block composing the chain for the block mode, or the desired end-effector position for the complete chain for the end-effector mode.

When, for example, an emotional expression is performed in combination with a left handed deictic gesture, the left arm chain has a level 1 priority for the emotional expression but a level 2 for the deictic gesture. Therefore, for that chain, the pointing position is considered and the end-effector mode will calculate the corresponding joint angles. For the other chains, only priority level 1 is present. Therefore, the block mode will calculate the joint angles for all blocks in the remaining chains.

Figure 3 schematically summarizes how the mode mixer and the priority levels determine the imposed constraints, while Fig. 4 visualizes the work flow of one iteration, depending on the priority levels.

Fig. 3
figure 3

Schematic representation of how the end-effector constraints are determined by the motion mixer and the priority levels

Fig. 4
figure 4

Work flow of one iteration, depending on the priority levels

Fig. 5
figure 5

Example illustrating the calculation of a blended gesture for a NAO b Justin. Left joint configuration of the robot. Middle calculated end posture for the emotional expression of fear. Right calculated end posture for a combination of a pointing gesture with the right arm, and the emotional expression of fear

4.2 Examples of blended gestures

Figure 5 illustrates the calculation of a blended gesture for both the robots NAO and Justin. The left part of Fig. 5 shows the joint configuration of the robots. NAO has an actuated head, and a left and right arm of 5 DOF. Justin’s arms, on the other hand, contain 7 DOF, with a remarkable different configuration. In addition, Justin has an actuated body with 3 DOF. The middle of Fig. 5 displays the end posture for the emotional expression of fear, calculated by the block mode. For the right part of Fig. 5, the mode mixer was enabled and a combination of gestures was demanded. Next to the expression of fear, a pointing gesture with the right arm was desired, accompanied by gazing towards the pointing location. As explained above, the priority levels determine which calculation principle is activated for every chain, and which corresponding end-effector conditions need to be used. For the expression of fear, all present chains have priority level 1. However, the priority of the pointing gesture for the right arm is higher than the basic level; level 2. Therefore, for the right arm chain, the end-effector mode is activated, whereby the end-effector condition is determined by the desired pointing position (see Sect. 3.2). For all the other chains present, the block mode is activated. Since the priority of gazing towards a specified position for the head overrules that of the emotional expression, the necessary rotation matrix to obtain the desired gazing direction is imposed. For the left arm chain, and body chain in case of Justin, the mapped rotation matrices, calculated using data from the gesture database, are imposed as end-effector condition for every present block in the corresponding chains (see Sect. 3.1).

5 Affective functional behaviors

5.1 Expressivity models

In some situations, it is desirable to express an emotional condition in a different manner than by using explicit bodily expressions as calculated by the block mode. It is possible, for example, that both arms are involved in a functional behavior, and therefore not available for performing an emotional expression. On the other hand, the recognizability of an emotional expression can decrease severely when one arm is used for a deictic gesture. In such cases it can be useful to express an emotional state through an ongoing functional behavior by modulating it, using a certain set of characteristic performance parameters. In literature, different expressivity models have been developed to reach that goal. Amaya et al. (1996) proposed a model to generate an emotional animation from neutral motions by calculating an emotional transform based on the difference in speed and spatial amplitude of a neutral and emotional motion. In Pelachaud (2009), six parameters, namely spatial extent, temporal extent, fluidity, power, overall activation and repetition were used to modify behavior animations for the virtual agent Greta. Yamaguchi et al. (2006) found that the amplitude, position and speed are relevant parameters in modifying basic motions to express joy, sadness, angriness and fear, while Lin et al. (2009) found that the stiffness, speed and spacial extent of the motion, can effectively generate emotional animations from an initial neutral motion. Xu et al. (2013a) proposed a method for bodily mood expression, whereby a set of pose and motion parameters modulate the appearance of an ongoing functional behavior. Results indicated that the spatial extent parameters, including hand-height and amplitude, head position and the motion speed are the most important parameters for readable mood expressions (Xu et al. 2013b). Since in all these discussed expressivity models, the motion speed and the amplitude are important recurring factors, we decided to focus on these modification parameters in our method.

5.2 Generating affective gestures by influencing the motion speed

In both (Xu et al. 2013b) and Yamaguchi et al. (2006), it was experimentally confirmed that the motion speed influences the perceived level of both valence and arousal; a fast motion is associated with a hight arousal and valence, while a slow motion is attributed to low arousal and valence values. By considering the two dimensional emotion space of valance and arousal, based on the circumplex model of affect (Posner et al. 2005), we obtained an appropriate speed scaling factor for each emotion (see Fig. 6). When calculating a deictic gesture with the end-effector mode of our method, a suitable trajectory between the initial posture and the end posture is generated by calculating intermediate key frames. The timing between two consecutive frames is fixed, but the amount of frames, and therefore the total duration of the gesture is determined by the speed scaling factor, in order to add affectional content to it.

Fig. 6
figure 6

Dependency of the modification factors motion speed (\(v_{motion}\)) and Amplitude (Amp) on the valence and arousal value, depicted on the circumplex model of affect (Posner et al. (2005))

5.3 Generating affective postures using the nullspace

The second modification parameter, the amplitude of the motion, refers to the spatial extent; the amount of space occupied by the body. (Xu et al. 2013b) found that this parameter is only related to the valence; open postures with a high amplitude are coupled with affective states with high valence, while closed, low amplitude posters are related to states with a low valence (see Fig. 6). As discussed in Sect. 3, the necessary joint angles to reach a desired posture are calculated by the inverse kinematics algorithm of Eq. 2 with as cost function w, a slightly adapted form of the joint range availability criterion (see Eq. 4). In that way, an optimal humanlike posture is calculated by keeping the joints q close to a selected set of minimum posture angles \(q_{mi}\). Instead of using the fixed minimum posture angles, it is possible to express them as a function of the current valence level. Hence, the resulting calculated posture becomes dependent of the current affective state. The Body Action Units mostly influencing the openness of a posture are BAU 10 and 13; the units corresponding to the abduction/adduction of the shoulder and the flexion/extension of the elbow joint (see Table 1). For the joints corresponding to these BAU’s, a linear function of the valence is provided instead of the fixed minimum posture angle as used before. When scaling the valence level val for each emotion as read on the circumplex model of affect (see Fig. 6) between 0 and 1, the following linear function can be used to select the current appropriate value for the minimum posture angle, which we now call the affective posture angle \(q_{ai}\):

$$\begin{aligned} q_{ai} = q_{ai, min} + val\times (q_{ai, max}-q_{ai, min}) \end{aligned}$$
(5)

The minimum value \(q_{ai, min}\) of the affective posture angle corresponds to the value associated to the minimum valence value, i.e. a value generating a closed posture with low amplitude. The angle value is defined in the corresponding reference frame connected to the human base model, and relatively to the T-pose as visualized in Fig. 2. Therefore, for BAU 10, a value of \(90^{\circ }\) is a suitable choice, since it corresponds to a posture whereby the upper arm is touching the flank of the body. Regarding BAU 13, a small amplitude posture is reached when keeping the forearm as close as possible to the upper arm. A value of \(170^{\circ }\) is therefore an appropriate choice. Similarly, the maximum value \(q_{ai, max}\) of the affective posture angle corresponds to the value associated to the maximum valence value; the value generating an open posture with high amplitude. This should be a posture whereby both the elbow and wrist are located far away from the body. A suitable choice is therefore \(0^{\circ }\) for BAU 10, and \(80^{\circ }\) for BAU 13.

Fig. 7
figure 7

Example illustrating expressing affect during a functional behavior for a a human virtual model b the robot Justin. A reaching gesture was calculated during different affective states: happiness, fear and sadness. Right top corresponding joint configuration. Main figure time line showing a set of postures for every affective state, illustrating the effect of the motion speed modification factor on the calculated gesture. The effect of the amplitude modification factor is visible when comparing the end postures for every mood

5.4 Example: deictic gesture during different states of affect

Figure 7 illustrates the results of the two subsections discussed above. A right-arm reaching gesture during different states of affect was calculated for two different configurations; a human virtual model with a 9 DOF arm (Fig. 7a) , and the robot Justin (Fig. 7b). This example shows that the developed method cannot only be used for existing robots, but also for any virtual model by assigning suitable joint configuration to it.

The right top of Fig. 7 shows the corresponding joint configuration of the used models, while the main figure visualizes a set of calculated postures for every affective state on a time line. As discussed in Sect. 5.2, the total timing of the gesture is influenced by the speed factor, of which the current value is determined by the current affective state. Since the motion speed increases with both valence and arousal, a high value is obtained for the happy state, resulting in a short total timing of the gesture of 0.75 s. For the same pointing gesture performed during a sad state, a low speed factor and long duration (1.5 s) is calculated, while for the fearful state, the values are located somewhere in between (duration of 1.0 s).

The influence of the amplitude modification factor is visible when comparing the end postures for each affective state. Since the amplitude of the posture increases with higher valence values, an open posture is calculated for the happy state, whereby the elbow is located far away from the body. For the sad state, the elbow is placed close to the body, generating a closed posture as expected. Since the valence values for fear and sadness are close to each other (see Fig. 6), the difference in posture during the the corresponding states is small (a difference of approximate \(10^{\circ }\) for BAU 10), but here the total timing of the gesture is the main differentiator.

6 Conclusions and current work

In this paper, we presented the new developments of our generic method to generate gestures for social robots. The method was designed to work in two modes, to allow the calculation of different types of gestures. The block mode is used to calculate gestures whereby the overall arm placement is crucial, like for emotional expressions, while the end effector mode was developed for end-effector depending gestures, such as deictic gestures. The working principles of both modes were discussed in previous publications (Van de Perre et al. 2015, 2016). During human communication, different types of gestures are used and combined. In this paper we discussed how the two modes can be combined to generate blended emotional expressions and deictic gestures. To achieve this, a mode mixer was developed, and for every mode, priority levels were assigned to each chain. The priority levels decide which end-effector constraints need to be considered for each chain. In that way, when gestures with different priority levels are selected with the mode mixer enabled, the imposed end-effector conditions originating from the different gestures result in a blended posture. A combination of a pointing gesture with the emotional expression of fear was calculated for both the robots NAO and Justin to illustrate this new functionality. When one arm is used for a deictic gesture, it is possible that the recognizability of the emotional expression decreases. In that case, it can be interesting to express an emotional condition not by using explicit bodily expressions as calculated by the block mode, but through an ongoing functional behavior. We implemented the possibility to modulate a pointing or reaching gesture into an affective gesture by influencing the motion speed and amplitude of the posture. To illustrate the results of this new implementation, an affective gesture was calculated for two different configurations for three affective states. Differences in motion speed and posture could be clearly distinguished. However, for configurations with very low DOF’s, these differences can diminish. In that case, it can be interesting to implement supplementary modification parameters.

The new implementations were validated on the virtual model of different robots. Current work includes evaluating the method on the physical model of a set of robots, including Romeo and Pepper.