Recognizing Emotional Body Language Displayed by a Human-like Social Robot

McColl, Derek; Nejat, Goldie

doi:10.1007/s12369-013-0226-7

Recognizing Emotional Body Language Displayed by a Human-like Social Robot

Published: 04 January 2014

Volume 6, pages 261–280, (2014)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of Social Robotics Aims and scope Submit manuscript

Recognizing Emotional Body Language Displayed by a Human-like Social Robot

Download PDF

Derek McColl¹ &
Goldie Nejat¹

3598 Accesses
64 Citations
Explore all metrics

Abstract

Natural social human–robot interactions (HRIs) require that robots have the ability to perceive and identify complex human social behaviors and, in turn, be able to also display their own behaviors using similar communication modes. Recently, it has been found that body language plays an important role in conveying information about changes in human emotions during human–human interactions. Our work focuses on extending this concept to robotic affective communication during social HRI. Namely, in this paper, we explore the design of emotional body language for our human-like social robot, Brian 2.0. We develop emotional body language for the robot using a variety of body postures and movements identified in human emotion research. To date, only a handful of researchers have focused on the use of robotic body language to display emotions, with a significant emphasis being on the display of emotions through dance. Such emotional dance can be effective for small robots with large workspaces, however, it is not as appropriate for life-sized robots such as Brian 2.0 engaging in one-on-one interpersonal social interactions with a person. Experiments are presented to evaluate the feasibility of the robot’s emotional body language based on human recognition rates. Furthermore, a unique comparison study is presented to investigate the perception of human body language features displayed by the robot with respect to the same body language features displayed by a human actor.

A Sociable Human-robot Interaction Scheme Based on Body Emotion Analysis

Article 18 January 2019

Human-Robot Interaction: Exploring the Ability to Express Emotions by a Social Robot

Analysis of Influencing Factors on Humanoid Robots’ Emotion Expressions by Body Language

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Interactive robots developed for social human–robot interaction (HRI) scenarios need to be socially intelligent in order to engage in natural bi-directional communication with humans. Namely, social intelligence allows a robot to share information with, relate to, and understand and interact with people in human-centered environments. Robot social intelligence can result in more effective and engaging interactions and hence, better acceptance of a robot by the intended users [1–3]. The challenge lies in developing interactive robots with the capabilities to perceive and identify complex human social behaviors and, in turn, be able to display their own behaviors using a combination of natural communication modes such as speech, facial expressions, paralanguage and body language.

Our research focuses on affective communication as displayed through body language during social HRI. Our previous work in this area has resulted in the development of automated human affect recognition systems for social robots in order to determine a person’s accessibility and openness towards a robot via static body language during one-on-one HRI scenarios [4–9]. Alternatively, in this paper, we focus on a robot’s ability to display emotional body language. In particular, we explore the design of emotional body language for our human-like social robot, Brian 2.0 (Fig. 1). For Brian 2.0 to be able to effectively display emotional body language that can be easily recognized by different human users, we utilize human emotion research to determine how humans display and recognize emotions through the use of body postures and movements, and apply a similar approach for the generation of Brian 2.0’s emotional body language.

In general, it has been identified that non-verbal communication, which includes body language, facial expressions and vocal intonation, convey a human’s intent better than verbal expressions, especially in representing changes in affect [10]. To date, a significant amount of research has focused on the recognition of human emotions through facial expressions [11] and vocal intonation [12], or a combination of both [13], with only little interest being placed directly on emotion recognition from body language. Body language plays an important role in communicating human emotions during interpersonal social interactions.

Although initial work by Graham et al. [14] suggested that human bodily cues and hand gestures do not function as an additional source of information in the communication of emotion with respect to facial expressions, more recent research has shown that human body language plays an important role in effectively communicating certain emotions either combined with facial expressions [15] or alone on its own [16, 17]. In [15], a study using the display of both congruent and incongruent facial expressions and body language confirmed that both face and body information influence emotion perception. The authors noted that increased attention to bodies and compound displays could provide a better understanding of what is communicated in nonverbal emotion displays. They also mentioned the potential importance of dynamic stimuli. In [16], the influence of the body, face and touch on emotion communication was investigated. With respect to the body, it was determined that body language was the dominant non-verbal communication channel for social-status emotions which include embarrassment, guilt, pride and shame. In [17], a study investigating the recognition of the basic emotions of anger, fear, happiness and sadness, conveyed only through body language, found high recognition rates (greater than 85 %) for all the emotions. Work by Ekman [18] has identified that people are more likely to consciously control or tune their facial expressions over their body language. This is due to the fact that in general we pay a lot of attention to each other’s facial expressions and hence, can actively adapt our expressions to others for different scenarios. However, since feedback on body language from others is rare, we do not censor natural body movements. Hence, body language is considered an important channel for communicating a person’s emotions.

With respect to virtual agents, a lot of research has focused on investigating the display of emotions through facial expressions or a combination of both facial expressions and tone of voice as discussed in [19]. However, fewer works have emphasized the display of emotions through body movements, e.g. [20], or the combination of facial expressions and body movements, e.g. [19]. Similar developments in non-verbal emotion communication for humans and virtual agents also exist for robotic applications. In particular, with respect to the robotic display of emotions, the majority of the existing research has been on identifying facial nodes and actuation techniques in order for robots to be able to display believable facial expressions, e.g. [21, 22], or on the recognition of the facial display of basic emotions by a robot, e.g. [23]. To date, only a handful of researchers have focused on the use of robotic body language to display emotions, with the primary emphasis being on the display of emotions through dance, e.g. [24–30].

In this work, we aim to identify the appropriate emotional body language for the human-like robot Brian 2.0 to display during natural one-on-one interpersonal social interactions with a person, where in such interactions emotional dance may not be appropriate. Our contributions in this paper are as follows: (1) to uniquely investigate if life-sized human-like social robots can effectively communicate emotion by utilizing a combination of human body language features defined by researchers in psychology and social behavioral science, and (2) to conduct a novel comparison study to investigate the effectiveness of these human body language features in communicating emotion when displayed by such a robot with respect to a human actor, where the robot has fewer degrees of freedom.

Our goal is to demonstrate that body movements and postures for human-like robots can represent certain emotions and hence, should be considered as an important part of interaction on the robot’s side. The comparison study is performed with Brian 2.0 and a human actor both performing body movements and postures based on the same body language descriptors in order to investigate if non-experts can recognize the emotional body language displayed by the human-like robot, with fewer degrees of freedom, with similar recognition rates as a human. The study will allow us to determine which body movements and postures can be generalized for the robot to display a desired emotion as well as explore whether human body language can be directly mapped onto an embodied life-sized human-like robot. Feasibility in our case is based on human recognition rates of Brian 2.0’s emotional body language. Distinct from other robot body language studies in the literature, we focus on the use of social emotions that could be the causation of interpersonal factors during social HRI and if these emotions are perceived differently by individuals when displayed by a human-like robot or a human actor. In our work, we consider the implementation of body movements and postures defined by Wallbott [31] and de Meijer [32] for a variety of different emotions.

The rest of this paper is organized as follows. Section 2 provides a discussion on the current research on emotional body language for both humans and robots. Section 3 describes our social robot Brian 2.0 and Sect. 4 defines the emotional body language features utilized for the robot. Sections 5 and 6 present and discuss experiments conducted to evaluate the feasibility of the robot’s emotional body language as well as a comparison study to investigate the perception of the same emotional body language movements and postures when displayed by a human actor versus the robot. Lastly, concluding remarks are presented in Sect. 7.

2 Emotional Body Language

2.1 Human Display of Emotional Body Language

Early research on body language in [33] presented the importance of leaning, head pose and the overall openness of the body in identifying human affect. Participants were shown images of a mannequin in various body postures and asked to identify the emotion and attitude of the posture. The results indicated that posture does effectively communicate attitude and emotion, and that head and trunk poses form the basis of postural expression, with arms, hands and weight distribution being used to generate a more specific expression. More recent research presented in [34] has shown that emotions displayed through static body poses are recognized at the same frequency as emotions displayed with facial expressions. Participants viewed images of a woman displaying different poses for the emotions of happiness, fear, sadness, anger, surprise and disgust, and were asked to identify the corresponding emotions. The results showed that the body poses with the highest recognition rates were judged as accurately as facial expressions. In [35], a study performed with 60 college students utilizing stick figures showed that emotion was strongly related to varying head and spinal positions. For the study, the students were asked to choose, from a list, the emotions of 21 stick figures with three different head positions and seven different spinal positions. The emotions on the list included anger, happiness, caring, insecurity, fear, depression and self-esteem. It was found that upright postures were identified more often as positive emotions while forward leaning postures were identified more often as negative emotions. A comparison of the results with the emotional states of the participants found that the participants’ own emotional states did not influence their emotional ratings of the figures. In [36], Coulson investigated the relationship between viewing angle, body posture and emotional state. Images from three different viewing angles of an animated mannequin in numerous static body poses (derived from descriptions of human postural expressions) were shown to 61 participants who identified the emotions they felt best described each image. The findings indicated that the emotions of anger, sadness and happiness were identified correctly more often than disgust, fear and surprise, and that a frontal viewing angle was the most important viewing angle for identifying emotions. It was also found that surprise and happiness were the only two emotions from the aforementioned emotions that were confused with each other. A similar study to [36] was presented in [37], where instead of an animated wooden mannequin more human-like characters were presented in images to 36 subjects in order for them to distinguish between postures for different expressive emotions. The subjects were asked to group the posture images into the emotions of happiness, sadness, anger, surprise, fear and disgust, and then rate the intensity of emotion expression in each image on a five-point Likert scale. The results identified that happiness had the highest recognition rate, while disgust had the lowest. Furthermore, a different intensity level was assigned to each posture in the same emotion group.

In [17], a database of full body expressions of forty-six non-professional actors, with their faces blurred out, was presented to 19 participants. The participants were asked to categorize the emotion displayed by the expressions based on a four alternative (anger, fear, happiness, sadness) forced-choice task. The results showed that sadness had the highest recognition rate at 97.8 % and happiness had the lowest rate at 85.4 %. In [38], a study was conducted to illustrate that facial expressions are strongly influenced by emotional body language. In the study, twelve participants were presented with images of people displaying fearful and angry facial expressions and body language that were either congruent or incongruent. The participants viewed the images and were asked to explicitly judge the emotion of the facial expression while viewing the full face–body combination. The results showed that recognition rates were lower and reaction times were slower for incongruent displays of emotion. Furthermore, it was found that when the face and body displayed conflicting emotional information, a person’s judgment of facial expressions was biased towards the emotion expressed by the body. Comparison studies presented in [39] also investigated the influence of body expressions on the recognition of facial expressions as well as emotional tone of voice. The results reemphasized the importance of emotional body language in communication, whether displayed on its own or in combination with facial expressions and emotional voices.

Although, the aforementioned studies have been successful in validating emotion recognition from human bodies, they all focus on only static poses and do not take into account the dynamics of body language that are also present during social interactions. In [36], even though not considered, Coulson discusses the potential importance of considering body movements in addition to static postures for emotion display.

Recognition and interpretation of a person’s emotions is very important in social interaction settings. Ekman and Friesen [40] were the first to indicate the importance of body language in conveying information concerning affective states between two individuals in communicative situations. Furthermore, a detailed review of the literature by Mehrabian [41] showed a link between the body posture of one person and his/her attitude towards another person during a conversation. In particular, body orientation, arm positions and trunk relaxation have been found to be consistent indicators of a person’s attitude towards the other person. During social interactions, static body poses may not provide enough information to define a person’s emotions as the body can move a great deal while interacting, and such movement can provide information regarding the intensity and specificity of the emotion [42]. Hence, there exists a consensus that both body movements and postures are important cues for recognizing the emotional states of people when facial and vocal cues are not available [42]. In [42], point-light and full-light videos and still images of actors using body motions to portray five emotions (anger, disgust, fear, happiness and sadness) at three levels of intensity (typical, exaggerated and very exaggerated) were presented to 36 student participants for a forced-choice emotion classification study. For the point-light videos, strips of reflective tape were placed on the actors to only highlight the motion of the main body parts including the ankles, knees, elbows and hands, while a full-light video illuminated a person’s whole body. The still images were frames extracted from the point-light and full-light videos which depicted the peak of each emotional expression. The results of the study showed that exaggeration of body movements improved recognition rates as well as produced higher emotional intensity ratings. The emotions were also identified more readily from body movements even with the point-light videos which minimized static form information.

In [43], the characteristics of a person’s gait were examined to see if emotional state could be identified from walking styles. Observers examined four different people walking in an L-shaped path while displaying four emotions and then identified which emotion each walking style represented. The results showed that the emotions of sadness, anger, happiness and pride could be identified at higher than chance levels based on the amount of arm swing, stride length, heavy footedness and walking speed. In [44], the point-light technique was used to present two dances performed by four dancers (two male and two female) to 64 participants. The dances had the same number of kicks, turns and leaps, however, had different rhythms and timing. It was found that the participants identified that certain movements corresponded to the emotions of happy and sad. Namely, the happy dance was more energetic and consisted of free and open movements, while the sad dance consisted of slow, low energy and sweeping movements. In [45], videos of actors performing emotional situations utilizing body gestures with their faces blurred and no audio were presented to groups of young and elderly adults. One group of 41 participants (21 young adults and 20 elderly adults) were asked to label each of the videos as one of the following emotions: happy, sad, angry and neutral. A second group of 41 participants (20 young adults and 21 elderly adults) were asked to rate the following movement characteristics of the body gestures on seven-point Likert scales: (1) smoothness/jerkiness, (2) stiffness/looseness, (3) hard/soft, (4) fast/slow, (5) expanded/contracted, and (6) almost no action/a lot of action. The results with the first group showed that both the young and elderly adults were able to perform accurate emotion identification, however, the elderly adults had more overall error especially with respect to the negative emotions. With respect to movement characteristics, it was found that the angry body language was identified to have the jerkiest movements, followed by happy, while sad and neutral had the smoothest movements. In addition, angry was rated to have the stiffest movements followed by sad. Happy and neutral had the least stiff movements. Lastly, the body movements for happy and angry were found to be faster and have more action than those for sad and neutral. In [46], arm movements performing knocking and drinking actions which portray the ten affective states of afraid, angry, excited, happy, neutral, relaxed, strong, tired, sad and weak were presented as point-light animations to participants. Fourteen participants were asked to categorize each point-light animation as one of the aforementioned ten affective states. It was found that the level of activation of an affective state was more accurately recognized for the arm movements than pleasantness using a two-dimensional scale similar to the circumplex model [47].

In [31], Wallbott investigated the relationship between body movements and postures, and fundamental and social emotions. The movements and postures included collapsed/erected body postures, lifting of the shoulders, and head and arm/hand movements. Six female and six male professional actors performed 14 different emotions. Twelve drama students acted as expert coders to identify a sample of videos which had the most natural and recognizable emotions of the actors. Then these videos were coded by two trained observers. The 14 emotions considered were elated joy, happiness, sadness, despair, fear, terror, cold anger, hot anger, disgust, contempt, shame, guilt, pride and boredom. Inter-observer agreements of 75–99 % were found for the body movement categories representing the upper body, shoulders, head, arms and hands. Wallbott found that statistically significant relationships exist between specific movements and postures of the body, head and arms, and each of the 14 different emotions. For example, boredom can be characterized by a collapsed upper body, an upward tilted head, inexpansive movements, low movement activity and low movement dynamics. The results of the discriminant analysis resulted in a 54 % correct classification for all the emotions with shame having the highest correct classifications at 81 %, followed by elated joy at 69 %, hot anger at 67 % and despair, terror and pride with the lowest classification percentages at 38 %.

In [32], de Meijer investigated the relationship between gross body movements and distinct emotions. The body movements studied included trunk and arm movements, movement force, velocity, directness, and overall sagittal and vertical movements. Eighty-five adult subjects were shown 96 videos of three actors performing these various body movements and asked to rate the compatibility of the body movements, on a four-point Likert scale, with respect to 12 emotions: interest, joy, sympathy, admiration, surprise, fear, grief, shame, anger, antipathy, contempt and disgust. The results showed that the participants rated the majority of the body movements as expressing at least one emotion. Furthermore, it was determined that a unique combination of body movements was utilized to predict each distinct emotion. For example, a stretching trunk movement while opening and raising the arms would lead to the subjects selecting the emotion joy.

The aforementioned literature review has shown the importance of emotional body language in recognizing the emotions displayed by people, especially in social settings. In particular, it has been determined that specific body poses, postures and movements can communicate distinct emotions. Therefore, in order to achieve effective social HRI, it is important for a socially interactive robot to be able to use body language to display its own emotions, which can then be appropriately interpreted by a person engaged in the interaction at hand.

2.2 Robot Display of Emotional Body Language

A number of robots have been designed to display specific emotions through dance, i.e., [24–30]. In particular, some researchers have utilized Laban body movement features from dance to generate robot emotions, i.e. [24–28]. Laban movement analysis investigates the correlation between a person’s body movements and his/her psychological condition [48]. For example, a movement that is strong, flexible and has a long duration gives a psychosomatic feeling of relaxation. The four major Laban movement features are defined as space, time, weight and flow [48]. Space relates to whole body movements, it measures how direct, open and flexible the body movements are. The time feature determines the speed at which body movements travel spatially, i.e., if a body movement is sudden or sustained. Weight determines the energy associated with movements, i.e., if they are firm or gentle. The flow feature is concerned with the degree of liberation of movements, identifying if movements are free or bound. In [24], Laban features were utilized to create dancing motions for a mobile robot with 1 rotational degree of freedom (DOF) for each arm (two arms in total) and 1 DOF for head nodding. The robot performed six different dances, each displaying one of the following emotions: joy, surprised, sad, angry or no emotion. In [25] and [26], Laban dance features were used to define the motions of the small 17 DOFs KHR-2HV human-like robot for the emotions of pleasure, anger, sadness and relaxation. In particular, in [25], each of these emotions was attributed to only three distinct body movements which consisted of raising and lowering the arms. In [27], Laban dance theory was utilized to describe the body movements of a teddy bear robot. Arm and head motions of the robot were attributed with the emotions of joy, sadness, surprise, anger, fear and disgust. In [28], the 17 DOFs small humanoid Robovie-X robot generated dance movements to express the emotions of anger, sadness and pleasure based on Laban movement analysis and modern dance using its upper body, head, arms, hands, legs and feet.

Other robots have also been designed to mimic human emotional dance without utilizing Laban movement features, e.g. [29, 30]. For example, in [29], the Sony QRIO robot was used to imitate the dance motions of a person in real-time using moving region information, with the goal to create sympathy between a person and the robot. In [30], the Expressive Mobile Robot generated emotionally expressive body movements based on classical ballet using 7 DOFs in its arms, head and wheels. Experiments were conducted to see which body movements people found natural as well as which body movements depicted a feeling of interest by the robot.

A relatively small number of robots have also been developed to display emotions using body movements without incorporating emotional dance. For example, Keepon, a tele-operated chick-like robot utilizes the body movements consisting of bobbing, shaking, and swaying to convey the emotions of excitement, fear and pleasure, respectively [49]. The robot has been designed for interactions with children diagnosed with autistic spectrum disorders. In [50], the design of an insect-like robotic head with two arm-like antennas was presented to express different emotions using exaggerated expressions of animated characters. Namely, the change in color of the eyes and antennas, the motion of the antennas and the eye emoticons can be used to display such emotions as anger, fear, surprise, sadness, joy and disgust. Examples for expressive antenna motions include the ends of the antennas being brought in front of the eyes for fear and swept backwards for surprise. In [51], the small humanoid robot Nao was utilized to express the emotions of anger, sadness, fear, pride, happiness and excitement through head movements in a range of different robot poses. The poses of the robot were designed based on motion capture information of a professional actor guided by a director. In [52], the human-like WE-4RII robot was used to display emotions using facial expressions and upper body movements (especially hand movements). The facial and body patterns to display for the emotions were based on recognition rates from a pre-experiment where several simulated patterns were presented to subjects. Both the posture and velocity of the body were used to display the emotions of neutral, disgust, fear, sadness, happiness, surprise and anger. In [53], the Nao robot was also utilized to generate the emotions of anger, fear, sadness and joy with body movements, sounds (i.e., crying, growling, banging), and eye colors (i.e., red for angry, dark green for fear, violet for sad, yellow for joy) in order to map these emotions onto the Pleasure–Arousal–Dominance (PAD) model. The authors stated that they used psychological research inspired by the work of Coulson [36] and de Meijer [32], TV shows and movies to link emotions to body movement, sound and color. Expressions did also include dancing for the emotion of joy and saying “Jippie Yay” with the robot’s eyes turning yellow. The robot’s emotion expressions were first evaluated in a pre-test and then each single expressional cue was individually investigated in the experiments in order to determine the expressivity of each stimulus for each emotional cue. However, for these expressions, the authors did not specify which descriptors from Coulson and de Meijer they considered and for which emotions. Hence, it is not clear how the poses/movements of the small robot are directly linked to existing human psychological studies.

In general, the emotions of robots designed for HRI have mainly been derived from body movements from dance or robot-specific characteristics. For the latter group, robot-specific movements have usually been generated that cannot easily be generalized to other robots. With respect to emotional dance, the corresponding body movements are more appropriate for small robots that can have a larger workspace (i.e., table tops) during HRI, and cannot be effectively used for larger robots engaging in natural one-on-one social interactions, such as our robot Brian 2.0. To date, research into the use of emotions based on human body movements and postures for social interactions is non-existent for robotic applications with the exception of [53]. However, in [53], emotional dance is still incorporated into some of the small sized Nao’s emotional expressions and the link between the robot’s body language and human body language is not directly clear. Hence, our research explores the challenge of using natural human body movements and postures to represent social emotional behaviors for life-sized human-like robots in order for the robots to effectively communicate while building interpersonal relationships during one-on-one social interactions.

2.3 Human Perception of Robotic Body Language

A handful of researchers have primarily investigated human perception of robot body language in representing specific emotions. In [51], the head positions of Nao were utilized to investigate the creation of an affect space for body language. Twenty-six participants were asked to identify the emotions displayed by the robot, based on different head movements, as anger, sadness, fear, pride, happiness or excitement. Participants were also asked to rate the level of valence and arousal of each emotion utilizing a ten-point Likert scale. The results showed that a head-up position increased the recognition rates of the emotions of pride, happiness and excitement, and a head-down position increased the recognition rates of the emotions of anger and sadness. The position of the head was also found to be related to the perceived valence of the robot’s emotion but not to arousal. In [52], the human-like robot WE-4RII was utilized to determine how well participants could recognize the emotions of the robot utilizing facial expressions, body and hand movements. It was found that the participants recognized emotions more often when emotional hand movements were included with facial expressions and body movements. In [53], 67 participants were asked to identify which combination of body movements, sounds, and eye colors that the Nao robot displayed were most appropriate for the emotions of anger, fear, sadness and joy. Then another study was conducted with 42 participants, where the robot separately displayed body movements, sounds and eye colors for the same emotions. In this latter study, the participants were asked to assign a specific value within the PAD model for each of the individual expressions. It was found that body movements achieved the best results. In [54], one set of participants (which included amateurs and expert puppeteers) was asked to create simple non-articulated arm and head movements of a teddy bear robot for different scenarios. Another set of participants was asked to watch animations or videos of these robotic gestures and to judge the emotions that were displayed based on the simple movements created. The emotions that were available to the second set of participants to choose from included happy, interest, love, confused, embarrassed, sad, awkward, angry, surprised and neutral. The participants also rated the lifelikeness of the gestures and how much they liked the gestures. The results showed that emotions can be conveyed through simple head and arm movements for the teddy bear robot and that recognition rates increased when the participants were given the situational context for the gestures. The gestures for fear and disgust were found to be better understood when created by expert puppeteers rather than amateurs, however, this was not true for the other emotional movements. It was also found that positive emotions and more complex arm movements were rated as more lifelike.

Studies determining recognition rates of emotions based on the use of Laban body movements have also been conducted. For example, in [28], emotional dance for the three basic emotions of anger, sadness and pleasure was displayed by the small humanoid Robovie-X robot to two different groups of Japanese participants. In particular, a group of elderly individuals and a group of young individuals were asked to watch and identify each emotion displayed by the robot’s body movements. The results showed differences in the perception of emotion from robot body language between the two groups. The authors suggested that these differences are due to variations in the focus and cognition of the two groups when identifying the emotions such as their attention to different body parts and their perception of the magnitude and speed of the robot’s motions. Hence, body language of the robot should be designed with the consumer in mind. In [25], 33 subjects watched the KHR-2HV human-like robot’s movements and categorized these movements as being a weak or strong display of pleasure, anger, sadness or relaxation. The subjects first watched the robot display basic movements and then eight processed whole-body movements which represented the target emotions. The results showed that the subjects could identify the emotions of sadness, pleasure and anger for the movements but not relaxation, and that some emotions could easily be confused with each other such as pleasure with anger, and sadness with relaxation. In [27], 88 Japanese subjects were asked to identify the emotions related to the Laban body movements displayed by a teddy bear robot with 6 DOFs in the head and shoulders. The emotions were chosen from a list which included joy, anger, surprise, fear, disgust and sadness. They were also asked to rate on a four-point Likert scale how clearly the emotions were displayed. The results found that with simple arm and head movements, the emotions of joy, sadness, surprise and fear could be recognized. However, anger and disgust were not easily recognized by the subjects. In [24], 21 student participants were asked to judge the intensity and type of emotions (joy, surprised, sad, angry and no emotion) displayed by a mobile robot to determine correlations between these emotions and the robot’s effort and shape movement characteristics that are based on Laban movement features. Effort represents dynamic features of movement or quality of movement, whereas shape represents geometrical features of the overall body. The results showed that strong body movements were correlated with joy, and they were also correlated along with ascending and enclosing shape features to surprise. Weak body movements were correlated with sadness and an advancing body movement was correlated with angry.

Contrary to the robotic studies presented above, in this paper, we present a unique comparison study of the recognition rates of the emotional body language of our human-like social robot Brian 2.0 with the recognition rates of the same emotional body language displayed by a human actor in order to investigate the quality of the body language displayed by the robot. This will allow us to determine which body movements and postures can be generalized for the robot to display for a desired emotion, in addition to exploring whether human body language can be directly mapped onto an embodied life-sized human-like robot. We will use non-expert participants in our study, as it is intended that Brian 2.0 will be interacting with the general population. The body language features used in our work will be derived from the emotional body movements and postures defined by Wallbott [31] and de Meijer [32]. We will consider the emotions and corresponding body language that are applicable for social HRI scenarios. The emotions that will be investigated, herein, are happiness, sadness, boredom, interest, elated joy, surprise, fear and anger.

3 The Social Robot Brian 2.0

The human-like robot Brian 2.0 has similar functionalities to a human from the waist up (Fig. 2a). The dimensions of the upper body of the robot have been modeled after a male volunteer. The robot is able to display non-verbal body language via: (a) a 3 DOFs neck capable of expressing realistic head motions such as nodding up and down, shaking from side to side and cocking from shoulder to shoulder, (b) an upper torso consisting of a 2 DOFs waist allowing it to lean forward and backwards as well as turn side to side, and (c) two arms with 4 DOFs each: 2 DOFs at the shoulder, 1 DOF at the elbow and 1 DOF at the wrist. Utilizing these body parts, the robot is capable of displaying various human-like body movements and postures.

4 Emotional Body Language Features

As previously mentioned, since both body movements and postures are important cues for recognizing emotional states displayed by an individual, we focus on defining emotional body language for our robot Brian 2.0 that encompasses both these characteristics. This body language should be consistent with emotions that a robot would display during social HRI scenarios. In this work, the body language classification of Wallbott [31] and de Meijer [32] are utilized to generate body language corresponding to the emotions of sadness, elated joy, anger, interest, fear, surprise, boredom and happiness. We have chosen to use this set of eight emotions as they provide a large variation across both the valence (positive and negative feelings) and arousal dimensions of affect. For example, sadness represents negative valence whereas elated joy represents positive valence; and boredom represents low arousal whereas surprise represents high arousal. Furthermore, these emotions are included within a group of emotions that psychologists define as social emotions, [55–57]. Namely, social emotions which can also include the basic emotions of happiness, sadness, fear and anger serve a social and interpersonal function, where an individual’s relationship to another individual can be the central concern for these emotions [58–60]. Hence, these emotions involve the presence of a (real or virtual) social object which may include another person or a social constructed self [61]. The set of eight emotions that we have chosen, herein, can be used by the robot to engage in social communication with a person in order to accomplish different interaction goals such as, for example, obtaining compliance or gathering information.

The body language descriptors used for the different emotions are presented in Table 1. The emotions of sadness, elated joy, boredom and happiness are derived from body movements defined by Wallbott [31] and the other four emotions of anger, interest, fear and surprise are derived from de Meijer’s work [32]. Body language classification was taken from both these works in order to allow us to accommodate the range of proposed emotions for our robot. The emotional body movements and postures chosen can be achieved based on the robot’s mobility specifications and include upper trunk, head and arm movements as well as the overall movement quality. Trunk movement is classified as either the stretching or bowing of the upper trunk, which the robot emulates by leaning forwards or backwards at the waist. The movement of the head consists of facing forwards, tilting backwards or facing downwards and is achieved via the robot’s 3 DOFs neck. The arm motions are defined as: (1) hanging—when resting at the sides of the robot, and (2) opening/closing—for opening, the arms start near the center of the robot and move outwards away from the body, while closing consists of the opposite motion. The overall direction of movement is also described as forwards, backwards, upwards and downwards based on the motion of the trunk, arms and head of the robot in these directions. The movement quality represents the overall speed, size and force of movements and is divided into three main categories [31]: (1) movement dynamics which refers to the energy, force or power in a movement; (2) movement activity which refers to the amount of movements, and (3) expansive or inexpansive movements which refer to the large or small spatial extension of the robot’s body.

Table 1 Body language descriptors for different emotions

Full size table

In order to implement the emotional body language descriptors in Table 1, the kinematic model for Brian 2.0 (shown in Fig. 2b) is utilized. For example, to implement the descriptors for elated joy, the following joints are used. The revolute joint 1 rotates the robot’s trunk to an upright position, where the trunk is perpendicular to the ground to represent a stretched trunk posture. The two shoulder joints (joints 3 and 4, and 7 and 8) and elbow joints (joints 5 and 9) of each arm are used to move the arms of the robot in an upwards and outwards direction to mimic opening of the arms. Joint 12 is used to tilt the head back. The combination of the trunk, arms and head motion represents the overall upward motion of the robot. High movement activity is achieved by repeating the upwards motions several times. High movement dynamics are achieved by high joint velocities. Expansive movements increase the spatial workspace of the robot during the display of body language and are implemented through the motion of opening the arms as well as the rotating of both the trunk and head from left to right using joints 2 and 11.

5 Experiments

The first objective of the experiments was to determine if non-expert individuals would be able to identify emotions from the body language displayed by the human-like social robot Brian 2.0. The second objective of the experiments was to compare how individuals interpret the same emotional body language displayed by the robot and a human actor. Participants were asked to watch videos of both Brian 2.0 and an actor displaying the same emotional body language and then identify the corresponding emotion being displayed in each of the videos. The results were then analyzed to determine which emotions were recognized in both cases, and how the recognition results compared.

For the videos, the actor was instructed to perform the body language descriptors in Table 1 while keeping a neutral facial expression. He rehearsed the body movements and postures under the guidance of the authors prior to their videotaping. With respect to the robot, the neutral pose of the robot’s face was displayed throughout the videos by not actively controlling the robot’s facial actuators during the display of the body language.

5.1 Participants

A total of 50 (30 female and 20 male) participants took part in the overall study after accounting for dropouts. The participants ranged in age from 17 to 63 years with a mean age of 27.78 (SD = 9.13). The participants were all from North America, where the human actor was also from. None of the participants were familiar with social robots.

5.2 Procedure

Each participant logged on to a secure website that was developed by the researchers. On the website, the participants were able to watch separate videos of first the robot and then the human actor displaying the emotional body language defined in Table 1 in a random order. An initial pilot study with two groups of ten participants was performed prior to the experiment to determine if the order of presentation of the robot and actor videos would influence recognition of the emotional body language displays. The results of a two-tailed Mann–Whitney U test performed on the recognition rates of the two groups indicated no significant order effects, U = 42, \(p = 0.579\). Based on this finding, we showed the videos of the robot first so that we could also initially focus on obtaining the results needed to address the first objective of the experiment, i.e. to determine if non-expert individuals would be able to identify emotions from the body language displayed by the robot. Emotional body movements and postures were displayed in the videos without any facial expressions for both the robot and actor. This procedure follows a similar approach used in other robot emotional body language studies, e.g. [27, 28, 51, 54]. We decided not to cover the robot’s/actor’s face when presenting the videos to the participants in order to be able to clearly show head movements and the different angles of the head that are significant descriptors for the emotions, as well as any interactions between the other body parts and the head. The participants were informed that the faces in the videos would be in a neutral emotional state. A forced-choice approach was utilized, where after the participants watched each video, they were asked to select the emotion they thought was best described in the video from the following list of eight possible emotions: sadness, fear, elated joy, surprise, anger, boredom, interest and happiness. The use of this type of forced-choice approach is very popular in studies on emotion recognition, e.g. [17, 42, 45, 46, 51]. Additionally, the forced-choice approach used herein has many advantages, including: (1) it allows for simple interpretation, i.e. it does not require the expert coding of open ended questions [62], (2) it fits the categorical nature of emotions [62], and (3) by not including a “none of the above” option, controls for participant bias, ensuring that data is collected from every participant [63, 64]. An emotion needed to be selected by the participant for each video in order for the next video to be displayed to him/her. Eight videos were each shown for the robot and for the actor.

The average length of the videos was approximately 10 s, during which the appropriate body movements and postures were repeated three times. Example frames from each of the videos are shown in Figs. 3 and 4. The videos were recorded with a Nikon D7000 camera at 30 frames per second and a resolution of 1,280 by 720 pixels. The layout of the website was such that after each video was played, the list of possible emotions was presented to the participants directly to the left of the video, as shown in Fig. 5.

5.3 Data Analysis

A within-subjects experimental design was implemented. Confusion matrices were utilized to represent the recognition rates for the emotions for both the robot and human actor. A \(\chi ^{2}\) goodness of fit test was used to estimate the likelihood that the correct emotions that were observed for the corresponding body language did not occur due to random chance. A binomial test was utilized to determine if the desired emotion could be recognized more often than all other emotions for the respective body language.

A direct comparison study with respect to the recognition rates for the robot and human actor was conducted to determine the feasibility of using the chosen body language for the human-like social robot. A McNemar test was implemented to test if there is a significant difference between the recognition rates for the robot and the human actor. The null hypothesis used for the McNemar test was defined as: the emotion recognition rates for both the robot and human actor are the same.

5.4 Experimental Results

5.4.1 Identifying the Emotional Body Language Displayed by the Human-like Robot Brian 2.0

The recognition rates for the emotions displayed by the robot are presented in the confusion matrix in Table 2. Rows in Table 2 represent the emotions chosen by the participants and columns represent the true labeled emotions. Sadness had the highest recognition rate at 84 % followed by surprise with a recognition rate of 82 %. Anger and elated joy had recognition rates of 76 and 72 %, while boredom and interest had rates of 56 and 38 %, respectively. The emotions with the lowest recognition rates were fear with a rate of 26 % and happiness with a rate of 20 %. It is interesting to note that the body language for happiness was most often recognized as interest and the body language displayed for fear was recognized equally as both the emotions of fear and boredom. Interest had the highest frequency of incorrect recognitions at 11 % with respect to the true labeled emotion.

Table 2 Confusion matrix for the emotions of the robot

Full size table

A \(\chi ^{2}\) goodness of fit test with \(\alpha =0.05\) was utilized to determine if the emotions recognized from the observed body language were due to random chance. The results of the \(\chi ^{2}\) test are as follows for each of the emotions:

sadness: \(\chi ^{2}\) (df = 7, N = 50) = 237.04, \(p <0.001\);
elated joy: \(\chi ^{2}\) (df = 7, N = 50) = 171.44, \(p <0.001\);
anger: \(\chi ^{2}\) (df = 7, N = 50) = 186.48, \(p <0.001\);
interest: \(\chi ^{2}\) (df = 7, N = 50) = 57.20, \(p <0.001\);
fear: \(\chi ^{2}\) (df = 7, N = 50) = 32.24, \(p<0.001\);
surprise: \(\chi ^{2}\) (df = 7, N = 50) = 227.44, \(p<0.001\);
boredom: \(\chi ^{2}\) (df = 7, N = 50) = 133.68, \(p <0.001\); and
happiness: \(\chi ^{2}\) (df = 7, N = 50) = 37.68, \(p<0.001\).

Hence, the emotions for each of the eight displays of body language were chosen significantly above random chance.

It was hypothesized that the emotional body language movements and postures displayed by the robot would be recognized as their corresponding desired emotion more often than the other seven emotions. We utilized a binomial test to exam this hypothesis. Namely, the null hypothesis is that the desired emotion will be recognized at the same or a lower frequency than the other emotions, i.e., \(p_{1} \le 0.5\). The results of the binomial test are presented in Table 3. It can be concluded that with 95 % confidence the desired emotions of sadness, elated joy, anger and surprise are recognized significantly more often than any of the other emotions. A 75 % confidence level was found for the emotion of boredom being recognized significantly more often than the other emotions. However, the emotions of interest, fear and happiness were not recognized significantly more often than the other emotions. Interest was the emotion most often chosen by the participants for the body language corresponding to the desired emotion of happiness.

Table 3 Results of binomial test for the recognized emotions of the robot

Full size table

5.4.2 Identifying the Emotional Body Language Displayed by the Human Actor

The recognition results for the emotions displayed by the human actor are presented in the confusion matrix in Table 4. As can be seen by the results, anger had the highest recognition rate of 100 % followed by boredom which had a recognition rate of 86 %. Emotions such as fear, surprise, elated joy and interest had recognition rates of 70, 66, 60, and 56 %, respectively. The emotions with the lowest recognition rates were sadness with a rate of 34 % and happiness with a rate of only 2 %. Similar to the recognition rates with respect to the robot, happiness was again considered to be the least recognized emotion from the corresponding body language. Only one participant chose happiness based on its described body language. From the results, it can be seen that the body language for sadness and happiness were more often recognized as boredom. Hence, boredom had the highest frequency of incorrect recognitions across all the emotions at 18.5 %.

Table 4 Confusion matrix for the emotions of the human actor

Full size table

A \(\chi ^{2}\) goodness of fit test with \(\alpha =0.05\) was implemented to determine if the observed emotions were chosen at a rate higher than random chance for the human actor. The test was applied to all the emotions except anger, as anger had a 100 % recognition rate for the human actor. The results of the \(\chi ^{2}\) test are as follows:

sadness: \(\chi ^{2}\) (df = 7, N = 50) = 150.32, \(p< 0.001\);
elated joy: \(\chi ^{2}\) (df = 7, N = 50) = 117.68, \(p< 0.001\);
interest: \(\chi ^{2}\) (df = 7, N = 50) = 102.96, \(p< 0.001\);
fear: \(\chi ^{2}\) (df = 7, N = 50) = 169.84, \(p < 0.001\);
surprise: \(\chi ^{2}\) (df = 7, N = 50) = 160.88, \(p< 0.001\);
boredom: \(\chi ^{2}\) (df = 7, N = 50) = 251.76, \(p< 0.001\); and
happiness: \(\chi ^{2}\) (df = 7, N = 50) = 201.52, \(p < 0.001\).

Hence, the emotions for these seven displays of body language were chosen significantly above random chance.

Similar to the robot emotions, it was hypothesized that the emotional body language features displayed by the actor would be recognized as their corresponding desired emotion more often than the other seven emotions. The results of the binomial test are presented in Table 5. With 95 % confidence the desired emotions of anger, fear, surprise and boredom can be recognized significantly more often than any of the other emotions. Confidence levels of 89 and 75 % were found for the desired emotions of elated joy and interest, respectively. On the other hand, the emotions of sadness and happiness were not recognized significantly more often than the other emotions. In particular, for both the desired emotions of happiness and sadness, the most recognized emotion by the participants, based on the corresponding body language, was boredom.

Table 5 Results of binomial test for the recognized emotions of the human actor

Full size table

5.4.3 Comparison

Figure 6 presents a direct comparison for the emotion recognition rates for the robot and human actor. From the figure, it can be seen that the recognition rates were higher for the human actor for the emotions of anger, interest, fear and boredom, while the robot had higher recognition rates for the emotions of sadness, elated joy, surprise and happiness.

McNemar’s two-tailed test for paired proportions was used to statistically compare the recognition results from the robot and human actor. The null hypothesis was defined as the difference between the recognition rates, \(p_{1}\) for the human actor and \(p_{2}\) for the robot, should be zero. The first alternative hypothesis was defined as the emotion recognition rates of the body language for the human actor are higher than for the robot, and the second alternative hypothesis was defined as the recognition rates for the robot are higher than for the human actor. The \(2 \times 2\) contingency tables comparing the recognition results of the desired emotions of the robot and actor with respect to the other emotions are presented in Table 6 with the McNemar test results presented in Table 7. Significance testing was conducted using \(\alpha =0.05\). The emotions for which the null hypothesis was accepted were elated joy, interest and surprise. Hence, there was no statistical difference between the recognition rates for the robot and human actor for these emotions. For all other emotions, the null hypothesis was rejected. In particular, statistically, there is a significant difference in the recognition results for the robot and human actor for the five remaining emotions. Namely, the robot has higher recognition rates for sadness and happiness, while the human actor has higher recognition rates for the emotions of anger, fear and boredom.

Table 6 Contingency tables for the recognition results of both the robot and human actor

Full size table

Table 7 McNemar significance results for the robot and human actor recognition rates

Full size table

6 Discussions

The recognition results for the human-like social robot showed that participants were able to recognize the emotional body language for sadness, elated joy, anger, surprise and boredom, as defined by Wallbott [31] and de Meijer [32], with rates over 55 %. All these emotions had recognition rates significantly above random chance with respect to all other emotions for the same body language. The body language for the emotion of fear was recognized by the participants both as fear and boredom with the exact same frequency. This can be a difficult emotion for the robot to express based on the defined body movements and postures due to the rigidity of the robot’s body. For example, the rigid body of the robot does not allow it to easily curl in the shoulders and bend the back similar to how a human would for this particular emotion, i.e., Fig. 4. Furthermore, it is difficult for the robot to mimic the tensing of the muscles in the body to represent the force and energy of the high dynamic movements for this particular emotion. This made the recognition of this emotion more challenging for the participants. Furthermore, as the emotional body language for fear required the robot to turn its head away and bow its trunk, some participants confused this as the robot was displaying boredom.

For the robot, the body language for the desired emotion of happiness was recognized more often as interest and boredom. For the actor, the body language for the desired emotion of happiness was recognized most often as boredom. These other emotions contain similar descriptors to happiness such as stretching the trunk and low movement dynamics which could contribute to the confusion. In general, the body language for happiness had low recognition rates for both the robot and human actor, although the recognition rates were significantly higher for the robot than the actor. Unlike the robot, during this body language display, the actor also had his hands in his pockets for approximately half of the duration of the video. Hands in the pockets have been found to be perceived as a number of different affective states including calm and easygoing [65], casual attitude [66], relief [67], and sad [67]. Hence, this particular gesture may have also resulted in the majority of the participants recognizing this body language display as boredom for the actor. The similarity in descriptors can also be the reason why the robot’s emotional body language for interest was recognized as happiness by 32 % of the participants. Hence, alternative body language descriptors may need to be considered and tested for the emotion of happiness. The challenge will be to identify potential descriptors for happiness for the robot that will also be unique from those used for elated joy, where both emotions have positive valence, but the latter has higher arousal. Wallbott [31] is the only researcher to the authors’ knowledge that provides specific human body language descriptors for the emotions of happiness, elated joy, boredom and interest. In our study, we used Wallbott’s descriptors for the first three emotions and descriptors from de Meijer for the emotion of interest. For interest, Wallbott’s body language descriptors are similar to those defined by de Meijer, with the exception that de Meijer also included descriptors that describe the direction and dynamics of body movements for this emotion. The inclusion of other modes such as facial expressions may also need to be considered for happiness. For example, it has been shown in several studies that a universal human facial expression for happiness includes such descriptors as raising the checks and moving the corners of the mouth upwards [68], hence adding such descriptors to the body language for happiness might be necessary in order to increase recognition rates for this particular emotion for the robot.

The recognition results for the actor showed that the participants most often associated the body language for sadness with boredom, however, this was not the case for the robot. For the robot, the body language for the desired emotion of sadness was recognized significantly more often as sadness than any of the other emotions. From the comparison study, it was determined that the desired emotion of sadness was recognized at significantly higher rates for the robot than the actor. This may be a result of the difference in the head positions of the robot versus the actor during the videos. On average, the robot’s head was facing more downwards than the actor’s head while displaying the body movements for sadness, as the robot was not able to slouch its shoulders. Studies by both Darwin [69] and Bull [70] have found that dropping/hanging the head is related to the emotion of sadness. The emotions of interest and surprise had statistically similar recognition results for the robot and actor; this was due to the fact that the robot was able to easily replicate the body movements for these emotions and did so in a similar manner as the actor did. For the emotion of elated joy, due to each shoulder of the robot having one fewer rotational degree of freedom than a human, the robot generated the opening and upwards arm movements by also moving its upper arms outwards compared to the actor who directly lifted the upper arms forwards. Despite this difference, the recognition results were statistically similar to that of the human actor. The emotional body language for anger, boredom and fear were recognized at statistically higher recognition rates for the actor, this can be a result of the robot not being able to directly mimic the tensing of the muscles (for angry and fear) or curling in the shoulders and bending the back (for boredom and fear) as previously mentioned.

As both the robot’s and actor’s faces were visible in the videos, the lack of facial expressions could have influenced the recognition rates for the emotions, even though the participants were informed that only emotional body language without any facial expressions was displayed in both sets of videos. Namely, this might have been a reason why happiness had low recognition rates for both the robot and actor. This could have also been the cause of the confusion for the emotion of fear being recognized as both fear and boredom when the corresponding body language was displayed by the robot. Since the robot’s eyes did not move independently of the head, the robot did not keep eye contact with the camera to the same extent as the actor did for the emotional body language displays of sadness and surprise. For the display of sadness, due to its more downwards head pose, the robot averted its gaze from the camera for 89 % of the video, while the actor averted his gaze from the camera for 55 % of the video. As previously mentioned, this more downwards head pose of the robot and therefore its averted gaze may be a result of why its display of sadness had a higher recognition rate. For the display of surprise, due to the range of motion of the robot’s body, the robot averted its gaze for 95 % of the video, while the actor did not avert his eyes. Despite this difference in eye gaze, the recognition rate for the robot for surprise was statistically similar to the recognition rate for the actor. Although when comparing Figs. 3 and 4, the robot’s body language for fear and happiness appear to have slightly more instances of averted gaze in comparison to the human, the overall amount of time that Brian 2.0 and the actor had averted gazes during their respective videos for these emotions was within 10 % of each other. Previous studies have shown that eye gaze direction does not directly influence recognition of emotions displayed by facial expressions [71] and that they are processed independently [72], however, to the authors’ knowledge, there have been no studies that have investigated the direct influence of eye gaze on the recognition of emotional body language. Therefore, this relationship should be further explored in future work.

The recognition rates for the robot were also compared to the recognition rates that Wallbott obtained in [31] for the same body language descriptors used for happiness, sadness, boredom and elated joy to provide further insight. Unfortunately, a similar comparison could not be conducted with the emotions obtained from de Meijer’s descriptors as recognition rates were not provided in [32]. The recognition rates of the emotions elated joy and boredom of Brian 2.0 were found to be within 10 % of the recognition results that Wallbott observed for these emotions in [31]. Sadness also had high recognition rates for our robot study and Wallbott’s study. In [31], happiness had a good recognition rate, being distinguishable from all the other emotions except for contempt, which is an emotion we did not consider in our robot study.

Overall, the experimental results showed that the body language descriptors were effective in displaying the emotions of sadness, elated joy, anger, surprise and boredom for our social human-like robot Brian 2.0, warranting the potential use of these social emotions and corresponding body language for the robot in natural and social HRI settings. On the other hand, the body language for the emotions of happiness, fear and interest were not well recognized for the robot.

While previous studies have compared human and artificial displays of emotional facial expressions and have shown that the later can also be recognized effectively (though with lower recognition rates than the human) [23, 73, 74], our comparison study is novel in that it focuses on a robot’s display of emotional body language. In general, the work presented in this paper can be used as a reference when determining the emotional body language of other life-sized human-like robots or androids. With respect to android body language, it has been stated that there has been little active research in this area [75].

7 Conclusions

Our research focuses on robotic affective communication as displayed through body language during social HRI scenarios. Namely, in this paper, we investigate the use of emotional body language for our human-like social robot Brian 2.0 utilizing body movement and posture descriptors identified in human emotion research. The body language descriptors we explore for the robot are based on trunk, head and arm movements as well as overall movement quality. Experiments were conducted to determine: (1) if non-expert individuals would be able to identify the eight social emotions of sadness, fear, elated joy, surprise, anger, boredom, interest and happiness from the display of Brian 2.0’s body language which has been derived from a combination of human body language descriptors, and (2) compare how individuals interpret the same emotional body language descriptors displayed by the social robot with fewer degrees of freedom and a human actor, in order to determine if the desired emotions can be communicated by the robot as effectively as by a human. Experimental results showed that participants were able to recognize the robot’s emotional body language for sadness, elated joy, anger, surprise and boredom with high recognition rates. Even though the robot was not able to implement some body movement features due to its rigid body, the participants were still able to recognize the majority of the emotions. When comparing the recognition rates, it was determined that the emotion of sadness was even recognized at significantly higher rates for the robot than the human actor, while the robot and actor had similar recognition rates for elated joy, surprise and interest. Both the robot and actor had the lowest recognition rates for the emotion of happiness, due to its similarity in body movement features to other emotions. Only the emotions of anger, fear and boredom were recognized at a significantly higher rate for the human actor. Overall, these experimental findings demonstrate that certain human-based body movements and postures that can represent social emotions can be effectively displayed by a life-sized human-like robot. Our future work will consist of integrating the robot’s emotional body language with other natural communication modes we have been working on, such as facial expressions and vocal intonation, in order to develop and test a multi-modal emotional communication system for the social robot.

References

Breazeal C (2004) Social interaction in HRI: the robot view. IEEE Trans Syst Man Cybern C 34:181–186. doi:10.1109/TSMCC.2004.826268
Article Google Scholar
Xin M, Sharlin E (2007) Playing games with robots: a method for evaluating human–robot interaction. In: Sankar N (ed) Human–robot interaction. Austria, Vienna, pp 469–480
Google Scholar
Fong T, Nourbakhsh I, Dautenhahn K (2003) A survey of socially interactive robots. Robot Auton Syst 42:143–166. doi:10.1016/S0921-8890(02)00372-X
Article MATH Google Scholar
McColl D, Zhang Z, Nejat G (2011) Human body pose interpretation and classification for social human–robot interaction. Int J Soc Robot 3(3):313–332. doi:10.1007/s12369-011-0099-6
Article Google Scholar
McColl D, Nejat G (2011) A socially assistive robot that can interpret human body language. In: ASME 2012 international design engineering technical conferences and computers and information in engineering conference, DETC 2011-48031. doi:10.1115/DETC2011-48031
Zhang Z, Nejat G (2009) Human affective state recognition and classification during human–robot interaction scenarios. In: ASME 2012 international design engineering technical conferences and computers and information in engineering conference, DETC 2009–87647. doi:10.1115/DETC2009-87647
Nejat G, Ficocelli M (2008) Can I be of assistance? The intelligence behind an assistive robot. In: IEEE international conference on robotics and automation (ICRA), pp 3564–3569. doi:10.1109/ROBOT.2008.4543756
Terao J, Trejos L, Zhang Z, Nejat G (2008) An intelligent socially assistive robot for health care. In: ASME international mechanical engineering congress and exposition, IMECE 2008–67678. doi:10.1115/IMECE2008-67678
Nejat G, Allison B, Gomez N, Rosenfeld A (2007) The design of an interactive socially assistive robot for patient care. In: ASME international mechanical engineering congress and exposition, IMECE 2007–41811. doi:10.1115/IMECE2007-41811
Collett P, Marsh P, O’Shaughnessy M (1979) Gestures: their origin and distribution. Jonathan Cape, London
Google Scholar
Fasel B, Luettin J (2003) Automatic facial expression analysis: a survey. Pattern Recogn 36(1):259–275. doi:10.1016/S0031-3203(02)00052-3
Article MATH Google Scholar
Murray IR, Arnott JL (1993) Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. J Acoust Soc Am 93:1097–1108. doi:10.1121/1.405558
Article Google Scholar
Zeng Z, Pantic M, Roisman GI, Huang TS (2009) A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans Pattern Anal Mach Intell 31(1):39–58. doi:10.1109/TPAMI.2008.52
Article Google Scholar
Graham JA, Bitti PR, Argyle M (1975) A cross-cultural study of the communication of emotion by facial and gestural cues. J Hum Mov Stud 1(2):68–77
Google Scholar
App B, Reed CL, McIntosh DN (2012) Relative contributions of face and body configurations: perceiving emotional state and motion intention. Cogn Emot 26(4):690–698. doi:10.1080/02699931.2011.588688
Article Google Scholar
App B, McIntosh DN, Reed CL, Hertenstein MJ (2011) Nonverbal channel use in communication of emotion: how may depend on why. Emotion 11(3):603–617. doi:10.1037/a0023164
Article Google Scholar
De Gelder B, Van den Stock J (2011) The bodily expressive action stimulus test (BEAST). Construction and validation of a stimulus basis for measuring perception of whole body expression of emotions. Frontiers Psychol 2: article 181. doi:10.3389/fpsyg.2011.00181
Ekman P (2003) Darwin, deception, and facial expression. Ann N Y Acad Sci 1000:205–221
Article Google Scholar
Pelachaud C (2009) Modelling multimodal expression of emotion in a virtual agent. Philos Trans Royal Soc B 364(1535):3539–3548. doi:10.1098/rstb 2009.0186
Article Google Scholar
Castellano G, Mancini M, Peters C, McOwan PW (2012) Expressive copying behavior for social agents: a perceptual analysis. IEEE Trans Syst Man Cybern A 42(3):776–783. doi:10.1109/TSMCA.2011.2172415
Article Google Scholar
Hashimoto T, Hiramatsu S, Kobayashi H (2008) Dynamic display of facial expressions on the face robot made by using a life mask. In: IEEE-RAS international conference on humanoid robots, pp 521–526. doi:10.1109/ICHR.2008.4756017
Miwa H, Okuchi T, Takanobu H, Takanishi A (2002) Development of a new human-like head robot WE-4. In: IEEE/RSJ international conference on intelligent robots and systems, 3:2443–2448. doi:10.1109/IRDS.2002.1041634
Becker-Asano C, Ishiguro H (2011) Evaluating facial displays of emotion for the android robot Geminoid F. In: IEEE workshop on affective computational intelligence, pp 1–8. doi:10.1109/WACI.2011.5953147
Nakata T, Mori T, Sato T (2002) Analysis of impression of robot bodily expression. J Robot Mechatron 14(1):27–36
Google Scholar
Masuda M, Kato S (2010) Motion rendering system for emotion expression of human form robots based on laban movement analysis. In: IEEE international symposium on robots and human interactive communications, pp 324–329. doi:10.1109/ROMAN.2010.5598692
Masuda M, Kato S, Itoh H (2009) Emotion detection from body motion of human form robot based on laban movement analysis. In: Principles of practice in multi-agent systems. Springer, Berlin, pp 322–334. doi:10.1007/978-3-642-11161-722
Takahashi K, Hosokawa M, Hashimoto M (2010) Remarks on designing of emotional movement for simple communication robot. In: IEEE international conference on industrial technology, pp 585–590. doi:10.1109/ICIT.2010.5472735
Nomura T, Nakao A (2010) Comparison on identification of affective body motions by robots between elder people and university students: a case study in Japan. Int J Soc Robot 2(2):147–157. doi:10.1007/s12369-010-0050-2
Article Google Scholar
Tanaka F, Suzuki H (2004) Dance interaction with QRIO: a case study for non-boring interaction by using an entrainment ensemble model. In: IEEE international workshop on robots and human interactive communications, pp 419–424. doi:10.1109/ROMAN.2004.1374797
Mizoguchi H, Sato T, Takagi K, Nakao M, Hatamura Y (1997) Realization of expressive mobile robot. In: IEEE international conference on robotics and automation, 1:581–586. doi:10.1109/ROBOT.1997.620099
Wallbott HG (1998) Bodily expression of emotion. Eur J Soc Psychol 28(6):879–896
Article Google Scholar
de Meijer M (1989) The contribution of general features of body movement to the attribution of emotions. J Nonverbal Behav 13(4):247–268. doi:10.1007/BF00990296
Article Google Scholar
James WT (1932) A study of the expression of bodily posture. J Gen Psychol 7(2):405–437. doi:10.1080/00221309.1932.9918475
Article Google Scholar
Walters KL, Walk RD (1986) Perception of emotion from body posture. Bull Psychol Soc 24:329–329
Google Scholar
Schouwstra SJ, Hoogstraten J (1995) Head position and spinal position as determinants of perceived emotional state. Percept Motor Skills 81(2):673–674. doi:10.2466/pms.1995.81.2.673
Article Google Scholar
Coulson M (2004) Attributing emotion to static body postures: recognition accuracy, confusions, and viewpoint dependence. J Nonverbal Behav 28(2):117–139. doi:10.1023/B:JONB.0000023655.25550.be
Article MathSciNet Google Scholar
Shaarani AS, Romano DM (2007) Perception of emotions from static postures. In: Affective computing and intelligent interaction. Springer, Berlin, pp 761–762. doi:10.1007/978-3-540-74889-287
Meeren HK, van Heijnsbergen CC, de Gelder B (2005) Rapid perceptual integration of facial expression and emotional body language. Proc Natl Acad Sci USA 102(45):16518–16523. doi:10.1073/pnas.0507650102
Article Google Scholar
Van den Stock J, Righart R, de Gelder B (2007) Body expressions influence recognition of emotions in the face and voice. Emotion 7(3):487–494. doi:10.1037/1528-3542.7.3.487
Article Google Scholar
Ekman P, Friesen V (1967) Head and body cues in the judgment of emotion: a reformation. Percept Motor Skills 24:711–724
Article Google Scholar
Mehrabian A (1969) Significance of posture and position in the communication of attitude and status relationships. Psychol Bull 71(5):359–372. doi:10.1037/h0027349
Article Google Scholar
Atkinson AP, Dittrich WH, Gemmell AJ, Young AW (2004) Emotion perception from dynamic and static body expressions in point-light and full-light displays. Perception (Lond) 33: 717–746
Google Scholar
Montepare JM, Goldstein SB, Clausen A (1987) The identification of emotions from gait information. J Nonverbal Behav 11(1): 33–42. doi:10.1007/BF00999605
Article Google Scholar
Brownlow S, Dixon AR, Egbert CA, Radcliffe RD (1997) Perception of movement and dancer characteristics from point-light displays of dance. Psychol Rec 47:411–421
Google Scholar
Montepare J, Koff E, Zaitchik D, Albert M (1999) The use of body movements and gestures as cues to emotions in younger and older adults. J Nonverbal Behav 23(2):133–152. doi:10.1023/A:1021435526134
Article Google Scholar
Pollick FE, Paterson HM, Bruderlin A, Sanford AJ (2001) Perceiving affect from arm movement. Cognition 82(2):B51–B61. doi:10.1016/S0010-0277(01)00147-0
Article Google Scholar
James A (1980) A circumplex model of affect. J Pers Social Psychol 39(6):1161–1178
Article Google Scholar
Laban R (1992) The mastery of movement. Northcote House, Plymouth
Google Scholar
Kozima H, Michalowski MP, Nakagawa C (2009) Keepon: a playful robot for research, therapy, and entertainment. Int J Social Robot 1(1):3–18. doi:10.1007/s12369-008-0009-8
Article Google Scholar
Song H, Kwon DS (2007) Design of a robot head with arm-type antennae for emotional expression. In: IEEE international conference on control, automation and systems, pp 1842–1846. doi:10.1109/ICCAS.2007.4406645
Beck A, Cañamero L, Bard KA (2010) Towards an affect space for robots to display emotional body language. In: IEEE international conference on robot and human interactive communication, pp 464–469. doi:10.1109/ROMAN.2010.5598649
Itoh K, Miwa H, Matsumoto M, et al. (2004) Various emotional expressions with emotion expression humanoid robot WE-4RII. In: IEEE technical exhibition based conference on robotics and automation, pp 35–36. doi:10.1109/TEXCRA.2004.1424983
Haring M, Bee N, André E (2011) Creation and evaluation of emotion expression with body movement, sound and eye color for humanoid robots. In: IEEE international conference on robot and human interactive communication, pp 204–209. doi:10.1109/ROMAN.2011.6005263
Li J, Chignell M (2011) Communication of emotion in social robots through simple head and arm movements. Int J Social Robot 3(2):125–142. doi:10.1007/s12369-010-0071-x
Article Google Scholar
Hareli S, Parkinson B (2008) What’s social about social emotions? J Theory Social Behav 38(2):131–156. doi:10.1111/j.1468-5914.2008.00363.x
Article Google Scholar
Silvia PJ (2008) Interest—the curious emotion. Curr Dir Psychol Sci 17(1):57–60. doi:10.1111/j.1467-8721.2008.00548.x
Article Google Scholar
Barbalet JM (1999) Boredom and social meaning. Br J Sociol 50(4):631–646. doi:10.1111/j.1468-4446.1999.00631.x
Article Google Scholar
Parkinson B (1996) Emotions are social British. J Psychol 87(4):663–683. doi:10.1111/j.2044-8295.1996.tb02615.x
Google Scholar
Shaver PR, Wu S, Schwartz JC (1992) Cross-cultural similarities and differences in emotion and its representation: a prototype approach. In: Clark MS (ed) Review of personality and social psychology. Sage, Newbury Park, pp 175–212
Google Scholar
Ben-Ze’ev A, Oatley K (1996) The intentional and social nature of human emotions: reconsideration of the distinction between basic and non-basic emotions. J Theory Soc Behav 26(1):81–94. doi:10.1111/j.1468-5914.1996.tb00287.x
Article Google Scholar
Barrett KC, Nelson-Goens GC (1997) Emotion communication and development of the social emotions. In: Barrett KC (ed) The communication of emotion: current research from diverse perspectives. Jossey-Bass, San Francisco, pp 69–88
Google Scholar
Elfenbein HA, Mandal MK, Ambady N, Harizuka S, Kumar S (2002) Cross-cultural patterns in emotion recognition: highlighting design and analytical techniques. Emotion 2(1):75–84. doi:10.1037/1528-3542.2.1.75
Article Google Scholar
Landis BN, Welge-Luessen A, Brämerson A, Bende M, Mueller CA, Nordin S, Hummel T (2009) “Taste Strips”: a rapid, lateralized, gustatory bedside identification test based on impregnated filter papers. J Neurol 256(2):242–248. doi:10.1007/s00415-009-0088-y
Article Google Scholar
Mogg K, Bradley BP (1999) Some methodological issues in assessing attentional biases for threatening faces in anxiety: a replication study using a modified version of the probe detection task. Behav Res Ther 37(6):595–604. doi:10.1016/S0005-7967(98)00158-2
Neill SSJ (1986) Children’s reported responses to teachers’ non-verbal Signals: a pilot study. J Educ Teach 12(1):53–63. doi:10.1080/0260747860120106
Article Google Scholar
Aziz K (1998) The key to perfect presentations. Ind Commer Train 30(6):214–217. doi:10.1108/00197859810232988
Google Scholar
Dael N, Mortillaro M, Scherer KR (2012) Emotion expression in body action and posture. Emotion 12(5):1085–1101. doi:10.1037/a0025737
Article Google Scholar
Kohler CG, Turner T, Stolar NM et al (2004) Differences in facial expressions of four universal emotions. Psychiatry Res 128(3):235–244. doi:10.1016/j.psychres.2004.07.003
Article Google Scholar
Darwin C (1872) The expression of the emotions in man and animals. Murray, London (reprinted: University of Chicago Press, Chicago, 1965)
Bull P (1978) The interpretation of posture through an alternative methodology to role play. Br J Soc Clin Psychol 17(1):1–6. doi:10.1111/j.2044-8260.1978.tb00888.x
Article Google Scholar
Bindemann M, Mike Burton A, Langton SR (2008) How do eye gaze and facial expression interact? Vis Cogn 16(6):708–733. doi:10.1080/13506280701269318
Article Google Scholar
Ganel T (2011) Revisiting the relationship between the processing of gaze direction and the processing of facial expression. J Exp Psychol 37(1):48–57. doi:10.1037/a0019962
Google Scholar
Bassili JN (1978) Facial motion in the perception of faces and of emotional expression. J Exp Psychol 4(3):373–379. doi:10.1037/0096-1523.4.3.373
Google Scholar
Tinwell A, Grimshaw M, Nabi DA, Williams A (2011) Facial expression of emotion and perception of the Uncanny Valley in virtual characters. Comput Hum Behav 27(2):741–749. doi:10.1016/j.chb.2010.10.018
Article Google Scholar
Roese N, Amir E (2009) Speculations on human–android interaction in the near and distant future. Perspect Psychol Sci 4:429–434
Article Google Scholar

Download references

Acknowledgments

The authors would like to thank G. Jiang for his assistance with this work. This research was partially supported by the Natural Sciences and Engineering Research Council of Canada and the Ontario Graduate Scholarship for Science and Technology.

Author information

Authors and Affiliations

Autonomous Systems and Biomechatronics Laboratory, Department of Mechanical and Industrial Engineering, University of Toronto, 5 King’s College Road, Toronto, ON , M5S 3G8, Canada
Derek McColl & Goldie Nejat

Authors

Derek McColl
View author publications
You can also search for this author in PubMed Google Scholar
Goldie Nejat
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Derek McColl.

Rights and permissions

Reprints and permissions

About this article

Cite this article

McColl, D., Nejat, G. Recognizing Emotional Body Language Displayed by a Human-like Social Robot. Int J of Soc Robotics 6, 261–280 (2014). https://doi.org/10.1007/s12369-013-0226-7

Download citation

Accepted: 18 December 2013
Published: 04 January 2014
Issue Date: April 2014
DOI: https://doi.org/10.1007/s12369-013-0226-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Recognizing Emotional Body Language Displayed by a Human-like Social Robot

Abstract

Similar content being viewed by others

A Sociable Human-robot Interaction Scheme Based on Body Emotion Analysis

Human-Robot Interaction: Exploring the Ability to Express Emotions by a Social Robot

Analysis of Influencing Factors on Humanoid Robots’ Emotion Expressions by Body Language

1 Introduction

2 Emotional Body Language

2.1 Human Display of Emotional Body Language

2.2 Robot Display of Emotional Body Language

2.3 Human Perception of Robotic Body Language

3 The Social Robot Brian 2.0

4 Emotional Body Language Features