Keywords

Facial expressions of emotions have always drawn the attention of researchers, primarily because of its importance in understanding human behavior, in general, and emotions, in particular. Facial expressions are considered to be the most significant nonverbal language to communicate emotions since the beginning of human evolution. These expressions are not only relevant for communication of emotions among humans, but also to other species, as Darwin (1872/1998) explained that emotions have evolved from the animals.

The psychological theory of emotional expression perhaps began with Darwin’s seminal work The Expressions of Emotions in Man and Animals (1872) based on his theory of evolution. Since then, facial expressions have been studied through multiple theoretical and empirical perspectives, from evolutionary theory to computational sciences. The present chapter aims to examine the existing theoretical and empirical approaches being utilized in the researches on facial expressions of emotions.

Two theoretical approaches have dominated the researches in this area: evolutionary-biological approach and sociocultural approach. While researchers have presented evidence in favor of each of these theoretical approaches, they have generally argued upon which single theoretical perspective is capable of conclusively explaining the multitude of findings in the research on facial expressions of emotion. Theoretical approaches have been developed based upon the observers’ responses (as an outcome measure) to different facial expressions of emotions. The evolutionary-biological approach believes that emotions are biologically triggered, as proposed by Darwin. This approach was further supported by the numerous biological and neuroscientific findings. On the other hand, the sociocultural approach defines social construction as the basis of development of facial expressions of emotions. In this chapter, we will revisit some of the evidence that forms the basis of the evolutionary-biological perspective of emotion expression, which has led to the development of universality thesis. We will then revisit the studies that oppose the universality thesis and, instead, advocate for the culture-specific influences on expression and recognition of emotions. The chapter will also discuss the attempts made by interactionist perspective in order to resolve the theoretical debates while emphasizing the in-group advantage of facial expressions of emotions.

Applications of the theory are based upon resolving the theoretical underpinning in order to establish the concept into measurable constructs. Contradictory theoretical findings and unresolved debates about the facial expressions of emotions have been widely attempted to resolve the controversies related to its origin and measurement by transforming theoretical underpinning into measurement-based approaches. Yet it has led to more contradictions than solutions. Major measurement approaches may include anatomical and computational perspectives. The present chapter will highlight these perspectives where facial expressions have been treated as predictor measures. Anatomical perspective emphasizes that the anatomy of facial muscles is responsible for production of facial expressions of emotions. This further provides the basis for developing the computational modeling of face from automated recognition to make expressions of emotions possible among virtual characters, for example, animated characters and virtual avatars with facial expressions of emotion.

The sociocultural perspective of facial expressions of emotions considers the recognition tendency above chance-level accuracy in order to formulate the universality of facial expressions of emotions. Similarities of the facial expressions have been attributed to the innateness of the biological basis, whereas the differences have been considered to be a result of differences in the sociocultural factors. It is believed that since similar biological structures are shared by the individuals across cultures, localization of the specific behavior in the brain may also be universally similar. Facial expressions of emotions and experience of emotions have been undertaken as an automatically associated process except the lying and deceptive behaviors. Brain, behavior, and computational sciences may not be capable enough of understanding behavior, in general, and facial expressions, in particular, from a unidirectional perspective. Rather, these perspectives are complementary to each other; for example, observable changes in the facial expressions are the result of the activation in the neural architecture. Further, these muscular changes are being automatized through an anatomical measure (FACS; Ekman and Friesen 1978). So, in order to understand the incongruence between the observable behaviors and neural stimulations, a comprehensive and interdisciplinary approach would be more suitable. In the end, the present chapter proposes an integrative perspective of cultural–computational neuroscience to understand facial expressions of emotions.

Most of the researches on facial expressions have followed a unidimensional perspective. However, in the recent past, some attempts have been made to understand facial expressions through interdisciplinary perspectives. The ultimate aim of each perspective is to generate generalizability and objectivity of facial expressions of emotions, while acknowledging the differences. This chapter aims to present the major perspectives utilized to understand the basic issues of facial expressions of emotions.

1.1 Theoretical Approaches

1.1.1 Evolutionary-Biological Perspective: Universality of Facial Expressions of Emotions

The evolutionary-biological perspective of emotions has started from Darwin’s (1872) evolutionary theory of emotions. It suggests that expressions of emotions help in regulating the social interaction and increase the likelihood of survival (see Westen 1996). Knapp (1963) later on noted that “emotional phenomena were among the evolving attributes of man which had developed like man himself from antecedents in his animal forebears” (p. 5). Since then, the unique patterns of neural and physiological activity that accompany different emotions have been a central subject of research in the study of human emotion. The evolutionary-biological perspective was further supported by Izard (1971, 1994) and Ekman (1984) who found that individuals across cultures display the same facial expressions when experiencing the same emotion, till the culture-specific display rules do not interfere. Ekman (1972) suggested that emotions are expressed in universally equal manner. He suggested that emotions are expressed through different combinations of facial muscles, which are governed by a subset of neural network. Ekman followed the universality thesis of Darwin and explained emotions as the result of facial affect program that may be modulated by cultural display rules. To study the nature–nurture debate of emotions, researches have been conducted on facial expressions of congenitally blind and people with eyesight. Dumas (1932) found that congenitally blind people express spontaneous expressions adequately, similar to people with a normal eyesight, but they are not able to express posed expressions adequately. Matsumoto and Willingham (2009) have compared the expressions of congenitally and non-congenitally blind athletes of Paralympic Games 2004 with normal athletes of 2004 Olympic Games and found no significant differences in the level of facial emotion configurations.

Empirical evidences for the universality thesis come mostly from a series of cross-cultural judgement studies conducted mostly by Ekman and others. Universality refers to accurate recognition of facial expressions across cultures at better-than-chance levels (Ekman et al. 1987). Ekman conducted a series of experimental studies and concluded that emotions are recognized cross-culturally in Western–Oriental populations (Ekman 1972; Ekman et al. 1969, 1987; Izard 1971), and literate and preliterate populations (Boucher and Carlson 1980; Ekman et al. 1969; Ekman and Friesen 1971). To eliminate the exposure and familiarity factor, Ekman et al. (1987) studied the isolated and preliterate South Fore and Dani people of New Guinea. The participants were given different situations (e.g., “pretend your son has died”) and were asked to express themselves. Photographs of these expressions were then shown to the Western literate populations. High recognition accuracy was found across ten different cultures (Estonia, Germany, Greece, Hong Kong, Italy, Japan, Scotland, Sumatra, Turkey, and United States) for all emotions except sadness. In another study, literate participants from Hungary, Japan, Poland, Sumatra, United States, and Vietnam were shown photographs of facial expressions of emotions. A high degree of agreement was recorded in the recognition of facial expressions among the cultural groups (Biehl et al. 1997). Izard (1971) conducted a multination study among American, European, African, Indian, and Japanese observers for judgement of facial expressions of emotions, and over 78 % cross-cultural agreement was observed in terms of accuracy of recognition (for details of universality thesis, see chapter by Hwang and Matsumoto).

However, this universality theory has been criticized by the cross-cultural studies on facial expressions of emotions, which suggest that facial expressions are not universal, but differ across cultures (Russell 1994). Indeed, universality theorists (Ekman 1972; Izard 1971) emphasized the similarities of facial expressions and recognition across cultures, but ignored addressing the differences across cultures (Matsumoto and Assar 1992). The universality thesis has faced several criticisms, yet in its support, Shariff and Tracy (2011, p. 407) have suggested universality as “easily recognizable signals” of facial expressions of emotions even in “geographically and culturally isolated populations.” Ekman (1982) further proposed a neurocultural theory, in order to address the universal cultural differences, which suggests that the expression of emotion via face is the outcome of an elicitor that generates the innate facial affect program. It is proposed by the universality theorists that elicitors may change from one culture to another (as a function of the social situation and prevalent norms), but the facial behavior in response to that situation conveys the same meaning in all cultures. Ekman, therefore, proposed a set of primary/basic expressions of emotions—happiness, sadness, fear, anger, surprise, and disgust—and the recognition of which he believed to be universal.

1.1.2 Sociocultural Perspective: Culture Specificity of Facial Expressions of Emotions

Although the relationship between emotion and neurobiological processes has been established beyond doubt, the social significance and origins of emotions cannot be overlooked. The claim of the universalists that facial expressions of emotions are expressed or understood pan-culturally has been challenged by the cultural psychologists or social constructivists. It is commonly believed that an individual’s emotional response is often guided by the evaluation of their social situation. Oatley et al. (2006) posit that all emotions are social in nature because their evolution is a result of the need of individuals to deal with the complexities of human social life.

The basic assumption of the sociocultural approach is that emotions are the result of socialization process and are constructed primarily by the process of culture with aspects ranging from how emotions are elicited, shaped, and valued by cultural beliefs and practices. Individuals learn to express subjective feelings through the process of socialization, in the process individuals learn about self and others through different social and emotional communications (Markus and Kitayama 1991; Mesquita and Albert 2007). Evolutionary-biological perspective believes that emotions are innate in nature and are universal, whereas sociocultural approach explains that some elements of emotions may be universal. Findings in support of both approaches also reveal the existence of remarkable cultural differences in emotions, which are learned according to the culture-specific meanings of identity, morality, and social structure (Averill 1985; Mesquita 2003; Sweder and Haidt 2000). The sociocultural environment influences one’s expressions of emotion in a systematic manner right from birth. Socialization differs from one culture to another since cultures differ in their characteristics and nature, and these differences further influence individual’s expressions of emotion. Researchers believe that emotions are not the result of innate programs solely, but are the result of different types of cultural variations across its components (e.g., Frijda 1986; Mesquita and Frijda 1992; Scherer 1984). Kitayama and Markus (1994) suggested that emotions depend upon the cultural situations and cannot be separated from cultural influences. The individualistic and collectivistic characteristics of culture have also been studied as variables influencing facial expressions and recognition of emotion.

Mead (1975) described the importance of culture in emotions and explained that nature is not the only factor responsible for emotions. The cultural school of thought grew as a challenge to the “universal” (i.e., emotion is the basic function of human beings that is relatively invariant across cultures) or “differential” (i.e., emotions are differentiated on the basis of accompanying physiological response patterning) theories in the experience of emotion (Scherer and Wallbott 1994). Russell (1994) posited in favor of culture specificity in the recognition of facial emotion. His arguments were based on the facts that (a) recognition accuracy for facial emotions in all cultures is not equal and (b) there are cultural variations in semantic attribution to facial expressions of emotion. In a meta-analysis of emotion studies, Russell (1994) concluded that facial emotion recognition accuracy differs from one culture to another. Certain emotions, such as happiness and sadness, are recognized equally accurately across cultures. Emotions such as fear and anger are not recognized equally accurately across different cultures (Mandal et al. 1986; Russell 1994). Evidences from studies (see Biehl et al. 1997; Elfenbein and Ambady 2002; Mandal et al. 1996; Russell 1994; van Hemert et al. 2007) reveal that there is cross-cultural difference in facial expression recognition. Universal affect program of the facial expressions may be characterized by the differences across cultures, and as the contact between cultures increases, familiarity becomes high (Elfenbein and Ambady 2003b). Individuals are able to recognize familiar faces easily across large variations in image quality, though our ability to match unfamiliar faces is strikingly poor (Burton et al. 2005). Kitayama and Markus (1994) illustrated that emotions depend upon the dominant culture frame.

1.1.3 Interactionist Perspective: In-Group Advantage

This perspective suggests that facial expressions of emotions are the result of the interaction between biological and social/cultural factors. There are certain innate programs which are molded by social/cultural determinants. It seems clear that there are strong innate components for facial expressions as well as cultural rules which exert strong influence on facial expressions and recognitions. Young-Browne et al. (1977) experimented on 3-month-old infants on their ability to discriminate facial expressions and found significant differences between control and experimental groups. Infants were able to discriminate surprise expressions with happiness and sometimes with fear also. Fear and surprise expressions have been found to create more confusion in discrimination tasks. Developmental studies suggest that infants are able to express emotions in extreme ways and, as being developed in a social environment, they learn to modulate, minimize, and exaggerate expressions according to social demands. Infants follow the expressions of their caregiver and innate program of facial expressions, later modulated by cultural display rules.

Though many studies have been conducted to resolve the controversy of universality and culture specificity, yet little progress has been made toward arriving at a consensus. One of the major reasons behind this state of controversy is probably the reliance on a single set of photographs depicting facial expressions of different emotions. Such a set of photographs is usually shown to observers of different cultures, and the responses are examined to conclude in terms of culture specificity or universality. Relatively fewer studies have been conducted with sets of photographs displaying facial emotions within different cultures with the purpose of examining the universality hypothesis in a given culture. This perspective is important because the recognition accuracy for “in-group” (same ethnic group) and “out-group” (different ethnic groups) may be compared to isolate the elements of universality and culture specificity. It is, therefore, believed that facial emotions “are a combination of biologically innate, universal expressions and culturally learned rules for expression management” (Matsumoto et al. 1998; p. 148). Dailey et al. (2010) explored the in-group advantage reproduced by using computational model for Japanese and American cultural context.

Elfenbein and Ambady (2002) reported that emotions are universally recognized at better-than-chance accuracy; however, within cultures, there is an “in-group advantage,” that is, accuracy of recognition becomes higher when the perceiver and expresser have the same cultural background. Individuals recognize facial expressions displayed by members of their own culture (in-group) more accurately than those displayed by members of other cultures (out-group) (Beaupre and Hess 2006; Elfenbein and Ambady 2002; Thibault et al. 2006). Individuals recognize more accurately and take less response time while judging emotional expressions of in-group members (Elfenbein and Ambady 2003b). The greater accuracy of facial emotion recognition for one’s own culture is referred to as in-group advantage. In the meta-analysis, Elfenbein and Ambady (2002) revealed that in-group advantage exists in the recognition of facial expressions of emotions. (for details of in-group advantage, see the chapter by Elfenbein in this volume).

Elfenbein and Ambady (2003a) proposed a dialect theory in support of in-group advantage. According to the dialect theory, there is a universal pattern of emotion recognition better-than-chance level, but because of the cultural differences, the recognition of facial expressions varies across cultures. Similar to the variation in language accents across different cultures, facial expressions too are unique to a specific culture and result in slightly different signals of facial expressions. These dialects are the result of learning and are developed in the context of the attunement of expression between individuals within the culture (Leach 1972 in Niedenthal et al. 2006). According to this account, in-group advantage occurs because members of a given culture are used to perceiving a particular expression (dialect) of universal expressions of emotion and are therefore more accurate in recognition of in-group facial expressions.

“The individual who moves from one class to another or from one society to another is faced with the challenge of learning new ‘dialects’ of facial language to supplement his knowledge of the more universal grammar of emotions” (Tomkins and McCarter 1964, p. 127). Individuals recognize more accurately and take less response time while judging emotional expressions of in-group members (Elfenbein and Ambady 2003a). Children correctly recognize fear and happiness at lower intensity, but they recognize fear, anger, and disgust less accurately in familiar faces (Herba et al. 2008). Herba et al. (2008) found less accuracy in children of 4–15 years in recognizing familiar faces. Familiarity with the other culture facilitates the recognition of other-culture facial emotions (Beaupre and Hess 2006; Elfenbein and Ambady 2003b). Familiarity, as a construct, further needs to be studied in order to minimize the miscommunication of facial expressions of emotions. Ekman et al. (1969) studied the preliterate and isolated tribe from New Guinea to ascertain the minimum effect of cross-cultural exposure on facial emotion recognition. They observed significant differences between the isolated and Western population, yet happiness, anger, and sadness achieved 50 % of recognition accuracy, but the remaining 50 % for these emotions and other emotions may be attributed to the lack of exposure to the other-race facial emotions. Achieving recognition accuracy above chance level has always been considered as a parameter to determine universality, whereas recognition differences have been ignored in order to attain cross-cultural agreement.

Matsumoto (2005) opined that the use of different muscle contractions leads to the differences in facial expressions across cultures. He further stated that any face expressing the emotion in a way specific to a culture is more accurately recognized by the members of that culture. The use of “any” face might be important because “culture” has often been confounded with the race. Since members from different races have slightly different facial morphology, morphological differences in facial expressions have also been confounded with culture. Cultural norms and patterns create the difference in the level of recognition across cultures. An individual learns recognition through the culture.

1.1.4 Theoretical Approaches: Comments

Either the evolutionary-biological perspective or the sociocultural perspective cannot be the sole contributor in the development of facial expressions of emotions. Facial expressions evolve through the process of evolution (Darwin 1872), so they may be considered universal. According to social construction, social environment determines an individual’s behavior. Since social norms are different across cultures, facial expressions of emotions also vary across cultures. The central theme of the universality thesis of Darwin was that certain physical movements in the face and body are evolved for adaptations that are biologically basic in their form and function. Facial expressions of emotions might be learned like other socially learned symbolic communication. So, social evolution and its influence on the recognizability of facial expressions of emotions may not be negated in order to establish the universality. “It is more likely that evolution produced a generative, multipurpose set of mechanisms that work together in each instance to produce a variety of emotional responses that are exquisitely tailored to each situation” (Barrett 2011, p. 403).

Although many issues pertaining to the facial channel of emotion communication are discussed (such as universality, dimensionality, context specificity, and individual difference), some areas of research remained little explored. One such area is “eliciting condition” for facial expression. Most researchers deal with simulated expressions for the convenience of and suitability to experimental purpose and utilized posed facial expressions of emotions. Spontaneous expressions, on the other hand, are difficult to achieve in experimental conditions. Because spontaneous expressions are not quite free from cultural display rules, experiments with these stimuli do not permit generalization. Some theorists even believe that “pure” uninhibited facial emotions are rarely expressed, and therefore, we seldom perceive these expressions (Russell 1994). On the other hand, simulated facial expressions lack the “felt” component of emotion to a great extent. Ekman (1992), however, noted that subjective feelings may be evoked by instructing the encoder to move facial musculature in a definite way.

1.2 Measurement-Based Approach

Face has always been considered as the key to understand emotions, so attempts have been made to measure facial expressions and to classify expressions into emotions. Measurement of facial expressions in order to minimize the variability of facial emotions through different expressions has been always a challenge among the researchers. Facial expressions are easy to observe and understand, but it is difficult to develop a measurement system for facial expressions of emotions, since facial expressions vary in terms of frequency, intensity, and durations of certain types of changes in facial expressions. Researches to understand emotions through facial expressions have primarily been based on the individual’s ability to recognize static and dynamic stimuli of facial expressions, for example, pictures of facial expressions and videos of facial expressions. Measurement of expressions is primarily based upon the anatomical changes in facial muscles during different affective states. Anatomy-based facial expressions have further provided the basis to the computational sciences in order to develop the automated system for the recognition of facial expressions.

Considering facial expressions as a dependent measure, quantification and measurement of facial expressions of emotions have been a challenge for the empirical researches conducted on this subject. Research has been conducted to identify facial expressions of emotions through behavioral measures in order to develop constancy and model facial expressions of a particular emotion, for example, self-report, rating systems, judgmental, and video analysis, or by electrophysiological approaches, for example, electromyography (EMG), electroencephalogram (EEG), and galvanic skin response (SCR), or by anatomically based coding systems, for example, Facial Action Coding System (FACS; Ekman and Friesen 1978) and MAX (Izard 1979), and mathematical and computational coding of facial expressions. In the judgemental method, observer’s judgement is considered as an independent and his facial behavior as a dependent measure. Based on the observers’ judgement, facial behaviors are calibrated and inferences are drawn (Ekman 1982). In electrophysiological method, the movements of muscles during the actions in face are measured either by EMG recordings (Brown and Schwartz 1980; Philipp et al. 2012) or by EEG.

1.2.1 Anatomical Perspective

The perspective has been evolved in order to develop a model set of expressions of various emotions. Anatomical perspective believes that different combinations of facial muscles communicate specific category of emotions. Human facial musculature structure has been known to the researchers for a long time (Duchenne 1859/1990). Initial attempts were made by Sir Bell (Bell 1824; in Loudon 1982) in the field of medicine through “Essays on the Anatomy of the Expression in Painting.” The anatomical basis of facial expressions has been developed in order to develop objectivity in understanding the facial expressions of emotions. Anatomically based coding system outlines the specific production of expressions based on the differential combination of activation of facial muscles. Carl-Herman Hjortsjo (1969) proposed first measurement system based on the facial muscles associated with different expressions (in Niedenthal et al. 2006). Further, Ekman et al. (1971) developed an objective coding system Facial Affect Scoring Technique (FAST) to measure emotion categories (happiness, anger, etc.) than emotion dimensions (pleasantness, unpleasantness, etc.). FAST provides 77 descriptors in three parts of face, i.e., forehead, eyes, and lower face to observe six basic emotions. Observers compare facial expressions with the FAST atlas and attribute corresponding scores, which further indicates the emotions mostly expressed. Izard (1979) developed Maximally Descriptive Facial Movement Coding System (MAX) on the similar pattern based on the 27 descriptors and used it with infants. These systems categorize facial expressions into different emotions such as happiness and surprise, but these systems did not code the intensity and dynamicity of the expressions.

Ekman and Friesen (1978) further developed a purely anatomically based FACS that relies on minimal facial muscle actions. FACS not only provides the coding, but also focuses on the intensity and temporality of muscular activity. FACS coding is based upon the 44 facial action units that singularly and in combination contribute to different facial muscle movements. FACS is not restricted to the emotion-specific measurements, but it also measures all facial movements (Rosenberg 1997). This system outlines specific actions produced by particular facial muscles. The quality of these actions, however, likely varies with differences in the facial muscles. Different facial muscles produce different types of movements, and they are most likely heterogeneous in their structure and innervation.

1.2.2 Electrophysiological Support

Physiological measures such as EEG and EMG have been initially applied to explore the brain architecture responsible for understanding emotion. During the past few decades, the emergence of brain-imaging technologies has redefined the biological and neural basis of emotional behavior. Facial EMG involves measuring electrical potentials from facial muscles in order to infer muscular contraction. Most studies were conducted with facial electromyographic technique. These data indicated that different emotional reactions induce facial electromyographic activities of different sorts. For instance, electromyographic activity of the brow region increased with unpleasant thoughts. The emotional valence-specific facial muscle activity is documented by many (Cacioppo et al. 1986; Hu and Wan 2003; Jancke and Jancke 1990). The finding was replicated in a recent study with an additional observation that the cheek muscles of the lateral halves (right or left) covary during pleasant expressions (Jancke 1994). The study designed to examine “how rapidly emotion-specific facial muscle reactions are released” revealed that the electromyographic activity of zygomatic major muscle (muscle used for smiling) sets in within 500 ms of the exposure of the pleasant stimuli. Activity of the corrugator supercilii (muscles used for frowning) sets in within 500 ms of the exposure of the unpleasant stimuli. Reviewing a large body of research in this domain, Dimberg 1997 commented that “humans have a pre-programmed capacity to react to facial expressions and that facial reactions are automatically evoked and controlled by fast operating facial affect programs” (p. 59). Facial EMG has extensively been utilized in the recent researches (Philipp et al. 2012; Tan et al. 2012) because it is noninvasive yet sensitive enough to record subtle changes in facial muscles during facial expressions (Neta et al. 2009; Tassinary et al. 2007).

Further there are great individual differences in the physical characteristics, resulting in variation in electrophysiological activation. Methodologically, it is difficult to establish baseline data for every subject being tested. The reliability of measurement is also affected by the quality of emotion being experienced. Categories of emotion are differentially related to electrophysiological measures. Emotion categories reduced to dimensions (e.g., positive–negative) revealed a variable picture. For example, happiness (positive) and sadness (negative), despite having opposite emotion qualities, derive similar kinds of cardiovascular activities as measured by heart rate and blood pressure (Rusalova et al. 1975; Ekman et al. 1983). To avoid such difficulties, it is important to develop a profile of electrophysiological record for each primary emotion. For instance, fear-provoking stimulus is accompanied by accelerated heart rate (Fredrikson 1981), increased electrodermal reactivity (Ohman and Soares 1994), vasoconstriction in the upper face (Hare 1973), and characteristics of facial reactions (Dimberg 1990; cited in Dimberg et al. 1998). Observable behavioral characteristics during emotional states are even more variable than reliable. Nonverbal expressions, especially facial behaviors, are modulated to a great deal by culture-specific display norms. Other forms of expressive behaviors also depend considerably on an individual’s strategy to respond to a social situation.

1.3 Computational Perspective

Advances have recently been made in mathematical and computational coding of facial behavior. The computational modeling and automatic facial expression recognition have been the interest of the researchers since last two decades. There have been several advances in terms of face and facial feature detection mechanism, but developing the perfect system still has been the challenge among the computational researchers. The automatic facial expression recognition system requires robust face detection and facial feature tracking systems. In an earlier attempt, Thornton and Pilowsky (1982) tried to quantify facial expression mathematically. In this method, 60 key points on the face that may produce visible emotion behavior were identified with the help of a computer graphic procedure. These key points were joined with smooth curves to obtain a graphic model of facial expression. Pilowsky and Katsikitis (1994) attempted to calibrate facial behavior with numerical taxonomy program. Artificial neural network rules [such as, adaptive resonance theory 2 (ART-2); learning vector quantization (LVQ)] were also applied to reliably discriminate facial emotions (Driscoll et al. 1995). Some investigators used cascade correlation neural network and achieved 87.5 % success rate in the discrimination of six facial emotions: happiness, sadness, fear, anger, surprise, and disgust (Zhao et al. 1995).

The problem of Automatic Facial Expression Analysis is broadly divided into three stages, though some other steps may need to be performed in between these three stages depending upon the approach taken (Gunn and Nixon 1994). The three stages are as follows:

  1. 1.

    Face acquisition

  2. 2.

    Feature extraction

  3. 3.

    Classification

In the first stage, a face is detected in the given image. Once the face has been detected, features, which contain the information required for facial expression analysis, are extracted from the facial image in a feature vector, and, finally, the extracted feature vector is passed through classifier for classification/recognition. The classifier might be a two-class or a multiclass classifier.

An important functionality of these interfaces will be the capacity to perceive and understand the user’s cognitive appraisals, action tendencies, and social intentions that are usually associated with emotional experience. Because facial behavior is believed to be an important source of such emotional and interpersonal information, automatic analysis of facial expressions is crucial to human–computer interaction. Face can depict numerous expressions at a given time, but each expression may not be an indicator of the emotional state of the individual. Separating emotions from other facial expressions is a challenge while developing an automated facial expression recognition system.

Facial expressions are the results of different combinations of facial musculature and depend upon the craniofacial characteristics of the individual. There are some permanent dispositions reflected on an individual’s face—isolating the permanent characteristics of the individual face with the model face and increasing accuracy regardless of individual differences is another challenge. Culture-specific display and decoding rules play a major role in facial expressions and recognition of facial emotions. Embedding display and decoding rules in the automated system is another challenge in order to develop a universal automated facial expression of emotion recognition system. Recently, Dailey and colleagues (2010) have made an effort to develop a neurocomputational model trained in specific cultural context, i.e., Japanese and American in order to study in-group advantage. They attempted to model culture-specific display rules, the effect of encoder–decoder distance, and the effect of culture-specific decoding rules. They concluded that the encoder–decoder distance, culture-specific display, and decoding rules and other factors contribute in an integrated manner to create the differences in facial expressions across cultures.

To develop an interface between anatomical and mathematical models of facial measurement, more researches are necessary with experimental and clinical data for generalization and psycho-diagnosis in terms of emotional behavior. The major challenge for the computational sciences is to develop the system for spontaneous expressions. Developing the authentic database for spontaneous expressions is another challenge as the laboratory setting itself made the subjects pose their expressions. Emotions are not a sudden emerging state, but expressions emerge suddenly on the face, so capturing the real expressions associated with the subject’s internal state is another challenge. Minimizing the individual and cultural differences in developing the model emotion expressions may further be a challenge among the researchers.

1.4 Conclusion

“Face” is a multidisciplinary subject matter, and it demands understanding from various perspectives both at macro- (such as social and cultural) and at micro (such as neuroscientific and computational)-levels, and understanding the facial expressions of emotions is one of the greatest challenges facing the twenty-first-century psychological, behavioral, and computational sciences. If we can rise to the challenge, we can gain fundamental insights into what it means to understand human behavior, in general, and emotions, in particular. In order to understand the complete gamut of facial expressions of emotions, an integrative and interdisciplinary approach is needed, which includes the three basic approaches based on social, biological, and computational sciences. Most of the researches in this area have been conducted independently with a unidimensional perspective, whereas researches need to be conducted with a complimentary approach. These approaches do not need to be considered in isolation, but should be treated complimentary to each other. The understanding will help researchers uncover the role of facial emotions in day-to-day interactions.

Emotions are a gradual stimulation process in human beings, and it becomes a challenge to decipher them among the localization, regionalization, and lateralization processes of facial expression processing. The neural circuit helps us understand the development of emotional processes among human beings and other species. Accurate assessment of facial expressions of emotions will further help develop new diagnostic tools, such as an automated behavioral assessment system based on the facial expressions of emotions. The major obstacle that hinders our understanding of the brain architecture behind facial expressions of emotion is the fragmentation of brain research and the enormous data it produces. Modern neuroscience has been enormously productive but unsystematic. It further needs revalidation through sociocultural and behavioral approaches. A recent field of cultural neuroscience (Chiao and Ambady 2007) studies the bidirectional relationship between cultural influences on neural architecture of brain and vice versa. Cultural neuroscience tries to bridge the gap between theory and the methods of psychology and genetics. Cross-cultural differences in neural architecture will further enable computational sciences to develop neural network for machines based on the foundations provided by the cultural neuroscientific findings.

Attempts have been made to automate facial expressions of emotions in order to utilize the systemic interface between technology and society. FACS has been used primarily by the researchers in the area of computational sciences to develop such systems, though developing a zero-error system is still a challenge. Based on the studies, six basic emotions, which are universal in nature, have already been tested in automation of facial expressions of emotions. Other existing concepts, such as in-group advantage, self-conscious emotions, and culture-specific emotions, need to be taken care of while transforming behavioral cues within the technological advancement. Transforming cultural differences into computational models has been an emerging issue among computational researches. Computational models may help researchers in validation of sociocultural and neurological models with state-of-the-art technologies.

Automated system for detecting deception and lying may be developed by involving the neural basis of deception with the help of computational sciences. Such system may further be utilized by an interviewer during an interview or interrogation. Studies of micro-momentary expressions have been of interest to the researchers in the recent past due to its relevance in deciphering deception and lying (for details, see chapter by Mark Frank and Elena Svetieva in this volume). Micro-expressions occur when an individual consciously tries to conceal the signs of true feelings (Ekman 2003; Freitas-Magalhães 2012). Recent research findings (Abe et al. 2007; Johnson et al. 2008; Yokota et al. 2013) have identified biological and neural structures involved in deception. Further attempts can be made to decipher deception through an integrative approach, by involving cultural, computational, and neuroscientific perspectives. Similar attempts can also be made to perform preliminary assessment of chronically ill psychotic patients in day-care center or at out-patient department of hospital through automated computational models (see chapter by Poria, Mondal and Mukhopadhyay in this volume).

All of the three approaches need to look into some basic issues for future research: For example, it would be interesting to find (a) the dominance of context, content of interaction, and intent of judges while perceiving facial expressions of emotions, (b) emotion-specific laterality and its effect on brain–behavior relationship (an integrative approach through behavioral and biological sciences may help us understand), (c) cultural differences in hemispheric dominance in triggering the expression of an emotion (biological and sociocultural sciences can help us understand), and (d) developing an automated behavioral diagnosis system (behavioral and computational sciences can help us understand). While micro-level perspectives, such as biological or computational, will help uncover the bases of facial emotions, macro-level perspective, such as sociocultural, will add meaning to it. Thus, an integrative perspective of cultural and computational neuroscience will help provide a comprehensive understanding of facial expressions of emotion.