They will stare at a stranger with a fixed gaze and unblinking eyes.

Charles Darwin 1872, p. 328.

Introduction

In 1862, Duchenne de Boulogne published his famous Mécanisme de la Physionomie Humaine, in which he exclusively concentrated on the role of facial muscles in the expression of emotions (Duchenne (de Boulogne 1862). In the 1970s, Paul Ekman and his collaborators followed absolutely the same pattern. In their Emotion in the human face that was published in 1972, the topic of direct emotional gaze was not even mentioned (Ekman et al. 1972). Further studies of this group of researchers led to the introduction of “Ekman faces” as specific facial expressions of main emotions and to development of FACS (Facial Action Coding System) algorithm (Ekman and Friesen 1975; Ekman et al. 2002) that were Duchenne’s ideas being put into the modern setting. The statement “the movements of the facial muscles have been regarded by many authors as relevant to or a primary element of emotional behavior” (Ekman et al. 1972, p. 12) became a generally accepted maxim. Further evolution of psychology of emotional expression and its recognition led to a bitter discussion in the 1990s when the universality of “Ekman faces” was questioned (Ekman 1994; Russell 1994; Russell 1995). All these developments, however, completely ignored the power of emotional gaze in a still-face setting.

The gaze was not completely forgotten, but studies concentrated on either visible eye movements (gaze direction, gaze aversion, gaze shifts) or timing of the eye contact (Grumet 1983; Kleinke 1986; Mason et al. 2005). In general, however, experiments measuring eye movements indicated that during emotion recognition people fixate the eye region much longer than any other region in the face (Scheller et al. 2012; Cowan et al. 2014). Emotion recognition independently, it was postulated that the gaze is “a special cue in human interactions” (Ulloa et al. 2015).

The role that no-muscle cues have in emotion facial expression is still poorly researched. An attempt was made to classify facial expressions of emotions into emotions shown by the whole face (actions of the facial muscles, facial reddening) and emotions shown by the eyes that included “gaze direction, eye blinks, tears, and the pupil dilatation” (Kret 2015). Such classification has some weaknesses. Physiologically-wise it would be more appropriate to classify the same cues as the central nervous system (CNS) controlled and the autonomic nervous system (ANS) controlled. The CNS-controlled cues include actions of the facial muscles, gaze direction, and eye blinks. The ANS-controlled cues include facial reddening, tears, and the pupil dilatation. Following this approach, the emotional direct gaze is predominantly ANS-controlled phenomenon (Roitblat et al. 2019).

It was demonstrated in our previous study (Report 1; Roitblat et al. 2019) that the facial expression of emotions can be adequately achieved with minimal muscular involvement when mainly the gaze expresses an emotion by combined slight activity of small intraorbital muscles and broader involvement of ANS-controlled reactions such as the pupil dilatation, aqueous humor formation (“moist eyes”) and aqueous humor outflow, contractions of the ciliary muscle, and actions of the Muller’s muscle (sympathetic fibers). In the current Report 2 study, we planned to test the hypothesis that emotions being expressed by a still-face person keeping direct gaze can be adequately recognized.

Using a Brunswikian lens model sender – receiver approach for analysis of encoding and decoding processes, the whole process of recognition of emotion may have three possible variations in a sender-receiver setting:

  • The sender expressed an emotion vividly – the receiver recognized the emotion correctly;

  • The sender expressed an emotion vividly – the receiver nevertheless did not recognize the emotion or recognized it wrongly;

  • The sender was unable to express an emotion – the receiver did not recognize it.

The third variation is a failure from the beginning while the first two possibilities were further tested. We hypothesized that while the majority of persons can express an emotion by gaze only, likewise the majority of persons can adequately recognize the gaze-expressed emotion.

Method

Participants for Experiment 2

The prospective study utilized a cross-sectional survey design. The participants were either high school (grade 10–12) students or college students. To keep the margin of error below 5%, we planned the sample size of 400 for each group (the margin of error 4.9% with a confidence level of 95% and the observed proportion of 0.5) that is suitable for large cohort screenings. Participants (age 15–25, average age 16.6; n = 800, M 375, F 425) without any psychiatric or medical comorbidity were recruited on a strictly voluntary basis. The rationale for choosing such age limits was based on numerous studies of younger children and older adults that revealed some age-related peculiarities in emotion recognition and even “age-related decline in facial emotion identification” concept was suggested (Noh et al. 2011; Noh and Isaacowitz 2015; Naruse et al. 2013; Chaby et al. 2017). At the same time, numerous studies indicate adolescents, younger adults or, broader, “early adulthood” persons as ideal participants for emotion recognition studies (Noh et al. 2011; Noh and Isaacowitz 2015; Widen et al. 2015; Roitblat et al. 2019; Novello et al. 2018).

Participants with known and documented psychiatric conditions (including autistic spectrum disorder) that may alter cognition abilities were excluded from the study. All selected participants had a normal or corrected vision. The Institutional Review Board approved the research as “radiation-free, non-invasive, non-interventional anonymous survey of volunteers”. Informed consent: Participants were informed before the beginning of the experiment that they would be participating in a study on the facial expressions of emotions. According to the age of a participant, the informed consents were obtained either from the participants or from the participants and guardians. No personal information was obtained from the participants except age and sex. The research took place at high schools and colleges during the educational process.

Experiment 2 Study Design

All the participants who acted as judges were presented with three 10-picture sets of still faces expressing angry, sad, and happy emotions by vivid gaze only. These pictures were selected out of 300 full face pictures that were obtained during our first experiment (Roitblat et al. 2019). The authors and 20 judges (n = 28) who were involved in previously reported Experiment 1 selected 30 pictures from 23 participants with the most vivid expressions. We selected pictures in which the expressed emotion was correctly recognized by at least 22 out of 28 judges (80% or more of success). These pictures were randomly divided into three sets of 10 pictures each. The first 10-picture set consisted of full faces as taken (Fig. 1), the second set consisted of the middle part of the face pictures (some forehead, the eyes, and the nose) that were cropped from the full-face pictures (Fig. 2), and the third set presented the eyes only (Fig. 3). The main aim of cropping the middle part of the face was to remove the lips. Thus, the second and the third sets of pictures presented the eyes with a wider background and the eyes with a limited background. For presenting pictures of each one of three emotions at equal numbers, the full-face set consisted of three sad expressions, three angry expressions, and four happy expressions while the middle-part-of-the-face set consisted of four sad expressions, three angry, and three happy expressions. The eyes-only set consisted of three sad expressions, four angry, and three happy expressions. Thus, all three expressions were presented ten times that permitted to analyze 8000 judgments for each emotion (10 pictures × 800 participants).

Fig. 1
figure 1

The full-face happy gaze picture. The emotion was correctly recognized in 75% in the three-emotion experiment and 67% in the five-emotion experiment. Informed consent

Fig. 2
figure 2

a. The middle-face crop happy gaze picture. The emotion was correctly recognized in 77% in the three-emotion experiment and 70% in the five-emotion experiment. b. The middle-face crop sad gaze picture. The emotion was correctly recognized in 72% in the three-emotion experiment and 63% in the five-emotion experiment. Informed consent

Fig. 3
figure 3

a. The eyes-only sad gaze picture. The emotion was correctly recognized in 88% in the three-emotion experiment and 79% in the five-emotion experiment. b. The eyes-only sad gaze picture. The emotion was correctly recognized in 76% in the three-emotion experiment and 71% in the five-emotion experiment. The most common mistaken recognition was an angry gaze. Informed consent

The experiment was performed as commonly employed labeling task method. The pictures were numbered in consecutive order. Half of the judges (Group 1, n = 400) were asked to label the pictures as Angry, Sad, or Happy. The judges were given the following instructions: “Please label three sets of pictures with emotional gazes. You may choose the following emotions: Sad, Angry, and Happy. Write your answers below. Take your time.”

Validation of the Study Design

In psychology, a labeling task method involves self-report and, therefore, is subjective. For validation of the designed labeling task test, the participants were asked to label the same sets of pictures choosing between either three (main group) or five (validation control group) emotions. Shifting from three categorical nominal independent variables to five variables in labeling can normally decrease the accuracy of labeling from 13% to 26% under various circumstances of an experiment design (Groves et al. 2004; Hall et al. 2016). If accuracy decreases for more than 26%, it means that the test was designed incorrectly.

The validation analysis of three above-described sets of pictures was performed with another 400 judges (Group 2) who were asked to label the same pictures as Angry, Sad, Surprised, Frightened, or Happy. Disgust was excluded because several recent studies indicated that it may not be well recognized (DiGirolamo and Russell 2017; Pochedly et al. 2012; Widen and Russell 2013; Zloteanu et al. 2018). The judges were given the following instructions: “Label three sets of pictures with emotional gazes. You may choose the following emotions: Sad, Angry, Happy, Surprised, and Frightened. Each emotion may be displayed once, more than once, or not at all. Write your answers below. Take your time.”

Age-related differences were not investigated in this study, but for external validation of the picture sets, a group of adults (n = 56, F 36, M 20; age range 27–56) also was tested with the same sets (three expressions test). Inclusion of this third group in the study could test whether the conducted research can be applied to a larger sample size or setting or can be generalized.

Having unlimited time, the judges executed their task alone in all three groups of participants. About five to eight minutes per participant was planned. The correct answers were counted separately to each of the three sets of pictures. The correct answers per emotion were also calculated. The 75% performance threshold as the criterion for hypothesis verification was chosen because it has been conventionally applied in studies of facial expression recognition (Gosselin and Schyns 2001; Rodger et al. 2018).

Data Analysis

The qualitative correct recognition/incorrect recognition data were analyzed in percentage. Total correct recognition responses to each gaze expression comprised the dependent variables. The correlation analysis between the sex of a participant and the ability to recognize a specific emotion was performed using χ2 criterion with 95% confidence interval and the correlation coefficient of r˃0.55 was counted as a significant correlation. The difference in results between Group 1 and Group 2 and the difference between recognition of specific emotions were statistically evaluated by a one-way within-subjects ANOVA (for Group comparison) and by a three-way ANOVA (for recognition of specific emotions), SPSS, Standard version 19 (IBM SPSS Statistics 2010). The level of significance for all analyses was set at p < 0.05, but for analysis of the three separate tests, we also assessed whether findings remained after Bonferroni correction for multiple comparisons that corresponded to an adjusted α = 0.016. Bonferroni correction was applied according to generally accepted rules for medical research (Ranstam 2016; Ranstam 2019).

Results

Group 1 judges presented correct recognition of the emotion in 72% for full-face pictures (average accuracy 7.2, range 2–10), 73% agreement for middle parts of the face (average accuracy 7.33, range 3–10), and 68% agreement for the eyes only (average accuracy 6.79, range 3–10). Thus, the difference in accuracy of recognition of emotions in full faces, middle parts, and eyes was not statistically significant in this Group (the full face against the middle part: p = 0.92; the full face against the eyes: p = 0.8). In Group 2, the agreement between a real emotion and the participant’s judgment was 63% for full-face pictures (average accuracy 6.25, range 2–9), 68% for middle parts (average accuracy 6.84, range 4–9), and 48.5% for the eyes only (average accuracy 4.85, range 1–8). In this group, the better agreement was received from the set of the middle part of the face pictures (the full face against the middle part: p = 0.34; the full face against the eyes: p = 0.05; the middle part against the eyes: p = 0.03), but the agreement was not maintained after Bonferroni correction for multiple comparisons.

For the adult control group, correct recognition of the emotion was achieved in 80% for full-face pictures (average accuracy 7.96, range 6–10), 68% agreement for middle parts of the face (average accuracy 6.78, range 4–9), and 60% agreement for the eyes only (average accuracy 5.96, range 4–8) for three-emotions test. Contrary to the younger participants of Group 1, the older adults were more successful in recognizing emotions presented by a full face rather than by the eyes only (full-face set vs. eyes-only set: p = 0.03), but this significance also was not maintained after Bonferroni correction for multiple comparisons.

The results in recognizing specific emotions are presented in Table 1. It shows that the angry gaze was better recognized than the sad gaze (p = 0.05) and the happy gaze (p = 0.07). However, these differences were not significant after stringent Bonferroni correction.

Table 1 The results in recognizing specific emotions in three-emotion and five-emotion experiments

Summarizing these results against the hypothesis presented in the Introduction,

  • The sender expressed an emotion vividly – the receiver recognized the emotion correctly in 44% to 80% in various tests and these percentages are emotion-specific;

  • The sender expressed an emotion vividly – the receiver nevertheless did not recognize the emotion or recognized it wrongly in 20% to 56% in various tests.

The correlation analysis between the sex of a participant and the ability to recognize an emotion indicated that the involved females recognized emotional gaze better than the male participants, but this correlation was not very convincing (r = 0.63). Female participants recognized the angry gaze and the sad gaze well better than the male participants (r = 0.72 and r = 0.68 respectively), but recognition of the happy gaze did not show any correlation (r = 0.5).

Validation Results Analysis

As it can be seen from Table 1, the decline in recognition accuracy between Group 1 and Group 2 was 9% (p = 0.48) for full-face pictures, 5% (p = 0.69) for middle parts of the face, and 19.5% (p = 0.11) for eyes only. While all these percentages are below 26%, the recognition of emotions from the eyes without some face background presented difficulties for many participants. The full-face set and the middle part of the face set are definitely valid. The differences in emotion recognition between Group 1 and the older adults control group was 8% (p = 0.79), 5% (p = 0.88), and 8% (p = 0.79) for three analyzed sets that are acceptable for external validation.

Further Analysis of the Results

The authors were intrigued by the fact that the gaze expressions from the middle-part-of-the-face set of pictures were recognized slightly better than the full-face gaze expressions in both three-emotion test and especially in the five-emotion test. To clarify this, an additional and not-preplanned test was performed for 200 participants. We selected three emotional gaze pictures of the same person (Fig. 4A). Half of the participants judged the full-face pictures for three emotions, and the other half of the participants judged the cropped images (Fig. 4B-D). The cropped images were judged more accurately (full-face correct recognitions 75% (n = 75) vs. cropped images correct recognitions 89% (n = 89; p = 0.04). But it must be noted that recognition of emotional gazes of the same person is easier than judging a set of ten pictures of different people.

Fig. 4
figure 4

a. The full-face picture with a neutral “passport picture” gaze. b. The cropped picture of the same participant expressing the angry gaze. The emotion was correctly recognized in 79% in a full-face picture and 94% in a cropped picture. c. The cropped picture of the same participant expressing the sad gaze. The emotion was correctly recognized in 78% in a full-face picture and 91% in a cropped picture. d. The cropped picture of the same participant expressing the happy gaze. The emotion was correctly recognized in 68% in a full-face picture and 82% in a cropped picture. Informed consent

Discussion

According to the basic probability theory, the pure guessing for the right answer when three variables are evaluated presents the probability of right answers for each variable as 33.3%, while in our experiment the emotional gaze in full-faced pictures was correctly recognized in 72%. This probability will be 20% if five variables are involved in evaluation by guess, while in our experiment the accuracy was 63% for full-face pictures and even 68% for cropped images. The differences between the probability percentages and the experimentally obtained percentages are obviously significant and one may assume that the question “can a still-face direct emotional gaze be adequately recognized?” can be answered positively. Yet, of all the results obtained, only two, namely, the angry gaze can be correctly recognized in up to 80% and the older adults correctly recognize emotional gaze on full-face images in 80% met our 75% performance threshold requirements. While some other results such as 72% and 73% of accuracy almost approached the threshold, in general, the hypothesis verification is not fully convincing. Our results are in relative concord with the earlier findings of Xiao et al. (2016) who observed emotion recognition accuracy of 66% from “isolated faces”. At the same time, the lack of statistical significance, most even before stringent Bonferroni correction for multiple comparisons, between recognition accuracy in full-face, cropped image, and eye-only experiments indicates that, first, a still-face setting was adequately kept, and, second, that the gaze remained the main cue to recognize an emotion while the rest of the face did not signal additional cues.

The possible explanation of the obtained percentage results may be seen in individual differences in facial expression recognition ability of different participants. Such differences were already investigated and the “emotion recognition skill” term was introduced (Lau et al. 2009; Castro et al. 2018). The “facial expression training” for better facial expression categorization was also suggested (Pollux et al. 2014). Like with any other skill, the facial expression recognition ability varies between individuals. It is impossible to achieve 100% accuracy in facial emotional expression recognition even if all facial muscle and no-muscle cues are involved (Kosonogov and Titova 2018; Lau et al. 2009; Castro et al. 2018; DiGirolamo and Russell 2017; Zloteanu et al. 2018; Xiao et al. 2016). What our results demonstrated was that ANS-controlled direct gaze cue is a powerful component of a general facial expression of an emotion and it should be taken into account each time when the emotional expressions are investigated. As it was pointed out in our Report 1, in addition to well-observed vivid muscular facial expressions, the eyes present sympathetic/parasympathetic response to an emotionally significant situation. The current Report 2 demonstrated that for the majority of the people these combined ANS-controlled changes are enough to recognize an emotion.

The “Ekman faces” approach was designed to recognize a truly and openly expressed emotion and was based on the specific activity of the facial muscles. Yet, various life situations such as business talks, political talks, court procedures, some official meetings, and even some personal talks are emotion-controlled situations with suppressed activity of the facial muscles. Such “facial muscles inhibition” was researched in detail and it is known that most of the people can control their facial muscles relatively well (Bush et al. 1989; Kappas et al. 1989; Kappas et al. 2000). Computerized recognition of emotions, automatic emotion detection, and automatic labeling of broadcast material are also based predominantly on muscle cues (Shan et al. 2009; Cowie et al. 2001). They are only partially successful even in the analysis of an informal spontaneous conversation (Tu and Yu 2012; Akakin and Sankur 2011). Ten years ago, Lance and Marsella (2010) mentioned that “gaze is an important but understudied signal for displaying emotion”. Today, we forced to admit that the topic is still understudied. Importance of the ANS-controlled components of the expressions of emotions was emphasized by Ekman et al. already in 1983 (Ekman et al. 1983). Ironically, while usually preoccupied with faces, this time these authors concentrated on such ANS cues as heart rate, hand temperatures, skin resistance, and forearm flexor muscle tension and completely overlooked ANS-controlled changes during an emotional gaze. Coming to the more recent times, in-depth discussions on ANS reactions to emotional situations again mentioned such variables as respiration rate, minute ventilation, blood pressure, heart rate, skin conductance, finger pulse amplitude, and skin temperature (Wac and Tsiourti 2014; Kreibig 2010). The ANS-controlled changes during an emotional gaze were completely ignored despite an existence of sufficient anatomical and physiological data on the ANS control of the eye and of the orbital muscles (McDougal and Gamlin 2015; Ruskell 1970; Tyrrell et al. 1995; Steinhauer et al. 2004; Demer et al. 1997; Izci and Gonul 2006; Bradley et al. 2008). The ANS-controlled changes of the gaze during emotional feeling are not an isolated facial phenomenon and can be accompanied by another ANS-controlled facial cue, namely, blushing (Moukheiber et al. 2012).

Two situations are possible, however. The ANS-controlled and the CNS-controlled facial cues may be in concord or may contradict each other. Most of the time, they are in concord (Soussignan et al. 2013). On the other hand, “feeling negative, but smiling politely” situations or a simple cold-eyes smile are well-known and were researched in-depth most recently (Dijk et al. 2018). Numerous studies of facial expressions of emotions that analyzed eye-tracking data of the receivers and computer-assisted quantification of their regional gaze fixations indicate that the receivers (judges) share most of the time needed for emotion recognition between the eye-region and the lips-region of the face (Joyal et al. 2014; Schurgin et al. 2014; Priebe et al. 2015; Jiang et al. 2019; Figueiredo et al. 2019). The results of the above-mentioned reports may explain why the cropped no-lips images of our experiment were evaluated slightly better in comparison with the full-face images. This difference in accuracy was not significant, from 1% to 5%, and this percentage of cases most probably included the pictures with some eyes-lips disagreement in expressions that misled the judges while evaluating the full-face images. For example, a vivid happy gaze with absolutely relaxed no-smile lips could mislead some judges.

Discussing the recognition of specific emotions, our finding only partially supports some other reports indicating that anger may be recognized better than some other emotions (Mumenthaler and Sander 2012; Švegar et al. 2018). This tendency was observed, but its significance was not maintained after Bonferroni correction. There is no universal opinion on this subject, an opposite viewpoint also exists (Corden et al. 2006) and further research is needed to clarify the topic. Discussing differences between the sexes, this topic is also not yet fully clarified. Our findings indicated that, in general, the female judges were somewhat more accurate in recognition of emotions than the male participants but the obtained correlation of r = 0.63 is not very strong. Our findings are in partial agreement with some other studies that reported that female participants may demonstrate greater accuracy when recognizing facial expressions under certain conditions (Wells et al. 2016).

As it was mentioned above, age differences in emotion recognition were not investigated in the current study. The older adults group of participants was tested to validate the design of the experiment as a labeling task with three sets of pictures. While the obtained percentages of accurate recognition of emotions show a relatively small difference of 5% - 8%, the patterns of recognition accuracy were different between the younger adults and the older adults. The decline of recognition accuracy of the older adults was almost linear from full-face to eyes only pictures (80% → 68% → 60%) while no such tendency was noted for the younger adults of Group 1. The age-related differences in recognition of facially expressed emotions are known and well researched (Noh and Isaacowitz 2015; Chaby et al. 2017; Shechner et al. 2017). In the current study, some difference could appear because Group 1 adolescents and young adults judged pictures of their peers.

In concluding remarks, the “Ekman faces” are not exactly Ekman’s and even not Duchenne’s. They are known at least since the seventeenth century when Charles Le Brun expressed them in 1668 (Le Brun 1702) (Fig. 5). We may continue exercising this 350-year-old and well-tested approach to emotional facial expressions. The authors, however, believe that reinforcing CNS-controlled facial muscle cues with ANS-controlled eye cues and combining these cues in further studies may produce more accurate results for our understanding of facial emotional expressions and their recognition. Emotion recognition tests based on judgment of facial muscle configurations as captured in still photographs was criticized most recently and appraisal theories of emotion were suggested as more appropriate approach to the problem (Scherer et al. 2019). The importance of the eye-region and eye-gaze pattern is understood (Cowan et al. 2014) but this understanding was not yet applied to the studies of emotional facial expressions in its full capacity. It was demonstrated that direct gaze can be assessed and judged even within a neutral facial expression (Todorov and Duchaine 2008). Further investigations in this research area may significantly expand our knowledge of emotional expressions and their recognition.

Fig. 5
figure 5

The set of pictures of facial expressions of emotions was designed by Charles Le Brun in 1668. (a). the sad face; (b). the angry face; (c). the happy face. (Le Brun 1702). Public domain

Limitations of the Current Research

The participants of the current study predominantly belonged to the Caucasian race or were of Semitic origin and the results cannot be generalized for all ethnicities because ethnic differences of emotion recognition were reported (Russell 1994; Stanley et al. 2013; Krämer et al. 2013). We conducted a field study of a large cohort. The laboratory study of a smaller cohort may have more accurate results.

Conclusion

We confirm our initial statement (Report 1) that the expression of emotions can be adequately achieved with minimal facial muscular involvement when predominantly the gaze expresses an emotion. In most of the cases, an “emotional gaze” during a still-face or static face situation is adequately recognized by people but the emotion recognition skill has individual variations. The ANS-controlled direct gaze cue is a powerful component of a general facial expression of an emotion and it should be taken into account each time when the emotional expressions are investigated.