Keywords

Introduction

Recognizing identity and condition of group members is an important cognitive task in many species of social animals. In human and nonhuman primates, the face plays a highly important role in this aspect of social cognition: it conveys visual information about the owner’s familiarity, emotion, age, gender, health and reproductive status, gaze direction, etc. A specialized neural network for face processing has been found in the visual cortex of various primate species including marmosets (Hung et al. 2015), macaques (Chang and Tsao 2017), and humans (Kanwisher and Yovel 2006), indicating that at least some underlying face processing mechanisms are shared among species. Recognition of conspecific faces has in fact been reported in species as disparate as paper wasps (Sheehan and Tibbetts 2011), cichlid fish (Kohda et al. 2015), and sheep (Kendrick et al. 2001). Furthermore, animals kept as companions by humans (dogs: Bognar et al. 2018; cats: Takagi et al. 2019) have been shown to attend to human faces and process them depending on their relationship with the owner of the face . Regardless of the extent to which facial recognition abilities are shared or have evolved independently in different taxa, in many visually capable species intra- and inter-specific communication is largely mediated through facial signals.

Several studies on visual social perception and communication in tufted capuchin monkeys (Sapajus apella) conducted in the laboratory of Kazuo Fujita have conspecifics and humans as social stimuli (Anderson et al. 2009; Hattori et al. 2007; Kawaguchi et al. 2019; Kuroshima et al. 2002; Matsuno and Fujita 2018; Morimoto and Fujita 2012; Takimoto and Fujita 2011). Capuchin monkeys, highly social platyrrhine monkeys endemic to South America (Fragaszy et al. 2004), are known to extract a range of information from the faces of conspecifics, including familiarity (Talbot et al. 2016), age (Kawaguchi et al. 2019), symmetry (Paukner et al. 2017), and emotional expression (Calcutt et al. 2017).

Capuchin monkeys, like most other platyrrhines, show polymorphic color vision: dichromatic and trichromatic individuals coexist in the same group (Jacobs 2007). This intriguing diversity is caused by the multiallelic trait of the L/M opsin gene on the X-chromosome (Jacobs et al. 1993). The L/M opsin gene codes photopigments sensitive to the middle- to long-wavelength of visible light. Species of the family Cebidae, which includes capuchin monkeys, usually have three variants of the L/M opsin gene in the gene pool of the population, with maximal absorbance at around 530, 545, and 560 nm, respectively (Hiramatsu et al. 2005; Saito et al. 2005). Together with the S opsin gene for short-wavelength sensitivity coded on an autosome, all males (with a single X-chromosome, hence hemizygous) and homozygous females on this allele become dichromats, while heterozygous female become trichromats. By contrast, in catarrhine primates (including humans), tandemly repeated L and M opsin genes on the X-chromosome enable routine trichromatic vision in both males and females (Jacobs 1996). Regardless of the underlying genetic mechanisms, the evolution of trichromatic vision in primates is likely to have had important implications for species’ visual cognition.

The shift from nocturnal to diurnal activity patterns might be one primary etho-ecological factor in the prevalence of trichromatic vision in primates, as nocturnal primates have remained dichromatic—the normal form of color vision for mammals (Heesy and Ross 2001). Another likely driving force in the evolution of trichromacy is frugivory (Mollon 1989; Regan et al. 2001; Sumner and Mollon 2000), while the “social signal” hypothesis Changizi et al. (2006) proposes that trichromatic vision is optimized for detecting skin color modulations with variations in blood amount and blood oxygen levels. Changizi et al. (2006) focused on the fact that the bare skin region of faces is visually salient for diurnal primates with polymorphic or routinely trichromatic vision, whereas most of the face of nocturnal mammals is covered with fur. In humans, face coloration can be an honest signal of health and emotional state (Stephen et al. 2009; Thorstenson et al. 2019). In macaque monkeys, it is likely that face color modulation communicates reproductive states (Dubuc et al. 2009, 2014; Higham et al. 2011). Although it is arguable that detecting skin color modulations has driven the evolution of trichromacy, it is worthwhile to evaluate the role of trichromatic vision in this ability in social animals. Capuchin monkeys are clearly an excellent nonhuman primate model for investigating the effect of face color modulations on conspecific behavior. Currently, there are no objective data that suggest face color modulates in capuchin monkeys. However, red facial skin is more prominent in males than females in the bald uakari, a platyrrhine monkey that has a bald head and highly polymorphic color vision (Corso et al. 2016). The relatively large region of bare skin around the eyes and nose in capuchin monkeys supports the possibility that their face color modulates depending on emotional or reproductive states, and that this modulation is detectable by conspecifics with trichromatic vision. In addition, a simulation experiment suggested that face color modulation in macaque monkeys can be detected by trichromatic vision with narrower spectral separation of L and M photopigments, which is frequently observed in platyrrhines (Hiramatsu et al. 2017).

Several years ago, I set out to examine the effects of color in a social context, using capuchin monkeys. The original motivation was to investigate how capuchins with different color vision types would respond to color modulation in the faces of their group members. Furthermore, were they sensitive to color modulations of specific face parts? Were they also sensitive to face orientation, reflecting feature-based or holistic/configural processing of faces? To examine if color-related effects are enhanced in the face context, before our main experiment using intact face images, Fujita and I used randomized images in a titration experiment to adjust for individual differences in color sensitivity. We soon discovered that our experimental design was too complicated—involving too many variables—to provide clear answers to our original question. However, our results did suggest some consistencies and commonalities in capuchins’ recognition of faces of familiar individuals. Below, I describe our experiments in some detail, as this information might provide useful pointers for future perceptual and cognitive studies with capuchin monkeys.

A Study of Color Modulation Detection in Capuchin Monkeys

At the start of our study, the group containing our subjects consisted of Heiji (19-year-old male), Pigmon (15-year-old male), Zinnia (12-year-old male), Zilla (19-year-old female), Kiki (17-year-old female), Theta (17-year-old female), Zen (9-year-old female), Zephie (3-year-old female), and Kojilo (3-year-old female). Zen and Zephie were the offspring of Zilla. Kojilo was Kiki’s daughter, but as an infant she was raised by human caretakers due to Kiki’s poor maternal behavior. Zinnia was not tested in these studies. They lived together as a group in a multi-cage, two-level complex spanning two rooms, with several interconnecting doors. Color vision type was determined for each monkey by genetic analysis as described elsewhere (Hiramatsu et al. 2005). Only Kiki, Zen, and Kojilo had trichromatic vision, with two classes of L/M opsin genes for photopigment with estimated peak sensitivity at 530 and 545 nm. The other five monkeys were dichromats, with L/M photopigments with peak sensitivity at either 530 or 545 nm. All monkeys had extensive experience of experimental visual stimuli being presented on a touch-sensitive monitor, and except for Pigmon, all had participated in a study involving visual categorization of surface materials (Hiramatsu and Fujita 2015). The monkeys received a portion of their daily diet during experimental sessions and the remainder in their home cage after the experiment each day. Water was always freely available in the home environment.

Stimulus Manipulations

The faces of all nine capuchin monkeys in the group were photographed under a full spectrum fluorescent light using a color-calibrated camera (Fig. 5.1a). Each monkey’s facial expression was neutral (at least to human observers). A color modulation towards red or blue was made on three bare skin parts of the pictures—the nose, around the left side of the eye (hereafter: left eye ) or around the right side of the eye (hereafter: right eye ) from the viewpoint of observer—by increasing R or B in the 8-bit RGB values (Fig. 5.1b). Area sizes of the modulated parts were almost the same. For the color titration experiment described below, the color modulation was gradually increased in 100 steps. At the highest level (100), the R or B value was 255 (maximum value); at the lowest level (0), there was no color modulation and each part maintained its original color. The gamma of the monitor (relationship between input RGB values and output monitor luminance) was set to 2.2 and the luminance of RGB values was modulated linearly for a visually typical human observer.

Fig. 5.1
figure 1

Example of facial stimuli used in the color modulation detection experiment. (a) Neutral faces of nine monkeys. (b) Color-modulated pictures of one face (monkey Zen). The color of the nose, and around the left eye and right eye was modulated toward red and blue. Inverted stimuli were included in the main experiment. Scrambled images containing all the parts of the modulated faces were used in the titration experiment. Note that stimuli with the mid-level color modulation (50) are shown; these correspond approximately to the middle level red color modulation for dichromats in the main experiment. Middle color modulation level of red for trichromats and blue for all monkeys in the main experiment was lower (see Results) than the color modulation level shown here

Experimental Procedures

A four-alternative forced-choice (4AFC) procedure was used in which monkeys had to choose the stimulus picture that differed from the other three when all four stimuli were simultaneously presented on a calibrated touch-sensitive LCD monitor. The monkeys, all tested individually in a familiar operant chamber, responded by touching one stimulus. To prevent confusion arising from changes in the target stimulus on every trial, we used a non-color-modulated stimulus as the target (oddball procedure). Each stimulus image was 300 × 300 pixels (ca. 16 × 16 degree) and the background was uniformly gray (x = 0.311 and y = 0.330, 30 cd/m2). The four stimuli were randomly assigned to four areas of the monitor, and to maintain the monkey’s attention they were presented in a slightly different position in each area across trials. A trial started after the monkey touched a simple square (start image) that appeared in the center of the monitor (Fig. 5.2).

Fig. 5.2
figure 2

Schematic of the 4AFC oddball search task. The same task design (4AFC oddball search task) was used for both the titration and the main experiment

Color Titration

To adjust for individual differences in color sensitivity, we first conducted a color titration experiment by using scrambled images of all face parts (both eyes, nose) of the modulated stimuli (Fig. 5.1b). Seven evenly separated color modulation levels were selected from the 100 steps of the stimuli and 630 4AFC trials with randomized face images (9 faces × 2 color direction × 7 levels × 5 times) were conducted on 5 consecutive days. The monkeys’ task was to detect an oddball stimulus of original color from three identical color modulated stimuli. The monkeys’ performances were plotted as a function of color modulation level and fitted to a logistic function. This procedure was repeated until the fit curve came within the range of the mean performance ± CI (confidence interval) by narrowing the range of color levels. In 4AFC, the accuracy threshold is 0.625 (1/4 + (1–1/4)/2) correct (Kingdom and Prins 2010). In this task, threshold indicates the minimum color modulation that can be detected. We estimated the color modulation level at threshold and four more levels where the estimated proportion of correct responses deviated from the threshold by 0.15 or 0.3, i.e., proportions correct at 0.325, 0.475, 0.775, and 0.925, from the psychometric function for each monkey. These values were then used for the main experiment.

Face Color Modulation Experiment

To examine if sensitivity to color modulation is enhanced in the face context, we used intact face images in the main experiment. Based on the titration experiment, five levels of color-modulated stimuli for each face part were recreated for each individual (Fig. 5.1). In addition to the face parts, color direction (red and blue), color level (five), face identity (nine individuals), face orientation (upright and inverted) were included as stimulus parameters in the analysis. Therefore, there were 540 stimulus conditions in the main experiment. We added 90 scrambled trials used in the final titration experiment to transfer the 4AFC task to face context (Fig. 5.2). Each test session consisted of 630 trials, with the trial order randomized across the five experimental days. Each monkey received 126 trials per day, and 3150 trials in total in the five sessions, each condition being run five times to obtain a more accurate sample of their responses.

Differences in Color Sensitivity between Dichromats and Trichromats

Seven monkeys (the exception was Pigmon) completed the titration experiment with scrambled images, allowing estimation of the threshold color modulation level for red and blue directions. The mean threshold for the red direction appeared higher in dichromats (47.5 ± 11.3) than in trichromats (12.7 ± 1.53), underscoring the higher sensitivity for red in the latter. The mean threshold for blue in dichromats (8 ± 1.63) also tended to be higher than in trichromats (5.3 ± 0.58). Statistical analysis indicated lower threshold in trichromats than dichromats for both red (t (5) = 6.14, p < 0.01) and blue (t (5) = 3.03, p < 0.05), although caution is required due to small sample size (four dichromats and three trichromats).

Effect of Color, Face Part, and Face Orientation

For the seven individuals who completed the main experiment, accuracy at the mid-color modulation level (the threshold in the titration experiment) did not improve in the face context (Fig. 5.3). However, some tendencies appeared via generalized linear mixed model (GLMM) analysis. We evaluated the effects of various factors on accuracy at the threshold color modulation levels for each individual. Color vision type, color direction, face part, face orientation, face identity, the interaction between color vision type and color direction, and the interaction between face part and face orientation were included in the models as fixed effects. Participant identities were included as random effects. For the analysis of accuracy, the response variable was correctness (1: correct or 0: incorrect) in each trial, and a model was fitted to a binomial distribution with the logit link function.

Fig. 5.3
figure 3

Mean accuracy for each monkey under red and blue conditions at the mid-color modulation in the main experiment. Accuracy was averaged over the levels for face parts, face orientation, and face identity. Dichromat: circle with solid line; trichromat: triangle with dotted line. Note that mean accuracies did not exceed 0.625, the expected accuracy from the titration experiment

The GLMM analysis and type II test showed effects on accuracy of color direction (p < 0.001), face part (p < 0.001), face identity (p < 0.05), and an interaction between face part and face orientation (p < 0.001). There was no effect of color vision type. Post-hoc comparison showed that accuracy under the blue condition was higher than under the red condition (p < 0.001) (Fig. 5.3). Regarding face parts, accuracy of color modulation on the left eye was higher than on the right eye and nose under the upright condition (p < 0.001), whereas accuracy for right eye was higher than nose under the inverted face condition (p < 0.001) (Fig. 5.4).

Fig. 5.4
figure 4

Effect of face part, face orientation, and color direction. The boxplot shows the median, lower and upper quantiles, and the minimum and maximum values for accuracy under each condition across all monkeys (n = 7). Dots indicate outliers

Effect of Face Identity on Reaction Time

To analyze reaction time (RT), a GLMM model was fitted to a Gaussian distribution with the log link function for the same fixed and random effects as used in the accuracy analysis. Trials for all subjects with long RTs (1.5 times longer than the overall interquartile range) were removed as outliers (the same trials were also removed from the accuracy analysis). The GLMM analysis revealed a significant effect of face parts (p < 0.05). Post-hoc comparison showed that RT for the nose was shorter than for the left eye (p < 0.01).

To reveal individual tendencies, a linear mixed model (LMM) was used to analyze effects of color direction, face part, face orientation, face identity, and interactions between color direction, face part, and face orientation on RTs of individual monkeys. The LMMs showed that face identity affected RTs significantly in Heiji (p < 0.01), Zilla (p < 0.05), and Theta (p < 0.001). Post-hoc analyses showed that the Heiji’s RTs were longer for the faces of Zen and Kojilo than those of Zinnia, Zilla, and himself. Theta’s RTs were longer for the faces of Pigmon and Zinna than those of Zephie and Kojilo. Zilla showed the longest RTs for the face of Zephie, who in turn showed the longest RTs for the face of Zilla, although Zephie showed no significant effect of face identity (Fig. 5.5).

Fig. 5.5
figure 5

Effect of face identity on RT for each monkey. Points indicate mean RTs, lines indicate lower- and upper- 95% confidence interval limits calculated by the bootstrapping method. Face identities are ordered (from left to right) according to dominance information from Takimoto and Fujita (2011)

The analysis of each individual’s RTs also showed significant to marginally significant effects in Kiki (interaction between face orientation and color: p = 0.034), Theta (interaction between face part and color: p = 0.097), Zephie (color: p = 0.084), and Kojilo (interaction between face orientation and color: p = 0.062). However, the relationship between color and face orientation or face part was inconsistent across the individuals.

Discussion

Below, I review the results of our study on capuchin monkeys’ responses to manipulated facial stimuli, and discuss both possible explanations and implications for the effects or lack of effects of the variables that we manipulated.

Color Vision and Color Modulation in the Face Context

Accuracy in the face context was higher than chance level (0.25) but lower than 0.625 (the performance for scrambled images at threshold color modulation) in both dichromatic and trichromatic monkeys (Fig. 5.3). There are several possible explanations for this result. First, the color of only one part of the face was modulated in the main experiment, whereas in the titration experiment with scrambled images all parts were modulated. The small size of the area of the color modulation might be a reason for the lower performance. Second, since the RGB values of the stimuli were modulated artificially, our stimuli did not reflect truly natural color modulations in monkeys. Our modulation scheme was based on increasing R or B values in the pixels of face parts, and this treatment increased both the color saturation and the brightness of the pixels. In real macaque monkeys, the reddish color modulation associated with the reproductive status also decreases lightness (Higham et al. 2010; Hiramatsu et al. 2017). Although we did not measure the true color modulation of capuchin monkey faces, our modulation scheme may not have been ecologically valid. Furthermore, since the monitor RGB spectra were not optimized for capuchin monkeys’ spectral sensitivity, which differs from our own, the face coloration might have appeared unnatural to the monkeys.

In both dichromats and trichromats, accuracy was higher for blue color modulation than red, although in trichromats we expected a higher or comparable performance for red. The larger chromatic contrast between modulated parts and the surrounding skin under the blue condition compared to red might explain this result. Capuchin monkeys might be more sensitive to unnatural modulation in the blue direction with large chromatic contrast; this possibility remains to be tested.

Nonetheless, the color modulation scheme in our experiments failed to reveal a face enhancement effect in trichromatic capuchin monkeys. It is also possible that our experimental design was too complicated and artificial to obtain such an effect. A study that used face pictures of rhesus macaque monkeys taken under natural conditions and a color substitution paradigm in which humans experienced various color vision types, found enhanced detection of ecologically valid face color modulation in participants who experienced trichromatic vision (Hiramatsu et al. 2017). This needs to be examined with monkey participants. Their combination of color vision polymorphism and cognitive competence points to capuchin monkeys as an excellent model for studies of cognitive aspects of color vision in primates.

Face Parts and Face Orientation

Most of our monkeys’ data showed significant effects of face part or an interaction between face parts and face orientation. Accuracy was generally greater for the left eye in the upright condition. The finding of upright superiority is consistent with the widely accepted hypothesis of configural processing of faces in humans (Maurer et al. 2002). Studies with nonhuman primates including capuchins (Calcutt et al. 2017), squirrel monkeys (Nakata and Osada 2012), tamarins (Neiworth et al. 2007), macaques (Adachi et al. 2009), and chimpanzees (Tomonaga 2007) have shown a similar phenomenon. Configural processing of bodies is also suggested by body inversion effects in capuchin monkeys (Matsuno and Fujita 2018). A left eye bias may be consistent with a left gaze bias reported in humans, macaques, and dogs (Guo et al. 2009). In humans, the leftward face bias is at least partly attributable to right hemisphere dominance in processing faces (Megreya and Havard 2011) but this is debated for monkeys (Zangenehpour and Chaudhuri 2005; Tsao et al. 2008). In fact, the results of a recently published meta-analysis led to the conclusion that the inversion effect is not a reliable phenomenon in nonhuman primates (Griffin 2020). Therefore, the left side or upright bias found in our study might be an effect of our captive monkeys’ extensive history of interacting with humans or due to some unidentified procedural aspects of the study. Further research is required to bring greater clarity to the issue of configural face processing in capuchin monkeys.

Face Identity

In three monkeys, all dichromats, RTs varied significantly depending on stimulus face identity. Although there might be no clear relationship with color vision type, dichromatic monkeys with less sensitivity to reddish modulation might have paid more attention to identity during the task. Interestingly, RTs of Heiji, the calm, alpha male in the group, were longer for faces of younger monkeys than older monkeys (including his own). The longer RT’s for young monkey faces might reflect interest or concern for those individuals, assuming that he did in fact recognize the pictures as members of his group. By contrast, Theta showed longer RTs in response to adult males (except for Heiji) rather than young individuals. Theta’s accuracy in the test trials was markedly lower than in her titration trials with randomized stimuli. She in fact appeared afraid to touch the face stimuli during test trials, and her social position—the most subordinate adult in the group—might have led to low accuracy and slower RTs to faces of dominant individuals (Figs. 5.3 and 5.4). In other words, for Theta face context might have had a negative effect on performance. The longest RTs for Zephie and Zilla were in response to seeing each other’s face , possibly an effect of their mother-offspring relationship.

Clearly, many factors might affect RTs, and so the possible explanations offered here are speculative. However, individual differences in the effects of face identity even within the same color vision type exclude the possibility that variations in RT were due to physical properties of face stimuli. By using an eye-tracking technique, Lonsdorf et al. (2019) showed that capuchin monkeys looked longer at unfamiliar conspecific faces of the same sex than the opposite sex. Although our experiment was not set up to examine looking durations, social factors could affect visual cognition of faces especially when those faces are familiar, and those factors might be reflected in RTs. What I hope to have conveyed here is how monkeys living in a social group but tested individually can provide valuable opportunities to investigate how social relationships influence perceptual and cognitive processes.

Conclusion

This chapter describes a study in which we aimed to examine how color modulations on face parts of familiar group members are responded to by capuchin monkeys with different color vision types, namely dichromatic and trichromatic. Monkeys were more sensitive to blue than red modulation in the face context, irrespective of color vision type. In other words, we found that the monkeys detected a color modulation despite it having (to our knowledge) no biological relevance. However, we found no evidence of sensitivity to what we expected might be the more biologically relevant change, that is, in redness. We also observed upright orientation and left side biases, and likely effects of social relationships in face context. These results illustrate the multiplicity of factors associated with face recognition and processing in capuchin monkeys, and this no doubt also applies to a wider range of animals.