Introduction

For primates, the most important visual cue for gathering social information about another individual is the face (Fujita 1993a; Spencer et al. 1997; Burton et al. 1999). Information about identity, age, gender and emotion can be extracted from the face and facial information leads to the fastest and most accurate identification of an individual (Dahl et al. 2013a; Kano and Tomonaga 2009; Parr 2011; Pascalis and Wirth 2011). The face is also used to discriminate between individuals within the same species and other prey or predator species (Pascalis and Wirth 2011).

Exactly how facial features are processed in human and non-human animals has been a topic of intense study for decades (Bruce and Young 1986; Ge et al. 2009; Valentine 1991). Diamond and Carey (1986) propose that faces are processed according to first and second-order relational features. First-order relational features refer to the eyes, nose and mouth and their fixed position relative to each other, used to discriminate between face and non-face stimuli. Second-order relational features refer to the relative spatial distances between facial features, used to discriminate between the faces of different individuals. Interestingly, when faces are inverted judging distances between their second-order relational features becomes more difficult compared to non-face stimuli. This is known as the ‘inversion effect’ (Yin 1969). The inversion effect is consistently found in humans (Goldstein 1965; Scapinello and Yarmey 1970; Yin 1969) and chimpanzees (Dahl et al. 2013c; Parr et al. 1998; Tomonaga 1999, 2007). Faces are thought to be processed configurally. Configural processing refers to “the emergent features of a face that only become apparent when two or more of its basic features (e.g., the eyes, nose, or mouth) are processed at the same time.” (Piepers and Robbins 2012, p. 2). The integration of this information into a perceptual whole allows individual recognition and discrimination within and between other species (Parr 2011).

A number of primate studies have investigated visual discrimination of species based on whole body images, including the face. Humphrey (1974) conducted one of the earliest experiments on species discrimination in rhesus monkeys using a habituation task. The monkeys could distinguish individuals within their own species but not between species. Fujita (1987, 1990, 1993b; Fujita and Watanabe 1995) found longer looking durations for own-species images in Japanese macaques and rhesus monkeys using a sensory reinforcement procedure. In addition, Demaria and Thierry (1988) observed stumptailed macaques looked for longer at own-species images compared to other macaque species images. These experiments show a consistent advantage for processing own-species faces in monkeys.

In contrast, in chimpanzees Tanaka (2003, 2007) found a visual preference for human stimuli over other primate species in free-choice tasks. Tanaka (2003) presented five adult chimpanzees with images of humans, chimpanzees, gorillas, orangutans and other monkey species. Human images were touched more than any other species and preference strength did not differ with phylogenetic distance from chimpanzees. Later, Tanaka (2007) tested eight adult and three infant chimpanzees using the same paradigm and species categories. Consistent with the previous study, adult chimpanzees showed a preference for humans, whilst infants showed a preference for either chimpanzees or no species preference. These differences may be explained by social experience during infancy; the adult chimpanzees were mainly raised by humans, whereas the infants were raised by their mothers. Conspecific social experience during infancy may also account for the own-species preferences found in earlier monkeys studies (Fujita 1990, 1993b).

Several studies have also investigated visual discrimination of species based on stimuli featuring only faces. Pascalis and Bachevalier (1998) used a visual paired-comparison (VPC) task to investigate species preference in humans and macaques. In the VPC task a preference for looking at new stimuli is measured after a period of familiarization with another stimulus. Looking longer at the new stimulus indicates recognition of the familiarized stimulus. An own-species effect across species was found; humans showed a novelty preference for human faces but not monkey faces, whereas monkeys showed the opposite preference. Dufour et al. (2006) examined recognition of different species faces using the VPC task in humans, Tonkean macaques and brown capuchin monkeys and found recognition was limited to own-species faces. Dahl et al. (2007) used an adaptation paradigm with rhesus monkeys and found they were better at identifying individuals within their own species compared to other species and processed the faces configurally. Gothard et al. (2009) found macaques can recognize individual faces of both macaques and humans. However, recognizing own-species faces involved configural and feature-based processing, whilst recognizing human faces mainly involved feature-based processing. Consistent with discrimination studies involving the whole body and face, own-species effects are also observed with stimuli featuring only the face.

Related to the own-species effect, other studies have focused on the influence of experience and familiarity. An interesting study in chimpanzees by Martin-Malivel and Okada (2007) used a matching-to-sample task (MTS) to investigate the influence of familiarity on categorical perception (CP) of morphed human and chimpanzee faces. CP “occurs when members of a class of stimuli which vary in their sensory characteristics are nevertheless processed as if they are equivalent.” (Campbell et al. 1997, p. 1429). They found one group of chimpanzees more exposed to human faces than chimpanzee faces showed better discrimination of human than chimpanzee faces, and a CP effect for human faces. In another group of chimpanzees, familiar with both chimpanzees and humans, no discrimination advantage or CP effects for human or chimpanzee faces were found. Also in chimpanzees, Dahl et al. (2013b) investigated the effects of lifetime experience on discriminating individual chimpanzee and human faces using a delayed MTS task. Infant chimpanzees were better at discriminating chimpanzee faces and older chimpanzees were better at discriminating human faces. These results support Tanaka (2003, 2007) in demonstrating the importance of developmental experience in species discrimination in the same group of chimpanzees.

In addition to experience and familiarity, perceptual features are also important for individual and species discrimination. For example, Quinn and Eimas (1996) found 3 to 4-month-old human infants used the internal features and outer edges of the face to discriminate between cat and dog images. Roberts and Mazmanian (1988) showed humans had greater difficulty in discriminating between pictures of kingfishers and other birds in greyscale compared to colour. Parr et al. (2000) demonstrated chimpanzees were worse at discriminating conspecific faces with masked eyes compared to control stimuli, but rhesus monkeys were worse when both the eyes and mouth were masked. Marsh and MacDonald (2008) studied the classification of great ape, gibbon and monkey, and prosimian faces by orangutans using a two-choice touch screen procedure. Discrimination performance was best for stimuli in colour and featuring the face only, and worst for stimuli in greyscale and with modified eyes, i.e., removed eyes or inserted infant eyes. Together, such visual cues help primates to discriminate between faces.

Although many studies have investigated discrimination of individual faces within the same species, few studies have investigated discrimination between primate species faces at the categorical level, especially in chimpanzees. The aim of the current study was to systematically examine the factors important for visual discrimination between different species faces in chimpanzees using a MTS task. Examining how chimpanzees, our closest living relatives, visually discriminate between primate species faces may give us greater insight into how the human visual system has evolved to perceive similarities and differences in categories of faces. We predicted discrimination accuracy would be higher for faces in colour than greyscale, upright than inverted orientation, highly familiar than unfamiliar species faces, and perceptually different than perceptually similar species faces. In addition, higher discrimination accuracy was predicted for identical compared to categorical (non-identical) faces within the same species.

Methods

Participants and housing

Five adult female chimpanzees (Pan troglodytes) participated in the study at the Primate Research Institute, Kyoto University (KUPRI), Japan (Table 1) (Watanuki et al. 2014). The chimpanzees were members of a social group of 12 individuals living in an environmentally enriched facility consisting of two outdoor enclosures (250 and 280 m2), an open-air outdoor enclosure (700 m2) and indoor living rooms linked to testing rooms. The open-air outdoor enclosure was equipped with 15-m-high climbing frames and included streams and trees (Matsuzawa 2006; Yamanashi and Hayashi 2011). No food or water deprivation was used in the study. The experimental protocol was approved by the Animal Welfare and Care Committee of the KUPRI and the Animal Research Committee of Kyoto University and followed the Guidelines for the Care and Use of Laboratory Primates of the KUPRI (Version 3, 2010). The chimpanzees had extensive experience participating in perceptual and cognitive tasks using a touchscreen, including MTS tasks (e.g., Dahl et al. 2013b; Tomonaga 1999).

Table 1 Basic information about the five chimpanzees

Apparatus

Experiments were conducted in an experimental booth (1.80 × 2.15 × 1.75 m) inside a testing room. Each chimpanzee voluntarily walked to the booth through an overhead walkway connected to the living rooms. Two 17-inch touch-sensitive LCD monitors (1280 × 1024 pixels) encased inside Plexiglas were used to present visual stimuli at approximately 40-cm distance. Food rewards (8-mm apple cubes) were delivered via a universal feeder device. All experimental events were controlled by a PC and the computer task was programmed using Microsoft Visual Basic 2010 (Express Edition).

Stimuli

Photographs (200 × 225 pixels) of six species of primate faces including; chimpanzee (Pan troglodytes), western lowland gorilla (Gorilla gorilla), orangutan (Pongo pygmaeus and Pongo abelii) (20 images each), Japanese human (Homo sapiens), olive baboon (Papio anubis) and white-headed capuchin monkey (Cebus capucinus) faces (ten images each) were obtained from the Internet and colleagues (for examples see Fig. 1). All images were cropped and featured unfamiliar faces with a neutral closed-mouth expression against a variety of naturalistic backgrounds. The images were prepared in colour, greyscale, upright and inverted format and controlled for luminance and contrast using Adobe Photoshop CS2 (Version 9.0). In addition, for conditions presenting faces of two primate species, ten ‘random dot’ control images (five images for each species) were composed by randomly assigning each pixel of the original primate face images to a new position. This procedure ensured content-related information was removed while average brightness levels were maintained.

Fig. 1
figure 1

Chimpanzee ‘Pal’ choosing an orangutan face in the colour condition

Procedure

The chimpanzees participated in a series of zero-delay MTS tasks (Fig. 1). To begin each trial the chimpanzees touched a blue start key at the bottom-centre of the screen, followed by a sample stimulus directly above the start key position. When the sample was touched it disappeared and three comparison stimuli appeared in the middle of the screen. When a correct choice was made (the comparison touched matched the sample) a chime sound was played and a food reward given, when an incorrect choice was made (the comparison did not match the sample) a buzzer sound was played and no food reward given. The inter-trial interval was 2 s. The number of correct choices and response times (ms) were recorded by a PC.

Conditions

The study consisted of five conditions to investigate the influence of colour, orientation, familiarity, and perceptual similarity on the ability to discriminate primate faces under identical and categorical matching formats. On identical matching trials, the sample and correct comparison stimuli were identical to each other, whilst on categorical matching trials, the images were different from each other but within the same-species category. Identical and categorical matching was alternated on each trial in each of the conditions. For the orientation condition, only identical matching was presented. The chimpanzees completed a total of 864 trials (72 trials × 12 sessions) for each condition. Trial order was randomized within and across sessions.

In condition 1, the chimpanzees matched chimpanzee, gorilla and orangutan faces in colour and greyscale. In condition 2, they matched the same species in upright and inverted orientation in greyscale. Both the sample and comparison images were presented in the same orientation (Tomonaga 1999; Dahl et al. 2013c). In conditions 3, 4, and 5, they matched greyscale baboon and capuchin monkey faces (unfamiliar and perceptually different), chimpanzee and human faces (highly familiar and perceptually different), and orangutan and gorilla faces (unfamiliar and perceptually similar) respectively. Perceptual similarity was defined as the relative similarity between faces of two different species as perceived by the experimenters. In conditions 3, 4, and 5, a third ‘random dot’ control image was added to the two comparison face images to ensure an equal number of choices across all conditions. Ten novel chimpanzee, gorilla and orangutan faces were used in conditions 4 and 5 to reduce the possible influence of learning effects.

Image similarity analysis

Similarity between chimpanzee, orangutan, and gorilla faces (20 images each) and baboon, capuchin monkey and human faces (ten images each) in greyscale was calculated. First, the faces were aligned vertically and eyes positioned centrally. Using a customized programme written in Python and OpenCV commands AKAZE (Accelerated KAZE) local features were then extracted (cf. Alcantarilla et al. 2011). These features were analyzed using the Brute-Force Matcher method and “cv2.bfmatcher” function. The programme generated similarity values for each face pair, where lower values indicate higher similarity based on local feature matching (OpenCV Development Team 2014). This similarity-matrix data was used for metric multidimensional scaling using the “cmdscale” function in the library “stats” of R version. 3.2.2 (R Core Team 2015). This produced spatial representations of similarities between stimuli. We adopted a two-dimensional solution for the present analysis. To evaluate goodness of fit, the “cmdscale” function generated a GOF (Goodness of Fit) value. The higher the GOF value the better the spatial solution.

Statistical analysis

The data were analyzed using SPSS (version 23). Paired-comparison t tests and repeated measures analysis of variance (ANOVAs) were used to analyze accuracy (the mean number of correct choices) and response times (on correct trials). One-sample t tests were used to analyze difference in accuracy from chance level. Trials with response times of 5 s or longer after the sample image presentation disappeared may not have been retained and so were excluded from the analysis. In conditions 3, 4, and 5, the third random dot control image was never touched and so chance level was 50%.

Results

Condition 1: colour effects

Figure 2 shows the mean percentage of correct choices for identical and categorical matching in colour and greyscale. One-sample t tests found accuracy was above chance level (33.33%) in all conditions; identical colour (M = 95% correct, t(4) = 42.59, p < 0.001), categorical colour (M = 59% correct, t(4) = 16.92, p < 0.001), identical greyscale (M = 87% correct, t(4) = 20.49, p < 0.001) and categorical greyscale (M = 44% correct, t(4) = 3.30, p = 0.030). A two-way ANOVA, with colour type (colour and greyscale) and matching type (identical and categorical) as independent variables was conducted. A main effect of colour type on accuracy was found, with higher accuracy for colour (M = 77% correct) than greyscale (M = 65% correct) matching (F(1,4) = 172.19, p < 0.001). A main effect of matching type was also found, with higher accuracy for identical (M = 91% correct) than categorical (M = 52% correct) matching (F(1,4) = 166.38, p < 0.001). No interactions were found. For response times on correct trials, a two-way ANOVA found a main effect of colour type, with faster response times for colour (M = 768 ms) than greyscale (M = 938 ms) matching (F(1,4) = 25.46, p = 0.007). A main effect of matching type was found, with faster response times for identical (M = 696 ms) than categorical (M = 1010 ms) matching (F(1,4) = 27.29, p = 0.006). No interactions were found.

Fig. 2
figure 2

Mean percentage of correct choices for identical and categorical matching of chimpanzee, gorilla, and orangutan faces in colour and greyscale. The solid horizontal line indicates a main effect of colour on accuracy. The dashed line represents 33.33% chance level. Error bars represent the standard error of the mean

Condition 2: orientation effects

Figure 3 shows the mean percentage of correct choices for identical matching in upright and inverted orientation in greyscale. One-sample t tests found that the accuracy was above chance level (33.33%) in both orientations; upright (t(4) = 8.91, p = 0.001) and inverted (t(4) = 8.84, p = 0.001). A paired comparison t test found accuracy was higher for upright (M = 75% correct) than inverted (M = 66% correct) matching (t(4) = 3.06, p = 0.038). A paired comparison t test found no difference in response times on correct trials between upright (M = 889 ms) and inverted (M = 882 ms) orientation.

Fig. 3
figure 3

Mean percentage of correct choices for identical matching of chimpanzee, gorilla, and orangutan faces in upright and inverted orientation in greyscale. The solid horizontal line indicates a significant difference in accuracy between orientation presentations. The dashed line represents 33.33% chance level. Error bars represent the standard error of the mean

Conditions 3, 4, and 5: familiarity and perceptual similarity effects

Figure 4 shows the mean percentage of correct choices for identical and categorical matching of baboon and capuchin monkey (condition 3), chimpanzee and human (condition 4), and gorilla and orangutan (condition 5) faces in greyscale. One-sample t tests found that accuracy was above chance level (50%) in all conditions; baboon and capuchin monkey (identical: M = 91% correct, t(4) = 20.67, p < 0.001), (categorical: M = 82% correct, t(4) = 15.43, p < 0.001), chimpanzee and human (identical: M = 95% correct, t(4) = 23.79, p < 0.001), (categorical: M = 87% correct, t(4) = 10.72, p < 0.001) and gorilla and orangutan (identical: M = 83% correct, t(4) = 14.03, p < 0.001), (categorical: M = 58%, t(4) = 8.77, p = 0.001). A 3 × 2 ANOVA with stimulus type (baboon and capuchin monkey, chimpanzee and human, and gorilla and orangutan) and matching type (identical and categorical) as independent variables was conducted. A main effect of stimulus type was found with higher accuracy for matching baboon and capuchin monkey (M = 87%) and chimpanzee and human (M = 91%) than gorilla and orangutan (M = 70%) faces (F(2,8) = 26.48, p < 0.001). A main effect of matching type was found, with higher accuracy for identical (M = 90%) than categorical (M = 76%) matching (F(1,4) = 78.30, p < 0.001). An interaction between stimulus type and matching type was found, with a sharp decline in accuracy for categorical gorilla and orangutan face matching (F(1,4) = 51.80, p < 0.001). For response times on correct trials, a 3 × 2 ANOVA was conducted. A main effect of stimulus type was found (F(1,4) = 4.60, p = 0.047) with longer response times for matching gorilla and orangutan (772 ms) than baboon and capuchin monkey (698 ms) or chimpanzee and human (668 ms) faces. A main effect of matching type was found with faster response times for identical (669 ms) than categorical (756 ms) matching (F(1,4) = 48.93, p = 0.002). No interactions were found.

Fig. 4
figure 4

Mean percentage of correct choices for identical and categorical matching of baboon and capuchin monkey, chimpanzee and human, and gorilla and orangutan faces in greyscale. The solid horizontal line indicates a main effect of stimulus type on accuracy. The dashed line represents 50% chance level. Error bars represent the standard error of the mean

Image similarity analysis

Figure 5 shows the two-dimensional representation of the multidimensional scaling analysis of the image similarity data based on local feature matching. The GOF value is 0.207. The chimpanzee, gorilla, and orangutan features were grouped more closely together, indicating greater image similarity, and the human, baboon, and capuchin monkey features were grouped further away indicating less similarity.

Fig. 5
figure 5

Two-dimensional representation of the multidimensional scaling analysis of the image similarity data based on local feature matching. Large circles represent average values and small circles represent individual values. Chimpanzee, gorilla, and orangutan faces consisted of 20 images each, and human, baboon, and capuchin monkey faces consisted of ten images each

Discussion

This study focused on the visual elements important for categorical discrimination between primate species faces. As predicted, discrimination performance for great ape faces was better in colour than in greyscale. This is consistent with the findings of Roberts and Mazmanian (1988) in humans and Marsh and MacDonald (2008) in orangutans, that pictures of animals and faces are easier to discriminate in colour than greyscale. An inversion effect for discriminating great ape faces was found, with lower accuracy for discriminating inverted faces, suggesting the stimuli were perceived as faces and processed configurally. Previous MTS and visual search studies with our chimpanzees (Tomonaga 1999, 2007) have found inversion effects for images of faces but not houses, chairs, or hands as control stimuli, suggesting our results are also face-specific.

Interestingly, although discrimination performance was high for both unfamiliar (baboon and capuchin monkey) and highly familiar (chimpanzee and human) perceptually different species faces, it was not significantly better for highly familiar species faces as predicted. Two studies in chimpanzees have examined the influence of familiarity on discrimination of individual human and chimpanzee faces using the MTS task (Dahl et al. 2013b; Martin-Malivel and Okada 2007). Both studies found discrimination accuracy was highest for species to which the chimpanzees had the most real-life previous exposure. Given that previous experience enhances discrimination of individual faces within species, why did previous experience appear to have no influence on discrimination between species at the categorical level in the current study?

One explanation may be from an evolutionary perspective, discrimination of individuals within one’s own species is especially important for the social lives of primates. In chimpanzees, higher-order features (a combination of first- and second-order features) influence individual face discrimination within species (Dahl et al. 2013b; Martin-Malivel and Okada 2007). However, discrimination between individuals of other species, or between species at the categorical level, is less socially relevant and likely to involve fewer processing resources allocated to recognition (Pascalis and Wirth 2011). Studies examining the other-race effect have found humans are better at recognizing faces of individuals within their own race and better at categorizing faces of other races (Ge et al. 2009; Levin 1996, 2000). This may be due to greater processing resources allocated to individuating information of own-race faces we are experienced with, and to categorical information of other-race faces we are less experienced with (Ge et al. 2009). Similarly, when discriminating between categories of primate species faces, rather than within own-species faces, our chimpanzees may have paid more attention to information important for categorization than recognition.

Another explanation is the perceived similarity between species faces. We chose primate faces representing a range of phylogenetic groups including humans, great apes, old world monkeys, and new world monkeys, to examine the potential relationship between phylogenetic distance and discrimination ability. Performance for categorical discrimination of unfamiliar, perceptually similar and phylogenetically closer faces (gorilla and orangutan) was significantly worse than for unfamiliar, perceptually different and phylogenetically more distant faces (baboon and capuchin monkey). In support of this finding, Campbell et al. (1997) found humans were better able to discriminate between physically morphed faces of monkeys and cows, and humans and cows, compared to humans and monkeys. Categorical perception was sharpest for monkey and cow and human and cow faces, suggesting the phylogenetically and morphologically closer human and monkey faces were perceived as more similar to each other. In our chimpanzees, Tanaka (2003, 2007) did not find a relationship between visual preference of primate species and phylogenetic distance from chimpanzees, although they were not required to directly discriminate between species.

After excluding effects of colour and familiarity, we propose difficulty in discriminating between primate species faces can be best explained by their perceived similarity to each other. This conclusion is supported by both the behavioural data; categorical discrimination performance for perceptually similar primate faces was significantly worse than for perceptually different faces, and the multidimensional scaling analysis of the stimuli; greater local feature similarity between chimpanzee, gorilla and orangutan faces than between human, baboon and capuchin monkey faces. Greater perceptual similarity between great ape faces likely explains the relatively poor discrimination performance in the absence of colour cues, and greater perceptual difference explains the high discrimination performance for baboon and capuchin monkey, and chimpanzee and human faces. Although we do not exclude the possibility previous experience has some influence on categorical discrimination between species faces, information about perceptual similarity appears to be prioritized. This could have masked the potential influence of previous experience or a conceptual representation of species (cf. Martin-Malivel et al. 2006).

In this study, the experimenters made subjective judgments about the relative perceptual similarity of primate species faces. This was subsequently supported by the objective multidimensional scaling analysis of the stimuli. However, a useful future extension to this study would be to test whether our findings generalize across primate species, by testing them under the same experimental conditions as our chimpanzees. If the same pattern of responses are found, we could more concretely conclude non-human primates perceive similarity in different species faces like humans. In addition, although the multidimensional scaling analysis revealed the relative similarities between faces by local feature matching, it does not tell us which specific features, e.g., external face contours or internal features such as the eyes, nose, and mouth may be responsible for discrimination between species. Although the inversion effect we observed provides evidence of configural processing, further analysis of part-based processing by extracting curvature information about the eyes, nose, and mouth image surfaces (e.g., Dahl et al. 2014a, b) or systematic removal of different facial features (e.g., Parr et al. 2000; Quinn and Eimas 1996; Tomonaga and Imura 2015) may help to reveal features most important for between-species discrimination. Finally, we acknowledge some of the general limitations of our study including the small sample size, the limited stimuli number and species used for each phylogenetic group, and the fact sex differences could not be examined as all the chimpanzees were female.

We conclude our chimpanzees appear to perceive similarity in primate faces in a similar way to humans. Information about perceptual similarity is likely prioritized over the potential influence of previous experience or a conceptual representation of species for categorical discrimination between species faces.