Introduction

Accurate recognition of familiar faces is critical for normal social interaction. Faces provide powerful visual clues to identity and we can usually determine from a single glance whether we know a person or not. The cognitive mechanisms and neural substrates of face recognition have been the subject of intense study by neuroscientists using a variety of methods including functional imaging, surface and intracranial electrical recordings, transcranial magnetic stimulation, and lesion-deficit correlations in neurological patients. In this review, we focus on recent advances in our understanding of the neural circuitry that underpins our remarkable ability to use facial appearance to identify the vast number of individuals we encounter in everyday life. We begin by presenting a cognitive model of face recognition followed by an overview of what functional imaging studies in normal subjects and intracranial recordings in patients with epilepsy have revealed about the neural systems involved. We next discuss the salient clinical features and lesion correlates of distinct neuropsychological disorders characterized by defective processing of facial identity. We conclude by using converging evidence from these different sources to propose a functional neuroanatomical model of face recognition.

Cognitive Model of Familiar Face Recognition

From the cognitive perspective, recognizing familiar people by their facial appearance entails both modality-specific visual operations and the retrieval of multimodal identity-specific knowledge about unique individuals from long-term memory [1]. As shown in Fig. 1a, the recognition process begins with the visual analysis stage that leads to the construction of a detailed perceptual representation of the face encountered. Because faces constitute a visually homogenous category with high levels of structural similarity among exemplars, successful discrimination at the individual level requires fine-grained holistic/configural processing that integrates multiple face parts into a unified perceptual representation sensitive to subtle differences in the spatial relationship among component features [2]. The next stage in the recognition process involves activating stored visual memory representations of familiar faces that gives rise to a feeling of familiarity proportionate to the degree of overlap between the current face percept and the memory trace (Fig. 1a). Under normal circumstances, the activation of face memory representations is followed by the retrieval of multimodal identity-specific information about the familiar person that is a diverse collection of relevant biographic/semantic facts (e.g., occupation, name, and personality traits), autobiographical/episodic details (e.g., memories of specific personal encounters), and our emotional response reflecting the personal significance of the individual [3,4,5] (Fig. 1). The amount, type, and quality of the information retrieved in response to the face depend on our level of familiarity with the person and whether the individual is personally known to us or only through media exposure. Personally familiar people are associated with a rich set of autobiographical memories imbued with emotion, whereas memory representations of famous persons and celebrities are dominated by biographic/semantic information. Because in the clinical setting familiar face recognition is typically assessed with photographs of famous faces, we will focus here on the retrieval of person-specific semantic knowledge.

Fig. 1
figure 1

a Cognitive model of face recognition. Modality-specific visual components of the recognition process are shown in red; nonvisual components involved in multimodal person semantic knowledge retrieval, emotion processing, and executive control are shown in blue. b The distributed neural network for face-identity processing. Core network components are shown in red (OFA, FFA, ATFA) and extended network components in blue (ATL, the amygdala, PFC). See text for details

The cognitive model depicted in Fig. 1 also includes a central executive component that exerts top-down control over the operations of the face recognition system. Executive control processes include the monitoring and verification of the information retrieved from memory in response to the face cue, setting appropriate criteria for recognition decisions, and initiating strategic memory search processes when the identity of the person remains in doubt. Executive control operations are not usually required for the recognition of highly familiar faces because for these individuals identity-specific information is readily available in an automatic or bottom-up fashion. However, executive processes play a more significant role under conditions of uncertainty when the face cue does not directly elicit relevant person-specific information, leaving the source of facial familiarity underspecified or ambiguous [6, 7]. This typically occurs when trying to identify people we encounter less frequently, as our stored knowledge of these individuals is less detailed and not as easily accessible.

Neuroimaging and Intracranial Recording Studies of Face Recognition

Neuroimaging studies in normal subjects have provided compelling evidence that, rather than being localized to a single specialized cortical region, face recognition is mediated by a distributed neural network [3,4,5, 8,9,10, 11••, 12••, 13, 14•]. It has been proposed that the face recognition network can be subdivided into a core system dedicated to modality-specific visual processing of faces and an extended system involved in retrieving stored multimodal identity-specific information from memory and in generating emotional responses to the faces of familiar individuals [3,4,5] (Fig. 1b). Core network components relevant to facial identity recognition include three spatially distinct but anatomically [15, 16] and functionally [17, 18] interconnected visual areas located along the posterior-anterior axis of ventral occipito-temporal cortex (VOTC): the occipital face area (OFA), fusiform face area (FFA), and the anterior temporal face area (ATFA) [5, 8,9,10, 11••, 12••, 13, 14•, 19,20,21,22] (Fig. 1b). All three cortical regions demonstrate face selectivity by generating stronger responses to faces than to other visual object categories. Although face-selective cortical patches can be identified in both hemispheres, the right-sided activations are more robust and reliable, consistent with the dominant role of the right hemisphere in face recognition [2, 5, 9, 10]. In addition to face selectivity, all three VOTC regions show sensitivity to facial identity manifested by distinct neural responses to the faces of different individuals [2, 9, 12••, 23,24,25]. It has been suggested that the processing of information across VOTC face-selective areas is hierarchically organized along the posterior-anterior neuroanatomical axis [10, 11••, 12••, 19, 20]. Specifically, the OFA is considered the entry point of the face recognition network primarily involved in the visual analysis of individual facial features, while the FFA and ATFA are responsible for constructing more complex holistic/configural representations optimal for the selective coding of facial identity [10, 11••, 12••, 19, 20]. Furthermore, whereas perceptual descriptions of faces generated in the OFA are view-specific, the face representations computed by the FFA are mirror-symmetric reaching full view-independence in the ATFA by extracting and encoding invariant facial attributes that allow recognition despite changes in head orientation or expression [10, 11••, 26]. Thus, visual representations of faces become increasingly complex, abstract or view-independent, and dentity-sensitive along the posterior-anterior axis of the core face recognition network linking OFA, FFA, and ATFA (Fig. 1b). It has also been shown that the response properties of anterior face-selective regions, including FFA and especially ATFA, are modulated by prior experience with a particular face, with stronger activations to familiar compared with unfamiliar faces [20, 22, 24, 25]. These findings suggest that the relative contribution of distinct core network components to face perception and memory also follows a posterior-anterior functional gradient within VOTC (Fig. 1b). According to this view, OFA and FFA are primarily involved in implementing perceptual operations while ATFA may contain visual memory representations of familiar faces and thus serve as the critical neural interface for linking these records with multimodal person-specific information represented within the extended network for face recognition [9, 10, 11••, 20,21,22, 26].

Key components of the extended face recognition network include anterior temporal lobe (ATL) regions (polar, ventrolateral, and medial structures including the hippocampus and perirhinal/entorhinal cortex) implicated in semantic and episodic memory retrieval. Consistent with the proposed role of ATL in the storage of multimodal person-specific knowledge, the faces, voices, and names of unique familiar individuals produce overlapping neural activations in these regions [27, 28, 29•, 30]. The ATL is also recruited when subjects are attempting to learn new associations between faces, voices, names, and other personal biographic/semantic information [31,32,33,34,35,36, 37•]. Collectively, these findings suggest that ATL serves as a multimodal hub of person knowledge, integrating identity-specific visual (face), auditory (voice), and verbal (name) information processed in specialized modality-specific cortical areas with other unique semantic and episodic details about familiar people [38, 39].

As shown in Fig. 1b, the extended network also includes the amygdala and other limbic structures (insula, ventral striatum, cingulate, orbitofrontal cortex) involved in assessing the emotional significance and personal relevance of the face. Finally, the activation of ventrolateral prefrontal cortex (PFC) during familiar face processing has been linked to the engagement of top-down monitoring and executive control functions over the operations of the temporal lobe face perception and memory networks [6, 7] (Fig. 1b). Consistent with this view, neural activity within VOTC face selective areas has been shown to be modulated by top-down signals originating in PFC [40, 41].

Complementing the results of neuroimaging studies, intracranial recordings in epileptic patients have confirmed the distributed nature of the face identity network and the dominant role of the right hemisphere in face recognition [42••, 43]. These studies have also demonstrated that, in contrast to the modality-specific visual responses elicited from core network face-selective areas, neurons in the ATL component of the extended network respond to multimodal person-specific information (face, voice, and name) and show sensitivity to the familiarity and personal relevance of the individual [44,45,46,47]. Face-selective neural responses have been recorded from other extended network nodes, including the amygdala and PFC implicated in emotion processing and the top-down executive control of face recognition [48, 49]. Intracranial recordings have also provided important information about the temporal dynamics of neural activation within the face recognition network. In particular, the neurophysiological evidence seems consistent with both feedforward and feedback neural signaling, suggesting that face recognition is the emergent property of parallel bottom-up and top-down interactions between core and extended network nodes [50]. Finally, intracranial recordings have confirmed the causal role of distinct network components in face recognition by demonstrating that direct electrical stimulation of these cortical regions interferes with face perception and memory [51,52,53]. Temporary disruption of face recognition has also been documented by targeting the cortical components of the network using the virtual lesion method of TMS [54].

Neuropsychological Disorders of Face Recognition

At the behavioral level, face recognition impairments in neurological patients can manifest either as a failure to identify familiar faces or as false recognition/misidentification of unfamiliar faces [7, 55]. These qualitatively different disorders of face identity processing are associated with damage to distinct functional components of the core and extended network for face recognition.

Prosopagnosia

The central clinical feature of prosopagnosia is the striking inability to recognize previously familiar faces. In severe cases, patients may fail to recognize not only public figures or celebrities but also the faces of close personal acquaintances and family members. Prosopagnosics have no difficulty in distinguishing faces from other visual object categories, but they cannot discriminate the faces of different individuals and complain that all faces look unfamiliar. Due to the profound impairment in perceiving and remembering the unique visual attributes of different faces, individuals with prosopagnosia also show prominent deficits in learning new faces. Importantly, prosopagnosics can recognize familiar people from their voices or names, providing evidence that stored semantic knowledge of these individuals is preserved and can be accessed from nonvisual person-specific cues. Therefore, prosopagnosia represents a modality-specific recognition disorder in which the visual appearance of the face no longer serves as a reliable guide to personal identity [2, 7, 9, 55,56,57].

Prosopagnosia is sometimes subdivided into apperceptive and associative subtypes based on whether the recognition impairment is primarily attributable to abnormal face perception or memory [2, 7, 9, 55,56,57,58,59,60,61]. In apperceptive prosopagnosia, damage to the visual analysis stage (Fig. 1) prevents the construction of a detailed holistic/configural structural description of the face suitable for selectively activating stored memory representations for familiar faces. Due to their general face perception deficit, patients with apperceptive prosopagnosia also have difficulty discriminating unfamiliar faces. In contrast, in associative prosopagnosia the perception of unfamiliar faces is relatively preserved but recognition cannot take place either because memory representations of familiar faces cannot be accessed, are degraded, or because their activation fails to trigger the retrieval of identity-specific person knowledge (Fig. 1). Loss of face memory representations in associative prosopagnosia can lead to defective mental imagery for familiar faces in addition to recognition failure from visual input [58, 62]. It should be noted, however, that some investigators have questioned the existence of distinct clinical subtypes and argued that defective processing of fine-grained holistic/configural information critical for individuating faces would undermine both face perception and memory and thus constitute a common neuropsychological mechanism underlying all cases of prosopagnosia [2, 59, 60]. Consistent with the proposed breakdown of holistic/configural processing, prosopagnosics typically demonstrate abnormal reliance on a slow, feature-based, or “piecemeal” strategy in face recognition tasks [2, 59, 60].

Although prosopagnosics cannot recognize the individual identity of faces, they may retain the ability to categorize faces based on emotional expression, gender, age, race, and other social attributes such as attractiveness or trustworthiness [7, 56]. The behavioral dissociation between severely impaired face individuation and relatively preserved categorization suggests that category-level face recognition processes do not require access to fine-grained holistic/configural representations and may be accomplished on the basis of coarse-grained visual information or by attending to single facial features considered diagnostic of category membership [7, 56]. Surprisingly, some individuals with prosopagnosia also demonstrate intact implicit or covert processing of facial familiarity and identity despite their poor performance on explicit or overt face recognition tasks [55,56,57,58, 63, 64]. For instance, they may perform significantly better than chance in forced-choice judgments of familiarity when presented with pairs of famous and unknown faces or display an advantage in learning to associate famous faces with the correct vs. the incorrect names [55, 63, 64]. Individuals with prosopagnosia have also been shown to generate discriminatory skin conductance responses (SCRs) when exposed to the faces of familiar individuals they could not overtly recognize [65]. In cases with spared covert recognition, perceptual processing of faces, activation of face memory representations, and access to person-specific semantic knowledge can apparently still take place in a relatively normal fashion. However, overt recognition is prevented because the reduced output of the damaged face recognition system is not sufficient to give rise to a conscious experience of remembering [55, 56]. Unrecognized faces may still trigger an appropriate covert emotional response, as demonstrated by preserved autonomic SCRs to familiar faces.

In terms of neural substrates, prosopagnosia is associated with damage to the core visual components of the face recognition network, including OFA, FFA, and ATFA [2, 7, 9, 26, 55,56,57,58,59,60,61]. Prosopagnosia can also result from damage to the white matter pathways that connect these face-selective VOTC regions with one another and with the other nonvisual components of the extended face recognition system [66, 67]. The usual lesion etiology is stroke in the distribution of the posterior cerebral artery, although cases associated with neurodegenerative disease, surgical excision, or trauma have also been described [2, 7, 9, 26, 55,56,57,58,59,60,61, 67,68,69,70]. Although initial lesion-deficit correlation studies concluded that prosopagnosia required bilateral damage, it is now well established that unilateral right VOTC lesions are both necessary and sufficient to produce the syndrome, confirming the critical contribution of the right hemisphere to face recognition. It has been proposed that posterior VOTC lesions centered on OFA/FFA give rise to apperceptive prosopagnosia whereas anterior lesions primarily involving ATFA result in associative prosopagnosia [7, 9, 26, 55,56,57,58,59,60,61]. The alternative view is that the integrity of the entire core network is necessary for the fine-grained holistic/configural perception of individual facial identity and therefore damage to the posterior vs. anterior components of the network results in qualitatively similar forms of prosopagnosia [2, 59, 60]. Interestingly, functional imaging studies in patients with prosopagnosia have demonstrated face-selective, and in some cases identity-sensitive, neural responses in anatomically spared components of the face recognition network [2, 26, 71,72,73]. The dense behavioral deficit of these patients on face identification tasks despite relatively preserved neural activity in surviving individual network nodes suggests that the structural/functional integrity of the entire face recognition network is necessary for normal performance. Furthermore, neural responses to faces in patients with prosopagnosia have sometimes been observed in the anterior components of the core network following damage to more posterior nodes that were considered to be the source of visual input to these regions [2, 26, 71,72,73]. These findings seem inconsistent with the notion of strict hierarchical processing and suggest the existence of multiple parallel visual pathways from the occipital cortex to face-selective VOTC areas [2, 10, 26, 71,72,73].

Regarding the neural correlates of spared face processing abilities in prosopagnosia, it is possible that although no longer able to support the recognition of individual identity, the damaged core network can still decode categorical facial information about expression, gender, age, race, and various other social attributes [7, 56]. Alternatively, category-level face processing may not normally require the integrity of the right VOTC face identity network implicated in prosopagnosia and could be mediated by other right hemisphere visual pathways or by the intact left hemisphere face recognition system. For instance, functional imaging studies have shown that facial expression recognition that depends on the processing of rapidly changing or dynamic facial attributes, as opposed to the static or invariant representations that support identity recognition, recruits a dorsal visual pathway projecting to the superior temporal sulcus (STS) [3, 5, 10, 11••, 12••, 74]. To account for the phenomenon of covert face recognition, it can be assumed that although individual components of the damaged system may retain some residual sensitivity to facial identity, as shown by functional imaging studies of patients with prosopagnosia, the degraded neural signals are not robust enough to activate the entire face recognition network. Abnormal propagation of neural activity results in a failure to achieve long-distance cortico-cortical synchronization within the network and also prevents the transmission of information to the frontoparietal attention and working memory systems required for conscious awareness [56]. However, reduced neural activity within the damaged network may be sufficient to support covert face recognition [26, 55,56,57,58, 63, 64, 75, 76] and preserved transmission of face identity information to the amygdala and other limbic structures may mediate spared autonomic SCRs to the faces of emotionally significant individuals [56, 65] (Fig. 1b).

Person Recognition Disorders

Inability to recognize familiar people by their faces is also a prominent clinical feature of person recognition disorders [7, 55, 77,78,79, 80•]. However, in contrast to prosopagnosia, patients with person recognition disorders demonstrate similar difficulties when trying to identify familiar people from nonvisual person-specific cues such as voice or name, or when presented with unique biographic details about the individual (i.e., he was the first African-American president of the USA). The multimodal nature of the recognition impairment suggests that the underlying neuropsychological mechanism is likely to involve the degradation or loss of person semantic knowledge (Fig. 1). Consistent with a central deficit of semantic integration, perceptual processing of holistic/configural facial information in modality-specific core visual areas can be preserved [81].

Person recognition disorders are associated with neurological conditions that produce damage to the ATL, including semantic dementia/FTD [82, 83, 84••, 85, 86, 87•, 88], Alzheimer’s disease [87•, 88, 89], and temporal lobe epilepsy/lobectomy [90,91,92]. These findings underscore the critical contribution of ATL to the encoding, storage, and retrieval of multimodal semantic information about unique individuals. Importantly, person recognition disorders can be associated with both left and right ATL damage, providing empirical evidence for the bilateral representation of familiar person knowledge [7, 55, 77,78,79, 80•, 81, 82, 83, 84••, 85, 86, 87•, 88, 89, 90,91,92,93]. However, the clinical manifestations of the person recognition impairment can vary as a function of lesion laterality [77, 79, 80•, 82, 83]. Specifically, right ATL lesions are associated with greater impairments in recognizing familiar people from their faces and voices, whereas following left ATL damage the recognition deficit is relatively more severe from name cues or when a naming response is required [77, 79, 80•]. Furthermore, patients with left ATL damage often retain a sense of familiarity with the face even when verbal identity-specific semantic information or the name of the person cannot be retrieved, whereas right ATL lesions are associated with a loss of facial familiarity [77, 79, 80•, 84••, 93]. One interpretation of these hemispheric asymmetries is that conceptual knowledge of familiar people in the left ATL is mostly verbal or language-based whereas in the right ATL it is represented in a nonverbal sensory-based format [77, 79, 80•]. Alternatively, it is possible that both ATLs contain multimodal representations of familiar people and the observed laterality effects reflect stronger input/output connectivity with language areas in the left hemisphere and with modality-specific regions involved in processing nonverbal (visual/auditory) person-identity information in the right hemisphere. Regardless of whether the different behavioral profiles of patients with left vs. right ATL damage are attributable to interhemispheric differences in representing, accessing, or expressing person semantic knowledge, the neuropsychological evidence indicates that intact recognition of familiar people from multiple identity-specific sensory (face, voice, and name) cues requires the structural/functional integrity of both ATLs.

False Recognition/Misidentification of Unfamiliar Faces

In contrast to the impaired recognition of familiar faces that constitutes the behavioral hallmark of prosopagnosia and person recognition disorders, some neurological patients present with memory distortions and falsely claim that novel faces are familiar, at times mistaking unfamiliar individuals for famous people or personal acquaintances [6, 7, 55, 94, 95]. The behavioral double dissociation between defective processing of familiar faces and false recognition/misidentification of unfamiliar faces is also reflected by differences in lesion profiles. In particular, whereas prosopagnosia and person recognition disorders are associated with right VOTC or bilateral ATL lesions, the most striking cases of false facial recognition have been observed following damage to right PFC [6, 7, 55, 94, 95].

Face memory illusions following right PFC lesions cannot be attributed to face perception impairment or defective memory for familiar faces. Instead, false recognition results from abnormal reliance on category-level face memory representations that cannot support the recognition of individual identity. For instance, frontal patients may incorrectly claim that a novel face is familiar or famous based on strong categorical resemblance to facial prototypes that we associate with celebrity status (e.g., “actress type”). Activation of facial prototypes automatically triggers the retrieval of general semantic knowledge corresponding to the appropriate social stereotype. However, the general sense of facial familiarity can be opposed by engaging in a strategic memory search to retrieve identity-specific semantic details about the individual and using the presence or absence of this information as the appropriate criterion for making face memory decisions [6, 7, 55, 94, 95]. Frontal patients do not engage in effortful memory retrieval and monitoring operations and are likely to respond on the basis of general facial familiarity signals thereby mistaking “looking famous” for “being famous.”

In terms of our cognitive model (Fig. 1), false recognition/misidentification of unfamiliar faces in patients with PFC lesions can be explained by postulating damage to the executive system. In particular, the neuropsychological evidence suggests that frontal lobe structures play an important role in face recognition by implementing strategic memory retrieval, monitoring, and decision functions critical for attributing facial familiarity to a specific context or source [6, 7, 55, 94, 95]. Under conditions of uncertainty when the face cue does not automatically elicit identity-specific information, effortful and strategic recollection of unique personal biographic/semantic details by the frontal executive system provides the principal mechanism for suppressing false recognition attributable to the misleading influence of general familiarity associated with the activation of category-level memory representations (facial prototypes and social stereotypes). Memory distortions in patients with frontal lobe lesions underscore the fact that face recognition is a dynamic process that requires reciprocal bottom-up and top-down interactions and functional integration between temporal lobe face perception and memory networks and frontal executive control systems [6, 7].

Conclusions

Functional imaging in normal subjects, intracranial recordings in epileptic patients, and lesion-deficit correlation studies in patients with neurological disorders have made critical contributions to our understanding of the cognitive mechanisms and neural substrates of face recognition. Convergent and complementary findings from these different lines of investigation have provided conclusive evidence that face recognition is mediated by a large-scale neural network comprised of core face-selective visual areas in VOTC and nonvisual extended network components that include ATL, the amygdala, and PFC. The core and extended components of the network have distinct functional roles in face identity processing and damage to these regions results in qualitatively different types of face recognition impairments manifested either by a failure to recognize familiar faces or false recognition/misidentification of unfamiliar faces. In particular, damage to the core visual components of the network (OFA, FFA, ATFA) gives rise to prosopagnosia, multimodal semantic impairment following damage to ATL is associated with person recognition disorders, while executive dysfunction is the primary abnormality contributing to false facial recognition/misidentification of unfamiliar faces in patients with PFC lesions.