1 Introduction

It seems natural to argue that pictures foster in us some kind of emotional involvement: we feel some relief when we look at a picture of someone we miss and we are not able to meet in real life - sometimes, we rather avoid looking at the picture of our partner after breaking up with him/her, in order to avoid sadness; we often feel surprised when we look at some family photos portraying how young our children were many years ago; and usually, we look at pictures just to have a laugh during boring moments. All these examples suggest that picture perception is sometimes infused with emotional charge. That is, the visual perceptual state we are in during picture perception may be accompanied by emotional charge. I will call this aspect of picture perception the emotional charge of picture perception (henceforth: ECPP).

While philosophy has tried to meticulously explain the peculiar visual features of picture perception (Nanay 2010, 2011, 2015; Wollheim 1980, 1987, 1998; Lopes 2005; Matthen 2005; Hecht et al. 2003), the emotional component is, surprisingly, completely missing, with a few exceptions (Abell and Bantilaki 2010; Lopes 2005: Ch. 2; Nanay 2013), in most of the accounts we currently have. In other words, nobody is engaged in the important task of explaining the ECPP, which remains a deeply unexplored phenomenon. Even our best model of picture perception, the dorsal/ventral account of picture perception (Nanay 2010, 2011, 2015) – henceforth DVAPP - which integrates the most important empirical results coming from our best model on vision in neuroscience, i.e., the two visual systems model (henceforth: TVSM), lacks a reference to the ECPP.

The aim of the present paper is to offer an account of picture perception that is consistent with and able to explain this neglected aspect of picture perception. The claim defended in this paper is that, as for face-to-face perception, during picture perception, we are not only in a visual perceptual state, but also in an emotional state which is directly connected to our visual perceptual state during picture perception. Crucially, in the paper it will be shown that it is possible to defend this claim while remaining consistent with the philosophical/empirical framework of the DVAPP, whose explanatory power is maintained, confirmed and improved.

Note that the DVAPP is both a philosophical and a psychological uncontroversial theory in the literature (Nanay 2011, 2015; Ferretti 2016a, c). It is a sound philosophical theory because it manages to investigate the nature of the perceptual state we are in when we are looking at a picture (see Nanay 2015: Sect. 4 for a review), which is a crucial philosophical question in picture perception (see §1.2). It is moreover a sound psychological theory because it carries out this investigation by carefully looking at the neurophysiological underpinnings of such a perceptual state (see Nanay 2015: Sect. 3 for a review). This paper follows the same spirit of the DVAPP in investigating a crucial philosophical question about picture perception, but, at the same time, suggesting something new: i.e., that we are also in an emotional state when we perceive a depicted object and that this emotional state is linked to our visual state. The same methodology pursued by the DVAPP will be followed; there will be a close look at the evidence concerning the neurophysiological underpinnings of such states. Thus, it will be suggested that different sets of evidence from the psychological and neuroscientific literature on emotions are compatible with the DVAPP. It is precisely thanks to this consistency between the two proposals that the ECPP hypothesis is plausible.

The reader should note that the proposal offered here does not entail that whenever we look at a picture that represents a sad situation, we automatically feel sad.Footnote 1 What is suggested is a possible explanation of why we might entertain the phenomenon related to the ECPP, while remaining in the framework proposed by the DVAPP.

1.1 A Brief Digression on Pictures and Emotions

This paper starts from the usual and most important question in the philosophical debate about picture perception that the DVAPP aims to answer to:

It should be pointed out that this paper does not want to enter the debate about whether emotional representations are a form of perceptual representation. The claim can be reformulated in both perceptual and non-perceptual views. If emotions are taken to be representational perceptual states (Döring 2007; Roberts 2003; Prinz 2004; see Nanay 2013: 157 for empirical evidence), then, when answering (Q1), the DVAPP seems to leave aside an important aspect of our perceptual experience of pictures: affective perceptions or emotions. However, if emotions are taken to be representational non-perceptual states (Solomon 2003), (Q1) can be reformulated thus:

  • (Q1a) What representational state are we in when we see an object in a picture?

The fact that emotions are representational states is widely agreed (Barlassina and Newen 2013; de Sousa 2014).Footnote 2 In this reformulation, a new important fact is highlighted: among the representational states we can be in, when looking at pictures (a more general claim with respect to the original formulation of Q1), there are emotional representational states - which is something the DVAPP and all other accounts on picture perception are silent about. The presence of this emotional representational state occurring during picture perception gives raise to the ECPP. This is possible because, as it will be suggested, the visual state we are in during picture perception is deeply linked to an emotional state.

It must be noted that the account offered here is not committed to a particular idea on the nature of emotions. It is assumed, as it is in most of the literature, that emotional states are representational states, that they refer to something, that they have content, that they attribute salient emotional properties to objects and that they depend on neurophysiological states (Barlassina and Newen 2013; Prinz 2004; Roberts 2003; Döring 2007; de Sousa 2014; Nanay 2013). The usual loci where we can detect emotional activity in the brain are the insular pathway (Craig 2009), the somatosensory pathway (Khalsa et al. 2009), the orbital sector of the prefrontal cortex - that is, the orbito-frontal cortex - the amygdala and the ventral striatum (Barrett and Bar 2009).

The emotional encoding of the objects we visually perceive is due to the fact that “sensory information from the world is projected rapidly from the back to the front of the brain after an image is presented to the visual system” (Barrett and Bar 2009: 1329). This propagation depends on the anatomo-functional projections from the occipital cortex to the prefrontal cortex. These projections are crucial for the visuo-affective encoding. A crucial cortical area involved in this propagation and in the emotional encoding of visual information is the orbito-frontal cortex (Barrett and Bar 2009; see also Duncan and Barrett 2007; Pessoa 2008). However, all the different cortical areas mentioned above are deeply interconnected and are equally important, even though for different reasons, for emotional encoding (Barrett and Bar 2009: 1326). In order to explain the DVAPP, this paper will discuss different sets of evidence concerning the activity of the orbito-frontal cortex and of other cortical areas involved in emotional encoding during picture perception.

So, whereas the DVAPP aims to understand what (visual) perceptual state we are in during picture perception, the claim defended here is that, during picture perception, we are not only in a visual perceptual state, but also in an emotional state, which is directly connected to the visual perceptual state. This conclusion will follow from the evidence that the activation of the cortical states of the visual system as the basis of the visual perceptual states we are in when we perceive depicted objects is accompanied with the activation of the cortical states whose task is to detect the salient emotional properties of the objects we visually perceive. In the next section, the discussion starts from the answers given by the DVAPP to Q1. These answers seem to suggest what are all the possible perceptual states, with respect to the TVSM, we can be in during picture perception. This discussion will help in the development of the theory proposed in this paper.

2 Q1 and the DVAPP

The idea of using the results of the TVSM in order to answer (Q1) was first proposed by Matthen (2005) and then completely developed by Nanay (2010, 2011: 477; Nanay 2015). In a nutshell, according to TVSM, there is a separation of the visual pathways, grounded on distinct anatomical structures (Milner and Goodale 1995/2006; Jacob and Jeannerod 2003): one for conscious visual identification and recognition, whose encoding is in an allocentric frame of reference, and one for unconscious visually guided action, whose encoding is in an egocentric frame of reference. It is well known that these pathways can be dissociated due to cortical lesions. Lesions in the dorsal stream (the occipito-parietal network from the primary visual cortex, passing through the posterior parietal cortex, with projections until the premotor and the motor one) impair one’s ability to use what one sees to guide action (optic ataxia), but not object recognition; lesions in the ventral stream (the occipito-temporal network from the primary visual cortex to the inferotemporal cortex) impair one’s ability to recognize things in the visual world (visual agnosia), but not action guidance. Moreover, we have behavioral studies on normal subjects which involve visual illusions that deceive the ventral stream but not the dorsal one; thus, it seems that – contrary to ventral perception, which is the cutting edge of visual experience - dorsal perception is completely impenetrable by consciousness (Milner and Goodale 1995/2006).

As said above, the DVAPP is the best empirically-framed philosophical account of picture perception in the light of vision neuroscience (Matthen 2005; Nanay 2010, 2011, 2015; see also Ferretti 2016a, c), since it investigates picture perception by using the neurophysiological knowledge about vision we get from the TVSM - which is, in turn, the best neuroscientific account we have on vision (Kandel et al. 2013).Footnote 3 Indeed, the DVAPP is currently the best explanation of what goes on in our visual system when we are perceiving pictures. (For a complete discussion of the implications of the DVAPP see Nanay (2011)). Following the TVSM in answering (Q1), Nanay defends the following claims: (1) since the perceptual representation of the depicted object is grounded on our recognitional apparatus, which in turn is grounded on the ventral subsystem, the depicted object is represented by ventral perception; (2) the depicted object is not represented by dorsal perception; (3) the picture surface is represented by dorsal perception; and (4) the surface of the picture is not necessarily represented by ventral perception. In other words, during picture perception it is necessary that our ventral vision attributes properties to the depicted object, whereas our dorsal vision attributes properties to the picture’s surface. When we see an object face-to-face, both the dorsal and the ventral visual subsystems attribute properties to the same object: the perceived object. When we see a depicted object in contrast, the dorsal and the ventral visual subsystems attribute properties to different objects: the ventral subsystem attributes properties to the depicted scene, whereas the dorsal subsystem attributes properties to the surface of the picture (pp. 466–477). Given this background, it is possible to develop the proposal of this paper.

3 Visual States and Emotional States During the Perception of Depicted Objects

This paper wants to provide an account defending the idea that the perceptual state we are in when we see an object in a picture is connected to an emotional state. Moreover, it will be suggested that the connection is explainable within the framework of the DVAPP. In order to explain the ECPP, the claim this paper makes is that depicted objects foster in us an emotional response. Therefore, the paper will only focus on claims (1) and (2), formulated by the DVAPP (§2).

Before proceeding, an important fact must be made explicit. Even though pictures have two components (the picture’s surface/vehicle and the depicted scene/object) and they involve ‘seeing-in’ (a peculiar perceptual state based on a visual ‘twofoldness’ that allows us to represent both the picture’s surface/vehicle and the depicted object (Wollheim 1980, 1987, 1998)), this paper will just focus on the representation of the depicted object while leaving aside (but not denying the presence of) the representation of the picture surface. The idea behind this choice is that the most interesting case, concerning emotional encoding, involves the emotional response to the depicted object we see, rather than its surface. Note that this choice does not affect the plausibility of the account offered; a genuine interpretation of twofoldness - and seeing-in - entails that we visually represent both the depicted object and some of the properties of the picture surface (the locus classicus is Wollheim 1980, 1987, 1998; but see Nanay 2017; Lopes 1996, 2005; Voltolini 2013), even though we may or may not attend to the picture’s surface (this idea of seeing-in and twofoldness is the most plausible one, see Nanay 2011: 461–464, 2017; see also Levinson 1998: 229; Lopes 1996: 37–51, quoted in Nanay 2011: 463–464; see also Ferretti 2016c). Often, during picture perception, we are interested in the depicted object, and we completely ignore its surface (Levinson 1998; Nanay 2011). Moreover, even if face-to-face perception and picture perception are not the same perceptual phenomenon – indeed, only the latter involves seeing-in and twofoldness – they are very similar perceptual states, because both depicted objects (that is, just a portion of what we call a picture) and normal objects can foster very similar responses of the visual system since they are related to very similar visual cues. In other words, the visual system attributes very similar properties to both real objects and depicted ones (Vishwanath 2011, 2014; Ferretti 2016a, 2016c; Hecht et al. 2003; someone has also suggested that seeing-in and seeing face-to-face are experiences of the same psychological kind, see Briscoe 2016). Note that it is usually claimed that the relevant difference between face-to-face perception and picture perception is that the former but not the latter can foster the feeling of presence (Matthen 2010; Nanay 2015; Ferretti 2016a, c). However, except for the feeling of presence, it is widely agreed that depicted objects are able to offer many of the visual cues that a normal object usually offers (see Vishwanath 2011, 2014; Briscoe 2016; Ferretti 2016a, b, c). It is therefore not inconsequential that both the chunks of our visual system – the ventral and the dorsal one, as well as several specific areas within them - represent both normal and depicted objects (Ferretti 2016a). Finally, the account offered here allows the same interplay between emotional encoding and vision to be held in the case of the picture’s surface, that is in the case of (c) and (d). In the specific case of (d), it is uncontroversial to suppose that, when we look at a picture and our ventral stream is representing different properties of the surface, the emotional encoding assisting vision is active (Barrett and Bar 2009). A good example of this situation is when the surface is ruined and fosters in us a visual impression of sloppiness.

These reasons are sufficient in order to justify the focus of this paper on depicted objects and on the responses they solicit in our visual system. To sum up, while seeing-in is a peculiar visual state that involves representing both the depicted object and the picture surface, we often focus only on the depicted object (see Ferretti 2016a: Sect. 4.2). The account proposed here is perfectly compatible with the idea of a peculiar twofoldness of picture perception and seeing-in. Furthermore, our visual system represents both depicted and normal objects in a very similar way and attributes to both of them very similar properties (Briscoe 2016; Ferretti 2016a, b, c). Finally, since both chunks of our visual system represent depicted objects and their visual activities are assisted by the emotional encoding of specific emotional areas, it seems uncontroversial to say that even the cortical areas involved in emotional encoding, assisting the visual processing of those visual chunks, represent depicted objects. The literature reported here about emotional (or visuo-affective) responses to depicted objects is clear on this point, as well as on the link between emotional and visual responses.

It has been said that, in order to explain the ECPP, the claim will be that depicted objects foster in us an emotional response and the argument given will concern the claims of the DVAPP about the depicted object, but not the surface/vehicle; thus, the discussion will focus on (1) and (2)(§2). However, it seems that, according to the DVAPP, it is only the ventral stream that represents the depicted object (1), whereas the dorsal one does not (2). The DVAPP is grounded on much evidence defending the claim that ventral perception represents depicted objects (Nanay 2011: 467–468), and the discussion developed here just assumes that this claim is at least justified Thus, the task here can be restricted to show that ventral visual processing – the only kind of processing involved in the representation of depicted objects (1) – gives rise to a visual state connected to an emotional state that is crucial in detecting the salient emotional properties of those depicted objects that are the contents of our picture perception. Despite this, the aim of this paper is not limited to claim (1). Indeed, vision and motor neuroscience have provided enough evidence that also dorsal perception can represent depicted objects – that is, (2) is not always the case. It has also been shown that this is not in contrast with the DVAPP (see §3.1). So, before offering a complete account of how depicted objects foster in us emotional responses in the framework of the DVAPP, there is the need to briefly address the question about the possibility of dorsal perception of depicted objects, something that has been completely neglected by the DVAPP.

3.1 The Question of Dorsal Perception of Depicted Objects

The DVAPP suggests that it is only the ventral stream that represents depicted objects (1), whereas the dorsal one does not (2). However, it has been recently shown that this is not the case (Ferretti 2016a). Indeed, it has been suggested, on the basis of several empirical results, that specific portions of a specific chunk of the dorsal stream, the ventro-dorsal stream, are activated during picture perception (Chao and Martin 2000; see also Buccino et al. 2009; Costantini et al. 2010; Proverbio et al. 2011; Grezes and Decety 2002; Zipoli Caiani 2013). The ventro-dorsal stream is the crucial cortical portion of the dorsal stream involved in the transformation of intrinsic object properties into action properties with the consequent translation of those action properties in suitable motor acts. The particular circuit involved in the attribution of action properties is the one given by the interplay of the anterior intraparietal area (AIP) and the most rostral part of the ventral premotor cortex, F5 (Romero et al. 2013, 2014; Pani et al. 2014; see also Ferretti 2016a: 3, b: 4.1 for a review ). Even though the dorsal attribution of action properties to depicted objects is possible, this does not imply we can act upon depicted objects, but only that dorsal perception responds to those similar geometrical patterns exhibited by depicted objects, which normally instantiate the arrangement of an action property in a normal object. It is important to note that the dorsal response is active only when the depicted object is apparently presented within the peripersonal space of the observer (Costantini et al. 2010). This is because dorsal perception cannot distinguish between normal and depicted objects (Westwood et al. 2002), and therefore it responds to those objects apparently presented within the peripersonal space of the observer regardless of the real nature of the distal target. In other words, this means that, in order to foster visuomotor activation in dorsal vision, the important thing is that a subject must face a geometrical configuration that is usually linked to an action property, and that it is apparently located in the peripersonal space of the observer, perceived as apparently reachable, no matter if the configuration pertains to a 2-D object or a real object. Therefore, the same dorsal perceptual representation of action properties is shared by both face-to-face perception and the perception of depicted objects - for the complete account see (Ferretti 2016a).

There are reasons for which this result is not in contrast with the DVAPP (Ferretti 2016a). First, Nanay (2015) points out that his claim (2) is not necessary in his account. Second, the proposed idea does not violate the intuition that dorsal perception is involved in egocentric encoding, nor does it conflict with the claim that we cannot really egocentrically localize a depicted object as a normal one, in line with Nanay (2015: 189). Indeed, dorsal perception can represent depicted objects as having action properties when, in the experimental settings, the depicted object is perceived as apparently located within the subject’s peripersonal space and, thus, as apparently reachable, even if the subjects may not be able to actually act on these action properties, that is, even if these depicted objects do not really afford the apparent motor interaction they seem to recall (Zipoli Caiani 2013; Costantini et al. 2010). This view will not be defended in full details here (see Ferretti 2016a), however, what is important for the purpose of this paper is that the neuroscientific results above mentioned are very clear about the possibility of a dorsal representation of depicted objects and this possibility is not necessarily denied by the DVAPP (see Nanay 2015).

This integration is very important for the spirit of the DVAPP, which aims to reconcile philosophy and psychology concerning the topic of picture perception. If we do not endorse this explanation, all the sets of evidence from vision and motor neuroscience about the dorsal activity in the case of depicted objects remain unexplained in our philosophical account (Ferretti 2016a). This integration, which perfectly fits with the DVAPP, is also able to make the DVAPP more and more in tune with psychology and neuroscience. Finally, the fact that also dorsal perception can represent the depicted object entails a sort of completion; both streams can represent both depicted objects (and their surfaces) and normal objects, in a very similar way. But if those two streams are the total components of the visual system of humans (and other mammals), then our visual system functions in almost the same way in both picture perception and face-to-face perception (and pictures and normal objects do not differ that much for our twofold visual elaboration). Thus, while face-to-face perception and picture perception are not precisely the same perceptual phenomenon, they are more related than previously suggested in the philosophical literature (Ibid.).

What we can derive from this addition is that the visual state realized by ventral processing is not the unique visual state we can be in during the perception of depicted objects – as suggested by claim (1) of the DVAPP. The fact that dorsal perception is activated during the perception of depicted objects can be included in the DVAPP without any philosophical problem (Nanay 2011, 2015). This is the reason why, whereas the account proposed here suggests that also the dorsal stream perceptually represents the depicted object, it remains compatible with the DVAPP.

This addition is very important for the proposal offered here. On the one hand, our visual system in its whole is activated during the perception of the depicted objects; on the other hand, as it will be suggested (§5), we have evidence that both the perceptual states subserved by the ventral and the dorsal activity are connected to an emotional state. Indeed, anatomo-functional evidence suggests that both streams project to these orbito-frontal areas that are crucial in detecting the salient emotional properties of the objects we visually perceive.

Now, following the idea that both streams perceptually represent depicted objects, the ECPP hypothesis is developed by defending the claim that the activities of both streams are emotionally charged in both face-to-face perception and picture perception. It will be suggested that, since the ventral visual activity is influenced by the emotional processing of the lateral orbito-frontal cortex (and the related interconnected areas), which is involved in emotional encoding, then ventral perception of depicted objects can have an emotional charge. Similarly, it will be suggested that, since dorsal visual activity is influenced by the emotional processing of the medial orbito-frontal cortex (and the related interconnected areas), then dorsal perception of depicted objects can have an emotional charge. These two sets of evidence suggest that the visual information that reaches the two visual pathwaysFootnote 4 is influenced by the emotional encoding of the prefrontal cortex, especially the orbito-frontal cortex. In order to argue in favor of these two claims, which are necessary for defending the ECPP hypothesis, the analysis offered here first reports the neural mechanisms thanks to which we visually perceive normal objects with an emotional charge (§ 4). Then, it is suggested that the same processing is activated in the case of the visual perception of depicted objects. The crucial point is that, on the one hand, we know that both streams perceptually represent depicted objects; on the other hand, we know that both streams’ activity is influenced by the emotional encoding subserved by the orbito-frontal cortex (and the related interconnected areas). Thus, reporting evidence concerning the activation of the same emotional states of the orbito-frontal cortex (and the related projections) during the perception of depicted objects is a good explanation in favor of the ECPP (§ 5). In particular, in (§ 5.1) is discussed the case of dorsal visuomotor processing and emotional processing in picture perception while in (§ 5.2) is discussed the case of ventral object recognition and emotional processing in picture perception. Finally, in (§ 6) it will be explained what are the implications of this extension of the DVAPP.

4 Visual Perception and Emotions in the Case of Normal Objects: the Visual Streams and the Orbito-Frontal Cortex

From the previous section we know that we are (or, at least, we can be) both in a ventral and a dorsal perceptual state when we see an object in a picture. However, this section shows how both visual streams project to different areas of the orbito-frontal cortex - which is, in turn, as said above, connected to other important cortical areas involved in emotional encoding, mainly followed here is the account proposed by Barrett and Bar (2009), but see also (O’Reilly 2010; Elliott et al. 2000) for related evidence about this cortical geography - by reporting the anatomo-functional links between the two visual streams and the two different chunks of the orbito-frontal cortex, which are crucial for emotional encoding during visual perception: 1) the lateral orbito-frontal cortex, which is connected with the ventral stream; and 2) the medial orbito-frontal cortex, which is connected with the dorsal stream.

It has been seen that “sensory information from the world is projected rapidly from the back to the front of the brain after an image is presented to the visual system” (Barrett and Bar 2009: 1329). That means that there are important anatomo-functional projections from the occipital cortex to the prefrontal cortex and that these projections are crucial for visuo-affective encoding. This explains how, when observing normal objects, our vision is emotionally charged. A crucial cortical area involved in this propagation and in the emotional encoding of visual information is the orbito-frontal cortex (Ibid.: 1326). I focus on its activity with respect to the activity of the two visual systems. This will allow arguing, in the next section, that the prefrontal cortex, in particular the orbito-frontal cortex, and its projections work in a similar manner during the perception of depicted objects.

Two functionally related circuits of the orbito-frontal cortex are differentially connected to the dorsal stream and to the ventral one, with different roles in the detection of salient emotional properties, during object perception (Barrett and Bar 2009).

The medial orbito-frontal cortex projects to the dorsal stream and has strong reciprocal connections to its lateral parietal areas (MT and MST). Through largely magnocellular pathways (Bar et al. 2006; Bullier 2001; Laycock et al. 2007), the medial orbito-frontal cortex receives low spatial visual information used to build a gist representation of the object’s identity (see Barrett and Bar 2009: 1329) and computes the initial affective information about the object, triggering the internal bodily changes suitable for potential action performance related to that specific object in a given context (for a review see Ibid.: Sect. 6). Due to its neuroanatomical connections to the lateral parietal cortex, the medial orbito-frontal cortex sends the information on bodily responses to the dorsal stream, and an estimate of the affective meaning of the object for action is built (for technical details see Barrett and Bar 2009: 1329). The brain’s preparation to respond (based on this first encoding) arrives even before the object is consciously perceived (for an analysis of this point see Ibid.: 1330, but also see below). These phenomena show the intrinsic influence that the emotional response might have on the motor response (see also Ferretti 2016b and Ferretti and Chinellato (In press) on the link between the dorsal visuomotor responses and the emotional activity of the medial OFC in motor representations). Showing that this same influence that occurs in the case of the perception of depicted objects might be in line with the result offered by Ferretti (2016a) (§ 3.1) and that depicted objects can be represented by dorsal vision as having action properties. To this extent, dorsal ascription of action properties might march in step with an emotional response often accompanied with the motor response. This would be consistent with the idea that motor simulation and its relevant bodily changes, which are shared with the emotional responses to salient stimuli for action, might be active during picture perception (see Ferretti 2016a).

However, there are also parvocellular pathways (Bar et al. 2006; Bullier 2001; Laycock et al. 2007) connecting the lateral orbito-frontal cortex to the inferior temporal areas (TEO, TE and temporal pole) of the ventral stream (for a review see Barrett and Bar 2009: 1330 and Sect. 7). Through the interplay with the inferior temporal cortex, the ventral stream manipulates high-resolution visual cues that are related to visual experience. While the medial orbito-frontal cortex manages motor and bodily reactions, the lateral orbito-frontal cortex is more involved in managing the cues we get from the sensory modalities concerning object identification and in relation to a given context. Showing that the same processing is present during the perception of depicted objects might be in line with the result offered by Nanay (2010, 2011, 2015) that ventral vision is responsible for conscious object recognition even during picture perception. To this extent, the ventral conscious representation of different objects’ features might march in step with a conscious emotional response concerning the affective features exhibited by the depicted object, e.g., looking at a picture of your dog with happiness.

As seen, the prefrontal cortex and its orbito-frontal sector - the orbito-frontal cortex - are crucial for the emotional encoding (Barrett and Bar 2009; but see also Duncan and Barrett 2007; Pessoa 2008), but emotional encoding is subserved by a complex interplay between the orbito-frontal cortex, the insular pathway, the somatosensory pathway, the amygdala and the ventral striatum (for the cortical geography see Barrett and Bar 2009). Then, when talking about the activity of the orbito-frontal cortex we have, inevitably, to refer to these other cortical states involved in emotional processing. The fact that sensory information is projected from the back to the front of the brain when we look at an object is possible because there are important anatomo-functional projections from the occipital cortex to the prefrontal cortex. These projections are crucial for visuo-affective encoding. I will return to this later (§ 5).

Here is a summary of the implications of this evidence for the ECPP. We know that the activity of both the visual streams is linked to the emotional encoding subserved by the orbito-frontal cortex thanks to different connections linking each stream to a different portion of this part of the cortex. If all that has been said about visual and emotional responses is true also for the perception of depicted objects, we have reached an important point. It is widely agreed that the perception of depicted objects, like face-to-face perception, can be conscious or unconscious. The same holds for the emotional state linked to the visual state during the perception of depicted objects. Indeed, according to the DVAPP, ventral vision of depicted objects is conscious (Nanay 2010, 2011, 2015) while dorsal vision of depicted objects is unconscious (Ferretti 2016a). But the ECPP this paper is trying to explain is in accordance with this view to the extent that, with respect to the neural projections we are talking about, the ECPP can be conscious or unconscious. On the one hand, the affective encoding linked to the visual features of the perceived object, which is due to the ventral projections onto the lateral orbito-frontal cortex, is usually conscious. On the other hand, the affective encoding linked to the visual features of the perceived object, which is due to the dorsal projections to the medial orbito-frontal cortex is usually unconscious. Accordingly, for Barrett and Bar the affective response mediated by the processing of the orbito-frontal cortex may become conscious and, then, the object is experienced as pleasant or unpleasant: a perceiver experiences himself as reacting to a pleasant object (p. 1328). However, even without conscious accessibility, the percept always includes the affective value of the object: “whether consciously felt or not, an object has affective ‘value’ when it influences a person’s breathing, or heart rate, hormonal secretions, etc. Objects are ‘positive’ or ‘negative’ being able to influence a person’s body state (just as objects are said to be ‘red’ if they reflect light at 600 nm)” (p. 1328). Also: “’unconscious affect’ is why a drink tastes delicious or is unappetizing, why we experience some people as nice and others as mean and why some paintings are beautiful while others are ugly. When in the foreground, it is perceived as a personal reaction to the world: people like or dislike a drink, a person or a painting. Affect can be experienced as emotional” (p. 1328). So, even if the emotional response mediated by the medial orbito-frontal cortex (and connected to dorsal vision) remains unconscious, it still plays some sort of role in our affective representation of the depicted object. Note that the idea of the possibility of unconscious emotional response, and its massive importance in our affective evaluation of what we deal with, is addressed in the literature (Berridge and Winkielman 2003).

This entails that the plethora, mentioned above, of the visuo-affective responses produced by different cortico-cortical projections, whether conscious or unconscious, is at the basis of our ECPP. Note, accordingly, that the interplay between the two streams along with the activity of the two portions of the orbito-frontal cortex is huge. All the information processed by the two streams is sent to the orbito-frontal cortex in order to build an initial affective representation and then the information resulting from this initial affective representation is relayed back to the visual streams in order to build a specific representation of the perceived object.

Summing up, this section showed how face-to-face vision is charged with an emotional aspect. This is something widely agreed in the literature about emotion neuroscience (for a review see Barrett and Bar 2009; Price 2007; Zald and Rauch 2007). Showing that also the visual perception of depicted objects entertains the same emotional charge would be very important for the DVAPP, in that it might explain something that remained neglected in the literature: the ECPP. Now, the task is to show that the same story holds concerning the functioning of our visual system and its connection with different emotional areas during the perception of depicted objects. It is known that both streams represent the depicted object (§ 3.1). We also know that the activity of both streams is deeply linked to the emotional activity of the orbito-frontal cortex. The prefrontal cortex in general, and the orbito-frontal cortex in particular, are cortical regions very important for emotional encoding. Now, since we know that both streams are activated during picture perception and that both streams receive feedbacks from the orbito-frontal cortex, then, in order to explain the ECPP, there is the need to report the results about the activation of the orbito-frontal cortex, and connected areas, during picture perception.

5 Visual Perception and Emotions During Picture Perception: the Orbito-Frontal Cortex and the Perception of Depicted Objects

It has been suggested that both streams are activated during picture perception, and that both streams receive feedbacks from the orbito-frontal cortex, which is, in turn, linked to further important cortical areas involved in emotional encoding. Here I report evidence about the activation of the orbito-frontal cortex during picture perception. This should reliably constitute an explanation of the ECPP. Remember that the important point is to suggest a connection between vision and emotion during picture perception, because a lack of this connection would mean that something is missing between the evident ECPP in our ordinary life and the explanation offered by the DVAPP. In this paper, there has been offered a possible connection, and it has been maintained there is the relation with the DVAPP, starting from its background and pointing out the appropriate additions (§ 3.1). It is important to report sets of experimental results in line with the explanation of the ECPP that has been anticipated at the beginning of this paper.

First of all, there is evidence of activation, during picture perception, of specific portions linked to the prefrontal and, specifically, to the orbito-frontal cortex, such as the ventromedial and the medial prefrontal cortex (Asmaro et al. 2014), the medial area of the rostral prefrontal cortex (Kreplin and Fairclough 2013, 2015), the ventrolateral prefrontal cortex (Kohno et al. 2015) and of several emotional areas related to the orbito-frontal cortex such as the amygdala (Scharmüller et al. 2014). This activation is generally confirmed by summaries about the overall recruitment of several emotional areas during picture perception (Codispoti and De Cesarei 2007; Amrhein et al. 2004; for a complete review see Olofsson et al. 2008; Cuthbert et al. 2000; Schupp et al. 2000, 2004; Keil et al. 2002; for the relation between these emotional responses and the motor responses see Hajcak et al. 2007).

There is also specific evidence that observing pictures whose pictorial content elicits fear or pain recruits the overall activation of the limbic system, the amigdala and several parts of the prefrontal cortex (Glotzbach et al. 2011) – we have also evidence concerning the activity of the amygdala and of the insula when subjects with social anxiety disorders are faced with images able to foster an anxious response (Shah et al. 2009).

However, it is also possible to recollect the evidence with respect to the link between (dorsal) visuomotor processing and emotional response on the one side, and between (ventral) object recognition and emotional response on the other hand.

5.1 Pictures, Actions and Emotions

Concerning the link between visuomotor and emotional responses in picture perception, there are experiments showing that observing pictures of neutral or dangerous objects elicits facilitation or inhibition effects of motor responses about the action preparation (which, as said in (§ 3.1), is possible even in the case of depicted objects) related to the action possibilities offered by the depicted objects – indeed, this is perfectly in line with what has been said about the perception of action properties during the perception of depicted objects as well as with the emotional response related to action during the perception of depicted objects, reported in (§ 3.1): while depicted neutral graspable objects approachable without any risk activate a facilitating motor response, in the case of images of dangerous objects that pose a potential risk motor resonance evokes aversive action possibilities, generating an interference-effect: information about an object’s potential risks might conflict with the motor actions that are activated while observing that object. Thus, the response might be blocked, because aversive action possibilities are evoked (for a review see Anelli et al. 2012; see also Algom et al. 2004). This is because the prefrontal cortex is crucial in inhibitory and excitatory responses (Munakata et al. 2011). For example, by receiving inputs from other emotional circuits, prefrontal activations allow the acting subject to inhibit motor responses in the case of pictures of dangerous objects (Caligiore et al. 2013; for a review see Anelli et al. 2012). Once again, this seems to be a further link between dorsal processing of depicted objects and the processing subserved by orbito-frontal cortex that responds to depicted objects. It should be pointed out that the dorsal stream is the high road from the primary visual cortex V1 to the (pre)motor system and its projections to the medial orbito-frontal cortex are crucial in order to encode the salience of the motor scenario we are faced with, even when the motor scenario is depicted.

So far it has been suggested that the depicted objects in which pleasant or unpleasant things are present can foster the ECPP. This paper started with the example of the ECPP concerning the picture of a partner or a relative. To this extent, the explanation offered here is not restricted to the ECPP concerning pleasant or unpleasant objects. The same explanation can be given concerning pictures of people or of human interactions. This seems to be very important for an explanation of everyday life experiences of pictures and their related ECPP.

In accordance with the idea proposed here, there are several sets of evidence concerning a resonant mechanism active during the perception of different kinds of depicted hands exhibiting painful interaction (Avenanti et al. 2005, 2006; Morrison et al. 2007; Nummenmaa et al. 2008). These results hold for different social variables such as gender and age of participants (Anelli et al. 2012). The literature is very rich. For example, transcranial magnetic stimulation studies with subjects faced with images showing pain inflicted on others (Avenanti et al. 2005, 2006) suggest that viewing pain inflicted on others, even if at the pictorial level, activates a specific corticospinal inhibition that is usually linked to the direct experience of pain (Farina et al. 2003; Le Pera et al. 2001) and involves a resonant activation of pain representations in the subject’s sensorimotor systems – this is a rephrasing of Anelli et al. (2012: 1628). This is very interesting in relation to what it has been said above. Indeed in (§§ 4, 5) it has been suggested that showing that the same visuo-affective representation linked to motor responses is active during the perception of depicted objects might be in line with the result offered in (Ferretti 2016a) (§ 3.1) that depicted objects can be represented by dorsal vision as having action properties in (almost) the same way normal objects can. To this extent, dorsal ascription of action properties might be linked to an emotional response often accompanied with the motor response. It has been also said that this would be consistent with the idea that motor simulation and its relevant bodily changes might be actively selected in pictures. Crucially, in Ferretti (2016a) it is suggested that the AIP-F5 circuit is involved in the attribution of action properties to both depicted and normal objects. But it is also known that in F5 (the most rostral portion of the ventral premotor cortex) there are different families of mirror neurons, which respond both when the subject observes an action performed by another individual, and when she/he performs the same or a similar action (Cook et al. 2014). Mirror neurons in particular, and mirror mechanisms in general, seem to play (at least) a causal role in our comprehension of the actions of others, as well as, to some extent, a causal role in empathic understanding. Moreover, the ventro-dorsal stream seems to play a crucial role in conscious action understanding (Gallese 2007). Furthermore, the embodied simulation subserved by mirror mechanisms seems to be at the basis of conscious empathy toward other individuals (Gallese 2005). Given these results, the evidence reported here is sufficient in order to show that there are similar visuo-motor-affective representations allowing us to perceive (consciously or unconsciously) situations in which emotional responses - in such a case as concerning fear or pain - and motor responses are elicited together (but I am not committed to the idea that these representations are necessarily conscious). This is true both in the case in which we perceive dangerous real objects as well as dangerous depicted objects (Anelli et al. 2012), or when we perceive real, as well as depicted, dangerous motor interaction performed by others. This is also in accordance with the evidence that various mirror mechanisms of different kinds are very important in emotional and empathic responses (for a review about the case of pictures see Anelli et al. 2012; for what concerns face-to-face situations see Rizzolatti and Sinigaglia 2008). The (conscious or unconscious) visuo-motor-affective states we can be in during face-to-face vision are very similar to these we can be in during the perception of depicted objects.

Accordingly, in line with what it is proposed here, observing pictures showing painful situations (a hand pierced by a syringe needle) inhibits hand muscles through the cortical motor system (Avenanti et al. 2005). Also, motor resonance is found in TMS experiments in which subjects observe different depicted hands interacting with painful stimuli (Avenanti et al. 2010); furthermore, evidence from fMRI studies showed that when looking at static pictures of potentially painful stimuli subjects are also able to rate the intensity of the pain attributed (Jackson et al. 2005).

We know that, following the extension of the DVAPP discussed above (§ 3.1), both visual streams can represent depicted objects. So, we have to account for the link between each stream, the related visual representation and the orbito-frontal area that subserves the emotional encoding linked to the visual activity. The crucial point here is that these results confirm what it has been anticipated in (§ 4), that (dorsal) visuomotor processing and emotional processing are related both in the case of face-to-face perception and picture perception – see below for the case of (ventral) object recognition and emotional processing. Resonance and/or motor inhibition during the perception of depicted objects that are related to action are linked to the induction of a sensation of pain related to the depicted object/scene. This allows the subject to represent the object as dangerous (this is very clear in the literature about broken/dangerous affordances, see Borghi and Riggio 2015). This explains the link between the extension of the DVAPP about the attribution of action properties made by dorsal processing (Ferretti 2016a) (§ 3.1) and the hypothesis about the ECPP, in relation to action, according to which when we look at depicted objects an emotional response can be fostered, even when it comes to the attribution of action properties. In this case, the emotional response related to the depicted object is linked to the dorsal visuomotor one. In other words, when we look at the depicted object the response of vision-for-action subserved by dorsal visuomotor processing can be related to the emotional response concerning the perception of (aversive or facilitated) action possibilities mediated by the prefrontal cortex.

5.2 Pictures, Object Recognition and Emotions

Concerning the link between object recognition and emotional responses in the case of picture perception, empirical findings suggest that the occipito-temporal cortex – the cortical portion related to the ventral stream, which is connected with the lateral orbito-frontal cortex - and the insula (connected to the orbito-frontal cortex) are involved in managing the affective/emotional information offered by the pictorial content of depicted objects. Indeed, observing pictures whose pictorial contents are related to mutilation fosters emotional responses - such as fear - (see Wright et al. 2004; Schienle et al. 2006) mediated by the insula (Phillips et al. 2000; Shapira et al. 2003), whose response to affective pictorial contents is well known in the literature, together with the occipito-temporal cortex (related to the ventral stream) and the orbito-frontal cortex (Schienle et al. 2002; Stark et al. 2003). In line with what it has been said, pictures of spoiled food and body products (or mutilations, injuries and corpses) induce responses, concerning disgust, sadness and fear mediated by the occipito-temporal cortex, the amygdala and the insula (for the different activations of these areas mentioned here in relation to the different kinds of emotional pictorial contents see Wright et al. 2004; Schienle et al. 2006).

This evidence is really important because it helps us to understand the deep relation between picture perception and emotional involvement. Indeed, the lateral orbito-frontal cortex receives information from the ventromedial prefrontal cortex and from the anterior insula (Craig 2009; Barrett and Bar 2009; Vuilleumier et al. 2001; Ghashghaei and Barbas 2002; Barbas and De Olmos 1990). It is not by chance that the occipito-temporal network – again, the cortical portion related to the ventral stream, which is connected to the lateral orbito-frontal cortex - is activated during the perception of depicted objects containing emotional salient stimuli (Wright et al. 2004), being ventral vision crucial in high-level object recognition, during both face-to-face seeing and picture perception, as shown by the DVAPP.

Once again, the crucial point here is that these results are a confirmation of the fact, anticipated in (§ 4), that (ventral) visual object recognition and emotional processing are close phenomena both in the case of face-to-face perception and in the case of the perception of depicted objects and, if we couple this result with the other result exposed above about the interplay between (dorsal) visuomotor processing and emotional processing, we have a plausible explanation for the ECPP.

An important point here is that this paper has focused more on these cases in which presented stimuli seem to be related to negative emotions (concerning pain, fear, etc.) because these kinds of responses seem to be clear cut cases of emotional responses. The important implication of these results is that viewing images of dangerous objects as well as perceiving face-to-face dangerous objects are very similar perceptual processes. Moreover, looking at depicted objects whose content is about a dangerous performance concerning someone else and perceiving face-to-face the same scenario foster similar activations of our visuo-motor-affective representations. That means these representational states are similar. What has been offered here is a coherent set of experimental results for the explanation of the ECPP, which can be fostered with pictures of (dangerous/pleasant) objects or (dangerous/pleasant) scenarios involving different subjects. However, there is a huge amount of literature concerning facilitation effects or positive responses when the presented stimuli are linked to neutral situations (Anelli et al. 2012).

6 The DVAPP and Emotions

To sum up, these results have important implications because they seem to suggest that ventral and dorsal visual activities are linked to specific emotional activities of the orbito-frontal cortex. Based on a specific cortical projection, each visual stream is involved in the processing of particular visual cues together with the related emotional response concerning such cues, given by the specific portion of the orbito-frontal cortex. All this is in line with the description of the functioning of the visual system, and of its different chunks, in the case of picture perception, offered by the DVAPP (Nanay 2010, 2011, 2015; Ferretti 2016a, c). Accordingly, as for the perception of depicted objects, the ECPP can be conscious or unconscious, depending on the visual/emotional projection we are focusing on (cfr. with § 4).

In other words, all these experimental results reported here suggest a strong activation of the orbito-frontal cortex - which is involved in the encoding of the salient emotional properties of the (real) objects we deal with - as well as other emotional areas interconnected to the orbito-frontal cortex, also during picture perception. But, at this point, we know that both streams perceptually represent depicted objects (§ 3.1) and that both streams receive feedbacks from different portions of the orbito-frontal cortex (§ 4). Since the orbito-frontal cortex is active during picture perception (§ 5), we have all the experimental ingredients that seem to explain the ECPP. All it has been said suggests that it is thanks to this shared processing that depicted objects can foster in us an emotional charge. When we look at the picture of something pleasant or unpleasant, our visual streams reconstruct the visual features of the depicted object; then, the information is sent to the orbito-frontal cortex, which computes the salient emotional properties of the depicted object and sends back this information to our visual system. The result of this shared processing is what the ECPP is grounded on. Recall that the ECPP can be personal or subpersonal, conscious or unconscious, as well as picture perception (Nanay 2011, 2015; Ferretti 2016a) and emotional encoding can be as such (Barrett and Bar 2009).Footnote 5

A very important implication of the proposal offered in this paper is the confirmation of the explanatory power of the DVAPP in quality of both a philosophical and a psychological valid theory of picture perception. As said, the point about dorsal representation of depicted objects (§3.1) is very important for the spirit of the DVAPP insofar as, if we do not endorse this explanation, all the sets of evidence from vision and motor neuroscience about the dorsal activity in the case of picture perception remain unexplained under the DVAPP (Ferretti 2016a). The same holds for the explanation of the ECPP, in the framework of the DVAPP, on the basis of the empirical evidence about emotions and visuo-affective encoding: this is a genuine extension of the DVAPP and a confirmation of its power. Furthermore if this explanation of the ECPP is reliable, this result is - as the result about dorsal representation of depicted objects is (§3.1) - a further proof that, while face-to-face and picture perception are not the same perceptual phenomenon, they are even more related than previously suggested (Ibid.). Indeed, not only both streams can represent both depicted objects (and their surfaces) and normal objects, in a very similar manner (Ibid.) (§3.1). Also, the visuo-affective activity produced by their joint action with the – specific portions of the - orbito-frontal cortex gives rise, during picture perception, to emotional representations very similar to those we have in face-to-face perception.

A final important specification needs to be reported. Relying on neural and behavioral evidence about the functioning of our visual system (and its anatomo-functional arrangement) in order to investigate what perceptual state, or what representational state we are in when we see an object in a picture is a common procedure in the literature. The paper reported different experiments, from both psychology and neuroscience, which rely on experimental settings in which the targets are depicted objects. Even this procedure is a common one in the literature. Indeed, empirical informed philosophical accounts of picture perception usually rely on these kinds of sets of evidence (Nanay 2011, 2015; Briscoe 2016; Ferretti 2016a, c). On the one hand, using the general resources from neuroscience, we can observe which are the cortical states at the basis of our perceptual states. On the other hand, with this knowledge in our hands, the experimental settings in which the targets are depicted objects allow us to test if and when the cortical states subserving our perceptual states are active in the experimental setting. In other words, if we know that a given cortical state subserves a specific (visual or emotional) representational state, a good way of testing whether or not this representational state is active in the perception of depicted objects is by testing the way in which the cortical state subserving the representational state responds to the depicted object. These results allow us to understand the behavior of the representational activity of the subject’s brain in the case of depicted objects.

This practice is at the basis of the investigation of Q1 (and, thus, Q1a) and is uncontroversial in the empirically informed philosophical literature (see Nanay 2011: 467, 2015; Briscoe 2016; Ferretti 2016a, b, c) concerning the DVAPP (Nanay 2011, 2015) and its further licit extensions (Ferretti 2016a, c). The same procedure can be beneficial for the analysis of the emotional areas that assist visual processing during the perception of depicted objects and that are, on an anatomo-functional description, inextricably linked to our visual brain. This is precisely what it has been proposed here.

7 Conclusion

The paper started from the neglected ECPP in the literature of picture perception. It underlined that even our best account of picture perception, namely the DVAPP does not account for the ECPP. It also suggested to extend the DVAPP by reporting empirical evidence that the visual representations involved in the perception of depicted objects are emotionally charged in the same manner as are the visual representations involved in face-to-face perception. The advantage of this account is twofold: it suggests that we can be in an emotional state during picture perception, and it explains how this emotional state can be linked to the visual state we are in during picture perception. All of this is explained under the DVAPP, whose explanatory power is confirmed and extended. Embracing this proposal allows to take into account several sets of evidence from vision and affective neuroscience whose important impact in the debate on picture perception remains otherwise unexplored.Footnote 6 The result is a further proof that, while face-to-face and picture perception are not the same perceptual phenomenon, they are even more related than previously suggested.