1 Introduction

John Searle suggested that experiencing a joint action involves a “sense of us.” It has been contended that a collective intention implies the sense of us as one of its “underpinnings” (Crone, 2020). Developmentally, it seems plausible to suppose that the sense of us involved in joint action originates from “emotion sharing” (Hobson and Hobson, 2007; Tomasello, 2019), an early form of shared intentionality that has also been argued to entail a basic sense of us (Schmid, 2014a; Zahavi and Rochat, 2015).

The present paper connects the question of the nature of this basic sense of us with the socio-cognitive problem of infants’ “perceptual” access to other minds (Reddy, 2008). Specifically, I propose that a particular view of emotion sharing—with its intrinsic characterization of the sense of us—is corroborated by a particular developmental account of infant social perception and vice versa.

The particular view of emotion sharing I will pursue is called the “straightforward view” (Schmid, 2009: xvi, 65-69; Salice, 2015: 83). It is the view originally proposed by classical phenomenologists Max Scheler and Edith Stein, and that has been taken up directly from either one or both of them by its contemporary advocates Hans Bernhard Schmid and Angelika Krebs, as well as by Noemi Eilan and Joel Krueger through Merlow-Ponty’s mediation.Footnote 1 The view suggests that participants in emotion sharing undergo one and the same overarching emotion.

The developmental hypothesis I will elaborate on is the “pairing” account of social perception. This account originates from Husserl and Merleau-Ponty’s philosophy and has been recently put forward as a competitive hypothesis supported by a variety of developmental evidence (Vincini, 2020) and capable of generating new findings (Vincini and Gallagher, 2021). The pairing account posits that infants perceive others’ embodied experiences as belonging to someone other than the self through a process of assimilation to, and accommodation of, their own embodied experience.

Both emotion sharing and infant social perception are vast and complex debates very far from being settled. The present paper does not try to provide enough reasons to believe that the straightforward view or the pairing account are correct. I think that a goal of this kind is currently out of reach for any of the theories advocated in those debates. Surely, it would be outside of the scope of a single paper. Therefore, I recommend the reader to keep in mind that, in examining two topics that are so complex and widely debated, the only goal of this paper is to show that the straightforward view and the pairing account strengthen each other.

The paper is divided into two parts. The first part (Sects. 2-6) expounds the prima facie viability of the straightforward view; the second part (Sects. 7-10) examines the pairing account and its reciprocal connections with the straightforward view; the conclusion (Sect. 11) recaps the resulting approach to the developmental origins of the sense of us. The relatively extensive discussion of the straightforward view in the first part is necessary because, if the prima facie viability of the straightforward view could not be ensured, the pairing account would be weakened, not strengthened, by its association with the straightforward view.

2 Background presuppositions

For the purposes of the present paper, the assumptions that are widely accepted in the current phenomenological debate on emotion sharing must be taken for granted. Two of them are particularly important.

First, in emotion sharing, just like in general, experience does not hide, but manifests reality and its structures. The phenomenological tradition—including Husserl, Scheler, Stein, Zahavi, Schmid, etc.—challenges the (Cartesian) division between a phenomenological domain of “mere appearances” and a metaphysical domain of what is “truly real.” Of course, the phenomenological debate acknowledges that sharing is a fallible experience. For example, I can feel that someone is sharing an emotion with me, but later realize that she was not. However, a part from the fact that many phenomenologists would say that we do not have reasons to doubt the veridicality of a large number of episodes of ordinary sharing, the point is that what appears as the structure of sharing in the experience of sharing is indeed the real structure of sharing. Sometimes, this structure can appear in experiences that, in face of a closer examination, are revealed to be fallacious. Nonetheless, it remains true that there is no structure of sharing beyond the one that is ordinarily experienced and that is further verified by scientific investigation—for phenomenologists assume that, when sharing really occurs, experience and science confirm, or in any case are compatible with, each other. Therefore, just like most if not all views in the phenomenological debate, the straightforward view must be understood as identifying both a phenomenological and a metaphysical structure of emotion sharing (Husserl, 1973; León et al., 2019; Scheler, 2008; Schmid, 2009; Stein, 2000).

Second, shared emotions are constituted by components distributed among a plurality of subjects (León et al., 2019; Salice, 2015; Schmid, 2009). This assumption is called the Socially Extended Emotion Thesis and Krueger and Szanto (2016) show that there is a growing body of research in support of it. The Socially Extended Emotion Thesis typically implies a general thesis about embodiment, i.e., that emotions include neural and extra-neural bodily processes, but goes beyond it by insisting that, in the case of shared emotion, even the conscious components of a shared emotion can be distributed among a plurality of individuals (León et al., 2019; Krueger and Szanto, 2016).

The second common assumption allows one to appreciate the problem motivating the straightforward view. If emotions have a complex mereological structure, there is the problem of what is part of a unitary emotion and what is external to it or part of another unitary emotion. This is the problem of individuation. Indeed, this problem is taken seriously by all theories that acknowledge that emotions are “patterns” of constitutive components or “emergent states” distributed in space and time (Newen et al., 2015; Oosterwijk and Barrett, 2014; Gallese and Caruana, 2016). The first assumption allows one to anticipate that the way in which the straightforward view addresses the problem of individuation cuts across phenomenology and metaphysics: what is part of a unitary emotion is not independent of what is experienced to be part of it.

It is not possible to understand the straightforward view without considering individuation as the problem addressed by such a view. Accordingly, Sect. 3 presents the straightforward view with a focus on individuation and Sect. 4 delves into the analysis of the process of individuation presupposed by the straightforward view.

3 The straightforward view

Scheler (2008: 255, 258) explicitly characterizes the straightforward view through the opposition in Standard German between (i) “derselbe/dieselbe/dasselbe,” which indicates “token-identity,” or numerical oneness, and is rendered in English by expressions such as “one and the same,” or “selfsame,” and (ii) “gleich,” which denotes type-identity and corresponds to expressions such as “two things of the same kind.” Even more importantly, Scheler (2008: 257-258) states that the individuation of a selfsame emotion shared among a plurality of people is analogous to the individuation of a selfsame individual emotion experienced by one person at different time points and in different ways.

In this fashion, Scheler (1973: 389-393, 522-523; 2008: 244-245, Schloßberger 2020: 79) challenges the widespread Cartesian assumption that a mental state can be had by only one subject:

Just as the selfsame [dasselbe] mental content can be present in a multiplicity of acts [of a single individual], so it can also be present to a number of different individuals. Just as we can revive, recall and grieve, more or less, over the selfsame [dasselbe] painful experience at different periods in our life, so we can also join with others, in grieving, at one and the same [ein und dasselbe] experience. To be sure, we can never experience the selfsame [dieselbe] (physically localized) sensory pleasure or pain. These states are confined to the individual in whom they occur, and can only be like one another [gleich], never identical. But two people may very well feel the selfsame [dasselbe] sorrow; a strictly identical, not just a similar one [streng dasselbe, nicht nur ein «gleiches» Leid], even though the experience may be differently colored in each case by differing organic sensations. Anyone who holds that mental events are given in each case only to one person will never be able to explain the exact meaning of phrases like: “All ranks were fired with the same [dieselbe] enthusiasm,” “The populace was seized with a common joy, a common grief, a common delight,” and so on. Custom, language, myth, religion, the world of the tale and the saga—how can they be understood on the assumption that mental life is essentially private? (Scheler, 2008: 258; translation modified; the emphasis on the indefinite articles appears in the original German)

Scheler (2008: 12-14, 37, 64) emphasizes both the token-identity of the shared experience and other-awareness as its constitutive component, where by “other-awareness” it is meant the awareness of a mental state as (also) had by someone other than the self. Indeed, Stein (1964: 17; translation modified) approvingly summarizes the ideas that Scheler stresses the most:

Scheler clearly emphasizes the phenomenon that different people can have strictly the selfsame [streng dasselbe] feeling (Sympathiegefühle, p. 9 and 31) and stresses that the various subjects are thereby retained.

Stein (1964, 2000) too underscores the compatibility between the numerical oneness of the shared emotion and other-awareness. For example, she calls emotion sharing “feeling of oneness” (Einsfühlen) and describes the prototypical example of the collective joy in communal celebrations as follows:

All of us are seized by an excitement, a joy, a jubilation. We all have the “selfsame” [“dasselbe”] feeling. Have thus the barriers separating one “I” from another broken down here? Has the “I” been freed from its monadic character? Not entirely. I feel my joy while I empathically grasp the others’ and see: it is the selfsame [dieselbe]. (Stein, 1964: 17; translation modified; the emphasis on the indefinite articles appears in the original German)

In order to understand the compatibility between the numerical oneness of the shared emotion and other-awareness, one has to examine how—according to the straightforward view—a shared emotion is individuated as unity of components experienced to belong to a plurality of individuals. For this reason, Sect. 4 is entirely devoted to the process of individuation.

4 Individuation

I already noted that Scheler (2008: 257-258) draws an analogy between the individuation of individual and shared experiences, but the analogy goes even further: Scheler (2008: 255) explicitly states that it also holds between the individuation of items of the objective environment and that of shared emotions.

Obviously, like most if not all phenomenologists, Scheler is aware of the differences in the individuation of objective things and experiences. As I shall discuss below, unlike ordinary things, experiences are fully individuated only in reflection and thus one must talk of “pre-individuation” for pre-reflective experiential unities. Moreover, while a perspective on a thing (e.g., a table) is not a part of it, a constituent of a unitary experience can indeed coincide with a perspective on the experiential whole.

Therefore, in order to understand Scheler’s “straightforward” analogy between the individuation of (1) objective-environmental items, (2) individual experiences, and (3) shared experiences, we need to identify what may support such a wide analogy across different domains. The following subsections show that it can be assumed that there is one domain-general process functioning in each of the three domains of individuation referred to by Scheler.Footnote 2

4.1 Objective-environmental items (things and people)

Notoriously, Husserl (1999: 40-41; 2001: 297-298) provides us with a classical phenomenological treatment of individuation. For him, any token-identity is a synthesis of a plurality of manifestations. Without denying the distinction (Xu, 2007) between individuation (“how many”) and identification (“which one”), Husserl suggests that the two are linked in that identifying a token phenomenon across appearances entails individuating it as one and not many (De Warren, 2009: 181-182).

Syntheses of individuation/identification take place not just in virtue of spatiotemporal continuity, but also through a process of association by similarity. In general, I can recognize an object as the same one across different time points, under different lightening conditions, from different perspectives and distances, etc. because what I see has much in common across these different circumstances. For this reason, Husserl describes a synthesis of individuation/identification as a kind of “Deckungssynthesis,” i.e., “overlap-synthesis” or “synthesis of coincidence.” Two or more experiences overlap in the sense that they share a characteristic part of their content. Because of this overlap, the experiences can be taken to be appearances of the same token being.

To realize how basic this process is, consider Xu’s (2007) review of the evidence that infants and adults use spatiotemporal, featural, and sortal information in individuating objects. Similarity surely plays a role with regard to featural and sortal information. As an example of featural information use, infants take an object that appears from behind an occluder to be the same one that has previously disappeared behind it if the object has the same shape (at about 4.5 months), the same pattern (at about 7.5 months), or the same color (at about 11.5 months): having the same shape, pattern, or color means being similar under those respects (Xu 2013: 402). With regard to sortal information, we should note that if “the blue teacup that you see now cannot be the same object as the blue pencil you saw 10 minutes ago,” it is certainly because they do not fall under the same sortal concept—as Xu (2007: 401) says—but also—as we say from our more abstract perspective—because they are not similar enough to fall under the same sortal concept.

Because of the importance of recognizing people for successful social interaction, face recognition is perhaps the most studied phenomenon of individuation in cognitive science. According to Hugenberg et al. (2010), there are identity-diagnostic characteristics that allow us to recognize the same face across different circumstances and orientations.

In sum, a glance at classical phenomenology and contemporary cognitive science suggests formulating this domain-general assumption: it is, in part, because a plurality of stimuli shares a set of characteristic features (similarity) that the cognitive system can take the stimuli to present the same token being (Schyns, 2018).

4.2 First-person-singular experiences

According to Husserl’s classical analysis of inner time-consciousness, first-person-singular experiences are (pre-)individuated as temporally extended. Such analysis suggests two senses in which it is possible to talk about varying pre-reflective perspectives on experience. The first concerns the “retentional process” abstractly considered.

For Husserl, every present experience includes not only the presentation of the momentarily present phase of an object—the “primal impression”—but also a “retention” of what is just-past and an anticipation (“protention”) of what is to come. In order to retain the just-past phase of the object, one has to retain the experience of that phase, i.e., the former primal impression. Since the experience that is so retained, retained other experiences in its turn, present experience implies a structure of “retention of retentions:” present experience is embedded in its past. Thus, as experience proceeds through time, a primal impression undergoes a continuous series of “retentional modifications” through which it is kept track of as “more and more past.” At some point, it moves from being a “near retention” (strongly affecting the present) to being a “far retention” (weakly affecting the present).

For our purposes, the key is that this succession of retentions of the same primal impression can be described as the unfolding of varying experiential perspectives on the same experience. Husserl (1991: 31, 285-286; 2001: 423) characterizes it as “a series of adumbrations”—the same term he employs for spatial perspectives—and suggests that to this variation of “temporal perspective” corresponds a variation of “affective perspective,” since the perspectivized experience becomes less and less forceful.

The second legitimate sense of pre-reflective perspectivity on experience concerns the pre-delineation of concrete individuated experiences. Here I talk about “pre-delineation,” or “pre-individuation,” because I assume—with Brough (2011) and Zahavi (2011)—that there is full-fledged individuation of experience only in reflection.

Notwithstanding, there is a sense in which it is true that “we are pre-reflectively aware of the experiences as discrete units” (Zahavi, 2011: 22). Pre-reflectively, experiences are not “separated as neatly from one another as coaches on a train” (Zahavi, 2011: 19), yet they “enjoy a fleeting unity and integrity” (Brough, 2011: 34). Consider Zahavi’s (2011: 19-20) example:

You are sitting and enjoying a glass of wine. Suddenly your reveries are interrupted by the phone. It is your mother asking whether you have remembered to buy a wedding gift for your nephew. You embarrassingly confess that you have forgotten all about it. Now, whereas it would be quite right to stress the qualitative continuity of the temporal phases of an experience—say, the auditory experience of your mother’s voice—it is just not right to divide that experience up into separate and externally related time-slices. […] On the face of it, a denial of their distinctness [between the experience of winetasting and the experience of embarrassment] just seems wrong.

Acknowledging the pre-reflective pre-delineation of temporally extended experiences implies admitting a certain perspectivity on such unities. I can live through an experience as a temporally extended unity only if at the different phases of its streaming—or at other phases of my pre-reflective stream preceding or following the experience—I am not aware of the current phase only, but I have a perspective on the entire experience extending across time. As Brough (2011: 34) puts it, acts of consciousness are…

experienced in temporal modes. I am pre-reflectively aware of my act of perceiving as now or as just past. If the act were not given in temporal modes, we could not experience it as a unity, as “an experience.”

Consider De Warren’s (2009: 204-205) example of hearing bells tolling. At each ring, as I am struck by the sound, I expect my experience of the sound to continue, though decreasing in intensity. In a different scenario, the fatigue that I experience in a work out session is given differently when I am in the middle of it and when I’m almost done (cf. Brough, 2011: 33).

Tying together the abstract idea of the retentional process as variation of perspectives and the idea of pre-delineation of concrete temporally extended experiences, we can thus convene with De Warren (2009: 42) in thinking of “the structure of time-consciousness in terms of variable perspectives within a landscape of temporal orientation.”

But what determines the pre-delineation of individuated experiences? It cannot be temporal continuity alone, because temporal continuity does not explain the onset or end of an experience. In other words, it does not explain why, for example, two experiential phases at t1 and t2 are temporally “distant” from one another, yet belong to one experience, whereas the phases at t2 and that at t3, which are much “closer,” belong to two different experiences.

Therefore, we must, with Zahavi (2011: 19-20), stress the “qualitative” continuity within experiences. “Qualitative” should be here understood liberally, including the sameness of the “intentional object” (e.g., one’s mother talking on the phone). Indeed, for Husserl (1974: 168; 1962: 286), sameness of the intentional object is a primary factor in constituting the synthesis of “a consciousness,” which can also be “discrete,” i.e., tolerate within itself interruptions of a shorter or longer time-interval. The intentional object can be a routine “practical” goal, e.g., “enjoying a glass of wine:” the different phases (taking the glass, sipping, putting it back, repeating the action, etc.) form a unitary experience in that they all fulfill the intentional object or action-schema. Evidently, “quality” in the stricter sense also plays a role: the feeling of embarrassment starts when my mother is talking and continues when I am giving excuses for my forgetfulness.

Thus, one can reasonably assume that the process of overlap-synthesis contributes to pre-reflectively individuating first-person-singular experiences too: experiential phases are taken to be components of a unitary experience in part because they share the same intentional object or quality.

This assumption is confirmed by Zahavi’s (2011: 23) reference to “the level of passive synthesis.” Indeed, Husserl (2001: 174) indicates that passive syntheses establish experiential unities in virtue of their concrete “content.” This indication suggests a wide use of the term “content” that includes the intentional object and the quality (in the stricter sense) of experiences. Hence, the assumption on the individuating function of overlap-synthesis for first-person-singular experiences can be formulated as follows: experiential phases pre-individuate temporally extended unities partly in virtue of the similarity of their contents.

4.3 First-person-plural emotions

During communal celebrations, how is it possible that we may tend to pre-reflectively experience ourselves as seized by one and the same emotion? The straightforward view clarifies this phenomenon by assuming that the pre-individuation of a shared emotion is partly underpinned by the same process of association by similarity that contributes to the individuation of things and people and the pre-individuation of first-person-singular experiences. Just as a plurality of stimuli and experiential phases are pre-reflectively taken to be perspectives on the selfsame person and the selfsame first-person-singular experience, respectively, so the experiences of the participants in emotion sharing are passively taken to be perspectives on one and the same emotion.

To start unpacking this idea, consider the following characterizations of the situation of communal celebrations described by Stein:

I comprehend the others’ joy and see it as the same. As a result, our respective joys overlap and coincide. (Zahavi and Rochat, 2015: 545)

The individuals involved […] comprehend one another’s emotional response, and sensing the similarity of those responses, their experiences merge into one. (Thonhauser, 2018: 1009)

According to the straightforward view, our respective joys “coincide” precisely in the sense that they “merge into one,” i.e., they are revealed to be one joy. This is possible in virtue of their “overlap,” or “similarity.” It is the overlap between a plurality of experiences that allows these experiences to function as appearances of one and the same experiential unity.

For Scheler (2008: 255) and Stein (2000: 135), one can appreciate how there can be distinct individual ways of appearance of one and the same shared emotion if one understands that there can be distinct ways of presentation of one and the same individual emotion for a single person at different times. The latter idea is explored through the distinction between (a) the “function,” or “Erleben,” and (b) the “what,” or “content,” of the experience (Frings, 1996: 28; Stein, 2000: 16-17)—this is a different sense of “content” than the one introduced in 4.2. “Function” denotes the great variety of ways in which a content can be “picked up” into the mind; “content” denotes what can be experienced to remain the same across those variations (in short or long-lived experiences).

For example, the sorrow for the loss of one’s son/daughter can be repressed, intensely felt (as when one come across objects reminding one of him/her), or even embraced when one finds a positive meaning in it (Stein, 2000: 18). Across these transformations, the affective response remains the same one (the sorrow, or suffering, provoked by the loss remains), but the attitude towards it changes. At t1, person A, who is having a “crush” on person B, is working hard to finish earlier and be able to see his/her beloved in the evening; the feeling is motivating him/her in the background. At t2 in the same afternoon, person A is indulging in the thought of B; his/her love is now fully in the foreground. It seems a case of intellectual stubbornness to persist in repeating that A is not experiencing one and the same feeling at1 and t2 (Scheler, 2008: 257-258). To make another example from Stein (2000: 19)—a useful example outside the emotional sphere—a person who has a certain job or responsibility, or who is in a particular psychophysical state, is acutely receptive (a particular “function”, or “Erleben”)  to a class of stimuli (the “content”), even when these stimuli lie in the background of attention. In the last analysis, the distinction between “function” and “content” is even intrinsic to the characterization of the retentional-protentional process as varying inner perspectives on an experience (4.2).

Accordingly, in reference to the two parents standing in front of their dead child, Scheler (2008: 37; translation modified) explicates the interplay between the numerical oneness of the shared emotion and the differentiation of individual perspectives as follows:

The function [Funktion] of feeling in the father and the mother is given separately in each case; only what they feel—the one sorrow—and its value-correlate, is immediately present to them as identical.

The value-correlate is the objective target of the emotion, its “intentional object,” the loss of their child. Hence Stein’s (2000: 135-136; translation modified) description follows Scheler’s identification, in the passage just quoted, of the two elements (the “what” and its intentional object) that constitute the token-identity of the grief as a whole, notwithstanding each individual’s “veneer:”

They all feel “the selfsame” grief. This “selfsameness” has significance that merits precise exposition. […] The correlate of the experience is the selfsame for everyone who participates in it. And correspondingly, the sense-content of each of the individual experiences applying to this correlate is idealiter the selfsame [derselbe], notwithstanding the private veneer that encloses it at any given time. Therefore in every experiential content we have to distinguish a core sense from the particular sheath it takes on in the experiencing of this or that ego.Footnote 3

In Stein’s (2000: 16-19) framework, “content” includes a reference to the “egoic contents,” which “lie on the subject side.” “Egoic contents” are what today would be described as “affect (the mental counterpart of bodily sensation, with properties of valence and arousal)” (Hoemann et al., 2019: 1831). “Egoic contents,” or “affects,” are considered to be part of an emotion when they are referred to an intentional object and linked to specific action-tendencies. Stein (2000: 164) suggests that, in a communal emotion,

there’s an identical core that can recur in the egoic contents of different subjects.

For example, in communal celebrations, I have an intuitive awareness of the affective contents experienced by the others: these subjective feelings are apparent “in” the others’ expressive behaviors. Importantly, although the others’ feelings are perceived by me as belonging to the others, they are felt to be continuous with mine in virtue of their qualitative features. Pre-reflectively, I do not experience my and your feelings to constitute two emotions (“my emotion” and “your emotion”), but I experience them as part of one overall emotion we are living together. As Stein puts it,

egoic contents, which don’t just befall the subject peripherally but rather fill the subject inwardly, are themselves already experienceable as common. (Stein, 2000: 165)

On the whole, Scheler and Stein provide us with much material to address the question of how association by similarity contributes to the pre-individuation of a common emotion: the experiences of the participants are taken to function as perspectives on a common emotion because these experiences share the intentional object, similar evaluative components, expressive behaviors, and the quality of subjective feelings. Salmela (2012) confirms that these are similarities that contribute to the formation of a shared emotion and adds the similarities concerning physiological changes, action-tendencies, and the fundamental concerns of the participants.

To familiarize more with the function of similarity, examine, once again, a couple of examples. While all taken by the excitement for the sporting victory, we see that one of our friends responds with a casual smile and a cursory “Ah, we won, good,” and then goes back to work. His reaction is too different from ours to take him to be participating in our excitement. Here “attunement” (similarity of expressive behaviors and subjective feelings) is key. The composer, the man at the triangle, the audience, etc. participate in the communal joy at the successful execution of the premiere (Schmid, 2009: 82). The envious composer—who would have liked his own symphony to be executed instead—does not. Here “valence” is key.

The more examples we examined, the more we would realize that there is no need to provide a list of necessary or sufficient aspects under which experiences have to be similar in order to function as manifestations of a common emotion. On the contrary, a close discussion of the domains considered in 4.1 and 4.2 could even reveal that similarity never individuates through necessary and sufficient conditions. In the case of emotion sharing, the functional similarity could sometimes even be as abstract as Schmid (2009: 67-68) takes it to be.

Let’s recap the most important points with two quotes from Stein. First, we should not be surprised that individual experiences pre-reflectively figure in the individuation of a communal emotion. That different experiences can undergo this kind of synthesis is “nothing new:”

The grief as experiential content of the community is what, of the rationally required grief, is actually realized and intended in the experiences of the individual participants. That an experiential content coalesces out of multiple components is, considering the case of individual experience, nothing new. Indeed, an individual experience too is not something instantaneous. Rather, it develops in a continuity of experience during a certain period of time and shows all sorts of qualitative fluctuations within its unity. (Stein, 2000: 136; translation modified; cf. Stein, 2000: 137)

Repristinating the use of the term “content” introduced in 4.2, which includes the intentional object, one can say that pre-individuation partly relies on a process of association via similarity of contents, both in the individual and in the communal case:

Its efficacy [of association in virtue of concrete contents] within individual consciousness makes it understandable that complex communal experiences coalesce. (Stein, 2000: 169)

5 Defending the prima facie viability of the straightforward view

The straightforward view implies that other-awareness is necessary to distinguish sharing from emotional contagion (Scheler, 2008: 37; Stein, 2000: 175; Schmid, 2009: 66): in emotional contagion, I am unaware that my emotion is caused by the other’s, so I experience it as “my own;” in sharing, I am aware that someone else shares the emotion with me, I experience it as “ours” (Schmid, 2014a: 9; Stein, 2000: 134).

However, León et al. (2019: 4856, 4860, 4862-4863) claim that the straightforward view is incompatible with other-awareness. Because this objection is tantamount to the accusation that the straightforward view is unable to distinguish between contagion and sharing, it seems to undermine the prima facie viability of the view. Hence, I shall address this objection.Footnote 4

Leon et al.’s objection seems to rely on a generalization of the Husserlian principle of mutual exclusivity between self- and other-awareness. The principle consists in the idea that if I were aware of the experience of the other as “mine,” i.e., as first-personally given, then the other’s experience would belong to my stream of consciousness, not the other’s; I would no longer be other-aware, but self-aware—and vice versa if I were aware of an experience of mine as “yours” (Zahavi, 2005: 154-155). Because León et al. suppose that the principle is not only valid in the domain of individual experience, but also in the domain of shared experiences, they assume that a (pre-)individuating synthesis between experiences given in the modes of self- and other-awareness requires annulling the difference between these modes: one and the same emotion cannot be given as both “mine” (first-person-singular-awareness) and “yours” (other-awareness).

My reply consists in explicating once again the structure of pre-individuating syntheses. The analysis shows that the mutual exclusivity of the modes of self- and other-awareness is not generalizable from the individual to the communal domain. On the contrary, a communal synthesis requires the essential contribution of both modes.

In general, in an experiential synthesis, there is certainly something that brings the components together, but this does not have to annul the differences between the components. Instead, the unity resulting from the synthesis can be a unity of components that are in some respects different. In a pre-individuating overlap-synthesis, what brings the components together is the similarity of the contents, while the different modes are by no means encumbered, and are allowed to play a decisive role.

In the individual case, the synthesis occurs through sameness of intentional object and/or through similar qualities across experiential phases and does not affect the different “temporal modes” characterizing the phases (4.2). If it cancelled their difference, the result of the synthesis would not be a temporally extended experience, because the components would be given as all occurring at the same time.

In the communal case, the components that are brought together are the individual experiences. The synthesis occurs in virtue of some among the numerous similarities of content indicated in 4.3 (same intentional object, similar behavioral, physiological, expressive responses, feelings, underlying concerns, etc.) and does not affect the different modes presenting the individual experiences, i.e., self- and other-awareness. If it annulled their difference, the result of the synthesis would no longer be a communal emotion, i.e., an emotion shared among a plurality of co-subjects, because the emotion would be given as having only one subject (the other would be identical to the self). In a communal synthesis, the unaltered difference of the subjective modes allows for the individual experiences to be taken as individual perspectives on the unitary emotion.

Therefore, I agree with Salice (2015: 58-59):

The [shared] mental state (the emotion) is one, but the way in which it is given to me (or: the way in which I feel it) is radically different from the way in which it is given to you (or: from how you feel it). […] Since I am just co-experiencing the emotion and, hence, since I merely co-own the emotion, I am also aware that the way in which you feel that emotion is precluded to me and that I do not have access to how the emotion is given to you (to what you feel) from within.

Husserl (1999: 109, 114) posited mutual exclusivity of self- and other-awareness in his discussion of singular experience (first, second, and third-personal) and Stein (1964: 33) endorsed it. However, both Husserl (1962: 281; 1973: 201) and Stein (2000: 138, 141, 144, 190) emphasized that communal states are mental states of a “new kind,” or a “higher level.” This is an indication that not all principles valid in the individual domain can be carried over to the communal. In fact, there are at least two reasons to suppose that Husserl did not believe that the principle of mutual exclusivity cut across these domains (Vincini and Staiti, forthcoming).

First, Husserl (1973: 192-204) defended a straightforward view of communal states such as communal convictions, evaluations, and intentions. Second, Husserl described the constitution of communal states precisely in the terms of an individuating overlap-synthesis presented in Sect. 4. Though with a vocabulary that may sound somewhat archaic to the contemporary ear, the passage summarizes the idea of a comprehensive synthesis including an experience given in other-awareness as “necessarily separate:”

Consciousness, however, truly coincides with consciousness [Bewusstsein aber deckt sich wirklich mit Bewusstsein], a consciousness that understands another consciousness constitutes within itself the selfsame that the other consciousness constitutes. Both are one in the selfsame. […] In this way, consciousness merges itself with consciousness, overflowing all the time, encompassing time in the form of simultaneity as in the form of succession. Personal consciousness becomes one with another consciousness, which is, individually, necessarily separate from it, and thus becomes a unity of a super-personal consciousness. (Husserl, 1973: 199; cf. Husserl, 1973: 199-200, lines 37-5) Footnote 5

Like Scheler and Stein (Sects. 3-4), like Salice and Husserl (present Sect.), Schmid (2009: 79), Krueger (2013: 522-524) and Krebs (2015: 141) take the straightforward view to be compatible with the differentiation of self- and other-awareness. For example, Schmid (2009: 77; 2014b) affirms the necessity of pre-reflective self-awareness for individual consciousness and takes pre-reflective self-awareness as the point of departure from which to develop his view of the first-person-plural.

Furthermore, since binocular space perception develops by 3–4 months of age, the argument that “simple visual perception” requires self-world differentiation commits Schmid (2014b: 15) to the view that a 4-month-old baby possesses a primitive self-world differentiation. It is true that, in one passage, Schmid (2014b: 22-23) argues that self-awareness presupposes rather than explains the sense of us, but, in that passage, he doesn’t deny that the sense of us presupposes pre-reflective individual self-awareness.

Analogously, Schmid (2009: xvii, 77-82) argues for the compatibility between straightforward sharing and what he identifies as the truth of individualism, i.e., the “separateness of persons.” He concludes by characterizing other-awareness as a constitutive element of sharing (82-83). After describing shared emotions as implying the sense of us, Schmid (2014a: 9-10) states that the “us” who constitutes the co-subjects of the emotion implies “more than one participant.”

It is important to note that early and contemporary proponents of the straightforward view converge on the compatibility between the token-identity of the shared emotion and the differentiation of self- and other-awareness. However, my reply to León et al. is not only exegetical, but systematic in character. I have argued that the Husserlian principle of mutual exclusivity is not generalizable beyond singular (i.e., individual) experience. In the case of communal experience, pre-individuation works through the similarity of contents and makes use of different modes in order to pre-configure a comprehensive emotion felt as both “mine” and “yours,” i.e., as “ours.” At bottom, co-ownership coincides with a basic characterization of the sense of us, i.e., as the subjective character of an overarching emotional response having self and other as distinct co-subjects.

6 Confirming the prima facie viability of the straightforward view

The straightforward view meets three other requirements for prima facie viability. First, the straightforward view adds no ontological component to a naturalistic theory of emotion, because it merely states how the components contribute to constituting an individuated emotion: these components are physical and conscious processes of interacting individuals.Footnote 6, Footnote 7 The view rejects the consciousness of a “collective person as an individual person of only wider scope” (Scheler, 1973: 522-524; cf. Schmid, 2009: 156) and demands only to acknowledge a particular structure of the emotion of the participating individuals:

The shared feeling is nothing in addition to what the participating individuals feel. (Schmid, 2009: 81)

Second, the straightforward view can clarify how shared emotions appear in reflection. Reflection always occurs from a specific angle. What an individual pre-reflectively feels in emotion sharing can always be considered either as an individual experience or as a subjective perspective on a communal emotion, depending on what reflection aims at (Stein, 2000: 141, 164). Take again the simplest case of emotion sharing between two individuals (“you” and “me”). If, in reflecting on the situation, I look for individual emotions—perhaps because I’m used to think of emotions as individual experiences—then I will find two individual emotions, “yours” and “mine.” If, however, I seek to comply with the pre-delineation suggested by my pre-reflective experience and look for a shared emotion, then I find one emotion: “ours.” For this reason, Schmid states:

There are two ways of counting the number of feelings involved. (Schmid, 2009: 81)

However, after we count the individual emotions, there is no feeling left that can be counted as an additional emotion—because there is nothing in addition to what the individual feels—and, analogously, when we count the shared emotion, there are no experiences left that can be counted as additional emotions—because individual experiences are already considered as perspectival constituents of the shared emotion. As Schmid summarizes:

There is no legitimate way of counting that yields three. (Schmid, 2009: 81)

Finally, the straightforward view can account for the functional role of sharing in social cognition, which is in the foreground in developmental science (Mundy, 2018). As discussed by philosophers (Eilan, 2007; Campbell, 2011), this role is both causal and epistemological in that it not only brings about knowledge about others’ mental states, but also justifies it. The form of this justification is: “I know what your mental state is because I am living it too” (Eilan, 2007: 135; Schmid, 2014a: 9; Stein, 1964: 17). Or, to use Campbell’s (2011: 416, 425) term, I know what your mental state is because, in sharing, I have access to it through “introspection.”

Campbell’s notion of introspection corresponds to the Husserlian notion of “originality” (Zahavi, 2005: 53)—indeed, Campbell characterizes sharing as a “primitive” experience. “Originality” is compatible with “fallibility” and denotes the maximal kind of presentational directness in a comparison between experiences. For example, seeing a tree (perceptual presentation) is not infallible, but is more direct than seeing a picture of it (iconic presentation) and merely reading about it (symbolic presentation). First-person-singular awareness of an experience is not infallible, but is a more direct mode of givenness than when the selfsame experience of mine is perceived by another person (other-awareness without sharing).

The straightforward view implies that sharing arises from a reconfiguration of self- and other-awareness. The overarching emotion that emerges in this reconfiguration cannot be given in a more direct way than the first-person-plural: neither to me nor to the other. Therefore, the straightforward view accounts for the special epistemological force of sharing: knowledge of other minds based on original (most direct) givenness.

7 The challenge of early sharing

Despite having surveyed its prima facie viability in the foregoing sections, there is still a fundamental challenge that can undermine the straightforward view. The view does not seem to apply to infant-caregiver emotion sharing. In face-to-face interaction, a baby seems to have a self-experience that is primarily proprioceptive and an experience of the caregiver that is primarily visual. Thus, there seem to be no similarities between the emoting self and the emoting other, and without these kinds of similarities, there is no pre-reflective synthesis that can pre-delineate a shared emotion. If it were not possible to explain how pre-individuation occurs in the anthropologically fundamental case of infant and caregiver, the straightforward view would appear weak, to say the least. This is the challenge of early sharing.

The pairing account may be able to rescue the straightforward view and address this challenge. It’s time to turn to this account. The account states that for each kind of embodied experience of the other that the infant starts perceiving, the similarity between the other’s and one’s own behavior activates the sensorimotor schema formed through first-person-singular experience; the activation of this schema underlies the perception of the other’s behavior as expressive of the corresponding embodied experience (Vincini, 2020).

The idea that pairing can rescue the straightforward view and address the challenge of early sharing is tantamount to the claim that pairing contributes to explaining early sharing too. A first reason not to dismiss this claim is that it is already implicit in Hobson and Hobson’s “identification” hypothesis.

For Hobson and Hobson (2007: 411, 415, 426; 2011: 124), identification is the cognitive-psychological process that underpins different phenomena such as (a) social perception, (b) imitation, and (c) sharing. Hobson and Hobson (2007: 411) define “identification” in terms of “assimilation.” The latter is a notion that Hobson and Hobson (2007: 415; 2011: 130) take from Freud, Laplanche and Pontalis, and that refers to the process of association by similarity. Finally, Hobson and Hobson (2011: 130) explicitly identify “identification” with Husserl and Merleau-Ponty’s notion of pairing. Thus, by equating “identification” with pairing and by hypothesizing the explanatory role of identification for sharing, Hobson and Hobson implicitly suggest the relevance of pairing for sharing.

8 The pairing account of infant emotion perception

The pairing account suggests that infant-caregiver interaction, or “proto-conversation,” is the locus where infants experience the same or similar features in their own and others’ behaviors. Vincini (2020) identifies three main dimensions of self-other similarities to which infants are regularly exposed: 1) others’ vocalizations are more similar to infants’ own vocalizations than most other sounds; 2) parental imitation; 3) spatiotemporal proximity and functional equivalence: both self and other are there in the interaction and play comparable roles—e.g., both self and other “respond” to each other’s gestures shaping the playful interaction (Stern, 1990: 66-67).

There is considerable support for a pairing account of infant emotion perception. First of all, eminent contemporary hypotheses are non-nativist in character (Hoemann et al., 2019; Oosterwijk and Barrett, 2014; Ruba and Repacholi, 2019; Sullivan and Minar, 2020). The pairing account explicitly posits that emotion perception is enabled by the experience of the same or similar features in the expressive behavior of self and others. It thus assigns a key role to the well-known phenomenon of parental mirroring of affective behavior (Heyes, 2018: 501-502). When infants see others resonate their emotional behavior, they rely on their remarkable domain-general capacity of shaping perception in light of abstract features that remain the same across different stimuli (Quinn and Bhatt, 2015: 700, 703, 707).

It is because the other resonates the self’s happy behavior that the infant can assimilate the other’s behavior and see happiness “in” her face. This hypothesis is also supported by the existence of a correlation between previous parental mirroring and the degree of neural mirroring activation when perceiving others’ facial expressions (Rayson et al., 2017; De Klerk et al., 2019). Rayson et al. (2017: 5-6) would even insist that “very little experience of being mirrored” in “relatively infrequent periods of contingency” could be sufficient to mold infant emotion perception.

9 Pairing and sharing

Pairing posits that social perception—just like non-social perception—is constrained by the content of the apprehended stimulus. It follows that others’ embodied experiences cannot be given in the first-person-singular, since this would entail experiencing the other’s expressive behavior as instantiating the “here” of egocentric space. The constraint impeding a complete assimilation is that the other’s behavior is presented as being “over there” in contrast to the “over here” of first-person-singular behavior (Husserl, 1999: 118-119).Footnote 8 Consequently, pairing is an ideal account to explain early emotion sharing because it suggests that emotion perception implies both the functioning of association by similarity and the self-other differentiation that are required for emotion sharing.

At this point, we can connect the dots. The pairing account shows that the similarities between self and other have a functional role in early emotion perception (Sect. 8). Association by similarity underpins the individuation of things and people, and its pre-delineation in first-person-singular experience and emotion sharing (Sect. 4). Hence, it is reasonable to assume that self-other similarities also lead the infant to identify the other’s emotion with her own. In other words, in playful infant-caregiver interaction, the very same similarities that generate emotion perception enable emotion sharing as well. Just as in celebrating the sporting event it is the similarity of the other’s response that allows adults to experience the other as sharing in their own emotion, so the infant, for the same unhindered domain-general process of “overlap-synthesis,” will tend to experience the playful caregiver as sharing in her own emotion.

Is it likely that, in an intense playful interaction, the infant experiences two numerically distinct emotions? No, it seems more plausible that the infant experiences one overall excitement. This excitement is communal: it is lived through from her own first-personally-felt body “over here” and “somehow from a distance” (Merleau-Ponty, 1964: 118), i.e., from “over there” by the perceived caregiver (Stern, 1990: 18-19). The greater plausibility of this type of (pre-)individuation for early emotion sharing has been confirmed by Eilan (2007), Krueger (2013), and eminently by Tronick (2005) with his notion of “dyadic state of consciousness.” The pairing account shows why such (pre-)individuation is possible.

10 Testing the connection: sharing and gaze perception

Lastly, to exemplify how the connection between pairing and sharing could be tested, I shall consider gaze perception and shared attention, since, with regard to these phenomena, there is already an experimental study investigating their correlation (Brune and Woodward, 2007) and joint attention is closely related to emotion sharing (Hobson and Hobson, 2011).

Consider the distinction between:

  1. 1)

    Gaze following, i.e., following the direction of the movement of the other’s head and eyes.

  2. 2)

    The perception of the caregiver’s imminent disposition to interact with X—this arguably derives from learning the consequences of the other’s gazing behavior.

  3. 3)

    The perception of the other’s eyes as expressing the “depths” of the other’s animacy (Stern, 1990: 58, 63).

  4. 4)

    The perception of “care,” i.e., of the positive intentions and emotions the caregiver has toward the infant herself. The pairing account could explain this kind of social perception by noticing that infants experience similar behavioral patterns in self and caregiver. One of these is the basic invariant sequence of emergence of the self’s need (e.g., for warmth or tranquility), contingent response (for the infant: voluntary crying, turning, etc.; for the caregiver: picking her up, carrying her around, etc.), and fulfilment of the self’s need. Another is the pattern of vocalizations, kinematics, turn taking, etc. that in a playful interaction are lived by the infant as part of a positive emotion toward what the infant herself is doing. Hence the infant could assimilate the other’s behavior as expressing positive intentions and emotions toward the self.

  5. 5)

    The perception of the other’s perception where the organ and modality of the other’s perception remains unspecified—see Johnson et al. (2007: 536) and Luo and Johnson (2009: 148) for excellent explanations.

  6. 6)

    Gaze perception: perceiving the other’s visual (modality-specific) experience of the environment, i.e., that the other makes “visual contact” with X (Brooks and Meltzoff, 2005: 541). Gaze perception is the intuitive access to the other’s “experience of seeing” (Woodward, 2003: 309).

The pairing account suggests that #1-5 develop earlier than #6. Specifically, on the basis of the convergent results from three different experimental setups that differentiate gaze following from gaze perception (Beier and Spelke, 2012; Brooks and Meltzoff, 2005; Woodward, 2003), it supposes that gaze perception arises from around 9-12 months on the basis of sensorimotor gazing experience (Vincini, 2020). Nonetheless, the pairing account alone remains somewhat unconvincing with regard to gaze perception. One might argue that there seems to be not enough similarity between the gazing behavior of self and others to allow for a perception of others’ gaze at 9-12 months.

Here it is the straightforward view that may “rescue” the pairing account. The 9-12 months period is that of the emergence of the strictly defined phenomenon of joint attention in free-play (Boyer et al., 2020). In light of the function of sharing in social cognition (Sect. 6), one can hypothesize that there is also a causal connection going from sharing to social perception: it is sharing visual attention with the other that allows the infant to recognize the other’s visual attention. Given that—in accord with the straightforward view—attention sharing does not rely solely on the similarity of gazing behaviors, but also on a wider range of behavioral similarities between interacting individuals (e.g. both infant and caregiver are sitting and interact with a toy, they both smile, vocalize, point, etc. in relation to the toy and each other), as well as on the more basic similarities between self and others already given at that age (both self and other are given as minded agents), the hypothesis of a causal connection from sharing to gaze perception allows the pairing approach to explain gaze perception without resorting to a nativist, domain-specific process.

Consider, in contrast, Reddy’s (2008) nativist assumption that 2-month-olds perceive that caregivers are “seeing” them, i.e., that they are being “looked at” by caregivers. It is the inclusion of gaze perception in the social perception of the 2-month-old, or—which is substantially the same—the failure to distinguish gaze perception from #1-5, that commits Reddy to a domain-specific process underpinning gaze perception.Footnote 9 How can the non-nativist pairing account be supported in contrast to a nativist hypothesis such as Reddy’s?

Brune and Woodward (2007) found initial evidence for a positive correlation between shared attention and gaze perception, but, unfortunately, their call for further studies specifically testing this link seems to have been lost in the complexity of the field’s empirical questions. In addition to further studies employing Woodward’s (2003) habituation procedure for gaze perception, it could be particularly telling to investigate whether shared attention correlates with Brooks and Meltzoff’s (2005) measure of gaze perception—i.e., gaze-following the adult’s head turnings only when these are executed with open eyes (vs. closed eyes). Because Brooks and Meltzoff’s methodology highlights the specific meaning of perceived eyes, ascertaining a positive correlation would support the idea that it is only from around 9-12 months and, at least in part, thanks to the sharing of visual attention, that infants perceive eyes as expressive of seeing. This empirical correlation between shared attention and gaze perception would therefore support the non-nativist domain-general approach of the pairing account.

11 Conclusion

Let’s recap. The straightforward view is potentially undermined by the challenge that the infant experiences self and other too differently for a synthesis pre-individuating a common emotion to occur (Sect. 7). The pairing account provides a reply: the similarities that enable emotion perception also allow the infant to pre-delineate a comprehensive emotion shared with the caregiver (Sects. 8-9).

Vice versa, the crux of the pairing account—i.e., the functional role of self-other similarities—appears to extend its explanatory power to an area of fundamental anthropological significance. This is thanks to the straightforward view, which implies the idea that the pre-individuation of a shared emotion partly relies on the domain-general process of association by similarity—overlap-synthesis—that also contributes to the individuation of ordinary things and people, and to the pre-individuation of individual experiences (Sects. 3-4). The connection between pairing and sharing—which could be empirically tested—strengthens the non-nativist approach to gaze perception (Sect. 10).

That its association with the straightforward view supports, rather than weakens, the pairing account is ensured by considering that the process of overlap-synthesis does not annul, but rather requires different experiential modes, as it can be seen in both the individual and the communal case. The pre-individuation of a communal emotion relies on the similarity of the “contents” of the individual experiences (same intentional object, similar behavioral, physiological, expressive responses, subjective feelings, underlying concerns, etc.) and makes use of different subjective modes of awareness, the first-person-singular and other-awareness, precisely in order to pre-delineate an emotion as both “mine” and “yours,” i.e., as “ours” (Sect. 5). The assumption that the metaphysical question of what components constitute a unitary emotion is not independent from the question of how it is individuated for the subject(s) involved—a phenomenally lived emotion is a mereologically complex embodied affective response that is pre-reflectively experienced as a unity, as well as a reality that has effects on its subject(s) and on the social/non-social environment—is in line with the phenomenological tradition and represents an acceptable presupposition in the current phenomenological debate (Sect. 2).

In short, the straightforward view and the pairing account strengthen each other. If this conclusion is legitimate, the debates on emotion sharing and social perception are still very far from being settled, but are enriched with a straightforward view that is more solid from the cognitive-developmental standpoint and with a pairing account that has expanded its explanatory value, respectively.

That this alliance between pairing and straightforward sharing may be relevant to the phenomenology of joint action can be evinced from the fact that the straightforward approach has been generalized to a variety of shared-intentional phenomena (Scheler, 2008: 258) and specifically applied to joint action (Krebs, 2015: 149; Husserl, 1973: 193; Stein, 2000: 193). However, I would like to signal another possible link as well.

For the straightforward view, sharing is the pre-individuation of a unitary overarching state of which self and other(s) are co-subjects. It therefore implies a basic sense of us. Given the dependence of both pairing and straightforward sharing upon self-other differentiation, what their alliance puts forward for a phenomenology of joint action is a conception of the sense of us as ultimately based on minimal individual awareness—first- and second-personal—which nonetheless brings to the fore how individual awareness can itself be affected, psychologically and epistemologically, by experiences in the first-person-plural (Sect. 6).