Two decades ago, Goldman (1993) drew attention to various ways in which traditional branches of philosophy can productively collaborate with the cognitive sciences.This research field has since experienced a remarkable growth, and in the current scientific landscape the junction between the cognitive sciences and philosophy of mind is arguably one of the most dynamically evolving areas. There are numerous issues that are productively discussed within this field, but one of the current debates is particularly stimulating, because it seems indicative of a gradual change in opinion about the way in which the mind and the cognitive apparatus is to be studied. ‘Situated’ and ‘extended’ views of human cognitive activity oppose the idea that all human thought is constituted by computing activity located only in the head (Rupert 2009). Seminal work by Edwin Hutchins, which included the investigation of some cognitive processes that were involved in a ship’s navigation (1995a) and in a cockpit (1995b), has shown how cognition may be viewed as a distributed phenomenon that is fundamentally situated in practices. This early work on situated and distributed cognition has been influential, and today we can distinguish between two prominent views.

According to proponents of the ‘Hypothesis of Embedded Cognition, (HEMC), cognitive processes depend very heavily on organismically external entities and environmental structures in which cognition takes place (Rupert 2009). Proponents of the ‘Hypothesis of Extended Cognition’ (HEC) like Andy Clark and David Chalmers go a step further arguing that minds are systems that at times extend beyond the boundaries of the human body to include environmental resources as their proper parts (Clark and Wilson 2009; Clark and Chalmers 1998; Clark 2008, 2010a, b; Chalmers 2008). Thus, while the HEMC claims that cognitive activity exploits the surrounding environment to carry the cognitive load, HEC supporters challenge the view that the physical substrates that make up the vehicles of cognition are exclusively located in the brain and the body of the individual and argue that in some cases extra-cranial and non-biological items are simply constituents of cognitive processes (Clark 2010b, p. 84, 2008, pp. 135–139).Footnote 1 Proper parts of cognitive states sometimes extend into the environment, and hence cognition is sometimes ontologically distributed (Wheeler 2010, p. 246) and constituted by certain active features of the environment (Clark and Chalmers 1998; Clark 2008; see also Menary 2010a; for a critique see Adams and Aizawa 2001, 2008, 2010; Rupert 2004, 2009). One important difference for proponents of the HEC is that while the ‘first wave’ (Clark and Chalmers 1998; Clark 2008, etc.) is based on the parity principle, the ‘second wave’ is characterized by the complementarity principle: External entities, rather than being mereological parts, are seen as complementary tools that become integrated into one cognitive system (see Sutton 2010). This move involves either rejecting (Menary 2010b) or remaining neutral (Rowlands 2010) on the functionalism inherent in the ‘parity principle’.

Clark and Chalmers (1998) invite us to consider intuitively compelling thought experiments involving dispositional beliefs. Inga and Otto both forget the whereabouts of the museum they want to visit. Inga checks her memory, while Otto compensates for his failing internal memory by consulting a notebook he always keeps at hand. These two processes, according to Clark and Chalmers (1998, p. 14), are cognitively indistinguishable. The only difference is that the realizer of Otto’s belief is distributed between Otto’s brain and his notebook (Horgan and Kriegel 2008). Following the ‘parity principle’ of the HEC, if there is functional similarity in causing actions between the contribution of internal and external elements, then equivalent (cognitive) status must be granted to the external entity. Otto’s extended state is just as mental as Inga’s non-extended state, if and only if Otto’s extended state has the same causal-cognitive role in his mental life as Inga’s non-extended state does in hers (Chalmers 2008; Clark 2010a).Footnote 2 In order to avoid an overly inclusive account, and hence the accusations of an overly permissive account (‘cognitive bloat’), Clark and Chalmers (1998) add additional criteria.Footnote 3

Even though Clark argues in favour of the possibility of a distinct ‘socially extended cognition’ (Clark and Chalmers 1998; Wilson and Clark 2009),Footnote 4 the usual examples in discussions of the HEC by and large involve individual props like notebooks or cellular phones, somewhat neglecting that human problem-solving often relies on social interaction and other intrinsically social components. Only a small number of researchers have explored cases of socially extended cognition. Wilson (2004; 2005) has shown that many advanced cognitive performances are distributed, not merely over the technological and psychical, but also over social and cultural “scaffolds.” He takes seriously the claim of the HEC that cognitive processes may be partly constituted by external resources, and, accordingly, he stresses “the constitutive role that an individual’s social milieu plays in her cognitive activity” (Wilson 2005, p. 230).Footnote 5 Some researchers like Tollefsen (2006), Gallagher and Crisafi (2009), Theiner et al. (2010), Cash (2013), Krueger (2013) and Gallagher (2013) have since explored social extensions, and while their approaches and claims obviously differ, there are at least three commonalities between them.Footnote 6 First, these authors acknowledge that specifically socially extended cognition may be more than merely a subgroup of extended cognition. For instance, Tollefsen (2006) argues that there are important differences between ‘solipsistic’ cognitive systems made up of individual agents and their technological artefacts and ‘collective’ cognitive systems that are constituted by human agents. Second, they point out that socially extended cognition not only displays different regularities; in some cases, socially extended cognition may be resilient to several objections that have been raised to the standard HEC. Third, these authors enlarge the HEC to groups and analyse larger problem-solving wholes such as cognition in larger groups.

This paper draws on this line of thinking about socially extended cognition and subscribes to the first two shared points about the special status of socially extended cognition. However, rather than discussing larger systems, the focus here will be on second-personal interaction between human agents, namely on a dense type of ‘dyadic interaction’ between infant and caretaker that gives rise to the cognitive emotion regulation of the infant.Footnote 7 Krueger (2013) approaches the issue of how attentional control is shaped by specific interventions by the caretaker in a way that entrains the child to be sensitive to some sociocultural norms. While he briefly touches on emotion regulation, I attempt to provide a detailed analysis of the underlying interactive processes and discuss whether and in what sense the relevant effects are best described as emergent. Due to such focus, I refrain from reflecting on ways in which dyadic emotion regulation might be influenced by sociocultural norms.

In the first part of the paper, it is argued that ‘dyadic interaction’ between infant and caretaker is a case of socially extended cognition, since cognitive emotion regulation in the infant is achieved within the framework of the interaction, by the inclusion of extra-somatic environmental resources provided by the caretaker (1). Then, it is argued that ‘dyadic interaction’ represents a significantly different extended cognitive phenomenon, and they exhibit a different dynamic involving ‘uncontrollability’ and ‘irreducibility’ that cannot be fully accommodated within the HEC (2). It will be shown that Clark’s recent and slightly revised version of the HEC (henceforth HEC*) is valuable for getting a grip on the kind of extended cognition that is at stake in ‘dyadic interaction’, but ultimately fails on different grounds (3). The conclusion here is that in their current form, the HEC and the HEC* are not geared to understanding socially extended cognition.

In the second part of the paper, drawing on the concept of emergence, I introduce the ‘Hypothesis of Emergent Extended Cognition’. Rather than being a rival to the HEC, it will be argued that the HEEC should be understood as complementing it. More precisely, the HEEC may be understood as a version of the HEC that explains cases of socially extended cognition, in which cognitive properties are sometimes irreducibly emergent properties of coupled systems. Simply, the HEC and the HEEC clarify two types of extended cognition, while the HEEC is geared towards understanding cognitive processes in dyadic systems. Using two understandings of emergence that differ in their strength (emergence\(_{1}\) and emergence\(_{2}\)), two types of ‘Emergent Extended Cognition’ (HEEC\(_{1}\) and HEEC\(_{2}\)) are introduced (4). It is demonstrated that both HEEC\(_{1}\) and HEEC\(_{2}\) can clarify extended cognitive processes in ‘dyadic interaction’ between infant and caretaker (5–6).

It is concluded that both \(\hbox {HEEC}_{1}\) and \(\hbox {HEEC}_{2}\) are resilient to a central objection that has been raised to the HEC and that operating with these accounts leads both to a more precise grip on the explanandum and to a more robust explanans. In addition, there are significantly distinct interpretations of this phenomenon among developmental psychologists, and one of the contributions of this paper is to bring the analysis to bear on these debates in a productive way. The guiding intuition in this paper is that exploring the cognitive incorporations of genuinely social elements may both advance HEC debates and contribute to the gradual materialization of a novel framework for the pursuit of cognitive science.

1 Early interaction and the social infant

Classical views of human development, championed by Freud, Piaget, Skinner and Winnicott, argued that the newborn is not yet capable of any social interaction. The claim was roughly that infants only notice other humans if they can somehow be placed within their primitive reflex- or action-schemes. However, there has since been a veritable revolution in our understanding of the social and cognitive skills of the infant, and the idea of the asocial infant has lost its scientific credence. There is now overwhelming evidence showing that far from being born as asocial beings, infants, and to a certain degree even newborns, are in possession of inter-subjective competencies that allow them to meaningfully interact. As Meltzoff and Brooks (2007) have recently pointed out, this new orientation relies on at least three fields of empirical findings, all suggesting a close coupling between the infant and the caretaker. In the following, rather than giving an encompassing account of all these fields, I will concentrate on the third field, namely, work on primary intersubjectivity.Footnote 8

1.1 Synchrony and primary intersubjectivity

In their ground-breaking work, Trevarthen (1979), Trevarthen and Hubley (1978) and Stern (1985) have demonstrated that newborns are innately able to enter into interpersonal contact with their caretaker. Central to such primordial intersubjectivity is the development of a shared ‘we-space’ of experience between the infant and the caretaker, in which the ongoing primitive interaction does not refer to anything outside the interaction itself (Trevarthen 1979; Gallagher 2005; Gallese 2006). Studies have indicated the presence of a variety of early interactions that occur in this shared space of experience. Newborns are responsive to the caregiver’s micro-level behaviours regarding direction of gaze, level of arousal, body orientation, tone of voice and facial expressions, which are all indispensable for the ongoing partaking in emotional exchanges (Muratori and Maestro 2007; Feldman 2007). A great deal of research focused on the engaged mutual exchange of pleasure-giving movements and vocalizations (so-called ‘protoconversations’) and the typical ‘mirror play’ that involves the playful copying of each other’s vocalizations and gestures. While this research has revealed a wide range of interesting patterns, some have noted that such shared space of mutually-coordinated social interactions themselves might have a key role in driving certain social cognitive processes (Krueger 2011; Varga and Krueger 2013). In the context of this paper I wish to focus on the phenomenon of ‘synchrony’ that characterizes some of these dyadic interactions and argue that some of the underlying processes make possible the regulation of the infant’s emotions.

‘Sychrony’ refers to an unforeseen degree of temporal coordination of non-verbal behaviors of the child and the caretaker. This includes body movements, gaze, vocalizations and affect during the earliest caregiver-child interactions (Feldman 2007). In part relying on refined micro-analytic technologies, it has been shown that there are precise synchronizations between leg movements and cry patterns in newborns and adult speech (Condon and Sander 1974; Brazelton et al. 1974; Trevarthen 2002; Feldman and Eidelman 2004; Weinberg and Tronick 1994).Footnote 9 In such synchronic interactions, there is an emergence and maintenance of non-predetermined synchronic interaction patterns over time, in which caretaker and infant complement each other’s states and moderate the level of positive arousal in cooperation (Cohn and Tronick 1988; Tronick 2002; Feldman 2007). Importantly, synchrony here is not to be understood in the sense that we would say two watches are synchronized. Synchrony, rather, refers to the co-creation of patterns that involve not mere copying, but also the temporally and dynamically variable completion of each other’s vocalizations and gestures (Feldman 2007; Feldman and Eidelman 2004).

If the continuous, synchronic interaction pattern between the infant and the caretaker breaks down, or if the previously engaged caretaker suddenly puts on a motionless and neutral face, the infant becomes distressed (Tronick et al. 1979). Murray and Trevarthen (1985) have shown that when the infant and the caretaker interact via two connected monitors, the infant becomes distressed when the recording of her ‘live’ interaction with the mother is replaced with a recording of her mother showing her interacting with the infant just a short time before. As opposed to the case where the caretaker puts on a still face, it is not the simple lack of expressiveness that upsets the infant. Rather, it is the lack of the ongoing open-ended interactive engagement that causes the disturbance. The distress of the infant stems from the lack of the possibility to enter into synchronic interaction with the caretaker.

Synchrony is a rather fragile process that is always at the boundary of breaking down. Often, in the course of the caretaker-infant interaction, there are periods of interactive mis-coordination, in which emotions or intentions are mismatched (Reck et al. 2004). For instance, the infant’s expression of a positive affect might be met by the mother’s expression of negative affect. But importantly, instances of mismatch are usually followed by periods of ‘reparation’, where emotions and intentions match again and the regulatory function is re-established (Tronick 1989). In ordinary interactions, reparation reaffirms the feeling in the infant that problems can be overcome by dyadic interaction.

In sum, synchrony refers to the organization of social behaviour into rhythmic sequences, the matching of micro-level affective behaviour between caretaker and infant, during face-to-face interactions.

1.2 Dyadic synchronic interaction and emotion regulation

Important for our aims here, the dyadic synchronic interaction and the matching of micro-level face-to-face interactions make possible the cognitive regulation of the infant’s emotions (Trevarthen 1993; Tronick 1998; Feldman 2007; Manian and Bornstein 2009).Footnote 10 Developmental literature describes this emotion regulation as the cognitive management of emotionally arousing experience (Thompson 1991; Gross 2014; Gross et al. 2011).Footnote 11 More precisely, there is a double cognitive regulation task that is accomplished in synchronic dyadic interaction. First, we may speak of a ‘live’ emotion regulation process. ‘Live’ means that the previously unattainable achievement of cognitive emotion regulation is achieved in real time within the framework of this dense interaction. Second, this interaction is also the key to the progressive building of emotional self-regulation capacities. In other words, besides the ‘live’ process that regulates the infant’s emotion, there is a simultaneously unfolding diachronic process that develops the emotional self-regulation ability of the infant. While the ‘live’ regulation of elevated levels of emotions is an important developmental milestone, emotional self-regulation is an absolutely indispensable ability throughout a human life, which—if properly developed—enables individuals to deal with stressful events and adapt to changed circumstances and situational demands (Kopp 1989). To understand how indispensible dyadic synchronic interaction is for the emotion regulation of the infant, it is helpful to survey the literature on the capacities of infants of depressed mothers who are unable to enter into synchronic dyadic interactions. There is solid evidence that infants of clinically depressed mothers do not succeed in developing appropriate emotion regulation and develop instead maladaptive self-directed regulation strategies with long-term consequences (Manian and Bornstein 2009; Reck et al. 2004; Tronick and Gianino 1986). As Manian and Bornstein (2009, p. 9) note, “when the interactive attempts of infants have not been appropriately responded to over time by their depressed mothers, infants turn to self-directed regulatory behaviors. If used as the primary regulatory strategy under stress, these self-directed behaviors may develop into a stable style of self-regulation. This style (...) become maladaptive in the child’s broader social world.”

Given the complexity of the phenomenon, it is not surprising that we find quite different interpretations of the developmental literature. While researchers present somewhat differing accounts, a common point is that they take seriously the idea that in mutual synchronic interaction gazes and vocalizations may simultaneously work as effects and causes. For instance, the interaction may start with the infant’s vocalization or gaze eliciting the caregiver’s matching vocalization or gaze. But importantly, there is simultaneity at play here, since the caregiver’s matching reaction at the same time maintains the infant’s vocalization or gaze.

We may distinguish between three interpretations. First, Hofer (1994) puts forward a comparatively straightforward interpretation, proposing that we should understand the caretaker simply as the more or less direct external regulator of the infant’s affective state. Second, others like Bruner (1975) and Stern et al. (1985) argue that the phenomenon is too multifaceted to be a result of any direct regulation by the caretaker. Instead, they propose a more complex HEMC-like understanding, arguing that the caregiver provides the infant with regulatory, emotional-cognitive scaffolding so the infant can regulate herself. In contrast, a third group of researchers argue that both the ‘direct regulation’ and the HEMC (or ‘scaffolding’) models assume the unilateral adjustment of one partner to the other (Fogel 1993; Tronick 1998, 2002). What is lost in these models, they argue, is the “systemic whole-ness” and dynamic nature of the interaction constituting a mutual regulatory parent-infant system (Feldman 2007). In other words, aspects of the relational dynamics of such interaction remain unseen in both the ‘direct regulation’ and the HEMC models, in that they rely on an (overly) individualistic perspective and understand synchronic interaction in terms of a linear causal chain of events. Interestingly, Tronick, a central proponent in the latter group, provides an explanation that understands this phenomenon as a case of extended cognition in the spirit of the HEC. Tronick describes this process as follows:

Moving onto the question of the regulation of emotional states of the infant, we find that the infant’s emotional states are also regulated dyadically. The principal components are the infant’s central nervous system (e.g., limbic sites) and the behaviours it organizes and controls (e.g., facial and vocal emotional displays) and the caregiver’s regulatory input (e.g., facial expressions, touches, gestures). The dyadic emotional regulatory system is guided by communication between internal and external components (i.e., infant and caregiver) (Tronick 1998, emphasis added).

In Tronick’s overall description, the emotion regulation of the infant’s states figures as a cognitive process that at least heavily exploits the caregiver’s facial expressions, touches and gestures, made available in the synchronic dyadic interaction. Thus, we might at first label this as an instance of the HEMC. However, in Tronick’s view, the contribution of the caregiver’s facial expressions, touches and gestures is not merely of causal nature. It is not just that the infant exploits the surrounding environment to carry the cognitive load (HEMC). Rather, Tronick maintains that they are genuine components of the infant’s emotion regulation, like the infant’s central nervous system and the behaviours it organizes. Thus, it seems that it is more correct to say that in Tronick’s view this process is an instance of the HEC. If Tronick, Feldman and Fogel are right, then the emotion regulation of the infant is not only not brain-bound, but realized by vehicles that extend beyond organism boundaries and include extra-somatic environmental resources provided by the caretaker. Also, the contribution of the dyadic interaction is not merely causal, but genuinely constitutive for the infant’s emotion regulation. The caretaker partly serves as an indispensable vehicle of the infant’s cognition, through which the infant is able to accomplish the previously unattainable cognitive achievement of emotion regulation.Footnote 12

At this point, a proponent of the HEMC might question whether Tronick, Feldman and Fogel are right. She might object that the extra-somatic environmental resources provided by the caretaker should not be understood as constitutive parts of the infant’s emotion regulation, but merely as crucial causal contributions. The point is that since causation and constitution are independent metaphysical relations, facts about causal relations do not tell us anything about facts of constitutive relations. Thus, there is no inference from the infant and caretaker being reciprocally coupled to the claim that the caretaker’s activity is a constitutive part of the infant’s emotion regulation (see for instance Adams and Aizawa 2008). There are, however, several reasons why such criticism can be dealt with. The first possibility is to follow Clark’s (1997) and Wheeler’s (2010) line of reasoning and argue that we are dealing with separate parts of a system that are in a state of “continuous reciprocal causation.” In such cases in which the behavior of each part simultaneously affects the behavior of the other parts, an explanation that dissects the overarching system into insulated causally active parts will unavoidably miss out on important features of the dynamics of the system. The kind of dyadic interaction described here exhibits a form of mutually modulatory dynamics that is better understood in a wider perspective than one that focuses on components merely offering inputs and outputs.

In addition, one may also argue that the criticism suggesting a causal-constitutive fallacy itself rests on a problematic notion of constitution. For instance, Ross and Ladyman (2010, p. 159) argue that the coupling-constitution fallacy relies on the “containment metaphor” and thus the belief that “the world is a kind of container bearing objects that change location and properties over time....they themselves are containers in turn, and their properties and causal dispositions are to be explained by the properties and dispositions of the objects they contain (and which are often taken to entirely comprise them).” This picture, however, has no corresponding image in contemporary fundamental physics, and the distinction between causes and constitution tends to be abandoned as sciences “mature” and converge on robust general models. In a different manner, Kirchhoff (2014, 2015) argues that a better understanding of the constitutive relations in purported cases of extended cognition requires departing from the traditional understanding of material constitution in analytical metaphysics. Instead of relying on a notion of synchronic (compositional) constitution that is inconsistent with most cases of the HEC, Kirchhoff suggests operating with a diachronic notion of constitution that shares an affinity with non-eliminative process ontology. The point is that such a criticism misconstrues the nature of the constitution relation involved in most cases of extended cognitive processes, perhaps due to the assumption that the concept of constitution used to describe purported cases of the HEC must dovetail with the standard account of synchronic material constitution in analytical metaphysics. In particular, the charge of the HEC committing a causal-constitution fallacy is based on the assumption that the constitution relation in distributed cognitive processes is synchronic and fundamentally distinct from causation. However, this notion of material constitution is incompatible with the relation of constitution that holds in most cases of the HEC. Because standard accounts of synchronic material constitution assume that temporality itself is not essential to understanding constitution, its explanatory framework is inappropriate for describing and explaining characteristically temporal, dynamically enfolding, extended cognitive processes. Kirchoff goes on to show that constitution need not be synchronic, but can sometimes be understood as a diachronic relation in many ways akin to causation. In such cases, diachronically evolving relations of constitution share crucial features with “continuous reciprocal causation.”

In the framework of this paper, I’ll be operating with such a diachronic notion of constitution, which makes it possible to avoid the coupling-constitution fallacy. On such a notion, it is warranted to maintain that “external” entities and processes can be constitutive parts of the infant’s emotion regulation.

2 Dyadic interaction and synchrony: not a case for the HEC

At first look, the HEC framework seems adequate to understand this phenomenon. Clark considers “Otto-and-the-notebook” as “a single,” “integrated system”, that “can be seen as a cognitive system in its own right” (2005, p. 7; Clark and Chalmers 1998). Similarly, we could also maintain that when it comes to the emotion-regulation of the infant, the “infant-and-the-caretaker” functions as a single integrated system, made up of two components, the infant and the caretaker. Also, following the parity principle, a process of emotion-regulation that we would deem cognitive when it is performed “in the head” should also be thought of as cognitive when performed involving entities outside the head (Chalmers 2008). On this view, what really matters is that the infant-caretaker system that achieves the emotion regulation is functionally equivalent to a single individual’s normally functioning emotion regulation. This seems correct. Recall that emotion regulation occurs by influencing the emotion trajectory (Gross et al. 2011). In individual cases, this may occur by a number of strategies including cognitive reappraisal, emotion inhibition, refocusing attention, etc. In dyadic cases described in 1.2., the empirical findings indicate that emotion regulation occurs through a dense, micro-level face-to-face interaction. However, although individual and dyadic emotion-regulation occur in different ways, the infant caretaker system is functionally equivalent to “individual” emotion regulation, as both perform the same function.

So far so good. However, on a second look, problems arise for accommodating dyadic synchronic interaction within the HEC. Calling into mind some HEC criteria (Constancy, Availability, Endorsement, Past-Endorsement), we begin to see important qualitative differences between the interaction that takes place between Otto and the notebook and the one between the infant and the caretaker. First, we can neither say that the (same) caretaker is constantly available to the child at all times (like Otto’s notebook), nor that the interaction is reliable. Rather, as we have seen, interaction is characterized by rapidly occurring shifts between energetic peaks, breakdowns, phases of ‘reparation’ and so on. Thus, neither the external resource nor the interaction itself is available and reliable. Second, we cannot really claim that what is made available for the infant in our case is anything like the information or data that Otto relies on. Rather than providing information, the caretaker ‘merely’ provides the possibility of open-ended, affectively contoured interaction. Otto’s interaction with his notebook can be described in two steps (manipulating the environment, which subsequently provides him with information), while the relevant information in the notebook is fixed at the time of Otto’s inquiry. None of this is the case in dyadic synchronic interaction. It would be wrong to say that the infant aims at manipulating the environment (caretaker) to provide information that would enable self-regulation. What the infant ‘wants’ cannot be put in terms of information needed to achieve a previously fixed goal. Rather than ‘content’ and ‘goal’, as in the case of Otto, there is only the interaction itself, which in certain synchronic constellations gives rise to emotional regulation as a previously unintended by-product of interaction. Third, it seems odd to say that the ‘information’ received from the caretaker is automatically endorsed or must—at some past point—have been consciously endorsed by the infant. While infants do generally accept their caretakers as epistemic authorities when discovering the world, it is far from being the case that they automatically endorse the ‘information’ received. For instance, when communication in dyadic interactions is out of tune, rather than endorsing the ‘information’ received, there is often crying protest, which—as we have seen—can be ‘repaired’.

In more general terms, several differences between the kinds of extended cognitive processes that are at stake in dyadic and solipsistic systems must be emphasized. Let me highlight the aspects of ‘uncontrollability’ and ‘irreducibility’ that differentiate cognitive processes in dyadic synchronic interaction from the usual HEC-style solipsistic examples.

  1. (1)

    Uncontrollability Otto and a Tetris player can both be said to be in control of the process, in the sense that they have at some past point consciously chosen to use an external entity for an epistemic end. At the time of the endorsement, Otto clearly has a kind of privileged authorizing status that figures as constitutive of the whole extended cognitive process.Footnote 13 In contrast, the cognitive process in dyadic synchronic interaction is beyond the control of either parts of the dyadic system. Dyadic emotion regulation is in this sense a very different extended cognitive phenomenon; it emerges as the unintended result of interaction, rather than being about accomplishing previously fixed tasks that involve processing information and manipulating information-bearing structures.

  2. (2)

    Irreducibility This aspect also connects to the notion of emergence, which will be explained in detail later in the paper. For now, a simple contrast may suffice. After consulting the notebook, a new belief arises in the “Otto-and-the-notebook” system about the location of the museum. Importantly, this belief is a reducible systemic property, since it directly follows from the notebook entry that contains the information about the location of the museum. In other words, the belief is not novel compared to what is entailed in the parts of the system. In contrast, the emotion self-regulation in the dyadic synchronic interaction is a systemic cognitive property that cannot be reduced to parts of the system. When I speak of a cognitive property of a system, I mean a cognitive property (the capacity to recall, to categorize, perceive the environment, or, in this case, cognitive emotion regulation) that is instantiated in the (dyadic) system rather than a single human agent. Similar to Hutchins’ thoughts, the system’ cognitive potential depends more on the interaction of its components than on the cognitive potentials of its members.Footnote 14

In all, given the problems with the criteria of the HEC and the major differences connecting to the ‘uncontrollability’ and ‘irreducibility’ of system-level properties that differentiate cognitive processes in dyadic and solipsistic systems, we may begin to think that we are dealing here with a special type of extended process that does not really fit the HEC. Put simply, important differences arise between extended cognition in dyadic interaction and extended cognition in the usual cases. Unlike the usual HEC examples, in our case extended cognition is not about the readiness to engage in strategic, ‘epistemic’ actions with the environment, but about the readiness to initiate larger cognitive systems with other human agents for the sake of interaction itself.

In order to accommodate our example within the HEC, we either have to radically reduce the complexity that characterizes synchronic processes or construct a different version of the HEC that is geared towards understanding cognitive processes in non-solipsistic, dyadic systems. The second alternative is the only really attractive one, and this is the one that will be pursued here. Interestingly—at least in my opinion—Clark provides first steps towards such an account in his recent Supersizing the Mind (2008). He presents a slightly different way of arguing for the HEC (the HEC*) that may help to account for extended non-solipsistic cognitive processes that display ‘uncontrollability’ and ‘irreducibility’. My overall engagement with the HEC and HEC* should not be understood as a direct critique, but rather as a way of working towards an account that is able to deal with certain cases of socially extended cognition.

3 The HEC*

In Supersizing, Clark (2008) puts further emphasis on the uncontrolled and autonomous nature of some extended cognitive processes, calling attention to cases in which cognition proceeds without the controlling agency of the subject. In these cases, “control is itself fragmented and distributed” and “reciprocally interwoven among inner and outer elements” (Clark 2008, p. 136). Furthermore, Clark claims that the real added value in adopting the extended approach over competing accounts is that it can account for cognitive processes that proceed without “the intervention of an all-seeing, all-orchestrating inner executive” (Clark 2008, p. 137). Thus, the HEC* “sacks the privileged inner executor” (Clark 2008, p. 138) and explicates uncontrollable processes that remain below the radar of competing ‘merely’ embedded accounts.

For my purposes, I will disregard potentially problematic issues in this work,Footnote 15 and focus on a different aspect of the HEC* that may give us reasons to suppose it can account for extended cognitive processes that display uncontrollability and irreducibility. Let us now assess whether this is indeed the case.

  1. (1)

    Uncontrollability Drawing on the empirical work mainly by Goldin-Meadow (2003), Clark argues that the act of gesturing itself, far from merely being a motor act expressive of fully formed thoughts, plays an active cognitive role by providing an alternative (analogue, motor or visuo-spatial) representational format that reduces the overall neural cognitive load. Gesture and thinking thus continuously inform and alter each other as a coupled system and should therefore be recognized as an “organismically extended process of thought” (Clark 2008, p. 126). In a loop-like manner, gesture “continuously informs and alters verbal thinking, which is continuously informed and altered by gesture (i.e., the two form a genuinely coupled system)” (Clark 2008, p. 125). While extended cognition remains “organism centered but not organism bound” at least sometimes when the organism is not in control (Clark 2008, p. 139).

While I have argued that Otto-type examples do not include ‘uncontrolled’ processes, the gesture example might provide a different case. In Clark’s classic examples of extended cognition, the relation between the parts of the cognitive system remains asymmetric. Otto incorporates or “co-opts” (Clark 2010a, p. 53) the notebook, endowing it with a specific epistemic status, while he remains the locus of control during the whole process. By contrast, in gesture the elements of the cognitive system play symmetric roles in a two-way-interaction, while the locus of control over the systemic level of the process cannot be located in one of the parts. As Clark (2008) says, gesture is not the result of an intelligent skull-bound process deciding to offload work and storage onto bodily (or environmental) structures. Rather, these are sub-personally integrated routines, selected for their cognitive merits. Consequently, it may seem that there is an ‘uncontrolled’ extended cognitive process at stake with no all-orchestrating inner executive (Ibid., 137). Goldin-Meadow’s studies underline the genuine cognitive role of gesture and show that subjects often neither consciously control nor endorse their gestures.

Overall, then, it seems that the HEC* is able to address uncontrolled cognitive processes, and it could therefore be used to understand extended cognitive phenomena in dyadic interaction. However, as we shall see, the HEC* cannot account for extended cognitive processes that display irreducibility.

  1. (2)

    Irreducibility. Goldin-Meadow’s work indicates that children who are prevented from gesturing are less capable of solving math problems, and Clark is correct that gesturing is part and parcel of a coupled neural-bodily continuum that represents a kind of self-stimulation that creates loops. Clearly, the motor act of gesturing does not simply express a neurally realized process of thought; rather, it is a systemic output and a self-generated input that drive a loop-like process. Clark draws an interesting parallel to the turbo-driven automobile engine. A turbocharger can dramatically boost the power of the engine by injecting compressed air into the cylinders of the engine. This in turn leads to explosions in the cylinders that generate more energy, which again generates exhaust flow and powers the turbocharger. Clark’s point is that just as the whole turbocharging cycle is a part of the automobile’s own overall power-generating mechanism (the exhaust is a self-generated input that makes a part of a self-stimulating loop), “gesture is both a systemic output and a self-generated input that plays an important role in an extended neural–bodily cognitive economy” (Clark 2008, p. 131).

    However, it should be emphasized that this does not imply that without gesture the subjects cannot effectively think about these mathematical problems or that they are in principle incapable of solving them. Goldin-Meadow’s work does not show that those math-related thoughts that are shaped by gesture could not in principle be tokened without the involvement of gesture. Rather, the work indicates that there is a quantitative drop in output if you take away gesturing. Nothing indicates that without gesturing a qualitatively different system-level capacity disappears. The same line of reasoning applies to the turbocharger example. Take away the turbo, the output of the engine drops, but general ability of the engine to generate power remains. So while Clark’s examples shows that certain systemic properties help drive cognitive and power-generating processes, they do not show that the relevant systemic properties are irreducible.

In all, the HEC* provides steps towards getting a grip on the kind of extended cognition that is at stake in ‘dyadic synchrony’. But in their current form, neither the HEC nor the HEC* are geared to understand the kind of non-solipsistic extended cognition that is at stake in dyadic emotion regulation. In many ways, our case here represents a significantly different extended cognitive phenomenon. Not only are there significant differences relating to ‘uncontrollability’ and ‘irreducibility’, in our case cognition itself emerges as the unintended result of social interaction rather than being about accomplishing previously fixed tasks by manipulating external structures.Footnote 16 For this reason, I shall draw on the metaphysics of emergence to construct a version of the HEC that is tailored to understanding such processes: ‘Hypothesis of Extended Emergent Cognition’ (HEEC). In the following, I will present the concept of ‘emergence’, distinguish between two notions of emergence (\(\hbox {emergence}_{1}\) and \(\hbox {emergence}_{2}\)) and show how they could be used to support two versions of the Hypothesis of Extended Emergent Cognition (\(\hbox {HEEC}_{1 }\) and \(\hbox {HEEC}_{2}\)) that are guarded against an important objection. Importantly, they are not competitors to the HEC, but rather complement it, and provide different ways of supporting its main idea.

4 The concept of emergence and extended cognition

In relatively recent discussions in philosophy of mind and cognitive science, the concept of emergence is resurfacing, partly as a reaction to the threat of a potential reduction to neuroscience (Sawyer 2002). For its proponents, emergence is seen as a promising tool to address both the question of phenomenal consciousness and the complex, non-programmed and self-organizing behaviours of systems that display unforeseen regularities (McClelland 2010). For others, the concept is broad and slippery, often considered with scepticism and accused of covering matters in metaphysical mysticism (Kim 1999; Craver 2007; see O’Connor and Wong 2012).Footnote 17

In debates connected to the HEC, we may distinguish between two manners of using emergence, one in Clark (2008) and one in Theiner and O’Connor (2010) and Theiner et al. (2010). While Clark has made use of the concept of emergence in earlier work where he systematically distinguishes between several notions of emergence (Clark 1997, pp. 103–104, 112–113), in connection to the HEC he employs a (very) weak concept of emergence.Footnote 18 Clark speaks of emergence in the case of the Tetris player (see Clark 2008, p. 137) and in the case of Otto (Clark 2010a). In both cases, the relevant emergent cognitive property is linear and merely exhibits a simple aggregative decomposition. For instance, the new belief about the location of the museum that the coupled system Otto-and-the-notebook exhibits is a reducible systemic property and only qualifies as emergent in a very weak (and trivial) sense of the term. It is reducible since the occurring belief that guides the action (going to the museum) directly follows from the desire (to go to the museum) plus the notebook entry that contains the information about the location of the museum. Rather than being novel, the belief is completely reducible to beliefs contained in the parts of the system. The case would be little different if it involved standard socially extended cognition as depicted by Tollefsen (2006). Say Otto* also suffers from mild Alzheimer’s, but instead of using a notebook he relies on the memory of his deeply devoted wife of 50 years. In a case in which Otto*’s consultation with his wife about the location of the museum would count as extended cognition, it would still be of a kind that exhibits a merely aggregative decomposition and could be predicted from a pre-emergent stage (given a comprehensive knowledge of Otto* and his wife).

In contrast, Theiner and O’Connor (2010) have elaborated a stronger, three-dimensional notion of emergence for application in cognitive science. This is then taken up by Theiner et al. (2010) and made fertile in connection with group cognition and the HEC. Drawing from work on ‘transactive memory systems’, they argue that “groups have cognitive capacities that go beyond the simple aggregation of the cognitive capacities of individuals” (Theiner et al. 2010, p. 378). Theiner and colleagues deploy a notion of emergence that is tailored to suit cognitive phenomena in larger groups, characterized by several aspects. In such groups, there is a certain symmetry between the parts of the system made up of autonomous individuals. Individuals involved in such systems must be engaged in a common goal and possess some knowledge about the domains of expertise of other individuals (Tollefsen 2006, p. 145; Theiner et al. 2010, p. 379, 382). And, finally, the groups in question solve collective problems.

The nature of our explanandum in this paper requires a concept of emergence that is stronger than Clark’s, but that also exhibits decisive differences from the work of Theiner and colleagues.Footnote 19 In our case, in contrast to Theiner and colleagues, we need to take into account the asymmetry in the dyadic system, and thus the gradual difference between a prime cognizer and a ‘mere’ cognition supporter. In addition, we need to bear in mind that our dyadic system arises spontaneously, without involving common goals and knowledge about domains of expertise. And, finally, our example is not about solving a collective problem.

Following these preliminary clarifications, we may proceed to establishing a notion of emergence that is adequate in our context. In the following I would like to distinguish between two notions of emergence (\(\hbox {emergence}_{1}\) and \(\hbox {emergence}_{2}\)) and show how they could be used to support two versions of the Hypothesis of Extended Emergence Cognition (thus \(\hbox {HEEC}_{1 }\) and \(\hbox {HEEC}_{2}\)).

4.1 \(\hbox {Emergence}_{1}\) and \(\hbox {HEEC}_{1}\)

Emergence may be understood to encompass both reducible and irreducible systemic properties. A systemic property P of a system S is reducible, if P’s being instantiated follows from the behaviour of its parts (Stephan 2002, p. 86; Bedau 2007, p. 158). A property is irreducibly emergent with respect to low-level domains when properties concerning that particular phenomenon are not deducible from properties in the low-level domain (Chalmers 2006, p. 244).Footnote 20

In contrast to Clark, the notion of \(\hbox {emergence}_{1}\) that I would like to propose describes irreducible systemic properties.Footnote 21 In our case, irreducibility is implied to the extent that it guarantees that the systemic property is not reducible to beliefs contained by the parts of the system. Also, speaking of irreducibly \(\hbox {emergent}_{1}\) properties does not involve commitment to the view that these must be in principle irreducible. Rather, the claim is that the systemic property arises more or less autonomously as an uncontrollable effect of interaction, which cannot be completely explained by recourse to the intentions of the parts. Thus, it does not preclude the possibility that some phenomena may be reductively explained as deriving from micro-level truths in a complex, non-linear way. In this ‘cautious’ understanding, there is thus nothing metaphysically problematic about combining irreducibility with causal dependence.

In addition to irreducibility and uncontrollability, the concept of \(\hbox {emergence}_{1}\) that I propose also involves diachronic novelty. This is relatively straightforward. In the course of interaction between lower-level processes, some emergent properties are novel. Most of them are novel in a trivial way, as in novel compared to the underlying components of the system.Footnote 22 But some novel systemic properties are also novel in a non-trivial way. These are diachronically (or historically) novel in the sense that they appear for the first time. Examples of such emergent properties are richly provided in evolutionary accounts.

In all, the concept of \(\hbox {emergence}_{1}\) is significantly stronger than in Clark (2008), involving irreducibility and diachronic novelty.Footnote 23 The next step is to use \(\hbox {emergence}_{1}\) to construct the ‘Hypothesis of Extended \(\hbox {Emergent}_{1}\) Cognition’ \((\hbox {HEEC}_{1})\). Put in more formal terms, one condition must be fulfilled for a systemic property P of a system S to be an instance of cognition in the sense of the \(\hbox {HEEC}_{1}\).

An extended systemic property P of a system S is an instance of the \(\hbox {HEEC}_{1}\) if and only if P is an irreducibly emergent cognitive property that is diachronically novel and does not follow from the features of the parts (either taken in isolation or in constellations simpler than S).

The criterion of diachronic novelty has an important role to play here, since—as ‘mechanistic explanations’ of Craver (2007) and Bechtel (2009) show—relatively few higher-level properties are directly reducible to the sum of lower-level parts. Without diachronic novelty, the \(\hbox {HEEC}_{1}\) might turn out overly inclusive, potentially making itself vulnerable to accusation of leading to a “cognitive bloat.”Footnote 24

4.2 \(\hbox {Emergence}_{2}\) and \(\hbox {HEEC}_{2}\)

In philosophy of mind, ‘whole–part influence’ is usually a distinctive feature of a strong concept of \(\hbox {emergence}_{2}\). This does not simply entail that the relevant systemic properties are irreducible to the lower-level properties. An additional claim that is crucial to \(\hbox {emergence}_{2}\) is that higher-level systemic properties may also have an effect on lower-level properties in a top-down manner (O’Connor 1994, p. 98), while this effect is not deducible from low-level regularities (Chalmers 2006). This is entirely different compared to simple structural macro-properties, like the “V” shape of a bird flock, which only exhibit influence via the activity of their constituting micro-properties. We may say that an \(\hbox {emergent}_{2}\) property P has a direct (‘top-down’) effect on the pattern of behaviour involving S’s parts. Thus, an \(\hbox {emergent}_{2}\) property is an \(\hbox {emergent}_{1}\) property that additionally displays top-down effects. One often-used example is the development of convection rolls in heated liquid (see Bishop 2008; Kelso 1995). Importantly, while the molecules in the pan of oil are subject to random disordered motion, the whole has a top-down effect on the behavior of its parts, as the convection rolls ensure that they “are sucked into an ordered, coordinated pattern” (Kelso 1995, p. 8).

Granted, the issue of \(\hbox {emergence}_{2}\) entailing top-down effects is widely considered controversial. While some researchers claim that it is incoherent because it clashes with non-reductive physicalism (Kim 1999; McLaughlin 1992), others like Macdonald and Macdonald (2010) defend it and aim to work out a broader theory of causal influence that does not violate the principle of the causal completeness of the physical domain. In the light of this debate, it may seem imprudent to rely on such a controversial issue to support my argument. Therefore, to avoid misunderstanding, I must emphasize that my use of \(\hbox {emergence}_{2}\) exploits the heuristic value of the concept, but operates on a different level. Drawing on the work of Sawyer (2002, 2005, 2011), there is a shift of perspective from understanding the realm of physics, non-reductive physicalism and mental causation to the realm of social interaction. Rather than being at odds with the causal completeness of the physical domain, this means the claim is that some \(\hbox {emergent}_{2}\) systemic properties arise as uncontrollable outcomes of interaction and work back on the parts of the interaction. It is not contentious, and consistent with Durkheimian lines of thought, that interactions among individuals may give rise to social constellations and normative currents that later work back on the individual (Sawyer 2005, p. 69, 2002).

After these clarifications, we may proceed to link \(\hbox {emergence}_{2}\) to extended cognition. Given that an \(\hbox {emergent}_{2}\) property is an \(\hbox {emergent}_{1}\) property that additionally displays top-down effects, the ‘Hypothesis of Extended \(\hbox {Emergent}_{2}\) Cognition’ \((\hbox {HEEC}_{2})\) may be defined like this.

An extended \(\hbox {emergent}_{1}\) cognitive property P of a system S is an instance of the \(\hbox {HEEC}_{2}\) if and only if P has top-down effects over its components.

As we shall see, the \(\hbox {HEEC}_{2}\) helps us to explain another aspect of the cognitive processes at stake in our example that the HEC is not geared for. The causal influence of an \(\hbox {emergent}_{2}\) property on its constituent parts entails a shift in focus that may be of methodological relevance to debates on the HEC. Instead of analysing causation ultimately in terms of microphysical causes, the idea is to apply a more coarse-grained perspective that also includes the interaction between systemic and lower-level processes. This is compatible with the HEC and its emphasis on the idea that organism and environment at times create unified cognitive systems that should be regarded as proper units of analysis. The notion of \(\hbox {emergence}_{2}\) is also helpful to understand the difference of ‘mechanistic explanations’ in cognitive science.Footnote 25

5 \(\hbox {HEEC}_{1}\) and \(\hbox {HEEC}_{2}\) in dyadic synchronic interaction

In the previous section, I introduced a tailored conception of emergence, distinguished between two notions of emergence (\(\hbox {emergence}_{1}\) and \(\hbox {emergence}_{2}\)) and used them to underpin two versions of the Hypothesis of Extended Emergent Cognition (\(\hbox {HEEC}_{1 }\) and \(\hbox {HEEC}_{2}\)). These are not so much separate hypotheses as much as designed to deal with “weakly” and “strongly” emergent cases of extended cognition. I have presented them separately, because the \(\hbox {HEEC}_{1}\) targets cases in which the novel system-level property is cognitive and \(\hbox {emergent}_{1}\) (diachronically novel, non-linear and not reducible to the sum of properties at lower levels of complexity), while the \(\hbox {HEEC}_{2}\) deals with cases in which there are additional top-down effects.

To complete the analysis, the next step is to apply these accounts to the cognitive processes occurring in dyadic synchronic interaction. As we shall see, there are epistemic gains in approaching this issue from the vantage point of the \(\hbox {HEEC}_{1}\) and \(\hbox {HEEC}_{2}\).

5.1 \(\hbox {HEEC}_{1}\) in dyadic synchronic interaction

In contrast to Clark’s examples, the emergent emotion regulation ability arising in synchronic dyadic interaction is uncontrollable, irreducible and diachronically novel. The elements of the cognitive system, the infant and the caretaker, play roles in a genuine two-way-interaction, while the locus of control over the systemic level of the process cannot be exclusively located in one of the parts. The emotion regulation ability of the dyad may be regarded as irreducible and emergent. In opposition to Otto’s or Otto*’s case, it is not something like the simple linear ‘sum’ of the infant’s and the caretaker’s states that provides the emotion regulation ability. It is irreducible, because its occurrence does not follow from the features of the parts, either taken in isolation or in less complex constellations.Footnote 26 Also, at least at a certain developmental stage, it is diachronically novel. Hence, the novel system-level property is cognitive and \(\hbox {emergent}_{1}\) (diachronically novel, non-linear and not reducible to the sum of the properties at lower levels of complexity) and we may justifiably speak of a case of the \(\hbox {HEEC}_{1}\).

While our example is not a good fit for the HEC, there are several advantages connected to addressing it with the \(\hbox {HEEC}_{1}\). First, we get a more precise grip on the explanandum by accounting for system-level cognitive properties that do not exhibit a purely aggregative decomposition. Second, we get a more robust explanans since using the \(\hbox {HEEC}_{1}\) does not commit us to saying (as one might think we would with the HEC) that the caretaker is a part of the cognitive process of the infant. Rather, the more modest point is that the interaction is at a certain developmental stage and a part of the emotional regulation process of the infant.

5.2 \(\hbox {HEEC}_{2}\) in dyadic synchronic interaction

Recall the double task that is accomplished in synchronic dyadic interaction. There is a ‘live’ process that regulates the infant’s emotion in real time and the simultaneously unfolding diachronic process that develops the emotional self-regulation ability of the infant. While the emphasis has previously been put on the ‘live’ process and its intelligibility from within the \(\hbox {HEEC}_{1}\), it is the unfolding diachronic process that will receive attention in the following. This is because only the latter can be said to be an instance of the \(\hbox {HEEC}_{2}\)—thus an emergent systemic property that affects its emergence base and thus exhibits top-down effects. However, even if there is intense interaction between systemic and lower-level processes in our case, is it defensible to speak of top-down effects?

To be brief, the irreducibly emergent, ‘live’-regulating cognitive systemic property described with the \(\hbox {HEEC}_{1}\) influences the primitive (lower-level) cognitive abilities of the infant in a ‘top-down’ manner. This ‘live’-regulating cognitive systemic property calls into life a cognitive self-regulatory capability that is located only at the individual level of the child. In other words, when this capacity emerges as an effect of the organizational structure of the dyad, it is an individual cognitive capacity that is (diachronically) novel in the life of the infant. As demonstrated, this capacity shapes and determines the further unfolding of the infant’s experience in that it makes possible the self-regulation of the infant’s emotional states. Put differently, the dyad exhibits top-down effects on the infant and establishes an ongoing change, which, together with the whole development of the infant contributing to this process, eventually leads to the development of individual emotion regulation abilities. On such background, it seems safe to conclude that the diachronic cognitive aspect of dyadic interaction fulfills the criteria that I have laid down for the \(\hbox {HEEC}_{2}\): an emerging systemic property that has top-down effects on the simple parts and processes of the system.

6 Some (epistemic) gains

Having completed the analysis of extended cognition occurring in dyadic synchronic interaction, it is now possible to underscore epistemic gains that are connected to the approach offered in this paper. Before going further, I should consider two possible objections. First, I argued that although children who are prevented from gesturing are less capable of solving math problems, it is not the case that they cannot effectively reflect on or are in principle incapable of solving them. Now someone may argue that this objection also goes for my own example: though children who are prevented from entering into synchronic dyadic interactions are less capable of appropriate emotion regulation, it is not the case that they are in principle incapable of emotional regulation. However, this objection is unsuccessful. While the children’s ability to solve math problems momentarily drops in the first case, the second case is qualitatively different. In the second case, children develop maladaptive emotion regulation, thus a form of impaired emotional regulation that has significant psychological and social costs for the individual (see section 1.2 and Manian and Bornstein 2009; Reck et al. 2004; Tronick and Gianino 1986).

The second objection concerns the whole-part relation. If there are top-down effects on parts of the system, then one might simply describe these in terms of a kind of circular causality. But in that case, it might be problematic to secure the claim that the relation between the relevant systemic property and the parts of the system is one of constitution. However, this is only a problem if one operates with a synchronic notion of constitution. In that case, top-down (and bottom-up) processes could be interpreted as involving a kind of circular causality, which would invite the charge of committing the causal-constitution fallacy. But, as I noted earlier, I’m operating with Kirchhoff’s (2015, 2014) notion of diachronic constitution. On this notion of constitution, it is possible to maintain that although the infant’s emotion regulation process is constituted by the interaction of the individual parts of the dynamical system, there are top-down effects on its parts. This is possible because diachronic constitution is symmetric, and because it is the interlevel relationship that counts as constitutive, with constitutive effects running both bottom-up and top-down. In this way, we are able to describe the interlevel relation between the relevant systemic property and the components of the system as a constitutive and not causal relation. Dyadic emotion regulation is a case in which there are constructively mediated top-down and bottom-up effects between diachronically unfolding processes.

Having clarified this issue, I shall proceed by pointing out how the view proposed in this paper advances in regard to both explanans and explanandum. First, on the level of explanans, it helps us create an account of extended cognition that is robust and resilient to those critical remarks that have been raised against the HEC. Due to the introduction of the requirement of irreducible emergence and diachronic novelty, the risk of cognitive bloat has no traction against the \(\hbox {HEEC}_{1}\) or \(\hbox {HEEC}_{2}\). Both of these issues introduce strict limitations to what can be considered as cases of the HEEC. But it is important to point out that the HEEC should not be understood as undermining the HEC, but rather as complementing it, by helping to comprehend dynamic processes involving ‘uncontrollability’ and ‘irreducibility’, and by helping to account for cognitive properties that are sometimes non-programmed properties of coupled systems. In addition, the use of two understandings of emergence that differ in their strength, yielding two types of ‘Hypotheses of Emergent Extended Cognition’ (\(\hbox {HEEC}_{1}\) and \(\hbox {HEEC}_{2}\)), accurately differentiates between aspects and thus further contributes to the descriptive and explanative precision of the account. So in the end, we have two types of extended cognition, the HEC and the HEEC (which can be further divided into the \(\hbox {HEEC}_{1}\) and \(\hbox {HEEC}_{2}\)). Otto’s case would still qualify as a case of extended cognition on the HEC, while the HEEC could shed light on special dynamic aspects in human interaction.

Second, we get a more precise grip on the explanandum in a narrow manner. The HEEC helps achieve a description that retains the complexity of the phenomenon and accounts for system-level cognitive properties that do not exhibit a purely aggregative decomposition. The analysis may be productively brought to bear on the independently motivated debates between developmental psychologists. Recall that Tronick explains the phenomenon as a ‘Dyadic Expansion of Consciousness’, thus as an expansion of individual states of consciousness. While Tronick does not deliver theoretical support for such a bold claim, among many supporters of the HEC the opinion is that it is highly unlikely that consciousness can be viewed as extended (Clark 2009; Prinz 2009).Footnote 27 There are some who would support the idea that consciousness may extend (see Ward 2012; Noë 2004)Footnote 28, but it is unclear whether these supporters would think that dyadic emotion regulation is a good example for extended consciousness, or that the idea of extended consciousness can help explain dyadic emotion regulation. In any case, my point here is merely that understanding the phenomenon as an instance of the HEEC is both theoretically supported and helps retain the descriptive complexity that Tronick aims to achieve with his notion of the ‘expansion’ of consciousness.

Third, we get a more precise grip on the explanandum in a broad manner. The HEEC expands our understanding of what extended cognition is and how it can be studied. It takes into account that human cognitive activity is often an intrinsically social issue, and human cognition is not always about the readiness to engage in more or less deliberate, ‘epistemic’ actions with the environment. Sometimes cognition is about the readiness to initiate cognitive systems with other human agents for the sake of interaction itself; genuinely cognitive phenomena sometimes arise as unintended and irreducibly emergent results of such processes.

To close, some remarks are warranted on the relation between the \(\hbox {HEEC}_{2}\) and \(\hbox {HEEC}_{1}\) and some of the work on “socially extended cognition” that I mentioned in the beginning of this paper. There are obvious parallels, and while I do not oppose the idea the some form of the \(\hbox {HEEC}_{2}\) might shed light on more cases of extended cognition than just emotion regulation in dyadic systems, I also think that explicating larger socially extended systems might require a different form of the HEEC. The \(\hbox {HEEC}_{2}\) might simply not be complex enough to address processes in larger socially extended systems as discussed in the work of Gallagher (2013) and Cash (2013). Take for instance “mental institution” examples like “legal systems, research practices, cultural institutions, contracts, etc.” that Gallagher discusses. Compared to the type of dyadic interaction that I analyze, Gallagher’s examples exhibit an entirely different level of complexity in terms of both interlevel and intralevel processes. At least without more argument, it would not be warranted to say that the \(\hbox {HEEC}_{2}\) is sufficiently complex to analyze such complex systems.

But there are additional reasons for thinking that the \(\hbox {HEEC}_{2}\) might not be adequate for an explanation of larger socially extended systems. This becomes clearer when we consider important differences between emotion regulation in dyadic systems and large-scale examples. First, in Gallagher’s examples, it is possible for us to be “enactively coupled” to such large-scale systems without at the same time being engaged in face-to-face, embodied, second-personal interactions. However, as demonstrated, such embodied second-personal engagement is crucial to dyadic emotion regulation. Second, the cognitive properties of legal systems, research practices, cultural institutions and contracts can often be understood as linear and as merely exhibiting a simple aggregative decomposition. For instance, consider Gallagher’s case of “Alexis.” Although cognition may be extended across the legal institutional practices in all the scenarios, there is no reason for thinking that the relevant cognitive property is \(\hbox {emergent}_{2}\) (non-programmed, displaying unforeseen regularities, etc.).Footnote 29 Such important differences begin to indicate that large-scale examples would be better served by a different approach, although some of the processes in these examples may also exhibit a different dynamic involving ’uncontrollability’ and ’irreducibility’ that cannot be fully accommodated within the HEC.

7 Conclusion

In contemporary philosophy of the cognitive sciences, a remarkable and vigorous debate deals with alternatives to the standard approach to research and theorizing. Novel ‘situated’ and ‘extended’ approaches oppose the idea that computing activity in the brain constitutes human cognition and acknowledge the major role of the body and external structures. At the same time, the emerging body of work shows that explanatory power is to be gained in cognitive science by such a re-orientation.

To support these novel approaches, I have argued that the HEC needs to be supplemented. To show this, I have extended the discussion towards socially extended cognition in ’dyadic synchronic interaction’. This provides a strikingly different kind of case than those involving strategic manipulations of the environment in order reduce the cognitive workload. I argued that at least some dynamic socially extended cognitive phenomena cannot be explained within the HEC (or the HEC*). Drawing on the concept of emergence (\(\hbox {emergence}_{1}\) and \(\hbox {emergence}_{2}\)), I constructed an account of the ‘Hypothesis of Emergent Extended Cognition’ (\(\hbox {HEEC}_{1}\) and \(\hbox {HEEC}_{2}\)) that may be understood as a version of the HEC that is tailored to understand a particular type of socially extended cognition, in which cognitive properties are sometimes irreducibly emergent and non-programmed properties of coupled systems. Both the \(\hbox {HEEC}_{1}\) and \(\hbox {HEEC}_{2}\) proved productive in shedding light on extended cognitive processes and resistant to a central objection that has been raised to the HEC.

This paper can be seen as an attempt to show how complementing the HEC, adopting a wider focus on extended cognition and taking the social nature of cognition seriously can contribute to an emerging alternative framework for the pursuit of cognitive science. Further exploration of the cognitive incorporations of genuinely social elements may both advance HEC debates and contribute to the gradual materialization of a novel framework for the pursuit of cognitive science.