Introduction

One of the most interesting debates in social cognitive neuroscience in the last decade concerns the findings that the perception of an action leads to an activation of the observer’s own motor system (e.g., Caetano, Jousmäki, & Hari, 2007; Calvo-Merino, Grezes, Glaser, Passingham, & Haggard, 2006; Fadiga, Fogassi, Pavesi, & Rizzolatti, 1995; Kilner, Paulignan, & Blakemore, 2003; Kohler et al., 2002; Marshall & Meltzoff, 2011; Paulus, Hunnius, van Elk, & Bekkering, 2011a; van Schie et al., 2008). Usually, it is assumed that the perceived action is mapped onto the observer’s motor repertoire and the perceived action is thus “mirrored” within the observer’s own motor system.

Importantly, empirical findings have suggested that these mirroring processes are involved in the understanding and prediction of others’ actions (e.g., Aglioti, Cesari, Romani, & Urgesi, 2008; Cattaneo et al., 2011; Daum, Prinz, & Aschersleben, 2011; Flanagan & Johansson, 2003; Falck-Ytter, Gredebäck, & von Hofsten, 2006; Gredebäck & Melinder, 2010; Iacoboni et al., 2005; Kochukhova & Gredebäck, 2010; Rotman, Troje, Johannson, & Flanagan, 2006; Springer et al., 2011; Sommerville, Woodward, & Needham, 2005). Sommerville et al. (2005), for example, facilitated 3-month-old infants’ ability to explore objects by providing them with ‘sticky mittens’, which allowed them to easily pick up objects (cf. Needham, Barrett, & Peterman, 2002). In a subsequent task, these infants (but not infants who did not receive this training prior to the task) paid more attention to the goal object of another person’s grasping action, suggesting that active action experiences facilitated infants’ processing of another’s action. More recently, Aglioti et al. (2008) showed that only active basketball players, but not persons with comparable visual experiences, showed a superior performance in predicting the success of free shots. Additionally, only the basketball players showed an enhanced activation in the cortical motor system, when they observed unsuccessful basket throws. This indicates that the basketball players employed their own motor system to predict the other’s action.

On a theoretical level, several authors have proposed that through this matching process, one simulates the other’s action. By means of this simulation, one employs one’s own experience with this action to understand the other’s action (e.g., Agnew, Bhakoo, & Puri, 2007; Blakemore & Decety, 2001; Fogassi et al., 2005; Gallese, Rochat, Cossu, & Sinigaglia, 2009; Goldman, 2006; Kilner, Friston, & Frith, 2007; Rizzolatti & Craighero, 2004; Rizzolatti & Fabbri-Destro, 2008; Rizzolatti, Fogassi, & Gallese, 2001; Wilson & Knoblich, 2005). Most of these approaches confer in the claim that some kind of “knowledge” (Rizzolatti & Craighero, 2004, p. 172) about others’ “intentions” (Blakemore & Decety, 2001, p. 566) is acquired by means of the simulation process. Additionally, it is said to allow “the perceiver to rapidly interpret” (Wilson & Knoblich, 2005, p. 468) the ongoing behavior and enables “a direct comprehension of the actions of others” (Gallese et al., 2009, p. 110), subserving thus “mind-reading” (Agnew et al., 2007, p. 286). In other words, it has been suggested that by relating the other’s action onto one’s own motor repertoire one can employ one’s own motor programs to attribute either a certain intentional state to the other or a particular goal to the other’s ongoing action.

Criticism

However, these claims have repeatedly been criticized on various grounds (Borg, 2007; Hickok, 2008, Jacob, 2009a, 2009b; Jacob & Jeannerod, 2005; Saxe, 2005; Uithol, van Rooij, Bekkering, & Haselager, 2011). Hickok (2008), for example, has pointed out that humans are capable to make sense of actions that are not in their own motor repertoire. Support for this claim comes from studies, showing that humans are able to understand or predict others’ actions on the basis of pure frequency information (Boseovski & Lee, 2006; Paulus, Hunnius, van Wijngaarden, Vrins, van Rooij, & Bekkering, 2011b), i.e. that action mirroring is not necessary for understanding others’ actions. The strongest criticism, however, comes from a conceptual analysis of the claims of the approaches on action understanding through action mirroring. Jacob (2009a; Jacob and Jeannerod 2005) has argued that it is conceptually awkward to assume that the activation of one’s own motor system could directly lead to the attribution of an intention or a goal to others. He argued that

“as a result of action mirroring exogenously triggered by the perception of a motor act of grasping a mug, an observer could at best form the intention to grasp a mug without executing the act. Clearly, to form the intention to grasp a mug (without executing the act) is not the same psychological state as believing of an agent (distinct from self) that she intends to grasp a mug. Given the above definition of mindreading, only the latter, not the former, constitutes an instance of third-person mindreading. Furthermore, whereas one can form the intention to grasp a mug and lack the concept intention, one cannot believe that another intends to grasp a mug (and thus ascribe to him the intention) unless one possesses the concept intention” (Jacob, 2009a, p. 235).

In other words, Jacob (2009a) pointed to the fact that there is a conceptual and representational gap between the activation of one’s own motor system through the perception of another person’s action and the ascription of an intention or a goal to this person. Whereas there are theoretical models of how a perceived action can activate the observer’s own motor repertoire (e.g., Heyes, 2010; Prinz, 1997), it is questionable how an activation in one’s own motor system should be equivalent to the ascription of a mental state or an action goal to somebody else. This an interpretative process, which leads to a representation of somebody else’s intention or an action’s goal as being the intention or goal of somebody else (see also Perner, 1991). Such a representation is conceptually different from an activation of the motor system.

This argument also closely relates to philosophical analyses that suggest that the understanding of an action is a far more complex cognitive process that is embedded within social practices and customs (Brandom, 1994). Furthermore, a full understanding of an action is usually equated with the understanding of the reasons for this action (Hacker, 2010).

Taken together, whereas, on the one hand empirical findings have suggested a relation between action mirroring and action understanding, others have pointed out that current theoretical models of this relation run into conceptual problems. This situation calls thus for another theoretical approach that on the one hand relates motor resonance to action understanding and on the other hand avoids the problems laid out by the above outlined criticism. Moreover, given the fact that earlier accounts focused mainly on intention ascribing the precise neurocognitive information processing mechanisms that might lead from motor activation to action understanding are poorly understood.

The present contribution suggests a model that addresses both issues. First, it presents a neurocognitive model that relates motor activation through action perception (i.e. action mirroring) to higher-order processes of action processing. Second, it suggests that these processes do not operate on the level of intention ascription, avoiding thus the conceptual problems laid out by, for example, Jacob (2009a).

An ideomotor approach to action mirroring

A closer look at the findings of a possible impact of action mirroring on action understanding shows that the reported effects are often found in forms of predictive eye movements or other gaze-related measures (Cannon, Woodward, Gredebäck, von Hofsten, & Turek, 2011; Daum et al., 2011; Gredebäck & Melinder, 2010; Falck-Ytter et al., 2006; Flanagan & Johansson, 2003; Rotman et al., 2006; Sommerville et al., 2005). I propose that that action mirroring is not intrinsically related to action understanding, but rather leads to attentional modifications. By means of this process, information in the environment is preferentially accessed and processed by the organism. However, only through further processing of this information in the cognitive system, action understanding can be achieved. But how can action mirroring lead to a shift of (visual) attention?

The theoretical perspective that is proposed in this contribution originates from the ideomotor approach to action control, which dates back to Lotze (1852) and James (1890), and is contemporarily exemplified by the Theory of Event Coding (Hommel, Müsseler, Aschersleben, & Prinz, 2001). This account proposes that actions are controlled through bidirectional action–effect associations (Elsner & Hommel, 2001; Hommel et al., 2001; Nattkemper, Ziessler, & Frensch, 2010; Shin, Proctor, & Capaldi, 2010). In particular, it has been suggested that through the repeated co-occurrences of actions and their sensory consequences, the cognitive representations of the effects will be associated with the activated motor program. Actions are thus represented in terms of their (distal) action effects and not on the level of, for example, kinematic details. When someone subsequently either perceives the same effect again or intends to reproduce this effect, the associated motor program will be activated (e.g., Elsner et al., 2002; Elsner & Hommel, 2001, 2004; Hommel et al., 2001; Kiesel & Hoffmann, 2004; Kunde, Hoffmann, & Zellmann, 2002; Paulus, Hunnius, Vissers, & Bekkering, 2011c; Verschoor, Weidema, Biro, & Hommel, 2010). By means of this, bidirectional action–effect associations subserve intentional action control (Hommel, 2009). Furthermore, an activation of the motor program leads to an activation of the associated representation of the action’s typical effect (Kühn, Keizer, Rombouts, & Hommel, 2011). Such an activation of the effect has an impact on subsequent perception (e.g., Bekkering & Neggers, 2002; Lindemann & Bekkering, 2009; Repp & Knoblich, 2007; see for an overview, Schütz-Bosbach & Prinz, 2007). For example, Bekkering and Neggers (2002) showed that the intention to grasp an object facilitates the visual search for the object with the adequate orientation. More directly, Lindemann and Bekkering (2009) asked participants to prepare to grasp an object and rotate it clock- or counterclockwise. Interestingly, the participants were faster to detect a stimulus that seemed to rotate in the same direction as the goal object of the intended object rotation. This suggests that the preparation of an action elicited a representation of the associated effect and that this preactivation (i.e. priming) facilitated subsequent stimulus perception. Taken together, the ideomotor theory provides thus a conceptual framework, which relates motor activation and subsequent event perception to each other.

Importantly, the core assumptions of the ideomotor theory offer a plausible and yet parsimonious explanation of the findings on action mirroring and action prediction. More specifically, following the ideomotor theory action–effect associations can be acquired through own action experiences. When subsequently an action is perceived, which resembles the previously executed action, the perception of the action leads to an activation of the respective motor program in the observer. This, in turn, leads to the activation of the representation of the associated action effect (which may be either a very specific action effect or—generalized—a perceptual dimension that is relevant for this type of action; see Fagioli, Ferlazzo, & Hommel, 2007; Fagioli, Hommel, & Schubotz, 2007). The activated effect representation subsequently modulates visual attention and facilitates the processing of corresponding information in a visual scene (cf. Downing, 2000; McNamara, 1992; Moores, Laiti, & Chelazzi, 2003). As a consequence, for example, anticipatory eye movements to the target of an ongoing action are facilitated (see Fig. 1, for an example).

Fig. 1
figure 1

The figure visualizes the central idea of the proposed model. The part on the left hand side shows how an agent acquires action–effect associations. By means of using a lever, the light is turned on. This relation between action and effect will be represented by the actor (dashed lines indicates the representations of external events in the cognitive system). The right panel shows how a perceived action activates the corresponding motor program in the observer. By means of spreading activation subsequently also the representation of the associated action effect is activated (e.g., that the light is turned on), which will prepare the observer’s perceptual system for relevant information in the environment

Such a process might be useful as it directs one’s attention to information in the environment, which may be relevant in the future, and prepares the perceiver thus for a suitable and timely reaction to the other’s action (see Bekkering et al., 2009). This might be an adaptive and yet simple mechanism, which would support even a less developed organism to act successfully in the interaction and competition with others, without relying on the existence of high-level cognitive abilities such as the existence of a concept of intention (cf. Jacob, 2009a).

Accordingly, this does not mean that by this process the other’s action goal is cognitively represented as such or attributed (i.e. understood that this object is the goal of the otherFootnote 1). Within this model, action mirroring leads only to the activation of the representation of a typical effect, which facilitates the processing of the corresponding environmental event. A full understanding of another’s action is not reached by this action mirroring process as a full understanding of an action requires an explanation of an action in terms of intentions and reasons for this action as well as the social implications of this action (cf. Anscombe, 1957; Brandom, 1994; Hacker, 2010), which rests on knowledge about social rules and practices (Bennett & Hacker, 2003; Hutto, 2008; Nelson, 2007). However, the facilitated processing of certain objects or events in the environment may provide important information for the observer, on which he can rely when reasoning about the other’s action (e.g., when attributing an intention to the other or when considering the possible reasons for the ongoing action). That is, this information may help him to make sense and eventually understand the other’s action.

Refuting possible objections

One possible objection against the proposed model might be that not only effects are associated with motor programs, but also intentions and goals. Following this objection, while it is possible that an activated motor program actives the associated effect code, it could also active an associated goal or intention representation. This argument is problematic for two reasons. First, it assumes that an action’s goal or the actor’s intention exists independently from the effect representation and the associated motor code. However, conceptual analyses have suggested that the description of an action as intentional, or the ascription of an intention to an actor, expresses that the respected action has a goal (i.e. is goal-directed), rather than describing an independent psychological state that exists beyond action and desired effect (for thorough discussions of this topic see Anscombe, 1957; Greve, 2001; Kozak, Marsh, & Wegner, 2006; Waytz, Gray, Epley, & Wegner, 2010). Now, one could still argue that by means of action mirroring activated motor program could activate the associated goal beyond the associated effect. This notion would presuppose in a similar fashion that goals are psychological entities, which exist independently from actions and their desired effects. Yet, when we ask somebody about the goal of his action, he would describe the particular effect that he wants to achieve by this action. This shows that an action’s goal is the desired effect. In other words, goal is just a generic concept to describe the various effects that people strive to realize with their actions. Accordingly, present theories of action control state that the representations of effects serve as goals for future actions (e.g., Hommel et al., 2001). Second, even if we were to assume that intentions/goals are independent psychological states beyond actions and effects, this line of reasoning would run into the same conceptual problems as outlined by Jacob (2009a). More precisely speaking, if the perception of another person’s action would activate the intention that is associated with my corresponding motor program, it could only activate my intention to do something. That is, following the logic of the argument, the intention to do the same action should arise in the observer. This, however, is not the same psychological state as the ascription of an intention to somebody else.

One might also ask whether the proposed model is able to account for findings of context sensitivity in mirror neuron areas (e.g., Iacoboni et al., 2005; Kaplan & Iacoboni, 2006). Note that the presented model is based on an ideomotor approach to action control. This theory holds that actions are controlled by bidirectional action–effect associations and that action–effect associations are thus at the basis of voluntary action. Importantly, research on action control has shown that action–effect associations can be acquired context-specific (Kiesel & Hoffmann, 2004; see also Kunde, Elsner, & Kiesel, 2007). This suggests that the model is able to account for the findings of context sensitivity in mirror neuron areas.

A final question concerns findings of single-cell recordings in monkeys that show activations only for goal-directed actions, but not intransitive actions (e.g., Umiltá et al., 2001). In this study, a number of mirror neurons fired only when a goal-oriented reaching and grasping action was observed,Footnote 2 but not when the goal was hidden or when there was no object to being grasped, indicating that the mirror neurons are a simple way to understand others’ goals (Iacoboni, 2008). Yet, this view is problematic as the ascription of a goal to someone else is a cognitively more complex process that cannot be reduced to a simple mapping mechanism (see Uithol et al., 2011, for an explication of this argument). Importantly, in contrast to these findings, research with human participants has provided converging evidence that also the perception of intransitive actions leads to an activation of motor codes and cortical motor areas. This has been demonstrated in adults and children in behavioral (e.g., Bertenthal, Longo, & Kosobud, 2006; Brass, Bekkering, Wohlschläger, & Prinz, 2000; Catmur, Walsh, & Heyes, 2007; Jonas et al., 2007; Kilner et al., 2003; Press, Bird, Walsh, & Heyes, 2008; for a review see also Heyes, 2011) and neuroimaging studies (Calvo-Merino, Glaser, Grezes, Passingham, & Haggard, 2005; Iacoboni et al., 1999; Koski, Iacoboni, Dubeau, Woods, & Mazziotta, 2003; Maeda, Kleiner-Fisman, & Pascual-Leone, 2002; van Elk, van Schie, Hunnius, Vesper, & Bekkering, 2008). These findings provide evidence for the claim that the perception of an action leads to an activation of the corresponding motor code by the observer.

Yet, this leaves an open question of how to interpret the original findings of goal-selectivity in mirror neurons. On the one hand, it has been suggested that the activation of these particular neurons in the inferior frontal gyrus might be indicative of a higher-order conceptual representation of the situated action (e.g., grasping a cup) rather than of the respected motor code representing a particular motor act (Jacob, 2009b). On the other hand, a second interpretation could be that the authors in this study (Umiltá et al., 2001) selected neurons that actually encode grasping actions (i.e. a motor act instead of the action’s goal), but not reaching behavior. For this reason they did not find activation, when no proper grasping action was presented, but the actor only reached to an empty location. This interpretation is supported by the finding that mirror neurons discriminate between different types of hand actions (Gallese, Fadiga, Fogassi, & Rizzolatti, 1996). The interesting fact that the neurons in the Umiltá et al. (2001) study also fired when the goal was hidden could be due to the monkeys having remembered that there is an object behind the occluder (at the beginning of each trial the occluder was shortly lifted so that the monkey was reminded about the object) and imagining or predicting the upcoming grasping action. Note that they observed in the other trials of the experiment how the model was repeatedly grasping such an object. The claim that an imagined or predicted action activates the same motor code as an observed action is in line with the theoretical notion that executed, imagined, and observed actions share a common representational format (e.g., Jeannerod, 2001) and supported by studies (Schnitzler, Salenius, Salmelin, Jousmäki, & Hari, 1997; Munzert, Lorey, & Zentgraf, 2009). It directly relates to findings that participants show motor activation not only for observed, but also for a predicted action (e.g., Kilner, Vargas, Duval, Blakemore, & Sirigu, 2004; Meyer, Hunnius, van Elk, van Ede, & Bekkering, 2011; Southgate, Johnson, El Karoui, & Csibra, 2010). Importantly, this interpretation is also able to explain another finding of the Umiltá et al. (2001) study. It was shown that the apparently goal-selective mirror neurons fire only during the grasping part of the observed reaching and grasping action sequence (either when the grasp is directly observed or predicted), but not during the initial reaching phase. This finding is hardly to reconcile with the view that this activity is indicative for the detection of the actor’s goal. Participants have observed many instances of the reaching and grasping action sequence, so that the goal of the action should be clear already at the early initiation of the action. In other words, participants know from the first second that the actor is going to grasp the object. If the respected neurons should encode the action’s goal, they should fire immediately. Nevertheless, the neurons only show activity with the onset of the grasping phase, rendering it unlikely that the neurons code the goal of the action. Rather, this finding is in line with the idea that the neuron’s activity indicates a particular action (here: a grasping, but not a reaching action).

Advantages

The proposed model of action mirroring has a number of advantages. First, in developmental literature studies have reported clear findings between infants’ action experiences and their action prediction (e.g., Daum et al., 2011; Falck-Ytter et al., 2006; Sommerville et al., 2005). However, the interpretation that infants use their own motor system to ascribe a mental state to the other person is problematic, as the existence of such higher order cognitive abilities in young infants is not generally accepted (e.g., Haith, 1998; Perner & Ruffman, 2005; Sirois & Jackson, 2006). The present model allows to solve this conflict by offering an explanation for the developmental findings that does not rely on complex cognitive processes, but on fairly ‘lean’ attentional mechanisms. As previous research has indicated that already infants can acquire action–effect associations (Paulus et al., 2011a; Verschoor et al., 2010), infants could employ their own experiences with actions and effects to process others’ actions and prospectively guide their attention to the relevant information in a scene.

Second, this approach clarifies the precise neurocognitive mechanisms that underlie a potential relation between action mirroring and, for example, action prediction. In other words, whereas previous literature has suggested that action mirroring and action understanding or intention ascription may be related to each other, the precise neurocognitive mechanisms were not clearly spelled out and remained thus rather vague. In contrast, the present model suggests that processes based on spreading activation within previously acquired bidirectional action–effect associations as well as attentional cueing might be the relevant information processing mechanisms that are at the heart of these phenomena.

A third advantage of this model is that it avoids the conceptual problems discussed in relation to other theories (e.g., Hickok, 2008; Jacob, 2009a). It provides an explanation for the relation between action mirroring and action understanding, without equating these two psychic powers. That means, it acknowledges that the ascription of an intention to somebody else and the understanding of an action or an action’s goal relies on other processes such as knowledge about the reasons for an action or the integration of different sources of knowledge (e.g., context information, social rules and conventions; cf. Carpendale & Lewis, 2006; Nelson, 2007; Hacker, 2010; Hutto, 2008). Furthermore, it acknowledges that action mirroring is not necessary to understand others’ actions. Nevertheless, the model takes the reviewed findings seriously and agrees that motor mapping and action mirroring can have a facilitative role in action understanding.

Finally, it should be noted that the model makes concrete predictions that would help to empirically test it. More concretely speaking, the model would predict that the perception of another person’s action should lead to an activation of the associated effect codes in the observer. This could be evidenced by priming effects in reaction time tasks or by means of neuroimaging methods. For example, when the execution of an action leads typically to an auditory effect, the perception of this action executed by another person should activate the respected effect code. Future research is needed to investigate this hypothesis.

Conclusion

The present contribution proposes that action mirroring leads to an activation of the representation of the effects that are associated with this action. This activation, in turn, affects the observer’s attention and might facilitate the processing of relevant information in the environment. Accordingly, this model suggests that processes of action–effect binding and attentional cueing are the neurocognitive mechanisms that underlie the processing of others’ actions by means of action mirroring.