Introduction

The control of one’s own movements and of their impact on the external world generates a feeling of control referred to as Sense of Agency (SoA; Moore and Fletcher 2012; Tsakiris et al. 2010). SoA is put forth as a key element of the Self (Daprati et al. 1997; Gallagher 2000) and is fundamental to the development of a feeling of responsibility that fosters social cohesion (Frith 2014). Both “prospective” and “retrospective” accounts have been proposed to explain SoA (see Haggard 2017 for a review). Prospective accounts suggest high SoA is generated when action execution is preceded by fluent action selection (Chambon et al. 2012; Sidarus et al. 2017a; Wenke et al. 2010). Retrospective accounts focus on the role of processes that take place after action execution, like monitoring the consequences of actions, and suggest that high SoA is experienced when actions unfold as predicted (Blakemore et al. 1999, 2000, 2001, 2003; Frith et al. 2000; Wegner and Wheatley 1999). In spite of the differences between these accounts, it is widely acknowledged that detection of discrepancies between planned and actual consequences of the action reduces SoA. However, little is known about how and to what extent different types of prediction errors (e.g., related to the correctness of the movement or to the actual achievement of the targeted goal) affect SoA. Consider the case of a soccer player about to shoot a penalty: the player plans the shot and expects to score. If the planned movement is correctly performed and the goal is scored, no mismatch is identified by the player. However, the player may have to deal with occasional errors that could involve the execution of the movement (e.g., a clumsy performance), the achievement of the goal (e.g., the goalkeeper catches the ball), or both.

Will the player experience the same level of agency in each scenario?

The idea that some errors may impair SoA more deeply than others can be found in the renowned example of the famous soccer player Maradona who scored a goal by touching the ball with his hand. In doing so, the Argentinian champion led his team to win the world cup by scoring with an unconventional (and forbidden) movement. Under the assumption that he hit the ball with his hand involuntarily, the example begs the question: how much control is experienced when a goal is achieved with an unplanned movement?

Apart from football, we constantly perform goal-directed actions in everyday situations, which can be as simple as grasping a glass. By failing in completing such actions, we experience a discernible reduction in our feeling of control (Pavone et al. 2016; Spinelli et al. 2017). As in the case of the soccer player, we may fail or succeed in the presence of a clumsy motor performance or of changes in the external environment.

Although this type of dissociation may bear theoretical and practical implications, previous research does not resolve the incertitude regarding whether the reason behind the failure makes any difference to SoA.

This is due to the fact that the link between action monitoring and SoA was traditionally addressed by selectively investigating the effects of either movement information or goal achievement. The contribution of movement information to SoA has been traditionally studied by manipulating the degree of correspondence (i.e., congruency) between an executed and an observed movement (Daprati et al. 1997; Farrer et al. 2008; Fourneret and Jeannerod 1998; Padrao et al. 2016; Van Den Bos and Jeannerod 2002). A reduction of SoA has been consistently reported for incongruent movements. The influence of goal achievement on SoA has been investigated with tasks resembling videogame interfaces. The experimenters systematically varied the ease by which a target depicted on a computer screen could be reached by means of an input device (Kumar and Srinivasan 2017; Metcalfe et al. 2013; Metcalfe and Greene 2007): failure to achieve the goal was associated with a loss of SoA. Additionally, direct manipulation of the outcome of the action (e.g., a sound or a visual event following participant’s action) reduces SoA when the outcome is different than predicted (Kühn et al. 2011; Sato and Yasuda 2005) or when the outcome is delayed with respect to movement execution (e.g., Farrer et al. 2013; Spengler et al. 2009).

Despite the importance of manipulating movement and goal information within the same experiment to understand their impact on SoA, to the best of our knowledge only two recent studies on this issue have been published thus far. In the first, David et al. (2016) asked participants to observe a virtual hand depicted on a monitor (David et al. 2016). A movement of the hand was reproduced in the virtual environment, and a tap with the index finger was associated with an outcome: either in the form of a sound or of a color change in the VR scenario. In each trial, the experimenters manipulated the lag between movement execution and (1) movement observation or (2) outcome occurrence. Participants were asked to judge if the action they observed was or was not their own. Participants were found to be less likely to attribute the action to themselves when delays were introduced with respect to outcome observation. This led the authors to conclude that SoA is more sensitive to outcome than to movement information. Importantly, the authors manipulated movement and outcome separately and incongruence took place only in the time domain (both the observed movement and the outcome were correct but they could occur later than expected).

In the second study, Caspar et al. (2016) asked participants to associate two finger movements to two successive tones (Caspar et al. 2016). In the experimental phase participants decided freely which finger to move and the action was followed by the expected or unexpected tone. As in the Intentional Binding paradigm (Haggard et al. 2002), perceived latency of the tone was taken as an implicit measure of SoA. Importantly, while performing the task, participants observed a robotic hand moving the same or another finger. The authors found that binding between action and tone was stronger for congruent than incongruent tones only if the robot moved the same finger of the participants. These findings suggested that SoA was sensitive to both movement and outcome information.

It is worth noting that in Caspar et al. (2016), participants observed a robotic instead of a humanlike hand and their action was not clearly identifiable as goal-directed, since participants may not have intended to produce certain tones, but rather they expected a specific tone to occur following the movement of a certain finger as learned in preceding training blocks. Importantly, the manipulation of movement and outcome was not simultaneous: outcomes did not immediately follow movement execution (delays between action and outcome were at least 300 ms), while the robotic hand moved immediately after the real hand movement. Thus, the roles of specific types of prediction errors in reducing SoA remain unclear.

To fill this gap, we sought to investigate how SoA is modulated when monitoring two fundamental sub-components of the action at the same time, namely the congruency between performed and observed movement and the achievement of the goal. We reasoned that the simultaneous manipulation of the two sub-components in the context of an intuitive goal-directed action would allow a straightforward comparison of their respective roles for SoA.

We thus devised a novel task that involved simple goal-directed actions (i.e., to press, by raising or lowering the right index, one of two colored buttons), while participants observed actions performed by a virtual humanlike hand. Virtual actions could be congruent or incongruent with participants’ real actions in terms of movement execution and/or the resulting outcome. Moreover, to investigate the temporal dynamics of the effects of movement and goal manipulation on SoA, we introduced delays between action execution and action observation. Causality perception and time perception are known to be closely linked and to influence each other (Desantis et al. 2016; Shimada et al. 2010; Stetson et al. 2006; Timm et al. 2014; Walsh and Haggard 2013). Indeed, representing one’s own actions as the cause of certain outcomes biases time perception of the events associated with these actions (Desantis et al. 2011, 2016). For instance, expecting that specific visual stimuli will follow one’s own actions (as an effect of a previous learning phase) induces the tendency in the participants to perceive the onset of the visual stimuli as occurring after their own actions (Desantis et al. 2016). Vice versa, temporal cues are known to contribute to the perception of causality and to SoA: time gaps between action and outcome reduce the sensation that the outcome results from one’s own action (David et al. 2016; Franck et al. 2001; Sato and Yasuda 2005; Shanks et al. 1989; Weiss et al. 2014). Hence, the introduction of delays allowed us to measure SoA by means of judgments of temporal correspondence (Weiss et al. 2014) between the executed and the observed action (henceforth called Synchrony Judgments). We chose this measure since Synchrony Judgments rely on the same information employed to attribute an action to oneself or to someone else (Weiss et al. 2014). This may be suggested by interesting fMRI data (Farrer et al. 2008) showing that the inferior parietal cortex is activated both when participants notice delays between their action and visual feedback of their action (i.e., temporal discrepancy) and when they attribute the visual feedback to someone else (i.e., action authorship discrepancy). Reporting a discrepancy between an action and its outcome may thus be equivalent to expressing an explicit agency judgment (Weiss et al. 2014). Besides that, it is possible that Synchrony Judgments might capture also variations in the Sense of Ownership (SoO)—the sense that my body is ‘my own’ and that I am the one who is undergoing an experience (Gallagher 2000; Tsakiris et al. 2010)—of the participants. The point of a possible confusion between different aspects of SoA and SoO was also reported for previous studies that investigated SoA with measures similar to the one we employed (Gallagher 2012, 2013; Gallagher and Zahavi 2007). However, choosing Synchrony Judgments as a measure of SoA may facilitate the comparison of our results with those of similar studies possibly reducing the influence of self-attribution bias (Tsakiris et al. 2005; Wegner and Wheatley 1999; Weiss et al. 2014). We expected that Synchrony Judgments would be differently influenced by the type of observed action and by the introduction of incongruences.

In keeping with previous studies, we predicted that observation of both an incongruent (compared to a congruent) movement and a missed (compared to an achieved) goal would reduce perceived synchrony between the participant’s action and the one shown in the virtual scenario, which in turn would indicate a diminished SoA.

Research suggests that information relative to movement kinematics may not be adequately monitored as long as the visual feedback is coherent with the goal of the action (Fourneret and Jeannerod 1998). Therefore, we predicted that observing a failure in reaching the goal should be more relevant in diminishing SoA than observing an incongruent movement. This prediction remains consistent with the conclusions of David et al.

Finally, the introduction of delays allowed us to further investigate the temporal dynamics of movement and goal monitoring on SoA.

Materials and methods

Participants

Thirty healthy volunteers took part in the study (15 males; age range 20–32 years; mean ± standard error of the mean (SEM) 24.1 ± 0.538). All participants were right-handed, had normal or corrected-to-normal visual acuity and were naive as to the purposes of the experiment. Explanations of the experimental hypotheses were provided only after the end of the experiment. The experimental protocol was approved by the ethics committee of Fondazione Santa Lucia (Prot. CE/PROG.557) and was performed in accordance with the 1964 Declaration of Helsinki. All participants provided a written informed consent to take part in the study and received a refund of € 7.50/h.

Apparatus

The experiment was run by means of a Matlab (The MathWorks, Inc.) custom script and relied on the use of a mixed-reality scenario (Fig. 1a). A virtual response box (composed of two dark gray buttons attached to the upper and lower part of a transparent structure) and a virtual humanoid right limb (forearm + hand) were depicted on a computer screen.

Fig. 1
figure 1

Experimental set up. Participants sat on a chair in front of an inclined PC monitor. The virtual environment represented a virtual right limb resembling a human hand. Participant’s right arm rested on the table and matched the position of the virtual limb, with the index resting between the two buttons of the real response box (a). The C-shape response box with two buttons facing each other allowed participants to perform goal-directed actions (b). A keyboard on the left (not visible to the participants) allowed us to collect responses to the Synchrony Judgment questions (see main text for details). Answers were provided by pressing two keys (labeled with “S” for synchronous, and “A” for asynchronous) with the index and middle finger of the left hand

The index of the virtual hand laid between the two virtual buttons of the response box. Virtual stimuli were created with 3DS Max 2011 (Autodesk, Inc) and were presented on a led monitor (Benq GL 2250-T; refresh rate, 60 Hz; resolution set to 1280 × 720 pixels) sustained in an inclined position (12.7° with respect to the horizontal plane) by a wooden structure located on a table. A rectangular hole (7.50 × 5.8 cm) at the front of the structure allowed participants to lay their right hand on the table, under the monitor and hidden from sight. A custom-made C-shaped response box, designed to record downward and upward movements of the index, was placed on the desk below the monitor (Fig. 1b). This was composed of two identical numeric keypads that allowed two button presses with opposite movements. A plastic support fixed to the table (height, 7 cm) sustained the upper keypad, so that the keys of the two devices were facing each other’s. To facilitate input acquisition, two plastic buttons (height, 1.5 cm) with a squared, flat top face (side length, 3.2 cm) were fixated to single keys of the two keypads and aligned. The distance between the surfaces of the two buttons was adapted for each participant by inserting paper supports below the lower keypad, until the dorsal part of the distant phalanx of the index touched the superior button, while the ventral part rested on the inferior button. In this way, the key features of the virtual response box closely matched the features of the physical response box. A keyboard (not visible to participants) was positioned on the table to the left of the monitor and allowed participants to express Synchrony Judgments (see “Action–outcome manipulation” and Fig. 1 for details).

Procedure and task

The study was performed in a dimly lit room. Participants sat comfortably on a chair in front of a table, at a viewing distance of approximately 40 cm from the center of the screen. They were asked to lay their right arm on the table trying to match the position of the virtual limb and to insert their index in the space between the two buttons of the response box. A black cloth covered the shoulders and the elbow joint preventing any visual discontinuity between the virtual arm and participant’s real limb. Participants were asked to perform goal-directed movements following a color-based rule (see below for details and Fig. 2a for a graphical representation of a typical trial). For each trial, the two dark gray virtual buttons turned, respectively, to blue and yellow. Participants were instructed to press as fast as possible the button corresponding to a given target color (blue or yellow). Importantly, for each trial, the target color change could involve the upper (pressed by lifting the index finger) or the lower button (pressed by lowering the index finger).

Fig. 2
figure 2

a Timeline of a typical trial. For explanatory purposes, we show only the case where the color of the target button was blue. At the beginning of each trial, the color of the buttons was dark gray. After 1000 ms a text instruction reminded participants about the color of the target. After a random interval comprised of between 1000 and 1500 ms the buttons flashed once to yellow and blue for 100 ms with a random disposition. Feedback in the virtual scene was shown after a temporal delay (0, 75, 150, 225, 300 ms) only if participants pressed the correct button. The feedback consisted in a congruent (M+) or incongruent (M−) movement with respect to the one performed by the participant where the goal (i.e., pressing the target color) was achieved (G+) or missed (G−). The type of feedback depended on participant performance as evaluated by a staircase procedure. A fast press was associated with correct feedback (M+G+), while a slow press was associated with one of the types of erroneous feedback (M+G−, M−G+, M−G−). In case of a real error, a prohibition sign appeared on screen and the current trial was aborted. Feedback was shown for 500 ms: a black rectangle covered the virtual hand and the response box, and participants were asked to provide Synchrony Judgments. Trials were separated by an inter-trial interval of 1000 ms. b This panel represents the possible types of feedback participants observed after pressing a button of the response box. For explanatory purposes, we report only the case where the color of the target button was blue and where it appeared above the index, but the manipulation of movement and goal information was the same when participants were asked to press the yellow button and when the disposition of colors was reversed. On the far left of the figure the initial disposition of the colors is reported. On the center of the figure the four possible types of feedback that followed a correct button press are displayed: one type of feedback was fully correct (M+G+) and was viewed if participants provided a fast response, while the remaining types of feedback were erroneous (M+G−, M−G+, M−G−) and one of them was observed if they provided a slow response. The panel on the right represents the prohibition sign participants viewed if they pressed the wrong button

At the beginning of each trial, an instruction is presented for 1 s reminding participants about the target color (i.e., “Press Yellow/Blue”). After a random interval (between 1000 and 1500 ms), the two buttons changed from dark gray to yellow and blue for 100 ms with a random disposition (see Fig. 2a). The color change signaled to participants to press the target button as fast as possible. In trials where participants followed the instructions correctly, the button press triggered an action of the virtual hand (i.e., a visual feedback in the virtual scene; see “Action–outcome manipulation” section). In trials where participants followed the instructions erroneously (e.g., the target color was “blue” and they pressed the “yellow” button), a prohibition sign was displayed for 2 s (see Fig. 2b). After this signal, the current trial was aborted and a new trial begun.

Participants performed two blocks in which the color of the target was fixed. Thus, the color of the target remained the same for the entire duration of the first block but was changed in the following block. The block order was quasi-counterbalanced across participants (16 and 14 participants started the first block with the blue and yellow color as target, respectively).

Due to the adaptive algorithm employed to determine the type of visual feedback participants observed in the virtual scene (staircase procedure; see “Action–outcome manipulation” for more details), the number of trials of the two blocks was not identical for each participant. Participants performed on average 247 trials (range 223–271; SEM ± 1.89) in the first block and 246 trials (range 224–281; SEM ± 2.12) in the second block. Hence, participants performed on average 493 trials (range 447–547; SEM ± 3.33) in the whole experiment.

Before starting each block, participants performed a practice session to familiarize themselves with the task. During the practice session, they pressed the target color that would be used in the next block. Participants performed on average 24 practice trials (range 18–30; SEM ± 0.62) before the first block and 22 trials (range 15–29; SEM ± 0.65) before the second block.

Action–outcome manipulation

The visual feedback was presented at different delays after participants’ actual button press (0, + 75, + 150, + 225, + 300 ms). The feedback consisted of a button press in the virtual scenario, where the observed movement could be congruent or incongruent with the one participants performed (M+/M−) and the disposition of the colors of the two virtual buttons could be the same as the one preceding their input, or reversed (see Fig. 2b). Thus, by changing the disposition of the colors after the button press, the goal of the action could be either achieved or missed (G+/G−). Overall, we manipulated action–outcome expectations in four different ways: congruent movement with achieved/missed goal (M+G+ and M+G−) and incongruent movement with achieved/missed goal (M−G+ and M−G−). Therefore, one type of feedback was fully correct (M+G+), while the remaining types of feedback were erroneous (M+G−, M−G+, M−G−), since they could conflict with participants’ expectations about the observed movement and/or about goal achievement.

Whether participants observed correct or erroneous feedback depended on their reaction time. An adaptive algorithm (staircase procedure) was used to set up the limit to classify fast and slow responses for each trial (Walentowska et al. 2016). The mean of the reaction times in the last two trials (the current trial and the previous one) was computed and if the reaction time in the current trial was lower or equal to the mean value we considered it a “fast” response, while if it was higher than the mean value we considered it a “slow” response. Fast responses were associated with the observation of correct feedback, while trials in which participants provided slow responses were associated with the observation of erroneous feedback. The advantage of this procedure was that the response deadline was updated throughout the experiment, which prevented habituation and fatigue while motivating participants to actively attend to the external stimulus (Walentowska et al. 2016).

Each type of erroneous feedback (M+G−, M−G+, M−G−) was presented 80 times (16 times for each delay, 8 per block), for a total amount of 120 trials per block and 240 trials for the entire study. The order of appearance of the different types of erroneous feedback was fully randomized.

However, due to the characteristics of the staircase procedure, the appearance of the correct feedback (M+G+) depended on participant’s reaction time. This meant that the number of M+G+ observations was not identical for each participant. Participants observed on average an M+G+ feedback 239 times (range 198–272; SEM ± 2.78. See “Data handling” for more details). The order of delays for M+G+ was randomized.

In each of the two blocks, the first four correct button presses were always followed by the observation of a M+G+ feedback. This allowed participants to acclimatize and to start the staircase procedure for stimuli presentation. The feedback lasted on screen for 500 ms followed by the appearance of a black rectangle covering the virtual hand and the virtual response box. Participants had to judge whether the visual feedback was synchronous or asynchronous with their movement (Synchrony Judgment question, henceforth SJ). Participants were explicitly instructed to focus on the temporal correspondence between their movement and the feedback showed on screen, irrespective of the specific kind of feedback they observed. Answers were collected by pressing two keys (labeled with “S” for synchronous, and “A” for asynchronous) with the index and middle finger of the left hand. The associations between the two judgments (S and A) and the fingers (index and middle) used to provide the response were counterbalanced across participants.

Data handling

We excluded from the analysis the first four trials of each block (i.e., trials where the staircase procedure was not operating). Trials where participants committed a real error by pressing the wrong button according to instructions (range 0–27; mean ± SEM 5.83 ± 1.13; mean percentage of real errors across participants 1.18%) and trials where participants failed to provide any response after buttons flashed into yellow and blue (e.g., they did not notice the disposition of the colors) were aborted (range 0–4; 0.6 ± 0.189). This left on average 479 valid trials per participant (range 438–512; SEM ± 2.76). In half of these trials (mean ± SEM; absolute value 239 ± 2.78; percentage value 49.94 ± 0.298%), participants viewed a M+G+ feedback, while the remaining trials were equally divided among the three types of erroneous feedback (M+G−: absolute value 80 ± 0.056; percentage value 16.70 ± 0.101%; M−G+: absolute value 80 ± 0.92; percentage value 16.67 ± 0.099%; M−G−: absolute value 80 ± 0.92; percentage value 16.70 ± 0.100%). Thus, to perform the statistical analysis on the same number of trials per condition, we implemented an algorithm to select a subset of trials equally spaced for each action–outcome × delay manipulation. By applying this algorithm, an equal number of trials for each condition was obtained for each participant (absolute value 15.7 ± 0.085; range 15–16, but see Supplementary Materials where the same analyses were performed on the whole data set and similar results were found).

Two dependent variables were taken into account: (a) the proportion of “synchronous” answers to the Synchrony Judgments (SJs) per each experimental condition; (b) the amount of time participants took to provide a SJ after receiving visual feedback on the screen. This variable will be referred to as  “Judgment Times” (JTs). Moreover, the reaction times (RTs) between target appearance and button press were analyzed to check staircase procedure effectiveness.

The mean values of these variables were calculated for each participant for all the 20 experimental conditions, which resulted by manipulation of three factors: Movement (2 levels: M+/M−); Goal (2 levels: G+/G−); Delay (5 levels: 0/75/150/225/300 ms).

Normality was not met for some conditions when both the Kolmogorov–Smirnov test was significant and the z-scores for Skewness and Kurtosis were not between − 2.58 and + 2.58 (Field et al. 2012). To correct for this, SJs mean values underwent an intra-subjects standardization by means of an ipsatization procedure (Tieri et al. 2015). A reciprocal transformation (1/x) was applied to JTs and RTs, since several conditions were not normally distributed. After applying these transformations, we found no deviations from normality for the dependent variables. Transformed variables were then entered into separate 2 × 2 × 5 repeated measures analysis of variance (ANOVAs) with Movement, Goal and Delay as within-subjects factors. Tukey correction was applied for all post hoc comparisons.

Results

Staircase procedure

The 2 × 2 × 5 ANOVA on transformed RTs was performed to check that the staircase procedure was effective in splitting fast and slow presses and in associating the observation of erroneous feedback only to slow reaction times. According to the algorithm we set, fast RTs should have been followed by M+G+, while slow RTs should have been followed by one type of erroneous feedback. Hence, RTs should be, on average, faster in trials where M+G+ was viewed, compared to trials where one type of erroneous feedback (M+G−, M−G+, M−G−) was viewed. Consequently, these should not differ one from another. Indeed, we found significant main effects of factors Movement (F(1, 29) = 143.38, p = 0.000, ηp2 = 0.832) and Goal (F(1, 29) = 223.59, p = 0.000, ηp2 = 0.885). Importantly, a significant Movement × Goal interaction was also found (F(1, 29) = 183.44, p = 0.000, ηp2 = 0.863). Post hoc comparisons showed that M+G+ observations (2.14 ± 0.061) were preceded by faster RTs compared to M+G− (mean ± SEM 1.69 ± 0.058; p = 0.000; d = 1.37), M−G+ (1.71 ± 0.063; p = 0.000; d = 1.26) and M−G− (1.70 ± 0.059; p = 0.000; d = 1.33). All the other comparisons did not differ (all ps > 0.845, all ds < 0.057). This pattern of results confirms that fast responses were followed by M+G+, while slow responses were followed by one of the three types of erroneous feedback.

The ANOVA did not show any other significant main or interaction (all Fs < 1.32; all ps > 0.266, all ηp2 < 0.044) effects.

Synchrony Judgments (SJs)

The 2 × 2 × 5 ANOVA on the mean scores of ipsatized SJs showed a significant main effect of factor Movement (F(1, 29) = 4.47, p = 0.043, ηp2 = 0.134; Fig. 3a). Perceived synchrony was higher when participants viewed a congruent movement (M+: (mean ± SEM) 0.043 ± 0.021) compared to when they viewed an incongruent movement (M−: − 0.043 ± 0.021; d = 0.772). Not surprisingly, the ANOVA also revealed a main effect of the Delay (F(4, 116) = 81.445, p = 0.000, ηp2 = 0.737; Fig. 3c) explained by higher SJs for shorter delays compared to longer delays (delay 0, 0.253 ± 0.026; delay 75, 0.158 ± 0.020; delay 150, 0.003 ± 0.014; delay 225, − 0.165 ± 0.018; delay 300, − 0.248 ± 0.024; all ps < 0.039; all ds > 0.737). The only exception were delays 225 and 300 which did not differ from one another (p = 0.095; d = 0.717). Importantly, the ANOVA revealed a significant interaction between factors Goal and Delay (F(4, 116) = 4.06, p = 0.004, ηp2 = 0.123; Fig. 3b). Post hoc comparisons revealed that at delay 0 participants perceived the feedback as more synchronous when goal was achieved compared to when it was missed (p = 0.015; d = 0.358). The same comparison was marginally significant (p = 0.054; d = 0.380) also at delay 75. SJs were not different when goal was achieved or missed at delays 150, 225 and 300 (all ps > 0.999, all ds < 0.087); see Table 1 for mean ± SEM for each Goal × Delay level. It is interesting to note that the analogous interaction between factors Movement and Delay was not significant (F(4,116) = 0.806, p = 0.524, ηp2 = 0.027). Thus, in contrast to information about goal achievement, incongruent movements were associated with lower perceived synchrony irrespective of the duration of the delay (main effect of factor Movement).

Fig. 3
figure 3

This figure represents the mean ipsatized scores of Synchrony Judgments (SJs) a after the observation of a congruent (M+) or incongruent (M−) movement, b after the observation of feedback where goal was achieved (G+) or missed (G−) for each delay (0, + 75, + 150, + 225, +300 ms. Only significant differences between G+ and G− within each delay are plotted) and c for each delay irrespective of the type of observed feedback. Vertical bars denote mean ± standard error of the mean (SEM)

Table 1 Mean ipsatized scores ± standard error of the mean (SEM) of Synchrony Judgments after the observation of feedback where goal was achieved (G+) or missed (G−) for each delay

The ANOVA did not show any other significant main or interaction (all Fs < 0.906; all ps > 0.349; all ηp2 < 0.031) effects.

Judgment Times (JTs)

The 2 × 2 × 5 ANOVA on transformed JTs revealed that participants were significantly faster in providing SJs when they observed visual feedback that was fully congruent with what they expected in terms of movement direction and goal achievement (M+G+).

The ANOVA showed a significant main effect of factor Delay (F(4, 116) = 4.748, p = 0.001, ηp2 = 0.141). Post hoc comparisons on factor Delay revealed that participants were significantly faster in providing a JT when the delay was 300 ms (mean ± SEM 1.58 ± 0.112) compared, respectively, to 0 ms (1.39 ± 0.110; p = 0.003; d = 0.319), 75 ms (1.41 ± 0.105; p = 0.016; d = 0.283) and 150 ms (1.42 ± 0.097; p = 0.030; d = 0.273). No other comparison was significant (all ps > 0.110; all ds < 0.213).

We found a significant main effect of Movement (F(1, 29) = 15.91, p = 0.000, ηp2 = 0.354) and a main effect of Goal (F(1, 29) = 18.085, p = 0.000, ηp2 = 0.384). Importantly, the interaction between Movement and Goal was significant as well (F(1, 29) = 25.054, p = 0.000, ηp2 = 0.463; Fig. 4). Post hoc comparisons revealed that participants were significantly faster in providing a SJ when they viewed M+G+ (mean ± SEM 1.64 ± 0.116) compared to M+G− (1.41 ± 0.101; p = 0.000, d = 0.395), M−G+ (1.39 ± 0.010; p = 0.000; d = 0.434) and M−G− (1.41 ± 0.101; p = 0.000; d = 0.387) respectively. All the other comparisons were not significant (all ps > 0.887; all ds < 0.049).

Fig. 4
figure 4

Graphical representation of the reciprocal mean values (1/x) of the time required to express a Judgment Time (JT) after the observation of each possible feedback. Vertical bars denote mean ± standard error of the mean (SEM)

The ANOVA did not show any other significant main or interaction (all Fs < 0.854; all ps > 0.494; all ηp2 < 0.029) effects.

Discussion

We investigated how violating expectations about movement execution or goal achievement influences SoA. We reasoned that to understand their relative contribution to SoA, movement execution and goal achievement should be manipulated simultaneously in the context of an intuitive goal-directed action. To do this, we devised a novel paradigm that combined the execution of simple goal-directed actions (i.e., pressing a button of a target color by lifting or lowering the index finger) with the observation of virtual actions that fitted or violated the participants’ expectations about the performed and observed actions. Virtual actions could be congruent or incongruent with participants’ actions. Movement and/or goal incongruences could occur at different time delays, a feature of the task that allowed us to investigate the temporal dynamics of their effects on SoA. Participants were asked to evaluate the synchronicity between the executed and the observed actions, i.e., Synchrony Judgments, which is equivalent to express an explicit judgment of agency.

Our results indicate that both movement and goal errors impair SoA. However, we show for the first time that movement monitoring may be a more constant source of modulation of SoA than goal monitoring.

Modulations of SoA: influence of movement, goal and delays between action execution and action observation

The analysis of SJs showed a significant main effect of Delay. This is coherent with the results of previous studies that found a reduction of SoA when the latency of events resulting from one’s own actions differs from what one expects (e.g., David et al. 2016; Franck et al. 2001; Weiss et al. 2014). In the specific case of our study this also indicates that participants correctly understood the task: they successfully identified increasing delays between their action and the visual feedback in the virtual scenario.

Interestingly, the analysis of SJs showed that participants tended to perceive the visual feedback in the virtual scenario as more synchronous with their own action when they observed a movement that was congruent with the one they executed, compared to when they observed an incongruent movement, as indicated by the main effect of factor Movement. This was true regardless of the specific delay we introduced between the observed and executed action and irrespective of whether the goal was or was not achieved (both the interactions Movement × Delay and Movement × Goal were not significant). The feedback in the virtual scenario was also perceived as more synchronous when the goal was attained compared to when it was missed. However, this happened only for simultaneous (0 ms) feedback or when a very short delay was introduced (75 ms) with respect to the button press as revealed by the post hoc analysis ran for the significant Goal × Delay interaction on the SJs.

These results suggest that information regarding the congruency of the movement and the achievement of the goal are both relevant for experiencing SoA. One may note that: (1) participants’ SoA decreased when they observed a movement that was incongruent with their own, irrespective of when they observed it; and (2) the observation of a failure to achieve the goal was also effective in reducing SoA, but only when the feedback was contemporary to or immediately followed action execution. However, these results do not indicate that movement information is more relevant than goal information for SoA, since no interaction between factors Movement and Goal was found.

The time-limited sensitivity to goal manipulation found in our study, as compared with the constant reliance on movement information, may be compatible with the findings by two previous studies (Metcalfe et  al. 2013; Metcalfe and Greene 2007). In Metcalfe et al. (2013) the experimenters asked their participants to play a videogame in which their task was to touch downward scrolling targets with a cursor controlled through a mouse. Success in touching the target was associated with a change in its visual appearance (“explosion”). The cursor responsiveness to commands (“proximal action”) and the probability that the target would “explode” after a hit (“distal outcome”) were manipulated. SoA was more influenced by introducing a perturbation that affected the responsiveness to commands, than by diminishing probability of causing the explosion of the target. Congruently, in a previous study that employed a similar procedure, Metcalfe and Greene (2007) found that SoA was modulated by the degree of control participants were allowed to exert over the outcomes. When no perturbation in the control of the cursor was introduced, their perceived control corresponded to their success in causing the distal outcome: judgments of control were high when participants hit many targets, and low when they did not succeed in the task. When noise was introduced in the control of the cursor, or when target or distractors were “magically” hit despite the cursor being distant from said target, people relied less on how often they succeed in hitting the targets, and more on the monitoring of the performed action. Taken together, the results of these two studies suggest that people are generally capable of tracking information about their movements and that monitoring of proximal actions is at least as relevant as obtaining an expected outcome for generating SoA. As in Metcalfe et al. studies, our results suggest that information relative to one’s own movement is relevant for feeling control over actions which aim at attaining a goal.

All together, our results are in line with the hypothesis that the observation of an incongruence between the executed and the observed action—either related to the movement or to the goal will generally reduce SoA, but they do not support the hypothesis that a failure to achieve the goal will more strongly affect SoA than the observation of an incongruent movement. In fact, movement information induced a more constant modulation of SoA than goal information: the influence of the latter began to vanish when introducing very short delays. On first sight, our results may seem in contrast with findings by David et al. (2016) who found that SoA was crucially affected by the final outcome of the action more than by other features related to the action itself. However, we think that methodological differences may have played a role. In our case, participants could observe the virtual finger moving in the opposite direction (incongruent movement) and/or pressing the button of a different color than their target (missed goal). In the study by David et al. (2016), expectations about the course of the action were violated only by introducing delays between executed and the observed movement, or between the executed action and the final outcome. In other words, the observed movements were always congruent and the final outcome was always obtained, but both were shown at different latencies. Additionally, in our study, manipulation of both movement and goal took place simultaneously, while in David et al.’ manipulation of movement and goal could not occur in the same trial. Finally, in David et al.’ study, participants were asked to explicitly express if they or someone else produced the action observed in the virtual scenario, while in our study SoA was assessed through judgments of correspondence (i.e., judgments about the synchronicity between the executed and the observed action).

Our results are in line with those reported by Caspar et al. (2016). In their study, binding between action and outcome was higher for congruent than for incongruent tones only if the robot moved the same finger used by the participant. Therefore, similarly to our findings, Caspar et al. reported that information about the execution of the movement and the outcome of the action contribute to SoA. Importantly, we expand their findings by adding that movement information may contribute to SoA for a more extended temporal window than goal information.

Salience of the goal

One possible limitation of our study concerns the seemingly low influence of the goal, which was time-limited as compared to the extended influence of movement manipulation. This unexpected result may be due to the fact that achieving (or missing) the goal was not associated to any relevant consequence for the participant (e.g., in the form of a monetary gain/loss) that might have reduced the “salience” of the goal. Notably, we deliberately selected a “neutral” goal to measure its contribution to SoA to avoid any emotional or rewarding effect associated to a salient outcome. This procedure may have reduced the influence of the goal on SoA compared to the other components of the action we manipulated here, i.e., movement and time. However, our procedure allowed us to compare our findings with those published by other groups. A neutral outcome, for example, was employed in the original version of the intentional binding paradigm (Haggard and Clark 2003; Haggard et al. 2002), and in Metcalfe and Greene experiment (2007) where targets simply disappeared after a hit. Moreover, a neutral outcome was employed also in the more recent studies using similar procedures to the ones proposed here (Caspar et al. 2016; David et al. 2016).

Was then the goal we employed too neutral to the point that participants did not attend to it and therefore did not notice when the virtual finger pressed the wrong target? The idea that participants did not pay attention to the target while executing the task is very unlikely. In fact, locating the target color and pressing the correct button in the response box was fundamental for the correct execution of the task. Given the low number of incorrect responses with respect to the total amount of trials, we argue that participants could successfully identify the location of the target most of the time, and respond appropriately.

In support of that, the analysis on the amount of time participants took to provide a Synchrony Judgment (i.e., Judgment Time) revealed an important interaction between factors Movement and Goal. This interaction shows that when participants observed a fully congruent action (M+G+) they were faster in providing a SJ compared to all other types of feedback which were associated with longer JTs and did not differ. Thus, after erroneous feedback participants noticed a discrepancy between the executed and the observed action, which led them to wait longer to respond to the SJ question. This may be similar to the behavioral adjustments that occur after erroneous responses (e.g., post-error slowing Rabbitt 1966) as reported in studies on performance monitoring (see Danielmeier and Ullsperger 2011; Ullsperger et al. 2014 for extensive reviews). Interestingly, this was also true for M+G−, where the observed movement was congruent and goal was missed. If the goal was truly irrelevant, JTs for M+G− should not differ from JTs for M+G+ or should be at least lower than M−G−. However, this was not what we found. Indeed, our data support the idea that participants actually noticed when the goal was not attained. SJs at 0 ms delay were higher when the goal was achieved compared to when it was missed (and tended to be higher when delay was equal to 75 ms), suggesting that participants recognized an unexpected change in the observed outcome. For all of these reasons, we believe that both movement and goal manipulations were salient for the participants and both modulate SoA.

Conclusion

To explore how different components of actions modulate SoA, we devised a novel paradigm where the congruency between the expected and observed movement and the success to attain the goal can be simultaneously manipulated. Previous investigations of SoA tended to focus on specific features of action (either movement execution or goal achievement). However, the actions we perform every day involve both: we use our bodies to achieve desired goals.

By combining the manipulation of movement and goal information within the same study, we confirm that they are both relevant for SoA as previously reported. However, we expand current knowledge by showing that the former may be more constant that the latter in influencing SoA.

We suggest that the advantage of the paradigm presented here is that it allows a straightforward comparison of the contribution of different sub-components of action (e.g., movement, goal and time) to SoA (Sidarus et al. 2017b). The paradigm could be easily combined with other known measures of the SoA—like the intentional binding (Haggard and Clark 2003; Haggard et al. 2002)—to better specify the conditions under which this central feature of the Self is experienced, and, at times, lost.

Importantly, the paradigm could also help clarify which aspects of action monitoring are involved in conditions associated to an impairment of SoA, such as schizophrenia, utilization behavior, the alien-hand syndrome (Moore and Fletcher 2012) or obsessive–compulsive disorder (Gentsch et al. 2012).