Introduction

It is well known that grasping movements are sensitive to the features of their target object including its location, size, shape, and orientation (Jeannerod 1984; Jakobson and Goodale 1991; Jeannerod et al. 1995), confirming the importance of visual processing in the guidance of such actions. Object features are delivered by the visual system to the motor system prior to movement initiation and used in a feedforward manner such that movement kinematics are scaled to object features very early in the trajectory (Jeannerod 1984); furthermore, visual processing during the action can be incorporated in a feedback manner to adjust the movement as necessary (e.g., Goodale et al. 1986).

Oftentimes action plans must take into consideration the features of objects that are not the immediate target of a movement (Castiello 1999). In one such context, action plans might require sequential movements to multiple objects to complete a complex goal such as grasping a hammer and then driving a nail. At other times, it might be necessary to avoid an object that lies in the path of the optimal movement trajectory. In both of these situations, it makes sense for the motor system to incorporate the visual features of nontarget objects into the action plan in order to improve the accuracy and/or efficiency of movement; indeed, considerable evidence shows that grasping movements are influenced by the visual features of objects that are not the immediate target of action in sequential tasks (e.g., Henry and Rogers 1960; Hesse and Deubel 2010; Rosenbaum et al. 1990) and in obstacle avoidance tasks (e.g., Schindler et al. 2004).

In the case of sequential action tasks, it has been known for quite some time that the latency to initiate the first movement in a series increases with the complexity of the entire movement sequence (e.g., Henry and Rogers 1960; Khan et al. 2008). Specifically, changing the index of difficulty (ID) associated with the second movement (target size and movement amplitude) can influence the kinematics of the first movement (Rand et al. 1997). The authors demonstrated that as the ID of the second segment increased there was an increased in the movement time (prolonged time to peak velocity and the deceleration time) of the first segment when performing an elbow extension–extension movement sequence. Similar findings were also shown for an elbow extension–flexion movement sequence, with the addition of a lowered peak velocity for the first movement in relation to increasing the ID of the second segment.

This evidence is consistent with the idea that actors are preparing an integrated movement plan that incorporates some, or all, of the features of the entire movement sequence before initiating the first component. Rosenbaum (e.g., Rosenbaum et al. 1990) has provided more direct support for this speculation by showing that actors anticipate the comfort of the final action in a sequence, trading off an initially awkward action for earlier movement components in favor of a more comfortable posture at the end of the sequence. In a possibly related finding, Hesse and Deubel (2010) have shown that the orientation of the second object in a sequential task can impact the orientation of the hand when grasping the first object, but this effect disappeared when increased spatial accuracy was required in the first movement.

It is not entirely clear, however, that Hesse and Deubel’s (2010) observations are due to mechanisms associated with intentional planning of an action sequence (“holistic planning processes”). In their task, participants were to grasp a cylinder across its circular axis, move it to a target location, and then grasp a rectangular object along its long axis; the orientation of the rectangular object was varied randomly from trial to trial. Arguably, participants could adopt a hand posture on the cylinder that anticipated the orientation of the second object in order to simplify the transition between the two actions; this was possible because the first object afforded all possible hand orientations, and it was economical because the orientation of the hand at the end of the first movement could be conserved en route to the second object. Alternatively, it is possible that the influence of the second object on the first action did not arise from intentional motor planning but rather from the mere act of paying attention to the second object regardless of any conscious intention to prepare an economical sequence of movements.

There have been many demonstrations that attention to nontarget objects can influence the kinematics of a primary action even though this creates no obvious advantage to the participant (Tipper et al. 1998). In the context of simple reaching movements, it has been shown that the timing (Tipper et al. 1992) and trajectory (Neyedli and Welsh 2012) of the reach can be affected by distractors that are not relevant to the task in any way and that do not have the potential to physically interfere with the movement. Effects like this have been interpreted within the context of an action-centered model of visuomotor processing, in which it is proposed that allocating attention to a distracting visual stimulus—whether voluntarily or involuntarily—can lead to the automatic planning of a movement to that stimulus that competes with the primary movement plan via spatial averaging or perhaps response inhibition (Welsh and Elliott 2004).

In the context of grasping, Castiello (1996) showed that the size of the grip used to pick up a piece of fruit was affected by the size of an adjacent piece of fruit if participants were required to count the number of times a light flickered on its surface before or during the primary action; this interference did not occur when the associated attention task was removed from the distractor fruit. Castiello argued that paying attention to an object elicits automatic action preparation, which can compete with a primary movement through an averaging process, much like Tipper proposed for reaching movements but in the context of intrinsic characteristics of the object (size) rather than the extrinsic feature of location.

Based on the observations from these distractor interference studies, it is possible that Hesse and Deubel’s result reflects a response averaging process arising from merely paying attention to the second object in the series rather than intentional movement integration. This is consistent with the elimination of the effect in the “spatially demanding” task context; here, the first task may have consumed all of the participant’s attentional resources, precluding attention to the second object and thus eliminating unintentional movement preparation. The purpose of the present study is to determine the role that attentional interference might play in sequential action tasks, by eliminating any incentive for participants to intentionally integrate movement plans to the first and second object in a sequential task and by varying the level and type of attention allocated to the second object.

We used a variation of Hesse and Deubel’s sequential action task in which the target objects varied in size rather than orientation, so that there was no obvious incentive for participants to strategically incorporate the second object’s features into the first action; the hand must always close to the size of the first object before grasping the second one, so opening the hand wider or narrower in anticipation of the second object’s size confers no obvious strategic benefit. Additionally, we varied the task required for the second object in the sequence so that effects could be compared when attention was deployed to the second object without explicitly requiring a grasping action. If interference in a primary grasping action arises from paying attention to another object—via averaging of the primary motor plan with the unintentional motor plan elicited by the second object—then interference should occur whenever the second object is associated with a specific task (action planning or perceptual judgment) but not when it is ignored.

Method

Participants

Eleven undergraduate students (2 males) at Dalhousie University participated in the current study in exchange for partial course credit. All were right-handed, had normal or corrected-to-normal vision, and no history of neurological deficit as ascertained by self-report. Each participant provided informed written consent prior to participation in accordance with guidelines established by the Dalhousie University Research Ethics Board.

Materials

For each trial of the experiment, two objects were presented simultaneously; the more proximal object was the stimulus for the first task, and the more distal object was the stimulus for the second task. A single white object (5 cm × 5 cm) was used throughout the experiment as the stimulus for the first task. Five rectangular wooden objects (all white) served as the variable stimuli for the second task. One object measured the same dimensions as the object used in the first task (5 cm × 5 cm). The other objects were either larger in width than the first object (6 cm × 4.17 cm; 7 cm × 3.57 cm) or smaller in width (4 cm × 6.25 cm; 3 cm × 8.30 cm). Object dimensions were chosen to ensure equivalent surface area (25 cm2), thereby ensuring equivalent amounts of reflected light so that width was not correlated with overall brightness (Fig. 1). All objects had a height of 1 cm. The objects were placed on a table that was covered with a black cloth. The first object was always placed 10 cm away from the starting area along the midsagittal axis, and the second object was always 10 cm away from the first object, also along the midsagittal axis (Fig. 2). Marked zones were located 10 cm to the left side of each object’s location, into which participants placed the object if the task required a grasping action (“grasp the object and move it to the adjacent placement area”).

Fig. 1
figure 1

Visual stimulus set. The first object was always 5 × 5 cm, whereas the second object ranged in width from 3 to 7 cm, with the length negatively covarying to ensure a constant surface area of 25 cm2 for all objects

Fig. 2
figure 2

Stimulus layout. The starting switch was aligned with the participant’s midsagittal axis as were the first and second objects, each separated by 10 cm. “X” marks the placement area for each object

An Optotrak 3020 (Northern Digital, Waterloo, ON, CANADA) system was used to record at 200 Hz the three-dimensional locations of IREDs placed on the distal phalanx of the thumb, the lateral surface of the distal phalanx of the index finger, and the styloid process of the radius of the right upper limb. Participants wore liquid–crystal occlusion glasses (PLATO Translucent Technologies, Toronto, ON, Canada), in order to block visual input during the experiment as indicated below. A tone was presented as the signal for participants to initiate the first action (800 Hz; 250 ms).

Procedure

Participants were seated in front of a table during all experimental trials. Prior to the experimental trials, participants were shown a 15-cm ruler to familiarize them with the verbal size judgments required during the study. Each participant also performed several practice trials (one for each condition) to ensure they understood the requirements for each type of condition.

Participants were instructed to depress a release button using their pinched right index finger and thumb at the start of each trial. The LCD glasses were opaque at the start of each trial while the experimenter positioned the target objects. Prior to initiating the trial, the experimenter verbally announced the task required for the second object (“judge,” “grasp,” or “do nothing”), which varied randomly trial by trial. The task for the first object was always to grasp and move it to the marked placement region. Approximately 1.5 s after the task announcement the glasses turned transparent to reveal the environment, and the start tone was presented a further 500–1500 ms later (possible delays were 500, 750, 1000, 1250, or 1500 ms, with equal distribution and randomized trial by trial).

For the first task, which always involved grasping the 5 × 5 cm first object, participants were instructed to use their index finger and thumb to grasp the object along the “front-to-back” axis “as quickly and accurately as possible” and to place it in the marked location before beginning the second task.

For the second task, in the perception condition participants were instructed to verbally indicate the width of the second object “along the front-to-back axis” in centimeters. In the action condition, participants were instructed to grasp the second object “along the front-to-back axis” using the index finger and thumb “as quickly and accurately as possible” and then move it to the marked location. In the ignore condition, participants were told that the task was complete once the first object had been placed in its proper location. The LCD glasses returned to an opaque state 5000 ms after the initiation tone, such that vision was available during the entire task but occluded at the end of each trial before the stimuli for the next trial were arranged.

For the perception and the action condition, participants performed a total of 50 trials each (ten repetitions of each of the five sizes of the second object) and 20 control trials (four repetitions of each of the five sizes of the second object). Trial types were randomly intermixed. Thus, each participant performed a total of 120 trials.

Data collection

Offline, a custom Python routine was used to extract movement kinematics from the raw 3D data collected during the experiment. Measures extracted from the primary action (the movement to the first object) included peak grip aperture (PGA; the maximum distance between the index finger and thumb achieved during the movement), reaction time (RT; the time from the onset of the auditory go signal until the velocity of the IRED on the wrist exceeded 30 mm/s for 5 consecutive time samples), movement time (MT; the time from when the wrist IRED exceeded 30 mm/s for 5 consecutive time samples until it dropped below 30 mm/s for 5 consecutive time samples), and peak hand speed (PHS; the maximum speed calculated from the wrist IRED). Interactive routines enabled the experimenter to ensure the automated algorithms chose the appropriate values in cases of missing IRED positions. All dependent measures were analyzed within participants, and trials were rejected if any of the measures fell beyond ±3 standard deviations of the individual participant’s mean for that measure (2.2 % of trials were rejected from data analyses).

Each dependent measure was analyzed using a 2 (condition: action vs. perception) × 3 (object 2 size: 3, 5, 7 cm object) fully repeated measures ANOVA (α = 0.05). Note that while 5 sizes for the second object were included in the experiment, this was simply to guard against the possibility that participants could easily categorize the object as “larger” or “smaller” than the first object; only the 3, 5, and 7 cm sizes were analyzed statistically in order to increase power. Omnibus analyses did not include the control condition because fewer trials were included in this condition compared to the others. The effect of the second object’s size in the control condition was analyzed using a one-way repeated measures ANOVA (α = 0.05). Sphericity was evaluated using Mauchly’s test (α = 0.05), and the Greenhouse–Geisser correction was used where indicated (adjusted df are reported).

Results

A summary of all kinematic measures can be found in Table 1.

Table 1 Mean (SD) of all dependent measures for each condition relative to the size of the second object

Peak grip aperture

The results revealed a significant interaction between condition and the size of the second object, F (2, 20) = 3.76, p = 0.04, MSE = 1.56, so analyses of simple effects were pursued. A significant simple effect of size was found for the perception condition, F (2, 20) = 4.96, p = 0.02, MSE = 2.5, but not for the action condition, F (2, 20) = 0.91, p > 0.05, MSE = 1.6. Specifically, in the perception condition participants grasped the first object with a larger PGA when the size of the second object was larger than the first object (i.e., 7 cm) compared to the same-sized (i.e., 5 cm) and smaller (i.e., 3 cm) second objects, which did not differ significantly (see Fig. 3). No main effect of the second object’s size was observed in the control condition, F (2, 20) = 1.35, p > 0.05, MSE = 6.91.

Fig. 3
figure 3

Mean peak grip aperture performing a grasping action to the first object in relation to the size of the second object. Error bars indicate SEM. “*” denotes a statistically significant difference (p < 0.05)

Reaction time

A main effect of condition was found, F (1, 10) = 26.49, p < 0.001, MSE = 1492.2; participants were faster in the action condition (M = 413 ms) compared to the perception condition (M = 462 ms). No significant interaction was found between condition and object 2 size, F (2, 20) = 0.84, p > 0.05, MSE = 907.3. A main effect of size was found for the control condition, F (2, 20) = 4.47, p = 0.02, MSE = 3045.3. Specifically, participants were faster as the size of the second object increased (3 cm: M = 452 ms; 5 cm: M = 428 ms; 7 cm: M = 383 ms).

Movement time

No significant interaction between condition and the size of second object was found for MT, F (2, 20) = 2.83, p > 0.05, MSE = 8784.85, and no main effect of size was found for MT performed in the control condition, F (2, 20) = 0.76, p > 0.05, MSE = 681.52.

Peak hand speed

No significant interaction between condition and size, F (2, 20) = 0.32, p > 0.05, MSE = 520.98, and no main effect of size was found for PHS in the control condition, F (2, 20) = 0.35, p > 0.05, MSE = 1135.1.

Time to PGA

No significant interaction between condition and size, F (2, 20) = 0.27, p > 0.05, MSE = 365.3, and no main effect of size was found for TPGA in the control condition, F (2, 20) = 0.56, p > 0.05, MSE = 489.1.

Discussion

The purpose of this study was to test the hypothesis—motivated by Castiello (1996)—that interactions among the elements of a sequential action task might be due to the deployment of attention to the second object in the sequence regardless of strategic action intentions. If so, then participants’ peak grip aperture when grasping the first object should be positively correlated with the size of the second object in the perception and action conditions—which presumably engage attention to the second object—but not the “ignore” condition.

The results demonstrate a rather surprising pattern that did not completely support this prediction. The size of the second object influenced grip size in the perception condition, in which participants indicated the size of the second object after grasping the first object, but not the action condition in which participants grasped both objects in sequence. As expected, no effect of the second object was observed in the “ignore” condition. More specifically, in the perception condition participants’ peak grip aperture—when grasping the first object—was significantly larger when the second object was 7 cm as compared to when it was 5 or 3 cm, which did not differ; the direction of this relationship is consistent with an averaging process involving the sizes of the first object and second object. It is noteworthy that an effect of the second object was only observed when that object was larger than the primary target but not when it is smaller. This is not particularly surprising, however, because an increase in grip aperture on the approach to the first object would not jeopardize the success of that action, whereas a decrease could lead to an insufficient grasp size and perhaps action failure.

Drawing on action-centered models of attention (e.g., Castiello 1996; Tipper et al. 1992), which propose that interference arises from an averaging of competing motor plans to attended objects, interference was predicted in both the action and perception conditions because both encourage attention to the second object in the sequence. The lack of interference in the action condition was particularly surprising because the preparation of two movement plans should lead to robust interference, given that this is the presumed mechanism of interference. Consequently, it is not clear why interference was observed in the perception condition.

Action preparation and attention in sequential tasks

Based on the literature on sequential action tasks, it was assumed of the “action condition” that participants would pay attention to both objects—and indeed prepare actions to both—before beginning the first action (Henry and Rogers 1960; Hesse and Deubel 2010; Khan et al. 2010). Accordingly, the absence of interference in the action condition implies that neither attention to the second object nor the preparation of an associated action is sufficient to create interference in the primary action. By extension, then, the interference observed in the perception condition cannot be attributed to attention or action preparation and instead must be due to the demands of making an explicit perceptual judgment.

Such a conclusion—that interference seen in an action might arise from explicit perceptual mechanisms—would be quite surprising given the considerable evidence supporting the anatomical and functional segregation of the visual systems that mediate object perception and object-directed action (e.g., Goodale and Westwood 2004). Framed within the two-visual-system hypothesis, this interpretation suggests that processing in the ventral stream, associated with judging the size of the second object, affects processing in the dorsal stream, associated with the primary action task. Moreover, within this framework the lack of interference observed in the action condition would imply that two action plans generated within the dorsal stream do not interfere with each other.

While it is tempting to ascribe all action-related processing to the dorsal visual stream, it is possible that the primary action in this task was instead mediated by the ventral visual stream. After all, the size of the first object did not vary during the experiment, so actions to it could be generated from memory rather than current visual input. Considerable evidence suggests that the control of memory-guided actions draws more heavily on the ventral stream than dorsal stream (e.g., Singhal et al. 2013; Westwood and Goodale 2003). According to this line of reasoning, the interference observed in the perceptual condition could be due to competition within the ventral visual stream between a memory-guided action (i.e., the first grasping task) and a perceptual judgment (i.e., the second perceptual task). By extension, the absence of interference in the action condition could reflect the absence of competition because the first action task and second action task are mediated by the ventral and dorsal streams, respectively.

It also remains possible that during the perception condition participants’ attentional system engaged in a statistical summary representation of the stimulus set. Specifically, it has been shown that when performing a perceptual task with a set of similar items, participants are more accurate at performing a mean discrimination task about the overall mean size of the set, as opposed to performing a member identification task about a specific item of the set (Ariely 2001). In the case of this experiment, it could be that during the perception condition participants averaged the overall size of the set (combining the 2 objects). Thus, when the size of the second object was bigger it increased the overall size of the set and made participants reach out with a larger PGA to the first object. However, because the action condition did not require any perceptual judgment, the representation via statistical summary was not engaged.

Sequential action tasks might not engage attention and/or action planning to the second object

Despite the evidence discussed earlier that supports the advance preparation of both components of a two-stage action (implying both attention to the second object, and action preparation prior to starting the primary action), it is possible that participants did not pay attention to the second object in the action condition until the first action was completed. After all, the second object remained visible throughout the task and, by design, there was no strategic movement advantage to be gained by adjusting the first movement to anticipate the size of the second object. As such, participants may have simply decided to ignore the second object until the first action was completed and then engage movement preparation at that time. Of course, one could make a similar argument about the perceptual condition, but an interference effect was nevertheless observed in that case.

The reaction time results are partially consistent with the possibility that participants did not pay attention to the second object in the action condition, as reaction times were shorter in this condition compared to the perception and “ignore” conditions even though the initial action was identical in all three cases. The increased latency for the perception condition could indicate attention to, and processing of, the second object before commencing the first action, but if so then it is not clear why reaction time was similar for the “ignore” condition in which no attention was required to the second object and neither was interference observed.

According to this line of reasoning, the pattern of interference observed might be due to the level of attention paid to the second object in the sequence, which would support the predictions derived from action-centered models of attention. At the same time, this conclusion would also challenge the assumption that interactions between elements of an action sequence are necessarily due to holistic action planning and might instead arise simply from paying attention to objects that are not the current movement target.

The results of our present study have potential implication for studies that make use of pictorial illusions to study action and perception, because such visual displays incorporate multiple objects that could compete for attention and/or action planning leading to response averaging in some circumstances. This possibility could be studied in future research by comparing conditions in which attention is explicitly directed to the nontarget elements in the visual display to see whether this changes the sensitivity of the action to the illusion.

Implications for previous studies

The study was motivated, in part, by Castiello’s influential (1996) study, which looked at grasping a fruit while paying attention to an unrelated, adjacent fruit. However, our study differs from Castiello’s in several obvious (e.g., sequential vs. parallel tasks) but also subtle ways.

Castiello’s study employed well-learned objects (fruit) as targets and distractors in contrast to the simple rectangles used here. Fruits are perceptually recognizable by many features other than just their size (e.g., shape and color), and indeed, the actions associated with them might not require the engagement of visuomotor systems, given that an associated grasping posture could be produced from long-term memory (e.g., Yoon et al. 2002). Perhaps the interference observed in Castiello’s study has little to do with visuomotor systems and more to do with confusion within long-term representations of familiar objects and their associated motor representations. Furthermore, qualitatively different grip types are associated with very small fruits like cherries (i.e., precision grasp) compared to a long elongated fruit like a banana (i.e., power grasp), so interference between different sized fruits may have little to do with averaging two grasping plans with different intended grip sizes, but instead the joint activation of distinctly different grasping postures.

In favor of the possibility that action interference might be different for well-learned objects like fruits, compared to indistinct, unfamiliar objects like rectangles of varying sizes, it has been shown that the grasping of fruit can be impacted by tasting different fruit flavors prior to the action. Specifically, peak grip aperture was greater when participants reached out and grasped a small fruit (e.g., strawberry) that was preceded by a sip of a “large” (e.g., orange) than a “small” (e.g., almond) flavor juice (Parma et al. 2011). This observation implies that the interference does not arise from competition between visuoperceptual and visuomotor processes so much as processes that connect object representations stored in long-term memory (with multisensory inputs) to action plans also stored in long-term memory (e.g., Pavese and Buxbaum 2002; Riddoch et al. 1998).

The second study that motivated this investigation was that of Hesse and Deubel (2010). Unlike their sequential action task, our task was designed to eliminate the incentive and indeed opportunity for participants to adjust their first movement strategically in anticipation of the characteristics of the second object (i.e., holistic action planning), allowing us to focus on the role that attention and interference might play in sequential tasks. The results of the study indicate that either attention to, or explicit perception of, the second object can induce interference in the primary action task. Thus, while it remains possible that Hesse and Deubel’s results arose from holistic action planning, our results indicate that a similar result could occur via interference arising from merely paying attention to the second object in the sequence.

Conclusion

The results indicate that interference can occur in a primary grasping task from an object that is the target of a subsequent perceptual judgment task. Interference was not observed when the subsequent task was another grasping action, or when the second object was ignored. Taken together, the results indicate that actions can be affected by paying attention to an object that is not the current target and that interactions between elements of a movement sequence do not necessarily imply holistic action planning.