Introduction

Distributed practice of a continuous motor skill leads to better performance of that skill than does massed practice (e.g., Baddeley and Longman 1978; Lee and Genovese 1989; Taub and Goldberg 1973; Whitley 1970; for comprehensive reviews see Donovan and Radosevich 1999 and Lee and Genovese 1988). In particular, distributing practice sessions across multiple days rather than within a single day is of substantial benefit to learning and performance (e.g., Shea et al. 2000). Recent research suggests a mechanism for this advantage: distributing practice across days may allow for the consolidation of memory over periods of either sleep or wake (e.g., Cohen et al. 2005; Korman et al. 2003; Walker et al. 2003; Plihal and Born 1997). Indeed, a number of investigators have demonstrated off-line performance improvements in explicitly learned motor skills (i.e., those associated with awareness) after delays in training that included a period of sleep (Robertson et al. 2004; Walker et al. 2003) and off-line performance improvement in implicitly learned motor skills (i.e., those not associated with awareness) over periods of either sleep or wake (Robertson et al. 2004). Our goal here was to determine whether the benefit of distributing practice sessions across days on a visuomotor adaptation task—mirror-tracing—depends on the amount of initial within-session training on that task.

Performance improvements associated with between-session delays that include a period of sleep have been demonstrated for procedural tasksFootnote 1 as diverse as repetition priming (e.g., Hauptmann and Karni 2002), visual discrimination (e.g., Karni et al. 1994; Stickgold et al. 2000), mirror reading (Ofen-Noy et al. 2003), motor and oculomotor sequence learning (e.g., Albouy et al. 2006; Fischer et al. 2002; Savion-Lemieux and Penhue 2005; Walker et al. 2002) and visuomotor adaptation (e.g., Goedert and Willingham 2002 Exp 2; Plihal and Born 1997; Tamaki et al. 2007). For a review and theory see Walker (2005). Notably, however, not all investigations of procedural skill learning have found off-line performance improvements. The extent of off-line performance improvements may depend on the amount of training that an individual has had with the task (Hauptmann and Karni 2002; Hauptmann et al. 2005), the complexity of the task (Kuriyama et al. 2004), and the nature of the task itself.

Work with different procedural tasks yields conflicting results on how the amount of within-session practice relates to between-session performance improvements. For the case of repetition priming, training beyond the point of saturation or “leveling off” of performance has been crucial for the observation of off-line performance improvements (Hauptmann and Karni 2002; Hauptmann et al. 2005). Thus, greater within-session training has led to greater off-line performance improvements in priming. For the case of explicit sequence learning, however, greater practice is not associated with greater off-line performance improvements. Subjects learning an explicit sequence of response locations experienced similar off-line performance improvements in speed and accuracy regardless of whether they initially received 12 or 24 trials of practice (Walker et al. 2003). Similarly, subjects learning an explicit timed movement sequence experienced off-line performance improvements in accuracy and timing regardless of whether they initially received 12, 36, or 72 trials of practice, but only those groups receiving either 12 or 36 trials experienced off-line performance improvements in the stability of their responses (Savion-Lemieux and Penhue 2005).

A more precise examination of the learning of specific components of an explicit response sequence suggests that the least-developed components of the skill will undergo the most off-line consolidation (Kuriyama et al. 2004). These researchers manipulated the complexity of the sequencing task and observed that all complexity levels of a sequenced finger-tapping task underwent off-line performance improvements, but the learning of a 9-unit bimanual sequence led to significantly greater improvements than learning a 5-unit bimanual, 5-unit unimanual, or 9-unit unimanual sequence. Furthermore, the authors observed that those finger transitions that proved most difficult during training (i.e., the slowest transitions) demonstrated the most off-line improvement. In contrast to the work with repetition priming (Hauptmann and Karni 2002; Hauptmann et al. 2005), this work with explicit sequence learning (Kuriyama et al. 2004) suggests that less off-line improvement will occur for better-learned tasks or task components.

Although researchers have found off-line performance improvements for visuomotor adaptations, including prism adaptation (Goedert and Willingham 2002 Exp 2) and computer-induced visuomotor transformations (Plihal and Born 1997; Tamaki et al. 2007), the relation between the amount of within-session practice and off-line performance improvements for visuomotor adaptation has yet to be explored. Most germane to the current study, researchers found sleep-related off-line performance improvements in a visuomotor transformation after six within-session trials, each with different mirror-rotated images (Plihal and Born 1997) or with different 90°-rotated images (Tamaki et al. 2007). Neither study manipulated the amount of training, nor investigated repeated practice on the same image, which would allow the learning of specific movements and not just learning of the transformation.

Differences in the relation between the amount of within-session practice and the observation of off-line performance improvements for different procedural tasks may result from the tasks’ recruitment of different cognitive and neural systems (e.g., Cohen and Robertson 2007). Learning of a movement sequence is associated with increased activity in primary motor (M1), premotor, supplementary motor, and inferior parietal cortices (Bischoff-Grethe et al. 2004). Moreover, movement-specific learning within the sequence learning task can be further dissociated from goal learning in the task, with the movement-specific learning recruiting M1 and goal learning recruiting premotor and inferior parietal cortices (Hikosaka et al. 2002; Grafton et al. 1998). In contrast, the performance benefit in repetition priming is associated with decreased neural activity in prefrontal, occipital, and temporal cortices (e.g., Lin and Ryan 2007; Ryan and Schnyer 2007; Wagner et al. 2000). Finally, the learning of a new visuomotor transformation, controlling for the recruitment of neural systems for error detection and correction, is associated with increased activity in premotor and posterior parietal cortices (Clower et al. 1996; Ghilardi et al. 2000; Krakauer et al. 2004).

In light of these behavioral and neural differences, the current study addressed two questions: One, is performance of a visuomotor adaptation task—mirror-tracing—affected by the point in training that a 24-h between-session delay is introduced? Two, does the extent of off-line performance improvement on the mirror-tracing task depend on the amount of within-session practice with that task? All subjects completed ten trials of a mirror-tracing task across a period of 2 days. Subjects were assigned to one of three training conditions in which they received either one, three, or seven mirror-reversed training trials on Day 1. Subjects returned 24-h later, Day 2, to finish the remainder of their mirror-reversed trials such that all groups had completed ten trials by the end of the two training sessions. Given repeated presentations of the same figure, this mirror-tracing task allows for the development of movement-specific performance improvements in addition to performance improvements associated with the learning of the mirror transformation. Thus, we expected the relation between the amount of practice and off-line performance improvement for this mirror-tracing task to be more similar to that of explicit sequence learning (Kuriyama et al. 2004) than to repetition priming (Hauptmann et al. 2005). In particular, we anticipated that those subjects who had more practice with the mirror-tracing task on Day 1 would be less likely to demonstrate off-line performance improvements.

Methods

Participants

Seventy-eight (62 female) subjects aged 18–32 (M = 19.2, SD = 2.5) participated in partial fulfillment of a course requirement. An equal number of subjects were randomly assigned to each of the three training conditions such that n = 26 in each condition. All subjects had normal or corrected-to-normal vision and all were right handed as assessed by the Edinburgh Handedness Inventory (Oldfield 1971). All subjects received informed consent in accordance with the Declaration of Helsinki.

Apparatus and procedure

The tracing task required subjects to draw a path within the lines of the running-man form depicted in Fig. 1 as quickly and as accurately as possible. Subjects sat at a desk in front of a 15-in. computer monitor controlled by an iMac G3. The running-man form, when drawn on the monitor, was 6 in. high and 5.75 in. wide. The distance between the seated subject and the monitor was approximately 30 in. Subjects performed the tracing task with a single button mouse placed on a 2 ft × 2 ft mousepad positioned to the right of the monitor. Throughout the task, the monitor resolution was set to 800 × 600 and the mouse speed was set to slow. The gain between mouse movement and cursor movement was unitary across all locations. At the beginning of each trial, the experimenter centered the mouse on the mousepad and positioned the subjects’ cursor on a red dot at the head of the figure (i.e., black dot at head of Fig. 1). Subjects initiated the trial by clicking the dot and used the mouse to trace the figure whilst looking only at the computer screen and not at their hand, clicking the dot once again to complete the trial and stop the time clock. Subjects’ tracing paths were represented in real-time on the computer screen by a red line. The black stray line on Fig. 1 is an example of this feedback line. All subjects completed a total of 14 trials with the tracing task, with a 30 s break between each trial. On ten trials a mirror transformation was applied to the relation between mouse movements and feedback on the computer screen. On Day 1 and Day 2, all subjects performed one non-reversed trial immediately before and immediately after their series of mirror-reversed trials.

Fig. 1
figure 1

Running-man figure traced on each trial

Results

We assessed both speed and accuracy of subjects’ mirror tracing performance. Speed was measured as the time subjects took to complete tracing the figure in seconds (movement time; MT) and accuracy was measured as the number of times the subject’s tracing path deviated outside the lines of the figure (as coded by two independent raters). These MT and error measures are depicted in Fig. 2 and in Table 1. A mixed-model multivariate analysis of variance (MANOVA) with trial and training condition as factors revealed a main effect of trial for both measures [F(9, 67) = 42.9, P < 0.001 for speed and F(9, 67) = 30.53, P < 0.001 for accuracy], a main effect of training condition only for errors, F(2, 75) = 3.77, P < 0.05, and no interaction. Although, both speed and accuracy improved across mirror-reversed trials, the number of Day 1 training trials only influenced accuracy. Tukey’s HSD post-hoc analyses on the effect of training condition revealed that, overall, the group that received one trial of training on Day 1 had significantly fewer errors (M = 7.2, SE = 1.1) than the group that received seven trials of training on Day 1 (M = 11.3, SE = 1.1). This result must be interpreted with caution, however, given the speed-accuracy tradeoff we observed, as described below.

Fig. 2
figure 2

The raw dependent measures of a movement time in s and b number of errors depicted as a function of training trial and training group. MT movement time. Error bars are 1 SE

Table 1 Average raw error and raw movement time (MT) across mirror-reversed trials for each training condition

A set of power polynomial contrasts fit separately for accuracy and MT for each of the three training conditions revealed that the highest order polynomial to fit MT (Fig. 2a) for the one-trial training group was a cubic component, F(1, 25) = 9.10, P < 0.01, and for the three- and seven-trial training groups it was a quadratic component, F(1, 25) = 35.50, P < 0.01 and F(1, 25) = 14.31, P < 0.01, respectively. The highest order polynomial to fit accuracy (Fig. 2b) for the one-trial training group was a fifth-order polynomial, F(1, 25) = 5.50, P < 0.05, and for the three- and seven-trial training groups it was cubic, F(1, 25) = 18.95, P < 0.01 and F(1, 25) = 10.35, P < 0.05, respectively.

Despite improvements on both performance measures, we observed a speed-accuracy tradeoff: individuals’ average number of errors across the ten trials negatively correlated with their average MT (r = − 0.28, P < 0.01). Similar negative correlations between MT and errors were observed within each of the ten trials. Because of this speed-accuracy tradeoff, we created a composite measure of performance that was a modified product of the error and MT measures:

$$ {\text{MT}}/{\text{error composite}} = {\text{movement time}} \times (1 + {\text{number of errors}}) $$
(1)

Figure 3 depicts covariate-adjusted mean performance on the MT/error composite as a function of trial and training condition. We addressed the primary hypotheses of the current study using this composite measure.

Fig. 3
figure 3

Covariate-adjusted means for the movement time (MT)/error composite score as a function of block and training condition. Legend indicates amount of mirror-reversed trials experienced on Day 1 prior to a 24-h between-session delay. Arrows indicate trials immediately preceded by that delay. Error bars are 1 SE

Timing of delay and overall performance

Our first research question was whether the point in training at which a between-session delay is introduced would affect performance on the mirror-tracing task. The answer to this question is an unequivocal “yes.” As can be seen in Fig. 3, groups receiving the 24-h delay early in training (after one or three trials) experienced a boost in performance relative to the group receiving the 24-h delay later in training (after seven trials). In particular, the performance of the one-trial group diverges from that of the other two groups at trial 2, immediately following that group’s 24-h delay. Similarly, performance of the three-trial group diverges from that of the seven-trial group at trial 4, immediately following the three-trial group’s 24-h delay. The seven-trial group, however, does not appear to benefit from the 24-h delay. The impressions conveyed by the figure were confirmed by a mixed model MANCOVA and a series of planned comparisons on the composite scores. Although performance on trial one was similar for all three training conditions, F(2, 77) = 1.2, P = 0.30, we used the composite performance from trial one as a covariate in all subsequent analyses to improve statistical power. The MANCOVA with trial as the within-groups factor and training condition as the between-groups factor revealed a main effect of trial, F(9, 66) = 3.49, P < 0.01, a main effect of training condition, F(2, 74) = 6.23, P < 0.01, and a trial by condition interaction, F(18, 134) = 1.96, P < 0.05. On trial 2, performance of the one-trial group was significantly better than that of the three- and seven-trial groups, F(1, 74) = 8.77, P < 0.01, which did not differ from each other, F = 1.5, P = 0.23. On trial 4, performance of the three-trial group was significantly better than that of the seven-trial group, F(1, 74) = 6.21, P < 0.05, but did not differ from that of the one-trial group, F = 1.2, P = 0.27. On trial 8, rather than performing significantly better than the other groups after the 24-h delay, the seven-trial group performed significantly worse than the one- and three-trial groups, F(1, 74) = 15.9, P < 0.001. Similarly, at the end of training, on trial 10, both the one- and three-trial training groups performed significantly better than the seven-trial group, F(1, 74) = 10.05, P < 0.01, but did not differ from each other, F < 1. Thus, despite all groups receiving equivalent training with the mirror-tracing task, those groups that experienced a 24-h between-session delay early in practice achieved better performance at the end of training than the group that experienced the same delay late in practice.

Timing of delay and off-line performance improvements

Our second research question was whether the extent of off-line performance improvement depends on the amount of within-session practice. To answer this question we created three savings scores from the composite measure: savings from trials one to two, from trials three to four and from trials seven to eight. Each savings score was computed as follows:

$$ {\text{Savings}} = \left[ {\frac{{({\text{original}} - {\text{subsequent}})}}{{{\text{original}}}}} \right] \times 100 $$
(2)

We defined off-line improvement as performance benefits associated with the 24-h delay beyond that experienced between trials within a single session. Using composite performance on trial one as a covariate, we performed a series of planned comparisons to test for off-line improvement in each of the three training groups. The covariate-adjusted savings scores appear in Fig. 4. Results of these planned comparisons were consistent with the hypothesis that off-line performance improvements would be observed early, but not late, in training. The one-trial training group, which experienced the 24-h delay between trials 1 and 2, experienced significantly more savings between these trials than did the three- or seven-trial training groups, F(1, 74) = 6.94, P < 0.05, which did not differ from each other F < 1. Similarly, the three-trial training group, which experienced the 24-h delay between trials 3 and 4, experienced significantly more savings between these trials than did the one- and seven-trial training groups, F(1, 74) = 4.47, P < 0.05. Conversely, the savings of the seven-trial group between trials 7 and 8 was similar to that of the one- and three-trial groups, F < 1. However, none of the groups demonstrated savings between trials 7 and 8. Perhaps most importantly, the lack of savings between trials 7 and 8 for the seven-trial group is not likely due to a floor effect because the performance of that group continued to improve across trials 7 through 10 [as is apparent in Fig. 3 and as confirmed by a MANCOVA comparing trial 7 to trial 10, F(1, 24) = 4.84, P < 0.05]. Although this continued performance improvement for the seven-trial group was observed in the composite measure, this effect was largely carried by continued improvements in MT for this group: separate MANCOVAs comparing trial 7 to trial 10 for the seven-trial group revealed a significant improvement in MT, F(1, 24) = 7.75, P = 0.01, but not in accuracy, F < 1 (see Fig. 2a, b). Overall, off-line performance improvements were observed in the groups that received either one or three trials of training prior to the 24-h between-session delay, but not in the group that received seven trials of training prior to the 24-h delay.

Fig. 4
figure 4

Savings scores as a function of training condition and between-session break. Legend indicates amount of mirror-reversed trials experienced on Day 1 prior to a 24-h between-session delay. Error bars are 1 SE

Timing of delay and systematic differences in strategy

Although overall we observed a speed-accuracy tradeoff across trials, the different training groups may have adopted different thresholds for balancing speed and accuracy, perhaps favoring accuracy over speed or conversely favoring speed over accuracy.Footnote 2 We assessed whether there were systematic differences in the speed-accuracy tradeoff adopted by subjects in the three training groups by first calculating the mean number of errors and the mean MT for each participant separately across the mirror-reversed trials that person experienced prior to and after the 24-h between-session delay. We then converted each of these measures to z scores. For each participant we created a speed-accuracy bias measure by subtracting her corresponding z score for error from her z score for MT. Negative values on this measure indicate that a person sacrificed accuracy in favor of speed and positive values on this measure indicate a person sacrificed speed in favor of accuracy. Values close to zero indicate a balance of speed and accuracy.

We subjected this speed-accuracy bias measure to a mixed MANOVA with training condition as a between-groups factor and time of trials (before vs. after delay) as a within-groups factor. The analysis revealed an interaction between training condition and time of trials, F(2, 75) = 4.50, P = 0.01, and no other effects (= 2.9, P = 0.06 and F < 1 for the main effects of training condition and time of trial, respectively). As is apparent in Fig. 5, none of the groups demonstrated a speed-accuracy tradeoff prior to their 24-h between-session delay: separate single-sample t tests revealed that the speed-accuracy bias measure did not differ from zero prior to the delay in any of the conditions, Ps > 0.44. After the delay, however, the one-trial group demonstrated a significant accuracy bias, t(25) = 2.09, P < 0.05, while the seven-trial group demonstrated a significant speed bias, t(25) = −2.61, P < 0.05, and the three-trial group continued to demonstrate no tradeoff, P = 0.56. Given the differences in the speed-accuracy tradeoff adopted by the groups after their between-session delay, it is possible that manipulating the timing of the between-session delay induced participants to adopt different strategies for performing the task that ultimately led to performance differences among the groups.

Fig. 5
figure 5

Measure of speed-accuracy bias as a function of training group. Negative values indicate performance favoring speed while sacrificing accuracy and positive values indicate performance favoring accuracy while sacrificing speed. Dark bars represent the average of trials occurring before the 24-h between-session delay and the light bars represent the average of those occurring after. Error bars are 1 SE

Non-reversed tracing trials

Although performance on the mirror-reversed training trials is critical for assessing our primary hypotheses, it is possible that the different training structures produced differences on the non-reversed tracing trials as well. For example, more training with the mirror-reversal could lead to an increase in error and MT on the non-reversed trials. As with the mirror-reversed trials, we observed a speed-accuracy tradeoff across the four non-reversed trials (r = −0.33, P < 0.01), with similar correlations observed within each trial. As such, we created a movement time/error composite score for the non-reversed trials using Eq. 1, as previously applied to the mirror-reversed trials. Four separate one-way ANOVAs with training condition as the between-groups factor were performed on each of the non-reversed trials. As would be expected, the performance of the three training groups did not differ on the first trial, at which point all groups had been treated identically, F(2, 77) = 1.0, P = 0.15. Nor did the performance of the groups differ on the non-reversed trials performed either before or after the mirror-reversed trials on Day 2, Fs < 1. However, as might be expected, the performance of the groups on the final non-reversed Day 1 trial varied as a function of the amount of mirror-reversed training they experienced, F(2, 77) = 5.0, P < 0.01. Tukey’s HSD post hoc analysis revealed that the performance of the one-trial group (M = 52.8, SD = 73.4) was significantly better than that of the seven-trial group (M = 113.7, SD = 87.7), with the three-trial group demonstrating intermediate performance (M = 71.5, SD = 88.7). Thus, more training with the mirror-reversal on Day 1 led to greater error on the non-reversed trial completed immediately after that training.

Discussion

The goal of the current study was to determine whether the extent of off-line performance improvement in a mirror-tracing task would be related to the amount of practice subjects had with that task prior to a 24-h delay. We found that one or three trials of within-session practice led to off-line performance improvements, but that seven trials of within-session practice did not. These off-line performance improvements for the one- and three-trial groups exceeded improvements expected from trial to trial within a single learning session, as is necessary for the demonstration of “off-line” improvement (see Krakauer and Shadmehr 2006). Moreover, after the tenth and final training trial, subjects in the one- and three-trial groups demonstrated better performance on the mirror-tracing task than did the seven-trial group, despite the equivalent amounts of training experienced by all three groups. These results are consistent with the superiority of distributed over massed practice for skills with a motor component (Donovan and Radosevich 1999) and suggest that at least for the case of mirror-tracing, introducing a between-session delay early in training benefits performance more than introducing a between-session delay later in training.

Our lack of off-line improvement in the seven-trial group is consistent with the work on explicit sequence learning demonstrating that well-learned components of a skill do not exhibit off-line improvement (Kuriyama et al. 2004). These results do not imply, however, that performance of a skill must be at asymptote before one fails to see off-line improvements. First, it is unclear in the movement sequencing study whether subjects were at asymptote in their reaction times to the easiest finger transitions (i.e., those that were fastest in the first session of training and did not exhibit off-line improvement). Second, in the current experiment, even though the subjects in the seven-trial group had not hit asymptote in their performance by trial seven, as evidenced by the continued practice-related improvements on the second day of training, these subjects failed to demonstrate off-line improvements.

Our results are inconsistent with studies investigating the effects of continued practice on off-line improvements in repetition priming (Hauptmann and Karni 2002; Hauptmann et al. 2005). Unlike repetition priming, the visuomotor task employed in the current study, did not require saturation of performance to observe off-line performance improvements. Indeed, greater within-session practice impeded off-line performance improvements. The explicit movement sequence tasks (Kuriyama et al. 2004; Walker et al. 2003) and the mirror-tracing task employed here both allowed for movement-specific learning, whereas the repetition priming task affords only perceptual learning. As such, it is unsurprising to find greater consistency between the sequence learning and mirror-tracking task than between these tasks and repetition priming.

A number of previous studies have found off-line performance improvements specifically associated with a period of sleep but not with a corresponding period of wake (e.g., Fischer et al. 2002; Walker et al. 2003). Although all of the 24-h delays in our study included a night of sleep, we did not systematically investigate the effects of sleep. Therefore, it is possible that the mechanism of off-line performance improvement observed in the current study is either sleep-related or merely delay-related. Indeed, a single task may have multiple components, some of which consolidate over periods of wake and others which consolidate over periods of sleep. For example, the learning of the movement-specific aspects of a sequence consolidate over a period of wake, whereas the goal-specific aspects of a sequence consolidate over a period of sleep (Cohen et al. 2005; Cohen and Roberston 2007).

A final observation is that the different training structures employed in the current study induced between-group differences in speed-accuracy tradeoff. Although none of the groups demonstrated a speed-accuracy tradeoff prior to their 24-h between-session delay, the one-trial and seven-trial groups demonstrated speed-accuracy trade-offs in opposite directions after the delay, with the seven-trial group favoring speed over accuracy and the one-trial group favoring accuracy over speed. Previous research has found the adoption of a speed-accuracy tradeoff on a visuomotor transformation task only when subjects were able to adopt a direction-reversal strategy (Cunningham 1989). It is possible that after their initial experience with the task, subjects in our study adopted a different strategy for performing the task and that these strategy differences underlie the differences in performance improvement. Alternatively, it may be that speed and accuracy aspects of the mirror-tracing task are susceptible to either on-line or off-line learning at different stages of training. Assessing the viability of this latter alternative would require further experimentation.

An additional question stems from this initial investigation. Would we observe a similar relation between the amount of within-session practice and off-line performance improvement for transformation-specific aspects of the mirror-tracing task? In the current study, repeated exposure to the same mirror-transformed figure meant that both movement-specific learning and transformation-specific learning could benefit performance. Manipulating the exact figures presented on Day 2 could tease apart these two potential sources of performance improvement. Given repeated exposure to the same stimulus, it is possible that subjects in our experiment primarily learned movement-specific information that would be context dependent and fail to transfer to other figures. Clower and Boussaoud (2000) have shown that computer-induced transformations, as opposed to prism-induced transformations, result in learning that is context-specific, reflecting what they call visuomotor skill acquisition rather than perceptual recalibration. Nevertheless, learning of transformation information must be possible on mirror-tracing tasks given that others have found practice with one figure to transfer when the same transformation is applied to a different figure (e.g., Plihal and Born 1997). Thus, it would be of interest to assess the relation between within-session practice and off-line performance improvement for the transformation-specific component of the task.

In summary, our results suggest that too much within-session practice on a mirror-tracing task may actually be detrimental to performance on that task. After performing the same number of training trials on a mirror-reversed form, those groups that experienced a 24-h between-session delay early in training demonstrated off-line performance benefits and overall better performance at the end of training compared to the group experiencing the 24-h between-session delay late in training. Collectively, these results suggest that early distribution of training across days is optimal for the learning a visuomotor skill.