Introduction

Learning of sequential movements, such as typing on a keyboard or driving, consists of explicit and implicit processes. Explicit knowledge may also concern a sequence itself or a transfer rule between sequences. For improvement of sequential movements, explicit knowledge plays an important role (e.g., Benson et al. 2011). Explicit knowledge can be obtained in two ways: knowledge can be detected spontaneously, or provided by others. In the former, individuals spontaneously notice reportable rules during learning a new sequence or transferring the learned sequence. In the latter, an individual is instructed regarding the sequence itself or the rule by another person prior to learning or transfer.

Regardless of the method used to obtain explicit knowledge, a likely consensus is that explicit knowledge contributes to improvements in early phases of learning a sequence, while in some paradigms, such as rotary pursuit or mirror tracing, explicit knowledge of the transfer rules do not appear to contribute (Gabrieli et al. 1993; Heindel et al. 1989). For example, explicit knowledge by spontaneous detection can lead to faster sensorimotor adaptation (Werner and Bock 2007) and faster performance of sequential learning (e.g., Willingham et al. 2002). Similarly, explicit knowledge by instruction or strategy can lead to faster adaptation to walking on a split-belt treadmill (Malone and Bastian 2010), fewer errors with sensorimotor adaptation (Benson et al. 2011), and faster performance of sequential learning (Stefaniak et al. 2008). However, only a few studies have examined the effects of the manner in which learners obtain explicit knowledge on sequential performances (e.g., Curran and Keele 1993), while findings regarding the effects of spontaneous discovery and explicit instruction on learning have been a source of controversy in the field of education and learning science (e.g., Klahr and Nigam 2004; Dean and Kuhn 2006).

In a study of sequential learning, Curran and Keele (1993) asked participants to perform a serial reaction time task (Nissen and Bullemer 1987). In this task, participants observe visual stimuli successively presented at one of four or six horizontally aligned locations and respond with spatially compatible keys, as quickly and accurately as possible. A predetermined sequence, composed of 6–12 key presses, was repeated in the majority of trial blocks, while in a few trial blocks a randomly generated sequence was presented. Participants were assigned to one of the following three groups: intentional, more aware, and less aware groups. In the intentional group, the participants were instructed regarding the repeated sequence prior to starting the task. The participants who became aware of the repeated sequence without the explicit instruction were assigned to the more aware group, and those who were not aware of the repeated sequence were assigned to the less aware group. Curran and Keele (1993) found that the difference between reaction times to the repeated and random sequences (i.e., learning effect) was the largest in the intentional group, followed by the more aware group, with the smallest difference being observed in the less aware group. Next, a secondary task was introduced, following these single-task trial blocks, in which participants listened to a high- or low-pitch tone presented between each of the visual stimuli and were required to report the number of high-pitch tones at the end of each block. Interestingly, the learning effects did not differ among the three groups; the already acquired sequential representations in the single-task trial blocks could not be presented in the dual-task trial blocks. In the dual-system model on sequence learning (Keele et al. 2003; see also; Abrahamse et al. 2010), a set of unidimensional modules detect and utilize all available regularity within sequence information, and a multidimensional module allows sequence learning across types of information. According to this model, the system isolated by dual-task conditions corresponds to a unidimensional system and learning within this system is implicit. Additional learning under single-task conditions corresponds to a multidimensional system, with explicit learning depending on this system (see also Grafton et al. 1995; Hazeltine et al. 1997). The latter system also can learn associations within the single dimension of single-task learning, regardless of the capacity of multidimensional system, unless interference from a secondary task occurs. Thus, the explicit knowledge of the sequence obtained by instruction itself had a more favorable effect on the serial reaction time task than that observed with spontaneous discovery, which indicates that the instruction of the sequence rule contributes to development of the multidimensional system earlier. However, the benefit vanished when a secondary task, requiring cognitive or attentional resources, was introduced.

Even much less is known about the effect of explicit instruction on transfer of transformation (e.g., Watanabe et al. 2006; Tanaka and Watanabe 2014a, b). For example, if spatial button configurations of a learning sequence were horizontally mirrored in a transfer session (e.g., Tanaka and Watanabe 2014b), and participants received an explicit instruction of the transformation rule (not the transformed sequence), participants would be required to transform the learning sequence. Given that this transformation requires additional cognitive processes, benefits of explicit transfer could be abolished, according to the dual-system model (Keele et al. 2003).

In the present study, we were interested in whether benefits of explicit knowledge on transfer of a visuomotor sequence could be observed, and whether the manner in which learners explicitly obtained a transfer rule would modulate their performance during transfer. For this, we employed a sequential button press task, called the m × n task (e.g., Hikosaka et al. 1995, 1996, 2002; Tanaka and Watanabe 2013, 2014a, b, 2016). The experimental device consisted of 16 light-emitting diode (LED) buttons mounted in a 4 × 4 matrix. Experimental trials consisted of a triad (m), comprising three sequential button presses ([1] [2] [3]), and a sequence, comprising seven consecutive triads (n; i.e., 3 × 7 task). In this task, participants were required to learn the sequence, via trial and error. Following completion of the learning sequence, they were asked to perform another sequence in which the order of each triad was reversed ([3] [2] [1]; Tanaka and Watanabe 2014a). Before the transfer session, we classified participants into either the Non-instruction or the Instruction group. For the Non-instruction group, we did not provide the participants with any information regarding the reversal rule prior to the transfer session; therefore, some participants spontaneously noticed the reversal rule during the transfer session and some did not (i.e., Aware and Unaware groups). For the Instruction group, we informed the participants about the reversal rule prior to the transfer session with the second sequence. The Aware and Instruction groups were similar in their discovery of explicit knowledge; however, the underlying differences between the groups may be present. Hence, we mainly focused on differences of performance between the Aware and Instruction groups. As both the Aware and Instruction groups were presumed to obtain the same explicit knowledge of the rule, we were interested in whether speed, not the number of error trials (i.e., accuracy), differed in the Aware and Instruction groups. Note that, in our study, the explicit knowledge concerned a transfer rule (e.g., the reversal rule between the sequences) rather than the sequence itself (e.g., Curran and Keele 1993). Here, we regarded that the acquisition of explicit knowledge of a transfer rule likely corresponds to the secondary task in Curran and Keele (1993), because in the transfer session the participants mainly tried to decipher a given sequence with transformation of the acquired sequence. Similar to the results of Curran and Keele (1993), one prediction was that the explicit instruction of a transfer rule would be beneficial, even in the transfer session, if it spares cognitive load. However, other possibilities could not be ignored. Specifically, instruction might not have a greater effect than that of spontaneous detection of the rule, or the explicit instruction could even have a detrimental effect on transfer, if it requires a larger cognitive load to transfer the acquired sequence.

Methods

Participants

Sixty-three individuals (38 men, 25 women; 18–28 years old; 57 right-handed, according to self-reports) participated in the study. All participants had normal or corrected-to-normal visual acuity, normal motor function, and were naïve to the purpose of this study. The experiment was approved by the institutional review board of The University of Tokyo and was conducted in accordance with the ethical standards in the 1964 Declaration of Helsinki. All participants provided informed consent prior to commencement of the experiment.

m × n task

Stimuli and procedure

We adopted a similar experimental paradigm as those used in previous studies (e.g., Hikosaka et al. 1995, 1996, 1999, 2002; Sakai et al. 1998, 2003, 2004; Tanaka and Watanabe 2013, 2014a, b, 2016; Watanabe et al. 2006, 2010). The experimental device consisted of 16 LED buttons, mounted in a 4 × 4 matrix, and another LED button (called the “home key”) positioned at the bottom of the device (Fig. 1a). The LED buttons were square in shape (10 × 10 mm) and positioned 8 mm apart.

Fig. 1
figure 1

Experimental paradigms in the present study. a Experimental flow of the m × n task. Note that the numbers shown on the buttons (1, 2, and 3) were not displayed during experiments. b Illustration of visuospatial working memory task. The sample arrays consisted of colored squares. Note that K, G, R, Y, B, and P indicate black, green, red, yellow, blue, and pink colors of the squares during experiment. In this example, the yellow square in the sample array was replaced with the violet square in the different test array

When the home button was pressed for 500 ms, three buttons (i.e., a triad) were illuminated simultaneously. Seven triads were presented in a fixed order, which we simply called a “sequence.” Participants were asked to press the three illuminated buttons in a predetermined order with the index finger of their dominant hands. That is, they were required to uncover the correct sequence order, via trial and error. If button presses were successful, the LEDs deactivated, one by one, and the subsequent triad was activated. Participants were then required to discover the correct order of the triad again. When button presses were incorrect, all LEDs were briefly illuminated, and the trial was regarded as an error trial. Then, participants were required to restart the sequence from the beginning, using the home button (i.e., repetition of the first triad). A trial was only regarded as successful when participants completed a sequence (i.e., seven triads) without error. The same sequence was repeated until participants completed it successfully for 20 cumulative trials. Participants were asked to perform the sequence as quickly and accurately as possible after they started a trial (i.e., pressed the home button).

We prepared two types of sequence: “Original” and “Reversed.” The Original sequence was generated randomly for each participant. In the Reversed sequence, the order of button presses was reversed from that used in the Original sequence (i.e., the spatial configurations of the triads were not changed). Tanaka and Watanabe (2014a) adopted the Reversed sequence in the transfer session and found that 50% of participants noticed the reversed rule; therefore, we expected that this sequence would likely lead to a balanced proportion regarding awareness of the transfer rule.

All participants performed a learning session using the Original sequence and, after a 5-min break, a transfer session using the Reversed sequence. Before commencement of the transfer session, participants were assigned into either the Non-instruction (n = 41) or the Instruction group (n = 22). Importantly, the Non-instruction group was not provided with any information regarding the reversal rule and was instructed that a new sequence would be randomly generated. By contrast, the Instruction group received the reversal rule in the transfer session.

Interview

Following completion of the transfer session, the Non-instruction group was asked whether they had noticed anything peculiar during the transfer session. If participants reported the reversal rule, they were assigned to the Aware group. In order to determine whether the remaining participants did not report the rule despite noticing it, an experimenter explained the rule and asked them whether they had become aware of the rule while performing the transfer session. Those who reported that they had become aware of the rule were also assigned to the Aware group, and the remaining participants were assigned to the Unaware group. This procedure was identical to that in Tanaka and Watanabe (2014a, b).

Various methods to distinguish explicit and implicit knowledge have been used. Ashby et al. (1998) suggested that implicit learning could be assumed as the result when participants could not verbalize what they had learned. Ziori and Dienes (2006, 2008) adopted subjective measures, based on confidence ratings. Importantly, the present experiment itself is an explicit learning paradigm, and we only focused on whether or not participants noticed the rule during the transfer session and could report it. Therefore, we defined explicit knowledge as participants being able to verbalize the transfer rule.

Visuospatial working memory task

In order to observe individual differences in cognitive performance between the Unaware, Aware, and Instruction groups, we adopted a computerized visuospatial working memory (VSWM) task used in a previous study (Bo et al. 2011, which was modified from Luck and Vogel 1997). Bo and Seidler (2009) showed that VSWM capacity could predict the rate of explicit motor sequence learning. Thus, if the Aware group had cognitive superiority to the Unaware group, the difference could be represented by VSWM capacity.

All stimulus arrays were presented within a 9.8° × 7.3° region on a CRT monitor with a gray background (9.3 cdm−2). Nine discriminable colored squares (0.65° × 0.65°) were prepared (red, orange, yellow, green, blue, violet, pink, white, and black). Each trial began with a central fixation cross (1.0° × 1.0°), followed by a sample array for 100 ms, a 900-ms blank screen delay, and a 2000-ms presentation of a test array. The arrays consisted of 2–8 (array size) colored squares which were randomly selected from the nine colors. For each trial, the test array was either the same as the sample array (50% of the trials) or different from the sample array (50% of the trials). Note that, in the different array, only one color of the squares was changed from the sample array, and the same color squares did not appear in each trial. Thus, participants needed to detect a change in color at different locations. Participants were then prompted to indicate whether the test array was the same as or different from the sample array by key press (same, left key; different, right key; Fig. 1b). Participants performed 252 trials in sum: 7 types of array size × 2 types of array (same or different) × 18 iterations. Participants had a seat approximately 60 cm away from the monitor. Before the commencement of the experiment, the experimenter showed participants all colored squares simultaneously and confirmed that all participants could discriminate them clearly.

Data analysis

We counted the number of error trials committed prior to completing one correct trial (i.e., accuracy). For example, the number of errors in the first successful trial (e.g., Fig. 2a) indicates the total number of errors committed by the first successful trial. We also measured the time that elapsed from the moment the home key was pressed to the moment the third button of the final (7th) triad was pressed, in each successful trial (i.e., speed). Particularly, the performance time in successful trials was fit with the power function for each participant (c.f., the power law of practice, Speelman and Kirsner 2005): y = αx β. α represents the overall (and initial) speed, with a smaller value indicating faster initial performance time. β represents the learning efficiency, with larger negative values indicating higher learning efficiency. This fitting was used in a previous work utilizing the m × n task (Watanabe et al. 2010).

Fig. 2
figure 2

Performances of m × n task in the learning sessions and relationships between accuracy and speed of the m × n task and working memory capacity (K). Error bars indicate standard errors of the mean. a Mean number of errors before the successful completion of each trial in the learning session. b Mean performance time for successful trials in the learning session. The data of averaged performance time were fit with the power function: y = αx β. c Relationship between total number of error trials in the m × n task and K. d Relationship between overall speed in the m × n task and K. e Relationship between learning efficiency in the m × n task and K

As for the number of errors, we conducted two-way mixed ANOVAs with the 20 successful trial sections as a within-subject factor and the three experimental groups as a between-subject factor. As for the overall speed and learning efficiency, we conducted a one-way ANOVA, with the three experimental groups as a between-subject factor, respectively. Post hoc tests were performed using Shaffer’s method where appropriate. Effect sizes (η p 2) are reported for all ANOVAs. Difference-adjusted pooled confidence intervals for independent means were also adopted, and 95% confidence intervals are reported for the main results (Baguley 2012). VSWM capacity was calculated using the formula: K = Size of the array × (observed hit rate − false alarm rate) (Vogel and Machizawa 2004). Then, the mean K across all array sizes was computed to represent the working memory capacity for each participant.

Results

Of the 63 participants, five were excluded from the following data analysis: three were because the overall speed in the learning session was greater than two standard deviations from the mean overall speed, and two were because the overall speed in the transfer session was greater than two standard deviations from the mean overall speed of the group (one was assigned to the Unaware group and the other was assigned to the Aware group).

Learning session in the m × n task

A one-way ANOVA on the number of error trials in the learning session revealed a significant main effect of successful trial section [F(19, 1083) = 220.09, p < 0.0001, η p 2 = 0.79; Fig. 2a]. Post hoc tests indicated that the mean number of errors was larger by the first successful trial (27.01 trials) relative to the other sections (less than 1.68 trials, ps < 0.01). As for speed, the results of power fitting model (y = αx β) showed that overall speed (α) was 15.54 (p < 0.0001) and learning efficiency (β) was −0.09 (p < 0.0001; Fig. 2b). The rapid improvement of accuracy (i.e., the number of errors) and relatively slow improvement of speed (i.e., learning efficiency) reflected parallel learning of the sequence, meaning that individuals first acquire effector-independent representations and then effector-dependent representations (e.g., Hikosaka et al. 1995; Sakai et al. 1998, 2003; Tanaka and Watanabe 2013, 2014a, b).

Relationship between sequence learning and visuospatial working memory

We examined the relationship between performances of sequence learning and VSWM capacity (i.e., K). First, the results of Pearson’s correlation analysis for the total number of error trials in the learning session and K revealed a significant correlation [t(56) = 3.03, r = −0.37, p < 0.01; Fig. 2c], which indicates that participants with a larger VSWM capacity can perform a sequence with fewer errors. By contrast, the results of Pearson’s correlation analysis for overall speed and K, and learning efficiency and K did not showed significant correlations [overall speed and K, t(56) = 0.22, r = −0.029, p = 0.82; learning efficiency and K, t(56) = 1.27, r = −0.16, p = 0.20; Fig. 2d, e]. These results may suggest that participants with a larger VSWM capacity can acquire a given sequence earlier (e.g., Bo and Seidler 2009). Therefore, the present results, so far, could be assumed to replicate and verify the previous results in both visuomotor sequence learning and VSWM tasks.

Comparison of the Unaware, Aware, and Instruction groups

Through the interview for the Non-instruction group, we found that 19 participants spontaneously discovered the reversal rule during the transfer session (i.e., Aware group) and 17 participants did not (i.e., Unaware group). Therefore, the Unaware, Aware, and Instruction groups included 17, 19, and 22 participants, respectively.

Working memory capacity

Mean VSWM capacity of the Unaware, Aware, and Instruction groups was 2.60 [95% CI (2.37 2.84)], 3.05 [95% CI (2.78 3.32)], and 2.90 [95% CI (2.66 3.14)], respectively. A one-way ANOVA showed a significant main effect of group [F(2, 55) = 3.38, p < 0.05, η p 2 = 0.10; Fig. 3a]. Post hoc tests showed that the VSWM capacity was larger in the Aware group than that in the Unaware group (p < 0.05), but it was not significantly different between the Aware and Instruction groups (p = 0.38), and the Unaware and Instruction groups (p = 0.079). Interestingly, this result suggests that participants with a larger VSWM capacity tended to notice the hidden relationship between learning (first) and transfer (second) sequences. Those with a larger VSWM capacity, who can perform a sequence in a shorter time, typically may display this larger capacity in the multidimensional system (Bo et al. 2011; Keele et al. 2003). Given this assumption, the multidimensional system in those with larger VSWM capacities was not fully occupied by the transfer task (i.e., trial-and-error processes), and accepted an entry about the relationship between the learning and transfer sequences (i.e., transformation rule), resulting in the awareness of the rule.

Fig. 3
figure 3

VSWM capacity and performances in the learning session. Error bars show the standard error of the mean. All participants performed the Original sequence in the learning session. a Mean working time in each group. Points indicate the VSWM capacity in each participant. b Mean number of errors before the successful completion of each trial in the learning session. c Mean performance time for successful trials in the learning session. The data of averaged performance time were fit with the power function: y = αx β

Note that, in the present study, we classified the participants in the Non-instruction group into the Aware and Unaware groups through interview and found the significant difference of K. By contrast, the participants in the Instruction group were randomly recruited and were not sorted by any criteria; therefore, the VSWM capacity in the Instruction group must be similar to the mixture of those in the Aware and Unaware groups. In other words, even though we did not find significant differences between the Aware and Instruction, and Unaware and Instruction groups, it is reasonable to assume that original cognitive ability, such as VSWM capacity, might be generally different among the three groups.

Learning session

Mean total number of error trials in the Unaware, Aware, and Instruction groups was 38.64 trials [95% CI (27.49 49.79)], 27.26 trials [95% CI (21.96 32.55)], and 31.40 trials [95% CI (23.95 38.86)], respectively. Two-way ANOVAs on the number of error trials (3 experimental groups × 20 successful trials) did not show a significant main effect of experimental group [F(2, 55) = 2.08, p = 0.13, Fig. 3c]. However, the interaction between experimental group and successful trial section was significant [F(38, 1045) = 27.93, p < 0.0001, η p 2 = 0.10]. Post hoc tests revealed that the number of error trials by the first successful trial was larger in the Unaware group than that in the Aware group [t(55) = 2.61, p < 0.05], while it was not significantly different between the Unaware and Instruction, and Aware and Instruction groups [ts(55) < 1.50, ps > 0.13]. This difference likely reflects the discrepancy of VSWM capacity (i.e., participants with a larger VSWM capacity could perform a given sequence with fewer errors, Fig. 3b).

Mean performance time in successful trials in the Unaware, Aware, and Instruction groups was 12.47 s [95% CI (11.57 13.38)], 12.28 s [95% CI (11.46 13.11)], and 13.19 s [95% CI (12.57 13.81)], respectively. Overall speed (α) in the Unaware, Aware, and Instruction groups was 15.48 [95% CI (14.18 16.78)], 15.65 [95% CI (14.03 17.27)], and 15.67 [95% CI (14.51 16.83)], respectively. Learning efficiency (β) in the Unaware, Aware, and Instruction groups was −0.10 [95% CI (−0.12 −0.076)], −0.11 [95% CI (−0.14 −0.075)], and −0.078 [95% CI (−0.10 −0.051)], respectively. Both two-way ANOVAs on speed index (i.e., overall speed and learning efficiency) did not reveal a significant main effect of experimental groups [Fs(2, 55) < 1.44, ps > 0.24, Fig. 3c], which indicates non-significant differences of original speed among the Unaware, Aware, and Instruction groups.

Transfer session

Mean total number of error trials in the Unaware, Aware, and Instruction groups was 28.00 trials [95% CI (18.60 37.40)], 7.52 trials [95% CI (4.07 10.97)], and 2.90 trials [95% CI (1.26 4.550], respectively. Two-way ANOVAs on the number of error trials showed a significant main effect of experimental group [F(2, 55) = 27.45, p < 0.0001, η p 2 = 0.49, Fig. 4a] and successful trial section [F(19, 1045) = 56.67, p < 0.0001, η p 2 = 0.50]. The interaction between experimental group and trial section was also significant [F(38, 1045) = 25.38, p < 0.0001, η p 2 = 0.48]. Post hoc tests regarding experimental groups showed that the number of errors was larger in the Unaware group than those in both the Aware and Instruction groups [ts(55) > 5.61, ps < 0.001], while it was not significantly different between the Aware and Instruction groups [t(55) = 1.35, p = 0.18]. This difference could be mostly explained by the number of errors until the first successful trial (p < 0.001) and was simply due to a benefit of explicit knowledge of the sequence in the Aware and Instruction groups.

Fig. 4
figure 4

Performances in the transfer session. Error bars show the standard error of the mean. All participants performed the Reversed sequence in the transfer session. a Mean number of errors before the successful completion of each trial in the transfer session. b Scatter plots of the total number of errors in the transfer session and K. c Mean performance time for successful trials in the transfer session. The data of averaged performance time were fit with the power function: y = αx β. d Mean overall speed in each group. Points indicate the overall speed of each participant. e Mean learning efficiency in each group. Points indicate learning efficiency of each participant. f Mean improvements in speed in the Aware and Instruction groups. Points indicate an improvement in speed in each participant

Since we found a significant correlation between VSWM capacity and the number of errors in learning session, we also performed Spearman’s rank correlation analysis for the total number of error trials in the transfer session and K in each group (Fig. 4b). The results showed a significant correlation in the Unaware group (S = 1425.2, ρ = −0.74, p < 0.001), not in the Aware and Instruction groups (Aware, S = 1317, ρ = −0.15, p = 0.52; Instruction, S = 1339.5, ρ = 0.24, p = 0.27). This result indicates that the correlation between the number of errors and K was retained, even in the transfer session, unless participants obtained the explicit knowledge of the rule. That is, if participants believed that they performed the task with a randomly generated sequence in the transfer session, their accuracy was related to K. If they noticed or received the explicit knowledge of rule, their accuracy was not related to the VSWM capacity.

Mean performance time in the Unaware, Aware, and Instruction groups was 11.85 s [95% CI (11.01 12.69)], 12.50 s [95% CI (11.55 13.46)], and 14.19 s [95% CI (13.29 15.08)], respectively. Overall speed (α) in the Unaware, Aware, and Instruction groups was 15.11 [95% CI (13.33 16.90)], 17.04 [95% CI (15.29 18.79)], and 20.00 [95% CI (17.81 22.19)], respectively. Learning efficiency (β) in the Unaware, Aware, and Instruction group was −0.11 [95% CI (−0.15 −0.067)], −0.14 [95% CI (−0.17 −0.11)], and −0.15 [95% CI (−0.19 −0.12)], respectively. A one-way ANOVA on overall speed revealed a significant main effect of experimental group [F(2, 55) = 6.93, p < 0.01, η p 2 = 0.20; Fig. 4d], and post hoc tests showed that the overall speed was slower in the Instruction group than that in both the Aware and Unaware groups [ts(55) > 2.28, ps < 0.05], but was not significantly different between the Aware and Unaware groups [t(55) = 1.39, p = 0.17]. By contrast, a one-way ANOVA on learning efficiency did not reveal a significant main effect of experimental group [F(2, 55) = 1.75, p = 0.18; Fig. 4e]. The slower overall speed in the Instruction group than that in the Aware group, and non-significantly different learning efficiency between the Instruction and Aware groups, indicates that the slower speed in the Instruction group persisted even in late learning.

In addition, in order to consider individual differences, because original cognitive performance of the Aware and Instruction groups should be different, we calculated improvements in speed in the transfer session in the Aware and Instruction groups and compared them. In each participant, mean performance time in the transfer session was subtracted from that in the learning session, and the value was divided by the mean performance time in the learning session: (P learn − P transfer)/P learn. A Welch unpaired t test showed a significant difference [t(37.22) = 2.08, p < 0.05, Fig. 4f], which indicates that the improvement in speed was also larger in the Aware group [mean −1.84%, 95% CI (−5.28 1.58)] than that in the Instruction group [mean −7.57%, 95% CI (−12.17 −2.97)]. Interestingly, improvements in speed in the Instruction group were less than zero, indicating the occurrence of interferences between the learning and transfer sessions [one-sample t test for the improvements in the Instruction group and zero t(21) = −3.42, p < 0.01], while those in the Aware group were not [t(18) = −1.13, p = 0.27].

Correlations between speed and accuracy in the transfer session

Although we did not find a significant difference of the number of errors in the transfer session between the Aware and Instruction groups, participants in the Aware group believed that at the beginning of the transfer session the sequence in the transfer session was newly generated and they tried to decipher the correct orders of the sequence. In order to confirm that the slower overall speed in the Instruction group than that in the Aware group was not related to the number of errors in the transfer session, we performed Spearman’s rank correlation analyses on the overall speed and the total number of errors, and learning efficiency and the total number of errors in the Aware and Instruction groups separately, as well as in combined data of these two groups. The correlation analyses did not show any significant correlations between overall speed and the total number of errors (Aware group: S = 1366.3, ρ = −0.19, p = 0.41; Instruction group: S = 1864.8, ρ = −0.052, p = 0.81; Mixed data: S = 13,925, ρ = −0.21, p = 0.18), or between learning efficiency and the total number of errors (Aware group: S = 1216.4, ρ = −0.067, p = 0.78; Instruction group: S = 1824.6, ρ = −0.030, p = 0.89; Mixed data: S = 11,329, ρ = 0.013, p = 0.93).

General discussion

This study examined the effects of manner of acquisition of explicit knowledge, regarding a transfer rule, on transfer of a visuomotor sequence. We found that (1) explicit knowledge of the transfer rule generally led to fewer errors in the transfer session, regardless of how it was acquired (Unaware, 28.00 trials; Aware, 7.52 trials; Instruction, 2.90 trials), (2) the total number of error trials in the transfer session did not differ significantly between the Aware and Instruction groups, (3) explicit knowledge of the instruction led to a slower overall speed than that of spontaneous detection, and (4) the total number of error trials in the transfer session did not contribute to a faster overall speed or better learning efficiency in the transfer session in either the Aware or the Instruction group. These results suggest that explicit knowledge of the transfer rule may have helped reduce errors, irrespective of the manner of acquisition, but it may have interfered with performance speed, particularly when the knowledge was provided, rather than discovered spontaneously.

Hikosaka et al. (1999) proposed a model in which sequential learning is independently acquired by two parallel systems: spatial and motor (see also the neural network model of procedural learning, Nakahara et al. 2001). The spatial system is predominantly involved at the early learning stage (effector-independent learning) and is mostly evaluated by the number of errors committed in a task. The motor system is mainly involved at the late learning stage (effector-dependent learning) and is mostly evaluated by the performance time of a given sequence. In addition, people are capable of developing effector-dependent and effector-independent representations in parallel, but with different time courses (see also Bapi et al. 2000, 2006). Therefore, a shorter performance time in successful trials indicates that participants acquired effector-independent representations earlier and, then, could start to develop effector-dependent representations. The present results show that explicit knowledge can facilitate effector-independent learning, but does not facilitate effector-dependent learning, which is congruent with results of previous studies (e.g., Watanabe et al. 2006). Watanabe et al. (2006) used the 2 × 10 task and rotated the workspace for the transfer session, without notifying participants. The participants who noticed the rules of rotation did not improve performance times compared to those who did not, although they were able to use their explicit knowledge of the rotation to reduce errors. The non-facilitation of effector-dependent learning in the transfer task might arise from two possibilities. One is due to different number of error trials until the first successful trial, reflecting a different working time. Since the Unaware group performed the transfer sequence with trial and error (i.e., greater number of errors), they might have been able to develop larger effector-dependent representations by the first successful trial, compared to the Aware group. In addition to this, the Aware group may have developed effector-dependent representations of the sequence within a shorter time. Therefore, the speed in the Unaware and Aware groups looks similar, but the underlying process is likely different. Alternatively, the results might reflect that explicit knowledge of the transformation rule required cognitive resources, resulting in deactivation of the multidimensional system and activation of only the unidimensional system (Keele et al. 2003). That said, since the number of errors in the transfer task was greatly different between the Aware and Unaware groups (7.52 vs. 28.00 trials), the non-facilitation of speed likely supports the first possibility.

The most significant finding in the present study was that the overall speed of the Instruction group was slower than that of the Aware group, indicating that explicit knowledge provided by another person hindered performance time. This seemed to persist for the entire duration of the experiment. A few qualitatively similar effects were reported previously. An explicit instruction reduced movement errors, but the reduced magnitude of errors caused interference with sensorimotor adaptation (Benson et al. 2011). Performance of well-practiced movements, such as golf putting, can be impaired by explicit instruction (e.g., Beilock and Carr 2001; Flegal and Anderson 2008). As mentioned in the last paragraph, given that explicit knowledge of a transformation rule contributes to the development of speed in the multidimensional system, the most reasonable interpretation is that explicit knowledge provided by instruction requires more cognitive resources in the multidimensional system, compared to that spontaneously discovered. This is supported by the negative improvement in speed in the Instruction group (Fig. 4f). Some studies suggest that explicit instruction leads to an increase of cognitive load (e.g., Masters et al. 2008). Masters et al. (2008) asked novices to learn a table tennis shot with an analogical instruction or explicitly detailed instruction and to do a concurrent low- or high-complexity decision during the shot performance. They observed that performance was disrupted when the participants with the explicit instruction made the high-complexity decision, which indicated that increased cognitive load impaired motor performance. Using an experimental paradigm which is similar to the present study, it has been shown that the frontal areas of the brain are involved in explicit learning and transfer (e.g., Hikosaka et al. 2002; Sakai et al. 1998); the two frontal areas (i.e., the dorsolateral prefrontal cortex and the pre-supplementary motor area) were mainly activated in the earlier stages of learning, and the two parietal areas (i.e., the precuneus and intraparietal sulcus) were mainly activated in the later stages. Then, activity of the frontal areas (i.e., executive function, Miyake et al. 2000) decreases in tasks requiring concurrent task load (e.g., Lavie et al. 2004). Taken together, explicit knowledge provided by another person may automatically require more cognitive resources than those required by spontaneous detection, resulting in learning of the effector-independent sequence to be delayed and, thus, the shift from effector-independent learning to effector-dependent learning also being delayed. In other words, participants who obtained the explicit knowledge via instruction may have taken more time at the effector-independent learning phase.

The present study is the first to directly compare explicit instruction and spontaneous detection and showed that the manner of acquisition of explicit knowledge might play a key role in transfer. That said, the present study has several limitations. One limitation is regarding classification of participants, which was also observed in Curran and Keele (1993). For the Non-instruction group, we classified the participants into the Aware and Unaware groups via interview and found that those in the Aware group had a larger VSWM capacity than those in the Unaware group. However, participants in the Instruction group were not classified by any criteria; therefore, cognitive abilities of participants might not be precisely controlled, even if we did not find a significant difference of VSWM capacity between the Aware and Instruction groups. Another limitation is regarding the role of working memory in the task. In the present study, we adopted only one task in addition to the m × n task. Although we found a significant correlation of sequence learning (i.e., total number of errors) and VSWM capacity, other possibilities could not be ruled out, such as motivation. However, Bo and Seidler (2009) adopted a sequence learning task, visuospatial working memory task, and continuous tapping task, and found that only VSWM capacity was correlated with the rate of motor sequence learning. In addition, since similar neural activation in frontal areas during performances in the earlier learning stage of the m × n task and VSWM task could be assumed (e.g., Hikosaka et al. 2002; Sakai et al. 1998), the present results also likely reflect individual differences of cognitive capacity, not a motivational issue.