Introduction

Practice via motor imagery (MI), the mental rehearsal of a motor task, has been shown to result in motor skill acquisition in conjunction with or apart from physical practice (PP). A long-standing assumption as to why MI is effective for motor skill acquisition is the motor simulation theory (Jeannerod 2001), which posits that MI is functionally equivalent to PP, with the exception that MI encompasses only the planning (i.e., covert) stage of movement. Evidence supporting the notion of functional equivalence includes neuroimaging work showing the network of brain areas underlying MI performance overlap heavily with those utilized when physically executing the same movement (Burianová et al. 2013; Hétu et al. 2013; Kraeutner et al. 2014; Hardwick et al. 2018).

Notable exceptions to this shared neural representation, including inconsistent activation of primary motor cortex and a greater reliance on fronto-parietal network activation (both key areas involved in movement planning) in MI, coupled with less robust behavioural outcomes resulting from MI, have led some to suggest that MI is fundamentally different from PP, engendering learning unique from that resulting from PP. One such proposition is that MI-based practice results primarily in the effector-independent encoding of a movement as opposed to effector-independent -and dependent encoding resulting from PP (Frank et al. 2014; Frank and Schack 2017; Kim et al. 2017). In this context, effector-independent components refer to the hierarchal organization of basic action concepts that control the motor system (Schack 2004). Emerging evidence supports the proposition that motor skill acquisition occurring via MI-based practice relies on building this effector-independent encoding of movement. Work shows that combined MI and PP improved outcomes on a golf task in comparison to PP alone, yet MI in comparison to PP alone also led to improved organization of the hierarchal framework representing the movement (Frank et al. 2014). That MI-based training is more perceptual in nature has been supported by work from Ingram et. al., who, in employing a perceptual (switching from auditory to visual cue) or motor (switching effector) transfer following physical or MI-based practice of a serial reaction time paradigm, showed that learning via MI was negatively impacted to a greater extent following the perceptual as opposed to motor transfer, whereas the opposite was observed in PP (Ingram et al. 2016). In line with the proposed perceptual nature of MI-based learning, disruption of activity in the parietal lobe, an area responsible for performing visuomotor transformations and perceptual mapping necessary to achieve a movement goal, has also been shown to impair one’s ability to perform MI, and ostensibly, one’s ability to learn via MI-based practice (Sirigu et al. 1996; Kraeutner et al. 2016a; McInnes et al. 2016; Oostra and Bladel 2016). In contrast, transient inhibition of the primary motor cortex, involved in effector-dependent encoding of a movement, did not impair MI-based learning (Kraeutner et al. 2017b).

While these studies provide evidence in support of the effector-independent nature of MI-based practice, they neglect a key region involved in the motor planning to execution continuum, the supplementary motor area (SMA; Nachev et al. 2008; Cona and Semenza 2017). The SMA is of specific interest as, consistent with physical execution, the SMA is active during preparation for imagined movement in voluntary and cued actions (Deecke and Kornhuber 1978; Jahanshahi et al. 1995; Kuhtz‐Buschbeck et al. 2003; Hétu et al. 2013; Lara et al. 2018). The SMA can be divided into two spatially distinct areas, the pre-SMA, which is thought to encode information about the selection of movement parameters, and the SMA proper, which is believed to be involved in the timing of movements (Hoffstaedter et al. 2012). When the SMA is damaged or inhibited, ordering movements becomes increasingly more difficult and learning on sequence tasks becomes impaired (Halsband et al. 1993; Nakamura et al. 1999; Tanji 2001; Tanaka et al. 2010). These impairments could be attributed to one’s ability to learn new stimulus response associations, a function of the pre-SMA (Sakai et al. 1999; Cona et al. 2017). Alternatively, the deficits in acquiring new patterns of sequential movement could pertain to damage to the SMA proper as the region has been linked with initiation and timing of voluntary movements (Tanji and Shima 1994; Cona and Semenza 2017; Lara et al. 2018).

With respect to sequence learning, SMA function appears to be vital for skill acquisition, whereby transient inhibition of the SMA increased reaction times (RTs) that accompany practice in these paradigms (Verwey et al. 2002). Sequence learning typically requires the mapping of perceptual cues to motor actions whereby participants start by associating individual responses to a single stimulus. During training, associations between successive responses are made, with single stimuli ultimately representing an increasing number of these successive responses over the course of training. The series of responses associated with a single stimulus is termed a ‘motor chunk’ (Verwey et al. 2015). Accordingly, it has been noted that the SMA activates selectively in response to upcoming sequences of learned movements implying that the area is involved to some extent with the consolidation of the sequential components of movement (Tanji and Shima 1994). Supporting this function, SMA inhibition during preparation for movement impairs participants’ ability to perform a trained (implicit) sequence after a motor transfer (Perez et al. 2008). This result suggests that the SMA is involved in reinforcing movement-based representations of motor chunks (i.e., effector-dependent encoding) but not goal-based representations of motor chunks (i.e., effector-independent encoding) for sequence learning-based tasks. However, given that these findings arise from studies using PP, the importance of the SMA in MI-based motor skill acquisition has yet to be elucidated, a noticeable gap in our knowledge in light of recent views of MI that prefer alternative explanations to the motor simulation theory that suggests functional equivalence of MI and PP.

The current study sought to address this gap in knowledge by examining the contribution of the SMA to MI-based skill acquisition, with the overarching goal of generating evidence related to the nature of MI-based skill acquisition. Similar to previous work from our laboratory, inhibitory transcranial magnetic stimulation (TMS) was applied prior to the performance of a serial reaction time paradigm, whereby learning is assessed by determining the differences in RTs between elements belonging to a repeated sequence (learned implicitly) versus random, non-repeating sequences (Kraeutner et al. 2016a, 2017b). Comparing RTs between those receiving real versus sham stimulation to the SMA was expected to elucidate the contribution of the SMA to skill acquisition in the given task in both PP and MI, addressing the study hypotheses that (1) inhibition of the SMA will impair skill acquisition when training occurs via PP, a finding consistent with previous work; and (2) skill acquisition following inhibition of the SMA would proceed as in the sham group when training occurs via MI, given its effector-independent nature.

Methodology

Participants

Sixty-four right-handed participants with normal or corrected to normal hearing and vision that reported no contraindications to TMS provided written informed consent to participate in this study. Handedness was assessed by a score of > 40 on the Edinburgh handedness inventory (Oldfield 1971). Dalhousie University’s health sciences research ethics board approved the study.

Group allocation

The study had four experimental groups: MI-based training on the serial reaction time paradigm (described below) while receiving either sham (MI sham) or real stimulation to their SMA (MI stim) prior to training, and PP-based training on the serial reaction time paradigm while receiving either sham (PP-sham) or real stimulation to their SMA (PP stim) prior to training. To minimize attrition, participants were first randomly allocated into the MI-stim or PP-stim groups. Participants whose resting motor threshold (RMT, described below) could not be attained or exceeded 59% of stimulator output (SuperRapid2Plus1 magnetic stimulator, Magstim, Whitland, UK) were rolled into the sham group for each modality (MI sham and PP sham) following random assignment. Once the ‘stim’ group for both modalities (MI or PP) reached their targeted group size, participants were randomly assigned to the MI- or PP-sham groups until recruitment was completed.

Experimental procedure

Following written, informed consent and TMS screening, participants completed the Kinesthetic and Visual Imagery Questionnaire (KVIQ) to assess their ability to perform MI (Malouin et al. 2007). Participants then underwent TMS prior to engaging in the training blocks of the serial reaction time paradigm. Finally, participants completed the test block and verbal recall task (Fig. 1).

Fig. 1
figure 1

Experimental timeline. Following informed consent, screening and questionnaires, participants underwent real (SMA) or sham (vertex of the head) TMS. A wait period (10 min) was included between application of TMS and the onset of the serial reaction time paradigm to ensure effects of the stimulation would coincide with the training period. Finally, participants completed the testing and the verbal recall task. The ratio of repeated (rep) sequence elements to random (ran) sequence elements was 4:1. MI motor imagery; PP physical practice

Experimental Task. Participants engaged in a serial reaction time paradigm, the details of which have been reported previously (Kraeutner et al. 2016b). Briefly, participants were seated comfortably in front of a monitor with their left hand resting on a keyboard oriented to the Z, X, C, V keys (numbered 4–1) representing their little, ring, middle and index finger, respectively. Prior to task onset, participants were familiarized to the task via an audio script delivered via headphones; in addition to the familiarization script, the MI groups also received a description of kinesthetic MI (i.e., to focus on the sensory aspects of the movement and how the movement would feel), and instructions to perform kinesthetic MI of the task, given it has been found to be conducive to motor skill acquisition (Stinear et al. 2006; Malouin et al. 2007). During the task, participants sat with their eyes closed while responding to a series of auditory cues (numerical digits 1, 2, 3 or 4, delivered via the headphones) by pressing the corresponding key using their left hand (PP groups) or to imagine pressing the corresponding key using their left hand (MI groups). An error tone was played when participants recorded an incorrect response in PP or made an inadvertent keypress in MI. The serial reaction time paradigm consisted of four training blocks of 250 trials each, performed using MI or PP based on group assignment, followed by a test block and a verbal recall task (described below, see Fig. 1). A single trial took approximately 2 s, comprised of the presentation of a single auditory cue (duration range: 390–480 ms) and a subsequent 1.5 s window in which participants were to respond appropriately using MI or PP. Unbeknownst to the participants, the cues (numerical digits 1, 2, 3 or 4) in each training block were elements belonging to either a repeating sequence of ten consecutive cues determined at the onset of the experiment, or randomly generated sequences of ten consecutive cues. The creation of the repeated sequence was pseudo-random as they were constrained to avoid obvious regularities (repeating digits [1, 1, 1, 1…], sequential digits [1, 2, 3, 4…] and patterns of digits [1, 2, 1, 2…]). During training, the repeated and random sequences were presented in random order, determined at the beginning of each block, at a ratio of 4:1 (repeated to random respectively).

As indicated above, following training participants underwent a test block to assess the degree to which participants learned the implicit (repeated) sequence. For the test block, the repeated and random sequences were presented at a 1:1 ratio, consisting of a total of 200 trials. Regardless of group assignment, participants completed the test block by physically pressing the keys in response to the auditory cues to enable measurement of RTs for offline analysis of learning. Following the test block, participants completed a verbal recall task to determine if they had acquired explicit knowledge of their repeating sequence. The verbal recall task consisted of two questions. The first asked “Do you think you learned a sequence during the training blocks?”. If participants responded “yes” they answered a second question where they were asked to recall the repeating sequence. If the participants responded “no” they were asked to recall their repeated sequence and were told that “it was okay if they did not think they learned a sequence”. Recall of the repeated sequence was completed by writing the response on a form provided to the participant (i.e., writing out the 10 digits they recalled constituting the sequence).

Electromyography

Muscle activity was monitored throughout training on the serial reaction time paradigm for the MI-based groups (MI sham and MI stim). Briefly, electromyography (EMG) was obtained from the flexor and extensor muscles of the digits via self-adhering electrodes (1 × 3 cm; Q-Trace Gold; Kendall-LTP, USA) attached in a bipolar configuration with a 1 cm inter-electrode distance. EMG was sampled at 1961 Hz (1902 and Power 1401, Cambridge Electronic Design, UK) and stored for offline analysis.

Transcranial magnetic stimulation

Participants underwent TMS prior to engaging in the serial reaction time paradigm. Single and repetitive pulse TMS was delivered via an air-cooled 70-mm figure eight coil connected to a SuperRapid2Plus1 system (Magstim, Whitland, UK). A neuronavigation system (Brainsight2, Rogue Research Inc., Montreal, Canada) facilitated accurate localization and orientation of the coil over the cortical representation of the first dorsal interosseous (FDI) muscle and SMA (details below). To permit neuronavigation, each participant’s head was co-registered to a template MRI (MNI152_T1_1mm) by digitizing three anatomical landmarks (left and right pre-auricular points and the nasion) before undergoing TMS. Resting motor threshold (RMT) was determined by measuring the peak-to-peak amplitude of the motor evoked potential (MEP) induced by the application of TMS. Motor evoked potentials were obtained using a baseline corrected signal from EMG electrodes overlying the first dorsal interosseous (FDI) muscle using vendor supplied hardware (Brainsight2 EMG Isolation Unit and Amplifier Pod). Identification of RMT was facilitated by superimposing a 5 × 5 grid with 7.5 mm spacing on the template MRI with the centre of the grid (2, 2) overlying the ‘hand knob’ of the primary motor cortex. Stimulation intensity was set to 55% of the stimulator output and, starting at the centre point (2, 2), different points on the grid were stimulated to identify the location which produced the highest amplitude MEP on five out of ten stimuli. This location was determined to be the ‘hotspot’; RMT was then determined as the lowest stimulation intensity required to produce MEPs of > 50 μV on five out of ten stimuli (Kleim et al. 2007). Throughout, the TMS coil was positioned tangentially to the participant’s scalp at a 45º angle to the anterior–posterior axis. After RMT was determined, participants received either inhibitory stimulation to the SMA or sham stimulation based on group assignment. Inhibitory stimulation delivered to the SMA (x = − 5, y = − 10, z = 67) followed an established continuous theta burst stimulation (cTBS) protocol that produces an inhibitory effect extending up to 60 min post-stimulation (Huang et al. 2005; Oberman et al. 2011). The cTBS protocol consists of bursts of three stimuli at 50 Hz pulses, repeated at intervals of 200 ms for a total of 600 pulses delivered at 90% of the participants RMT, lasting a total of 40 s (Nyffeler et al. 2008). Sham stimulation consisted of the same cTBS protocol with stimulation intensity set to 20% of stimulator output and the TMS coil placed over the vertex of the head (Fig. 2).

Fig. 2
figure 2

Averaged reaction times (RT) in milliseconds for random (triangle) versus repeated sequence (circle) elements collapsed across stimulation type and learning modality. Vertical bars represent 95% confidence intervals

Data analysis

Learning type. This analysis was performed to ensure participants learned their repeating sequence implicitly. If participants correctly recalled > 5 consecutive elements of their repeating sequence when responding to the second question of the verbal recall task they were determined to have learned the sequence explicitly and were removed from further analysis (Rünger and Frensch 2010).

Response and EMG analysis. Data from the MI groups (MI stim and MI sham) were reviewed to ensure the absence of physical performance during the training blocks. First, data were examined for the presence of key presses, and participants who physically responded on more than 2% of all trials (20/1000) were removed from further analysis. Next, EMG data obtained from the left extensor and flexor muscles was rectified and band passed from 1 to 10 Hz. To verify that participants did not engage in PP during training, the EMG signal amplitude from consecutive 15 s windows were compared to a 15 s envelope of EMG data obtained while participants were at rest (during familiarization). The threshold for an active muscle was defined as the mean EMG signal magnitude plus two standard deviations (Kraeutner et al. 2016b). If more than 15% of EMG windows obtained during the training block exceeded this threshold, the participant was removed from further analysis.

Performance analysis

The first cue in each sequence (random or repeated) and RTs outside of the range of 100–1300 ms were removed from further analysis to control for cueing effects and anticipatory/outlier responses (Rüsseler et al. 2001). The number of erroneous responses (i.e., responses where the incorrect key was pressed after an auditory cue), during the test block were noted for each participant. Erroneous responses were excluded from the analysis.

Statistical analysis

A model comparison was performed to select the best fitting model for the data based on Akaike’s Information Criterion (AIC). The first was a linear model where participant’s RTs were predicted by sequence type (random and repeated), stimulation type (Stim and Sham) and learning modality (MI and PP). The second was a linear mixed effects (LME) model with the same fixed effects as the linear model but which also included a random effect for participant. The LME model was explored to determine if retaining the trial level data would provide additional sensitivity in the omnibus test. Akaike’s Information Criterion accounts for model complexity when comparing the goodness of fit of each model (AICs of 90,412.06 and 88,216.07 for the linear and LME models, respectively). The smaller AIC score for the LME model indicates that the random effect of participant improved the variance explained by the model and, as such, the LME model was selected for the analysis. The omnibus test was performed at α = 0.05. Post hoc testing of the significant main effects and interaction terms were performed using Bonferroni corrected t tests. The magnitude of learning in each group was further characterized by calculating an effect size (Cohen’s d) and quantifying the difference in RT (dRT; \(dRT={\overline{RT} }_{random}-{\overline{RT} }_{repeated}\)) for the RTs within each group.

Results

Of the 64 participants recruited, 40 (22 female, 22.8 ± 4.6 years) were included in the final analysis. Eight participants were removed prior to any form of analysis based on technical error (n = 5) and a lack of adherence to the task instructions (n = 3). From the analysis detailed above, 16 participants were removed for having explicit knowledge of the repeating sequence, including 13 from PP conditions (PP sham = 6, PP stim = 7) and 3 from MI conditions (MI sham = 1, MI stim = 2). No participants were removed for demonstrating evidence of physical execution during MI-based training, leaving a total of ten participants per group in the final analysis. Within the MI groups, KVIQ scores in the kinesthetic and visual domains were comparable to previous literature with values of 17.6 ± 3.4 and 18.1 ± 2.6 (kinesthetic and visual, respectively) in the MI-sham group and 17.7 ± 5.0 and 18.8 ± 4.7 (kinesthetic and visual, respectively) in the MI-stim group (Malouin et al. 2007).

Reaction time analysis

Descriptive statistics for the mean RTs from the test block of each group are reported in Table 1. The values obtained with relation to RTs and number of erroneous keypresses for the PP- and MI-sham groups are similar to those from previous work using the same serial reaction time paradigm (Kraeutner et al. 2016b).

Table 1 Reaction times (RT) [mean ± sd (ms)] and number of erroneous keypresses (error) [mean ± sd] for each group separated by sequence type

The interaction between sequence type and learning modality indicated that there was a larger difference between RTs to different sequence types in PP (t(3414) = 6.39, p =  < 0.0001, d = 0.22) than in MI (t(3474) = 3.57, p =  < 0.001, d = 0.12), indicating PP was the more effective modality for learning (Fig. 3). Additionally, RTs were slower in MI than in PP for both sequence types but the effect was larger for repeated sequence elements (t(3469) = 7.63, p =  < 0.0001, d = 0.26) than random sequence elements (t(3417) = 4.89, p =  < 0.0001, d = 0.16; Fig. 2). The interaction between sequence and stimulation type indicated that there was a smaller difference between RTs to different sequence types after participants received inhibitory TMS (t(3457) = 3.49, p =  < 0.0001, d = 0.12) in comparison to sham stimulation (t(3447) = 6.58, p =  < 0.0001, d = 0.22), indicating that inhibitory TMS impaired learning on the task (Fig. 4). Additionally, RTs were slower after receiving inhibitory stimulation than sham stimulation for both sequence types, but the effect was larger for repeated sequence elements (t(3433) = 5.27, p =  < 0.0001, d = 0.17) than random sequence elements (t(3406) = 2.78, p =  < 0.01, d = 0.09; Fig. 4). Analysis of group performance via effect sizes (Cohen’s d) as an indicator of the magnitude of learning demonstrated alignment with the significant two-way interactions above. The PP-sham group learned more (dRT = 44.9 ms, d = 0.30) in comparison to MI sham (dRT = 26.7 ms, d = 0.16) and inhibitory TMS reduced the magnitude of learning by approximately 50% in each modality (PP stim: dRT = 28.3 ms, d = 0.16 and MI stim: dRT = 13.9 ms, d = 0.08; see Fig. 5).

Fig. 3
figure 3

Averaged reaction times (RTs) in milliseconds for random (triangles) versus repeated (circles) sequence elements for each learning modality collapsed across stimulation types. Vertical bars represent 95% confidence intervals

Fig. 4
figure 4

Averaged reaction times (RT) in milliseconds for random (triangles) versus repeated (circles) sequence elements for each stimulation type collapsed across learning modality. Vertical bars represent 95% confidence intervals

Fig. 5
figure 5

Group averaged reaction times (RT) in milliseconds for random (triangle) versus repeated (circle) sequence elements. Vertical bars represent 95% confidence intervals; horizontal dotted lines denote the effect size (Cohen’s d) of sequence type within group

Discussion

The purpose of the present study was to investigate the contribution of the SMA to motor skill acquisition occurring via MI-based practice and in-turn to generate additional evidence related to the nature of MI-based learning. To address this purpose, participants engaged in a serial reaction time paradigm whereby participants trained using MI or PP after receiving either sham or real inhibitory TMS to the SMA. Learning on the task was determined by analyzing the RTs to randomly generated and repeating sequence elements. The faster participants were at responding to repeated sequence elements compared to random ones, the greater the magnitude of learning. Results indicated that overall there was a learning effect, where participants responded faster to repeated sequence elements compared to random ones (Fig. 2) although learning was better following training that occurred via PP relative to MI (Fig. 3), a finding consistent with previous work using a variety of tasks (Kraeutner et al. 2016b; Kim et al. 2017; Ingram et al. 2019). As it pertains to our hypothesis, inhibition of the SMA impaired learning in both PP and MI, evidenced by the interaction between stimulus and sequence type, whereby inhibitory TMS decreased the difference in RTs obtained post-training compared to sham stimulation. When looking at this effect within each experimental group, both the PP stim (dRT = 28.25, d = 0.16) and MI stim (dRT = 13.94 ms, d = 0.08) groups demonstrated a lower magnitude of learning in comparison to their sham counterparts (PP sham; dRT = 44.94 ms, d = 0.30 and MI sham; dRT = 26.69, d = 0.16; Fig. 5). It is important to note that while the RTs reported here are similar to those reported in previous work employing this paradigm, the effect sizes are smaller due to increased variability in the raw data (Ingram et al. 2016; Kraeutner et al. 2016b,a, 2017a; b). This difference in effect size can be attributed to the use of LME modelling which utilizes data from all trials, in-turn increasing the variability in the data, whereas previous work performed statistical testing on participant means. The presence of a difference in learning between PP stim and PP sham confirmed our first hypothesis and represents an important finding as it replicates prior work of Verwey and colleagues and demonstrates that our cTBS protocol was effective at inducing inhibition of the SMA (Verwey et al. 2002). Contradicting our second hypothesis, results of the two-way interaction between stimulus and sequence type indicated that inhibition of the SMA also impaired learning in the MI-stim group.

It has been proposed that stimulus–response relationships in sequence learning and serial processes are mediated by a cognitive framework for sequential motor behavior (Verwey et al. 2015). The cognitive framework for sequential motor behavior indicates that once a series of motor chunks have been formed to represent a sequence, the process of selecting the upcoming response is mediated by three processors, a perceptual, central, and motor processor. In this model a perceptual processor interprets stimuli and informs the central processor, which is responsible for response selection and controls many of the cognitive aspects of movement. Once the response is selected, it is then executed by the motor processor. During training, individual responses are initially executed in response to a single stimulus (“reaction mode”). As training progresses participants decrease their RT to regularly presented stimuli by forming stimulus–response associations and effector specific representations of movement, defining improvements in the central and motor processors, respectively (“associative mode”; Deroost and Soetens 2006; Schwarb and Schumacher 2012; Verwey et al. 2015). With adequate practice, a representation of the sequence as a series of consecutive motoric responses is created in long-term memory, termed a motor chunk, and can be loaded by the central processor into the motor buffer as a single response (Verwey 1999, 2001; Tubau et al. 2007; Verwey et al. 2015). Depending on the sequence length, it may be represented as a motor chunk or series of motor chunks (Verwey and Eikelboom 2010). Once these motor chunks are formed, the end state of the cognitive framework for sequential motor behavior, termed the chunking mode, has been achieved. In this state, each processor, central or motor, competes for control of responses in sequence tasks, whereby the processor that enables a faster response time selects the upcoming response (Verwey et al. 2015). In this framework, effector-dependent and independent components of learning a sequence can be identified respectfully as the motor processor’s ability to execute commands loaded in the motor buffer more quickly and the central processor’s ability to create and load a motor chunk to the motor buffer as a single response.

The neural underpinnings of sequence learning as explained by the cognitive framework for sequential motor behavior is largely represented by the shift in neural activity from cortical areas involved in associative behaviors (including prefrontal, parietal and premotor regions) to regions involved in sensorimotor processing (including the primary motor and sensory cortices and SMA) as learning occurs (Verwey et al. 2015). Discussion of this shift in neural activity during learning is outside of the scope of this paper; however, the present findings pertaining to the PP-stim and PP-sham conditions reinforce the role of the SMA in the chunking mode. The pre-SMA is thought to play a role in loading motor chunks in the motor buffer, aligning with the role of central processor (Kennerley et al. 2004; Halsband and Lange 2006). The loaded motor chunks are then passed onto the SMA proper with the purpose of initiating a motor command in the primary motor cortex that executes the series of movements represented in the motor chunk (Verwey et al. 2019).

The design of the serial reaction time paradigm used here allows for distinction between improvements made by the central and motor processors over the course of the training. As detailed in the methods, the only difference between the random and repeated elements is that participants had recurring exposure to the repeated sequence elements whereas there was no pattern to the presentation of random elements. Therefore, we theorize that the faster RTs observed in the test block in response to repeated relative to random sequence elements are due to the formation of motor chunks made by the central processor during training, reflecting improvements on the effector-independent components of the task. Whereas the between group comparisons of RTs allow us to make inferences about the effect of training modality (PP or MI) and whether the SMA was inhibited or not (i.e., real or sham stimulation) on the motor processor. Regardless of sequence type (i.e., random or repeated), if the RT differed significantly between groups, the difference would indicate that the speed in which the responses can be made from an already loaded motor chunk had been altered, which is the domain of the motor processor and reflects improvements on the effector-dependent components of the task.

In the PP conditions, the effect of stimulation replicates the earlier finding of Verwey et al. 2002, whereby inhibition of the SMA impaired learning. When comparing RTs, we observed that participants respond more similarly to RTs for both repeated and random elements in the PP-stim group compared to PP-sham (Fig. 5), suggesting that the SMA is linked to the function of the central processor in PP and thus ‘function’ of this processor is impeded by SMA inhibition. While the results show that both the PP-stim and PP-sham groups improved performance on repeated sequence elements, the size of the effect was almost halved in PP stim compared to PP sham, indicating that inhibition of the SMA impaired acquisition of the repeated sequence. In the context of the cognitive framework for sequential motor behavior, the reduction in effect size would indicate impaired function of the central processor to either form motor chunks during training or load the motor chunks from memory after they have been formed, a proposed function of the pre-SMA (Verwey et al. 2015, 2019; Cona and Semenza 2017). Two possible explanations exist to explain this interpretation; i) the stimulation provided in this study was not focal enough to just inhibit the SMA proper and also had an effect on neighboring cortical regions including the pre-SMA or ii) the SMA also plays a role in loading motor chunks, rather than providing input to the primary motor cortex that loaded motor chunks should be executed, a function of the motor processor. The latter possibility reinforces the potential role of the SMA in multiple networks, one for representing motor sequences through the formation of motor chunks and the other for reacting to stimuli through establishing and retrieving visuomotor associations (Nakamura et al. 1998; Sakai et al. 1999; Picard and Strick 2001; Cona and Semenza 2017; Verwey et al. 2019). Consistent with previous work, there appears to be an upward shift in RTs to both repeated and random sequence elements in the PP-stim group relative to PP sham (Verwey et al. 2002). However, the interaction between modality and stimulation was not statistically significant. This lack of significance is surprising given that our results replicate previous findings from the same serial reaction time task where similar values were obtained in the PP-sham group (584.33 and 539.39 ms for random and repeated sequence elements, respectively, compared to 589 and 532 ms in our previous work; (Kraeutner et al. 2016b) and the PP-stim group was slower (611.42 and 583.17 ms for random and repeated sequence elements, respectively). In the current study, this non-result may be attributed to the additive effect of two separate interactions: modality with sequence type (suggesting that RTs to both sequence types are slower in MI in comparison to PP; Fig. 3) and stimulation with sequence type (suggesting that RTs are slower following inhibitory stimulation in comparison to sham; Fig. 4). Given the differences in the descriptive results and support from previous work, we believe that the low RTs in the PP-sham condition in comparison to the other groups results from improvements made by the motor processor, but we do not have statistical support for this claim. A study designed to specifically address this hypothesis, including only groups that train using PP, may provide clarity related to this non-significant result and its interpretation, representing a potentially fruitful avenue for future research.

In the MI groups, our results show that inhibition of the SMA impairs skill acquisition, as RTs to elements of each sequence type were more similar in the MI-stim group, whereas participants in the MI-sham group were faster at responding to elements in the repeated sequence. In line with the cognitive framework for sequential motor behavior, this finding supports the idea that in MI only the central processor is affected by the stimulation, whereby inhibition of the SMA interferes with the process of loading the motor chunks while execution of the motor chunks, once loaded, is unaffected. The similarity in RTs between the MI-stim and MI-sham groups, irrespective of Sequence Type, could, therefore, be reflective of a fundamental property of MI. Specifically, encoding of the movement at the level of the effector does not occur during learning via MI. Admittedly, MI may begin to engage in the effector-dependent encoding of movement, i.e., loading the associated motor chunks from the central processor, but the execution processes are never initiated. Anatomically, this phenomenon may be explained by a lack of sensorimotor activity in preparation for MI performance, the lack of motor cortex activity in MI or changes in the functionality of activity in the SMA that inhibits the motor cortex rather than exciting it (Kasess et al. 2007; Burianová et al. 2013; Hétu et al. 2013; Kraeutner et al. 2014; Hardwick et al. 2018; Solomon et al. 2019).

The combination of behavioral and neurophysiological findings gives credence to the notion that MI and PP both rely on the SMA to help encode effector-independent components of movement. When extrapolating from the present results to more complex movements, the function of the central processor in creating and loading motor chunks is akin to the process of developing hierarchal frameworks representing complex movement when practicing movements using MI (Frank et al. 2014; O’Shea and Moran 2017). In this context, creating motor chunks from individual stimulus response associations in sequence learning is analogous to how associations of basic action concepts are grouped together during practice of a motor task to form a hierarchy that controls the practiced movement.

Conclusion

In the present study, we sought to investigate the involvement of the SMA in MI-based skill acquisition and to generate additional evidence related to the nature of MI-based learning. We first replicated previous findings, supporting the involvement of the SMA in PP-based skill acquisition, as inhibition of the SMA impaired skill acquisition in the post-training test block. Motor imagery-based skill acquisition was also impaired following inhibition of the SMA given a significantly reduced difference in RTs between sequence element types (random and repeated), in MI stim relative to MI sham. While contrary to our hypothesis, the results do suggest that MI-based learning relies more heavily on developing the effector-independent components of the movement representation, and in relation to the present work, specifically the development of the motor chunks that appear to be dependent to some extent on a component of the SMA.