Introduction

Early in its development, transcranial direct current stimulation (tDCS) was theorized to modulate neuronal firing patterns in a polarity-specific manner through the passing of a small electric current between two electrodes (an anode and cathode) placed on the scalp (Nitsche and Paulus 2000, 2001). Although the precise mechanisms by which this modulation may occur remain un-elucidated, it has been suggested the electric current generated by tDCS may serve to directly hypo-polarize axonal membrane potentials under the anode and hyper-polarize axonal membrane potentials under the cathode during stimulation (Stagg and Nitsche 2011). Furthermore, following long-duration stimulation (>7 min), lasting cortical excitability changes have been argued to occur due to synaptic strengthening under the anode and synaptic weakening under the cathode (Stagg and Nitsche 2011).

Recent evidence, however, suggests a more complex picture (Pelletier and Cicchetti 2015). More specifically, it has been demonstrated that different tDCS parameters may lead to different neuronal influences. For instance, increasing the duration of anodal stimulation has been demonstrated to lead to neuronal hypo-polarization (Monte-Silva et al. 2013), while increasing the intensity of cathodal stimulation has been demonstrated to lead to neuronal hyper-polarization (Batsikadze et al. 2013). Furthermore, a number of non-experimental factors have recently been demonstrated to influence response to neuromodulatory protocols. For instance, age (Fujiyama et al. 2014), gender (Chaieb et al. 2008; Russell et al. 2014) and menstruation cycle (de Tommaso et al. 2014) have each demonstrated significant effects on cortical excitability and tDCS response. Accordingly, there is increasing need to better understand the specificity and predictability of neuromodulation using typical tDCS protocols.

Recent reviews of the neurophysiologic efficacy of tDCS (Stagg and Nitsche 2011; Horvath et al. 2015a; Parkin et al. 2015) suggest that transcranial magnetic stimulation (TMS) engendered motor evoked potential (MEP) amplitude represents the neurophysiologic outcome measure most susceptible and predictable to tDCS modulation. Put simply, a sufficiently strong TMS pulse to the M1 representation of a particular muscle can cause that muscle to activate generating a measurable electrical potential termed an MEP. As TMS is commonly thought to transynaptically activate pyramidal neurons, the amplitude of a TMS-generated MEP is believed to reflect the excitability of corticocortical and spinal motoneuronal circuits (Rotenberg et al. 2014). As tDCS is believed to modulate cortical excitability, this modulation should be reflected in a similar modulation of MEP amplitudes. In fact, a significant anode excite/cathode inhibit (Ae/Ci) pattern on MEP amplitude has been demonstrated at the group level (Nitsche and Paulus 2000; Monte-Silva et al. 2009; Jacobson et al. 2012).

Although the Ae/Ci pattern of tDCS on MEP amplitude has been demonstrated at the group level, questions remain about this pattern at the individual level. Although several studies have reported individual data (e.g., Nitsche and Paulus 2001; Roche et al. 2011; Tremblay et al. 2013), only four studies have explicitly examined and analyzed individual response to tDCS using MEP amplitude modulation as the outcome measure (Lopez-Alonso et al. 2014; Wiethoff et al. 2014; Chew et al. 2015; López-Alonso et al. 2015). Between these studies, reported levels of individuals to demonstrate a significant change in MEP amplitude following stimulation range from 20 to 55 %.

These studies are each important as they bring to light the variability in individual response patterns following tDCS. However, because none utilized a control/sham condition, it remains unclear whether the measured responses reflect variable effects of tDCS or fall within the range of typical neural fluctuations (as influenced by placebo/sham stimulation). Due to the fact that MEP-to-MEP amplitudes are known to demonstrate large inter-pulse variability within individuals in the absence of an intercessory protocol (see: Amassian et al. 1989; Ellaway et al. 1998; Kiers et al. 1993; Nielsen 1996; Rosler et al. 2008; Roy Choudhury et al. 2011), these results may have been influenced by random fluctuations in neuronal excitability at the individual level throughout the testing period. A related question concerns whether or not an individual who demonstrates an effect following a single session of tDCS will consistently demonstrate the same effect following subsequent, identical protocols. To date, only two studies have directly explored the issue of intra-subject reliability (Chew et al. 2015; López-Alonso et al. 2015). Chew et al. (2015) explored MEP amplitude modulation reliability in 29 individuals following 2 different sessions of anodal tDCS separated by >1 week. These authors reported that ~30 % of participants demonstrated a reliable response following both stimulations sessions. López-Alonso et al. (2015) explored MEP amplitude reliability in 45 individuals following 2 different sessions of anodal tDCS separated by 6–12 months. These authors reported that ~55 % of participants demonstrated a reliable response in the 60 min following both stimulation sessions. However, again, as these studies did not include a sham/control condition, it remains unclear whether the measured responses reflect variable response patterns to tDCS or fall within the range of typical neural fluctuations. In addition, these studies did not include a cathodal condition. It therefore remains an open question whether cathodal (as opposed to anodal) stimulation is reliable at the individual level.

The aims of this investigation are twofold: first is to determine whether or not tDCS (using an often utilized set of parameters) generates reliable effects within individuals; second is to establish whether or not tDCS generates a significant impact on MEP amplitude at the group level (as compared to sham stimulation). A longitudinal, neuronavigated TMS study looking at the effects of tDCS on MEP amplitude modulation over repeated sessions was performed. Participants received three sessions each of anodal, cathodal and sham stimulation over the primary motor cortex with a contralateral orbit reference (a total of 9 sessions; minimum 48 h delay between sessions). MEPs were collected prior to, immediately following and at 5 min intervals for 30 min following stimulation. In addition, a series of non-experimental variables were collected (energy level, sleep patterns, etc.).

Materials and methods

Participants

A total of 14 individuals (7 male/7 female) between the ages of 18 and 31 (M = 22.58, SD = 4.15) participated in this study. Each was right handed (self-reported), had no personal or family history of epilepsy or neurological/neuropsychological illness, no medical implants or cranial surgery, and no actual or possible pregnancy. Each was not on any medication regimen (self-reported) and was non-smokers (self-reported). All subjects gave informed consent before participating in this experiment, which was approved by the local ethics committee (St. Vincent’s Hospital: Melbourne) and conformed to the standards set by the Declaration of Helsinki.

tDCS

tDCS was delivered through two saline-soaked sponges (35 cm2) using a battery-driven constant-current stimulator (Chatanooga Ionto 2). As the device was not commercially designed for tDCS purposes, we had it independently assessed by an electrical biomedical engineer with expertise in intracranial stimulation in the context of epilepsy seizure management. The device was confirmed to generate and maintain a constant current with a compliance of 56 V @ 1 mA, which falls well within the compliance voltage required for tDCS at that power (Hahn et al. 2013). Electrodes were held in place using flexible rubber straps adjustable via non-conductive clips. In each condition, the target electrode was centered over the M1 cortical representation of the right first dorsal interosseous (FDI) muscle. This location was determined using TMS and maintained using neuronavigation (see below for details). The reference electrode was fixed at the forehead over the right eyebrow. In each condition, current intensity was 1 mA and applied for 10 min with approximately 30-s ramp-up and ramp-down times. For sham stimulation, current flow was increased gradually (to 1 mA) then decreased gradually (to 0) over the course of approximately 30 s to mimic the sensation of stimulation. The 1-mA current intensity (0.02857 mA/cm2 current density) with an M1/orbito-frontal montage was chosen as it is the most replicated parameter set used to date to explore the impact of tDCS on MEP amplitude modulation (Horvath et al. 2015a). Similarly, the post-stimulation impact of tDCS on cortical excitability has been shown to require >7 min of stimulation to manifest (Nitsche and Paulus 2001). Accordingly, we chose a 10 min duration to ensure post-stimulation excitability changes.

TMS and neuronavigation

TMS was delivered through a 70-mm figure-of-eight coil connected to a Magstim BiStim stimulator (Carmarthenshire, UK) with the second charging unit deactivated and a monophasic current flow. The center of the coil was placed flat against the scalp of the participant with the handle angled at 45° from the midline toward the occipital pole. The optimal site of stimulation was defined at the coil location to elicit the largest motor evoked potentials (MEPs) in the right FDI muscle. To ensure consistent TMS coil position and tDCS electrode placement, neuronavigation was performed using a Brainsight II system (Montreal, QC). Electromyography (EMG) was recorded through two self-adhesive electrodes positioned over the muscle body of the right FDI and the first phalanx of the right index finger using the same Brainsight II system. A self-adhesive ground electrode was positioned on the underside of the right forearm. The EMG signal was sampled at 3 kHz, band-pass-filtered (5–500 Hz: low pass 6th order [typ. 120 dB/decade]; high pass 3rd order [typ. 60 dB/decade]), digitized, and all signal processing performed using Brainsight v2.2.9 software.

The motor ‘hot spot’ for the right FDI was determined anew for each session. This was done for two reasons: first, to account for any inter-session drift in the registration of either the participant or the TMS coil to the neuronavigation system, and second, to account for any subtle hot spot shift or motor cortical reorganization over the lengthy study duration (~6–9 weeks—see: Karni et al. 1998; Wolf et al. 2004; Plowman-Prine et al. 2008). Hot spot determination was initially guided by relative anatomy, while tracking the TMS coil to determine an area of strong MEP elicitation. This area was marked and a 3 × 8 circular grid was built around this location using the Brainsight system. Initially, two pulses were delivered to each location on the grid to determine which area generated the largest response (typically 2–4 neighboring grid points). Next, 10 pulses were delivered at each of the grid points within the defined area. The grid point to demonstrate at least 50 % maximal MEP response was chosen and set as the target (if more than 1 grid point demonstrated reliably large MEP amplitudes, the spot with the larger average MEP amplitude was chosen).

The resting motor threshold (rMT) value was determined anew for each session as well (see Supplemental Material Table S3 for session-by-session rMT values). This value was determined by applying an unbiased maximum-likelihood strategy using the TMS Motor Threshold Assessment Tool 2.0 (Knoxville, USA). TMS pulse intensity for the remainder of the session was set to 130 % of the individual’s rMT (unique to each session). For additional discussion of why this power was chosen, see ‘discussion’ section.

Experimental protocol

Each participant underwent 9 recording sessions of ~75 min each separated by a minimum of 48 h (range = 48–214 h; M = 135.85, SD = 47.71: Supplemental Material Table S2: analysis demonstrated no significant correlation between inter-session interval and outcome: Supplemental Material). Nine study sessions consisted of anodal (A1, A2, and A3), cathodal (C1, C2, and C3), and sham (S1, S2, and S3) stimulations, respectively. Session order was determined by first generating 14 random sequences that were then randomly selected for each participant such that each participant was blind to the condition. Although the TMS practitioner was also blinded to the stimulation sequence, after 30 sessions he was tested and able to guess stimulation type with 100 % accuracy; accordingly, this can only be considered a single-blind study.

At the start of each session, participants were asked to complete a study questionnaire asking how many hours of sleep they had the previous night, quality of sleep (1–10 scale, 1 = horrible), time since last meal, time since last caffeinated beverage, self-reported energy levels (1–10, 1 = none), level of satiety (hunger/thirst: 1–10 scale, 1 = not at all), and days since commencement of last period (females only).

Subsequently, EMG electrodes were applied and the participant head position was co-registered to the Neuronavigation system. The motor hotspot and rMT were determined as described above (unique to each session). The TMS device was then set to 130 % rMT and each participant received 15 single TMS pulses to the motor hotspot separated randomly by 6–10 s (baseline). Following this, tDCS was administered for 10 min. An additional 15 single TMS pulses were then delivered to the motor hotspot immediately post-tDCS, and at regular 5 min intervals for a total of 30 min (T0–T30). Any EMG trace showing a peak-to-peak amplitude level >9 µV in the 50 ms preceding the TMS pulse was discarded and not utilized in any analyses (total drop rate = 3.75 % or ~4.5 MEPs per session).

Data analysis

The peak-to-peak amplitude of each MEP was normalized to the average amplitude of the baseline MEP (including the 15 MEPs generated during the baseline measure). The average change from baseline over each 30 -min session was used to perform an intra-class correlation [ICC(2,1)] to assess the absolute agreement between the three sessions of each polarity. Typically, an ICC(2,1) of <0.3 indicates poor while an ICC(2,1) of >0.75 indicates good agreement and strong reliability (Fleiss 1986).

A group analysis to explore the average group effect of tDCS stimulation was conducted by combining all anodal, cathodal and sham stimulation sessions and compared using a two-way ANOVA (session × time).

Neuronavigation accuracy

The three-dimensional x-, y- and z- coordinates for each of the 120 pulses delivered to each participant during each testing session were recorded (in mm) and the within-session variance of each coordinate determined. Variances were then averaged between all sessions (126 total). Average and standard deviation of pulse coordinate variance was: x = 0.44 mm (0.42); y = 0.04 mm (0.05); z = 0.90 mm (0.69). This means that, on average, pulse variance was less than 1 mm in any of the coordinates across three-dimensional space.

Results

Session variables

With regards to rMT values, a repeated measures ANOVA nested according to polarity (anode, cathode, sham) and participant revealed a significant difference in rMT values between participants [F(13,26) = 47.23, p < 0.001], but no significant difference between polarities [F(2,26) = 0.971, p = 0.391] or sessions [F(2,26) = 0.660, p = 0.0523; anode: M = 37.40, SD = 5.92; cathode: M = 38.12, SD = 5.83; sham: M = 37.92, SD = 5.28; for absolute rMT values for each participant across each session, see Supplemental Material Table S3]. Additional non-experimental baseline variables are presented in Table 1 (Pocock et al. 2002; for additional individual measures, see Supplemental Material Table S2).

Table 1 Median and inter-quartile range for varied non-experimental variables collected at baseline

Anode

Individuals demonstrated large variability between anodal testing sessions (Fig. 1; Table 2). Absolute baseline values demonstrated moderate agreement between sessions [ICC(2,1) = 0.363]. Using the ±20 % change-from-normalized-baseline value to classify response patterns (Chew et al. 2015), zero participants demonstrated an excitatory or inhibitory response to all three sessions and one demonstrated ‘no response’ following all three sessions. Intra-class correlation revealed poor agreement between the outcomes of the three anodal sessions within individuals [ICC(2,1) = 0.062: Fig. 2].

Fig. 1
figure 1

Temporal dynamics of each individual’s MEP amplitude patterns following the three unique anodal stimulation sessions. Markers represent the average and variance of the 15 TMS pulses generated at each time point (error bars ±1 SD)

Table 2 Average percent of baseline values during the 30 min following stimulation in each of the three anodal sessions (100 = baseline)
Fig. 2
figure 2

Average MEP amplitude change-from-baseline over the 30-min period following each anodal stimulation session. Each line represents a different participant. ICC analysis suggests poor agreement between sessions

Cathode

Individuals demonstrated large variability between cathodal testing sessions (Fig. 3; Table 3). Absolute baseline values demonstrated moderate agreement between sessions [ICC(2,1) = 0.600]. Using the ±20 % change-from-normalized-baseline value to classify response patterns (Chew et al. 2015), two participants demonstrated an excitatory response to all three sessions, two demonstrated ‘no response’ to all three sessions and zero demonstrated inhibition to all three sessions. Intra-class correlation revealed poor agreement between the outcomes of the three cathodal sessions within individuals [ICC(2,1) = 0.055: Fig. 4].

Fig. 3
figure 3

Temporal dynamics of each individual’s MEP amplitude patterns following the three unique cathodal stimulation sessions. Markers represent the average and variance of the 15 TMS pulses generated at each time point (error bars ±1 SD)

Table 3 Average percent-of-baseline values during the 30-min following stimulation in each of the three cathodal sessions (100 = baseline)
Fig. 4
figure 4

Average MEP amplitude change-from-baseline over the 30-min period following each cathodal stimulation session. Each line represents a different participant. ICC analysis suggests poor agreement between sessions

Sham

Individuals demonstrated large variability between sham testing sessions (Fig. 5; Table 4). Absolute baseline values demonstrated moderate agreement between sessions [ICC(2,1) = 0.395]. Using the ±20 % change-from-normalized-baseline value to classify response patterns (Chew et al. 2015), two participants demonstrated an excitatory response to all three sessions, one demonstrated ‘no response’ to all three sessions and one demonstrated inhibition to all three sessions. Intra-class correlation revealed moderate agreement between the outcomes of the three sham sessions within individuals [ICC(2,1) = 0.433: Fig. 6].

Fig. 5
figure 5

Temporal dynamics of each individual’s MEP amplitude patterns following the three unique sham stimulation sessions. Markers represent the average and variance of the 15 TMS pulses generated at each time point (error bars ±1 SD)

Table 4 Average percent-of-baseline values during the 30 min following stimulation in each of the three sham sessions (100 = baseline)
Fig. 6
figure 6

Average MEP amplitude change-from-baseline over the 30-min period following each sham stimulation session. Each line represents a different participant. ICC analysis suggests moderate agreement between sessions

Inter-subject stimulation condition comparison

Absolute baseline values across all conditions demonstrated strong agreement between sessions [ICC(2,1) = 0.738]. A repeated measures ANOVA nested according to polarity (anode, cathode, sham), session and participant revealed a significant difference in MEP values between participants [F(13,26) = 4.730, p < 0.001] and between time points within each session [F(6,78) = 3.84, p = 0.002], such that individuals demonstrated different MEP response patterns and all groups demonstrated a significant increase in MEP amplitude as compared to baseline over the 30 min following stimulation (Fig. 7). This analysis revealed no significant difference between polarities [F(2,26) = 0.681, p = 0.513] or sessions [F(2,26) = 0.422, p = 660]. Cohen’s d effect sizes for each comparison are anode versus sham: d = −0.034; cathode versus sham: d = 0.102; anode versus cathode: d = −0.147.

Fig. 7
figure 7

Group average of each stimulation condition (error bars ±1 SD)

Impact of menstrual cycle

It has been reported in the literature that menstruation cycle may impact cortical excitability (Smith et al. 1999) and response to neuromodulation (de Tommaso et al. 2014). We found no significant correlation or impact of menstrual cycle on tDCS response patterns, though this was not an explicit aim of this study and the number of female participants (n = 7) is likely underpowered to investigate such an effect (for details and statistics, please see Supplemental Material Table S1).

Discussion

In this experiment, we explored two questions; whether the effects of a commonly utilized set of tDCS parameters on cortical excitability are reliable at the individual level and significant at the group level. TMS-evoked MEP amplitudes were measured in 14 individuals (7m/7f) prior to and for 30 min following 9 different sessions of tDCS (3 anodal, 3 cathodal, and 3 sham; each session separated by at least 48 h; average ~5.5 days).

Reliability

In this study, participants demonstrated large variability between the different testing sessions of like stimulation conditions. Contrary to the common Ae/Ci pattern often reported in tDCS MEP studies, no participant demonstrated a ≥20 % excitatory response following all three sessions of anodal stimulation and no participant demonstrated a ≥20 % inhibitory response following all three sessions of cathodal stimulation. Of the 14 participants, 9 demonstrated both enhancement and inhibition following different sessions of anodal stimulation, 10 demonstrated both enhancement and inhibition following different sessions of cathodal stimulation and 9 demonstrated both enhancement and inhibition following different sessions of sham stimulation. Furthermore, ICC(2,1) values suggested poor reliability between sessions of anodal and cathodal stimulation and moderate reliability between sessions of sham within individuals.

Although these findings are in-line with Chew et al. (2015), who reported low test–retest reliability between two sessions of anodal tDCS on MEP amplitude, they differ from López-Alonso et al. (2015), who reported moderate test–retest reliability between two sessions of anodal tDCS. A possible explanation for this discrepancy concerns the number of testing sessions: Whereas we compared three different testing sessions, Lopez-Alonso et al. compared two. The inclusion of an additional session may have allowed more opportunity for any sources of intra-individual variability to impact outcome. Another possible explanation for this discrepancy could be in different localization protocols: Whereas we utilized neuronavigation, Lopez-Alonso et al. did not. As subtle deviations in TMS coil position and orientation have been shown to impact MEP amplitudes (Guggisberg et al. 2001; Julkunen et al. 2009; Bashir et al. 2013), it’s possible use of a stereotactic system differentially impacted the results from these studies. A final possible explanation concerns the time between testing sessions: Whereas the average duration between sessions in our experiment was ~5.5 days, Lopez-Alonso et al. utilized a 6–12 month duration. Although there is no evidence of tDCS carryover effects for longer than 6 h (see: Nitsche and Paulus 2001; Nitsche et al. 2003; Monte-Silva et al. 2010; Kuo et al. 2013), it is possible our relatively shorter inter-session intervals led to interaction effects between stimulation sessions thereby impacting our results in manner which led to lower reliability.

Group level findings

At a group level, averaging of the normalized response patterns from all participants (the typical analytic technique utilized for this type of data) demonstrated no significant effect of anodal or cathodal stimulation as compared to sham or to each other. Despite appearing at odds with the literature, over 80 % of the reported studies that have explored the effect of tDCS on MEP amplitude modulation did not include a control/sham condition (see: Horvath et al. 2015a). Accordingly, although many papers have reported a significant effect of stimulation, they did not include a sham condition making the results harder to interpret and not directly comparable with the current study (for review: Horvath et al. 2014). Of the 9 previous papers that did include a sham condition when measuring MEP amplitude following long duration (>7 min) 1 mA tDCS, none have measured this outcome for 5-min increments over the entire 30-min period following stimulation: 5 measured only immediately following stimulation (Lang et al. 2004a, b; Siebner et al. 2004; Quartarone et al. 2005; Batsikadze et al. 2013; Simis et al. 2013), 1 measured immediately following, 5, and 10 min post-stimulation (Power et al. 2006), 1 measured immediately following and 10 min post-stimulation(Schabrun et al. 2013), 1 measured immediately following, 10, and 30 min post-stimulation (Suzuki et al. 2012) and 1 measured only 30 min post-stimulation (Bradnam et al. 2011). Accordingly, while our results may appear at odds with commonly held assumptions of tDCS effect, it does not contradict the existing literature. It is important to note that of these 9 papers, MEP amplitude following sham ranged between a decrease of 11 % (Quartarone et al. 2005) and an increase of 5 % (Suzuki et al. 2012)—none demonstrated the ~15 % increase following sham that we found. Below, we discuss this finding in more detail in the context of non-experimental influences on outcome.

Interestingly, Labruna et al. (2016) recently reported that individual sensitivity to TMS demonstrated a significant correlation with anodal (but not cathodal) tDCS response patterns. We were unable to corroborate this finding, as our analysis demonstrated no significant intra-individual changes in rMT value (suggesting no TMS sensitivity shifts within individuals between sessions) yet highly variable intra-individual tDCS response patterns. Reasons for the different findings between these studies likely involve methodology. More specifically, Labruna et al. utilized the TMS output power required to elicit a 1 mV MEP as their proxy for TMS sensitivity. As we did not utilize a similar procedure, we are forced to utilize a different proxy (namely, rMT). It is possible that these differences in sensitivity measure can account for the discrepant findings. In addition, Labruna et al. included single-session data from different individuals (rather than multiple sessions, as included here). The utilization of single-session data does not allow for an exploration of the reliability of the influence of TMS sensitivity on individual response patterns; an influence our data suggests may not be highly reliable.

It is important to note that each condition in our study displayed a ~15 % increase in MEP amplitude over the 30 min following baseline measure (regardless of intervening stimulation). This could be due to any number of things: intra-session adaptation, an accumulation effect of single-pulse TMS, increasing fatigue or satiation, mood, previous experience, etc. In addition, participants were allowed to talk and make minor movements between MEP batches. It is conceivable that this led to increased motor cortical excitability over the course of the session (Kasai et al. 1997; Stinear and Byblow 2003; Royal et al. 2015). Although we are unable to clarify the source of this MEP amplitude increase over time, our results suggest that it is not due to tDCS (as the same pattern was seen regardless of stimulation condition). Our findings are, however, consistent with other studies mentioned previously that report changes in MEP amplitude over time following sham stimulation (ranging from a decrease of 11 % to an increase of 5 %). This finding, along with the previous literature, raises interesting questions concerning the utilization of MEP amplitude as an outcome measure for tDCS effects as both tools demonstrate strong intra-subject variability (see discussions above). If future studies show similar degrees of variability during sham conditions, this might suggest MEP amplitudes are not the best measure to use if the goal is to understand the true variability of tDCS stimulation itself.

Methodological considerations, limitations and future directions

A difference between this experiment and several others was our choice to utilize 130 % rMT as our TMS pulse intensity. This was chosen as it was similar to that utilized in many tDCS/MEP studies to date (e.g., Ardolino et al. 2005; Di Lazzaro et al. 2012; Lang et al. 2004b; Madhavan and Stinear 2010; Pellicciari et al. 2013; Scelzo et al. 2011; Suzuki et al. 2012; Teo et al. 2014). Practitioners often utilize this intensity as both human and animal data suggest the use of a TMS pulse of at least 130 % rMT both enhances the reliability of MEP measurements (Brasil-Neto et al. 1992) and increases the number of corticospinal volleys reflective of cortical excitability (Amassian and Cracco 1987; Di Lazzaro et al. 2004; Ziemann and Rothwell 2000). However, many other tDCS studies utilize a TMS intensity that engenders a baseline of ~1 mV. As the 1 mV baseline leads to TMS intensities independent of threshold values (e.g., In Wiethoff et al. 2014, obtaining 1-mV baseline MEPs in the anodal condition required an average intensity of ~130 % rMT, while obtaining 1-mV baseline MEPs in the cathodal condition required an average intensity of ~144 % rMT), it is possible our findings were influenced by the reference of our pulse intensity to rMT. More specifically, it is possible the subtle MEP increase seen in each condition at the group level may have been driven by our utilization of 130 % rMT. Further research clarifying the impact of threshold-based TMS intensity versus an average MEP amplitude-based intensity may be important. In addition, further research into this area may consider utilizing the non-MT-based protocol (e.g., TMS power that engenders ~1 mV) as this might offer more reliable outcomes than the oft utilized percentage-of-MT protocol explored here.

Similarly, a difference between this study and several others in this field concerns the use of a variable baseline versus a static baseline. As can be inferred, when basing TMS intensity on rMT, the baseline MEP amplitude values will naturally be different within individuals over different sessions and between individuals, as was reflected in the moderate ICC scores for baseline conditions reported above. Conversely, when basing TMS intensity on baseline MEP amplitude, there is far less variability at baseline—though, this variability is shifted to the TMS stimulation intensity (see Wiethoff et al. 2014). The practice of adjusting the TMS intensity to standardize baseline values is often done to minimize the impact of non-experimental variables (e.g., energy, hunger, thirst). However, these variables were randomly distributed and not noticeably different between sessions within this study. Furthermore, we were able to account for variable baselines at the individual level by normalizing data and calculating a change from baseline, a practice long utilized in the field of noninvasive stimulation (Wassermann et al. 2008; Rotenberg et al. 2014). It is possible that these methodological decisions influenced our outcome and masked or reduced any possible tDCS effect. It may be important in future research to elucidate the impact of allowing for variable baseline values as compared to standardized baseline values in this field.

An important limitation involves our use of 15 MEPs per batch to establish our average MEP values. Again, although using 15 or fewer MEPs to measure MEP behavior is not uncommon in this field (e.g., Di Lazzaro et al. 2012; Madhavan and Stinear 2010; Miyaguchi et al. 2013; Nitsche et al. 2007; Power et al. 2006; Quartarone et al. 2004; Scelzo et al. 2011) and this number has been demonstrated to elicit highly reliable response patterns (Bastani and Jaberzadeh 2012), many studies utilize 20 or more MEPs per batch. Furthermore, a recent article suggests that, although 15 pulses per batch are sufficient for internal consistency at an α > 0.95, more than 20 pulses per batch are required for absolute consistency at α > 0.99 (Chang et al. 2016). It is therefore possible that collection of a greater number of MEPs per batch may have reduced the variability seen in our data. With respect to the baseline value, it is possible that 15 MEPs were not enough to establish a reliable baseline value for each session. However, it is worth noting that the baseline values obtained in this study appear to be neither systematically underestimated nor overestimated, given that 83 of our 126 sessions include MEP batches which demonstrate both enhancement and impairment compared to baseline in the 30 min following baseline determination. Nonetheless, in future experiments, it would be worthwhile collecting a larger number of MEPs per batch to directly assess the extent to which MEP number contributes to the observed variability of tDCS effects on MEP amplitude.

Another consideration involves blinding. Whereas participants were blinded during each session, we felt it important to also blind the TMS practitioner. This study followed a double\-blind protocol; however, after each session we asked the practitioner to guess what type of stimulation he believed had occurred. Unfortunately, after 30 sessions, the practitioner had been able to guess the condition with 100 % accuracy. When asked how, he explained it was a combination of hair ‘wetness’ (post-sham hair appears to maintain the moisture of the saline solution more than active stimulation), vasodilatation (skin redness appears to be greater during active stimulation as compared to sham, and greater under the anode as compared to the cathode) and participant behavior (participants are more likely to scratch electrode sites following active stimulation). Because of this accuracy, it is clear our experiment cannot be considered double-blind. This places our work in the majority, as over 85 % of the articles exploring the effects of tDCS on MEP amplitude are not double-blind (see: Horvath et al. 2014). It is possible that this influenced our results. Interestingly, the only study published to date to directly explore assessor blinding similarly reported blinding was ineffective (O’Connell et al. 2012). Determining how best to blind practitioners may be a very important consideration for future research (Horvath 2015).

Though we were unable to obtain reliable (at the individual level) or significant (at the group level) results, it is important to remember that there remain a number of important variables left unexplored by this study. For instance, in this study, we utilized a single-current strength (1 mA), a single-electrode montage (M1/OrbFro) and measured outcome for only 30 min post-stimulation. There is evidence that different current densities (Bastani and Jaberzadeh 2013) and electrode locations (Bikson et al. 2010) engender variable results. In addition, there is emerging evidence that some populations (e.g., elderly individuals) may exhibit delayed excitability alterations from tDCS (Fujiyama et al. 2014). Accordingly, it is wholly possible that a different combination of density, montage and post-stimulation measurement duration may incur significantly more reliable and outcome patterns. In addition, novel forms of tDCS (e.g., HD-tDCS; Kuo et al. 2013) and individualized current-flow modeling paradigms (Datta et al. 2011) are emerging. Again, with the utilization of these technological advances, it is possible more exacting protocols can be devised which may increase reliability and outcome predictability. A final consideration involves sample size. It is possible that our inclusion of only 14 participants did not allow for sufficient power to detect accurate reliability measures. Future studies should include a larger number of participants in order to explore this question in more detail.

Finally, as noted above, a number of non-experimental factors have been linked to stimulation outcome. It is possible that strictly controlling for and manipulating these variables may impact the outcome of tDCS, thereby leading to more reliable and robust outcomes. In short, although striking, this study represents a single finding utilizing a single protocol with a tool that has a varied and wide parameter base. Accordingly, our results must be interpreted within this larger milieu and will hopefully encourage future studies to explore within-subject reliability and elucidate which combination of parameters/variables will ensure optimal reliability of tDCS effects.

Conclusion

The results of this paper were unable to demonstrate, at the individual level, a reliable effect on MEP amplitude. In addition, at the group level, our results were unable to demonstrate a significant effect on cortical excitability (as compared to sham stimulation condition). When these results are combined with a recent systematic review undertaken by this group which suggests MEP amplitude modulation is the outcome most susceptible to tDCS modulation (Horvath et al. 2015a, b), several important questions are raised: including how best to standardize protocols to ensure a reliable response and the utilization of other neurotechnologies to obtain the best results using tDCS.