Introduction

Low doses of psychostimulants, such as methylphenidate (MPH), are the most commonly used treatment for attention deficit hyperactivity disorder (ADHD). Methylphenidate blocks the dopamine (DA) transporter (DAT) and the norepinephrine transporter (NET) and thereby increases the extracellular levels of these neurotransmitters (Volkow et al. 2002). While low doses of MPH, equivalent to those used for the treatment of ADHD, improve the ability to focus attention (e.g., Gamo et al. 2010; Rajala et al. 2012; Solanto and Wender 1989), it has been suggested that they may do so at the expense of hindering other higher order functions, such as cognitive flexibility (CF) (Rajala et al. 2012; Dyme et al. 1982; Robbins and Sahakian 1979; Tannock and Schachar 1992), defined as the ability to adapt behavior in response to changing circumstances, including reward value (Cools et al. 2002; Haber and Knutson 2010; Monchi et al. 2001; Nakahara et al. 2002), for which the prefrontal cortex (PFC) is thought to play a central role. In humans, PFC lesions impair switching, rule application (Milner 1963), and behavioral adjustment based on feedback (Hornak et al. 2004). In nonhuman primates (NHPs), neural correlates of rules have been documented in the discharges of single PFC neurons (Asaad et al. 2000; Everling and DeSouza 2005; Wallis et al. 2001; White and Wise 1999), and experimental lesions of the PFC impair the performance of set-shifting tasks (Dias et al. 1996). Cognitive flexibility requires the ability to adapt behavior in response to reward feedback. Reward-based learning is thought to depend on prediction error signals (Dayan and Balleine 2001; Schultz et al. 1997) encoded by DA neurons in the ventral tegmental area (VTA) (Schultz et al. 1993, 1997), which project to the PFC (Bannon and Roth 1983; Glowinski et al. 1984; Williams and Goldman-Rakic 1993) and the striatum (Haber and Knutson 2010). The striatum is intricately connected with the PFC through multiple parallel channels (Parent and Hazrati 1995). Reward processing signals are also encoded by PFC neurons, albeit in a different form. Some PFC neurons discharge in the absence of expected reward (Niki and Watanabe 1979; Mansouri et al. 2006; Watanabe 1989), and some dorsolateral PFC (dlPFC) neurons encode correctness regardless of reward (Watanabe 1989). This variety of signals could encode different contexts and outcomes, consistent with the PFC’s attributed role underlying CF (Kennerley and Wallis 2009; Marcos et al. 2018). Thus, do therapeutically relevant doses of MPH hinder CF, and does this hypothesized effect result from changes in reward-outcome signals in the PFC? We addressed these questions using NHPs trained to perform a CF task (Everling and DeSouza 2005), and single-unit recordings from neurons in the PFC before and after the oral administration of different doses of MPH. The doses tested, 3 and 6 mg/kg, were used in a previous study that included three of the four subjects in this study and produced an inverted U-shaped pattern of performance of a working memory task that required a high level of attention (Fig. 1). In the working memory task (Rajala et al. 2012), subjects had to detect a brief 100 ms visual target and withhold a timely and accurate saccadic eye movement response to it for a variable delay period. The delays were randomly mixed throughout the session resulting in increased difficulty and increased attentional demands (Fig. 1 inset). Consistent with our prediction (Rajala et al. 2012; Dyme et al. 1982; Robbins and Sahakian 1979; Tannock and Schachar 1992), the behavioral results of the present study show that MPH hinders task-switching performance at the 3 mg/kg dose that had been shown to improve performance in the attentionally demanding working memory task (Fig. 1) in the same subjects, and the physiological results reveal a neural correlate of the behavioral effect in the form of a reduction in the amplitude of an outcome signal found in the discharge of some PFC neurons.

Fig. 1
figure 1

Adapted from the Journal of Cognitive Neuroscience (Rajala et al. 2012). Dose response function illustrating inverted U-shaped effect of MPH on working memory task performance. Control normalized percent correct plotted as a function of MPH dose. Data from three subjects (all included in the current study) were averaged; error bars represent standard error of the mean. Significance was determined using 95% confidence intervals, *p < 0.05 (two-tailed). The task consisted of the following: (1) Upon the start of a trial, a fixation light located straight ahead was turned on, and the subjects were expected to look at it within 750 ms and maintain fixation until it was extinguished. (2) During fixation, a 100-msec visual target was presented. (3) The time between the onset of the target and the offset of the fixation light, the delay period, was varied randomly from trial to trial (1–6 s and no delay). (4) The subjects were required to withhold response to the target until the fixation light was turned off, which was their cue to respond. The attentional requirements of the experimental task are illustrated in the inset, showing percent correct as a function of the number of delays presented in a session. As the number of randomly intermixed delays increased, performance decreased, indicative of increased difficulty, and attentional demands. The study consisted of a mixture of visually guided saccades and six delays, 1–6 s, randomized across trials within each session thus requiring a high level of attention to be successful

Methods and materials

Subjects and experimental details

Four adult male rhesus monkeys (Macaca mulatta) were implanted with a head-post, eye coils (Judge et al. 1980) to measure eye position (Robinson 1963), and a recording chamber (Rajala et al. 2013, 2018). Rhesus monkeys were selected because they are the animal preparation with the closest homolog to the human frontal brain available for studies of this type (Wise 2008). All surgical procedures were approved by the University of Wisconsin Animal Care and Use Committee (IACUC) and were in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals. Subjects were trained on the reward-based switching task (Fig. 2), adapted from (Everling and DeSouza 2005), in which pro- and antisaccades (Hallett 1978) were presented in alternating blocks with a neutral cue (orange) for both tasks. Visual stimuli were presented with red/green/orange light emitting diodes (LEDs). Specifically, on each trial, an orange fixation LED located straight ahead was turned on, and the subjects were required to look at it and maintain fixation while lit (900–1050 ms). Coincident with the offset of the fixation LED, a target was presented randomly with equal probability at ± 10° from the fixation point in either the horizontal or vertical directions. The offset of the fixation and appearance of the target signaled the subject to make a saccadic eye movement. To obtain a reward, subjects had to make the correct response, i.e., a prosaccade or an antisaccade depending on the rule for the particular block, with its end position falling within an acceptance window of ± 4°× 4°, within 700 ms after target presentation. Saccadic eye movements of the incorrect type, that fell outside the acceptance window, or that were started 700 ms after target onset, were considered failures. Subjects were allowed to perform the task until sated. Trials of either type were presented in blocks of randomly varying length, 26–42 successful trials. The 26–42 trials within a given block were equally assigned to two target locations, left and right or up and down. Incorrect trials were repeated at the end of each block until all trials within a block were completed successfully. Once the prespecified number of successful trials was reached, the task requirements changed from, e.g., antisaccade to prosaccade or vice-versa. The subjects were required to detect the rule change and switch from one task to the other, i.e., to flexibly adapt their behavior in response to identical external stimuli based solely on reward feedback.

Fig. 2
figure 2

Switching task. Subjects were presented with alternating blocks of prosaccades and antisaccades. All trials appeared identical. Blocks consisted of 26–42 trials equally assigned to two target locations, left and right or up and down. Incorrect trials were repeated at the end of each block until all trials within a block were completed successfully. Once the prespecified number of successful trials was reached, the task switched from, e.g., antisaccade to prosaccade or vice-versa. On each trial, an orange fixation LED located straight ahead was turned on, and the subjects were required to look at it and maintain fixation while lit (900–1050 ms). Coincident with the offset of the fixation LED, a target was presented randomly with equal probability at ± 10° from the fixation point in either the horizontal or vertical directions. The offset of the fixation and appearance of the target signaled the subject to make an eye movement. To obtain a reward, subjects had to make the correct response with a saccade end position falling within an acceptance window of ± 4°× 4°, within more than 650 ms after target presentation. Saccades that fell outside the acceptance window, or that were started 650 ms after target onset, were considered failures. Failed trials were added to the end of the vector that represented the order for the block and were repeated until all trials in the block were completed successfully, which is required to trigger the switch to next block. By trial and error, the subject was expected to determine the correct type of response, and to continue executing it while successful, e.g., antisaccades. Lack of reward, signaling an incorrect response, was expected to trigger a switch from antisaccade to prosaccade

Drug delivery and dosing

Methylphenidate hydrochloride (Sigma Aldrich, St. Louis, MO) was dissolved in a 1-mL mixture of grape and unsweetened cranberry juice, delivered orally 45 min before the start of each behavioral experiment while the subjects were in the primate chair. During behavioral/electrophysiological experiments, MPH was administered while the subject was in the primate chair after completing 4–5 blocks of the switching task and while holding a single neuronal recording, which allowed for the control of the timing of administration and confirmation that the subjects received the entire dose. Juice alone (0 mg/kg) was administered on control days. Drug dosing was calculated in mg/kg based on the animals’ weight on the day of testing. The doses 3 and 6 mg/kg were chosen based on previous work on the effects of MPH on an attentionally demanding working memory task (Rajala et al. 2012), which produced significant improvement and deterioration of performance, respectively (Fig. 1). An acute oral dose of 3 mg/kg MPH produces plasma levels equivalent to doses used therapeutically in humans (Doerge et al. 2000).

Behavioral data analysis

Switching performance was analyzed in part with the procedure reported in (Everling and DeSouza 2005). The proportion of correct responses was computed for each of the conditions (0, 3, and 6 mg/kg MPH) for each of the subjects separately and averaged. Data were plotted synchronized to the first trial following task reversal (trial 0), and the proportion correct on the ten trials preceding and twenty trials following the switch point was computed (Fig. 3). Subsequently, the portion of the resulting functions comprising the switch, i.e., trials 0–19, were fit with an exponential function using nonlinear least-squares of this form:

$$ \hat{y}=A\exp \left\{-\frac{1}{\tau }x\right\}+\upalpha $$
(1)
Fig. 3
figure 3

Switching performance. a The proportion of correct responses is plotted for each condition separately, as a function of trial number synchronized to the start of each block (trial 0). The last trial of every block (t − 1) was always a success (proportion correct = 1) because the subjects were required to complete all trials within a block successfully before the task reversal. Note the large drop in performance at the transition between tasks, e.g., antisaccade to prosaccade or vice-versa. Exponential fits from each condition are overlaid. b The data following the switch, from trials 0–20 were fit with an exponential function\( \hat{y}=A\exp \left\{-\frac{1}{\tau }x\right\}+\upalpha \). The vertical broken line illustrates the group estimates of the time constant τ, and the horizontal broken line the offset α. Colors correspond to dose conditions as in part a. c, d Quantification of the group average of time offset α and constant τ, respectively, for each condition, *p < 0.05. Error bars represent standard error of the mean.

where the estimated parameter τ defines the rate of recovery, and the steady state level of performance corresponds to the offset term α given the vector of post-switch times x and proportion of correct responses y. Failed trials were sorted into six error types: (1) directional—e.g. subject looked at the target in an antisaccade block, (2) inaccurate—subject looked in the correct direction but eye movement was inaccurate, (3) fixation—subject failed to initiate a trial by not acquiring the fixation LED within 650 ms or not maintaining fixation for its full duration, (4) premature—subject initiated a response before a target was presented, (5) no response—subject did not respond, and (6) random—subject looked neither to the target nor its mirror image. Only the first three types were included in the analyses; the last three constituted less than 1% of all trials. In addition, trials in which saccade latency was < 50 ms or > 650 ms were excluded.

Permutation (randomization) tests of exponential parameter estimates

Both the time constant τ and the offset α were compared across conditions using a pairwise randomization test. The pairwise condition comparison of parameter estimates of the exponential recovery function (Eq. 1) were performed using a permutation method to construct null sampling distributions of parameter estimate differences. Two-tailed p-values were obtained by summing the random difference samples occurring more extreme than the parameter difference observed. Session trials were maintained as a group series and randomly shuffled between dose conditions under the assumption of exchangeability. The exponential recovery parameters were then estimated for each condition (A, B), e.g., A = 0 mg/kg and B = 3 mg/kg, and the difference for the time constant Δτ = τA − τB and offset Δα = αA − αB computed. This process was replicated 1000 times to obtain the empirical null sampling distribution. False discovery rate was controlled at the 0.05 level (Benjamini and Hochberg 1995) across all possible pair-wise comparisons.

Test of rule type switch relationships

The relationship of the proportion of choice errors to MPH dose and rule switch conditions was examined by fitting the following mixed-effects general linear model (GLM):

$$ {\mathrm{PE}}_{\mathrm{jk}}={\beta}_0+{b}_{0j}+\left({\beta}_1+{b}_{1j}\right){\mathrm{MPH}}_k+\left({\beta}_2+{b}_{2j}\right){\mathrm{ANTIPRO}}_k+{\beta}_3{\mathrm{MPH}}_k\times {\mathrm{ANTIPRO}}_k+{e}_{\mathrm{jk}} $$
(2)

where β0, β1 ,β2and β3are the intercept and slope-fixed effects for the MPH dose (0, 3, 6 mg/kg), ANTIPRO difference coding (− 1, 1), and their interaction for the kth observation, and b0j, b1j, and b2j are the random effects specific to subject j. Random effects are random values that represent deviations from associations described by fixed effects. Here they are random intercepts and coefficients representing random deviations for each subject. PEjk is the subject specific proportion of errors (total error count/number of trials), and ejk is the random error term (Gelman and Hill 2007). ANTIPRO represents the difference in error rate between anti-to-prosaccade switches and pro-to-antisaccade switches for the ten trials after the rule switch. The mixed effects model was fit using the Matlab Statistics and Machine Learning Toolbox (fitlmematrix).

Tests of error type relationships

The relationship of the proportion of choice errors to MPH dose and pre-post switch condition was examined by fitting the following mixed-effects GLM:

$$ {\mathrm{PE}}_{\mathrm{jk}}^{\mathrm{type}}={\beta}_0+{b}_{0j}+\left({\beta}_1+{b}_{1j}\right){\mathrm{MPH}}_k+\left({\beta}_2+{b}_{2j}\right){\mathrm{PREPOST}}_k+{\beta}_3{\mathrm{MPH}}_k\times {\mathrm{PREPOST}}_k+{e}_{\mathrm{jk}} $$
(3)

where β0, β1, β2, and β3are the intercept and slope-fixed effects for the MPH dose (0, 3, 6 mg/kg), PREPOST difference coding (− 1, 1), and their interaction for the kth observation, and b0j, b1j,and b2j are the random effects specific to subject j. \( {\mathrm{PE}}_{\mathrm{jk}}^{\mathrm{type}} \) is the subject-specific proportion of errors for each of the three error types (directional, inaccurate, and fixation), and ejk is the random error term. PREPOST represents the difference in error rate before the switch point, during the steady state, and the ten trials after, during the switch. Mixed effects models for each error type were fit separately. False discovery rate (FDR) was controlled at the 0.05 level (Benjamini and Hochberg 1995) across the three independent fits. Each of the three error types were analyzed separately using mixed-effects models to study the relationship of proportion of errors to various experimental conditions.

Electrophysiological recordings and data analysis

Single-unit recordings were taken from left area 46 using standard techniques (Rajala et al. 2013, 2018). Each dose of MPH studied (0, 3, and 6 mg/kg) was preceded with a control epoch within the same recording session. All data were collected from well-isolated single units. Following isolation, approximately 4–5 switches were recorded (~ 200–300 trials) and used as control. If the recording was stable and the subject seemed willing to continue working, the experiment was paused, and MPH (or vehicle) was administered while maintaining the recording. The shape of the action potential was saved on a storage oscilloscope before and after the administration of MPH (or vehicle) for comparison. Auditory feedback was also used to determine if the neuron was lost while administering the drug. If the recording remained stable, the experiment resumed, and the subject continued to work until sated. If the recording was lost, the data were discarded. Success was achieved in ~ 25–30% of experiments. The analysis was focused on the discharge rate within the reward epoch of correct and incorrect trials. While in successful trials, the onset of reward delivery was time stamped, such an event was not available for failed trials. Inspection of transient neural activity associated with lack of reward indicated that it began approximately 500–600 ms after the target onset. The length of the epoch was estimated by measuring the duration of the bursts observed following the absence of reward. Accordingly, a 1000 ms epoch was selected, starting 500 ms after the target presentation, to capture activity associated with reward, or lack thereof, in all trials. Only directional and inaccurate errors were included in this analysis because for fixation errors no target presentation occurred therefore those trials had no comparable activity. Nonparametric methods were used to test hypotheses with p-values, as well as standard error analyses due to the unknown nature of the sampling distributions.

Permutation (randomization) tests of MPH-dependent discharge rates

The correct/incorrect difference in average discharge rate for each dose of MPH studied (0, 3, and 6 mg/kg) across trials was compared with the difference in the control condition using the test statistic

$$ {R}_{N_j}=\frac{\mathrm{abs}\left({\overline{\mathrm{r}}}_{\mathrm{incorrect}}^{-\mathrm{treat}}-{\overline{\mathrm{r}}}_{\mathrm{correct}}^{-\mathrm{treat}}\right)}{\mathrm{abs}\left({\overline{\mathrm{r}}}_{\mathrm{incorrect}}^{-\mathrm{control}}-{\overline{\mathrm{r}}}_{\mathrm{correct}}^{-\mathrm{control}}\right)} $$
(4)

where \( {\overline{\mathrm{r}}}_{\mathrm{outcome}}^{\mathrm{condition}} \) is the average discharge rate across correct or incorrect trials under one of the treatment conditions for the jth neuron Nj. An empirical null distribution was obtained by randomly shuffling between the treatment and control discharge rates for the incorrect and correct trials independently. The stated alternative hypothesis is that the treatment correct, and incorrect absolute difference is less than the control such that the average \( \overline{R} \)across neurons will be less than one. Trial discharge rates were permuted 10,000 times to obtain the empirical null distribution, and the number of surrogates\( {\overline{R}}_{\mathrm{null}}^{\ast } \)that exceeded the measured \( \overline{R} \)was used to calculate a p-value for the null hypothesis.

Bootstrapped estimates of error distributions

Trial discharge rates contributing to the calculation of \( {\overline{\mathrm{r}}}_{\mathrm{outcome}}^{\mathrm{condition}} \) were resampled with replacement to yield surrogates \( {\overline{R}}^{\ast } \). This calculation was replicated 10,000 times for each neuron and averaged across neurons for each dose to obtain the bootstrapped distribution of \( {\overline{R}}^{\ast } \)for each dose \( {\overline{R}}^{\ast } \) (Efron and Tibshirani 1993). The absolute value operation lower bounds the null distribution of \( {\overline{R}}^{\ast } \)at 0 such that the distribution is skewed positive, and motivated the use of percentiles as error estimates of \( \overline{R} \).

Results

Behavioral data

Behavioral data from 62,685 trials, taken from four subjects, three conditions (0, 3, and 6 mg/kg), and 40 experimental sessions are presented (subject MI 4 sessions at 0 mg/kg, 2 at 3 mg/kg, 2 at 6 mg/kg; subject SH: 6 at 0 mg/kg, 3 at 3 mg/kg, 2 at 6 mg/kg; subject SW 6 at 0 mg/kg, 3 at 3 mg/kg, 3 at 6 mg/kg; subject CO 5 at 0 mg/kg, 2 at 3 mg/kg, 2 at 6 mg/kg). No sessions were excluded.

Steady state and task-switching performance

Figure 3a shows the average proportion of correct responses for the three conditions (0, 3, and 6 mg/kg MPH), plotted as a function of trial number relative to the point of task switching (trial 0). Each combined dataset was fit with a single exponential function (Eq. 1), which yielded the offset α representing the group steady state level of performance, and the group time constant τ, representing the rate of recovery. For the parameters ∝ and τ, conditions were compared using the pairwise randomization test described above.

The steady state level of performance α was 0.8 in the 0 mg/kg condition, which indicates that the NHPs made errors throughout the blocks of trials, not simply during the switching epoch. Steady state performance was hindered in the 3 and 6 mg/kg MPH conditions compared with 0 mg/kg, and in the 6 mg/kg condition compared with the 3 mg/kg condition (Fig. 3b, c). In the 6 mg/kg MPH condition, steady-state performance was just above chance level, 0.56. As with steady-state performance, the rate of recovery, estimated by τ, was also hindered by the 3 and 6 mg/kg doses of MPH (Fig. 3d). Specifically, τ was lengthened in the 3 and 6 mg/kg MPH conditions compared with the control. Individual subject data are shown in Supplemental Figs. 1 and 2.

Previous studies that have used this type of switching task (e.g., Everling and DeSouza 2005) documented a difference in the ability to switch depending direction: prosaccade to antisaccade, or vice versa. We found no consistent pattern across the four subjects in the number of errors during the 10 trials after a switch depending on direction, nor was there evidence of an interaction with MPH dose (Table 1). Consequently, anti-to-prosaccade and pro-to-antisaccade error counts were combined for subsequent analyses.

Table 1 Mixed-effects model results for all error types for switching from anti-to-pro vs pro-to-anti. Italics denote p < 0.05

Error types

The proportion of the three types of errors analyzed, averaged across the four subjects, is shown in Fig. 4. The corresponding results from the mixed effects model are shown in Table 2. Errors of fixation depended significantly on MPH dose, regardless of the steady-state vs. switch epoch. Thus, subjects made more fixation errors across the entire session with MPH in a dose-dependent manner. For directional errors, the main effects for MPH dose and for steady-state vs. switch were significant, without evidence of interaction (Table 2). Thus, the subjects made more directional errors in the switch epoch compared with the steady state, which was not differentially dependent on MPH dose, and they made more directional errors across the entire session with MPH in a dose-dependent manner. Errors in response accuracy were not dependent on the steady-state vs. the switch epochs, or MPH dose (Table 2) and thus were constant across all conditions.

Fig. 4
figure 4

Effects of methylphenidate on switching task error types. The proportion of errors of direction, accuracy, and fixation are plotted in stacks separately for each condition for the 10 trials that preceded the switch and the 20 trials that followed, as a function of trial number, synchronized to the task switching point (trial number 0). The statistical analysis of these data is presented in Table 2

Table 2 Mixed-effects model results for error types. Italics denote p < 0.05 FDR corrected.

Electrophysiology data

The behavioral results led us to hypothesize that MPH might have affected the representation of reward outcome signals, and that a neural correlate of the changes would be observed in the PFC. We recorded from single neurons in the PFC while the monkeys performed the switching task before and after the administration of MPH (n = 4 neurons, 2 per dose level) or the vehicle 0 mg/kg (n = 5 neurons). Figure 5 illustrates the effects of the two doses of MPH studied on the responses of example units. Conspicuous in their discharge was a burst of action potentials in incorrect trials during the reward epoch (Fig. 5a, d). The activity of the same units recorded 45 min after the administration of 3 and 6 mg/kg MPH is shown in Fig. 5b, e, respectively. The difference between incorrect and correct reward outcome signals, for the control (orange) and MPH (light blue) conditions for each unit, is shown in Fig. 5c, f. Methylphenidate produced a large decrease in the magnitude of this difference between correct and incorrect trials that distinguished between rewarded and non-rewarded trials, effectively attenuating an outcome signal that could inform a switch. To determine if MPH affected this outcome signal (gray boxes in Fig. 5c,f), the permutation test described above was carried out for each of the dose levels. The ratio of control to treatment in the difference between the discharge rate within the reward-related epoch of unsuccessful and successful trials, the outcome signal, within each neuron did not differ significantly from 1 at 0 mg/kg (\( \overline{R}=0.93,p=0.41\Big) \), but did differ significantly at 3 mg/kg\( \left(\overline{R}=0.14,p<{10}^{-4}\right) \) and 6 mg/kg \( \left(\overline{R}=0.09,p<{10}^{-4}\right) \). Bootstrapped estimates of \( {\overline{R}}^{\ast } \) are shown in Fig. 6 for the three MPH doses. These results indicate that MPH significantly reduced the difference between correct and incorrect trials, and this effect is not a result of the act of delivering the juice itself.

Fig. 5
figure 5

Effects of methylphenidate (MPH) on single PFC unit activity. a Single-unit activity from the control condition. Firing rate is plotted as a function of time, synchronized to the presentation of the target (time = 0 ms), represented by the solid vertical line. Dotted vertical line at time = − 2000 ms represents onset of the fixation. Spike density functions were generated by convolving a Gaussian function (σ = 40 ms) (MacPherson and Aldridge 1979; Richmond and Optican 1987) with dot rasters plotting raw trial-by-trial spike data underneath. Data from correct trials are plotted in blue, and incorrect trials are plotted in red. Only direction and inaccurate errors are included in the incorrect plots. The shaded areas represent 95% confidence intervals. The reward epoch is illustrated by a gray box extending from 500 to 1500 ms. b Effect of a 3-mg/kg dose MPH on the same unit as in a recorded 45 min after administration of the drug. Details as in a. c Difference in firing rate between incorrect and correct trials in the control and MPH condition, illustrating the effect of MPH on the reward outcome signal. d, e, f Effect of a 6-mg/kg dose MPH on a different single unit. Details as in a.

Fig. 6
figure 6

Bootstrapped distributions of treatment-to-control ratios. Resampled trial neural discharge rates were used to generate surrogates \( {\overline{R}}^{\ast } \)for measures of spread as a function of MPH dose. We recorded from single neurons in the PFC while the monkeys performed the switching task before and after the administration of MPH (5 neurons 0 mg/kg, 2 neurons 3 mg/kg and 2 neurons 6 mg/kg). Box-whisker plots for the three doses of MPH where the box bounds correspond to the 25th and 75th percentiles, and whiskers bound the central 90%. A ratio \( {\overline{R}}^{\ast } \) of one would be expected if there is no difference between treatment and control. Bootstrap resampling was performed to estimate the variability of the distribution of \( {\overline{R}}^{\ast } \)using quantiles

Discussion

The present data are consistent with the hypotheses that therapeutic doses of MPH hinder task-switching performance, and that such an effect is correlated with changes in reward outcome signals in PFC. Several methodological issues must be considered with regard to these results. First, the switching task (Everling and DeSouza 2005) was chosen because it relies solely on reward processing. Reward processing, in turn, relies on dopaminergic signals at some level of the network, which are a specific target of MPH, which blocks DAT, thereby increasing extracellular DA levels and directly affects neurotransmission (Rolls et al. 1984). Second, oral administration of MPH is used for therapeutic treatment in humans (Greenhill 2001), and the doses selected were equivalent to those used for treating humans with ADHD (Doerge et al. 2000). The doses selected have also previously been shown to improve (3 mg/kg) and hinder (6 mg/kg) the performance of another attentionally demanding cognitive task (Rajala et al. 2012) requiring the detection and response to a cue (Fig. 1) in three of the four subjects used in the current study. Third, we chose to record from single units in area 46 of the macaque brain because reward signals have been documented in this area (Niki and Watanabe 1979; Watanabe 1989; Kennerley and Wallis 2009; Marcos et al. 2018), which is also essential for CF (Milner 1963). Importantly, we have shown that MPH alters the functional connectivity between the PFC and the head of the caudate nucleus, an area where significant increases in extracellular DA were documented following the oral administration of MPH in rhesus monkeys using simultaneous PET/MR imaging (Birn et al. 2019). Fourth, we chose to administer MPH during recordings, despite the low-yield of the approach (25–30% success rate), to document, with maximal specificity, the effects of the drug on single units. We addressed the inherently low number of units studied by using well-established numerical resampling methods to perform statistical inference based on trial-by-trial data.

The behavioral data show that the doses of MPH studied significantly impaired the ability to perform the switching task. We focused on three types of errors to understand the nature of this effect. Subjects were required to initiate the task on each trial. A significant increase in the proportion of fixation errors measured under MPH, and no effect of steady state vs. switch, suggests that the drug interfered with the subjects’ overall ability to engage with the task (Table 2). We observed a similar effect of MPH in subjects performing a working memory task (Rajala et al. 2012). Thus, fixation errors may not be task specific, but rather point to a more general impairment brought about by the drug. The most important type of error for evaluating performance in the switching task is the directional error because it is indicative of the subject’s ability to disengage from one task and engage in the other. Consistent with previous findings (Everling and DeSouza 2005), subjects made a significantly larger proportion of directional errors during the switching epoch compared with the steady-state. Interestingly, the subjects did not switch tasks immediately after the large drop in the rate of success resulting from a switch (Fig. 3), requiring instead several trials to achieve the steady-state performance level. As hypothesized, directional errors also depended significantly on MPH dose (Table 2 and Fig. 4), suggesting that the drug, which blocks DAT, might have interfered with the integrity of reward outcome signals that are essential for recovery of performance. Differences between the proportion of errors between pro- and antisaccade tasks did not differ significantly contrary to previous findings (Everling and DeSouza 2005). A post-hoc test of the random effects coefficients b2j for each subject revealed a balance in sign (+ and −); however, the magnitudes did not differ significantly from 0 (p > .05). Although individual subjects’ differences differ between pro- and antisaccade, the trend is weak. We conjecture that the subjects in this study may have been trained more extensively in the performance of the switching task than previous studies. The lack of significant differences in accuracy across conditions (Table 2) indicates that the actual motor component of the responses was consistent across MPH dose and contingency change. Therefore, the differences in performance can be attributed to effects on higher order aspects of the task and not motoric effects. These data have important implications for the use of MPH and drugs of this type in the treatment of ADHD because, while therapeutic doses may bring about improvement in some behaviors or cognitive functions, they may hinder others. Successful execution of the switching task employed here requires a negative outcome signal to represent the lack of reward for an executed action. Accordingly, we focused our analysis of the effects of MPH on PFC neuronal activity during the reward epoch. As indicated at the outset, there is a variety of neuronal signals in the PFC that can encode different contexts and outcomes (Kennerley and Wallis 2009; Marcos et al. 2018). For instance, two types of reward prediction errors, positive and negative (Asaad and Eskandar 2011; Matsumoto et al. 2007; Oemisch et al. 2019), have been postulated, but there is no consensus on how such signals might be combined and processed by neural circuitry to produce a decision. Under control conditions, the PFC neurons studied show a difference in discharge rate during the reward epoch of incorrect compared with correct trials. This difference could serve as an outcome indicator, signaling the need for a change in behavior. Following the administration of MPH, which significantly impaired behavioral performance, the data show a significant decrease in this difference (Fig. 5), consistent with the statistical analysis of all neurons studied (Fig. 6). This reduction in the magnitude of the neural signals lead us to hypothesize that it represents a correlate of the decrease in switching performance. The oral administration of therapeutically relevant doses of MPH seems to reversibly deprive cortical decision-making circuits of essential reward outcome signals, without which reward-guided performance could be severely affected. The question arises as to where is the locus of the effects of MPH? In rodents, the effects are thought to take place directly at the level of the PFC (Berridge et al. 2006), while in non-human primates changes in extracellular DA resulting from oral MPH have been documented in the striatum using microdialysis (Kodama et al. 2017), and in the head of the caudate nucleus using [18F]fallypride positron emission tomography (Birn et al. 2019), not in the PFC. Thus, the effects of MPH observed in the discharge rate of PFC neurons may likely be the result of a network effect. There are two possible dopaminergic systems, the mesolimbic and mesocortical, that could exert changes in the PFC. The mesocortical pathway projects from the VTA to the PFC. Methylphenidate has been shown to inhibit VTA neurons, along with an induced increase in PFC membrane potential upstate duration, which shifts the functional coupling between PFC and DA neurons from negative to positive (dela Pena et al. 2018). This increased coupling with VTA could further blunt the neural signaling of an unexpected outcome by decreasing PFC firing during a prolonged upstate (Lewis and O’Donnell 2000).

The mesolimbic pathway, on the other hand, projects from the VTA to limbic structures such as the ventral striatum (VS). Overall reduced discharge as a consequence of increased DA has been observed in the striatum (Rolls et al. 1984), which in turn, could modulate the PFC. In fact, the effects of MPH, which increases extracellular DA in the striatum (Birn et al. 2019; Kodama et al. 2017), can be observed in area 46, where a variety of reward-related signals have been documented (Niki and Watanabe 1979; Mansouri et al. 2006; Watanabe 1989). The thalamus closes the loop providing a corticostriatal recurrent pathway back to the PFC propagating the effects of MPH (Haber and Knutson 2010; Haber et al. 1993). Further evidence supports a distributed encoding of feature-specific prediction errors observed in the anterior cingulate and PFC, as well as connected areas in the head of the caudate and VS in primate during a reversal learning task (Oemisch et al. 2019).

Collectively, these parallel pathways reflect processes by which midbrain dopamine neurons could, in a distributed and recurrent framework (Cortese et al. 2019), affect the observed reward outcome signals in the PFC, and the attenuation of these signals by MPH.