Abstract
In dynamic environments, subjects often integrate multiple samples of a signal and combine them to reach a categorical judgment1. The process of deliberation can be described by a time-varying decision variable (DV), decoded from neural population activity, that predicts a subject’s upcoming decision2. Within single trials, however, there are large moment-to-moment fluctuations in the DV, the behavioural significance of which is unclear. Here, using real-time, neural feedback control of stimulus duration, we show that within-trial DV fluctuations, decoded from motor cortex, are tightly linked to decision state in macaques, predicting behavioural choices substantially better than the condition-averaged DV or the visual stimulus alone. Furthermore, robust changes in DV sign have the statistical regularities expected from behavioural studies of changes of mind3. Probing the decision process on single trials with weak stimulus pulses, we find evidence for time-varying absorbing decision bounds, enabling us to distinguish between specific models of decision making.
Similar content being viewed by others
Main
When making a categorical decision about a noisy stimulus, subjects commonly fluctuate between levels of commitment to a choice before reporting a decision. In some instances, the fluctuations are sufficiently strong to lead to a change of mind (CoM)2,3,4,5,6,7,8. Because these within-trial fluctuations are different from trial to trial and not necessarily tied to an external event or stimulus feature, they can only be captured using a moment-to-moment neural readout of the decision state on single trials.
To obtain such a readout, we decoded an instantaneous DV in real time from neural population activity in dorsal premotor cortex (PMd) and M1 while two monkeys performed a motion-discrimination task9,10 (Fig. 1a, Supplementary Methods 3; all methods for this paper are provided in the Supplementary Information). We used a linear decoder, trained on previously obtained data, on multielectrode spiking data from the preceding 50–100 ms, updated every 10 ms throughout each trial (Fig. 1b, Supplementary Methods 9, 10). The sign of the DV indicated which choice was predicted, enabling calculation of the decoder’s prediction accuracy. The magnitude of the DV reflected the model’s prediction confidence in units of log odds for one versus the other decision (Supplementary Methods 9). Note that the DV defined here encompasses all choice-predictive signals that can be decoded from neural activity11, including but not limited to accumulated evidence as posited in classical models.
We demonstrate that this real-time DV can predict choices on single trials beginning approximately 250 ms after visual stimulus onset, and that prediction accuracy increases throughout the course of the trial, consistent with previous offline observations2. Moreover, we employ closed-loop, neurally contingent control over stimulus timing to directly probe the behavioural significance of within-trial DV fluctuations. We quantify the behavioural effects of previously covert DV variations (1) as a function of time and instantaneous DV (experiment 1), (2) during CoM-like DV fluctuations (experiment 2), and (3) in response to subthreshold stimulus pulses (experiment 3). Using this approach, we validate the behavioural relevance and computational implications of intra-trial DV fluctuations.
Real-time choice decoding
Psychophysical performance on the discrimination task11 was better for higher coherences and stimuli of longer duration (Extended Data Fig. 1a), as expected from previous studies9,12. We first measured the accuracy of our real-time decoder in predicting monkeys’ choices as a function of time during the trial. The average prediction accuracy started near chance during the targets epoch (Fig. 1c, Extended Data Fig. 1b). During stimulus presentation, average prediction accuracy quickly departed from baseline, rising monotonically to 99% correct for the longest stimulus presentations for monkey H and 98% for monkey F. Moreover, for all 4 epochs considered, the average accuracy of our real-time readout was within ±2% of an equivalent offline decoder (Extended Data Figs. 2a–d, Supplementary Methods 12.3; comparisons between PMd and M1 in monkey H and for decoders trained in different epochs are presented in Extended Data Figs. 3, 4). Thus, our real-time decoder reproduces the prediction accuracy of our own offline analyses and of an analogous study of the prearcuate cortex2.
Our real-time decoder also reproduced the average temporal dynamics and coherence dependence expected of the DV: it started at around 0 at dots onset, separated by choice after about 200 ms, and rose (or fell) faster for higher coherence trials (Fig. 1d, Extended Data Fig. 1c). As expected from previous results13, prediction accuracy was higher for correct trials than error trials (Extended Data Fig. 5) at constant stimulus coherence.
Our decoding method yielded stable performance across multiple days, justifying combination of data across sessions (Extended Data Fig. 6).
DV fluctuations track evolving decisions
We often observed large fluctuations (over 3 natural log units) in the DV on individual trials, even within single epochs (Fig. 1e). If moment-to-moment fluctuations in DV reflect fluctuations in the animal’s decision state, we expect larger absolute values of DV to be associated with stronger preference for one of the two choices, and hence higher prediction accuracy were a decision to be required at a given time during a single trial.
Because we decoded and tracked the DV in real time, we were able to test this expectation by terminating the visual stimulus in a neurally contingent manner and probing both neural activity and behaviour with high precision and negligible latency (less than 34 ms; Supplementary Methods 11.4). In the first closed-loop test (experiment 1), we imposed virtual decision boundaries at specific DV values that, if reached, triggered stimulus termination (Fig. 2a), prompting the subject to immediately report its decision (in trials with no delay period). For example, Fig. 2b shows 22 DV traces that reached a fixed DV boundary of magnitude 3 (tolerance of ±0.25 DV units), leading to stimulus termination and the subject’s decision. In this manner, we obtained a direct mapping between the nearly instantaneous readout of decision state and the likelihood of a given choice.
We systematically swept boundary heights from 0.5–5 DV units in increments of 0.5 (1 DV unit corresponds to an increase of 2.718 in the likelihood ratio of choosing one target over the other). Figure 2c shows that prediction accuracy increases monotonically with the DV magnitude at termination, as expected. Using only 100 ms of data to estimate the terminating DV, the observed likelihood of a given choice (solid trace) differed from that predicted by the logistic function (dashed trace) by 1.7% for monkey H and by 1.9% for monkey F (mean absolute error; Extended Data Fig. 1d). Notably, prediction accuracy falls systematically as the time window for calculating DV is moved further than 100 ms into the past (Extended Data Fig. 1g). Thus, very recent neural population activity better reflects the current decision state than earlier time intervals. In further analyses, we performed the calculation in Fig. 2c on subsets of the aggregated data: high versus low stimulus coherences and short- versus long-duration trials. The result in Fig. 2c is robust across trial duration, but differs modestly for high versus low coherences (Extended Data Fig. 1e, f), revealing a significant effect of DV derivative on prediction accuracy (Supplementary Note 1, Supplementary Table 2).
Overall, these results show that moment-by-moment fluctuations in PMd and M1 neural population activity captured by our decoding model are indeed reflective of a fluctuating internal decision state—fluctuations that have been covert and thus uninterpretable until now.
To quantify how much additional predictive power is gained from the real-time DV readout compared with (1) the stimulus itself, and (2) the average DV for a given stimulus coherence and time-in-trial, we built three nested logistic regression models, each using an additional regressor (Supplementary Methods 12.1). The first model, using only stimulus information (motion energy) plus an intercept, correctly predicted choice in 74.5% of trials for monkey H and 71.5% of trials for monkey F (Supplementary Table 1). Adding the average DV for the corresponding stimulus coherence and time in trial to this model increased prediction accuracy by 2–3%. By contrast, adding the single-trial DV at termination as a third regressor increased prediction accuracy by more than 10%. This effect is substantial for lower-coherence trials (Fig. 2d, Extended Data Fig. 1h). Second, as a complementary analysis, we built four logistic regression models, three using only one of the above regressors (Supplementary Methods 12.1) and a fourth using signed motion coherence. Not only was single trial DV by itself 10% more accurate than any other regressor, it was also only 1–2.5% less predictive than the model with all 3 regressors (Supplementary Table 1).
We emphasize that our decoded DV is model based and thus only a proxy for the actual decision state in the brain. We are sampling from a relatively small number of neurons in only one brain region, over relatively short time bins, and the underlying mechanism is unlikely to be strictly linear. Despite these caveats, our ability to predict choice likelihood within a small margin of error confirms that DV is a reliable proxy for decision state.
Neurally detected changes of mind
Validation of the mapping between DV and choice likelihood (Fig. 2c) enabled us to perform a new closed-loop experiment (experiment 2) aimed at capturing robust DV fluctuations in which the sign of the DV changed mid-trial, suggestive of a behavioural CoM (Fig. 3a, b). We established neural criteria for a candidate CoM that, when met in real time, led to stimulus termination and the monkey’s decision (Fig. 3a, Supplementary Methods 11.2).
We conceptually divide a CoM trial into two segments—the initial preference before the DV sign change, and the final (opposite) preference that leads to the observed choice. The interpretation of the initial preference relies on the mapping between the DV and choice likelihood obtained from experiment 1. The observed choices allow validation of the neural estimate of the final decision state in the second segment (Extended Data Fig. 7a, Supplementary Note 2).
For monkey F, the relationship between prediction accuracy and DV at stimulus termination was very similar for CoM and non-CoM trials (compare Extended Data Fig. 1d, right with Extended Data Fig. 7a, right; mean error between predicted and observed choice likelihood: 1.9% for non-CoM trials and 3.8% for CoM trials). This relationship was also lawful and monotonic for monkey H, although observed prediction accuracy was lower than expected from the logistic model (compare Extended Data Fig. 1d, left with Extended Data Fig. 7a, left; mean error between predicted and observed choice likelihood: 1.7% for non-CoM trials and 9.3% for CoM trials), suggesting that in addition to the measured DV at stimulus termination, the decisions of monkey H were influenced by some aspect of the DV trajectory history specifically related to the CoM (Extended Data Fig. 7a, Supplementary Note 3, Supplementary Table 3, Supplementary Methods 12.4).
We combined all 985 CoMs detected in monkey H (and all 1,727 CoMs detected in monkey F) to assess whether our neurally detected CoMs conformed to three statistical regularities of CoMs established in previous psychophysical3 and electrophysiological2 studies: (1) CoMs are more frequent for low- and intermediate-coherence trials compared with high-coherence trials; (2) CoMs are more likely to be corrective than erroneous; and (3) CoMs are more frequent early in the trial than later in the trial. All three predicted regularities are true in our real-time neural detection data (Fig. 3c–e, Extended Data Fig. 7b–d).
We also discovered a new regularity associated with CoMs: the average time of zero crossing was negatively correlated with stimulus coherence (Fig. 3f, Extended Data Fig. 7e). This observation probably results from the stronger corrective effect of higher-coherence stimuli (Fig. 3d, Extended Data Fig. 7c).
Of note, the statistical regularities in the neural CoMs were not foreordained since our decoder was trained on choices made at the end of trials, completely agnostic to rare CoMs during any given trial.
Probing decisions with motion pulses
In a final closed-loop experiment (experiment 3), we tested whether neural and behavioural responses to brief pulses of additional motion information varied with DV and/or the time of pulse onset. Decision-making models involving accumulation of evidence to a bound14,15,16,17 predict that termination of deliberation and commitment to a choice become more likely at high DV values2,3,9,17,18, resulting in decreased sensitivity to stimulus information beyond the point of commitment. We therefore hypothesized that additional pulses of sensory evidence would have less effect on neural DV and behavioural choices when triggered by high DV values.
To test this prediction, we imposed virtual DV boundaries (as in Fig. 2a, b) that, if reached, triggered a 200-ms pulse of additive dots coherence (randomly assigned to be rightward or leftward on each trial) followed by stimulus termination (Fig. 4a). We swept a subset of the previously used DV values for the boundary (integers from 1 to 4 DV units). Pulses were only presented on trials with motion coherences near or below the subject’s psychophysical threshold; pulse strength was calibrated to yield very small but significant effects on behaviour, to avoid making the pulses sufficiently salient to change the animals’ integration strategy (∆coherence = 2% (monkey H), 4.5% (monkey F)). Pulse information had no bearing on reward9,19. Motion pulses slightly but significantly biased the monkeys’ choices in the direction of the pulse (Fig. 4b).
We reasoned that, to detect the small effects of these weak motion pulses on DV, and to best estimate the DV at the time when pulse information could actually influence the momentary decision state, we should account for a processing delay between pulse presentation and measured effects on our recorded neural populations in PMd and M1. We refer to this delay, estimated from an independent set of open-loop trials, as the evidence representation latency (ERL), which is 170 ms for monkey H and 180 ms for monkey F (Supplementary Methods 16). For each pulse trial, we then measured the initial DV at the time of pulse onset plus the ERL (DVPERL) (Supplementary Methods 16), as well as the change in DV (𝛿DV) for each subsequent time bin. On average, motion pulses slightly but significantly biased 𝛿DV in the direction of the pulse (Fig. 4c, Extended Data Fig. 8a).
In simple, unbounded linear integration, we expect the magnitude of DV change in response to a fixed motion pulse to remain constant regardless of the initial state of the DV, as suggested above. By contrast, Fig. 4d (Extended Data Fig. 8b) shows that motion pulses led to larger DV changes when triggered by low DV values compared with high DV values, consistent with the presence of an absorbing decision bound.
We next addressed whether the decision bound is stationary or changing with time. For models with stationary bounds, the effect of the motion pulse would depend solely on the state of the DV at the time of the pulse, whereas for models with time-varying bounds16,17,18,20,21, the pulse effect would also depend on the pulse time. Devising an analysis that disentangles the effects of the DVPERL from pulse time also addresses a potential confound in the 𝛿DV analysis presented above: the motion pulse was always delivered at the end of the stimulus and, on average, longer stimulus durations were required to generate higher DVs in our experiment, as expected from standard evidence-accumulation models. Thus the 𝛿DV analysis results could have been partially shaped by elapsed time. We therefore conducted an additional analysis to determine whether the reduced pulse effect was attributable specifically to higher DVs, to later pulse times (longer stimulus durations), or both.
We adopted a data-driven approach to separate the effects of DVPERL and stimulus duration (Supplementary Methods 16.1). In brief, to isolate the effect of the magnitude of DVPERL (|DVPERL|), we (1) divided trials into eight quantiles for stimulus duration, (2) calculated a residual pulse effect for each trial by subtracting the mean pulse effect for each combination of stimulus-duration quantile and baseline motion strength, (3) recombined the data across duration quantiles to obtain statistical power, and (4) analysed how the residuals varied with |DVPERL|. We refer to this as the time-adjusted effect of DVPERL, that is the effect of DVPERL that cannot be accounted for by stimulus duration or baseline motion strength (Fig. 4e, f). Conversely, to isolate the DV-adjusted effect of stimulus duration, we (1) divided trials into |DVPERL| bins, (2) calculated residual pulse effects by subtracting the mean effects for each combination of |DVPERL| bin and baseline motion strength, (3) recombined the data across all |DVPERL| bins, and (4) analysed how the single-trial residuals varied with stimulus duration (Fig. 4g, h).
The time-adjusted magnitudes of both behavioural and neural pulse effects decreased systematically with |DVPERL| (Fig. 4e, f, Extended Data Fig. 8c, d, g–j), and the DV-adjusted magnitudes of both behavioural and neural pulse effects decreased systematically with stimulus duration (Fig. 4g, h, Extended Data Fig. 8e, f).
Discussion
In this study, we have combined neural population recordings with closed-loop, neurally contingent stimulus control to probe moment-to-moment fluctuations in decision states and validate their significance for behaviour. We show that large fluctuations in a decoded DV in premotor and primary motor cortices are nearly instantaneously (<100 ms) predictive of choices. Notably, these intra-trial DV fluctuations are not driven predominantly by intra-trial fluctuations in stimulus strength, as quantified by motion energy, even in CoM trials (Extended Data Fig. 9, Supplementary Note 4). This advance enabled real-time detection of covert cognitive events (such as CoM) at the neural level.
We exploited this approach to test current models of evidence accumulation and termination in decision making. We introduced weak motion pulses at known DV values during naturally evolving decisions. Strictly linear, unbounded accumulation models predict a constant effect of stimulus pulses irrespective of the momentary decision state or the time of pulse presentation during the trial. By contrast, we found that the neural and behavioural effects of stimulus pulses were strongest when delivered at low DV values or short stimulus durations.
Each of these two results establishes constraints on models of decision making. First, the decreased efficacy of stimulus pulses at higher DV values comprises direct evidence for absorbing bounds, a feature of the decision-making process that is widely assumed in many models of decision formation15,22,23,24,25. Our result indicates that the system becomes resistant to further motion input as the DV becomes larger, reflecting a stronger state of commitment to a choice. Evidence for a decision bound also emerged from experiment 1: DV variability decreases late in the trial, consistent with inferences from previous studies9,19 that the system becomes more resistant to new stimulus input at longer stimulus durations (Extended Data Fig. 10).
Second, the decreased efficacy of pulses with longer stimuli suggests that the amplitude of the terminating bound decreases with time during the trial. Two large groups of models that lack a time-dependent termination mechanism cannot explain our data, because they predict that the pulse effect will be determined by the strength of the pulse and the state of the model when the pulse is delivered, with no independent effect of time: (1) models that assume a static termination mechanism, including commonly used drift diffusion models with fixed decision bounds; and (2) models that lack a termination criterion for fixed and variable duration tasks, relying instead on dynamic competition between two alternatives to determine the final choice15,26. Extensions of the drift diffusion models—and the broader class of bounded accumulation models—that include time-dependent decision bounds27 or an urgency signal16,17,18,20,21,28 are compatible with our experimental observations (Supplementary Note 5).
Our study builds on a substantial literature of single-unit9,19 and neural population2,4,5,8,29 studies of decision mechanisms, and leverages the technical power of intracortical brain–computer interfaces developed for real-time control of prosthetic devices30,31,32,33,34,35,36. Our findings were enabled by the ability to accurately decode decision states in real time (Supplementary Note 6), which could bring the concept of cognitive prostheses37,38,39,40,41 closer to reality by providing another means of decoding subjects’ goals for use in prosthetic control. More broadly, our real-time closed-loop approach may also be applicable to other cognitive phenomena such as working memory and attention42,43, and even to affective processes41.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this paper.
Data availability
The data that support the findings of the current study are available from the corresponding authors upon reasonable request. Source data are provided with this paper.
Code availability
The analysis code was developed in MATLAB (Mathworks) and is available from the corresponding authors upon reasonable request.
References
Shadlen, M. N. & Kiani, R. Decision making as a window on cognition. Neuron 80, 791–806 (2013).
Kiani, R., Cueva, C. J., Reppas, J. B. & Newsome, W. T. Dynamics of neural population responses in prefrontal cortex indicate changes of mind on single trials. Curr. Biol. 24, 1542–1547 (2014).
Resulaj, A., Kiani, R., Wolpert, D. M. & Shadlen, M. N. Changes of mind in decision-making. Nature 461, 263–266 (2009).
Kaufman, M. T., Churchland, M. M., Ryu, S. I. & Shenoy, K. V. Vacillation, indecision and hesitation in moment-by-moment decoding of monkey motor cortex. eLife 4, e04677 (2015).
Bollimunta, A., Totten, D. & Ditterich, J. Neural dynamics of choice: single-trial analysis of decision-related activity in parietal cortex. J. Neurosci. 32, 12684–12701 (2012).
van den Berg, R. et al. A common mechanism underlies changes of mind about decisions and confidence. eLife 5, e12192 (2016).
Lemus, L. et al. Neural correlates of a postponed decision report. Proc. Natl Acad. Sci. USA 104, 17174–17179 (2007).
Rich, E. L. & Wallis, J. D. Decoding subjective decisions from orbitofrontal cortex. Nat. Neurosci. 19, 973–980 (2016).
Kiani, R., Hanks, T. D. & Shadlen, M. N. Bounded integration in parietal cortex underlies decisions even when viewing duration is dictated by the environment. J. Neurosci. 28, 3017–3029 (2008).
Britten, K. H., Shadlen, M. N., Newsome, W. T. & Movshon, J. A. The analysis of visual motion: a comparison of neuronal and psychophysical performance. J. Neurosci. 12, 4745–4765 (1992).
Peixoto, D. et al. Population dynamics of choice representation in dorsal premotor and primary motor cortex. Preprint at https://doi.org/10.1101/283960 (2018).
Kiani, R. & Shadlen, M. N. Representation of confidence associated with a decision by neurons in the parietal cortex. Science 324, 759–764 (2009).
Shadlen, M. N. & Newsome, W. T. Neural basis of a perceptual decision in the parietal cortex (area LIP) of the rhesus monkey. J. Neurophysiol. 86, 1916–1936 (2001).
Smith, P. L. & Ratcliff, R. Psychology and neurobiology of simple decisions. Trends Neurosci. 27, 161–168 (2004).
Usher, M. & McClelland, J. L. The time course of perceptual choice: the leaky, competing accumulator model. Psychol. Rev. 108, 550–592 (2001).
Ditterich, J. Evidence for time-variant decision making. Eur. J. Neurosci. 24, 3628–3641 (2006).
Cisek, P., Puskas, G. A. & El-Murr, S. Decisions in changing conditions: the urgency-gating model. J. Neurosci. 29, 11560–11571 (2009).
Hanks, T., Kiani, R. & Shadlen, M. N. A neural mechanism of speed–accuracy tradeoff in macaque area LIP. eLife 3, e02260 (2014).
Huk, A. C. & Shadlen, M. N. Neural activity in macaque parietal cortex reflects temporal integration of visual motion signals during perceptual decision making. J. Neurosci. 25, 10420–10436 (2005).
Churchland, A. K., Kiani, R. & Shadlen, M. N. Decision-making with multiple alternatives. Nat. Neurosci. 11, 693–703 (2008).
Thura, D., Beauregard-Racine, J., Fradet, C.-W. & Cisek, P. Decision making by urgency gating: theory and experimental support. J. Neurophysiol. 108, 2912–2930 (2012).
Wong, K.-F. & Wang, X.-J. A recurrent network mechanism of time integration in perceptual decisions. J. Neurosci. 26, 1314–1328 (2006).
Wong, K.-F., Huk, A. C., Shadlen, M. N. & Wang, X.-J. Neural circuit dynamics underlying accumulation of time-varying evidence during perceptual decision making. Front. Comput. Neurosci. 1, 6 (2007).
Inagaki, H. K., Fontolan, L., Romani, S. & Svoboda, K. Discrete attractor dynamics underlies persistent activity in the frontal cortex. Nature 566, 212–217 (2019).
Ratcliff, R. & McKoon, G. The diffusion decision model: theory and data for two-choice decision tasks. Neural Comput. 20, 873–922 (2008).
Wang, X.-J. Probabilistic decision making by slow reverberation in cortical circuits. Neuron 36, 955–968 (2002).
Drugowitsch, J., Moreno-Bote, R., Churchland, A. K., Shadlen, M. N. & Pouget, A. The cost of accumulating evidence in perceptual decision making. J. Neurosci. 32, 3612–3628 (2012).
Standage, D., You, H., Wang, D. H. & Dorris, M. C. Gain modulation by an urgency signal controls the speed-accuracy trade-off in a network model of a cortical decision circuit. Front. Comput. Neurosci. 5, 7 (2011).
Seidemann, E., Meilijson, I., Abeles, M., Bergman, H. & Vaadia, E. Simultaneously recorded single units in the frontal cortex go through sequences of discrete and stable states in monkeys performing a delayed localization task. J. Neurosci. 16, 752–768 (1996).
Andersen, R. A., Aflalo, T. & Kellis, S. From thought to action: The brain-machine interface in posterior parietal cortex. Proc. Natl Acad. Sci. USA 116, 26274 (2019).
Collinger, J. L. et al. High-performance neuroprosthetic control by an individual with tetraplegia. Lancet 381, 557–564 (2013).
Hochberg, L. R. et al. Reach and grasp by people with tetraplegia using a neurally controlled robotic arm. Nature 485, 372–375 (2012).
Moritz, C. T., Perlmutter, S. I. & Fetz, E. E. Direct control of paralysed muscles by cortical neurons. Nature 456, 639–642 (2008).
Ethier, C., Oby, E. R., Bauman, M. J. & Miller, L. E. Restoration of grasp following paralysis through brain-controlled stimulation of muscles. Nature 485, 368–371 (2012).
Pandarinath, C. et al. High performance communication by people with paralysis using an intracortical brain-computer interface. eLife 6, e18554 (2017).
Willett, F. R. et al. Hand knob area of premotor cortex represents the whole body in a compositional way. Cell 181, 396–409 (2020).
Musallam, S., Corneil, B. D., Greger, B., Scherberger, H. & Andersen, R. A. Cognitive control signals for neural prosthetics. Science 305, 258–262 (2004).
Pesaran, B., Musallam, S. & Andersen, R. A. Cognitive neural prosthetics. Curr. Biol. 16, R77–R80 (2006).
Andersen, R. A., Hwang, E. J. & Mulliken, G. H. Cognitive neural prosthetics. Annu. Rev. Psychol. 61, 169–190 (2010).
Golub, M. D., Chase, S. M., Batista, A. P. & Yu, B. M. Brain–computer interfaces for dissecting cognitive processes underlying sensorimotor control. Curr. Opin. Neurobiol. 37, 53–58 (2016).
Shanechi, M. M. Brain–machine interfaces from motor to mood. Nat. Neurosci. 22, 1554–1564 (2019).
Wallis, J. D. Decoding cognitive processes from neural ensembles. Trends Cogn. Sci. 22, 1091–1102 (2018).
Schafer, R. J. & Moore, T. Selective attention from voluntary control of neurons in prefrontal cortex. Science 332, 1568–1571 (2011).
Acknowledgements
We thank all members of the Newsome and Shenoy labs at Stanford University for comments on the methods and results throughout the execution of the project. D.P. was supported by the Champalimaud Foundation, Portugal and Howard Hughes Medical Institute. J.R.V. was supported by Stanford MSTP NIH training grant 4T32GM007365 and supported by the Howard Hughes Medical Institute. R.K. was supported by Simons Collaboration on the Global Brain grant 542997, Pew Scholarship in Biomedical Sciences, National Institutes of Health Award R01MH109180 and a McKnight Scholars Award. J.C.K. was supported by NSF graduate research fellowship. P.N. was supported by NIDCD award R01DC014034. C.C. was supported by K99NS092972 and 4R00NS092972-03 award from the NINDS and supported as a research specialist by the Howard Hughes Medical Institute. J.B. and S.F. were supported by the Howard Hughes Medical Institute. K.V.S. was supported by the following awards: NIH Director’s Pioneer Award 8DP1HD075623, DARPA-BTO ‘NeuroFAST’ award W911NF-14-2-0013, the Simons Foundation Collaboration on the Global Brain awards 325380 and 543045, and ONR award N000141812158. W.T.N. and K.V.S. were supported by the Howard Hughes Medical Institute.
Author information
Authors and Affiliations
Contributions
D.P., J.R.V., R.K., S.F., K.V.S. and W.T.N. designed the experiments. D.P., J.R.V. and S.F. trained the animals and collected the data. D.P., J.R.V. and W.T.N. wrote initial draft of the paper. S.I.R., D.P. and R.K. performed the surgical procedures. D.P., J.C.K., P.N., C.C. and J.B. implemented the real-time decoding setup. D.P., R.K. and C.C. designed, and D.P. and J.R.V. implemented, the decoder training algorithm to obtain the decoder weights and normalization matrices. D.P. and J.R.V. analysed the data. All authors contributed analytical insights and commented on statistical tests, discussed the results and implications, and contributed extensively to the multiple subsequent drafts of the paper.
Corresponding authors
Ethics declarations
Competing interests
K.V.S. consults for Neuralink Corp and CTRL-Labs Inc (part of Facebook Reality Labs) and is on the scientific advisory boards of MIND-X Inc, Inscopix Inc and Heal Inc. These entities did not support this work. The remaining authors declare no competing interests.
Additional information
Peer review information Nature thanks Timothy Hanks, Hendrikje Nienborg and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Monkey and decoder performance.
a, Psychophysical performance, motion discrimination task. Trials were sorted for stimulus duration in quartiles from long (dark green) to short (light green). Data points (black) correspond to mean accuracy ± s.e.m. Data from each quartile were fit separately by a Weibull curve (Supplementary Methods 6). Inset: fit parameter α (psychophysical threshold) for each quartile. For monkey H (F): data from 12516 (12365) open-loop trials. For both subjects, median threshold for short duration stimuli (Q1, Q2 combined) was higher than for longest duration stimuli (Q3, Q4 combined, two-sided Wilcoxon rank-sum test: P = 6.188 × 10−30 for monkey H, P = 2.136 × 10−65 for monkey F). The x axis shows natural log scale spacing. b, Real-time choice prediction accuracy. Same as Fig. 1c, for individual monkeys (16,468, 15,286 trials for monkey H, F). Accuracy departed from baseline 174.5 ± 18.8 ms (214.5 ± 8.09 ms) after dots onset for monkey H (F). c, Average DV during dots. Same as Fig. 1d, for individual monkeys. For monkey H (F) coherence is a significant regressor of DV (for at least one of the choices) for the period between [190, 870] ms ([230, 970] ms) aligned to dots onset. Minimum 1,220 (1,332) trials per condition shown for monkey H (F). d, Prediction accuracy as a function of DV magnitude. Same as Fig. 2c, for individual monkeys. Data from 2,973 (2,518) trials – minimum 495 (484) trials per condition shown – from monkey H (F). e, Prediction accuracy as a function of DV magnitude and stimulus coherence. Same data and conventions as in Fig. 2e, for individual monkeys and pre-sorted by high versus low coherences. Minimum 234 (238) trials per condition shown for monkey H (F). f, Prediction accuracy as a function of DV magnitude and stimulus duration. Same data and conventions as Fig. 2f, for individual monkeys and pre-sorted by stimulus duration (median split). Minimum 239 (235) trials per condition shown for monkey H (F). g, Prediction accuracy as a function of DV magnitude and time before stimulus termination. Prediction accuracy as a function of DV at time t+DT before termination. Each curve corresponds to a different DT from 0 ms (dark blue) to −400 ms (light blue). Accuracy: percentage of correctly predicted choices. Data from 2,973 (2,518) trials from monkey H (F). h, Single trial DVs substantially increase choice prediction accuracy. Same as Fig. 2d, for individual monkeys.
Extended Data Fig. 2 Prediction accuracy online versus offline.
a, Online and offline classifiers result in similar performance, on average, for targets, dots, delay, and post-go epochs – monkey H. Average prediction accuracy (Supplementary Methods 9, 10, 12.3) over time ± s.e.m. (across sessions) for monkey H. Online/offline classifier results are plotted in black/red. Data in black are same as Fig. 1c, but for monkey H only. Prediction accuracy is very similar online and offline across the trial (see c). Data from 17 sessions (16,468 trials). b, Same as a, but for monkey F. Data from 15 sessions (15,286 trials). c, Summary of performance difference between online and offline classifiers within each epoch – monkey H. Average performance difference between online and offline classifiers (accuracy difference in proportion correct) for each of the epochs plotted in a (same sessions). Data points (black dots) correspond to mean accuracy difference ± s.e.m. (across sessions). Positive numbers correspond to better online classifier performance and negative numbers to better offline classifier performance. Black asterisks correspond to windows for which the differences were significantly different from zero (Wilcoxon signed-rank test, P < 0.01 two-sided, P values: 0.0004, 0.00009, 0.00009, 0.00009). d, Same as c, for monkey F (P values: 0.0051, 0.0001, 0.0017, 0.0001).
Extended Data Fig. 3 Choice prediction accuracy calculated offline on single trials: PMd versus M1, multiple versus single classifiers.
a, PMd predicts choices slightly better than M1 during stimulus presentation using a single classifier per epoch (monkey H). Mean prediction accuracy (Supplementary Methods 12.3) over time ± s.e.m. across sessions. Black dots denote time bins for which prediction accuracy was significantly different between the two areas (two-sided Wilcoxon signed-rank test, P < 0.05 Holm–Bonferroni correction for multiple comparisons). Same data as in c, d (dark traces). Data from 17 sessions (16,468 trials). b, Same as a, but using a different classifier for each 50 ms window. Same data as in c, d (light traces). c, Single and multiple classifiers yield similar performance for targets, dots and go epochs but not for reach epoch for PMd (monkey H). Same trials and statistical conventions as a. Average prediction accuracy ± s.e.m. across sessions for PMd using a single classifier (trained on data across all time points within an epoch) or multiple classifiers (a new decoder trained for every 50 ms window) per epoch. In the dots and go cue periods, performance is nearly identical with single versus multiple classifiers, reflecting the stability of choice representation during these periods. In contrast, multiple classifiers (trained for every time point) perform better during the movement period when choice representation changes rapidly with time. d, Equivalent to c, but for M1. e, Same as c, but for monkey F. Data from 15 sessions (15,286 trials).
Extended Data Fig. 4 Neural population choice prediction accuracy calculated offline on single trials when applying classifiers across epochs.
a, Only dots and go classifiers perform well across epochs. Average prediction accuracy (see Supplementary Methods 12.3) over time ± s.e.m. (across sessions) for monkey H for decoders trained in the targets (cyan), dots (dark yellow), go (magenta) and reach (black) periods. If the choice subspaces for two independent epochs are similar, the decoder from one epoch ought to accurately predict choice in the other epoch. Dots decoder performs well during go period and vice-versa. Targets and reach decoders perform poorly across other epochs. Data from 17 sessions (16,468 trials). b, Equivalent to a, but for monkey F. Same conventions apply. Data from 15 sessions (15,286 trials). c, Summary of performance difference between single and multiple classifiers within each epoch. Average performance difference between within-epoch classifier and across-epoch classifiers for each of the epochs plotted in a (same sessions). Error bars correspond to ± s.e.m. across sessions. Zero difference corresponds to the performance of the classifier trained and tested within the same epoch. d, Same as c, for monkey F. For both subjects in the dots and go periods the loss in decoding accuracy across epochs was very small, suggesting similar choice representation in both periods.
Extended Data Fig. 5 Choice prediction accuracy for correct and incorrect trials as a function of coherence.
Choice prediction accuracy obtained from real-time readout for correct and incorrect trials for each level of coherence. Prediction accuracy during the dots epoch for each coherence level is plotted for correct (black) and error (magenta) trials. Red dashed line corresponds to chance level. Insets show total number of correct (C) and error (E) trials used in the analysis (correct and incorrect designation was randomly assigned for 0% coherence stimuli). Data for monkey H and F are shown in top and bottom panels, respectively. Mean prediction accuracy for error trials after neural latency (180 ms after stimulus presentation) is outside (and lower than) the 95% CI for correct trials for 1.6%, 3.2%, 6.4%, 12.8% and 25.6% coherences for monkey H and for 12.8%, 25.6% and 51.2% coherences for monkey F (1,000 bootstrap iterations). Results for the highest coherence for each monkey should be interpreted carefully due to the extremely low number of error trials for these conditions resulting from excellent behavioural performance. (Dashed pink lines represent individual error trials at the highest coherence for monkey F.).
Extended Data Fig. 6 Real-time decoding: performance reliability, decoder weights, and Mu and Sigma stability.
a, Decoding performance is stable across sessions. Mean prediction accuracy late in the stimulus presentation (600–1,200 ms) across all sessions for monkey H (top panels) and monkey F (bottom panels). D1–D23 denote different decoders (sets of beta weights) used for the recorded sessions. For monkey H the same decoder (D1) was used for the first 14 sessions. The breaks on the x axis correspond to sessions that occurred on non-consecutive days. b, Real-time decoder β weights. β weights during the dots period (left panel) ranked by absolute magnitude for an example decoder (D1 from monkey H) used in real-time experiments. Channels with no or little choice predictive activity during this period had their weights set to zero using LASSO regularization to prevent over fitting. Delay period and post go cue β weights are shown in the middle and right panels respectively. c, Mu and Sigma matrices are very stable over dozens of sessions – monkey H. Average spike counts within a 50-ms window (Mu, left panel) and standard deviation of spike counts (Sigma, right panel) are plotted as a function of channel (y axis) and trial (x axis) for the sessions comprising closed loop experiments 1 and 2 for monkey H. Red lines, breaks between sessions. d, Same as c, for monkey F.
Extended Data Fig. 7 Validation of putative changes of mind.
a, Choice prediction accuracy for all trials collected during the CoM detection experiment. Trials were split in 6 quantiles sorted by DV magnitude (absolute value) at termination. Prediction accuracy and median DV magnitude were calculated and plotted separately for each quantile (blue line with black data points). Blue error bars show standard error of the mean for a binomial distribution. Dashed black line shows predicted accuracy from log-odds equation used to fit the DV model, and red dashed line shows chance level. Left: Data from 985 CoM trials (164 trials per condition) from monkey H. Right: Data from 1,727 CoM trials (287 trials per condition) from monkey F. b, CoM frequency as a function of coherence. Same as Fig. 3c for individual monkeys. c, CoM frequency as a function of coherence and direction. Same as Fig. 3d for individual monkeys. Median corrective and erroneous CoM counts: 530 and 242 for monkey H and 1,046 and 443 for monkey F, respectively (Wilcoxon rank-sum test P < 0.001). d, CoM frequency as a function of time in the trial. Same as Fig. 3e for individual monkeys. e, CoM time as a function of coherence. Same as Fig. 3f for individual monkeys. CoM time was negatively correlated with stimulus coherence (monkey H: P = 1.8 × 10−17; monkey F: P = 3.0 × 10−30).
Extended Data Fig. 8 Motion pulse effects over DVPERL and time.
a, Average change in post-pulse DV, time-locked to estimated Pulse Evidence Representation Latency (PERL), mean subtracted. Same as Fig. 4c for individual monkeys. PERL = 170 ms (180 ms) for monkey H (F). b, Average change in post-pulse DV for each DV boundary, time-locked to PERL, mean subtracted. Same as Fig. 4d for individual monkeys. Minimum 1,507 (1,731) trials per condition shown for monkey H (F). c, Residual behavioural pulse effects over |DVPERL|. Same as Fig. 4e for individual monkeys. Minimum 501 (504) trials per condition shown for monkey H (F). d, Residual neural pulse effects over |DVPERL|. Same as Fig. 4f for individual monkeys. e, Residual behavioural pulse effects over time. Same as Fig. 4g for individual monkeys. Minimum 1,122 (1,217) trials per condition shown for monkey H (F). f, Residual neural pulse effects over time. Same as Fig. 4h for individual monkeys. g, Pooled residual behavioural pulse effects over signed DV†PERL (Supplementary Methods 16.1, step 4). Black: mean residual pulse effects on choice for trials in each DV†PERL bin, ± s.e.m.; asterisks denote significantly non-zero means at 95% confidence (Supplementary Methods 16.1). Blue: nonlinear regression model fit (MATLAB fitnlm function) of the residuals to a Gaussian over DV†PERL, including the P value (two-sided t-statistic) for the fit amplitude coefficient. h, Pooled residual neural pulse effects over signed DVPERL. Black: mean residual pulse effects on ∆DV for trials in each DV†PERL bin, ± s.e.m.; asterisks denote significantly non-zero means at 95% confidence (Supplementary Methods 16.1). Blue: nonlinear regression model fit (MATLAB fitnlm function) of the residuals to a Gaussian over DV†PERL, including the P value (two-sided t-statistic) for the fit amplitude coefficient. i, Residual behavioural pulse effects over DV†PERL, single subjects. Same as g, for individual monkeys. Minimum 149 (151) trials per condition shown for monkey H (F). j, Residual neural pulse effects over DV†PERL, single subjects. Same as h, for individual monkeys.
Extended Data Fig. 9 Correlation analysis between DV and stimulus motion energy (ME).
a, Correlation between ME and DV across coherences – monkey H. Proportion of variance explained when regressing DV as a function of signed stimulus coherence (grey) or ME. Each green trace corresponds to a separate regression between DV and ME, offset by 180 ms to compensate for neural response delay (Supplementary Methods 14). Darker traces correspond to regressions in which ME was averaged over a longer period of time within each trial. Rises to peak in green traces appear right-shifted by approximately 60 ms due to edge effects from filtering ME at the beginning of each trial. Across all coherence levels ME and signed coherence explain a large fraction of DV variance. b, Same as a, for monkey F. c, Correlation between ME and DV within each signed stimulus coherence level – monkey H. Proportion of variance explained when regressing DV for each time point and within each level of signed coherence as a function of the ME (offset by 180 ms; see a). Within each level of signed coherence, the DV fluctuations are not explained by the ME traces. d, Same as c, for monkey F. e, Correlation between ME and DV slope during putative changes of mind within each signed coherence level – monkey H. Proportion of variance explained when regressing signed DV slope during the CoM for each level of signed coherence as a function of ME. ME was averaged over the 100 ms preceding the CoM, offset by 180 ms. DV slope was calculated over a 100 ms window centred around the zero-crossing defining the CoM. Within each level of signed coherence, the direction and magnitude of the CoM zero-crossings are not explained by ME. P values displayed are uncorrected (linear regression); none of the fit coefficients are significantly non-zero after correction for multiple comparisons. Data from 985 CoM trials from monkey H. f, Same as e, for monkey F. None of the fit coefficients are significantly non-zero after correction for multiple comparisons. Data from 1,727 CoM trials from monkey F.
Extended Data Fig. 10 Within trial DV variability decreases over time for long duration stimuli.
a, Average DV derivative as a function of time and choice – monkey H. DV derivative was calculated for each trial as the difference between consecutive DV estimates spaced out by 10 ms (Supplementary Methods 13). Traces show average DV derivative ± s.e.m. for right choices (red trace) and left choices (blue trace) during stimulus presentation. DV derivative initially starts increasing around the expected stimulus latency (170 ms) but progressively decreases for long (>600 ms) stimulus presentations. Minimum 8,176 trials per condition shown. b, Same as a, but for monkey F. Minimum 8,685 trials per condition shown. c, Average DV derivative as a function of time, coherence, and choice (correct trials only) – monkey H. Same data as in a, but with DV derivative averaged separately for each choice and motion coherence level. Right choices are plotted in red and left choices in blue as in a. Darker traces correspond to stronger coherences. Minimum 557 trials per condition shown. d, Same as c, but for monkey F. Minimum 692 trials per condition shown.
Supplementary information
Supplementary Information
This file contains the full Methods section, six supplementary notes, and five supplementary tables. Supplementary notes contain additional analyses and discussion. Supplementary tables detail regression results and experimental parameters.
Source data
Rights and permissions
About this article
Cite this article
Peixoto, D., Verhein, J.R., Kiani, R. et al. Decoding and perturbing decision states in real time. Nature 591, 604–609 (2021). https://doi.org/10.1038/s41586-020-03181-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41586-020-03181-9
- Springer Nature Limited
This article is cited by
-
Real-time analysis of large-scale neuronal imaging enables closed-loop investigation of neural dynamics
Nature Neuroscience (2024)
-
Graded decisions in the human brain
Nature Communications (2024)
-
Control and recalibration of path integration in place cells using optic flow
Nature Neuroscience (2024)
-
Abstract deliberation by visuomotor neurons in prefrontal cortex
Nature Neuroscience (2024)
-
Decision uncertainty as a context for motor memory
Nature Human Behaviour (2024)