Main

The LC serves as the primary source of noradrenaline in the brain, with a highly divergent set of projections to cortical and subcortical areas1,2,3,4. The LC-NA system has been generally linked to sleep and arousal5,6,7,8, and stress-related behaviours13,17. In addition, at least two distinct roles have emerged with respect to learned behaviour2,18,19,20. First, LC activity is co-regulated with behavioural response during goal-directed behaviour9,10,21,22,23—LC activity correlates with overall subject performance9,24, and manipulating noradrenaline activity affects task performance by enhancing sensory detection and responses11,25,26,27. These observations suggest a role in the execution of a task via sensory–motor gain modulation. Second, LC activity correlates with unexpected stimuli7,15,28 or surprising outcomes10,12,14,19,20, and LC activity is linked with learning12,14,15,16 or switches in decision-making strategies29,30,31, indirectly suggesting a role for the LC in strategy optimization, arousal-mediated learning and memory formation. Whether and under what conditions this relatively small, globally projecting nucleus can simultaneously support these distinct cognitive roles remains unknown.

The functions of LC-NA have been examined in different tasks, under different conditions, leaving open the question of whether the LC and its neurons indeed facilitate multiple components of a single behaviour. If the LC has multiple functions, one way to reconcile the different roles for LC-NA activity is suggested by recent evidence of spatial modularity within the LC-NA neuronal population1. Anatomical evidence indicating that the axonal distribution of single LC-NA neurons is target-specific12,32,33,34,35 breaks from the historical view of the LC as a homogeneous nucleus. This modular view of noradrenaline forebrain neuromodulation has been supported by the observation of differential cognitive effects of noradrenaline manipulations in distinct brain areas12,16,25,26,36. Moreover, LC-NA neurons may also display functional modularity, as recently exemplified by recording of LC neuron activity in anaesthetized rats37. However, whether different LC-NA outputs carry different types of information and whether the behavioural roles of noradrenaline are refined through selective targeting of LC outputs remains unknown.

Dual roles of LC-NA in learned behaviour

To evaluate the distinct cognitive roles of LC-NA and measure its underlying activity, we designed a go/no-go task with graded auditory stimulus evidence and performance. We trained water-restricted mice to respond to a ‘go’ tone by pushing a lever and to  a ‘no-go’ tone by holding the lever still (Fig. 1a, b, Extended Data Fig. 1a, b). Correct lever pushes (‘hits’) resulted in a water reward, whereas lever pushes in response to the no-go tone (‘false alarms’) resulted in an air-puff punishment (Fig. 1b). Other trial outcomes—refraining from pushing the lever at go (‘miss’) or at no-go (‘correct reject’) tones—were unreinforced. To vary stimulus evidence, we used tones of different intensities. Increasing the intensity of the go tone (sgo) resulted in an increased probability of lever press, increased sensitivity (d-prime), increased speed of lever press and decreased reaction time, whereas increasing the intensity of no-go tones (sno-go) resulted in a slight decrease in lever press probability (Fig. 1c, Extended Data Fig. 1c–j).

Fig. 1: LC-NA activity facilitates behavioural responses to low-evidence stimulus.
figure 1

a, Behavioural apparatus for the head-fixed go/no-go auditory detection task. b, Summary of the trial sequence and its trial outcomes. FA, false alarm; CR, correct rejection. Freqgo, frequency of go stimulus; Freqnogo, frequency of no-go stimulus. c, Probability of lever press (P(press)) for different go (sgo) or no-go (sno-go) tone intensities. Single dots correspond to the average performance for each of four tone intensities for either no-go (descending order) or go (ascending order) frequency. Single lines correspond to the fitted P(press) using logistic regression for sgo or sno-go (Methods). d, Methods for photoinhibition of LC-NA activity. e, Example trial sequence showing the trial type, lever presses and ‘laser on’ trials. Top, timing of photoinhibition with respect to task epochs. f, P(press) for different go and no-go tone intensities for trials with laser on or off in a mouse during one example session. g, Average false alarm, hit rate and d-prime for laser-off versus laser-on trials for high and low stimulus intensity trials. h, P(press) at 0 dB intensity, obtained by fitting the behaviour with a logistic regression. P values in g and h calculated using two-tailed Wilcoxon test. i, Change in P(press) for different go/no-go tone intensities for trials where laser was turned on for LC-ArchT-tdTomato mice (green) and for LC-tdTomato controls (grey). P value calculated using two-way ANOVA. n = 19 mice in c, 7 LC-ArchT mice in gi, and 13 LC-tdTomato mice in i. Data are mean ± 95% confidence interval determined by bootstrapping (c) or mean ± s.e.m. (i).

Source data

Using this learned behaviour combined with photoinhibition, we investigated the necessity of LC-NA activity for behavioural performance. We used Dbh-cre mice to specifically express archaerhodopsin (ArchT) in LC-NA neurons and implanted an optic fibre above the LC in each hemisphere for bilateral inhibition (Fig. 1d, Extended Data Fig. 1k). By connecting each fibre to a green laser, we silenced LC-NA activity throughout tone presentation, behavioural response (lever press) and reinforcement delivery on a subset of trials (Fig. 1e). Photoinhibition of LC-NA activity decreased the lever press probability (Fig. 1f, Extended Data Fig. 1l), resulting in lower hit and false alarm rates for low sgo and sno-go tone intensities (Fig. 1g), and in an overall decrease in press at 0 dB, calculated with a logistic regression fit (Fig. 1h). Since lever press probability decreased for both go and no-go trials, LC-NA photoinhibition had no effect on response sensitivity or d-prime (Fig. 1g). Calculating the change in lever press probability for all sgo and sno-go tone intensities showed a significant decrease in presses for LC-NA photoinhibition trials compared with fluorophore controls (Fig. 1i, Extended Data Fig. 1m, n). Silencing LC-NA activity did not affect premature early presses—a lever press occurring before go/no-go tone presentation—or reaction time (Extended Data Fig. 1o, p). For each mouse, we verified the efficiency of LC-NA photoinhibition by measuring pupil constriction (Extended Data Fig. 1q, r). We measured the effect on behavioural performance as a function of pupil constriction and found no clear relationship, suggesting that the effects of LC-NA activity on task execution are independent of changes in general arousal levels that might be affected by LC-NA inhibition (Extended Data Fig. 1s). Together, these results suggest that LC-NA activity facilitates behavioural responses when presented with low-evidence stimulus, in effect promoting guesses to obtain reward at the risk of punishment.

LC-NA activation may signal unexpected stimuli7,10,12,14,19,20 which has been linked to promoting arousal-mediated behavioural shifts and learning19,20,29,30,31, but which we postulated acts through the timing, magnitude and location of LC-NA release to facilitate reinforcement learning. We examined this possible second role for LC-NA activity in our task by studying the effect of different trial outcomes—air-puff punishment, water reward or no reinforcement—on lever press probability in the next trial (Fig. 2a). We observed a shift in the press probability bias that was dependent on the outcome of the previous trial (Fig. 2b, c, Extended Data Fig. 2a). Unreinforced trials produced an overall decrease in behavioural response on the next trial, characterized by a decrease in hit and false alarm rate and a lower value for the regression intercept (β0) (Methods) (Fig. 2c, Extended Data Fig. 2a, b), whereas punishment trials produced an increase in hit rate, the regression slope of lever press probability versus go tone intensity (βgo) and in d-prime (Fig. 2b, c, Extended Data Fig. 2a, b). These changes in hit rate and d-prime were relatively independent of the no-go tone intensity of the previous trial (Extended Data Fig. 2c–e). After rewarded trials, we observed a change in hit rate that was dependent on the go tone intensity of the previous trial (Extended Data Fig. 2a, d) but no change when effects were pooled across go tone intensity (Fig. 2c). We next determined whether LC-NA activity during a certain trial outcome was necessary for producing the serial response bias effect (Fig. 2d). Silencing LC-NA activity during a punishment (false alarm) trial abolished the increase in hit rate and response bias on the next trial (Fig. 2e, f), but silencing during a rewarded (hit) or unreinforced trial did not, on average, affect the bias on the next trial (Extended Data Fig. 2f–i). To test whether the effect of whole-trial LC-NA silencing was owing to the role of LC-NA encoding a punishment response or to an overall decrease in arousal, we performed the same experiments while limiting the inhibition period to the reinforcement epoch (Fig. 2d, g, Extended Data Fig. 2j, k). Silencing LC-NA during the reinforcement recapitulated the effects of whole-trial inhibition on the hit response in the subsequent trial (Fig. 2g, Extended Data Fig. 2j, k). We next evaluated whether this effect of previous trial outcome diminished over training as the punishment and reward became less novel, but found no clear relationship between training session number and rate of false alarms or hits following punishment or reward (Extended Data Fig. 2l, m). These results thus provide direct evidence for the role of LC-NA activity in integration of reinforcement signals to increase performance accuracy on the subsequent trial.

Fig. 2: LC-NA activity promotes serial response bias.
figure 2

a, Serial response bias was calculated as the change in P(press) on the subsequent trial following either punishment, reward or no reinforcement. P(press) for trials following punishment (red), reward (blue) and no reinforcement (grey) are shown in comparison to P(press) following a shuffled order for one mouse. b, P(press) bias calculated by subtracting shuffled data from P(press) of trials following punishment. c, Change in false alarm, hit rate and d-prime following punishment, reward and no reinforcement. P values calculated using two-tailed Wilcoxon test of data versus shuffled. d, Timing of LC-NA photoinhibition during full-trial or reinforcement-only inactivation experiments. e, Effect of LC-NA full-trial inactivation on the P(press) bias of the next trial following a punishment. Data are displayed as in b. P values calculated using one-tailed Mann–Whitney U-test compared with control bias (b) for sgo intensities greater than 5 dB. f, g, Effect of LC-NA whole-trial (f) or punishment-only (g) photoinhibition on the change in false alarm and hit rate, calculated as in c following punishment trials. P values in f and g calculated using one-tailed Wilcoxon test. Data comparable to dg for other trial types are shown in Extended Data Fig. 2g–k. n = 18 mice (b, c), 6 mice (e) and 5 mice (f, g). Data are mean ± s.e.m (b, e) and mean ± 95% confidence intervals determined by bootstrapping (c).

Source data

To further investigate the role of LC-NA in signalling reinforcement, we next tested whether giving an unexpected reward on a random subset of correct-rejection trials influenced performance (Extended Data Fig. 3a). Following a rewarded correct-rejection trial, the false alarm rate increased compared with an unreinforced correct-rejection trial (Extended Data Fig. 3a, b). Photoinhibition of the LC during a rewarded correct-rejection trial reversed this increase in false alarm rate (Extended Data Fig. 3c), suggesting that LC-NA activity following a surprising outcome, regardless of valence, contributes to serial response biases. Consistent with a role for LC-NA in encoding unexpected reward, silencing LC-NA activity during the reinforcement epoch at the first stage of training (go trials only; Extended Data Fig. 1a), when receiving a reward is unexpected, slowed the acquisition of the association between lever press and reward (Extended Data Fig. 2n, o). Together, these data suggest a role for LC-NA in encoding unexpected outcomes to influence task performance and learning.

Two components of LC-NA phasic activity

To investigate how LC-NA activity supports both behavioural execution and performance optimization, we recorded the spiking activity of LC-NA neurons in mice performing the task. Using photo-tagging, a combination of single-unit electrophysiology and optogenetics7, we recorded identified LC-NA neurons (Fig. 3a, b, Extended Data Fig. 4a–f). By aligning the spiking activity of LC-NA photo-tagged units to the timing of press for either false alarm or hit trials, we observed two transient peaks in LC-NA activity: the first peak preceded the timing of the lever press and the second peak followed the timing of reinforcement delivery (Fig. 3c, Extended Data Fig. 4g). Comparing the firing rates during hit trials versus miss trials or false alarm versus correct-rejection trials, we found that the first LC-NA peak was absent in trials with only the go or no-go tone and no action, indicating that this LC-NA activity was not simply due to the presence of the tone (Fig. 3d, Extended Data Fig. 4h). Untargeted electrophysiological recordings of LC neurons have indicated that baseline or tonic activity could be related to different levels of cognitive performance9. However, our targeted recordings of LC-NA activity did not show any relationship between tonic activity and task performance, suggesting that reinforcement during the task does not affect behaviour in subsequent trials through changes in tonic LC-NA activity (Fig. 3e, Extended Data Fig. 4i). Using a delay between lever press and reinforcement delivery clearly revealed the reward response of LC-NA neurons (Extended Data Fig. 4j). LC-NA activity before the press was not significantly different in false alarm versus hit trials (Fig. 3f), but was larger following punishment than following reward (Fig. 3g). Thus, LC-NA spiking activity is tightly correlated with its behavioural function by signalling both behavioural execution and positive or negative reinforcement.

Fig. 3: Transient LC-NA neuronal activity is linked with execution and reinforcement surprise.
figure 3

a, Recording the spiking activity of LC-NA neurons using photo-tagging. b, Units included for analysis spiked reliably after the onset of laser illumination, and their light-elicited waveform matched non-light-evoked spikes (blue versus black lines in inset). c, Mean firing rate of LC-NA photo-tagged units aligned to lever press during false alarm and hit trials. The population average (solid line) and the corresponding s.e.m. (shaded area) are shown at the bottom. d, Average firing rate activity during a 300 ms window after the tone. P values are shown; two-tailed Wilcoxon signed-rank test with Bonferroni correction. BL, background level. e, Pre-trial tonic activity calculated over a 1 s window before the light cue, preceding any of the 4 trial types. Post-trial tonic activity calculated over a 2 s window, 3 s after the tone. P values calculated using a Kruskal–Wallis test. f, g, Spike rate of LC-NA neurons during hit or false alarm trials averaged before press (f) and after reinforcement (g) for all tone intensities. P calculated using two-tailed Mann–Whitney U-test. h, Average firing rate as a function of tone intensity for three LC-NA neurons. Single dots represent average for each tone intensity. Solid lines were obtained with least-squares linear regression. Shaded areas indicate the 95% confidence interval of the regression. P values test the significance of the correlation. i, Slope of the spike rate versus no-go (left) or go (right) tone intensities for different behavioural epochs. Slope of baseline activity is shown as control. Punish., punishment. P values calculated using two-tailed Mann–Whitney U-test with Bonferroni correction. n = 45 units from 9 mice (df and calculation of pre-press activity in i), and 27 units from 5 mice (calculation of reinforcement activity in g, i). Box and whisker plots indicate the median, the 25th and 75th percentile and the minimum to maximum values of the distribution (dg, i).

Source data

We next examined the relationship between phasic LC-NA spiking activity and the level of stimulus evidence. For many individual LC-NA neurons, as well as LC-NA neurons on average, pre-press spiking rate correlated positively, whereas post-reward spiking rate correlated negatively, with go tone intensity (Fig. 3h, i, Extended Data Fig. 4k–n). Thus, pre-press LC-NA activity seems to encode evidence uncertainty, whereas post-reward LC-NA activity encodes the degree of unexpected reinforcement. In this respect, we found a modest relationship between LC-NA activity and the level of training for the post-reward LC-NA response, indicating a decrease in activity when expectedness of reward increases (Extended Data Fig. 4o). We did not observe any correlation between no-go tone intensity and post-punishment spike rate, demonstrating that in our task a reward is expected upon movement and punishment is unexpected regardless of no-go tone intensity (Fig. 3h, i, Extended Data Fig. 4n). Because aversive stimuli have been shown to elicit strong global LC-NA activation, we questioned whether the high levels of LC-NA activity observed after a false alarm were owing to the aversive nature of the punishment, or were a result of the surprise of the reinforcement. We therefore measured LC-NA activity following an unexpected water reward during correct-rejection trials, which we previously showed leads to behavioural changes on the subsequent trial (Extended Data Fig. 3a–c). We observed phasic activation following rewarded correct-rejection trials, with activity levels similar to those of the same units on a false alarm trial (Extended Data Fig. 3d–f). Thus, LC-NA activity reflects post-reinforcement surprise. Together, these data demonstrate that LC-NA neurons encode behavioural execution through reward expectation, as revealed by the relationship between pre-press spike rate and tone intensity, as well as unexpected reinforcement, as revealed by the high post-reward spike rate for low go tone intensity and high post-punishment spike rates for no-go tones regardless of intensity.

Modular response of LC-NA neurons

Next, we tested the extent to which the observed spiking activity in the LC during our task is represented homogeneously across LC-NA neurons. By examining the signal during false alarm or hit trials in our targeted spike recordings of LC-NA neurons, we found subpopulations of LC-NA neurons exhibiting heterogenous activity pre-press or post-reinforcement (Extended Data Fig. 4p–r). Ten out of ten LC-NA neurons showed phasic post-punishment responses, 43 out of 45 neurons showed different levels of pre-press responses and 16 out of 27 neurons showed post-reward responses (Extended Data Fig. 4q). Trial-to-trial reliability was similar for non-responsive and responsive neurons (Extended Data Fig. 4r). To further characterize the level of heterogeneous activity among LC-NA neurons, we used two-photon micro-endoscopy to image the population activity of LC-NA neurons expressing the genetically encoded calcium indicator GCaMP6m (Fig. 4a, b, Extended Data Fig. 5a, b). Simultaneous recordings of LC-NA neurons during the go/no-go task showed that some cells had decorrelated activity, especially during hit trials (Fig. 4c, Extended Data Fig. 5c, d). We measured the level of signal correlations for all of the 197 pairs of LC-NA neurons recorded during the task for different trial types and found that the false alarm signal was much more highly correlated compared to the hit signal among LC-NA neurons (Fig. 4d). The higher level of decorrelation observed during hit trials could not be explained by differences in licking onset (Extended Data Fig. 5e). These data suggest that the reward signal is discretely encoded in a subset of LC-NA neurons, whereas the false alarm response is globally represented in the LC-NA population.

Fig. 4: LC-NA activity before press and after reward is modular, whereas LC-NA activity after punishment is global.
figure 4

a, Calcium activity of LC-NA neurons was imaged by implanting a micro-endoscope above the LC of Dbh-cre mice injected with Flox-GCaMP6m virus. b, Example two-photon image of GCaMP6m and mCherry signals obtained through the implanted micro-endoscope. Scale bar, 50 μm. c, Session average ΔF/F traces aligned to the timing of press for false alarm and hit trials. Pairs of columns represent two simultaneously recorded cells (LC-NA+ 1 vs LC-NA+ 2 or LC-NA+ 3 vs LC-NA+ 4) recorded from two mice (session 1 vs session 2). Data are mean ± s.e.m. d, Signal correlation obtained by calculating the Pearson correlation of the average signal of all pairs of LC-NA neurons recorded during false alarm trials, hit trials or a baseline period taken during the inter-trial interval (ITI). Kruskal–Wallis test for comparing the three distributions. CDF, cumulative density function. e, Left, multiple regression linear model to predict the behavioural correlates of LC-NA neurons during the task (Extended Data Fig. 6). Right, average calcium activity for all 65 LC-NA neurons recorded separated by each cluster. EV, explained variance. f, Average activity aligned to lever press for each cluster. g, Area under the curve (AUC) of the normalized ΔF/F curve during press and after reward or punishment for each cluster. P values calculated using one-way ANOVA. n = 197 pairs from 3 mice (d). n = 26, 15 and 24 LC-NA cells from 3 mice for press, reward, and punishment clusters respectively (eg). Box and whisker plots indicate the median, the 25th and 75th percentile and the minimum to maximum values of the distribution.

Source data

To further explore this heterogeneity, we analysed the behavioural correlates of the activity of single LC-NA neurons using a multiple linear regression model. Task-relevant variables, including the timing of the light cue, tone, lever press and reinforcement were used as regressors for model fitting (Extended Data Fig. 6a–c) and were evaluated using fivefold cross-validation (Extended Data Fig. 6d, e). We determined the contribution of each of these variables by measuring the change in explained variance of the model when removing one regressor at a time (Fig. 4e, Extended Data Fig. 6f). By sorting the contribution of different regressors for each neuron, we found three clusters characterized by having a disproportionate fraction of their explained variance attributed to the pre-press, reward or punishment activity (Fig. 4e). As predicted by signal correlation analyses (Fig. 4d), the largest difference in signal among these three clusters was during press and reward, whereas the punishment signal remained similar regardless of cluster identity (Fig. 4f, g). To examine whether this heterogeneity is a result of neuronal identity, or if the responses of individual neurons are themselves heterogenous across days, we tracked the responses of the same neurons over multiple sessions (Extended Data Fig. 5f–l). We found that the response profiles of LC-NA neurons were stable across sessions, showing little change in within or between session trial-to-trial correlations, or in signal drift index across days (Extended Data Fig. 5o, p). These results suggest that LC-NA neurons form distinct groups with respect to encoding of action execution and positive reinforcement, whereas the negative reinforcement signal is globally encoded in LC-NA neurons.

Spatial dynamics of LC cortical outputs

Although neuronal activity in cortex has been linked to task execution38,39,40 and response bias41,42,43,44, the cellular mechanisms producing this activity are unknown. We therefore investigated how the heterogeneous activity at the level of LC neurons maps onto distinct LC-NA cortical outputs during our task to facilitate behavioural performance. Retrograde and anterograde tracings involving the motor cortex (MC) and the prefrontal cortex (PFC) have suggested that partially overlapping sets of LC neurons target these two areas34,35. Our dual retrograde tracing experiments combining retrograde virus transport and ‘retrobeads’ showed that only about half (48.8 ± 5.6 %) of LC-NA neurons that project to either the posterior forelimb area of the MC or the dorsomedial PFC (dmPFC) also projected to the other area (Extended Data Fig. 7), consistent with modularity of LC projections to discrete cortical targets. To examine whether these two regions receive similar LC-NA activity, we used two-photon axonal imaging of calcium dynamics of LC-NA projections through a cranial window located above either MC or dmPFC. (Fig. 5a–c, Extended Data Fig. 8a–i). To validate the technique, we compared axonal GCaMP7b activity with the activity of the genetically encoded fluorescent noradrenaline sensor (GRABNE) and found that LC-NA axonal calcium signals reflect the underlying noradrenaline release in the cortex (Extended Data Fig. 8j–o). By comparing the activity of LC-NA axons projecting to MC or dmPFC (LC-NA:MC versus LC-NA:dmPFC), we observed a significant increase in activity at the time of press for axons targeting the MC (Fig. 5d, e). To measure the behavioural correlates of single LC-NA axons, we used a multiple linear regression model as described above (Fig. 5f; see also Fig. 4e, Extended Data Fig. 6). The linear model contribution of the press was larger in LC-NA:MC axons, whereas the contribution of punishment was larger in LC-NA:dmPFC axons (Fig. 5g, h).

Fig. 5: LC-NA cortical outputs are modular.
figure 5

a, Experimental strategy to record LC-NA axonal activity in the cortex. b, Two-photon image in the MC or dmPFC of LC-NA+ axons expressing GCaMP7b. Scale bar, 50 μm. c, Example regions of interest (ROIs) and calcium traces of LC-NA axonal segments in fields of view shown in b. d, Comparison of the average LC-NA axonal activity in dmPFC or MC aligned to the timing of lever press during false alarm and hit trials. e, Area under the curve of the normalized ΔF/F signal during press and after reward or punishment for LC-NA:dmPFC or LC-NA:MC axonal segments. f, Left, the fraction of explained variance for each axon was sorted into three clusters as in Fig. 4e. Right, average calcium activity for all recorded LC-NA axons. Each cluster is separated by a dashed line within LC-NA:dmPFC (top) or LC-NA:MC (bottom) groups. g, Comparison of the linear model contribution to LC-NA cortical axons for press, reward and punishment predictors. h, Fraction of LC-NA axons from each cluster in LC:dmPFC and LC:MC. i, Experimental strategy to silence LC-NA axonal activity using photoinhibition. j, Average change in false alarm, hit and d-prime for trials in which the laser was turned on for LC:dmPFC (top) or LC:MC (bottom). k, Change in P(press) for different go/no-go tone intensities for trials where laser was turned on for LC:dmPFC (top) or LC:MC (bottom). n = 34 LC:dmPFC and 43 LC-NA:MC axons, 4 mice each (dh), and 7 and 5 mice for LC:dmPFC and LC:MC photoinhibition (j, k). P values calculated with Mann-Whitney U-test using Bonferroni correction for multiple comparison (e, g), normal approximation to binomial test (h), and hierarchical bootstrapping (j, k). Box and whisker plots indicate the median, the 25th and 75th percentile and the minimum to maximum values of the distribution (e, g). Data are mean ± s.e.m. (d) and mean ± 95% confidence intervals determined by bootstrapping (j, k).

Source data

Finally, since LC-NA activity before the press is disproportionally represented in the two cortical areas, we measured the effect of silencing LC-NA axonal activity in MC versus dmPFC. First, we examined the role of MC in the task by pharmacologically silencing MC with muscimol (Extended Data Fig. 9a–c). Consistent with the known involvement of MC in regulating movement, focal inactivation of MC caused significant impairment of behavioural responses, affecting all behavioural metrics (Extended Data Fig. 9d–f). We then locally photoinhibited LC-NA axons in MC or dmPFC (Fig. 5i, Extended Data Fig. 9g, h). Photoinhibition of LC-NA axons in MC decreased average hit rate while leaving false alarm rate and d-prime intact, whereas photoinhibition of LC-NA axons in dmPFC produced no significant effect (Fig. 5j). The decrease in hit rate for LC-NA:MC inactivation was mainly owing to a decrease in lever press probability for low intensity go stimuli (Fig. 5k). These results show that, consistent with the predominance of pre-movement LC-NA activity in MC, inactivating LC-NA outputs in this area affects movement execution particularly with low-evidence stimulus. Inactivating LC-NA axons in MC or dmPFC during the punishment signal did not impair the increase in performance accuracy on the subsequent trial observed after punishment (Extended Data Fig. 9i). These results are consistent with the observation that the punishment signal is distributed globally across cortex, and silencing LC-NA axons in one area alone does not reduce the full effect of the signal.

Discussion

Here, using a learned behaviour dependent on LC-NA activity, we demonstrate two concurrently encoded functions for the LC-NA system: task execution and performance optimization. Furthermore, we provide evidence that—at the level of LC-NA outputs—functional modularity exists and supports, at least partially, distinct aspects of learned behaviour. Recordings of LC-NA neurons demonstrate the temporal signatures of noradrenaline activity during the behaviour, characterized by two transient peaks, one preceding behavioural execution and another following reinforcement (Extended Data Fig. 10a). We demonstrate that this activity is projected heterogeneously to the cortex such that pre-movement noradrenaline release primarily targets motor regions, facilitating its role in behavioural execution, whereas the negative reinforcement or punishment signal produces broad neuromodulation that is probably used simultaneously by several regions to bias subsequent behaviour (Extended Data Fig. 10b).

LC-NA activity prior to task execution is low when stimulus evidence is low (Extended Data Fig. 10a). This pre-execution activity promotes reward-seeking actions, as demonstrated by the decreased behavioural response to low sensory evidence during LC-NA photoinhibition. Given that increasing LC-NA activity improves sensory–motor responses11,23,25,27,45, LC-NA activity probably provides the necessary gain modulation in target areas such as the MC (Extended Data Fig. 10b) to increase the probability of lever press with low-evidence stimulus. Since LC-NA activity is most critical for low-evidence stimulus trials, which elicit only modest increases in LC-NA activity, the effects of LC-NA on behavioural execution appear to reflect the encoding of stimulus uncertainty, potentially spanning encoding of effort or engagement, as recently suggested23.

LC-NA activity following a reward is high with low-evidence stimulus, and the activity following a punishment is highest in magnitude and relatively independent of the level of stimulus evidence (Extended Data Fig. 10a). In this surprise-encoding model of LC-NA activity, punishment following lever movement produces a large increase in noradrenaline for a wide range of no-go tone intensities, since a behavioural response is associated with expectation of reward and a punishment is always unexpected. This role of LC-NA in signalling surprise is consistent with its proposed role in implementing unexpected uncertainty19. Although we cannot dissociate the surprising from the aversive nature of punishment in our task, our data showing high LC-NA activity following a surprising reward—with different effects on behaviour than an equally high punishment signal—indicate task-specific signalling related to reward encoding. The encoding of reinforcement surprise has also been suggested for acetylcholine46 and serotonin47, and parallels reward prediction error for dopamine48,49,50. Thus, LC-NA activity would be part of a larger network involving several neuromodulators to facilitate outcome evaluation and implement reinforcement learning.

The LC-NA punishment signal is widely distributed (Extended Data Fig. 10b), and inhibiting this signal impairs performance accuracy on the subsequent trial. Notably, our results suggest that locally depleting a global LC-NA punishment signal in one target region does not produce a significant effect on behaviour. This finding is consistent with the view that serial response bias leading to task optimization might be expressed in multiple brain areas, including different cortices41,42,43,44, the striatum51 and the hippocampus44. Thus, depleting LC-NA in only one of these areas is probably insufficient to undermine the synergistic effect that widespread noradrenaline release has on the multiple brain regions that are responsible for shifting decision strategies that underlie performance optimization. As a possible mechanism, LC-NA release may enable persistent activity41,52 in multiple target areas to represent information about the erroneous action in time and to momentarily increase goal-directed attention.

Methods

Mice

All procedures performed in this study were approved by the Massachusetts Institute of Technology’s Animal Care and Use Committee and conformed to the Guide for the Care and Use of Laboratory Animals published by the National Institutes of Health. Male and female mice more than 2 months old were used in this study. Mice were housed in a room with reversed light/dark cycle (light off from 09:00 to 21:00) with controlled temperature and ventilation (20–22 °C; 40–60% humidity). All experiments were performed during the dark period of the cycle. The Dbh-cre line (B6.FVB(Cg)-Tg(Dbh-cre)KH212Gsat/Mmucd, MMRRC) was used for specific expression of various viruses in noradrenaline-expressing (NA+) neurons of the LC. We used the Gad2-IRES-cre (stock no. 019022, Jackson Laboratory) or the Vgat-IRES-cre (Stock No 028862, Jackson Laboratory) lines for sparse expression of GRABNE in cortex. Some anatomical and behavioural experiments were carried out on C57BL/6 wild-type mice.

List of viral vectors

For LC-NA photoinhibition experiments, we used AAV2-CAG-Flex-ArchT-tdTomato (UNC Vector Core) or AAV5-CAG-Flex-ArchT-tdTomato (AddGene #28305-AAV5) viruses. For axonal inhibition in the cortex, we injected a AAV8-CAG-Flex-Jaws-tdTomato (UNC Vector Core) virus. For control optogenetics experiments, we used a AAV1-Flex-tdTomato (AddGene #28306-AAV1). For LC-NA photo-tagging experiments we injected a AAV1-EF1a-double floxed-hChR2(H134R)-EYFP (Addgene #20298-AAV1) virus. For two-photon micro-endoscopy experiments, we injected simultaneously a AAV5-CAG-Flex-GCaMP6m (Addgene #100839-AAV5) and a AAV9-CB7-CI-mCherry (Addgene #105544-AAV9) virus. For two-photon calcium imaging of LC-NA axons in the cortex, we injected the enhanced genetically encoded calcium indicator with brighter baseline GCaMP7b53 – AAV1-syn-FLEX-jGCaMP7b (Addgene #104493-AAV1). For retrograde tracing from different cortical areas, we used a rgAAV-hSyn-Cre virus (Addgene #105553-AAVrg). Finally, to measure noradrenaline release in the MC, we used a AAV9-hSyn-DIO-GRABNE2m virus54 (courtesy of Y. Li and packaged by Vigene Biosciences).

Stereotactic surgeries

Animals were prepared similarly for all surgical procedures. Mice were anaesthetized using isoflurane anaesthesia (3% for induction, 1–1.5% for maintenance) while maintaining a body temperature of 37.5 °C using a heating pad (ATC2000, World Precision Instruments). Mice were given pre-operative slow-release buprenorphine (1 mg kg−1, subcutaneous injection) and post-operative meloxicam (1 mg kg−1, subcutaneous injection). Mice were placed in a stereotaxic frame, scalp hair removed, and the incision site sterilized using betadine and 70% ethanol. The skull was exposed and the conjunctive tissue removed using hydrogen peroxide. The skull was positioned such that the lambda and bregma marks were aligned on the anteroposterior and dorsoventral axes. For all surgeries, anti-inflammatory (Meloxicam) injections were pursued for 3 days following surgery.

For virus delivery, we first drilled a small craniotomy (0.5 mm) above the region of interest. For delivering Cre-dependent viruses in the LC, we injected a volume of 300–400 nl of virus (rate: 200 nl min−1), using a glass pipette with a 50 μm diameter tip. Coordinates for targeting the LC virally were (in mm): anterior-posterior (AP) −5.2 to −5.0, medial-lateral (ML): ±0.9, dorsal-ventral (DV) −2.8. For retrograde labelling of LC-NA neurons from the MC or dmPFC, a volume of 200 nl of undiluted red retrobeads (Lumafluor) or retrograde AAV-Cre virus was injected in either MC or dmPFC (rate: 50 nl min−1). Coordinates were (in mm): MC: AP 0 to 0.5; ML 1.5; DV 0.7 and dmPFC: AP 2 to 2.25; ML 0.3; DV 0.8. Note, we defined dmPFC based on previous literature that included secondary motor and anterior cingulate cortex as part of PFC in rodents55,56. For GRABNE cortical injections, we made 3 × 100 nl injections (rate: 50 nl min−1) in various locations within the 3-mm craniotomy above the MC. All injections were performed using an infuser system (QSI 53311, Stoelting) attached to the stereotaxic frame. For tracing experiments, the skin was sutured after injection and we let the mice recover for 14 days. For experiments using opsins, we let the virus express for a minimum period of four weeks. For calcium imaging experiments, we imaged as early as two weeks. For GRABNE experiments, longer incubations of four to six weeks were required for optimal sensor expression.

To deliver light into the LC, 200-μm two-ferrule cannulas (TFC_200/245-0.37_4mm_TS2.0_FLT, Doric Lenses) were implanted above the LC (AP: −5.2 to −5.0; ML: ±1.0; and DV: 2.5 μm). To deliver light into the cortex, we used single ferrule cannulas with large (400 μm) diameter and high numerical aperture (0.5 NA) (Thorlabs, CFML15L02). We implanted these single ferrule cannulas bilaterally above the MC or dmPFC using the following coordinates (in mm): MC: AP −0.5; ML: ±2; DV 0.3 at 10° in the AP axis; or dmPFC: AP: 1.5; ML: ±0.6; DV 0.4 at a 15° in the ML axis. After implantation, dental cement (Teets Denture Material) and Metabond (C&B Metabond, Parkell) was applied to affix the implant to the skull. To avoid light reflection and absorption, the transparent Metabond was mixed with black ink pigment (Black Iron Oxide 18727, Schmincke). A custom designed head-plate40 was then positioned over the implant and affixed to the skull using Metabond.

To perform LC single unit recording or pharmacological inhibition in awake head-fixed mice, we implanted a head plate parallel to the bregma–lambda axis of the skull. We used a custom design stereotactic arm to align the head plate parallel to the median and dorsal line of the skull during implantation. The head plate was attached to the skull using dental cement. The exposed skull was protected using rapid curing silicone elastomer (Kwik-Cast, WPI) topped with a fine layer of dental cement.

Two-photon imaging of LC-NA somas was performed using a gradient index lens of 500 μm diameter (GRINTECH, part: NEM-050-25-10-860-DS). After drilling a craniotomy and injecting Flex-GCaMP6m and mCherry viruses, a 27G needle was lowered above the LC to make space for lens implantation. The lens was glued to a custom-made 3D-printed implant guide with ultraviolet adhesive (NOA 61 UV adhesive, Norland Products). The GRIN lens was lowered slowly (~1 mm min−1) above the LC at a depth of 2.7 μm from the surface of the brain. After implanting, the GRIN lens and its implant guide were attached to the skull with metabond mixed with black ink pigment. A headplate parallel to the surface of the GRIN lens was attached to the head (see paragraph on preparing for single-unit recordings). Finally, the lid from a cut Eppendorf tube was attached on top the GRIN lens for protection.

Two-photon calcium imaging in the cortex was done through a cranial window. We drilled a 3-mm circular window centred over the forelimb part of MC (0 mm posterior and 1.5 mm lateral to bregma) or the medial PFC (~2 mm anterior to bregma and centred on the midline). A 3-mm centred on a 5-mm coverslip (CS-5R and CS-3R, Warner Instruments), and glued together with ultraviolet adhesive, was positioned over the craniotomy and attached to the skull using dental cement. For axonal imaging, Flex-GCaMP7b was injected in the LC of Dbh-cre mice, and for GRABNE imaging, Dio-GRABNE2m was injected within the craniotomy of Gad2-cre or VGAT-cre mice. A head plate was also attached to the skull for head fixation.

Behavioural setup

Mice were head-fixed on a behaviour rig and confined in a polypropylene tube to limit body movements. Their left forepaw was able to move a lever built with a 1/16-mm-thick brass rod attached to a piezoelectric flexible force transducer (LCL-005, Omega Engineering). A metallic lick spout placed near the mouse’s mouth and connected to a custom-made lick detector57 was used to deliver water rewards (~5 μl drop of water). A small tube, pointing toward the mouse facial area and at a distance of 3 cm, was used to deliver air-puff punishment (compressed air at 40 psi for 0.3 s). Voltage signals from the transducer and lick detector were recorded through a microcontroller board (Arduino UNO Rev3). Voltage signal from the transducer were converted to lever movement in degrees using calibration data from video analysis. A second microcontroller board was used to control a 5mm yellow LED light placed 8 cm if front of the mouse, and two solenoid valves (Parker 003-0141-900) for water and air-puff delivery. Four or twelve kilohertz sound stimuli of 0.5 s duration were delivered using a single speaker located at a distance of 30 cm from the mouse. The speaker frequency range was calibrated using a USB calibrated measurement microphone (UMIK-1, Mini DSP) and the Room EQ Wizard software (version 5.19). The sound stimulus intensities were established by a sound level meter. We used four behaviour rigs (two for general behaviour and optogenetics, one for electrophysiological and one for two-photon imaging). Noise levels were comparable across all 4 rigs (in dB with Z noise frequency weightings): 7.8 ± 1.1, 8.8 ± 1.0, 14.3 ± 0.8, and 14.7 ± 0.9 for 4 kHz; and −4.0 ± 1.2, −1.7 ± 1.1, −1.9 ± 0.9, and 0.3 ± 0.7 for 12 kHz. The behavioural setup was connected to a computer running a custom-written MATLAB (Mathworks) script that was able to record lick rate and lever voltage, while controlling the timing of light cue, sound (using Psychtoolbox), water, and reward. Behaviour rigs were assembled primarily with optomechanical components (Thorlabs).

Behavioural task and training

Upon recovery from surgical procedure, mice were gradually put on a water restriction schedule, receiving eventually 1–1.5 ml of water in total per day. Body weight was maintained above 90% of the pre-restriction weight.

Mice were trained to hold still for 1 s during the cue period (LED on), to wait for a delay to hear a tone, and to push the lever depending on stimulus identity to obtain a reward or to refrain from pushing to avoid a punishment. Mice learned to push the lever when they heard a go tone (12 kHz frequency) and hold still when they heard a no-go tone (4 kHz). After the onset of the 0.5s sound stimulus, mice had 0.8s to respond or hold still. If they pressed the lever during go trials they received a water reward. If they pressed during a no-go trial they received a mild air-puff punishment. Absence of response on go trials—miss—or holding still during no-go trials—correct rejections—were not reinforced. To vary the level of stimulus evidence, 4 intensities were used per frequency for a total of 8 different stimuli. Tone intensities used were 5, 15, 25 and 35 dB. These values were calculated by measuring the sound pressure level for either go or no-go frequency and subtracting the noise level of that given frequency. A lever press (hit or false alarm) was determined when the lever position reached a threshold value of 3 to 4° (depending on animal) from the position at the beginning of the trial. Absence of lever press (miss or correct rejection) was determined if the lever absolute position stayed below a value of 2.2°. Premature lever presses, occurring in the delay period between light cue off and tone onset, were considered early presses and the trial was aborted. The delay between light cue off and tone onset was randomized following a gaussian distribution (mean: 0.65 s and standard deviation 0.15 s). Trial order was pseudo-randomized to ensure that the same amount of go or no-go trials were presented every fourth trials and that each tone intensity was presented every eighth trial. Each trial was followed by a 4 s-long inter-trial interval.

Mice were taken through two stages of training until they became proficient at the task. During the first phase of training, mice learned to associate a lever press with reward and to detect a go tone. In this phase, only go tones (12 kHz at 35 dB for 0.5 s) were used. The same trial sequence as in the full task was used, but we extended the duration of the response window (30 s instead of 0.8 s). We switched the animal to the next stage once they made more than 80% of lever presses for 50 consecutive trials, within a period of 0.8 s after tone onset. This initial stage of training lasted 3.9 ± 0.3 sessions. During the second phase of training, no-go trials (tone: 4 kHz at 35 dB for 0.5 s) were introduced and the response window was reduced to 0.8 s after tone onset. Training in the second phase lasted until mouse performance reached 85% hit and less than 30% false alarm for two consecutive sessions. This second stage of training lasted 11 ± 2 sessions. The last stage was considered the full task in which various intensities were introduced. For physiological recordings, a 0.25 s delay between the timing of lever press and reward or punishment was introduced at the last stage. For correct rejection with surprising reward experiments, expert mice received water reward randomly on a quarter of correct-rejection trials on sessions after the final stage of training.

Optogenetic inhibition of LC-NA activity

We used solid state laser illumination at 532 and 593 nm for activating ArchT and Jaws, respectively (Opto Engine, MGL-III-532/1 ~300 mW and YL-589-00100-CWM-SD-05-LED-F). A 200-μm/0.39 numerical aperture patch cable (Thorlabs, M72L02) was connected to the laser output and to an intensity division cube (Doric Lenses, DMC_1 × 2i_VIS_FC) for bilateral LC modulation. The patch cable (Doric Lenses, MFP_200/230/900-0.37_1m_FC-ZF1.25(F)) was attached to the animal ferrule implant using corresponding ceramic mating sleeves. Care was taken to block any light emitting from the interface between the patch cable and the implanted ferrule, using a piece of black electrical tape or rubber wrapped around the connection. The laser pulse duration, frequency, and shape were controlled by a data acquisition system (Molecular Devices, Digidata 1440A) connected directly to the analogue port of the laser power supply. Laser activation was performed on a subset (one-third or one-quarter) of trials. We pseudo-randomized the order of laser-on trials to ensure that photoinhibition never occurred on two consecutive trials. For correct rejection with unexpected reward experiments, laser inactivation of LC was performed on half of correct-rejection trials with reward. For LC-NA inhibition during learning experiments, a 0.25 ms delay was added pre-reinforcement, and LC-NA inhibition was performed on every trial during the reinforcement epoch, while the mice received the water reward. 15–17 mW of power was applied for 2.5 s or 2 s, for whole-trial and reinforcement epoch inhibition, respectively, followed by a 0.5 s ramp-down of the laser power to avoid rebound of neuronal firing. The onset of laser activation occurred during the period between the cue and tone presentation (~0.5 s before tone) for whole-trial inhibition, and after the lever press for reinforcement epoch inhibition, and lasted until the ITI period. At the end of each experiment, the location of optic fibres was verified with respect to neurons or axons expressing the opsin. For control optogenetics experiment, we pooled mice injected with a Cre-dependent tdTomato virus (N = 5) together with mice injected with ArchT that had misplaced optic fibres, identified using histological verifications (N = 8).

Pupillometry

Pupil tracking was performed using a modified version of our previous set up7,58. A high-resolution CMOS camera (DCC1545M, Thorlabs) equipped with a 1.0× telecentric lens (Edmund Optics 58–430) was pointed at either the left or right eye depending on the experimental set up. Infrared illumination at 780 nm was provided by a light-emitting diode array light source (Thorlabs LIU780A). Video acquisition of eye images (240 × 184 pixels) was performed at 20 Hz by a custom-made MATLAB script. Pupil diameter were calculated online during acquisition with a least square fit of ellipse of the binarized pupil image. Timing of laser activation was recorded using a microcontroller board (Arduino UNO Rev3) connected to the pupil tracking computer. The pattern of light activation was the same as for optogenetic inhibition of LC-NA activity during behaviour (on for 2.5 s + 0.5 s ramp down). As shown previously6,7,59, LC-NA photoinhibition causes pupil constriction. We thus included only mice displaying clear pupil constriction following optogenetic silencing of LC-NA activity.

Spike recordings of photo-tagged LC-NA units

After training to proficiency on the task, Dbh-cre mice, previously injected with Flox-ChR2 virus, were anaesthetized with isoflurane and the dental cement and silicone elastomer on the skull were removed. A 500-μm diameter craniotomy was performed on top of the inferior colliculus (from bregma: −4.9 to −5.4 mm anteroposterior and 0.6–1.1 mm mediolateral). The dura was punctured and the craniotomy was protected with saline and a piece of gel foam (Pfizer). The skull was covered again with silicone and the mouse was allowed to recover for at least 2–3 h for the anaesthesia effect to wash out completely.

The awake animal was then head-fixed and the silicone and gel foam removed gently. A 0.9% NaCl solution was used to keep the surface of the brain wet for the duration of the recordings. After placing the animal in the recording set up, we submerged a reference silver wire in the saline solution on the skull surface. The position of the 16-channel silicone probe (Neuronexus, A1x16-Poly2-5mm-50s-177-OA16LP) was referenced on bregma and the surface of the brain. The probe was then lowered slowly (1 min per mm), using a motorized micromanipulator (MP-285, Sutter Instrument Company), until units responding to photo-activation were found, or until a depth of 3.5 mm was reached. If no clear photo-tagged units were found in this AP/ML location, the optrode was retracted slowly and the probe was inserted in another location within the craniotomy. We used a solid-state blue laser (Opto Engine, MBL-III-473/1 ~200 mW) connected via a 105-μm/0.22 numerical aperture patch cable (M61L01, Thorlabs) to the optrode. The extracellular signal was amplified using a 1× gain headstage (model E2a, Plexon) connected to a 50× preamp (PBX-247, Plexon) and digitized at 50 kHz. The signal was high-pass filtered at 300 Hz. Time stamps of laser and trial start were also recorded by the Plexon system for alignment.

At the beginning and end of each recording session, light pulses of 2–5 ms at various light intensities (0.1–1 mW) were repeatedly delivered in the tissue (frequency: 2 Hz), to perform post hoc comparison of spontaneous and light-evoked waveform for each sorted unit. Units were considered light-responsive if they responded significantly using the SALT algorithm60. We also only kept units responding within an 8-ms-period after light stimulus onset, and whose light-evoked waveforms closely matched the spontaneous ones. Recording sessions without light-responsive units were excluded from analysis. Spikes were sorted offline using a fully automated spike-sorting algorithm61. Manual curation was performed to remove artifacts picked by the algorithms (ill-shaped units), units with low amplitude spikes, units with low spike rate (<0.1 spikes s−1), or units without clear refractory period (more than 0.5% of spikes in the < 1ms refractory period of another spike). We verified spike times with cross-correlograms to combine units or eliminate duplicates. For each unit, we excluded parts of the recordings with obvious drift (unit firing rate abruptly decreasing).

At the end of each session, the craniotomy was covered again with Kwik-Cast to allow recording on the next day. For verifying the probe location on the last day of recording, the silicone probe was gently retracted and the recording tract was marked by re-entering the DiI-coated probe (2 mg ml−1; D3911, ThermoFisher) at the same location. The brain was collected post-experiment and immunohistochemistry for confirming the probe location was performed.

Two-photon microscopy

After training GRIN-lens-implanted or window-implanted mice, the fluorescence sensor signal (GCaMP or GRABNE) was imaged using resonant-galvo scanning with a Prairie Ultima IV two-photon microscopy system. We used the following list of objectives: CFI Plan Apochromat Lambda 4× 0.20NA (Nikon) (micro-endoscopy experiment); XLUMPlanFL N 20× 1.00NA (Olympus) (GRABNE experiment); and a XLPlan N 25× 1.05NA (Olympus) for axonal imaging. Two-photon excitation of GCaMP or GRABNE at a wavelength of 920 nm was provided by a Ti:sapphire tunable laser (Mai-Tai eHP, Spectra-Physics). Power at the objective ranged from 10 to 30 mW depending on depth and expression levels. We used 5.5× optical zoom for micro-endoscopy, 2× optical zoom for GRABNE imaging, and 4× optical zoom for axonal imaging. Images were acquired at 10 frames per second for micro-endoscopy and GRABNE experiments and 20 frames per second for axonal imaging. A voltage signal indicating the start of each trial was recorded by the prairie system for alignment with behaviour.

To increase the number of simultaneously recorded cells for LC imaging with micro-endoscopy, along with extracting the fluorescence signal from the ROI around somas, we also used ROIs from portions of dendrites emanating from somas located outside the GRIN lens field of view. Three to five sessions were collected at different depth (from 50 to 250 μm) below the GRIN lens. Somas or dendrites with high signal-to-noise ratio were selected for analysis. We obtained 65 ROIs using this method. To track the same ROIs over multiple sessions, we used sessions with matching fields of view. Since it can be challenging to obtain the same field of view from one session to another, we only selected ROIs (N = 9) that were easily traceable across sessions for this experiment. The maximal number of days a ROI could be tracked was 16, and was on average 8 ± 2 for the 9 ROIs tracked. For GRABNE imaging, the average fluorescence signal for a 450 × 450 μm area was extracted for analysis. For axonal imaging, axons with high signal-to-noise ratio were selected for analysis. Axonal ROIs were extracted by delineating the whole axonal process visible in a field of view. The area of an axonal ROI was on average 880.5 ± 65.9 and 1057.8 ± 96.0 um2 for LC-MC and LC-dmPFC axons (data ± s.e.m.). Using these ROIs of large areas provides more accurate signal extraction that is least dependent on micro movements of axons during imaging. After recording one field of view, we moved at least 1 mm away to find new axons in the next imaging session. Care was taken to select axons from different branches. After acquisition, time-lapse imaging sequences were corrected for x/y movement using template-matching ImageJ plugins to align images with normalized cross-correlation62. For LC micro-endoscopy, we used the static mCherry signal for xy drift correction. For GRABNE and axonal imaging, a stack of the average of all time points was used as a reference for motion correction. For GRIN, axonal and GRABNE imaging, animals with uncorrectable level of motion, especially in the z-axis, were excluded from analysis. The ΔF/F = (FF0)/F0 signal was calculated for each ROI extracted. Average fluorescence intensity was used as the reference value (F0) for GRABNE experiments, and the tenth percentile of fluorescence intensity was used for F0 for micro-endoscopy and axonal imaging experiments.

Histology

Mice were transcardially perfused with cold 0.9% NaCl followed by 4% paraformaldehyde (PFA). Brains were collected and post-fixed in 4% PFA overnight at 4 °C. Brains were then sectioned with a vibratome at 100-μm thickness.

Before antibody labelling, sections were incubated in blocking solution (0.1% Triton X-100, 3% BSA in PBS) for 2 h, with shaking at room temperature. Sections were then incubated in the blocking solution containing primary antibodies overnight at 4 °C. The following primary antibodies were used: 1:1,000 chicken anti-tyrosine hydroxylase (Aves Labs TYH) and 1:500, rabbit anti-GFP–Alexa Fluor 488 conjugated (ThermoFisher A-21311) and mCherry Alexa Fluor 594 conjugated (Life Technologies M11240). Sections were then washed in the blocking solution and incubated in the blocking solution containing secondary antibodies for 2–3 h at room temperature. For the secondary antibodies, we used goat anti chicken 647 nm (ThermoFisher) at a dilution of 1:500. Sections were then washed and mounted in Vectashield hard set mounting medium with DAPI (Vector Laboratories H-1500). Images of stained sections were acquired using a Leica confocal microscope with 10× or 20× objective lens. Confocal images were processed with the ImageJ software. Since the retrobead labelling appeared to infect more LC neurons, to measure the overlap between LC:dmPFC projecting and LC:MC projecting neurons we quantified the population of LC-NA neurons projecting to both MC and dmPFC as the percent of rgAAV+ cells that also contained retrobeads. We alternated the region injected with retrobeads versus rgAAV-Cre (MC or dmPFC) to make two groups and calculated the proportion for both groups. Sections were imaged using a confocal system (TCS SP8, Leica) running the Leica Application Suite X (v3.1.5.16308) with 10×/0.40 numerical aperture or 20×/0.75 numerical aperture objectives (Leica).

Reversible pharmacological inhibition of cortical activity

Mice were trained on the go/no-go behaviour as described previously. A day before pharmacological inhibition, a bilateral craniotomy was performed above the forelimb MC (AP: 0; ML: ±1.5 in mm) or dmPFC (AP: 2.0; ML: ±0.3 in mm) and covered with Kwik-Cast. On the day of experiment, 40 nl of a saline solution (0.9% wt./vol. NaCl), with or without the GABAA receptor agonist muscimol (5 μg μl−1; Sigma M1523-5MG), was injected (rate: 40 nl min−1) at a depth of 0.5 mm in one of the two regions. The bilateral injections were performed with a glass pipette with a 30–50 μm diameter. Behaviour was tested 90 to 120 min after the injection. The same mouse was tested again after saline or muscimol injection on consecutive days in a counter-balanced design. The order of saline versus muscimol session was randomized across mice. For analysis, we compared the behavioural performance during muscimol versus control (saline). For measuring the extent of our injection, we injected 40 nl low-molecular-weight fluorescein (Sigma F6377-100G) at the same concentration as muscimol (44 mM,) in either MC or dmPFC in some mice. We estimated the spread of our injection to be ~1 mm3.

Analysis of behaviour, optogenetics and pharmacological manipulations

To quantify behaviour, probability of pressing for each go and no-go intensity was fitted with a logistic regression model:

$${\rm{ln}}({P}_{{\rm{press}}}/(1-{P}_{{\rm{press}}}))={\beta }_{0}+{\beta }_{{\rm{go}}}{S}_{{\rm{go}}}+{\beta }_{{\rm{no}}-{\rm{go}}}{S}_{{\rm{no}}-{\rm{go}}}$$
(1)

where Ppress (or P(press)) is the probability of pressing the lever for a given tone intensity, sgo and sno-go are the intensity of the go or no-go frequency respectively. Parameters β0, βgo, and βno-go are the bias, the slope of the go, and the slope of the no-go curve respectively. Alternatively, we also quantified mice sensitivity to sgo using d-prime using norminv(hit rate)-norminv(false alarm rate). For the d-prime calculation, we pooled the false alarm rate for the 4 sno-go tone intensities. The average d-prime was computed by calculating the mean d-prime for all 4 sgo tone intensities.

To quantify the effect of photoinhibition on behavioural response, we extended the model to include the effect of laser activation:

$${\rm{ln}}({P}_{{\rm{press}}}/(1-{P}_{{\rm{press}}}))={\beta }_{0}+{\beta }_{{\rm{go}}}{S}_{{\rm{go}}}+{\beta }_{{\rm{no}}-{\rm{go}}}{S}_{{\rm{no}}-{\rm{go}}}+L({\beta }_{{\rm{L}}}+{\beta }_{{\rm{go}}-{\rm{L}}}{S}_{{Go}}+{\beta }_{{\rm{no}}-{\rm{go}}-{\rm{L}}}{S}_{{\rm{no}}-{\rm{go}}})$$
(2)

where L equals 1 on laser activation trials and 0 otherwise. The effect of laser activation was then measured by the change in Ppress for sgo or sno-go for laser off versus laser on trials. We also compared β parameters for laser off versus laser on trials. We excluded portions of behaviour where animal early-pressed (a press during the fore-period delay) on more than 40% trials, calculated with a 50-trial moving average. For pharmacological inhibition experiments, we fitted Ppress during separate sessions with equation (1), and compared the fitted data for control (saline) versus muscimol-injected sessions. To quantify Ppress at s = 0 dB (P0) we used the following equation:

$${P}_{0}=1/(1+{e}^{{\beta }_{0}})$$
(3)

where β0 is calculated using equation (1) or (2). The effect of LC-NA photoinhibition on false alarm, hit rate and d-prime during high or low tone intensities was calculated by averaging these metrics for 5–15 dB (low) or 25–35 (high intensity).

To quantify serial response bias, we measured the change in hit, false alarm and d-prime following a reward (hits), punishment (false alarms), or no reinforcement (combined misses and correct rejections). We also estimated Ppress on the following trials using equation (1). The fitted (Ppress) or unfitted (hit, false alarm and d-prime) data was compared to selecting the same trial type from a shuffled trial sequence (shuffled 50 times). The serial response bias, or press probability bias, was calculated by subtracting subsequent hit, false alarm, d-prime or Ppress of the normal sequence from those values calculated from the shuffled sequence. We excluded parts of a session where the hit rate was lower than 20% and false alarm rate was higher than 70%, calculated using a 50-trial averaging window. To evaluate the effect of silencing LC-NA neurons on serial response bias, we compared the shuffled-subtracted hit, false alarm, d-prime and β parameters for laser-off versus laser-on trials. Since the βno-go parameter was not affected by trial history, we removed it from equation (1) to quantify the effect of LC-NA photoinhibition.

Analysis of LC-NA single unit data

Spike delay to laser activation for photo-tagged LC-NA units was calculated as the average timing for the first peak after the light onset. The photo-evoked jitter was defined as the standard deviation of this peak onset distribution. Session averages and population averages were displayed using a spike density function:

$$r(t)={\sum }_{i}(f\sigma (t-{t}_{i}))$$
(4)

where r(t) is the instantaneous spike rate, ti is the time if the ith spike. Sum is over the total number of spikes. fσ represents the following gaussian kernel:

$$f\sigma \left(t-{t}_{i}\right)=\frac{1}{\sigma {\left(2{\rm{\pi }}\right)}^{1/2}}\times {\rm{\exp }}(-{t}^{2}/2{\sigma }^{2})$$
(5)

The parameter σ was set to 50 ms. To calculate the response for different behavioural events (press or reinforcement), we averaged the spike count during a window preceding or following the event for different trial types. We used a window from −0.25 to −0.05 s before press, from 0.05 to 0.15 s after water reward delivery, and from 0 to 0.1 s after air-puff delivery. Note that we used a different window to calculate reward versus punishment activity. Indeed, transient activity after a reward is delayed in time, since water has to come out of the spout and the animal has to initiate licking, whereas, for punishment, the air puff is almost instantaneous. For calculating the amplitude of press, reward, and punishment related spiking activity, we used a baseline window of −2.5 to −1 s before press. To test if the response of a neuron was significant, we used an unpaired Student’s t-test comparing the spike rate distribution of baseline versus different epochs of the task as described above. To do so, we used neurons that were recorded for at least 10 repetitions of the same trial type. To compare the activity after tone for hit, miss, false alarm, and correct-rejection trials, we used a window of 0 to 0.3 s after tone onset and compared it to a baseline window of −0.6 to −0.3 s before tone onset. For calculating baseline tonic activity, we used a 1s window before the light cue or a 2 s window taken 3 s after the tone. To evaluate the relationship between go/no-go tone intensity and spike rate, we fitted a least-square slope to the spike count obtained for each tone intensity and compared with the slope of the baseline period of −2.5 to −1 s before press. Fano factor, a measure of variability of spiking, was calculated using the variance/mean of the number of spikes during the pre-press or post-reinforcement windows defined above.

In some experiments, we did not use a delay between the timing of press and reinforcement (n = 18 units recorded without delay versus 27 with a 0.25 s pre-reinforcement delay) (Extended Data Fig. 3i). We included both delay and non-delay experiments for calculating pre-press or post-tone LC-NA single unit activity. For calculating activity following reinforcement, we only included experiments where we used a 0.25 s pre-reinforcement delay.

Analysis of calcium and GRABNE signals

For LC somas, LC axons, and cortical GRABNE imaging, the ΔF/F signal from each ROI was compared together by scaling the signal to the maximum value. To do so, we calculated the session average aligned to the timing of lever press for hit and false alarm trials, measured the peak intensity for any of these trial types, and divided the session average by this peak. To measure response to different behavioural epochs, we calculated the AUC for a window of −0.5 to 0.2 s for press and 0.2 to 1 s for outcome (reward or punishment). To calculate signal correlations of LC-NA neurons, we computed the Pearson correlation coefficient of the signal during a −1 to 2.5 s window aligned to lever press for each pair of simultaneously recorded LC-NA neurons. For comparison, we also measured the signal correlation during the inter-trial interval. To compare signal reliability across sessions, we used the 9 ROIs that were tracked over multiple sessions. We set the first day of tracking the ROI as day 0 and we calculated the signal drift index for subsequent sessions from the signal obtained at day 0. Signal drift index allow us to measure the trial-to-trial correlations across session and compare it for different ROI63. Signal drift index (SDI) was calculated using the following equation:

$${\rm{S}}{\rm{D}}{\rm{I}}=({{\rm{C}}{\rm{C}}}_{{\rm{w}}{\rm{s}}}-{{\rm{C}}{\rm{C}}}_{{\rm{b}}{\rm{s}}})/({{\rm{C}}{\rm{C}}}_{{\rm{w}}{\rm{s}}}+{{\rm{C}}{\rm{C}}}_{{\rm{b}}{\rm{s}}})$$
(6)

Where CCws and CCbs represents the average trial-to-trial correlation within session and between the current and day 0 sessions, respectively. For field of view with multiple axons, trial by trial correlation was calculated for all trial types. The centre of mass of each axon was used to calculate the distance between axons. To measure the within-axon correlation, we selected two segments of an axon (average size: 310 ± 20 um2) and calculated the correlation coefficient between the average signal from this segment and the signal from the whole axon.

To compare axonal calcium imaging to GRABNE signal, we computed first the average GRABNE signal from the MC aligned to lever press for all four mice tested. We then compared the session average of each of the LC-NA:MC axons (n = 43) imaged to the average GRABNE signal. To measure the timing of correlation of axonal calcium with GRABNE, we computed the normalized cross-correlation. To measure the overall correlation between axonal and noradrenaline release, we computed Pearson’s linear correlation coefficient between each axon and GRABNE.

Multiple regression linear model

We modelled the LC-NA signal during behaviour by using a multiple regression linear model64,65,66. In this model, we assumed that LC-NA activity can be explained by the combination of temporal filters aligned to the timing of different task events. These temporal filters were fitted by creating a design matrix using the timing of light cue, tone onset, lever press, reward, and punishment as regressors. Each regressor was convolved by a set of basis function, which consisted of a pulse function centred at the time of the event. Multiple copies of this function were created each shifted in time by one time-point to cover an appropriate time-period for each behaviour event. We used a period from 0 to 1.5 s for light cue, from −0.2 to 1.3 s for tone, from −1.1 to 0.3 s from press, and from −0.1 to 1.4 s for both reward and punishment predictors. Our design matrix used a total of 79 predictors.

To calculate the different temporal filters, we resampled the ΔF/F signal to a resolution of 10 Hz. We filtered the calcium data with a second-order lowpass Butterworth filter with a 4 Hz cut-off frequency. Predictors were z-scored before fitting. We then obtained the maximum-likelihood fitted coefficients for each predictor of the design matrix by using elastic net regression (MATLAB’s lassoglm function; with parameters distribution set to normal, alpha set at 0.5, and lambda set to 5 × 10−4). To quantify the explanatory power of each task event, we computed the overall explained variance using fivefold cross-validation. Cross-validation folds were balanced to have similar number of trial types (hit, miss, correct-rejection and false alarm trials) and left out of fitting procedure. Thus, each model was fitted and tested on separate set of data, removing concerns of overfitting. The overall explained variance was calculated by averaging all 5 values of explained variance obtained with cross-validation.

To assess the contribution of each behavioural epoch, we created reduced models in which one of the behavioural variables was removed. To do so, we set all predictors representing that variable to zero in the design matrix. We computed the explained variance using fivefold cross-validation of that reduced model. The linear model contribution (LMC) was calculated by:

$${\rm{L}}{\rm{M}}{\rm{C}}=1-{{\rm{E}}{\rm{V}}}_{{\rm{R}}{\rm{e}}{\rm{d}}{\rm{u}}{\rm{c}}{\rm{e}}{\rm{d}}{\rm{m}}{\rm{o}}{\rm{d}}{\rm{e}}{\rm{l}}}\,/\,{{\rm{E}}{\rm{V}}}_{{\rm{F}}{\rm{u}}{\rm{l}}{\rm{l}}{\rm{m}}{\rm{o}}{\rm{d}}{\rm{e}}{\rm{l}}}$$
(7)

Where, EVReduced model and EVFull model is the explained variance of the reduced and full model respectively. LMC values for the five behavioural variables was calculated for each cell individually. To identify clusters of LC-NA neurons based on the LMC of each of the five variables, we ranked cells by their peak linear model contribution.

Statistics and reproducibility

Throughout the paper we used non-parametric two-sided Wilcoxon test or Mann–Whitney test for evaluating P values of paired and unpaired populations respectively. P values for experiments with multiple conditions were computed using Kruskal–Wallis or ANOVA one-way analysis of variance with Tukey post hoc test. For P values computed using ANOVA, data distribution was assumed to be normal, but this was not formally tested. P values were adjusted with Bonferroni correction when using Wilcoxon test for multiple comparisons. P values for binomial distribution were obtained using the normal approximation to binomial test. For measuring the effect of photoinhibition of behavioural response, or Ppress, we used hierarchical bootstrapping. Null distribution of ΔP (Ppress_LaserOffPpress_LaserOn) was calculated by resampling with replacement the mice and sessions 105 times. Two-sided P values were defined as the likelihood of obtaining ΔP lower or higher than the actual probability, under the null hypothesis that photoinhibition did not change the probability of lever press. Significance levels were set as *P < 0.05, **P < 0.01 and ***P < 0.001. To calculate 95% confidence interval of a distribution we used bootstrapping, where we resampled with replacement the data 105 times. Sample sizes were not pre-determined before data acquisition. Data collection and analysis were not performed blind to the conditions of the experiments.

Representative in vivo images as well as histological experiments were repeated independently in different mice with similar results for Fig. 4b (n = 11 imaging sessions), Fig. 5b (n = 18 LC:dmPFC and n = 18 LC:MC imaging sessions), Extended Data Fig. 1k (n = 7 mice), Extended Data Fig. 4a (n = 9 mice), Extended Data Fig. 5a (n = 3 mice), Extended Data Fig. 7b (n = 8 mice), Extended Data Fig. 8d, g (n = 18 LC:dmPFC and n = 18 LC:MC imaging sessions), Extended Data Fig. 9c (n = 6 mice) and Extended Data Fig. 10a, b (n = 7 mice).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.