1 Introduction

Episodic memory is a fundamental cognitive operation that links together the contents of a present experience– spatial, temporal, sensory, and emotional– for future recall (Eichenbaum, 2004; Pfeiffer, 2020; Dragoi & Tonegawa, 2011; Stachenfeld et al., 2017). The hippocampal formation (HPF) is a critical substrate for episodic memory formation and retrieval, with area Cornu Ammonis 3 (CA3) crucial for auto-associative memories (Rolls, 2018). Auto-association and pattern completion are two circuit functions that involve the storage of individual experiences and their recall from a partial cue, respectively (Rebola et al., 2017). Neurophysiological studies highlight that these experiences are represented by the concurrent firing of a group or groups of excitatory pyramidal cells (PCs), known as neuronal ensembles or cell assemblies (Buzsáki, 2010; Farooq et al., 2019). Additionally, empirical evidence reveals a synaptic basis for these experiences, where the order and timing of spikes via long-term spike timing-dependent plasticity (STDP) is a key factor in strengthening synaptic conductance at PC-PC synapses (Feldman, 2012).

Open questions stemming from CA3 as a substrate for memory regard the quality of experience remembered, and the number of stored experiences: how well does CA3 recall experiences, and what is the memory capacity of CA3? Recall quality may be based on how the learned experience is encoded by cell assemblies and their corresponding connections, where changes in the amplitude of excitatory postsynaptic potentials (Perez-Rosello et al., 2011) and number of AMPA receptors at the terminals of postsynaptic PCs can occur (Feldman, 2012; Debanne et al., 1998; Mishra et al., 2016; Kakegawa et al., 2004). Additionally, the amount of information provided in the form of a cue to these cells can lead to graded re-activation of the memory through pattern completion (weak, moderate, or strong) (Neunuebel & Knierim, 2014). From a dynamical systems lens, this may involve the CA3 network exhibiting attractor dynamics in response to a pertinent cue (Treves & Rolls, 1994; Hasselmo et al., 1995; Menschik & Finkel, 1998). These mechanisms also depend not only on the specific input-output properties of CA3 PCs (Lazarewicz et al., 2002; Hemond et al., 2008, 2009), but also on considerably diverse inhibitory interneurons (Ascoli et al., 2009).

Concerning the network memory capacity, theoretical and empirical evidence suggests that there are four key factors in determining the number of memories stored in CA3: the number of PCs, the probability of connection between PCs, the size of cell assemblies, and the amount of overlap between cell assemblies. Estimates for the number of neurons, the PC-PC connection probability, and the size of cell assemblies have been offered based on various assumptions (Almeida et al., 2007; Guzman et al., 2016; Treves & Rolls, 1991). Additionally, estimates have been provided for the percentage of cells shared between cell assemblies, and the shared cells between assemblies provide a neural substrate for associations that enable representations of specific episodic memories (Gastaldi et al., 2021; Quian Quiroga, 2023). Estimates for the memory capacity of CA3 have been offered based on these factors in rats, though, to our knowledge, not in mice. However, these estimates relied on network models that did not reflect the neural and connection type diversity of the CA3 circuit (Almeida et al., 2007; Guzman et al., 2016; Treves & Rolls, 1991).

Hippocampome.org is an open access knowledge base of distinct neuron types in the rodent HPF (Wheeler et al., 2015, 2024). This resource identifies neuron types based on their primary neurotransmitter (glutamate or GABA) and the presence of axons and dendrites across distinct layers of each cytoarchitectonic area of the HPF: entorhinal cortex, dentate gyrus, CA3, CA2, CA1, and subiculum. Hippocampome.org provides for each neuron type experimental data regarding the expression of specific molecules (White et al., 2020), biophysical membrane properties (Ascoli & Wheeler, 2016), electrophysiological firing patterns in vitro and in vivo (Komendantov et al., 2019; Sanchez-Aguilera et al., 2021) and population size (Attili et al., 2019, 2022). Additionally, Hippocampome.org quantifies the connection probability and synaptic signals of directional pairs formed between a pre- and post-synaptic neuron type, known as potential connections, which are based on their axonal and dendritic distributions (Rees et al., 2017; Moradi & Ascoli, 2018, 2020; Tecuatl et al., 2021a, b). Also available on this web portal are computational models of neuronal excitability (Venkadesh et al., 2019) and short-term synaptic plasticity (Moradi et al., 2022) using the Izhikevich and Tsodyks-Markram formalisms, respectively.

Utilizing Hippocampome.org, we previously created a computational circuit model of the mouse CA3 that featured a selection of neuron types and potential connections chosen to represent the neural diversity of this area (Kopsick et al., 2023). Additionally, the in silico implementation of this model as a spiking neural network (SNN) in the GPU-based simulation environment CARLsim6 can capture the individual spike times of every neuron, and can track changes in synaptic weight at each connection (Niedermeier et al., 2022). This makes the Hippocampome derived CA3 SNN particularly useful for elucidating mechanisms for auto-association and pattern completion.

The present work investigates whether a SNN that reflects the scale, diversity, and biological properties of the mouse CA3 can form and retrieve patterns via cell assemblies. We demonstrate that this SNN has activity consistent with what has been observed in vivo, and that patterns are auto-associated and completed robustly with minimally informative cues that stem from cell assembly formation and retrieval, respectively. Additionally, we report that a range of assembly sizes can support pattern completion after a limited number of repeated presentations. Furthermore, when cells are shared between assemblies, auto-association and pattern completion remain nearly unaltered, suggesting that individual representations can be strongly retrieved while still providing a basis for overlapping experiences. Moreover, this finding offers a potential mechanism supporting a substantial expansion of memory capacity in the CA3 circuit.

2 Results

2.1 Can a full-scale CA3 SNN store and retrieve patterns via cell assemblies?

To answer this first research question, we utilize our full-scale SNN of the mouse CA3, which exhibited rhythmic network activity that was stable and robust in response to synchronous or asynchronous transient inputs, reflecting resting-state behaviors (Kopsick et al., 2023). This model consisted of 8 neuron types and 51 connection types and was instantiated with 84,053 neurons and 176 million connections (Fig. 1A; Tables 1 and 2). Starting from this architecture, we sought to understand how CA3 could embed experiences occurring during wakefulness via cell assemblies for later recall. To create cell assemblies, a symmetric STDP learning rule was implemented in the SNN (Mishra et al., 2016): \(\:\varDelta\:w=\:A{e}^{-|\varDelta\:t|/\tau\:}\), where \(\:A\) determines the peak amplitude of weight change, \(\:\tau\:\)is the decay time constant, and is the time difference between the post- and pre-synaptic spikes. Values for each parameter were set to best approximate the symmetric exponential decay curve observed experimentally (Mishra et al., 2016) (Materials and Methods; Fig. 1B).  

Fig. 1
figure 1

Full-scale CA3 SNN with long-term excitatory synaptic plasticity. (a) Circuit schematic of the CA3 SNN. Cell counts for each neuron type are displayed in the corresponding soma symbol, and probabilities of connection between pairs of neuron types are listed at points of axonal-dendritic overlap. (b) A broad symmetric STDP window promotes synaptic potentiation between concomitantly firing Pyramidal cells, reflecting each pattern. PC = Pyramidal cell; AAC = Axo-axonic cell; BC CCK + = Basket CCK+; BC = Basket cell; QuadD = QuadD-LM; BiC = Bistratified cell; MFA ORDEN = Mossy Fiber-Associated ORDEN

Table 1 Izhikevich parameters by neuron type
Table 2 Tsodyks-Markram parameters for each connection type in the model

We presented input patterns during a training phase that elicited concomitant firing in distinct subsets of PCs. This approach was inspired by a recent study (Guzman et al., 2016) which demonstrated through functional connectivity analysis and network modeling that cell assemblies formed within CA3 from the application of different input patterns to subsets of CA3 PCs. In this work, each pattern lasted the length of a gamma cycle (20 ms) and was activated within an overarching theta cycle (200 ms), inspired by how cell assemblies are theorized to form in vivo according to a theta-gamma neural code (Buzsáki, 2010; Lisman & Jensen, 2013); Fig. 2a, c). After training, a degraded form of each input pattern was provided during a testing phase to evaluate the pattern completion capability of the SNN. Pattern degradation consisted of eliciting concomitant firing in a smaller subset of PCs than the subset used during training; the test consisted of ascertaining whether this subset could retrieve the full pattern during the second half of the gamma cycle through activation of recurrent PC connections (Fig. 2b, d).

Fig. 2
figure 2

A theta-gamma training and testing protocol to investigate pattern completion within the CA3 SNN. (a) Training the SNN to store patterns involves the concomitant firing of (in this example) 275 PCs (red) during a theta time window. Two repetitions of a pattern are shown. Activity from a random selection of 500 PCs (black) and 10 interneurons of each type (spikes colored neither red nor black) are also shown. (b) Testing pattern completion involves activating a subset of PCs (red) which leads to pattern completion of the remaining subset (blue) during a theta time window. (c) Concomitant firing of PCs in (a) occurs during 20 ms gamma time windows. Inset: sparse firing of two representative neurons during pattern presentation. (d) Activation of the same subset of PCs and resultant pattern completion of the remaining subset in (b) during a 20 ms gamma time window. The time window utilized for computing pattern reconstruction accuracy is highlighted by a gold rectangle (Supplementary Fig. 4)

The full-scale network exhibited asynchronous population activity while patterns were not presented, with each neuron type firing at rates consistent with those observed for these types in vivo (Table 3). When patterns were presented, sparse firing of PCs was relegated primarily to assembly members, while the activity of each interneuron type remained similar to non-presentation periods (Fig. 3a, c). Between training and testing, all PC-PC synaptic weights were re-normalized via synaptic divisive downscaling based on the synaptic homeostasis hypothesis (Tononi & Cirelli, 2003, 2006, 2014). In order to test the specificity of auto-association and pattern completion, we trained the network with three distinct input patterns. Training (with 65 repetitions in this example) induced strong auto-association through the synaptic weights of PCs within the subset of PCs stimulated by each input pattern, thereby forming three cell assemblies. Synaptic weights between members of different assemblies and between PCs that did not belong to any assembly were similar to the synaptic weights before training had commenced (Fig. 3b). Strong auto-association within a subset of PCs stimulated by a given input pattern is indeed consistent with and expected from the cell assembly theory ((Hebb, 1949); Supplementary Fig. 1). Stimulation of 50% of the input patterns provided during training (50% pattern degradation) led to robust activation of each assembly (Fig. 3d). Importantly, utilizing a normal distribution of starting PC-PC synaptic weights (consistent with the Hippocampome.org ranges (Moradi et al., 2022) as opposed to a fixed distribution did not alter these results (Supplementary Fig. 2). Furthermore, our CA3 model exhibited fixed point attractor dynamics, as visualized by Principal Component Analysis (PCA; see Materials and Methods) in response to each of the three input patterns (Supplementary Fig. 3), consistent with previously theorized roles of CA3 as an attractor network (Rolls & Treves, 2024).

Table 3 Firing rates (mean ± s.d.) for each neuron type as recorded in our model and in vivo
Fig. 3
figure 3

Pattern completion in the CA3 SNN. (a) Activity from the entire CA3 SNN during one second of training. (b) Kernel density estimates of PC-PC synaptic weights (after synaptic downscaling) within assembly (red), between members of different assemblies (blue), between non-assembly members (green), and the initial (uniform) synaptic weights before training (dashed black). (c) Activity from the entire CA3 SNN during one second of testing the recall of three patterns. Degraded patterns are presented at the five hundred millisecond mark (orange window). (d) Activity from 825 Pyramidal cells (PC) and 10 interneurons of each interneuron type (spikes that are neither red nor blue) during the orange window in (c). Input to 138 PCs (50% pattern degradation) in each assembly (red) leads to robust activation of the remaining assembly members (blue)

In summary, we extended a previous data-driven, full-scale SNN of the mouse CA3 with experimentally-derived STDP and showed that (1) the network could store patterns via cell assemblies when trained with a biologically realistic stimulation protocol; and (2) cell assemblies retrieved their activity patterns when only provided a halved cue. This result allowed us to investigate the robustness of cell assembly retrieval across a variety of scenarios.

2.2 Can robust cell assembly retrieval occur across learning and with increasingly degraded cues?

A CA3 SNN capable of pattern storage and retrieval allows the characterization of two central aspects of auto-associative memory: the amount of repetition (learning) required for an experience to be stored and appropriately recalled, and the impact on performance when cues are degraded. Addressing these issues requires a metric to quantify the extent of pattern recall. To this aim, we defined pattern reconstruction based on a previously developed approach (Guzman et al., 2021) relying on Pearson correlation coefficients (PCCs): if the output pattern PCCs were greater than the input pattern PCCs, then pattern completion occurred (Materials and Methods). Our pattern reconstruction metric adapts this index to capture the degree of pattern completion by scaling the PCCs relative to the maximum value of 1, and converting the result to a percentage to obtain an intuitive expression of performance accuracy (Supplementary Fig. 4).

With pattern reconstruction defined, we turn to the first question. We trained the CA3 SNN in sets of 5 presentations of, again, three distinct input patterns, which in the prior example created three corresponding cell assemblies. After each set of 5 presentations, we stored the synaptic weight matrices of the network to enable separate testing with 50% degraded input patterns. Interestingly, non-zero pattern reconstruction occurred with as few as 15 presentations of input patterns (Fig. 4a). Based on the second derivative of the reconstruction accuracy, 40 pattern presentations corresponded to the inflection point of most effective learning. Furthermore, a pattern completion plateau emerged at 55 presentations, with 65 and 95 presentations providing the strongest reconstruction accuracies, indicating the best pattern retrieval.

Fig. 4
figure 4

The CA3 SNN is robust to pattern degradation across learning. (a) Pattern completion accuracy, quantified by pattern reconstruction with 50% pattern degradation, as a function of training. The star denotes the inflection point for most effective learning as defined by the second derivative of the accuracy curve, and the diamond and square denote the two best accuracy values on the plateau. (b) Reconstruction accuracy as a function of pattern degradation. With increased training, cell assemblies can withstand greater degradation of input patterns, but only up to the initial plateau. Results in both panels are from an assembly size of 275

Turning to the second question, we utilized network structures trained on 40, 65, and 95 pattern presentation sets to assess how increased pattern degradation (i.e., increasingly diminished pattern cues) impacted pattern retrieval. Remarkably, pattern reconstruction remained substantial until a steep drop-off at 70% pattern degradation, and only weak pattern reconstruction occurred with 95% pattern degradation for each of the three network structures (Fig. 4b). Additionally, the similar performance of networks trained on 65 and 95 input repetitions highlighted that training beyond the initial plateau does not improve performance at more extreme pattern degradations.

Taken together, these results show that the CA3 SNN reliably encoded and retrieved patterns after as few as 40 presentation sets and upon reactivation of only a minority of PCs belonging to a cell assembly.

2.3 What assembly sizes can support pattern completion?

Another fundamental question is that of memory capacity– how many experiences can the network store and recall without interference? To address this question, we first consider the simple scenario in which all cell assemblies are fully segregated, that is, no neuron belongs to more than one assembly. In this case, the number of cell assemblies supported by the CA3 network is given by the total number of CA3 pyramidal cells divided by the assembly size, i.e. the number of CA3 pyramidal cells constituting each assembly. This factor is related to the sparseness ratio (γ), defined as the percentage of cells activated during an experience (Almeida et al., 2007). Theoretical insights and experimental evidence from humans and rats offered constraints for γ; using these constraints as a guide, we tested cell assembly sizes between 50 and 600 (0.067% <= γ <= 0.8%; (Almeida et al., 2007; Guzman et al., 2016; Treves & Rolls, 1991; Waydo et al., 2006; Bennett et al., 1997); Materials and Methods).

We trained networks on 40, 65, and 95 presentation sets to create assemblies of variable size and tested on patterns degraded by 50%. Interestingly, smaller sized assemblies performed best with fewer presentations (40 sets), while larger sized assemblies performed best with more presentations (65 and 95 sets) (Fig. 5a). Additionally, there was a stable range of assembly sizes between 150 and 600 where reconstruction accuracy improved with more training; the best performance occurred for an assembly size of 275 (0.33% of the total network size). Assembly sizes smaller than 150 with additional training performed worse due to pattern interference (Supplementary Fig. 5). Furthermore, as observed in the previous section, the choice of either 65 or 95 presentation sets within this range conferred similar pattern reconstruction accuracy. Moreover, application of a different synaptic downscaling method (subtractive normalization) or not normalizing synaptic weights at all led to comparable reconstruction accuracies; however, without downscaling, the stable range of assembly sizes was narrower, as accuracy decreased with additional training (Supplementary Fig. 6).

Fig. 5
figure 5

A range of cell assembly sizes can support robust pattern completion. (a) Reconstruction accuracies for a range of cell assembly sizes throughout learning with 50% pattern degradation. (b) Effect of assembly size on reconstruction accuracy with increased pattern degradation levels of 30, 70, and 97.5%

Utilizing SNNs trained on 40 and 65 presentation sets, we further tested pattern completion for the range of assembly sizes with increased pattern degradation percentages of 70 and 97.5% (Fig. 5b). Notably, assembly sizes of 100 and 150 displayed the best pattern completion in response to these highly degraded input patterns and exhibited weak pattern completion even when only 2.5% of an input pattern was provided. Therefore, in the presence of severely degraded input patterns, smaller assembly sizes (100 and 150) performed best in the SNN, whereas across moderate to high degradation levels an assembly size of 275 offered the best performance.

2.4 Can a full-scale CA3 SNN store and recall overlapping cell assemblies?

Our analysis so far assumed that no neuron could belong to more than a single cell assembly, but this is not necessarily the case in biological circuits. In fact, the extent of assembly overlap constitutes another key factor in determining memory capacity, because sharing neurons between cell assemblies can increase the number the experiences the network can encode (Quian Quiroga, 2023). Moreover, neurons shared between cell assemblies may facilitate hetero-association between episodic memories in CA3 (Gastaldi et al., 2021). Therefore, we investigated the storage and retrieval of patterns in the CA3 SNN when cell assemblies shared a subset of neurons (Fig. 6a).

Fig. 6
figure 6

Overlapping cell assemblies support robust pattern completion. (a) Schematic of cell assembly overlaps. Three assemblies (red, blue, and yellow) of neurons (circles) and connections (lines), with shared cells (green, purple, and orange circles). Black circles and lines represent non-assembly neurons. The external arcs indicate the extent of overlaps. (b) Overlapping cell assemblies display similar reconstruction accuracy to assemblies without overlap throughout learning at 50% pattern degradation. (c) Overlapping cell assemblies perform comparably in reconstruction accuracy to assemblies without overlap when pattern degradation is increased. Results from (b) and (c) are from an assembly size of 275. (d) Overlapping cell assemblies have comparable reconstruction accuracy to assemblies without overlap across a range of assembly sizes at 50% pattern degradation. Bars in (b-d) reflect standard deviation of accuracy across three simulations with randomized selection of overlapping cells

To create (initially modest) overlaps between the three cell assemblies, we randomly selected 5% of neurons as shared between each pair of assemblies before training commenced (Materials and Methods). Following the usual procedure for storing cell assemblies and degrading input patterns by 50% during testing, overlapping cell assemblies retrieved patterns comparably to cell assemblies without overlaps (Fig. 6b). In particular, pattern reconstruction accuracy followed a similar trajectory with overlapping cell assemblies and had the same optimal point for learning of patterns and highest accuracy, which occurred at 40 and 65 presentations, respectively. Additionally, testing the overlapping cell assemblies in the presence of increased pattern degradation after training with 40 and 65 pattern presentation sets yielded similar reconstruction accuracies as with the no overlap (Fig. 6c). Furthermore, in the presence of 5% overlap, cell assembly sizes between 200 and 600 supported strong pattern completion, again consistent with the range found for assemblies without shared cells (Fig. 6d); notably, however, overlap reduced the performance of smaller cell assemblies in the 50–150 range.

Next, we varied the overlap percentage from 1 to 50% for an assembly size of 275 trained on 40 presentation sets and tested with 50% degraded input patterns. Remarkably, pattern reconstruction accuracy changed only minimally up to 20% overlap and remained above 30% even at 50% overlap. However, the percentage of neurons activated in the designated assembly compared to neurons not activated in the designated assembly, which we defined as pattern specificity (see Materials and Methods), substantially decreased with overlap (Fig. 7).

Fig. 7
figure 7

Overlaps decrease both pattern reconstruction accuracy and pattern specificity. Testing of pattern degradation at 50% with an assembly size of 275 trained on 40 presentations. With increased levels of overlap, the less assemblies can withstand moderate pattern degradation as measured with reconstruction accuracy (blue), and the less likely assemblies can retrieve a specific pattern (red)

Auto-association and pattern completion of cell assemblies reflect the structural and functional components of memory formation and recall within the CA3 circuit, respectively, and SNNs can help reveal the underlying link between structure and function (Buzsáki, 2010; Lisman, 1999). We investigated this relationship by tracking two characteristics of PC-PC synapses throughout training: the auto-association signal-to-noise ratio (SNR) and the percentage of assembly synapses that had reached the maximum weight (Materials and Methods). It is especially interesting to analyze if and how these characteristics relate to the observed pattern completion performance. In this regard, we observed an auto-association SNR plateau occurring in assemblies trained both with and without overlap: further improvements in reconstruction accuracy became inconsequential above 94% SNR (Fig. 8a). This is consistent with the influence of the number of presentations on pattern completion, where training beyond 60 presentation sets did not significantly improve retrieval (cf. Figure 6b). Furthermore, at both 50% and 70% pattern degradation, reconstruction accuracy reached values close to optimal performance when only 10% of assembly synapses had reached their maximum weight with or without overlap (Fig. 8b). This indicates that effective learning in the CA3 SNN does not require synaptic saturation.

Fig. 8
figure 8

Relationship between pattern completion performance and synaptic characteristics of the CA3 SNN. (a) Auto-association of cell assemblies throughout learning highlights comparable reconstruction accuracy as a function of maximum signal-to-noise ratio (SNR) regardless of overlap percentage. Inset: Zoomed-in view near maximum auto-association SNR. (b) Pattern completion as a function of the percentage of synapses at maximum weight demonstrates that optimal performance does not require synaptic saturation

Taken together, these results highlight that strong retrieval occurs at moderate SNRs and when most assembly synapses are below their maximum weight. Moreover, overlapping cell assemblies retrieve patterns comparably to non-overlapping assemblies, supporting the use of shared neurons to enhance auto-associative memory capacity.

3 Discussion

The present work demonstrates that a biologically realistic SNN of the mouse CA3, with cell type-specific parameters of neuronal excitability, connection probabilities, and synaptic signaling all extracted from experimental measurements, can store and recall auto-associative memories via cell assemblies. Notably, cell assembly formation and retrieval relies on a training and testing paradigm grounded by in vivo neurophysiology (Lisman & Jensen, 2013). In particular, strong pattern reconstruction reliably occurs in the SNN in response to heavily incomplete or degraded input cues. Furthermore, auto-associative pattern completion in our model is robust across a broad range of assembly sizes and in the presence of assembly overlap, two critical factors to determine auto-associative memory capacity in CA3.

Training our CA3 SNN to an optimum point enabling strong pattern completion, yet well before most assembly synapses reach their maximum weights, may reflect how the real CA3 stores and recalls memories. Rather than maximizing post-synaptic conductance in pyramidal cells, a tradeoff with synaptic downscaling (possibly during slow-wave sleep) could support the storage of many patterns with minimal pattern interference. Additionally, these insights may be useful in training artificial neural networks, where training a network on multiple tasks in parallel with reasonable performance, instead of optimizing accuracy on a single task, could prevent catastrophic forgetting (Kemker et al., 2018; Kumaran et al., 2016).

The hippocampus may facilitate “one-shot” learning, i.e. rapid memory encoding from just a single experience (Moser & Moser, 2003). In a previous study, training a rat CA3 network model to store patterns with a clipped Hebbian plasticity rule enabled encoding of these memories in a one-shot manner (Guzman et al., 2016). However, one-shot encoding may not be prominent in the real rodent hippocampus, as animals typically spend weeks to learn a spatial or novel object location memory task before testing begins, and even then strong performance often requires multiple trials (Pfeiffer, 2020; Nakazawa et al., 2002; Montgomery & Buzsáki, 2007; Neves et al., 2022; Siegle & Wilson, 2014). During this time the hippocampus goes through many encoding, consolidation, and retrieval phases, when theta, gamma, and sharp-wave ripples contribute to cell assembly formation, refinement, and recall (Buzsáki, 2019). Therefore, our simulation design subjected the CA3 SNN to a training phase representative of encoding during experience through theta nested gamma oscillations (Lisman & Jensen, 2013). Moreover, modification of synaptic weights within assemblies between training and testing reflected synaptic downscaling during slow-wave sleep (González-Rueda et al., 2018). With this protocol, heavily degraded or incomplete cues reliably triggered strong pattern completion-mediated recall of experiences, in line with the expected role of CA3 in auto-associative memory.

Our results of robust pattern completion using circuit parameters measured from anatomical and physiological experiments complement and extend previous modeling work. A network model consisting of CA3 PCs and two interneuron types receiving inputs from the entorhinal cortex and dentate gyrus showed that, when patterns were strongly degraded, pattern completion could still occur within one recall cycle, known as simple recall (Bennett et al., 1997; Hummos et al., 2014). Neither the network model with PCs and two interneuron types nor the rat CA3 network model mentioned above (Guzman et al., 2016), however, constrained the simulation based on both size and diversity of the CA3 circuit. Another recent model of pattern completion in CA3 reflected the mouse network size, but again not the neuronal and synaptic diversity (Sammons et al., 2024). Therefore, to our knowledge this work provides clear evidence of robust pattern completion in the most realistic full-scale SNN model– including data-driven, cell type-specific parameters of neuronal excitability, connection probabilities, and synaptic signaling– of the mouse CA3 to date.

The cell assemblies formed and retrieved in this work involved zero to 50% shared cells between them. It is likely that cell assemblies have at least some level of overlap between them, as randomly creating assemblies with coding sparseness ratio of γ would share γ2N cells in common (Gastaldi et al., 2021). Our results with substantial overlap demonstrate that the neuronal and synaptic physiology of the CA3 circuit are well suited to support pattern completion even if non-zero overlap exists between assemblies in the mouse hippocampus in vivo, as recent empirical evidence demonstrates in the mouse primary visual cortex (Carrillo-Reid et al., 2019). Additionally, our results highlighting weak pattern specificity with high (\(\:\ge\:\) 40%) overlap might be improved by cholinergic presynaptic inhibition to prevent interference during encoding, as demonstrated in previous studies of pattern completion (Hasselmo et al., 1995; Hasselmo & Wyble, 1997).

The strong retrieval of smaller assembly sizes below 40 presentations, and their poor retrieval with presentations greater than 65 was due to the interplay of cell assembly dynamic with the STDP learning rule. Specifically, in our model, smaller assemblies have a higher learning rate (parameter A in Table 4), so as to balance out the within-assembly maximum synaptic weights after 100 presentations (cf. ‘Long-term synaptic plasticity’ in Materials and Methods). The STDP rule, however, is applied to all PC-PC synapses, including not only those between assembly members, but also those between non-assembly members. Thus, all PC-PC synapses are more strongly potentiated in the training of smaller assemblies than of their larger counterparts. As a consequence, assemblies with fewer than 100 cells reach a critical point during training with the emergence of large network activity due to strongly increasing weights between non-assembly member synapses. This activity in turn causes a further increase in the weights of all PC-PC synapses. For assembly size 75, this happens at 95 presentation sets, while for size 50, it happens at 65 sets, consistent with Fig. 5a. However, this never happens at the lower learning rates associated with larger cell assemblies. With the synaptic weights excessively strengthened throughout the network, application of downscaling, which brings the mean of all PC-PC synapses back to the before-training value, leaves the smaller-sized cell assemblies without sufficient excitatory drive to complete the cued pattern, effectively preventing storage and retrieval. Notice that this mechanism is not binary, but rather continuous. For example, even after 65 presentations with assembly size 75, the synaptic weights between assembly and cross-assembly members (using the terminology of Fig. 3b and Supplementary Fig. 2a, b) became larger. This leads to more noise as reflected in partial interference between assemblies when recalling the patterns (Supplementary Fig. 5).

Table 4 Maximum synaptic conductances, weights, and learning rates for each assembly size

The best performance across the range of assembly sizes examined in this study, when considering varying levels of cue degradation, lengths of training, and overlap, occurred with an assembly size of between 250 and 300 neurons. Intriguingly, 275 is approximately the square root of the number of PCs in the mouse CA3 network. It is tempting to speculate that hippocampal cell assemblies in vivo optimally form in accordance with the square root of the number of PCs, at least in rodents: based on the values of γ reported in previous studies, the square root relation would hold for rats, but not for humans. However, these estimates for γ are based on indirect evidence, including the number of hippocampal place cells active in each environment, and the number of concept cells active when presenting a concept, based on simultaneous single- and few-neuron recordings (Almeida et al., 2007; Quiroga, 2012).

Estimation of memory capacity in CA3 has previously involved the use of the connection probability, c, between CA3 PCs and γ. Utilizing Willshaw’s formula, which estimates capacity for both non-overlapping and overlapping assemblies with P = c/γ2, the capacity of the mouse CA3 would be on the order of 2,000 patterns (Almeida et al., 2007). Another formula was proposed by Treves and Rolls, which considers the number of recurrent collateral (RC) connections onto each PC, CRC, a scaling factor reflecting the total amount of information that can be stored and retrieved from the RCs, k, and γ (Rolls, 2018; Treves & Rolls, 1991). Estimating capacity with their formula of \(\:\text{P}=\:\frac{{\text{C}}^{\text{R}\text{C}}}{{\upgamma\:}\text{l}\text{n}(1/{\upgamma\:})}\text{k}\), the mouse CA3 could store on the order of 18,000 patterns. However, these formulas do not consider overlap directly as a variable, which may in principle allow a substantial increase in storage capacity. Collaborating with a different group of colleagues, one of the authors recently discovered an exact, closed-form solution for how memory capacity in our model varies as a function of network size, cell assembly size, and cell assembly overlap. The expression, derivation, proof, and analysis of such a mathematical formula will be the subject of a separate, forthcoming manuscript.

The sigmoidal relationship between training amount and pattern reconstruction accuracy provide a means for comparing how a real rodent successfully learns and retrieves meaningful representations during memory-related tasks. During both goal-directed navigation tasks in an open field and in a four-arm maze, four task trials enabled performance above chance (Pfeiffer, 2022; Belchior et al., 2014). With 5 theta cycles usually occurring when a place cell occupies its preferred place (Lisman & Jensen, 2013), and given the equivalence in our model between one presentation and one theta cycle, performing above chance level at 4 task trials would occur after 20 presentations, which intriguingly is the case in our model (Fig. 4A). A rodent reaching its best performance likely depends on task complexity, as best performance for the open field and four-arm maze occur within 10 and 50 trials, respectively. Our results of best performance at 65 presentations (13 trials) may be more representative of open field foraging as opposed to navigation in a four-arm or an H-maze (Siegle & Wilson, 2014). Furthermore, as mentioned above, sharp-wave ripples also assist in strengthening representations (Malerba et al., 2019), which occur predominately between trials during quiet wakefulness and during sleep (Buzsáki, 2015). Thus, time between tasks must not be discounted when making future predictions of rodent memory task performance.

While our CA3 SNN supported pattern completion via strong cell assembly retrieval, it did so with simple recall, i.e., recall that occurs from a degraded cue within a single gamma cycle (Bennett et al., 1997). The degraded pattern we presented to our network thus accounted for a cue that could be successfully recalled within a single cycle, which consisted only of activation of assembly members. Our work could be expanded to account for a different type of pattern degradation through progressive recall, i.e., the activation of both assembly and non-assembly member PCs in the testing cue that involves retrieval beyond a single gamma cycle. Testing pattern degradation in this manner in future work would allow for more direct comparison with other CA3 models (Guzman et al., 2016).

The advent of large-scale recording technologies, including two-photon calcium imaging, Neuropixel probes, and hundred Stimulation Targets Across Regions (HectoSTAR), enabling the simultaneous monitoring of thousands of neurons, may soon make it feasible to measure more directly the size of hippocampal assemblies (Steinmetz et al., 2021; Zong et al., 2022; Vöröslakos et al., 2022). Such evidence might show that the size of assemblies in vivo could vary depending on the represented cognitive content, providing further guidance for how to extend our SNN model. Additionally, these recordings during cue mismatch tasks would pinpoint how many neurons are typically reactivated in response to degraded cues (Neunuebel & Knierim, 2014; Knierim, 2002), allowing a quantitative comparison with our results. Last but not least, large-scale recordings will likely highlight the variation in neuronal overlap between assemblies, facilitating the estimation of key factors determining the memory capacity of the CA3 circuit.

4 Materials and methods

4.1 Full-scale CA3 SNN

The selection of the neuron types constituting the CA3 SNN and the model parameters, including neuron type-specific excitability, population size, connection probabilities, and synaptic signaling, were developed and validated in prior work (Kopsick et al., 2023). Briefly, the SNN consists of PCs and seven interneuron types: Axo-axonic, Basket, Basket CCK+, Bistratified, Ivy, Mossy Fiber-Associated ORDEN (MFA-ORDEN), and QuadD-LM cells. The perisomatic targeting and axonal-dendritic overlaps between these eight neuron types give rise to 51 directional connections (Fig. 1a).

For each neuron type, we utilized experimentally-derived parameters from Hippocampome.org for both the neuronal input-output function, i.e., the spiking pattern produced in response to a given stimulation, and the neuron count. In particular, to balance biological realism with computational efficiency, we chose the Izhikhevich 9-parameter, single-compartment dynamical systems framework (Izhikevich, 2007). The parameters reflect the following neuron type-specific properties: membrane capacitance (C), a constant that reflects conductance during spike generation (k), resting membrane potential (vr), instantaneous threshold potential (vt), a recovery time constant (a), a constant that reflects conductance during repolarization (b), spike cutoff value (vpeak), reset membrane potential (Vmin), and a constant that reflects the currents activated during a spike (d). Hippocampome.org reports the parameter values that best fit the firing patterns reported in the literature for the corresponding neuron types (Venkadesh et al., 2019).

For neuron counts, we considered each neuron type in our network as a representative of its supertype family (hippocampome.org/morphology). Thus, the population size of each neuron type in the SNN is the sum of all neuron types of the given supertype. For example, the number of instantiated CA3 Axo-axonic cells in the model (i.e., the population size parameter value for this particular neuron type) consisted of the sum of Axo-axonic proper and Horizontal Axo-axonic cells (two variants of Axo-axonic neurons in CA3), which Hippocampome.org reports as 1,482 for the mouse. The population sizes and the 9 Izhikevich parameters for each of the 8 CA3 neuron types are shown and listed in Fig. 1a; Table 1, respectively.

Modeling neuron type-specific communication involves a description of the postsynaptic signal caused by a presynaptic spike and related short-term plasticity (STP), as well as the connection probability and delay between the presynaptic and the postsynaptic neuron types. We modeled synaptic dynamics with the 5-parameter Tsodyks-Markram framework (Tsodyks et al., 1998), for which Hippocampome.org reports experimentally-derived pre- and post-synaptic neuron type-specific values (Table 2): synaptic conductance (g), decay time constant (τd), resource recovery time constant (τr), resource utilization reduction time constant (τf), and portion of available resources utilized on each synaptic event (U). Note that this formalism captures unitary synaptic communication. As such, it reflects the total somatic effect of all synapses corresponding to connected neuron pairs. In the simulations involving a normal (as opposed to constant) distribution of initial weights, we derived the mean (0.5531 nS), standard deviation (0.1201 ns), minimum (0.3372 nS), and maximum (0.8971 nS) conductance values (g) for the PC-PC synapses from the ranges provided by Hippocampome.org (Moradi et al., 2022). Given the local scope of the CA3 circuit, all connections were modeled with a synaptic delay of 1 ms. Hippocampome.org also provides morphologically derived connection probabilities for each directional pair of rat neuron types (Tecuatl et al., 2021a, b), which we scaled for the mouse according to a fixed anatomical sizing ratio (Tecuatl et al., 2021a, b). The probabilities for all 51 connection types in the circuit are reported in Fig. 1a.

Every instantiation of the simulation thus contained 84,053 neurons and 176 million synaptic connections on average. To elicit activity in the SNN, each neuron received a lognormal background current to model the upstream inputs CA3 receives from the dentate gyrus and entorhinal cortex (Mizuseki et al., 2013; Buzsáki & Mizuseki, 2014). The inputs were constrained to match the mean firing rates of each neuron type in the model with those observed in vivo (Table 3).

4.2 Range of assembly sizes

In order to define a range of assembly sizes to evaluate auto-association and pattern completion, we first considered the sparseness ratio of neural coding, γ, which is the average fraction of cells activated during an experience (Almeida et al., 2007). Available estimates for γ in humans and rats varied only slightly, from 0.1% (Guzman et al., 2016), through 0.23% (Waydo et al., 2006), to 0.3% (Almeida et al., 2007). This would correspond, for the number of PCs in mouse CA3, to a range of sizes between 75 and 225. The authors of the latter cited study, however, accompanied their estimate for assembly size (225) with a wider range (150–300) as well as cautionary lower and upper bounds of a factor of 2 in either direction (Neunuebel & Knierim, 2014). Furthermore, in the absence of precise experimental determinations, smaller values of value of γ could allow for larger storage capacity as long as recall from partial input could be maintained (Bennett et al., 1997). Based on these lines of reasoning, we set bounds of 0.067% <= γ <= 0.8%, corresponding to a range of assembly sizes between 50 and 600.

4.3 Long-term synaptic plasticity

In line with the notion that cell assemblies form via long-term plasticity (Miles et al., 2014), we adopted a symmetric (Hebbian) spike-timing dependent plasticity (STDP) learning rule between PCs (Mishra et al., 2016). Importantly, this symmetric STDP rule was observed between CA3 PCs in hippocampal CA3 slices of adult rodents, as opposed to a previous study of STDP that reported anti-symmetric STDP in cultured hippocampal neurons (Bi & Poo, 1998). The symmetric STDP was implemented as \(\:\varDelta\:w=\:A{e}^{-|\varDelta\:t|/\tau\:}\) Here, \(\:\varDelta\:w\) is the change in synaptic weight, \(\:A\) determines the weight change where the pre- and post-synaptic neurons fire at exactly the same time, \(\:\tau\:\) is the plasticity decay time constant, and \(\:\varDelta\:\text{t}\) is the temporal difference between the post- and pre-synaptic spikes. The value for \(\:\tau\:\) was set to 20 ms, which best approximated the symmetric exponential decay curve observed experimentally for CA3 PCs (Mishra et al., 2016) (Fig. 1B). The values for \(\:A\) varied based on the maximum CARLsim6 synaptic weight \(\:\left({w}_{max}^{*}\right)\) between PCs, which in our model depended on cell assembly size. Specifically, since the firing of each PC is triggered by the convergent integration of all activated presynaptic PCs, we reasoned that the maximum synaptic weight of each synapse should be inversely proportional to the number of PCs in an assembly.

In initial pilot testing with an assembly size of 300, we found that a value \(\:{w}_{max}^{*}=20\) induced strong auto-association after 100 input pattern presentations. Therefore, we anchored the maximum synaptic weight scaling based on assembly size to this value: for instance, SNNs with assembly size of 150 or 600 would have a \(\:{w}_{max}^{*}\) of 40 or 10, respectively. We then derived \(\:A\) so as to allow the synaptic weight to increase from the initial value before training (\(\:{w}_{init}^{*}=0.625\) in all our simulations) to \(\:{w}_{max}^{*}\) if all pre- and post-synaptic spikes were exactly coincident during training in the initial pilot settings. Since each of the 100 randomized spike trains during training contain 4 spikes on average, the resulting formula was \(\:A=\:\frac{{w}_{max}^{*}-\:0.625}{400}\), where \(\:{w}_{max}^{*}=\:\frac{6000}{size}\). Table 4 reports the maximum total synaptic conductance \(\:{(g}_{max}={w}_{max}^{*}*g)\) and \(\:{w}_{max}^{*}\) between PCs and \(\:A\) for each assembly size used.

4.4 Network training and testing protocol

Formation and retrieval of cell assemblies occurred in the CA3 SNN through dedicated training and testing phases. During the training phase, the SNN was presented with three input patterns, which consisted of requisite injected current to activate firing in a specific subset of PCs based on the size set for an assembly. The current injections triggered in each PC a randomized train of four spikes during a 20 ms (gamma) time window, with 200 ms (theta) time windows separating the presentation of the subsequent input pattern. This protocol of patterns presented at 50 Hz within an encompassing 5 Hz rhythm (“theta-gamma neural code” (Buzsáki, 2010; Lisman & Jensen, 2013; Bezaire et al., 2016)) resulted in the formation of three unique cell assemblies. After the initial randomization of spike trains in the first input pattern (the first presentation), the same pattern was provided to each subset of PCs in every subsequent presentation of the pattern, i.e., each stimulation pattern provided to a given subset of PCs was identical across all presentations.

Between training and testing, for network structures trained on successive presentations of five patterns, e.g., structures trained on 5, 10,…, 100 pattern sets, each synaptic weight \(\:\left({w}^{*}\right)\) between PCs was divided by the same factor such that the average \(\:{w}^{*}\) across all PC-PC synapses returned to \(\:{w}_{init}^{*}\). Rescaling synaptic weights in this manner is theorized to occur during slow-wave sleep, preserving synaptic weight distributions without eliminating the auto-association between assembly member PCs (Tononi & Cirelli, 2003, 2006, 2014). Additionally, re-scaling the network after every 5th pattern presentation was consistent with evidence that 5 theta cycles (corresponding to 5 pattern presentations) elapse while a mouse occupies its preferred place field when performing a spatial memory task (e.g., a T-maze, Y-maze, or open field foraging task) (Lisman & Jensen, 2013). This design ensures a proper training interval equivalent to one task trial between periods of rescaling.

Testing pattern completion involved providing degraded input patterns to the SNN during gamma and theta time windows as performed during training. Degradation of input patterns consisted of decreasing the percentage of assembly PCs firing together within the designated 20 ms period. The percentage of pattern degradation in this work ranged from 25 to 97.5%.

To visualize the network attractor dynamics during the testing phase (Supplementary Fig. 3), we first divided the spikes for each neuron in the network into successive, nonoverlapping 10 ms bins, yielding an 84,053 (# of neurons) x 100 (# of time steps) activity matrix. We then performed Principal Component Analysis (PCA) on this matrix to project the high-dimensional activity onto a 3D principal component space (Mazor & Laurent, 2005; Cunningham & Yu, 2014).

4.5 Quantification of auto-association and pattern completion

The capability of the SNN to form cell assemblies was investigated by quantifying two features of PC-PC synapses. The first one was the auto-association signal-to-noise ratio (SNR), defined as the mean synaptic weight between assembly member PCs divided by the mean synaptic weight between non-assembly member PCs. Thus, the higher the ratio, the stronger the auto-association of the formed cell assemblies relative to the rest of the CA3 network. The maximum auto-association SNR for each network structure investigated would occur if all assembly and non-assembly members had reached the downscaled maximum and minimum synaptic weights, respectively. The second quantified feature was the percentage of all assembly member synapses that had reached the maximum synaptic weight.

Pattern completion via cell assembly retrieval was assessed with a metric we called pattern reconstruction. First, Pearson correlation coefficients (PCCs) were computed from the training and testing input and training and testing output as described previously (Guzman et al., 2021). Pattern reconstruction accuracy was then computed as the difference in the output and input PCCs divided by the difference of the maximum PCC (1) and the input PCC, multiplied by 100 to obtain a percentage (Supplementary Fig. 4). Therefore, a non-zero pattern reconstruction accuracy would mean a cell assembly was retrieved, with 100% accuracy meaning perfect assembly retrieval. For the extended analysis of overlap (Fig. 7) we introduce a second metric, pattern specificity, meant to capture the activity in the cued assembly (Ac) relative to the activity in the non-cued assemblies (An). Specifically, we define pattern specificity (expressed as percentage) as \(\:100\times\:\frac{\#PC\left({A}_{c}\right)-\sum\:\frac{\#PC\left({A}_{n}\right)}{\#{A}_{n}}}{\#PC\left({A}_{c}\right)}\), where #PC(Ac) is the number of pyramidal cells firing in the cued assembly, #PC(An) is the number of pyramidal cells firing in the non-cued assemblies, and #An is the number of non-cued assemblies.

4.6 Overlap of cell assemblies

Associations between episodic memories in CA3 may be encoded by neurons shared between cell assemblies (Gastaldi et al., 2021; Quian Quiroga, 2023). Based on the finding of an overlap of 4–5% being suitable for recall of individual and overlapping assemblies (Gastaldi et al., 2021), and that pattern reconstruction accuracy at 97.5% pattern degradation was close to zero for assembly size 275 (1.68%; meaning that overlapping memories would not interfere with one another), an overlap of 5% was selected. Shared cells between each of three assemblies were randomly selected before training commenced, with the same training and synaptic downscaling procedures for no shared cells (0% overlap) utilized to obtain and normalize the weights between overlapping cell assemblies, respectively. For testing pattern completion with overlaps, an equal proportion of overlapping and non-overlapping cell assembly members were selected for stimulation, e.g., for an assembly size of 300 tested with a degraded pattern of 50%, 135 non-overlapping members and 15 overlapping members were randomly selected to activate each of the three assemblies.

4.7 Model implementation and execution

The CA3 model was implemented in CARLsim6 (Niedermeier et al., 2022), which utilized the 4th order Runge Kutta numerical integration method with a fixed time step of 0.2 ms (Butcher, 1996). The duration for simulations that trained and tested the networks were 70 s and 1 s, respectively. Instantiation and execution of the network model was performed on single 40 and 80 GB VRAM Tesla A100 GPUs on the George Mason University High Performance Computing Cluster (Hopper). Hopper, which contained more than one hundred such GPUs, allowed for efficient and flexible simulation that greatly reduced the time needed to test different training and testing paradigms. Simulation results were loaded and visualized in MATLAB with CARLsim6’s Offline Analysis Toolbox (OAT). Additional custom-built functions for data analysis were written in Python and MATLAB. All scripts developed are available open source at github.com/jkopsick/cell_assembly_formation_retrieval.