The role of higher-order interactions in neuroscience has been been actively debated at both microscopic and macroscopic levels over the last decade. In both cases, there is evidence that higher-order terms are present, yet at the same time in many instances it is still unclear to what degree such interactions dominate, or are dominated by, pairwise interactions. To further complicate the matter, a certain confusion is present on what exactly is meant by higher-order interactions or effects, since they might be encoded as many-body coefficients in spin models of neuron firing, as hypergraphical structures for population coding, or even homological properties of the functional spaces of whole-brain activations. The landscape of higher-order effects in neuroscience includes contributions in which higher-order interactions can feature as a fundamental dynamical unit, as a methodological tool, and, at times, as both. In this chapter, we discuss these different aspects moving from the microscopic to the macroscopic scale, while explicitly highlighting the role that higher-order interactions take in the different cases.

Specifically, we describe how higher-order interactions have been introduced and measured in the context of neuronal populations and in coding theory. We then discuss recent applications of topological data analysis to whole-brain data, and finally highlight the challenges related to promoting signals from low-order to higher-order interactions.

17.1 Higher-Order Interactions and Descriptions at the Neuronal Scale

Neuronal activation is the atomic unit of brain activity, and the firing patterns of groups of neurons underpin human society. These patterns are not unique and do reoccur, showing that neurons communicate and generate spatio-temporal correlations in their firing activity. Interestingly, even cortical slices in a Petri dish display non-trivial spatio-temporal correlation patterns  [5] and can be used to show the delicate neurochemical balance underpinning neural activation [59]. Measuring neural activity is by definition difficult. The size and density of neurons makes it impossible to measure the activation of a single neuron, and electrodes array typically measure the firing activity of a group of neurons. Measurements taken from live subjects are invasive as they require implanting electrodes. Most experimental data therefore comes from animal studies, with the exception of measurement obtained from subjects suffering from certain forms of epilepsy or neurodegenerative disorders. Computational models are commonly used to generate data, but are—of course—short of the real thing. Despite these limitations on data size—i.e. number of electrodes, length of clean times series or the artificiality of data—, important work has emerged in the study of the role of higher-order interactions in neural coding.

Neuronal activity can be encoded as a two states variable, inactive and firing. A paradigmatic model to study binary variables interactions is the Ising spin model that can easily be extended to include any order of interactions, i.e. pairwise, threeway, and higher-order analogues. The probability of a given neuronal configuration for a population of N neurons \((\sigma _1,\ldots ,\sigma _N)\) is given by the maximal entropy distribution, also known as the Gibbs distribution:

$$\begin{aligned} p(\sigma _1,\ldots ,\sigma _N)\propto \exp \left( \sum _i \alpha _i\sigma _i+\sum _{i<j} \beta _{ij}\sigma _i\sigma _j+\sum _{i<j<k} \gamma _{ijk}\sigma _i\sigma _j\sigma _k+\cdots \right) , \end{aligned}$$
(17.1)

where the coupling parameters \(\alpha _i\), \(\beta _{ij}\), \(\gamma _{ijk}\) are to be estimated from experimental data. In practice, the estimation of the coupling parameters and its reliability limits the order that can be considered [54], and early results found that considering pairwise interactions was in some setups sufficient to capture most of the firing patterns structure [58].

However, investigations of larger neuron ensembles (of the order of  100), showed that higher-order interactions can easily encode responses to stimuli [19, 70] and—additionally—that the structure of the interactions is hierarchical and modular, suggesting scalability [19]. Further work has shown that including higher-order terms to encode firing pattern elicited in response to stimuli improves the goodness of fit when introducing a state-space for patterns [7, 60]. Simultaneous silence, i.e. patterns of inactivity, are also characterisation of higher-order interactions and highlight the role of inhibitory neurons in creating spatio-temporal interaction patterns [61]. Moreover, the higher-order models reveal activity patterns closely related to the underlying structure of cortical columns [31], indicating a relationship between structure and function.

While useful to capture the statistics of neuron firing patterns in response to stimuli, these models suffer from several limitations. Their scalability is a problem, as obtaining good and reliable estimates of the model parameters requires long time-series, even for small systems [54]. They also remain “fitting” models, that make assumptions about the process generating the data that, albeit intuitively reasonable, are nevertheless without theoretical or empirical foundation. The last limitation is built in the model class, they inherently lack a temporal dimension and dynamics that is central to spatio-temporal neural coding.

Fig. 17.1
figure 1

Figures reproduced from Ref. [1]

Coactivation complexes can explain sustained and robust representation of spaces. a A simulated place cell field map \(M(\epsilon )\) of a small planar environment \(\epsilon \) with a hole in the center is shown together with temporal snapshots of the temporal dynamics of the coactivity complex, which evolved from a small and fragmented one, early during the exploration, to a stable representation of the underlying environment later on. b The timelines, encoded as persistent barcodes, of topological persistent \(H_0\) and \(H_1\) cycles for the coactivity simplicial complexes: 0-dimensional persistent generators are shown in light-blue lines, 1-dimensional ones in light-green. Most 1-dimensional cycles correspond to noise, while the persistent topological loops (red dots) encode true physical features of the environment. The time to eliminate the spurious cycles is a proxy for the estimation of the minimal time needed to learn the path connectivity of \(\epsilon \). c Since simplices can also disappear due to noise and unstable neuronal firing, the coactivity complex can flicker, resulting in d the timelines of the topological cycles to be interrupted by opening and closing topological gaps.

To continue the study of higher-order driven neural activation patterns while alleviating some of the model-based limitations, we turn to model free, data-based methods from topological data analysis, and focus particularly on place cells [41]. Place and grid cells have complementary roles in encoding and memorising spatial information in the hyppocampus and the enthorinal cortex [24, 41]. They also display reliable and long lasting transient patterns [28], making them ideal candidates for detecting structure in neural activation patterns and understand the function of neural circuits. Although we argue for common neural mechanism across species [4, 30], the experimental data in the works we discuss come from rodents.

Remarkably, the firing patterns of hyppocampal place cells are shown to encode the topology of an animal’s environment rather than its exact geometry, as well as its position within its environment [12,13,14]. Place cells’ activations therefore reflect the environment an animal is moving in. The question of how the brain activates the appropriate “environmental” map is currently unknown, but research has shed light on possible mechanisms that allow maps to be consistent and robust over time  [1]. Co-activation complexes are constructed by building simplices from coactive place cells. Over time and exploration of the environment, the coactivation simplices progressively become a better topological representation of the physical space (Fig. 17.1). It is not clear that this mechanism is enough to ensure the maps are committed to memory once the animal is removed from the test environment and can be reused in the future or in mental exploration, i.e. memory trip. A potential such mechanism is proposed in a computational study [2] in the form of replays, where the cells regularly and autonomously reproduce firing sequences corresponding to specific maps, reinforcing existing patterns in a Hebbian learning way. One might conjecture that replays happen during sleep as part of a memory consolidation process [49].

Furthermore, [22] studied how the correlations of spike trains can be used to detect intrinsic structures in place cell activity, without recurring to external stimuli, and how they relate to the topology and geometry of the animal’s space. Each correlation matrix was then transformed into an order complex, a filtration of simplicial complexes, obtained by adding a each filtration step a new edge corresponding to the next highest non-diagonal correlation matrix value. The clique complex corresponding to that filtration was then built. They found that the Betti curves that encode the homological properties of the cell activation patterns measured from the animal free-roaming have consistently lower values than from reshuffled version of the correlation matrices. These observations suggested that the correlation structure of hippocampal neurons intrinsically represent the low dimension of the ambient space.

While the geometry of place cells is constrained by the nature of the information they encode, [50] investigated the topology of excitation networks built from simulated activity on reconstructed generic cortical micro-circuitry. The homological structure of such networks was strikingly complex, showing a surprisingly large number of high-dimensional cliques and a wide variety of high-dimensional homological holes. Further simulations on synthetic and null models found different organisations, suggesting that the topological properties of the activation patterns are not purely driven by the neuron interaction topology, but also by their particular function.

Due to the difficulty of obtaining precise measurement of physical connections for large collection of neurons, topologically based higher-order methods are currently limited to decoding the activity or neuron rather than their structure. At the single neuron level, there is however a correlation between a neuron’s topology and its function [29], opening to door to investigating the topological properties of groups of neurons and their function.

So far we have only discussed the role of neural activity at the very small scale, focusing on small neuronal ensembles or very specific functions, e.g. spatial representation. However, one of the great challenge of neuroscience is to understand how behaviour emerges from neural activity, to bridge the scales at which brain activity can be measured [66]. A unified model of brain function remains elusive, despite progresses being made, such as models relating interneuron dysfunction to schizophrenia [65], which has found some experimental confirmation [8, 42], or gene co-expression maps being correlated with fMRI brain activity and neurotransmitter pathways [43, 51]. One is often focused on the difference between the micro scale, i.e. neuronal activity, and macro scale dynamics, i.e. brain activity measured with neuroimaging techniques such as fMRI, EEG, or MEG. There are however similarities, such as the spatio-temporal statistics of neuron [5] and voxels [64] activity. Moreover, the aim at both scales is to link spatio-temporal activity patterns to behaviour. The same set methods can therefore be applied in both cases as they are agnostic to the source of the data, and they can be used to bridge across scales.

For example, binarized fMRI signal [64] can also be used as an input for the extended Ising spin model. The structure of the energy landscape defined by the spin-voxels configuration (Eq. (17.1)) reveals transition dynamics between tasks [17, 68, 69]. However, the models fitted in these studies do not include higher-order terms, as the length of fMRI time series is typically too short for a reliable estimation of the model parameters. Methods relying on signal correlation analysis and topological data analysis are less sensitive to this limitation and have seen their popularity increase for the analysis of macroscopic brain function [16]. We discuss a selection of relevant results in the next section.

17.2 Higher-Order Topology in Whole-Brain Descriptions

At the whole-brain level [15], the question of the importance of higher-order interactions is faced with contrasting evidence. For example, it has been suggested that weak higher-order interactions exist in large-scale functional networks, but are dominated by the pairwise interactions, which would therefore be the main shapers of brain function [25]. From this perspective, the higher-order terms could be neglected and functional connectivity descriptions based purely on network properties would be enough to fully characterize brain function.

On the other hand, however, higher-order observables were identified as important in multiple studies, e.g. test-retest analysis [72], aberrant connectivity in mental disorders  [48] and mild cognitive impairment [71], as well as model inference for EEG signals [32]. Further evidence in this direction has recently come from the study of the shape of the functional spaces described by whole-brain structural and functional data, using tools borrowed from topological data analysis [16, 21]. In structural networks, typically obtained from DTI measurements, persistent homology was used to distinguish healthy and pathological states in developmental [34, 37, 57] and neurodegenerative diseases [35]. For example, considering white matter fibers between brain regions as a weighted network, it was possible to detect loops and cavities between regions that were coherent with biologically-inspired principles of parsimonious wiring (Fig. 17.2a) [62]. Such cavities act as obstructions for information flows and were surrounded by large cliques, which could be interpreted as local dense units able to perform rapid processing. The cavities were reproducible across subjects and connected regions belonging to different phases of brain evolutionary history (Fig. 17.2b).

At the functional level, topological differences were found in healthy and pathological subjects [34, 36]. Higher-dimensional topological features in these cases corresponded to the homological structure of the correlation spaces extracted from functional connectivity analysis, e.g. spaces with Pearson correlation metric. They were useful to discriminate between brain functional configurations in neuropsychiatric disorders and altered states of consciousness relative to controls [11], and to characterize intrinsic geometric structures in neural correlations [22, 55]. One of the more known examples of this type of analysis compared the topology of the functional connectivity of subjects under the effect of psilocybin, a psychedelic drug, with their own under placebo [44], finding that the topological structure was very different between the two conditions, and that the difference could be quantified at the level of persistence diagrams (Fig. 17.2c). To improve the interpretability of the \(H_1\) topological summaries extracted from the data, homological scaffolds were produced to map the topological information back to the brain regions. Such scaffolds can be understood as topological backbones, built from approximated minimal homological generators (Fig. 17.2d), and showed that altered states of consciousness induced by psilocybin (and likely, other psychedelics) arise from different patterns of information integration and importance across brain regions [38] with respect to the normal state (Fig. 17.2e).

Fig. 17.2
figure 2

Figures adapted from Ref. [44]

Structural and functional brain topology. a Distribution of maximal cliques in the average DSI (black) and individual minimally wired (gray) networks, thresholded at an edge density of \(\rho \) = 0.25. Heat maps of node participation shown on the brain surfaces for a range of clique degrees equal to 4–6 (left), 8–10 (middle), and 12–16 (right). b Minimal cycles representing each persistent cavity at the density at birth represented in the brain (top) and as a schematic (bottom) (adapted from [62]). c Comparison of persistence p and birth b distributions. Left, \(H_1\) generators’ persistence distributions for the placebo group and psilocybin group. Right, distributions of homology cycles’ births. d Statistical features of group level homological scaffolds. Left, probability distributions for the edge weights in the persistence homological scaffolds (main plot) and the frequency homological scaffolds (inset). Right, scatter plot of the scaffold edge frequency versus total persistence for both placebo and psilocybin scaffolds. e Simplified visualization of the persistence homological scaffolds for subjects injected with placebo (left) and with psilocybin (right). Colours represent communities obtained by modularity optimization on the placebo scaffold and display the departure of the psilocybin connectivity structure from the placebo baseline.

Other examples of the application of homology can be found in the literature. Lee et al. [34] have proposed methods to discriminate between cohorts of children with attention deficit hyperactivity disorder, autism spectrum disorder and pediatric control subjects on the basis of their functional topology. In following works [33], the topological substructure of brain networks was represented through the eigenvectors of the corresponding Hodge Laplacians and used it to discriminate between mild and progressive cognitive impairments, and Alzheimer’s disease, used to describe the heritability of differences in whole-brain functional topology in a cohort of twins [10], and related to topological functional structure of EEG data during imagery to functional equivalence in a population of skilled versus unskilled imagers  [26, 27].

17.3 Beyond Functional Connectivity

The analysis methods presented so far mostly focus on notions of functional connectivity, the prototypical example being the canonical Pearson correlation matrices. Using topological description, it is however possible to investigate different features of the spaces in which brain activity can be represented. An interesting example is a topological simplification analysis [56] which focuses on extracting new network representations from temporally resolved fMRI signals. This approach starts by considering each instantaneous BOLD signal measurement as a point in a high-dimensional space. This space of activation is then filtered using a PCA-based function, which is then used to create a binning of the time points in overlapping bins. Inside each bin, points are then grouped using standard clustering algorithms. The resulting clusters constitute the node set of a new graph, typically called shape graph or Mapper graph [43]. Since the binning allows for overlap across bins, it is possible for the same time point to belong to multiple clusters in different bins. Whenever two clusters share a time point in this way, an edge is added linking the Mapper graph nodes corresponding to the two clusters. In such way, it is possible to build a Mapper graph for each subject, which captures the topology of a simplified representation of the landscape of an individual’s activation space (Fig. 17.3a). Interestingly, Saggar and collaborators [56] found that the properties of the individual Mapper graphs were predictive of changes in performance over a multiple tasks: Mapper graphs with large modularity were linked to higher accuracy and smaller response times (Fig. 17.3b). This suggests that a brain activation space that has more diverse and specialised representations of tasks guarantees better multitasking performances [45], as opposed to representations shared across multiple tasks, which instead have been linked to generalization. Moreover, it also suggests that changes in function can be both localized, i.e. specific altered states that induce functional change, and global, i.e. they affect the whole dynamical landscape of brain function rather than only specific configurations. Results supporting this picture were also obtained using related embedding techniques, e.g. low-dimensional projections based on topological distances [6] or persistent homological features obtained from spatial activation patterns [52].

17.4 Higher-Order Signals and Reconstruction

An open and interesting question regarding higher-order interactions in neuroscience is how to measure and—in some cases—even define them. In the case of co-firing neurons, it is natural to identify their firing patterns as a higher-order interaction, as it is done for example in the co-activation simplicial complexes of [3]. In such cases, the interactions also have a natural downward closure—groups of 4 co-firing neurons, also co-fire in groups of 3 and in pairs—, making simplices and simplicial complexes natural descriptions for the system. Moreover, it is also straight-forward and natural to define binary activation signals for these higher-order interactions by considering when they are and are not present.

Fig. 17.3
figure 3

Adapted from [56]

Mesoscale properties of graph representations of brain activation predict individual task performances. Panel a shows the Mapper graphs obtained for two subjects [56] (labeled S14 and S07). The pie charts on the nodes show the fraction of timepoints corresponding to each task in the graph node. The Mapper graph for S14 has a low modularity score [40], while that of S07 shows a high degree of modularity structure, in particular showing nodes that are most often connected between to other nodes of same task type. Panel b reports the correlations found between the graphs’ modularity scores (Qmod) and task performances.

On the other side, when dealing with large-scale brain dynamics, signals for higher-order interactions are almost never directly available. In general, neuroimaging signals recorded from regions of interest, i.e. 0-th order signals, are encoded as metric spaces [11, 44] or weighted clique simplicial complexes [46], using their correlation properties. Filtrations of simplicial complexes are then extracted from these representations to compute persistent homological features [47]. While this allows to capture information that is not available from a network representation perspective, the dynamics of higher-order interactions strongly depends on the structure of pairwise interactions. Although this type of analyses has encountered large success already [16], it would be very valuable to be able to measure or—at least—construct higher-order signals from low-order ones in a principled and controlled way.

A first possibility in this direction is to explicitly leverage low-order signals to define higher-order ones. An example is the edge-level signals and the corresponding edge-centric connectivity introduced by Faskowitz et al. [18]. In standard functional connectivity studies, after z-scoring each time series, the correlation \(r_{ij}\) between regions (nodes) i and j is computed as

$$\begin{aligned} r_{ij} = \frac{1}{{T - 1}} \sum _t \left[ z_i\left( t \right) \cdot z_j\left( t \right) \right] \end{aligned}$$
(17.2)

where \(z_{i,j}\) are the z-scored timeseries. The correlation coefficient \(r_{ij}\) is by definition time independent, however, if one discards the sum over t and the normalization, then it is possible to consider its time evolution

$$\begin{aligned} c_{ij}(t) = z_i\left( t \right) \cdot z_j\left( t \right) \end{aligned}$$
(17.3)

as the timeseries describing the coherent fluctations of the functional edge ij and therefore as a genuine higher-order signal. In [18] the authors used this construction to define an edge-based functional connectivity \(eFC_{ij,uv}\) among pairs of edges (Fig. 17.4a):

$$\begin{aligned} eFC_{ij,uv} = \frac{ \sum _t c_{ij}( t ) \cdot c_{uv}( t )}{{\sqrt{\sum _t c_{ij}(t)^2} \sqrt{\sum _t c_{uv}(t)^2} }} \end{aligned}$$
(17.4)

and then studied it using conventional network-based observables. The construction could be generalized to arbitrary orders, which could provide a way to construct weighted and temporally-resolved higher-order representations of brain neuroimaging data. Note that, however, there is no concept of what type of information we are encoding, e.g. whether the co-fluctuations of a set of regions are due to the effect of yet another region, absent which they would be conditionally independent.

Fig. 17.4
figure 4

Adapted from Gatica and Cofré [20]

Construction of higher-order timeseries from low-order ones. a the construction in the case of edge-centric connectivity. Adapted from Faskowitz et al. [18]. b Results for redundancy and synergy based on O-information for age groups: \(I_1\) 30 subjects, ages 10–20; \(I_2\) 46 subjects, ages 20–40; \(I_3\) 29 subjects, ages 40–60; \(I_4\) 59 subjects, ages 60–80.

A second recent approach to the inference of higher-order interactions offers a possible solution to this problem by adopting an information-theoretic point of view. O-information [53] is a (real-valued) observable that discriminates between redundant and synergistic components of the information in systems composed by multiple variables. Redundant information here means information that is present in the low-order marginals, e.g. at the node level, while synergistic refers to information that is absent at low orders and is only present at the group level. Formally, for a system composed by n discrete variables, \(\mathbf {X}^n = (X_1 , \ldots , X_n )\), the O-information \(\Omega (\mathbf {X}_n)\) is defined as:

$$\begin{aligned} \Omega (\mathbf {X}^n) = TC(\mathbf {X}^n) - DTC(\mathbf {X}^n) = (n-2) H(\mathbf {X}^n) + \sum _{j=1}^n \big [ H(X_j) - H(\mathbf {X}_{-j}^n) \big ] \end{aligned}$$
(17.5)

where TC and DTC are respectively the total correlation [67] and dual total correlation [63], and \(\mathbf {X}^n_{-j}\) is the vector \(\mathbf {X}^n\) with the jth variable omitted. A positive value of \(\Omega (\mathbf {X}^n)\) implies that the interdependence are mostly dominated by redundancy, while a negative value implies that synergistic effects are dominant. A further advantage of this quantity over previous multivariate measures of dependency is that it does not require a division between predictors and target variables, but rather provides a genuine measure of group synergy. As a proof of principle, O-information was used to quantify the changes in relevance of interactions of different orders in groups with different ages [20]. In particular, they found significant increases in redundancy in older participants for all interaction orders, and also that synergy and redundancy display different functional forms across all age groups and interaction orders (Fig. 17.4b).

Although promising, these two approaches still face challenges to be widely adopted in conjunction with the existing topological and network tools. For example, when generalizing edge-connectivity to higher-order interactions, the sign of the co-fluctuations, being the result of the multiplication of more than two terms can be misleading and could result in misinterpretations if not properly accounted for. On the other hand, information-theoretic observables typically require a discretization of the signals in states which is non-trivial already in simpler applications, e.g. multivariate mutual information [39]. Finally, in both cases, while it is possible to compute the strength of interactions at all orders, it is unclear how values for interactions at different orders could be compared directly, making the definition of a valid filtration on a weighted simplicial complex non-trivial.

17.5 Outlook for the Future

There is little doubt in our opinion that higher-order interactions play a central role in the brain dynamical organisation and in cognition. They might be weak and difficult to quantify in general at the moment, but in the context of complex systems, weak does not imply negligible [9, 23, 66]. They are likely to play a central role in multitasking [45], focus [27] and neural coding [12, 22] among other things. They have also been shown to improve model fitting [54, 68], prediction [3, 13] and separation [56]. Hence, higher-order interactions and related analysis methods might be good candidate for biomarkers. It is however difficult at the moment to know whether they are simply good tools and representation of signals for analysis, or have a deeper, more fundamental role in brain theory. Information-based signal analysis [53] might be a good candidate for a first investigation of the precise role of higher-order interactions and structure in brain organisation. However, as future steps, we envision the inclusion of such interactions in testable theoretical brain models so that theory and experiments feed on each other.