Keywords

1 Introduction

Diffusion MRI imaging and tractography algorithms have enabled the mapping of the macro-scale connectome of the entire brain [23]. This network representation enables the application of powerful tools from graph theory and graph algorithms in the study of the brain’s structure and function. Earlier work has focused on various important network properties of the brain such as small worldness [1], presence of hubs [12], modularity [22], etc. These studies have revealed that seemingly local pathologies in specific regions can have far-reaching global effects on other parts of the brain [19, 24].

Probably the simplest way to study the dynamics of brain activity at the macro-scale is to compute the “activation cascade” that is generated by the artificial stimulation of a source region. Activation cascades, represented in the form of directed acyclic graphs (DAGs), describe how an activation starting from one region (i.e., source node) propagates to the rest of the brain, activating other brain regions along the way. Previous work has applied the Asynchronous Linear Threshold (ALT) model on the mouse meso-scale connectome to simulate the propagation and integration of sensory signals through activation cascades [21]. Those modeling results were validated with functional data from cortical voltage-sensitive dye imaging, showing that the order of node activations in the model matches quite well with the empirical activation order observed experimentally [21].

The question that we focus on in this study is: suppose we are given two groups with significant differences in the activation cascades generated in their brain networks, what is the smallest set of brain connectivity (i.e., graph edge weight) changes that are sufficient to explain the observed differences in the activation cascades between the two groups? Answering this question can be valuable in many studies when two groups should be compared, not only in terms of structural connectome differences, but also in terms of functional dynamics. For example, we can identify a (generally small) set of brain connectivity changes that appear to cause the functional activation differences in a given disorder, by comparing the corresponding activation cascades with healthy controls. Further, the corresponding connections can be used as possible targets in interventions and treatments such as deep brain stimulation [20, 26].

We have developed an algorithm named TRACED (The Root-cause of Activation Cascade Differences) to solve the previous problem, as illustrated in Fig. 2. TRACED starts by identifying node membership differences between the two groups (say A and B) within the activation cascade of each source. Then, for each source, we identify the smallest set of edges that, if their weights in group A are modified to be equal to the weights in group B, the corresponding activation cascades will be the same in both groups. We have computationally validated TRACED across many test cases. Additionally, we have applied TRACED in the comparison between a group of patients with major depressive disorder (MDD) and a group of controls. This paper focuses on the proposed computational method – a more comprehensive MDD-focused study of the two groups will be presented in a different article.

Previous work detected significant topological differences in terms of network metrics such as edge weights and centrality measures for various neurological disorders, including multiple sclerosis [7, 15], Alzheimer’s disease [6], Parkinson’s disease [27], and schizophrenia [8]. We argue that the activation cascade approach to comparing the connectomes of two groups is more insightful than simply identifying such “static” network differences. The former makes some clear and simple assumptions about the processing and propagation of information in the brain, and it creates a causal connection between structural changes and functional effects. Therefore, the identified abnormalities are more interpretable and robust to subject variability.

2 Linear Threshold Model and Activation Cascades

Our starting point is a structural macro-scale brain network. In this network representation, the graph is denoted by \(G = (V, E)\), each node in V corresponds to a brain region, and E contains edges that correspond to connectivity between brain regions. For structural networks constructed with diffusion tensor imaging (DTI), the edges are undirected. Each edge (xy) in E is associated with a weight w(xy) that represents the strength of the corresponding connection.

In the linear threshold model, each node can be either active or inactive. Initially, all nodes are inactive, except a single source node. If a neighbor y of a node x is active, then we say that x “receives an activation” from y with strength w(yx). Node x becomes active if it receives a cumulative activation from all its active neighbors that is more than a threshold \(\theta \).

More formally, a node x at time t is associated with a binary state variable A(xt) indicating whether x is active (1) or not (0). For the source node s, we have that \(A(s, t=0) = 1\) and for all other nodes:

$$\begin{aligned} A(x, t+1) = 1 \text { if } \sum _{y \mid (y, x) \in E} {w(y, x) A(y, t)} \ge \theta \end{aligned}$$
(1)

for \(t\ge 0\). If x becomes active in the cascade of source s, \(t_s(x)\) is the time of its activation. By convention, \(t_s(x) = \infty \) if node x never gets active.

An activation cascade, in the form of a directed acyclic graph (DAG), shows whether as well as how each node becomes active. The nodes in the activation cascade of source s form the following set:

$$\begin{aligned} U(s) = \left\{ x\in V \mid t_s(x) < \infty \right\} \end{aligned}$$
(2)

The edges in the activation cascade include \((x, y) \in E\) if node x becomes active before y. So, the presence of this edge in the cascade DAG means that x participates in the activation of y. Mathematically,

$$\begin{aligned} F(s) = \{(x, y) \in E \mid t_s(y) < t_s(x) \} \end{aligned}$$
(3)

We denote the activation cascade as \(H(s)=\{U(s), F(s)\}\). In Fig. 1 we show a simple example illustrating an activation cascade generated in a toy network using the linear threshold model.

Fig. 1.
figure 1

An illustrative example of an activation cascade obtained using the linear threshold model. Node B is the source of the cascade. The threshold \(\theta =2\). Node A gets active through the edge (A, B), and node C becomes active after both A and B are active. The rest of the nodes stay inactive in this cascade.

For a given \(\theta \), different source nodes may give different cascade sizes. Some source nodes do not activate any other node giving rise to empty cascades, while other source nodes may activate every node in the network causing a full cascade. The third case is that of a partial cascade, which is more likely in practice. It would be unrealistic to set the threshold \(\theta \) so high that we get many empty cascades – that would correspond to a comatose brain! However, it would also be unrealistic to set \(\theta \) so low that we get many full cascades. The previous observations guide us to choose a range of \(\theta \) values that result in more partial cascades, across different source nodes.

When comparing the structural brain networks of two subjects, or two groups, we rely on the membership of each source’s cascade: If a node x is active in the cascade of source s in one network, is x also active in the corresponding cascade of the other network? The similarity between the node membership of two cascades is quantified using the Jaccard similarity metric, applied on the set of active nodes in the two cascades. A small Jaccard similarity represents a large difference between the two cascades. If U(s) and \(U'(s)\) denote the set of nodes activated from source s in networks G and \(G'\), respectively, the difference between the two cascades is quantified by:

$$\begin{aligned} d\{U(s), U'(s)\} = 1 - J\{U(s), U'(s)\} = 1 - \frac{|U(s) \cap U'(s)|}{|U(s) \cup U'(s)|} \end{aligned}$$
(4)

where \(J\{U(s), U'(s)\}\) is the Jaccard similarity of the two cascades.

Fig. 2.
figure 2

Method overview: the abnormal and control networks may have several edges with different weights. We generate the activation cascade for each source using the linear threshold model, and identify the cascade membership differences across the two networks. We identify a subset of edges (containing only edge BD in this example) whose weight change can explain the majority of the observed cascade differences. In other words, if we restore the weights of this subset of edges in the abnormal network to be equal to the corresponding weights in the control network, the majority of the cascade differences between two networks no longer exist.

3 TRACED Algorithm

We expect that a mental disorder (or any other genuine distinction in the structural brain networks of a group) would cause cascade membership differences for several different sources [25, 28]. Additionally, it is reasonable to expect that these cascade membership differences will be caused by a rather small set of brain connectivity abnormalities (a larger set of abnormalities would probably be lethal). Under these assumptions, we aim to detect the smallest set of edge weight changes that can explain the observed cascade membership differences between the two groups.

The Case of a Single Source Node: The problem of finding the root-cause for the activation cascade differences of a single source s can be formulated as follows: We are given the cascade of s in the control and the abnormal networks. Compute the minimum set of edges C in the abnormal network so that, if we restore the weights of those edges to be equal to the corresponding weights in the control network, the activation cascade of s will be identical in the two networks. We create C-restored network by replacing the weight of edge e (\(e \in C\)), in the abnormal network with the weight of e in the control network.

The mathematical formulation of the previous problem is:

$$\begin{aligned} \hat{C} = \mathop {\mathrm {arg\,min}}\limits _{C \in \{E \cup E'\}} |C| \ \text {s.t.}\ U'_C(s) = U(s) \end{aligned}$$
(5)

where the set of active nodes in the control cascade of s is denoted by U(s), the set of active nodes in the abnormal cascade of s is \(U'(s)\), and the set of active nodes in the C-restored network of s is \(U'_C(s)\). By convention, we take the weight of any edges that are not present as 0.

A naive algorithm would be to search among all \(2^m\) solutions (\(m = |E \cup E'|\)) but that would be computationally infeasible for the scale of structural brain networks.

Instead, the TRACED algorithm starts from an empty set C and gradually “grows” the solution by adding one edge at a time. The original empty set C can grow into m different sets, each with a distinct edge. In the next step, each of these m sets can include one of the remaining \(m - 1\) edges, creating a total of \(m(m-1)\) sets with two edges each. This way, when \(\hat{C}\) is found, the number of candidate solutions is \(m^{k}\), where \(k = |\hat{C}|\). Since we are adding edges step by step following an approach similar to breadth-first-search, the solution is guaranteed to be optimal. Note that even though the run-time of this approach grows exponentially with the solution size k, we expect (as previously mentioned) that k will be small in practice.

The run-time of the algorithm can be improved however based on the following observation. Let us define as “candidate edges” the edges that point from \(U(s) \cap U'_C(s)\) (nodes active in both cascades) to \(U(s) \triangle U'_C(s)\) (nodes active in one cascade but not the other). We know that at each “growth” step at least one of the candidate edges should be added to the solution. Otherwise, it is impossible to change the activation status of the nodes in \(U(s) \triangle U'_C(s)\). Therefore, in each step we only consider candidate edges, and thus limit the number of new possible solutions created. If b is an upper bound on the number of candidate edges, the number of total solutions generated during the search is at most \(b^{k}\).

Figure 3 illustrates the execution of the TRACED algorithm with a small example. We start with an empty solution C and with the two activation cascades (control and abnormal) for a single source s. Then, we identify the candidate edges between the two cascades. For each candidate edge we “grow” a new branch of the solution tree. We repeat these steps until \(U(s) = U'_C(s)\).

TRACED has a time complexity of \(O(b^{k} (|V|+|E'|))\) because it iterates through \(b^{k}\) candidate solutions and executes the linear threshold model once for each possible solution.

In Sect. A.1 we introduce an improvement that further reduces the average run-time and allows multiple optimal solutions to be found, by adding more than one edge into a candidate solution at each step. That improvement does not change the algorithm’s main idea or its worst-case run time.

Fig. 3.
figure 3

Illustration of TRACED: the tree structure shows how the solution is gradually computed one edge at a time – different branches of the tree can lead to different solutions. The final solutions are marked in red. Along with each candidate solution C, we present the corresponding cascade \(H'_C(s)\). In this example, two solutions can explain equally well the observed differences between the two cascades that originate from source C. (Color figure online)

To computationally validate the correctness of the algorithm, we created pairs of small-scale graphs for which we know the edges that cause activation cascade differences between the two networks. These examples are designed so that they vary in several factors: they can have one or multiple optimal solutions, only one edge or multiple edges in one solution, and edges in a solution that are dependent on each other (i.e., an edge included in the cascades only when the weight of another edge is restored). TRACED results in the correct results in all cases, identifying one or multiple optimal solutions correctly.

Aggregation Across Different Source Nodes: The previous algorithm may produce different sets of edges for different source nodes. Some of these edges may be the result of noise in the data or other artifacts. We select a subset of these edges based on the following argument: if TRACED identifies a certain edge as causal, not only for one source but for multiple, it is likely that edge represents a genuine and important difference between the control and abnormal networks.

We use the coverage metric to measure the number of sources for which an edge e has been identified as causal for the cascade membership differences. Edges with higher coverage play a more central role in the observed differences between the two networks.

To test if the coverage of an edge is significant or not, we construct a null hypothesis that all edges in the network have the same probability (\(\frac{|\hat{C}(s)|}{|E|}\), where \(\hat{C}(s)\) refers to the set of edges identified to be causal to cascade membership differences with source node s) to be reported as causal for source s. Under that assumption, the coverage metric follows a binomial distribution:

$$\begin{aligned} coverage'(e) \sim B\left( \sum _s |\hat{C}(s)|, \frac{1}{|E|}\right) \end{aligned}$$
(6)

So, the final output of TRACED is the set of edges for which the coverage value is much higher than expected based on chance (\(p<0.05\) in the binomial distribution).

This final step makes the TRACED algorithm heuristic - the set of edges that we finally report is no longer guaranteed to explain all differences in the activation cascades of all sources. Nevertheless, the result captures edges that have influenced the activation cascades across many source nodes, and is therefore more reliable.

4 A Case Study on Major Depressive Disorder

The focus of this paper is on the analysis method presented in the previous section, rather than a specific application. To illustrate one potential application of this method, however, we summarize here the results of a comparison between a group of severe MDD patients and a group of healthy controls. The DTI data for this comparison was provided to us by Dr. Helen Mayberg’s group and they were originally used in the PReDICT study [3, 4]. The PReDICT study was approved by Emory’s Institutional Review Board and the Grady Hospital Research Oversight Committee. We constructed structural brain networks applying probabilistic tractography on diffusion MRI scans of 90 MDD patients and 18 control subjects. The brain was parcellated into 396 regions (198 regions for each hemisphere) using the multi-modal cortical parcellation of Glasser et al. [9], and the Brainnetome Atlas [5] for sub-cortical regions. We applied the linear threshold model and generated an activation cascade for each source node, and measured the cascade membership differences between the two groups. The threshold that we used ranges from 0.1 to 0.3 among different source nodes, and is determined for each source node as the one associated with most significant cascade membership differences. We then applied TRACED to identify the minimal set of connections that can explain the observed cascade differences.

Table 1 lists the connections that we identified as causal for the cascade membership differences between the two groups. These connections have a significant overlap with findings of earlier studies reporting MDD-related structural/functional changes. The connections identified as causal are adjacent to parts of Brodmann area 24 [14], area 32 [10], area 9 [13], area 10 [16], and the orbitofrontal region [18]. All of these regions have been reported to be pathologically relevant for MDD in earlier studies. Some of the reported connections are also in the default mode network (DMN), which has been shown to be heavily affected by MDD [14], with increased functional connectivity [11]. We are going to further analyze this dataset and also compare our findings with those of other network analysis methods in a follow-up MDD-specific article.

Table 1. The connections that can explain the cascade differences between a group of MDD patients and a group of controls. The name of each node is based on the parcellation of Glasser et al. [9], followed with a brief description of the location of that region (L: left hemisphere, R: right hemisphere).

5 Discussion

Various network analysis metrics and methods have been proposed in the past to compare structural brain networks. For instance, earlier work has investigated the differences between brain networks in terms of small-worldness [1], efficiency [2], and modularity [22]. At the node level, the clustering coefficient, participant coefficient, and different node centrality metrics (especially the betweenness centrality) have been widely adopted [17, 29]. At the edge level, researchers have investigated the edges with significant weight differences and the subnetwork they form [14].

TRACED falls in the spectrum of the edge-level analysis, and the resulting set of connections is a subset of edges that have significant weight differences between the two groups. Additionally however TRACED also incorporates the information flow across the entire network in varied paths (because of all the source nodes considered). We aggregate this topological information across the entire network to describe the role that a specific network element (node or edge) plays in the network, and how that role is different between the two groups.

Fig. 4.
figure 4

Earlier work has mostly focused on brain connectivity differences using graph-theoretic metrics (e.g., node centrality metrics). TRACED associates connectivity changes with their impact on information transfer in the brain. It measures the impact of such changes on activation cascade differences, and identifies the specific connections that cause these differences through root-cause analysis.

Figure 4 illustrates typical node-level and edge-level network analysis metrics and compares them with TRACED. Compared to identifying solely edges with significant weight changes, TRACED associates a structural change (i.e., restoring the weight of a connection to its value in the other group) with functional changes (the node membership of the corresponding activation cascades). This is favorable for two reasons: it makes the results more interpretable, and less sensitive to variability across subjects. A significant difference in the weight of a connection between two networks may be simply due to subject variability. With TRACED, a connection is identified as causal not only based on its weight but also based on the topological role of that edge in the propagation of information (activation cascades) from different source nodes.

Compared to node-level analysis metrics, TRACED can provide higher spatial resolution because it identifies specific connections instead of entire brain regions. Additionally, some network analysis metrics often make implicit assumptions about information transfer in the brain (e.g., the betweenness centrality metric assumes that information travels through shortest paths, while the communicability metric assumes that information follows random-walks). These assumptions may not be realistic (e.g., shortest path routing requires information about the complete network stored in every node). It is also harder to interpret these metrics in terms of their associated localities in the brain (e.g., a node may have much lower communicability in one group but what is the corresponding set of affected information pathways?). TRACED makes an explicit assumption about information transfer, namely activation cascades based on the linear threshold model, and it associates structural connectivity changes with corresponding functional changes, making the results more transparent and informative.