Keywords

17.1 Emergence in the Structure and Function of Complex Systems

In the observable world, some of the most beautiful and most puzzling phenomena arise in physical and biological systems characterized by heterogeneous interactions between constituent elements. For example, in materials physics, heterogeneous interactions between particles in granular matter (such as a sand pile) constrain whether the matter acts as a liquid (flowing with gravity) or a solid (supporting load-bearing) [1, 2]. In sociology, heterogeneous interactions between humans in a society are thought to be responsible for surges in online activity, peaks in book sales, traffic jams, and correlated spikes in demand for emergency services [3]. In biology, heterogeneous interactions between computational units in the brain are thought to support a divergence of the correlation length, an anomalous scaling of correlation fluctuations, and the manifestation of mesoscale structure in patterns of functional coupling between units, all features that allow for a diversity of dynamics underlying a diversity of cognitive functions [4, 5]. The feature of these systems that often drives our fascination is the capacity for heterogeneous interactions to produce suprising dynamics, in the form of drastic state transitions, spikes of collective activity, and multiple accessible dynamical regimes.

Because element-element interactions are heterogeneous in such systems, traditional approaches from statistical mechanics – such as continuum models and mean-field approximations – fail to offer satisfying explanations for system function. There exists a critical need to develop alternative approaches to understand how interactions map to emergent behavior. The need is particularly salient in the context of neural systems, where such an understanding could directly inform models of neurological disease and psychiatric disorders [6, 7]. Moreover, gaining such an understanding is a prerequisite for the well-reasoned development of interventions [8], whether in the form of brain stimulation [9, 10], pharmacological agents [11, 12], or other therapies [13]. Technically, such interventions in systems characterized by heterogeneous interactions can be parsimoniously considered as forms of network control, thus motivating extensive recent interest in the utility of network control theory for neural systems [8].

Despite the generic importance of understanding how interactions map to emergent properties, and the specific importance of understanding that mapping in the human brain, progress toward that understanding has remained surprisingly slow. Some efforts have sought to develop detailed multiscale computational models [14]. Yet such efforts are faced with the ever-present quandary that, in point of fact, “the best material model of a cat is another, or preferably the same, cat” [15]. Detailed models are difficult to construct and intractable to analytic approaches, require extensive time to simulate, contain parameters that are frequently underconstrained by experimental data, and in the end produce dynamics that are themselves difficult to understand or to explain from any specific choices in the model. In contrast, approaches from physics consider natural phenomena as if dynamics at macroscopic length scales were almost independent of the underlying, shorter length scale details [16]. A hallmark of effective physical theories is a marked compression of the full parameter space into a few governing variables that are sufficient to describe the observables of interest at the scale of interest. Interestingly, recent theoretical work demonstrates that such simple models are the natural culmination of processes maximizing the information learned from finite data [17].

Here we embrace simplicity by considering the utility of linear systems theory for the understanding and control of neural systems comprised of computational units coupled by heterogeneous interactions. We begin by placing our remarks within the context of quantitative dynamical models of neurons and their interactions, as well as the spatial and temporal considerations inherent in choosing such models. We will then turn to a discussion of approximations to those dynamical models, the incorporation of exogeneous control input, and model linearization. Our treatment then naturally brings us to a discussion of the theory of linear systems, as well as their response to perturbative impulses, and to explicit control strategies. We lay out the formalism for probing state transitions, controllabilty, and the minumum control energy needed for a given state transition. After completing our formal treatment, we discuss the application of linear systems theory to neural systems, and efforts to map network architecture to control properties. We close with a description of several particularly pertinent methodological considerations and limitations, before outlining emerging frontiers.

17.2 Quantitative Dynamical Models of Neural Systems and Interactions

Historically, many neural behaviors and mechanisms have been successfully modeled quantitatively. Here we briefly describe several illustrative examples of such models. The classic fundamental biophysical model of a single neuron (Fig. 17.1, left) was developed by Alan Hodgkin and Andrew Huxley in 1952 (see [18] for details). The model is now known as the Hodgkin-Huxley model. It treats a segment of a neuron as an electrical circuit, where the membrane (capacitor) and voltage-gated ion channels (resistors) are parallel circuit elements. The time evolution of membrane voltage, V m, between the inside and the outside of the neuron is given by

$$\displaystyle \begin{aligned} C_m \dot{V}_m(t) &= \bar{g}_Kn^4(t)(V_K-V_m) + \bar{g}_{Na}m^3(t)h(t)(V_{Na}-V_m) + \bar{g}_l(V_l-V_m) + I(t), \end{aligned} $$

where C m is the membrane capacitance; \(\bar {g}_K, \bar {g}_{Na},\) and \(\bar {g}_l\) are maximum ion conductances for potassium, sodium, and passive leaking ions; and I is an external stimulus current, all per unit area. In addition, V K, V Na, and V l represent the reversal potential of these ions. The variables n, m, and h vary between 0 and 1 and model the ion channel gate kinetics to determine the fraction of open sodium (m, h) and potassium (n) channels:

$$\displaystyle \begin{aligned} \dot{n}(t) &= \alpha_n(V_m(t))(1-n(t)) - \beta_n(V_m(t))n(t)\\ \dot{m}(t) &= \alpha_m(V_m(t))(1-m(t)) - \beta_m(V_m(t))m(t)\\ \dot{h}(t) &= \alpha_h(V_m(t))(1-h(t)) - \beta_h(V_m(t))h(t), \end{aligned} $$

where the functions α i(V m) and β i(V m) are empirically determined. These segments are then spatially connected together, such that the propagation of an action potential across a neuron is modeled by a set of partial differential equations. Due to the biophysical realism of variables and parameters, this model can make powerful and accurate predictions of neuron activity in different environments and stimulation regimes [19,20,21]. Simplified versions of this model, such as the FitzHugh-Nagumo model [22], can also produce many of the same neuronal dynamics.

Fig. 17.1
figure 1

Schematic of neural models and controlling perturbations at different scales. Here, the Hodgkin-Huxley model describes the biophysical behavior of single neurons (left) that may be excitatory (blue) or inhibitory (gray). The artificial neuron models describe the simplified weighted connections and binary states of many neurons (center). The Wilson-Cowan model describes the activity of large neural populations in a region (right) or in a cortical column by modeling the excitatory and inhibitory connections of each population. In each case, a controlling perturbation (yellow) can affect the neural system at different scales

However, many complex behaviors of neural systems arise from interactions between multiple neurons. With four variables (membrane voltage, gates) and even more parameters to model the behavior of a single neuron, the space of models to explore interacting neurons quickly becomes intractable to both analytical and numerical interrogation. An alternative approach is to capture the simplest aspects of neural interactions that are crucial for the phenomenon of interest. Such was the approach taken by Warren McCulloch and Walter Pitts [23], who developed what would later become a canonical model of an artificial neuron. In this model, each neuron i at any point in time t exists in one of two states: firing x i(t) = 1 or not firing x i(t) = 0. The state of the neuron is determined by a weighted sum of inputs from connected neurons j at the previous time step. Then, neuron i in a system of N neurons evolves in time as

$$\displaystyle \begin{aligned} x_i(t+1) = f_i\left(\sum_{j=1}^N w_{ij}x_j(t)\right),\end{aligned} $$

where w ij is the strength of excitation (w ij > 0) or inhibition (w ij < 0) from neuron j to neuron i and function f i is typically a thresholding function (Fig. 17.1, center). Instantiations and extensions of this model are used to study associative memory (Hopfield [24]), machine learning (perceptron [25]), and cellular automata [26].

In many cases, the sheer number of neurons and interactions renders even these simple models difficult to study. A typical solution is to instead model the average activity of a population of neurons. This is the approach taken by Hugh Wilson and Jack Cowan [27] in the Wilson-Cowan model. Here, a group of neurons is separated into excitatory and inhibitory populations, where the fraction of cells firing at time t in each population is E(t) and I(t), respectively, that evolve in time as

$$\displaystyle \begin{aligned} \tau_e \dot{E}(t) &= -E(t) + (k_e-r_eE(t))S_e\left(c_1E(t)\right.\\ &\qquad \left.- c_2I(t) + P(t)\right)\\ {} \tau_i \dot{I}(t) &= -I(t) + (k_i-r_iI(t))S_i\left(c_3E(t)\right.\\ &\qquad \left. - c_4I(t) + Q(t)\right). \end{aligned} $$

Here, c 1, c 2 > 0 represent connection strength into the excitatory population, and c 3, c 4 > 0 represent connection strength into the inhibitory population, r e, r i are the refractory periods, and S e, S i are sigmoid functions from the distribution of neuron input thresholds for firing. Such models produce oscillations such as those observed in noninvasive measurements of large-scale brain activity (Fig. 17.1, right) in patients with epilepsy [28].

In these and many other models, a common theme is the tradeoff between realism and tractability. We desire sufficient realism to study crucial features of neural systems such as the activity of each unit, the interaction strength between units, the connection topology, and the effect of external stimulation. We also desire sufficient tractability (either to analytical or numerical interrogation) to make consistent and meaningful predictions about our neural system by understanding relations between the model parameters and the model behavior. In this chapter, we will discuss one such model from the theory of linear dynamical systems.

17.2.1 Spatial and Temporal Considerations

When modeling neural systems, an immediately salient consideration is the vast range of spatial and temporal scales at which nontrivial – and thus quite interesting – dynamics occur. It stands to reason that the most relevant type of model for understanding a given phenomenon depends on the spatiotemporal scale at which that phenomenon is observed. For example, consider the fact that while it is generally known that certain sensory regions such as the visual cortex are both anatomically linked to and functionally responsible for sensory inputs, it is more difficult to assign a set of neurons that are necessary for distributed cognitive processes such as attention and cognitive control. Thus, biophysical models at the level of single neurons may be viable for simulating receptive fields in visual processing, but may be less useful for studies of task-switching or gating. Similarly, consider the fact that a single neuron may fire every few milliseconds, while human reaction times are on the order of hundreds of milliseconds, and brain-wide fluctuations in activity on the order of seconds. Thus, the form of the model considered should match the temporal scales of the behavior to be studied.

From a modeling perspective, balancing these considerations of spatial and temporal scales with model realism impacts the category of model that has the greatest utility. If one wishes to consider small spatial scales, then a rather simplistic neuron-level model such as the McCulloch-Pitts may be particularly useful, where each neural unit has discrete states such that each neuron i is either firing x i(t) = 1 or not x i(t) = 0. In contrast, if one wishes to consider larger spatial scales characteristic of distributed cognitive processes, it may be more appropriate to consider models in which each neural unit reflects the average population activity of a brain region as a continuous state, where x i(t) is a real number. Similar considerations are relevant and important in the time domain. For models that assume fairly uniform delays in neuronal interactions such as the McCulloch-Pitts, a discrete time model where time evolves in integer increments may be appropriate. In contrast, if the timing of interactions between neural units such as myelinated versus unmyelinated axons is heterogeneous, a continuous time model may be more suitable, where time t is a real number.

In addition to affecting the definition of neural activity and the nature of its propagation, these considerations also affect the meaning of interactions between units. In a neuron-level model whose units reflect neurons, the unit-to-unit interactions may represent structural synapses between neurons. In contrast, in a population model whose units reflect average neural activity of a brain region, unit-to-unit interactions may represent a summary measure of the collective strength or extent of structural connections between regions. Both types of connections can be empirically measured using either invasive (staining, flourescence imaging, tract tracing [29]) or noninvasive (tractography [30]) methods. The specific type of interaction studied constrains the sorts of inferences that one can draw from the subsequent model, as well as the types of model-generated hypotheses that one can test in new experiments.

17.2.2 Dynamical Model Approximations

Both here and in the following sections, we will consider systems with both continuous state and time. However, we note that the theory of linear systems extends naturally to discrete time systems as well. We begin our formulation with a set of N neural units, where each unit has an associated level of activity x i(t) that is a real number at some time t ≥ 0 that is also a real number. Then the collection of activity for all units into column vector x(t) = [x 1(t);x 2(t);⋯ ;x N(t)] is called the state of our system at time t. For example, in the Hodgkin-Huxley equations, our state vector is x = [V ;n;m;h]. In many models including Hodgkin-Huxley, the time evolution of the system states can be written as a vector differential equation:

$$\displaystyle \begin{aligned} \underbrace{ \begin{bmatrix} \dot{x}_1(t)\\ \dot{x}_2(t)\\ \vdots\\ \dot{x}_N(t) \end{bmatrix}}_{\dot{\boldsymbol{x}}(t)} = \underbrace{ \begin{bmatrix} f_1(\boldsymbol{x}(t))\\ f_2(\boldsymbol{x}(t))\\ \vdots\\ f_N(\boldsymbol{x}(t)) \end{bmatrix}}_{\boldsymbol{f}(\boldsymbol{x}(t))},\end{aligned} $$

where f, the vector of functions f i, determines how the system states change, \(\dot {\boldsymbol {x}}\), at every particular state x. We can think of these equations as generating a vector field, where at each point x, we draw an arrow with magnitude and direction equal to f(x). As an example, consider the following two neuron system x 1, x 2 that evolves in time as:

$$\displaystyle \begin{aligned} \dot{x}_1(t) &= 2x_2(t) - \sin{}(x_1(t))\\ \dot{x}_2(t) &= x_1^2(t) - x_2(t),\end{aligned} $$

where the vector field and example trajectory from initial state x(0) = [−0.3;−0.4] are shown (Fig. 17.2, top). Note how at every point x 1, x 2 the above equation determines a vector of motion \(\dot {\boldsymbol {x}}\) that the system traces from the initial point. This quantitative modeling of neural dynamics allows us to study and predict the response of our neural system to changes in interaction strength or external stimulation.

Fig. 17.2
figure 2

Vector fields and trajectories, with and without control inputs. Example simple vector field of two states with a particular trajectory from initial condition x(0) = [−0.3;−0.4] (top left) in state space, with the corresponding plot of each state over time (top right) and the corresponding vector field and trajectory with control input u(t) = 0.5 (bottom left) with corresponding states over time (bottom right)

17.2.3 Incorporating Exogenous Control

While modeling intrinsic system behavior is already a broad topic of current research, there is an increasing need for the principled study of therapeutic interventions to correct dysfunctional neural activity. These interventions may take the form of targeted invasive (deep bran stimulation) or noninvasive (transcranial magnetic stimulation) inputs, or more diffusive drug treatments. Hence, in our modeling efforts, we also often desire to incorporate the effect of some external stimuli u 1(t), ⋯ , u k(t). We collect these stimuli into a vector u(t) = [u 1(t);u 2(t);⋯ ;u k(t)] and include their effect on the rates of change of system states in our function:

$$\displaystyle \begin{aligned} \underbrace{ \begin{bmatrix} \dot{x}_1(t)\\ \dot{x}_2(t)\\ \vdots\\ \dot{x}_N(t) \end{bmatrix}}_{\dot{\boldsymbol{x}}(t)} = \underbrace{ \begin{bmatrix} f_1(\boldsymbol{x}(t),\boldsymbol{u}(t))\\ f_2(\boldsymbol{x}(t),\boldsymbol{u}(t))\\ \vdots\\ f_N(\boldsymbol{x}(t),\boldsymbol{u}(t)) \end{bmatrix}}_{\boldsymbol{f}(\boldsymbol{x}(t),\boldsymbol{u}(t))}. \end{aligned} $$

As an example in our two-unit system, we can apply an input to the first unit

$$\displaystyle \begin{aligned} \dot{x}_1(t) &= 2x_2(t) - \sin{}(x_1(t)) + u(t)\\ \dot{x}_2(t) &= x_1^2(t) - x_2(t),\end{aligned} $$

thereby changing our system of equations. We plot the vector field and trajectory of our system under some constant input u(t) = 0.5 (Fig. 17.2, bottom). Notice how the control input changes the trajectory and final state of our system by modifying the vector field. Also notice that our input only shifts the x 1 component of our vectors because we only stimulate x 1. These abilities to map neural interactions f to the full trajectory of activity x(t) and to find control inputs u(t) that drive our neural system to a desired final state x(T) are among the core contributions of linear systems theory.

17.2.4 Model Linearization

While we have a quantitative framework for the evolution of a controlled neural system, there are no general principles for determining the full trajectory x(t) or control input u(t) to reach a desired final state for a general nonlinear system. In systems of only a few neural units, there exist several powerful numerical and analytic tools. However, the study and control of large neural systems is made difficult by our inability to know how a stimulus will affect our system without first simulating the full trajectory. Further, for multiple stimuli, the number of possible stimulus patterns grows exponentially.

A special class of simplified systems called linear systems circumvents this issue. In our state representation, a linear system is described by

$$\displaystyle \begin{aligned} \underbrace{ \begin{bmatrix} \dot{x}_1(t)\\ \dot{x}_2(t)\\ \vdots\\ \dot{x}_N(t) \end{bmatrix}}_{\dot{\boldsymbol{x}}(t)} &= \underbrace{ \begin{bmatrix} a_{11} & a_{12} & \dotsm & a_{1N}\\ a_{21} & a_{22} & \dotsm & a_{2N}\\ \vdots & \vdots & \ddots & \vdots\\ a_{N1} & a_{N2} & \dotsm & a_{NN} \end{bmatrix}}_A \underbrace{ \begin{bmatrix} x_1(t)\\ x_2(t)\\ \vdots\\ x_N(t) \end{bmatrix}}_{\boldsymbol{x}(t)} + \underbrace{ \begin{bmatrix} b_{11} & b_{12} & \dotsm & b_{1k}\\ b_{21} & b_{22} & \dotsm & b_{2k}\\ \vdots & \vdots & \ddots & \vdots\\ b_{N1} & b_{N2} & \dotsm & b_{Nk} \end{bmatrix}}_B \underbrace{ \begin{bmatrix} u_1(t)\\ u_2(t)\\ \vdots\\ u_k(t) \end{bmatrix}}_{\boldsymbol{u}(t)}, \end{aligned} $$
(17.1)

that is characterized by the time evolution of any state \(\dot {x}_i(t)\) being a weighted sum of current states \(\sum _{j=1}^N a_{ij}x_j(t)\) and external inputs \(\sum _{j=1}^k b_{ij}u_j(t)\). Here, a ij is a real number that determines how activity in state x j influences the rate of change of state x i and b ij is a real number that determines how external input u j influences the rate of change of state x i. We see that our example two-unit system is not linear, because the first state \(\dot {x}_1(t)\) depends on \(\sin {}(x_1(t))\), and the second state \(\dot {x}_2(t)\) depends on \(x_1^2(t)\), and is therefore a non-linear system.

To transform the nonlinear system \(\dot {\boldsymbol {x}} = \boldsymbol {f}(\boldsymbol {x},\boldsymbol {u})\), into a linear system \(\dot {\boldsymbol {x}} = A\boldsymbol {x} + B\boldsymbol {u}\), we can create an approximate model of our vector field about a particular constant operating state x and input u . We first evaluate the dynamics at this operating point, f(x , u ). Then we approximate the vector field along small deviations from this point by computing the derivative of f(x, u) with respect to the states to get matrix A and with respect to control inputs to get matrix B:

Then, for states near x and inputs near u , the vector field is approximately

$$\displaystyle \begin{aligned} \dot{\boldsymbol{x}}(t) &= \boldsymbol{f}(\boldsymbol{x},\boldsymbol{u}) \end{aligned} $$
(17.2)
$$\displaystyle \begin{aligned} &\approx{\kern-2pt} \boldsymbol{f}(\boldsymbol{x}^*,\boldsymbol{u}^*) {\kern-2pt}+{\kern-2pt} A(\boldsymbol{x}(t) {\kern-2pt}-{\kern-2pt} \boldsymbol{x}^*) {\kern-2pt}+{\kern-2pt} B(\boldsymbol{u}(t) {\kern-2pt}-{\kern-2pt} \boldsymbol{u}^*). \end{aligned} $$
(17.3)

A typical operating point for the input is u  = 0 corresponding to no input, because neural stimulation is viewed as a perturbation to the natural and unstimulated dynamics. A typical operating point for the state x is a fixed point where f(x , u ) = 0, because then the evolution of our system Eq. 17.2 only depends on deviations from the point, and not on its actual value. Finally, we can write the linearized equation explicitly as a function of these deviations through a change of variables y(t) = x(t) −x :

$$\displaystyle \begin{aligned} \dot{\boldsymbol{y}}(t) = \dot{\boldsymbol{x}}(t) \approx A\boldsymbol{y}(t) + B\boldsymbol{u}(t). \end{aligned} $$

We will continue to use variable x instead of y with the understanding that it represents deviations from the fixed point. For example, in our two-unit system, we can linearize about \(x_1^* = 0, x_2^* = 0,\) and u  = 0 to yield

$$\displaystyle \begin{aligned} \underbrace{ \begin{bmatrix} \dot{x}_1(t)\\ \dot{x}_2(t) \end{bmatrix}}_{\dot{\boldsymbol{x}}(t)} \approx \underbrace{ \begin{bmatrix} -1 & 2\\ 0 & -1 \end{bmatrix}}_{A} \underbrace{ \begin{bmatrix} x_1(t)\\ x_2(t) \end{bmatrix}}_{\boldsymbol{x}(t)} + \underbrace{ \begin{bmatrix} 1\\ 0 \end{bmatrix}}_{B} u(t). \end{aligned} $$

We show the vector fields and trajectories for both the nonlinear and linear equations without control where u(t) = 0 (Fig. 17.3, top) and with control where u(t) = 0.5 (Fig. 17.3, bottom) from the same initial condition, and we notice that in the neighborhood of \(x_1^* = 0, x_2^* = 0\), the field and trajectories are similar. Hence, by linearizing our neural dynamics about x , u , we can preserve the behavior of our neural system at state x(t) and inputs u(t) near this point while enabling the use of powerful tools developed in the next section.

Fig. 17.3
figure 3

Vector fields and trajectories for a nonlinear system and its linearized form. Example vector field of two states with a particular trajectory from initial condition x(0) = [−0.3;−0.4] for the uncontrolled nonlinear system (top left), the uncontrolled linear system (top right), the controlled nonlinear system (bottom left), and the controlled linear system (bottom right)

17.3 Theory of Linear Systems

A useful model for therapeutic intervention in a neural system should capture both how the activity over time depends on the connections between neural units and how to change the activity in a desired way through stimulation. Now that we have a model that captures features of neural activity and connectivity in a linearized form, we will develop equations that yield precisely these features. Specifically, we will first determine the system’s response to control through mathematical relations as opposed to simulations. Then we will use these principles to design stimuli that optimally guide our system from some initial state x(0) to some final state x(T).

17.3.1 Impulse Response

First, we find the natural evolution of system states from some initial neural state x(0) without any external input. This task amounts to finding the state trajectory x(t) that solves our dynamic equation \(\dot {\boldsymbol {x}}(t) = A\boldsymbol {x}(t)\). For scalar systems where x(t) is not a vector, we are reminded of the solution to \(\dot {x} = ax\):

$$\displaystyle \begin{aligned} \frac{dx}{dt} &= ax & &\text{differential }\ \text{equation,}\\ \frac{1}{x}dx &= adt & &\text{divide by } x,\\ \int\frac{1}{x}dx &= \int a dt + c & & \text{integrate both sides,}\\ \ln|x| &= at+c\\ x(t) &= Ce^{at} & & \text{solution to differential equation}, \end{aligned} $$

where the constant is the initial condition C = x(0). We can prove that this solution satisfies \(\dot {x} = ax\) by using a Taylor series of the exponential function \(e^{at} = \sum _{k=0}^\infty \frac {(at)^k}{k!}\). Taking the time derivative of x(t) = e at, we see \(\dot {x} = ax\):

$$\displaystyle \begin{aligned} \frac{d}{dt}e^{at} &= \frac{d}{dt}\left(1 + \frac{at}{1!} + \frac{a^2t^2}{2!} + \frac{a^3t^3}{3!} + \dotsm + \frac{a^kt^k}{k!} + \dotsm\right) & & \text{Taylor series of }e^{at},\\ &= 0 + \frac{a}{1!} + 2\frac{a^2t}{2!} + 3\frac{a^3t^3}{3!} + \dotsm + k\frac{a^kt^{k-1}}{k!} + \dotsm & & \text{differentiate each term,}\\ &= a\left(1 + \frac{at}{1!} + \frac{a^2t^2}{2!} + \dotsm + \frac{a^kt^k}{k!} + \dotsm\right) & & \text{factor out scalar }a,\\ &= ae^{at} & & \text{substitute Taylor series}. \end{aligned} $$

A matrix exponential is defined exactly the same as above with \(e^{at} = \sum _{k=0}^\infty \frac {(at)^k}{k!}\), and we again show that the time derivative satisfies the vector relation \(\dot {\boldsymbol {x}}(t) = A\boldsymbol {x}(t)\):

$$\displaystyle \begin{aligned} \frac{d}{dt}e^{At} &= \frac{d}{dt}\left(I + \frac{At}{1!} + \frac{A^2t^2}{2!} + \frac{A^3t^3}{3!} + \dotsm + \frac{A^kt^k}{k!} + \dotsm\right) & & \text{Taylor series of }e^{At},\\ &= 0 + \frac{A}{1!} + 2\frac{A^2t}{2!} + 3\frac{A^3t^3}{3!} + \dotsm + k\frac{A^kt^{k-1}}{k!} + \dotsm & & \text{differentiate each term,}\\ &= A\left(I + \frac{At}{1!} + \frac{A^2t^2}{2!} + \dotsm + \frac{A^kt^k}{k!} + \dotsm\right) & & \text{factor out matrix }A,\\ &= Ae^{At} & & \text{substitute Taylor series}. \end{aligned} $$

Hence, we see that the following solution

$$\displaystyle \begin{aligned} \boldsymbol{x}(t) = e^{At}\boldsymbol{x}(0) \end{aligned} $$
(17.4)

satisfies our dynamic equation. Here, the matrix exponential e At is called the state transition matrix, and Eq. 17.4 is called the impulse response of our system. Hence, we can find the state at any time T without solving for intermediate states 0 < t < T.

As an example in our linearized two-unit model, to find the state of our system at T = 2 given an initial start at x(0) = [−0.3;−0.4], we can use a software to numerically compute the matrix exponential at time t = 2 and multiply by our initial state Eq. 17.4

$$\displaystyle \begin{aligned} \boldsymbol{x}(2) = e^{2A}\boldsymbol{x}(0) = \begin{bmatrix} 0.1353 & 0.5413\\ 0 & 0.1353 \end{bmatrix} \begin{bmatrix} -0.3\\ -0.4 \end{bmatrix} = \begin{bmatrix} -0.2571\\ -0.0541 \end{bmatrix}, \end{aligned} $$

which agrees with the simulation results (Fig. 17.3).

17.3.2 Control Response

Next, we derive the system response from an initial state x(0) to some controlling input u(t) through some algebraic manipulation and calculus. We begin with our system equations \(\dot {\boldsymbol {x}}(t) - A\boldsymbol {x}(t) = B\boldsymbol {u}(t)\) and multiply both sides by a matrix exponential

$$\displaystyle \begin{aligned} e^{-At}\dot{\boldsymbol{x}}(t) - e^{-At}A\boldsymbol{x}(t) = e^{-At}B\boldsymbol{u}(t).\end{aligned} $$

Next, we see that the left-hand side is the result of a product rule where \(\frac {d}{dt}(e^{-At}\boldsymbol {x}(t)) = e^{-At}\dot {\boldsymbol {x}}(t) - Ae^{-At}\boldsymbol {x}(t)\), recalling that functions of matrices can switch orders of multiplication, such that Ae At = e AtA. Hence, we can write our equation as

$$\displaystyle \begin{aligned} \frac{d}{dt}(e^{-At}\boldsymbol{x}(t)) = e^{-At}B\boldsymbol{u}(t),\end{aligned} $$

and integrate both sides from t = 0 to t = T to yield

$$\displaystyle \begin{aligned} e^{-AT}\boldsymbol{x}(T) - \boldsymbol{x}(0) = \int_0^T e^{-At}B\boldsymbol{u}(t)dt. \end{aligned} $$

We note the matrix exponential at t = 0 becomes e A⋅0 = I from the Taylor series. Next, we move the initial state x(0) to the right-hand side and multiply by e AT:

$$\displaystyle \begin{aligned} e^{AT}{\kern-1pt}e^{-AT}\boldsymbol{x}(T) {\kern-1pt}={\kern-1pt} e^{AT}{\kern-1pt}\boldsymbol{x}(0) {\kern-1pt}+{\kern-1pt} e^{AT}\!\int_0^T\! e^{-At}B\boldsymbol{u}(t)dt. \end{aligned} $$

Finally we use the fact that e AT and e AT are inverses of each other where e ATe AT = I, and we bring e AT into the integral to derive the system’s response to control input:

$$\displaystyle \begin{aligned} \boldsymbol{x}(T) = \underbrace{e^{AT}\boldsymbol{x}(0)}_{\mathrm{natural}} + \underbrace{\int_0^T e^{A(T-t)}B\boldsymbol{u}(t)dt}_{\mathrm{controlled}}. \end{aligned} $$
(17.5)

Intuitively, we see that the first part of the response, e ATx(0), is just the natural evolution of our system from an initial state and that the second part of the response is a convolution of our mapped inputs, Bu(t), with the impulse response. We will next take advantage of the convolution’s property of linearity to draw powerful relations between the state evolution, control input, and system structure.

17.3.3 Linear Relation Between the Convolution and Control Input

Previously, we focused on the evolution of a neural system in response to a known control input u(t) in Eq. 17.5. However, our goal is to design a control input that drives our neural system to some desired final state that may stabilize an epileptic seizure [31], or aid in memory recall [32]. In this scenario, we fix the initial state x(0) = x 0 and the final state x(T) = x T as constants and rewrite Eq. 17.5 to move the variables u(t) to the left-hand side and the constants to the right-hand side:

$$\displaystyle \begin{aligned} \int_0^T e^{A(T-t)}B\underbrace{\boldsymbol{u}(t)}_{\mathrm{variable}}dt = \underbrace{\boldsymbol{x}(T) - e^{AT}\boldsymbol{x}(0)}_{\mathrm{constant}}. \end{aligned} $$

This formulation is a linear equation with a structure that is similar to a typical system of linear equations used in regression, Mv = b, where v is the variable, b is a constant vector, and matrix M is the linear function acting on v. Here, the control input u(t) is the variable, x(T) − e ATx(0) is the constant vector, and the convolution

$$\displaystyle \begin{aligned} \mathcal{L}(\boldsymbol{u}(t)) = \int_0^T e^{A(T-t)}B\boldsymbol{u}(t)dt \end{aligned} $$

is the linear function acting on our control inputs. By linear function, we mean that for two control inputs u 1(t) and u 2(t), if \(\mathcal {L}(\boldsymbol {u}_1(t)) = \boldsymbol {c}_1\), and \(\mathcal {L}(\boldsymbol {u}_2(t)) = \boldsymbol {c}_2\), then a weighted sum of inputs yields the same weighted sum of outputs, such that

$$\displaystyle \begin{aligned} \mathcal{L}(a\boldsymbol{u}_1(t) + b\boldsymbol{u}_2(t)) = a\boldsymbol{c}_1 + b\boldsymbol{c}_2. \end{aligned} $$
(17.6)

This linearity allows us to treat solutions to our control function problem the same as solutions to our linear system of equations. Specifically, suppose the control input u (t) is a particular solution to our control problem such that \(\mathcal {L}(\boldsymbol {u}^*(t)) = \boldsymbol {x}_T - e^{AT}\boldsymbol {x}_0\). Further, suppose that inputs u 1(t), u 2(t), ⋯ are homogeneous solutions such that \(\mathcal {L}(\boldsymbol {u}_i(t)) = \mathbf {0}\). If we construct a control input that is the particular solution added to a weighted sum of homogeneous solutions

$$\displaystyle \begin{aligned} \boldsymbol{u}(t) = \underbrace{\boldsymbol{u}^*(t)}_{\mathrm{particular}} + \sum_i a_i\underbrace{\boldsymbol{u}_i(t)}_{\mathrm{homogeneous}}, \end{aligned} $$

then the convolution of this combined input yields the desired output:

$$\displaystyle \begin{aligned} \mathcal{L}(\boldsymbol{u}(t)) &= \mathcal{L}\left(\boldsymbol{u}^*(t) + \sum_i a_i\boldsymbol{u}_i(t)\right)\\ &= \mathcal{L}(\boldsymbol{u}^*(t)) + \sum_i \mathcal{L}(a_i\boldsymbol{u}_i(t))\\ &= \boldsymbol{x}_T - e^{AT}\boldsymbol{x}_0 + \sum_i a_i\mathbf{0}\\ &= \boldsymbol{x}_T - e^{AT}\boldsymbol{x}_0. \end{aligned} $$

Hence, if we have a particular control input u (t) that drives our system to a desired final state, then the homogeneous control inputs u i(t) give us the flexibility to design less costly, more energy-efficient inputs.

17.3.4 Controllability

For any system, we would first like to know if a particular solution exists to the control problem described above. A system is controllable if there is a control input that brings our system from any initial state to any final state in finite time. For nonlinear systems, if we know that the input u (t) brings our system from the initial state 0 to some final state x T, there is in general no way to know what input will take our system to a scaled final state ax T.

In contrast, due to the linearity of our convolution operator, we know that a scaled input au (t) will produce a scaled output \(\mathcal {L}(a\boldsymbol {u}^*(t)) = a\boldsymbol {x}_T\). Further, any N-dimensional vector can be written as a weighted sum of N linearly independent vectors v 1, v 2, ⋯ , v N. Here, linear independence means that no vector v i in the set can be written as a weighted sum of the remaining vectors v ji. For example, a column vector a = [a 1;a 2;⋯ ;a N] can be written as the weighted sum

$$\displaystyle \begin{aligned} \underbrace{\begin{bmatrix} a_1\\ a_2\\ \vdots\\ a_N \end{bmatrix}}_{\boldsymbol{a}} = a_1 \underbrace{\begin{bmatrix} 1\\0\\\vdots\\0 \end{bmatrix}}_{\boldsymbol{v}_1} + a_2 \underbrace{\begin{bmatrix} 0\\1\\\vdots\\0\\ \end{bmatrix}}_{\boldsymbol{v}_2} + \dotsm + a_N \underbrace{\begin{bmatrix} 0\\0\\\vdots\\1 \end{bmatrix}}_{\boldsymbol{v}_N}, \end{aligned} $$

where none of the vectors v i can be written as a weighted sum of remaining vectors v ji. Hence, our system is controllable if we can find input functions u 1(t), ⋯ , u N(t) that reach N linearly independent vectors \(\mathcal {L}(\boldsymbol {u}_1(t)), \dotsm , \mathcal {L}(\boldsymbol {u}_N(t))\), because then we can always reach any final state from any initial state through the weighted sum

$$\displaystyle \begin{aligned} \underbrace{\boldsymbol{x}_T - e^{AT}\boldsymbol{x}_0}_{\boldsymbol{a}} = a_1 \underbrace{\mathcal L(\boldsymbol{u}_1(t))}_{\boldsymbol{v}_1} + a_2 \underbrace{\mathcal L(\boldsymbol{u}_2(t))}_{\boldsymbol{v}_2} + \dotsm + a_N \underbrace{\mathcal L(\boldsymbol{u}_N(t))}_{\boldsymbol{v}_N}, \end{aligned} $$

through the control input u(t) = a 1u 1(t) + a 2u 2(t) + ⋯ + a Nu N(t). This information of reachable states is encoded in the controllability matrix

$$\displaystyle \begin{aligned} \mathcal{C} = \begin{bmatrix} B, & AB, & A^2B, & \dotsm, & A^{N-1}B \end{bmatrix}, \end{aligned} $$
(17.7)

where the rank of this matrix (given by the number of linearly independent columns of \(\mathcal C\)) tells us how many of these N independent vectors can be reached using control input. If this rank = N, then the system is controllable and can reach all states. However, even if the rank < N, there still exists a control input that drives the system from x 0 to x T if the vector x T − e ATx 0 can be written as a weighted sum of the columns of \(\mathcal C\). This set of vectors spanned by the columns of \(\mathcal C\) is called the controllable subspace and the remaining set of vectors the uncontrollable subspace.

As an example in our linearized two-unit system, A, B, and \(\mathcal C\) are written as

$$\displaystyle \begin{aligned} A &= \begin{bmatrix} -1 & 2\\ 0 & -1 \end{bmatrix},\quad B = \begin{bmatrix} 1\\ 0 \end{bmatrix},\\ \mathcal C &= \begin{bmatrix} B, AB \end{bmatrix} = \begin{bmatrix} 1 & -1\\ 0 & 0 \end{bmatrix}, \end{aligned} $$

which is not controllable, because the rank of \(\mathcal C\) is 1. To consider the controllable subspace, notice that the columns of \(\mathcal C\) only have non-zero entry in the first row. Hence, the controllable subspace contains any desired value of x 1(T), but excludes all values of x 2(T). Intuitively, this loss of controllability arises because x 2 does not receive an input, nor is it affected by x 1. Hence, there is no way to influence the activity of x 2 in a desired way.

17.3.5 Minimum Energy Control

Once we know a system is controllable, we would like to determine the control input function u(t) that transitions our system from initial x 0 to final x T states. However, there are often limitations on the input magnitude such as electrical and thermal damage of neural tissue or battery life of chronic implanted stimulators. Due to the system’s linearity, we can find not only an input function but an optimal one u (t) that minimizes input cost.

First, we must define a measure of the size of our control input functions u(t). In many applications of electrical stimulation, the cost of control scales quadratically with the input, such as with resistive heating. This quadratic measure of size is mathematically and intuitively defined using the inner product. For N-dimensional column vectors of numbers, a, the inner product is the well known dot product

$$\displaystyle \begin{aligned} <\boldsymbol{a},\boldsymbol{a}> = a_1^2 + a_2^2 + \dotsm + a_N^2 = \boldsymbol{a}^\top\boldsymbol{a}, \end{aligned} $$

where a is the transpose that turns column vector a into a row vector. We see that doubling a will quadruple the inner product. For k-dimensional column vectors of functions a(t) from time t = 0 to t = T, the inner product is similarly defined as

$$\displaystyle \begin{aligned} <\boldsymbol{a}(t),\boldsymbol{a}(t)> = \int_0^T a_1^2(t) + a_2^2(t) + \dotsm + a_N^2(t)dt = \int_0^T \boldsymbol{a}(t)^\top\boldsymbol{a}(t)dt \end{aligned} $$

that has the same quadratic relation. Hence, we define the control energy as

$$\displaystyle \begin{aligned} E = <\boldsymbol{u}(t),\boldsymbol{u}(t)>. \end{aligned} $$
(17.8)

Now that we have a measure of how large an input is, we wish to find a minimal input u (t) that minimizes the control energy. This task is analogous to a typical linear system of equations, Mv = b, where we want to find v that solves the equation with the smallest cost < v , v  > . Here, if M has full row rank where the rows of M are linearly independent, then the minimum solution is given by the equation for least squares v  = M (MM )−1b. Here, M is the transpose, or adjoint of M.

This same principle holds for our linear system \(\mathcal L(\boldsymbol {u}(t)) = \boldsymbol {x}_T - e^{AT}\boldsymbol {x}_0\), where we want to find u (t) that solves the equation with the smallest cost < u (t), u (t) > . However, while matrix M inputs a vector of numbers v and outputs a vector of numbers b, our linear function \(\mathcal L\) inputs a vector of functions and outputs a vector of numbers. Hence, we need to carefully define the adjoint of \(\mathcal L\); because \(\mathcal L\) is not a finite matrix, we cannot use \(\mathcal L^\top \) to denote the adjoint. Instead, we will use \(\mathcal L^*\) to denote the adjoint of \(\mathcal L\). In the case of matrix M, the adjoint preserves the inner product between inputs and outputs such that

$$\displaystyle \begin{aligned} <M\boldsymbol{v},\boldsymbol{b}>&= <\boldsymbol{v},M^\top\boldsymbol{b}>\\ (M\boldsymbol{v})^\top\boldsymbol{b} &= \boldsymbol{v}^\top (M^\top\boldsymbol{b}). \end{aligned} $$

Identically, for state transition x = e ATx 0 −x T, the adjoint of \(\mathcal L\) preserves the inner product between the vectors of input functions u(t) and output numbers x as

$$\displaystyle \begin{aligned} <\mathcal L(\boldsymbol{u}(t)), \boldsymbol{x}> &= <\boldsymbol{u}(t), \mathcal L^*(\boldsymbol{x})>\\ \left(\int_0^T e^{A(T-t)}B\boldsymbol{u}(t)dt\right)^\top \boldsymbol{x} &= \int_0^T \boldsymbol{u}^\top(t)(B^\top e^{A^\top(T-t)}\boldsymbol{x})dt. \end{aligned} $$

Notice that the inner product on the left is over vectors of numbers, while the inner product on the right is over vectors of functions. Then, we see that our adjoint is

$$\displaystyle \begin{aligned} \mathcal L^*(\boldsymbol{x}) = B^\top e^{A^\top(T-t)}\boldsymbol{x} \end{aligned} $$

and takes as input a vector of numbers and outputs a vector of functions. Then, just as our system Mv = b, the minimum input u (t) is given by

$$\displaystyle \begin{aligned} \boldsymbol{u}^*(t) = \mathcal L^*(\mathcal L \mathcal L^*)^{-1}(\boldsymbol{x}_T - e^{AT}\boldsymbol{x}_0). \end{aligned} $$
(17.9)

Finally, through substitution into Eq. 17.8, we can write the minimum control energy as

$$\displaystyle \begin{aligned} E_{\mathrm{min}} = (\boldsymbol{x}_T-e^{AT}\boldsymbol{x}_0)^\top (\mathcal L \mathcal L^*)^{-1} (\boldsymbol{x}_T-e^{AT}\boldsymbol{x}_0). \end{aligned} $$
(17.10)

In conclusion, we point out the crucially important term of the minimum energy, \(\mathcal L \mathcal L^{\prime }\), as the controllability Gramian written as

$$\displaystyle \begin{aligned} W_c(T) = \mathcal L \mathcal L^* = \int_0^T e^{A(T-t)}BB^\top e^{A^\top(T-t)}dt. \end{aligned} $$
(17.11)

First, we notice that this Gramian is only a function of the underlying neural relationships, A; the matrix determining where the inputs are placed, B; and time T. Next, we notice that W c(T) is actually an N × N matrix and can therefore be numerically evaluated and analytically studied. Finally, we see that if our system begins at an initial state of x 0 = 0, then the minimum energy can be written as

$$\displaystyle \begin{aligned} E_{\mathrm{min}} = \boldsymbol{x}_T^\top W_c^{-1}(T) \boldsymbol{x}_T, \end{aligned} $$

where the role of neural interactions and stimulation parameters on our ability to control the system is fully encapsulated in the Gramian. This ability to decouple the states x T from the neural interactions and stimulation parameters A, B, T is a powerful tool for studying and designing control properties of neural systems.

17.4 Mapping Network Architecture to Control Properties

By formulating our neural system in a linear way, we can solve difficult problems such as predicting the system’s response to control, finding the set of states that the system can reach, and designing efficient input stimuli, without the need to try every control input and simulate every trajectory. Further, by directly mapping control properties to neural activity and network architecture in an algebraic way, we can study how features of interaction patterns impact our ability to control neural activity [8]. As an active area of research, the variety of questions being asked and systems being studied is very large, and require simultaneous innovations in experiment, computation, and theory. In this section, we will describe a few recent applications and advances.

17.4.1 Neuronal Control in Model Organisms

While most neural systems are too large to empirically measure activity and connectivity or to analyze numerically, there do exist a few sufficiently simple model organisms. Among these is the worm Caenorhabditis elegans [33] with several hundred neurons that can be recorded from simultaneously [34]. Even for such a small system, it is difficult to map the functional form of how activity in neuron i affects the activity in neuron j. However, the presence or absence of connections between neurons in this organism, and by consequence the presence or absence of elements in the connectivity matrix A, is well known.

Advances in the study of structural controllability [35] allow us to ask questions about our ability to control a system given only the binary presence or absence of edges. Colloquially, this framework focuses on connectivity matrices A where non-zero entries can only exist in the presence of binary edges, and can be used to determine whether the system is controllable for most values where an edge is present. Using this framework, recent work has sought to determine whether the removal of certain neurons in C. elegans will reduce structural controllability [36]. Specifically, the modeling involves input to the sensory receptor neurons as the control input that is mapped to the system through a matrix B and the connectivity between neurons and muscle cells through a matrix A. Further, instead of recording the activity of each neuron, the motion of muscles was recorded. This framework involves the appended control framework

$$\displaystyle \begin{aligned} \dot{\boldsymbol{x}}(t) &= A\boldsymbol{x}(t) + B\boldsymbol{u}(t)\\ \boldsymbol{y}(t) &= C\boldsymbol{x}(t), \end{aligned} $$

where y(t) represents the states (muscles) that are measured and C is the map from neurons and muscles x(t) to the measured output [37]. Here, the authors find that the ablation of a neuron not previously implicated in motion, PDB, decreased structural controllability, significantly reducing ventral bias in deep body bends in C. elegans.

17.4.2 State Transitions in the Human Brain

While neuron-level structural synapses map most directly to functional relationships between neurons, there are also well-characterized structural connections between larger-scale brain regions. These connections contain thick bundles of myelinated axonal fibers that run throughout the brain and are thought to play a crucial role in coupling the activity of distant brain regions [38]. These fibers are resolved by measuring water diffusion throughout the brain using magnetic resonance [39] and tracing fibers along this diffusion field using computational algorithms [30]. The whole brain is typically divided into hundreds to thousands of discrete brain regions using a variety of parcellation schemes [40, 41], and the strength of fibers between these regions comprises the connectivity matrix A [42].

Such region-level study of brain dynamics has led to the discovery of macroscopic functional organization in the human brain at rest [43] and during various cognitively demanding tasks [44]. Here, brain activity can be empirically measured through methods such as magnetic resonance imaging (blood oxygen level dependent) or electrophysiology (aggregate electrical activity). Of particular interest are large-scale functional brain networks that display stereotyped changes in activity patterns during tasks that demand certain cognitive or sensorimotor processes [45]. Here, it is thought that the brain uses underlying structural connections to support circuit-level coordination, as well as to guide itself to specific patterns of activity using cognitive control [46, 47].

Recent work has begun formulating cognitive control as a linear systems problem [46, 48,49,50,51], where matrix A is the network of white matter connections between brain regions, B represents the regions that were chosen to be responsible for control, and x(t) represents the activity of each region over time. Specifically in [48, 50], the authors quantify cognitive states as vectors corresponding to activity in the brain regions during cognitive tasks and compute the minimum control energy Eq. 17.9 to transition between cognitive states for various sets of control regions. Colloquially, if a set of regions requires less input energy to transition between cognitive states, then those regions may easily transition the whole brain between these states along an optimal trajectory given they are responsible for cognitive control. Moreover, individual differences in the minimal control energy are correlated with individual differences in performance on cognitive control tasks [52]. In complementary studies, individual differences in controllability statistics calculated for distinct regions of the brain are correlated with individual differences in measures of cognitive control assessed with common neuropsychological test batteries [49, 51].

17.5 Methodological Considerations and Limitations

While the theory of linear systems is a powerful quantitative framework for studying and controlling dynamical neural systems, there are several important caveats. Here we mention three: dimensionality and numerical stability, model validation and experimental data, and the assumption of linearity.

17.5.1 Dimensionality and Numerical Stability

The benefit of studying linear systems is that we take difficult and largely intractable questions of controllability and control input design and greatly simplify them into algebraic problems of computing objects like the controllability matrix Eq. 17.7 and the controllability Gramian Eq. 17.11. However, these matrices scale quadratically with the number of neural units, and numerical calculations and manipulations using these matrices quickly face computational issues.

Most viable approaches to dealing with these issues involve numerically representing the elements of our matrices and performing algebraic operations. However, these representations are imperfect, as it is impossible to completely represent irrational numbers such as π. Hence, the matrices are truncated to numerical precision, and this truncation error propagates with each computation. Further, the propagation of error tends to scale faster than the number of dimensions. This issue is prevalent in the computation of the state-transition matrix [53], as well as in the calculation of the controllability Gramian and its inverse. With the application of this theory to high-dimensional neural systems, the study of useful controllability metrics is an active area of research [54].

17.5.2 Model Validation and Experimental Data

A fundamental limitation for modeling any neural system is the ability to empirically and accurately measure model parameters and variables. A crucial parameter is the network of connectivity encoded by our adjacency matrix A, where the element in the i-th column and j-th row models the effect of unit i on the rate of change of unit j. While we typically use the structural connections in synapses between neurons, or bundles of axons between brain regions as a proxy for A, it is very difficult to measure the true functional effect that activity in unit i has on activity in unit j, particularly for large systems. This problem is exacerbated by further methodological limitations such as the inability to resolve directionality of connections in diffusion tractography. Along these lines, many statistical and autoregressive methods have been developed to infer functional relationships from recordings of neural activity [55,56,57,58,59] and to use that inferred activity to better understand control [60]. However, the degree of causality in these methods as measured by true response to external stimuli remains controversial.

Another such fundamental limitation is our inability to fully measure every state of the system. The state-space representation of our model requires that every state is observed. However, it is impossible to simultaneously record the activity of every neuron in almost all biological systems, although this recording has been achieved in sufficiently simple organisms [34]. As a result of only being able to observe a small subset of the full state-space, these models of interactions may become largely descriptive and phenomenological in nature. In response, there is a continuing effort to improve the spatial and temporal resolution of neuroimaging methods [61].

17.5.3 Assumption of Linearity

An inherent limitation is the lack of generality in our linear approximation of the full nonlinear neural dynamics. In response, there is a sizable quantity of research studying the control properties of nonlinear dynamical systems [62]. An interesting bridge between these two disciplines exists in the theory of the Koopman or composition operator [63]. The underlying benefit of this theory is that, while our system of equations may evolve nonlinearly in time given the current set of N states, there may exist a higher-dimensional set of M > N state variables in which the dynamical system does evolve linearly [64]. While the extension of linear systems theory to actually controlling this higher-dimensional system may be limited, it remains a promising future area of research.

17.6 Open Frontiers

Many exciting and open frontiers exist in the study of brain network dynamics using linear systems theory. Here we constrain our remarks to three main topic areas, but freely admit that this discussion is far from comprehensive. First, we describe opportunities in the further development of useful controllability statistics as well as in the development of foundational theory linking control profiles to the system’s underlying network architecture. Second, we underscore the need for a better understanding of how control is implemented in the brain, how control strategies might depend on context, and how control processes could facilitate the effective manipulation of information. Third, we describe the relevance of the modeling efforts we discussed here for our understanding of neurological disease and psychiatric disorders as well as the development of personalized and targeted therapeautic interventions for alterations in mental health.

17.6.1 Theory and Statistics

Linear systems theory has its basis in a rich literature stemming from now well-developed areas of mathematics, physics, and engineering [65]. Yet, much is still unknown about exactly how the network topology of a given unit-to-unit interaction pattern impacts the capacity for control, the trajectories accessible to the systems, and the minimum control energy. Some preliminary efforts have begun to make headway by using linear network control theory to derive accurate closed-form expressions that relate the connectivity of a subset of structural connections (those linking driver nodes to non-driver nodes) to the minimum energy required to control networked systems [66]. Further work is needed to gain an intuition for the role of higher-order structures (e.g., cycles) in the control of the networked system and any dependence on edge directionality [67]. Moreover, it would be fruitful in the future to further develop a broader set of controllablity statistics, extending beyond node controllability [54], and edge controllability [68], to the control of motifs [69]. Finally, throughout such investigations, it will be useful to understand which features of control are shared across networks with various topologies, versus those features which are specific to networks with a particular topology [70,71,72].

17.6.2 Context, Computations, and Information Processing

Despite the emerging appreciation that linear systems theory has considerable utility in the study of cognitive function, we still know very little about exactly how control is implemented in the brain, across spatial scales, and capitalizing on the unit-to-unit interaction patterns at each of those scales. Some initial evidence suggests that features of synaptic connectivity – and particularly autaptic connections – can serve to tune the excitability of the neural circuit, altering its controllability profile and propensity to display synchronous bursts of activity [73]. Complementary evidence also at the cellular scale demonstrates how intrinsic network structure and exogeneous stimulus patterns together determine the manner in which a stimulus propagates through the network, with important implications for cognitive faculties that require persistent activation of neuronal patterns such as working memory and attention [74]. There are interesting similarities between these observations and evidence at larger spatial scales, which suggests that the architecture of white matter tracts connecting brain areas can be used to infer the probability with which the brain persists in certain states [75]. Such conceptual similarities motivate concerted efforts to better understand how the architecture of brain networks across spatial scales supports information processing and cognitive computations and how those processes and computations might depend on the context in which the brain is placed. Formally, it would be interesting to consider context as a form of exogeneous input to the system, in a manner reminiscent of how we currently consider brain stimulation [8]. We speculate that such a formulation of the problem could help to explain a range of observations, such as the ability of cognitive effort to suppress epileptic activity [76].

17.6.3 Disease and Intervention

The fact that controllability can depend on network topology [66, 70] and can be altered by edge pruning [77] suggests that it might also be a useful biomarker in some neurological diseases and psychiatric disorders, many of which are associated with changes in the structural topology of neural circuitry at various spatial scales [6, 7]. Indeed, recent studies have reported differences in controllability statistics estimated in brain networks of patients with bipolar disorder [78], temporal lobe epilepsy [79], and mild traumatic brain injury [50]. In a complementary line of work, studies are beginning to ask whether the altered controllability profiles of brain networks in these patients could help to inform the development of more targeted interventions for their illness, in the form of brain stimulation [31, 80], pharmacological agents, or cognitive behavioral therapy. Other efforts have begun to consider symptoms of a given disease as a network and to identify symptoms predicted to have high impulse response in the patient’s daily life [81]. It would be interesting in the future to determine whether the linear systems approach could be useful in more carefully formalizing that problem as a network control problem, which in turn could be used to determine which symptom to treat in order to move the entire symptom network toward a healthier state [82].

17.7 Homework

  1. 1.

    Linearize the following system about point \(x_1^* = 1, x_2^* = -1, x_3^* = 0\),

    $$\displaystyle \begin{aligned} \begin{bmatrix} \dot{x}_1(t)\\ \dot{x}_2(t)\\ \dot{x}_3(t) \end{bmatrix} = \begin{bmatrix} -x_1^2(t) - 2x_2(t) + x_3(t) - 1\\ 2x_1(t) - 2x_2^2(t) + 2x_3(t)\\ x_1(t)x_2(t) - x_3(t) + 1 \end{bmatrix}. \end{aligned} $$

    and demonstrate that this point is a fixed point where \(\dot {x}_1 = \dot {x}_2 = \dot {x}_3 = 0\).

  2. 2.

    Prove that the matrix exponential of \(A = \begin {bmatrix}a & 0\\ 0 & b\end {bmatrix}\) is

    $$\displaystyle \begin{aligned} e^{A} = \begin{bmatrix} e^a & 0\\ 0 & e^b \end{bmatrix}, \end{aligned} $$

    using the Taylor series of the scalar and matrix exponentials.

  3. 3.

    Prove that the system response to control

    $$\displaystyle \begin{aligned} \boldsymbol{x}(t) = e^{At}\boldsymbol{x}_0 + \int_0^t e^{A(t-\tau)}B\boldsymbol{u}(\tau)d\tau \end{aligned} $$

    satisfies the dynamical equation \(\dot {\boldsymbol {x}}(t) = A\boldsymbol {x}(t) + B\boldsymbol {u}(t)\) by substitution.

  4. 4.

    Prove that the convolution operator

    $$\displaystyle \begin{aligned} \mathcal L(\boldsymbol{u}(t)) = \int_0^Te^{A(T-\tau)}B\boldsymbol{u}(\tau)d\tau \end{aligned} $$

    is linear according to Eq. 17.6; that is, if \(\mathcal L(\boldsymbol {u}_1(t)) = \boldsymbol {c}_1\), and \(\mathcal L(\boldsymbol {u}_2(t)) = \boldsymbol {c}_2\), then demonstrate that \(\mathcal L(a\boldsymbol {u}_1(t) + b\boldsymbol {u}_2(t)) = a\boldsymbol {c}_1 + b\boldsymbol {c}_2\).

  5. 5.

    Determine if the following system is controllable

    $$\displaystyle \begin{aligned} \begin{bmatrix} \dot{x}_1(t)\\ \dot{x}_2(t)\\ \dot{x}_3(t) \end{bmatrix} = \begin{bmatrix} 0 & 1 & 0\\ 0 & 0 & 1\\ 1 & 0 & 0 \end{bmatrix} \begin{bmatrix} x_1(t)\\ x_2(t)\\ x_3(t) \end{bmatrix} + \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} u(t), \end{aligned} $$

    by constructing the controllability matrix.

  6. 6.

    Determine for what value of a the system is not controllable

    $$\displaystyle \begin{aligned} \begin{bmatrix} \dot{x}_1(t)\\ \dot{x}_2(t)\\ \dot{x}_3(t) \end{bmatrix} = \begin{bmatrix} 0 & 0 & 0\\ 1 & 1 & 0\\ 1 & 0 & a \end{bmatrix} \begin{bmatrix} x_1(t)\\ x_2(t)\\ x_3(t) \end{bmatrix} + \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} u(t), \end{aligned} $$

    by constructing the controllability matrix.

  7. 7.

    Derive the minimum energy equation Eq. 17.10

    $$\displaystyle \begin{aligned} E_{\mathrm{min}} = (\boldsymbol{x}_T {-} e^{AT}\boldsymbol{x}_0)^\top (\mathcal L \mathcal L^*)^{-1} (\boldsymbol{x}_T {-} e^{AT}\boldsymbol{x}_0), \end{aligned} $$

    by substituting the minimum input u (t) into the control energy Eq. 17.8

    $$\displaystyle \begin{aligned} E = <\boldsymbol{u}(t),\boldsymbol{u}(t)>. \end{aligned} $$
  8. 8.

    Show that the controllability Gramian can be written as

    $$\displaystyle \begin{aligned} W_C(T) &= \int_0^T e^{A(T-t)}BB^\top e^{A^\top(T-t)}dt\\ & = \int_0^T e^{A\tau}BB^\top e^{A^\top\tau}d\tau, \end{aligned} $$

    using the substitution τ = T − t.

  9. 9.

    Show that the controllability Gramian for system

    is

    $$\displaystyle \begin{aligned} W_C(T) = \begin{bmatrix} \frac{1}{2a}\left(e^{2aT}-1\right) & 0\\ 0 & \frac{1}{2b}\left(e^{2bT}-1\right) \end{bmatrix} \end{aligned} $$
  10. 10.

    Compute the minimum energy required for the system

    to transition from initial state \(\boldsymbol {x}(0) = \begin {bmatrix}0 \\ 0\end {bmatrix}\) to final state \(\boldsymbol {x}(T) = \begin {bmatrix}1 \\ 2\end {bmatrix}\) in time T = 1.