1 Introduction

In this paper we investigate, theoretically and numerically, the minimal time control, via Optogenetics, of some widely used finite-dimensional deterministic neuron models such as the Hodgkin–Huxley model (Hodgkin and Huxley 1952), the Morris–Lecar model (Lecar and Morris 1981) and the FitzHugh–Nagumo model (FitzHugh 1961). Control of neuron models has been addressed in the literature in different ways. One popular way to investigate this problem is to look at phase reductions of non-linear evolution systems, consisting of reducing the system of equations to a single first-order differential equation (Brown et al. 2004; Nabi and Moehlis 2011). Integrate-and-fire models, which are also a simplification of nonlinear systems to single first-order linear differential equations, receiving stochastic inputs, have been studied in Feng and Tuckwell (2003) in order to minimize the variance of the membrane potential, arguably linked to the variance of the final time, while reaching a given membrane potential threshold in fixed time. These simplifications allow the authors to obtain a nice analytic expression for the optimal control. Stochastic integrate-and-fire models have also been used in Lolov et al. (2014) to find an optimal electrical stimulation to spike in a desired time, a problem close to ours, with numerical computation purposes.

All these studies were exclusively based on control via electrical stimulation. Optogenetics allows a control of excitable cells of a different nature. This recent and thriving technique is based on light stimulation (Deisseroth 2011, 2015; Boyden 2015). It has as its cornerstone the genetical modification of excitable cells for them to express new ion channels whose opening and closing are triggered by the absorption of photons. In particular, it is able to target specific populations of neurons. Indeed, by designing viruses that will aim at these populations only, the light stimulation will have no effect on the other populations that do not express the new ion channels. This makes Optogenetics a noninvasive technique, in contrast to electrical stimulation that reaches a whole volume of tissue, regardless of the types of neurons that populate this volume. Furthermore, optical devices such as optic fibers and lasers allow light to reach deeply embedded populations of neurons. It then provides Optogenetics with a tremendous advantage over electrical stimulation in the exploration of neural tissues and neural functions. The risk of tissue damage is also decreased with this technique. The perspectives of applications in medicine are thus colossal with, among others, the promise to help understand and treat Alzheimer’s disease (Ryan et al. 2015), Parkinson’s disease (Chen et al. 2015), epilepsy (Paz and Huguenard 2015), vision loss (Gaub 2015), narcolepsy (Adamantidis et al. 2007) and even depression (Lobo et al. 2012).

Our work is based on one of these light-gated ion channels called Channelrhodopsin (ChR2). It is a depolarizing non-selective cation channel that opens upon a stimulation with blue light. One of the neural events that contains a lot of information is the latency time between two consecutive action potentials or spikes (a large depolarization of the membrane potential when it goes beyond some threshold). Here we want to specifically address the time optimal control of the first spike in various neuron models, for two different mathematical models of ChR2 introduced in Nikolic et al. (2009). Indeed, the mathematical formulation of this problem is really close to the one of the optimal control of the latency time between two spikes. In particular, the investigation of singular trajectories is the same. To the best of our knowledge, this optimal control problem has never been studied before, neither in terms of electrical stimulation, nor in terms of light stimulation.

In Sect. 2 we set the mathematical framework of conductance-based neuron models and we recall some results of minimal time control problems for affine control systems, and the role of singular controls. We then present in Sect. 3 the mathematical model of ChR2 and how the resulting models can be incorporated in conductance-based models. We apply our results to various neuron models in Sect. 4. For the ChR2-3-state model, we prove that there are no singular optimal controls for two-dimensional models (FitzHugh–Nagumo, Morris–Lecar, reduced Hodgkin–Huxley models) and we give the expression of the bang–bang optimal control. We illustrate these results with numerical computations of the optimal controls by means of a direct method. For the ChR2-4-states model, we numerically observe optimal bang–bang controls. Along the review of the different models, we insist on how optimal control appears as a great tool to discuss and compare neuron models. In particular, it emphasizes a peculiar behavior of the Morris–Lecar model, compared to the other ones, and gives a new argument in favor of the reduced Hodgkin–Huxley model.

Although we focus in this paper on neuron models, our treatment of conductance-based model can be applied to any excitable cells such as cardiac cells for example [see Wong et al. (2012) for a work on application of Optogenetics in cardiac cells for simulation purposes].

2 Preliminaries

2.1 Conductance based models

Conductance based models form a popular class of simple biophysical models used to represent the activity of an excitable cell, such as a neuron or a cardiac cell. The principle is to give an equivalent circuit representation of the cell by assigning an electrical component to each meaningful biological component of the cell. Finite-dimensional conductance-based models represent the cell as a single isopotential electrical compartment. The lipid bilayer membrane of the cell is represented by a capacitance \(C>0\). Across the membrane are disposed voltage-gated ion channels, represented by conductances \(g_x>0\) whose values depend on the type x of the channel. An ion channel is a protein that constitutes a gate across the membrane. It has the ability to let ions flow across the membrane or to prevent them from doing so. Ion channels are said selective in the sense that they act as a filter of certain types of ions. The main types of ion channel are potassium (\(K^+\)) channels, sodium (\(Na^+\)) channels and calcium (\(Ca^{2+}\)) channels. The ion flows are driven by electrochemical gradients represented by batteries whose voltages \(E_x\in \mathbb {R}\) equal the membrane potential corresponding to the absence of ion flow of type x. For that reason, they are called equilibrium potentials. The sign of the difference between the membrane potential and \(E_x\) gives the direction of the driving force. The channels are all called voltage-gated because their opening and closing depend on the potential difference across the membrane. This means that the conductances \(g_x\) are variable conductances, depending on the membrane potential.

The ion flow across the membrane generates an electrical current in the circuit, the possible movements of ions inside the cell being neglected. To each type x of ion channels is associated a macroscopic ion current \(I_x\). The total membrane current is the sum of the capacitive current and all of ionic currents considered. In all models we consider in this paper, the ionic currents include a leakage current that accounts for the passive flow of some other ions across the membrane. This current is associated to a fixed conductance \(g_L\) and is always denoted by \(I_L\).

Every macroscopic ion current \(I_x\) is the result of the ion flow through all the ion channels of type x. Since the number of ion channels in an excitable cell is very large, the macroscopic conductance \(g_x\) is a function of the probability \(n_x\in [0,1]\) that a channel of type x opens. In fact, the channels of type x are constituted by several subpopulations of gates that have different dynamics. Let \(k_x\in \mathbb {N}^*\) be the number of subpopulations of the channels of type x and write \((n_{x_1},\ldots ,n_{x_{k_x}})\in [0,1]^{k_x}\) the probabilities that each gate of the subpopulation opens, that is, \(n_{x_i}\) represents the probability that a gate of type \(x_i\) opens. The time evolution of these probabilities in each subpopulation depends on the membrane potential and is of first order. For \(i\in \{1,\ldots ,k_x\}\), it is represented on Fig. 1, and the dynamical system governing \(n_{x_i}\) is the following

$$\begin{aligned} \dot{n}_{x_i}(t) = \alpha _{x_i}(V)(1-n_{x_i}) - \beta _{x_i}(V)n_{x_i}, \end{aligned}$$
(2.1)

where \(\alpha _{x_i}\) and \(\beta _{x_i}\) are smooth functions of the membrane potential V.

Fig. 1
figure 1

Ion channel of type \({x_i}\) with C a closed state and O an open state

This dynamics can be easily interpreted as follows : when the potential across the membrane is equal to V, ion channels in the subpopulation of type \(x_i\) open at rate \(\alpha _{x_i}(V)\) and close at rate \(\beta _{x_i}(V)\).

The macroscopic conductance \(g_x\) is then given by

$$\begin{aligned} g_x(n_x) = \bar{g}_x f_x(n_{x_1},\ldots ,n_{x_{k_x}}), \end{aligned}$$

where \(\bar{g}_x\) is the maximum conductance of the channel (i.e., the conductance when all the channels of type x are open) and \(f_x\) is a smooth function depending on the type of the channel.

The macroscopic current \(I_x\) of type x is given by Ohm’s law. Taking into account the equilibrium potential \(E_x\), we get

$$\begin{aligned} I_x&= g_x (V-E_x)\\&= \bar{g}_x f_x(n_{x_1},\ldots ,n_{x_{k_x}})(V-E_x). \end{aligned}$$

In Fig. 2 below we give the example of a conductance-based model with two types of channels with conductances \(g_1\) and \(g_2\).

Fig. 2
figure 2

Equivalent circuit for a conductance-based model with two types of channels

The total current \(I_{tot}\) is given by

$$\begin{aligned} I_{tot} = I + I_1 + I_2 + I_L, \end{aligned}$$

where \(I = C\frac{\mathrm {d}V}{\mathrm {d}t}\), \(I_{1,2}=g_{1,2}(V)(V-E_{1,2})\) and \(I_L=g_L(V-E_L)\).

The first conductance-based model dates back to the seminal work of Hogkin and Huxley (1952) on the squid giant axon. In voltage-clamp experiments (i.e., experiments in which the membrane potential was held fixed), they showed how the ionic currents could be interpreted in terms of changes in \(Na^+\) and \(K^+\) conductances. From the experimental data, they inferred the dependencies, on the membrane potential and the time, of these conductances. The resulting mathematical model became very popular because it was able to reproduce all key biophysical properties of an action potential. The \(K^+\) channels are composed of a single population. Let us denote by n the probability that a channel of type \(K^+\) opens. The \(K^+\) conductance is given by

$$\begin{aligned} g_K = \bar{g}_Kn^4. \end{aligned}$$

The population of \(Na^+\) is composed of two subpopulations and we write m and h the corresponding probabilities that a certain type of gate opens. The \(Na^+\) conductance is given by

$$\begin{aligned} g_{Na} = \bar{g}_{Na}m^3h. \end{aligned}$$

The total membrane current \(I_{tot}\) is then given by

$$\begin{aligned} I_{tot} = C\frac{\mathrm {d}V}{\mathrm {d}t} + \bar{g}_Kn^4(V-E_K) + \bar{g}_{Na}m^3h(V-E_{Na}) + g_L(V-E_L), \end{aligned}$$

with V the membrane potential. If an external current \(I_{ext}\) is applied to the cell, we can write the dynamic system (HH) for the evolution of the membrane potential

$$\begin{aligned} (HH)\left\{ \begin{aligned} C \dot{V}(t)&= \bar{g}_Kn^4(t)(E_K - V(t)) +\bar{g}_{Na}m^3(t)h(t)(E_{Na}-V(t))\\&\qquad + g_L(E_L-V(t)) + I_{ext}(t),\\ \dot{n}(t)&= \alpha _n(V(t))(1-n(t)) - \beta _n(V(t))n(t),\\ \dot{m}(t)&= \alpha _m(V(t))(1-m(t)) - \beta _m(V(t))m(t),\\ \dot{h}(t)&= \alpha _h(V(t))(1-h(t)) - \beta _h(V(t))h(t). \end{aligned} \right. \end{aligned}$$

The expression of the functions \(\alpha _x\) and \(\beta _x\) and the numerical values of the constants can be found in “Appendix 2”.

To end this section, we give a formal mathematical definition of what we will refer to as a conductance-based model in the sequel.

Definition 1

Conductance based model. Let \(n\in \mathbb {N}^*\). Let also \(k\in \mathbb {N}^*\) and for all \(i\in \{1,\ldots ,k\}\), let \(j_i\in \mathbb {N}^*\) such that \(\sum _{i=1}^k j_i = n-1\). We call n-dimensional conductance-based model the following dynamical system in \(\mathbb {R}^n\)

$$\begin{aligned} \dot{x}_1(t) = \frac{1}{C}\left( \sum _{i=1}^k \bar{g}_if_i(x_{1+j_1+\cdots +j_{i-1}+1}(t), \ldots ,x_{1+j_1+\cdots +j_{i-1}+j_i}(t))\left( E_i-x_1(t)\right) \right) , \end{aligned}$$

with the convention that \(1+j_1+\cdots +j_{i-1}+1 = 2\) and \(1+j_1+\cdots +j_{i-1}+j_i =1+ j_1\) for \(i=1\), and that \(1+j_1+\cdots +j_{i-1}+1 = 1+j_1+1\) and \(1+j_1+\cdots +j_{i-1}+j_i =1+ j_1+j_2\) for \(i=2\). The dynamics of the gating variables is then defined, for \(l\in \{2,\ldots ,n\}\), by

$$\begin{aligned} \dot{x}_{l}(t) = \alpha _{l}(x_1(t))(1-x_l(t)) - \beta _{l}(x_1(t))x_l(t), \end{aligned}$$

where \(C>0\) and for all \(i\in \{1,\ldots ,k\}\) and \(l\in \{2,\ldots ,n\}\)

  • \(\bar{g}_i>0\), \(f_i : [0,1]^{j_i}\rightarrow \mathbb {R}_+\) is a smooth function,

  • \(\alpha _l,\beta _l : \mathbb {R}\rightarrow \mathbb {R}\) are smooth functions such that for all \(v\in \mathbb {R}, \alpha _l(v)+\beta _l(v)\ne 0\).

We finally require that the previous dynamical system exhibits an equilibrium point \(x^{\infty }\in \mathbb {R}^n\), that we call resting state, defined by the following equations

$$\begin{aligned} x_l^{\infty } = \frac{\alpha _l(x_1^{\infty })}{\alpha _l(x_1^{\infty })+\beta _l(x_1^{\infty })}, \quad \forall l\in \{2,\ldots ,n\}, \end{aligned}$$

and

$$\begin{aligned} 0=\sum _{i=1}^k \bar{g}_if_i\left( x_{1+j_1+\cdots +j_{i-1}+1}^{\infty },\ldots ,x_{1+j_1+\cdots +j_{i-1}+j_i}^{\infty }\right) \left( E_i-x_1^{\infty }\right) . \end{aligned}$$

Conductance based models are uniquely defined on \(\mathbb {R}_+\). The initial conditions \(y\in \mathbb {R}^n\) that we consider are physiological conditions with \(y_1\) in a physiological range for the membrane potential of the cell considered, basically \(y_1\in [V_{min},V_{max}]\) with \(-\infty<V_{min}<V_{max}<+\infty \), and \(y_i\in [0,1]\) for all \(i\in \{2,\ldots ,n\}\).

Based on Definition 1, the Hodgkin–Huxley model (HH) defined above is 4-dimensional conductance based model with 2 types of ion channels (\(k=2\), and \(j_1=1\) for the potassium channel, and \(j_2=2\) for the sodium channel). We also have \(f_1(n)=f_1(x_2)=x_2^4\), and \(f_2(m,h)=f_2(x_3,x_4)=x_3^3x_4\).

2.2 The Pontryagin Maximum Principle for minimal time single-input affine problems

In this section we recall how the necessary optimality conditions of the Pontryagin Maximum Principle conveniently read for the specific affine problem that we investigate in the sequel. We first define the minimal time affine control problem and embed it in a general optimal control problem. We then show how the first-order optimality conditions of the Pontryagin Maximum Principle can be understood under the light of the well-known Lagrange multipliers rule. We refer the reader to Trélat (2012) for a short survey on optimal control, from theoretical and numerical points of view. Finally, we apply these first-order conditions to the minimal time affine control problem.

2.2.1 The minimal time single-input affine control problem

Consider the minimal time problem for a smooth single-input affine system:

$$\begin{aligned} \dot{x}(t) = F_0(x(t))+u(t)F_1(x(t)),\quad x(0) = x_{eq}\in \mathbb {R}^n, \end{aligned}$$
(2.2)

where \(x(t)\in \mathbb {R}^n\) and \(x_{eq}\) is a solution of \(F_0(x)=0\) (i.e., an equilibrium point for the uncontrolled system). The control domain \(U:=[0,u_{max}]\) is a segment of \(\mathbb {R}_+\), with \(u_{max}>0\). The state variable must satisfy the final condition \(x(t_f)\in M_f\) where

$$\begin{aligned} M_f := \{x\in \mathbb {R}^n | x_1 = V_f\}, \end{aligned}$$

with \(V_f > 0\) a given constant that will later correspond to the potential of a spike. The set of admissible controls, denoted \(\mathcal {U}_{ad}\), is the subset of the measurable applications from \(\mathbb {R}_+\) to U, denoted by \(\mathcal {L}(\mathbb {R}_+,U)\), such that (2.2) has a unique solution on \(\mathbb {R}_+\).

We introduce the Hamiltonian \(\mathcal {H} : \mathbb {R}^n\times \mathbb {R}^n\times \mathbb {R}_-\times U \rightarrow \mathbb {R}\) defined for \((x,p,p^0,u)\in \mathbb {R}^n\times \mathbb {R}^n\times \mathbb {R}_-\times U \) by

$$\begin{aligned} \mathcal {H}(x,p,p^0,u) := \langle p,F_0(x) \rangle + u \langle p,F_1(x) \rangle + p^0, \end{aligned}$$
(2.3)

where \(\langle \cdot ,\cdot \rangle \) is the scalar product on \(\mathbb {R}^n\), \(p\in \mathbb {R}^n\) is the adjoint vector and \(p^0\le 0\) a non-positive number.

2.2.2 The Pontryagin Maximum Principle for general optimal control problems

The control problem of the previous section is a particular case of the following optimal control problem. Let n and m be nonzero integers. Consider on \(\mathbb {R}^n\) the control system

$$\begin{aligned} \dot{x}(t) = f(t,x(t),u(t)), \end{aligned}$$
(2.4)

where \(f:\mathbb {R}\times \mathbb {R}^n\times \mathbb {R}^m \rightarrow \mathbb {R}^n\) is \(C^1\), and where the controls are bounded and measurable functions, defined on intervals [0, T(u)] of \(\mathbb {R}_+\) and taking their values in a subset U of \(\mathbb {R}^m\). Let \(M_0\) and \(M_1\) be two subsets of \(\mathbb {R}^n\). Denote by \(\mathcal {U}_{ad}\) the set of admissible controls such that the corresponding trajectories steer the system from an initial point of \(M_0\) to a final point in \(M_1\). For such a control u, the cost of the corresponding trajectory \(x_u(\cdot )\) is defined by

$$\begin{aligned} C(t_f,u):= \int _0^{t_f} f^0(t,x_u(t),u(t)) \mathrm {d}t + g(t_f,x_u(t_f)), \end{aligned}$$
(2.5)

where \(f^0:\mathbb {R}\times \mathbb {R}^n\times \mathbb {R}^m \rightarrow \mathbb {R}\) and \(g:\mathbb {R}\times \mathbb {R}^n \rightarrow \mathbb {R}\) are \(C^1\). We investigate the optimal control problem of determining a trajectory \(x_u(\cdot )\) solution of (2.4), associated with a control u on \([0,t_f]\), such that \(x_u(0)\in M_0,x_u(t_f)\in M_1\), and minimizing the cost C. The final time \(t_f\) can be fixed or not. The affine minimal time problem can then be defined by \(f(t,x,u) = F_0(x)+uF_1(x)\), and \(f^0=1\), and \(g=0\).

Definition 2

The end-point mapping \(E:\mathbb {R}^n\times \mathbb {R}_+\times \mathcal {U}_{ad} \rightarrow \mathbb {R}^n\) of the system is defined by \(E(x_0,T,u)=x(x_0,T,u)\), where \(t\rightarrow x(x_0,t,u)\) is the trajectory solution of (2.4), corresponding to the control u, such that \(x(x_0,0,u)=x_0\).

In terms of the end-point mapping, the optimal control problem under consideration can be written as the infinite-dimensional minimization problem

$$\begin{aligned} \min \{C(t_f,u)\mid x_0\in M_0, E(x_0,t_f,u)\in M_1, u \in L^\infty (0,t_f;U)\}, \end{aligned}$$
(2.6)

where \(L^\infty (0,t_f;U)\) denotes the set of measurable and bounded functions \(u:(0,t_f)\rightarrow U\).

We also introduce here a formal definition for singular controls, a notion that we will deeply investigate in the sequel.

Definition 3

A trajectory \(x(\cdot )\), associated with a control \(u(\cdot )\) on \([0,t_f]\), is said to be singular if it is a singular point of the end-point mapping, that is, if the rank of the linear continuous mapping

$$\begin{aligned} \frac{\partial E}{\partial u} ( x(0), t_f, u ) : L^\infty (0,t_f; \mathbb {R}^m) \longrightarrow \mathbb {R}^n \end{aligned}$$

is less than n.

Assume for one moment that we are in the simplified situation where \(M_0=\{x_0\}\), \(M_1=\{x_1\}\), T is fixed, and \(U=\mathbb {R}\). That is, we consider the optimal control problem of steering system (2.4) from the initial point \(x_0\) to the final point \(x_1\) in time T and minimizing the cost (2.5) among controls \(u\in L^\infty ([0,T],\mathbb {R}^m)\). In that case, the optimization problem (2.6) reduces to

$$\begin{aligned} \min _{E(x_0,T,u)}C(T,u). \end{aligned}$$
(2.7)

Now, according to the Lagrange multipliers rule (and using the \(C^1\) regularity of our data), if u is optimal, then there exists \((\psi ,\psi ^0)\in \mathbb {R}^n\times \mathbb {R}\setminus \{0\}\) such that

$$\begin{aligned} \psi \cdot dE_{x_0,T}(u) = -\psi ^0dC_T(u). \end{aligned}$$
(2.8)

Note that, if one defines the Lagrangian \(L_T(u,\psi ,\psi ^0):=\psi \cdot E_{x_0,T}(u) + \psi ^0 C_T(u) \), then this first-order necessary condition for optimality is written in the usual form as

$$\begin{aligned} \frac{\partial L_T}{\partial u}\left( u,\psi ,\psi ^0\right) =0. \end{aligned}$$
(2.9)

The first-order condition (2.9) is in this form not much tractable for practical purposes. The general version of the Pontryagin maximum principle, which is valid without the above restrictions (and in particular, which is valid under control constraints), parametrizes in some sense the condition (2.9) along the trajectory for the general control problem and reads as follows (see Pontryagin et al. 1974; Trélat 2008).

Theorem 1

If the trajectory \(x(\cdot )\), associated to the optimal control u on \([0,t_f]\), is optimal, then it is the projection of an extremal \((x(\cdot ),p(\cdot ),u(\cdot ))\) (called extremal lift), where \(p^0\le 0\), and \(p(\cdot ):[0,t_f]\rightarrow \mathbb {R}\) is an absolutely continuous mapping, called adjoint vector, with \((p(\cdot ),p^0) \ne (0,0)\), such that

$$\begin{aligned} \dot{x}(t) = \frac{\partial \mathcal {H}}{\partial p}(t,x(t),p(t),p^0,u(t)), \quad \dot{p}(t) = -\frac{\partial \mathcal {H}}{\partial x}(t,x(t),p(t),p^0,u(t)) \end{aligned}$$

almost everywhere on \([0,t_f]\), where

$$\begin{aligned} \mathcal {H}(t,x,p,p^0,u) := \langle p,f(t,x,u) \rangle + p^0f^0(t,x,u) \end{aligned}$$

is the Hamiltonian, and there holds

$$\begin{aligned} \mathcal {H}(t,x(t),p(t),p^0,u(t)) = \max _{v\in U} \mathcal {H}(t,x(t),p(t),p^0,v) \end{aligned}$$
(2.10)

almost everywhere on \([0,t_f]\). If moreover the final time \(t_f\) to reach the target \(M_1\) is not fixed, then one has the following condition at the final time \(t_f\):

$$\begin{aligned} \max _{v\in U} \mathcal {H}(t_f,x(t_f),p(t_f),p^0,v) = -p^0\frac{\partial g}{ \partial t}(t_f,x(t_f)). \end{aligned}$$
(2.11)

Additionally, if \(M_0\) and \(M_1\) (or just one of them) are submanifolds of \(\mathbb {R}^n\) locally around \(x(0)\in M_0\) and \(x(t_f)\in M_1\), then the adjoint vector can be built in order to satisfy the transversality conditions at both extremities (or just one of them)

$$\begin{aligned} p(0) \perp T_{x(0)}M_0, \quad p(t_f) - p^0 \frac{\partial g}{\partial x}(t_f,x(t_f))\perp T_{x(t_f)}M_1, \end{aligned}$$
(2.12)

where \(T_xM_i\) denotes the tangent space to \(M_i\) at the point x.

The relation between the Lagrange multipliers and \((p(\cdot ),p^0)\) is that the adjoint vector can be constructed so that \((\psi ,\psi ^0)=(p(t_f),p^0)\) up to some multiplicative scalar. In particular, the Lagrange multiplier \(\psi \) is unique (up to a multiplicative scalar) if and only if the trajectory \(x(\cdot )\) admits a unique extremal lift (up to a multiplicative scalar).

2.2.3 Application to the minimal time single-input affine problem

We now come back to the affine minimal time problem that we defined in Sect. 2.2.1. For this problem, the Pontryagin Maximum Principle states that if the trajectory \(t\rightarrow x(t)\), \(t\in [0,t_f]\) associated with the admissible control \(u \in \mathcal {U}_{ad}\) is optimal on \([0,t_f]\), then there exists \(p:[0,t_f]\rightarrow \mathbb {R}^n\) absolutely continuous and \(p^0\in \mathbb {R}_-\) such that \((p,p^0)\) is non zero and such that p satisfy the following equations, almost everywhere in \([0,t_f]\):

$$\begin{aligned} \dot{x}(t) = \frac{\partial \mathcal {H}}{\partial p}(x(t),p(t),p^0,u(t)),\quad \dot{p}(t) = -\frac{\partial \mathcal {H}}{\partial x}(x(t),p(t),p^0,u(t)). \end{aligned}$$

Moreover, the following maximum condition must be satisfied on \([0,t_f]\):

$$\begin{aligned} \mathcal {H}(x(t),p(t),p^0,u(t)) = \max _{v\in U} \mathcal {H}(x(t),p(t),p^0,v). \end{aligned}$$
(2.13)

In view of the initial and final conditions on the state variable, the transversality condition on p(0) is empty and the one on \(p(t_f)\) gives

$$\begin{aligned} p_1(t_f) = \lambda _1 \in \mathbb {R}, \quad p_i(t_f) = 0, \quad \forall i \in \{2,\ldots ,n\}. \end{aligned}$$

In our particular setting, the augmented system does not depend on the time variable. This implies that the right hand side of (2.13) is constant on \([0,t_f]\). Now since there is no final cost and because the final time is not fixed, we get from (2.11)

$$\begin{aligned} \max _{v\in U} \mathcal {H}(x(t_f),p(t_f),p^0,v) = 0. \end{aligned}$$

The two latter remarks imply that for all \(t\in [0,t_f]\)

$$\begin{aligned} \mathcal {H}(x(t),p(t),p^0,u(t)) = 0 = \max _{v\in U} \mathcal {H}(x(t),p(t),p^0,v), \end{aligned}$$
(2.14)

which can be written, in view of (2.3):

$$\begin{aligned}&\langle p(t),F_0(x(t)) \rangle + u(t) \langle p(t),F_1(x(t)) \rangle + p^0 = 0 \end{aligned}$$
(2.15)
$$\begin{aligned}&\qquad = \langle p(t),F_0(u(t)) \rangle + \max _{v\in U} v\langle p(t),F_1(x(t)) \rangle + p^0. \end{aligned}$$
(2.16)

In the case of single-input affine systems, the maximum condition (2.16) gives the expression of the optimal control:

$$\begin{aligned} u(t):=\left\{ \begin{aligned}&u_{max},&\quad \text {if } \langle p(t),F_1(x(t)) \rangle > 0,\\&0,&\quad \text {if } \langle p(t),F_1(x(t)) \rangle < 0,\\&\text {undetermined},&\quad \text {if } \langle p(t),F_1(x(t)) \rangle = 0. \end{aligned} \right. \end{aligned}$$

The function \(\varphi (t) := \langle p(t),F_1(x(t)) \rangle \), whose sign gives the expression of the optimal control is called the switching function. If it does not vanish on any subinterval I of \([0,t_f]\), the optimal control is a succession of constant controls called bang–bang control. The switching times between the two constant modes are given by the change of sign of the switching function \(\varphi \). This conclusion fails if there exists a subinterval I of \([0,t_f]\) along which the switching function vanishes and this situation has to be further investigated. It can easily be proved (Hamiltonian characterisation of singular trajectories, see, e.g., Trélat 2008, 2012) that, for the minimal time problem for control-affine systems, an arc along which the switching function vanishes identically is singular, in the sense of Definition 3.

Finally, the non-triviality of \((p,p^0)\) reduces in fact to the one of p because if \(p(t)=0\) for a given \(t\in [0,t_f]\) then \(p^0 = 0\) because of (2.15).

The investigation of the existence of singular trajectories will be done later for our different models but for now let us state that if there exists a subinterval I on which the switching function vanishes, with u the corresponding control, then from the Pontryagin Maximum Principle, (xpu) is the solution, on I, of the following equations:

$$\begin{aligned} \dot{x}(t)= & {} \frac{\partial \mathcal {H}}{\partial p}(x(t),p(t),p^0,u(t)), \\ \dot{p}(t)= & {} -\frac{\partial \mathcal {H}}{\partial x}(x(t),p(t),p^0,u(t)), \quad \langle p(t),F_1(x(t)) \rangle =0. \end{aligned}$$

3 Control of conductance-based models via optogenetics

In this section we consider a general conductance-based model in \(\mathbb {R}^n\), with \(n\in \mathbb {N}^*\), of the form

$$\begin{aligned} \dot{x}(t) = f_0(x(t)), \quad t\in \mathbb {R}_+,\quad x(0) = x_0 \in \mathcal {D}\subset \mathbb {R}^n, \end{aligned}$$
(3.1)

with \(f_0\) a smooth vector field in \(\mathbb {R}^n\) and \(\mathcal {D}\) physiological domain.

Optogenetics is a recent and innovative technique which allows one to induce or prevent electric shocks in living tissue, by means of light stimulation. Successfully demonstrated in mammalian neurons in 2005 (Boyden et al. 2005), the technique relies on the genetic modification of cells in order for them to express particular ionic channels, called rhodopsins, whose opening and closing are directly triggered by light stimulation. One of these rhodopsins comes from an unicellular flagellate algae, Chlamydomonas reinhardtii, and has been christened Channelrodhopsins-2 (ChR2). It is a cation channel that opens when illuminated with blue light.

Since the field is very young, the mathematical modeling of the phenomenon is quite scarce. Some models have been proposed, based on the study of the photocycles that the channel go through when it absorbs a photon (see Nikolic et al. (2006) and Nikolic et al. (2009) for a 3-states model and Hegemann et al. (2005) for a 4-states model). In Nikolic et al. (2009), the authors study two models for the ChR2 that are able to reproduce the photocurrents generated by the light stimulation of the channel. Those models are constituted by several states that can be either conductive (the channel is open) or non-conductive (the channel is closed). Transitions between those states are spontaneous, depend on the membrane potential or are triggered by the absorption of a photon. This kind of model has already been used to simulate photocurrents in cardiac cells. In Wong et al. (2012), the authors include ChR2 photocurrents into an infinite dimensional model and use finite differences and elements to simulate the system. The optimal control of such a system is not investigated in this paper. Here we are interested in both 3-states and 4-states models of Nikolic et al. (2009). The 3-states model has one open state o and two closed states c and d while the 4-states model has two open states \(o_1\) and \(o_2\), and two closed states \(c_1\) and \(c_2\). Their transitions are represented on Figs. 3 and 4.

Fig. 3
figure 3

ChR2 three states model

Fig. 4
figure 4

ChR2 four states model

In the 3-states model, the transition from the dark adapted close state c and the open state o is controlled by a function u(t), proportional to the intensity of the light applied to the neuron. In our model, the intensity is then the control variable. The transition from the open state to the light adapted close state d is spontaneous and has a time constant very small in front of the one of the transition from d to c (i.e. \(1/K_d<< 1/K_r\)). This last transition represents the fact that the protein has to regenerate before being able to go through a new cycle. The 4-states model can be similarly interpreted. The transitions from closed states to open states are triggered by light stimulation and all the other transitions are independent of the intensity of the light applied to the neuron. Hence, \(\varepsilon _1\), \(\varepsilon _2\), \(e_{12}\), \(e_{21}\), \(K_{d1}\), \(K_{d2}\) and \(K_r\) are all positive constants. This constitutes our general assumption on the models we study. Indeed, we assume that the transitions from closed states to open states depend linearly on the light and that all the others are independent of the light. This assumption is not too strong since it leads to models that still reproduce the shape of the photocurrents produced by the channel, and experimentally measured. Furthermore, it makes our control system affine. The dynamical system based on Figs. 3 and 4 is given by

$$\begin{aligned} \left\{ \begin{aligned} \dot{o}(t)&= u(t)(1-o(t)-d(t)) - K_do(t),\\ \dot{d}(t)&= K_do(t) - K_rd(t), \end{aligned} \right. \end{aligned}$$
(3.2)

and

$$\begin{aligned} \left\{ \begin{aligned} \dot{o}_1(t)&= \varepsilon _1u(t)(1-o_1(t)-o_2(t)-c_2(t)) - (K_{d1} + e_{12})o_1(t) + e_{21}o_2(t),\\ \dot{o}_2(t)&= \varepsilon _2u(t)c_2(t) + e_{12}o_1(t) - (K_{d2} + e_{21})o_2(t),\\ \dot{c}_2(t)&= K_{d2}o_2(t) - (\varepsilon _2u(t)+K_r)c_2(t). \end{aligned} \right. \end{aligned}$$
(3.3)

In the 3-states model, the conductance of the ChR2 channel is assumed to be proportional to the probability o(t) that the channel opens, so that the ion current associated to ChR2 channels is given by

$$\begin{aligned} I_{ChR2}(t) = g_{ChR2}o(t)(V_{ChR2}-v(t)), \end{aligned}$$

with v the membrane potential of the channel, \(g_{ChR2}\) the maximal conductance of the channel and \(V_{ChR2}\) the equilibrium potential of the channel. See “Appendix 3” for the numerical computation of these constants. In the 4-states model, the open states are assumed to be of different conductivity so that

$$\begin{aligned} I_{ChR2}(t) = g_{ChR2}(o_1(t)+\rho o_2(t))(V_{ChR2}-v(t)), \end{aligned}$$

with \(\rho \in (0,1)\). We can now include these two models of ChR2 in a conductance-based model defined in the previous section.

Definition 4

  1. (i)

    We call ChR2-3-states controlled conductance-based model, the system given by

    $$\begin{aligned} \left\{ \begin{aligned} \dot{x}(t)&= f_0(x(t)) + \frac{1}{C}g_{ChR2}o(t)(V_{ChR2}-x_1(t))\mathbf {e}_1,\\ \dot{o}(t)&= u(t)(1-o(t)-d(t)) - K_do(t),\\ \dot{d}(t)&= K_do(t) - K_rd(t), \end{aligned} \right. \end{aligned}$$
    (3.4)

    with \(\mathbf {e}_1=(1,0,\ldots ,0)\in \mathbb {R}^n\). We rewrite this system in \(\mathbb {R}^{n+2}\) in the affine form

    $$\begin{aligned} \dot{y}(t) = \tilde{f}_0(y(t)) + u(t)f_1(y(t)), \quad t\in \mathbb {R}_+, \end{aligned}$$
    (3.5)

    with \(y(\cdot )=(x(\cdot ),o(\cdot ),d(\cdot ))\), \(\tilde{f}_0(y) = (f_0(x)+ \frac{1}{C}g_{ChR2}o(t)(V_{ChR2}-x_1(t))\mathbf {e}_1,-K_do,K_do-K_rd)\), and \(f_1(y) = (1-o-d)\partial _o\), where \(\partial _o\) is the derivative with respect to the variable o.

  2. (ii)

    We call ChR2-4-states controlled conductance-based model, the system given by

    $$\begin{aligned} \left\{ \begin{aligned} \dot{x}(t)&= f_0(x(t)) + \frac{1}{C}g_{ChR2}(o_1(t)+\rho o_2(t))(V_{ChR2}-x_1(t))\mathbf {e}_1,\\ \dot{o}_1(t)&= \varepsilon _1u(t)(1-o_1(t)-o_2(t)-c_2(t)) - (K_{d1} + e_{12})o_1(t) + e_{21}o_2(t),\\ \dot{o}_2(t)&= \varepsilon _2u(t)c_2(t) + e_{12}o_1(t) - (K_{d2} + e_{21})o_2(t),\\ \dot{c}_2(t)&= K_{d2}o_2(t) - (\varepsilon _2u(t)+K_r)c_2(t). \end{aligned} \right. \end{aligned}$$
    (3.6)

    We also rewrite the system in \(\mathbb {R}^{n+3}\),

    $$\begin{aligned} \dot{z}(t) = \hat{f}_0(z(t)) + u(t)f_2(z(t)), \quad t\in \mathbb {R}_+, \end{aligned}$$
    (3.7)

    with \(z(\cdot )=(x(\cdot ),o_1(\cdot ),o_2(\cdot ),c_2(\cdot ))\),

    $$\begin{aligned} \hat{f}_0(z)&= (f_0(x)+ \frac{1}{C}g_{ChR2}(o_1(t)+\rho o_2(t))(V_{ChR2}-x_1(t))\mathbf {e}_1\\&\quad - (K_{d1} + e_{12})o_1 + e_{21}o_2,e_{12}o_1 - (K_{d2} + e_{21})o_2, K_{d2}o_2), \end{aligned}$$

    and

    $$\begin{aligned} f_2(z) = \varepsilon _1(1-o_1-o_2-c_2)\partial _{o_1} + \varepsilon _2c_2\partial _{o_2}- \varepsilon _2c_2\partial _{c_2}. \end{aligned}$$

Notation

Let \(k\in \mathbb {N}^*\). We use two ways to write a vector field \(F:\mathbb {R}^k\rightarrow \mathbb {R}^k\). For \(x\in \mathbb {R}^k\), we write either

  • \(F(x) = (F_1(x),\ldots ,F_k(x))\), or

  • \(F(x) = F_1(x)\partial _1 + \cdots + F_k(x)\partial _k\),

where \(F_i:\mathbb {R}^k\rightarrow \mathbb {R}\) is the \(i{\mathrm {th}}\) coordinate of F and \(\partial _i\) is the partial derivative along the \(i^{\mathrm {th}}\) direction, for \(i\in \{1,\ldots ,k\}\).

We already used this mixed notation in Definition 4 above. The second notation will be useful for the computation of Lie brackets later in this paper.

Note that for a bounded measurable functions \(u:\mathbb {R}_+\rightarrow \mathbb {R}\) and a starting point \(((o_0,d_0),(o_1,o_2,c_2))\in \mathbb {R}^2\times \mathbb {R}^3\), the systems (3.2) and (3.3) admit a unique solution, absolutely continuous on \(\mathbb {R}_+\). Thus, for all bounded measurable function \(u:\mathbb {R}_+\rightarrow \mathbb {R}\) and all initial conditions \(y_0\in \mathcal {D}\times \mathbb {R}^2\) and \(z_0\in \mathcal {D}\times \mathbb {R}^3\), the systems (3.4) and (3.6) have a unique solution, defined on \(\mathbb {R}_+\) and such that \(x(\cdot )\) is of class \(C^1\) and \((o(\cdot ),d(\cdot ))\) and \((o_1(\cdot ),o_2(\cdot ),c_2(\cdot ))\) are absolutely continuous on \(\mathbb {R}_+\).

3.1 The minimal time spiking problem

The control problem we are interested in here can be formulated for both ChR2 models. Consider a conductance-based neuron model in its resting state. If no light is applied to the neuron (i.e. \(u\equiv 0\)) then the system stays in this resting state. We want to find the optimal control that triggers a spike in minimum time when starting from the resting state. To do so, let \(V_s > 0\) be the membrane potential that we decide to be corresponding to a spike. Since the control is proportional to the intensity of the light applied to the neuron, the control space U will be a segment \([0,u_{max}]\), with \(u_{max}>0\). Let \(x_{eq}\in \mathbb {R}^n\) a resting state of the conductance-based model. In the next two sections, we formulate the mathematical problem for both ChR2 models.

3.1.1 The ChR2 3-states model

Let \(y_0 = (x_{eq},0,0) \in \mathbb {R}^{n+2}\) be our starting point. The state (0, 0) for the system (3.2) corresponds to a neuron being in the dark for quite a long period of time (i.e. all the ChR2 channels are in the dark adapted closed state c). From \(y_0\), we then want to reach in minimal time (denoted \(t_f\)) the manifold

$$\begin{aligned} M_s := \{y\in \mathbb {R}^{n+2} | y_1 = V_s\}. \end{aligned}$$

As in Sect. 2.2 we define \(\mathcal {H} :\mathbb {R}^{n+2} \times \mathbb {R}^{n+2}\times \mathbb {R}_-\times U \rightarrow \mathbb {R}\) the Hamiltonian of the system for \((y,p,p^0,u)\in \mathbb {R}^{n+2}\times \mathbb {R}^{n+2}\times \mathbb {R}_-\times U\) by

$$\begin{aligned} \mathcal {H}(y,p,p^0,u) := \langle p,\tilde{f}_0(y) \rangle + u \langle p,f_1(y) \rangle + p^0. \end{aligned}$$
(3.8)

This control problem falls into the framework of Sect. 2.2. If there is no singular extremal, the optimal control is bang–bang and is given by the sign of the switching function. Let \(p : \mathbb {R}_+ \rightarrow \mathbb {R}^{n+2}\) be the adjoint vector of the Pontryagin Maximum Principle. The switching function reads, for \(t\in [0,t_f]\),

$$\begin{aligned} \varphi (t):=(1-o(t)-d(t))p_o(t) \text { or also }(1-y_{n+1}(t)-y_{n+2}(t))p_{n+1}(t). \end{aligned}$$

In the absence of singular extremals, if we write \(u^* : [0,t_f]\rightarrow U\) the optimal control, then

$$\begin{aligned} u^*(t) = u_{max}\mathbf {1}_{\varphi (t) > 0}, \quad \forall t\in [0,t_f]. \end{aligned}$$

3.1.2 The ChR2 4-states model

We define here the same quantities for the 4-states model. Let \(z_0 = (x_{eq},0,0,0) \in \mathbb {R}^{n+3}\) be our starting point. From \(z_0\), we then want to reach in minimal time (denoted \(t_f\)) the manifold

$$\begin{aligned} M_s := \{z\in \mathbb {R}^{n+3} | y_1 = V_s\}. \end{aligned}$$

The Hamiltonian \(\mathcal {H} :\mathbb {R}^{n+3} \times \mathbb {R}^{n+3}\times \mathbb {R}_-\times U \rightarrow \mathbb {R}\) is defined for \((z,q,q^0,u)\in \mathbb {R}^{n+3}\times \mathbb {R}^{n+3}\times \mathbb {R}_-\times U\) by

$$\begin{aligned} \mathcal {H}(y,q,q^0,u) := \langle q,\hat{f}_0(z) \rangle + u \langle q,f_2(z) \rangle + q^0. \end{aligned}$$
(3.9)

Let \(q : \mathbb {R}_+ \rightarrow \mathbb {R}^{n+2}\) be the adjoint vector of the Pontryagin Maximum Principle. The switching function writes, for \(t\in [0,t_f]\),

$$\begin{aligned} \psi (t):=\varepsilon _1(1-o_1(t)-o_2(t)-c_2(t))q_{o_1}(t) + \varepsilon _2c_2(t)q_{o_2}(t)- \varepsilon _2c_2(t)q_{c_2}(t). \end{aligned}$$

Singular extremals correspond to vanishing switching functions. We will treat the two ChR2 models in a different way. Indeed, the 3-states model is theoretically tractable and is the object of the following section. The 4-states will be investigated numerically.

3.2 The Goh transformation for the ChR2 3-states model

We state and prove here our main reduction result regarding the existence of optimal singular controls for the ChR2-3-states control problem.

Theorem 2

The existence of optimal singular extremals in the spiking problem in minimal time for the control system (3.4) is equivalent to the existence of optimal singular extremals in the same problem but for the reduced system on \(\mathbb {R}^n\)

$$\begin{aligned} \dot{x} = f_0(x) + o\tilde{f_1}(x), \end{aligned}$$

where o is the control variable and \(\tilde{f_1}(x) =\frac{1}{C}g_{ChR2}(V_{ChR2}-x_1)\mathbf {e}_1 \).

Every nonlinear control system of the form \(\dot{x}=f(x,u)\) can be interpreted as an affine one by making the transformation \(\dot{u}=v\) and considering the variable v as the new control and the variable (xu) as the new state variable. The inverse transformation, called the Goh transformation, is a great tool for the investigation of singular extremals and will reveal itself fundamental here to show the absence of optimal singular trajectories in the models we will consider later.

Notations

To every couple of points \(y:=(x,o,d)\in \mathbb {R}^{n+2}\) and \(p:=(p_x,p_o,p_d)\in \mathbb {R}^{n+2}\) we associate a couple of points of \(\mathbb {R}^{n+1}\) defined by \(\tilde{y}:=(x,d)\) and \(\tilde{p}:=(p_x,p_d)\). Moreover, we write the corresponding reduced Hamiltonian \(\tilde{\mathcal {H}}\) defined for \((\tilde{y},\tilde{p},p^0)\in \mathbb {R}^{n+1}\times \mathbb {R}^{n+1}\times \mathbb {R}_-\) and \(o\in \mathbb {R}\) by \(\tilde{\mathcal {H}}(\tilde{y},\tilde{p},p^0,o):= \langle \tilde{p},\tilde{f}_0(\tilde{y})\rangle + o \langle \tilde{p},\tilde{f}_1(\tilde{y})\rangle + p^0\), where the vector field \(\tilde{f}_0\) remains unchanged (it did not depend on the variable o) and the vector field \(\tilde{f}_1\) is defined, for all \(\tilde{y}\in \mathbb {R}^{n+1}\), by \(\tilde{f}_1(\tilde{y}) := g_{ChR2}(V_{ChR2}-\tilde{y}_1)\partial _1\).

The following lemma is the first step to reduce the dimension of the system that has to be considered to investigate the existence of singular extremals.

Lemma 1

(yp) is the projection, on the space of continuous functions from \(\mathbb {R}_+\) to \(\mathbb {R}^{n+2}\times \mathbb {R}^{n+2}\), of a solution (ypu) of

$$\begin{aligned} \dot{y}(t)= & {} \frac{\partial \mathcal {H}}{\partial p}(y(t),p(t),p^0,u(t)), \nonumber \\ \dot{p}(t)= & {} -\frac{\partial \mathcal {H}}{\partial y}(y(t),p(t),p^0,u(t)), \quad \langle p(t),f_1(y(t)) \rangle =0. \end{aligned}$$
(3.10)

if and only if \(p_o\equiv 0\), \(\dot{o}=(1-o-d)u-K_do\), and \((\tilde{y},\tilde{p})\) is a solution of

$$\begin{aligned} \dot{\tilde{y}}(t)= & {} \frac{\partial \tilde{\mathcal {H}}}{\partial \tilde{p}}(\tilde{y}(t),\tilde{p}(t),p^0,o(t)), \nonumber \\ \dot{\tilde{p}}(t)= & {} -\frac{\partial \tilde{\mathcal {H}}}{\partial \tilde{y}}(\tilde{y}(t),\tilde{p}(t),p^0,o(t)), \quad \langle \tilde{p}(t),\tilde{f}_1(\tilde{y}(t)) \rangle =0. \end{aligned}$$
(3.11)

This lemma shows that singular extremals of (3.4) are directly related to singular extremals of the following, and still affine control system:

$$\begin{aligned} \left\{ \begin{aligned} \dot{x}(t)&= f_0(x(t)) + g_{ChR2}o(t)(V_{ChR2}-x_1(t))\mathbf {e}_1,\\ \dot{d}(t)&= K_do(t) - K_rd(t), \end{aligned} \right. \end{aligned}$$
(3.12)

where the control is now the variable o.

In the models that we are going to study in the sequel, we will see that this transformation allows to conclude to the absence of optimal singular extremals.

Proof of Lemma 1

The proof comes from the general result of Section 1.9.4 of Bonnard and Kupka (1993) and the structure of our particular model. If we keep on writing \(y=(x,o,d)\), system (3.10) gives on an interval I of \([0,t_f]\):

$$\begin{aligned} \left\{ \begin{aligned} \dot{x}&= f_0(x) + g_{ChR2}o(V_{ChR2}-x_1)\mathbf {e}_1,\\ \dot{d}&= K_do-K_rd,\\ \dot{o}&=(1-o-d)u-K_do,\\ \dot{p}_x&= -J_{f_0}^tp_x + g_{ChR2}p_o\mathbf {e}_1,\\ \dot{p}_d&= up_o +K_rp_d,\\ \dot{p}_o&= -g_{ChR2}(V_{ChR2}-x_1)p_{x_1} -K_dp_d+(u+K_d)p_o,\\ 0&=(1-o-d)p_o, \end{aligned} \right. \end{aligned}$$
(3.13)

where \(J_{f_0}^t\) is the transpose of the Jacobian matrix of \(\tilde{f}_0\). For continuity reasons, we get that either \(p_o\equiv 0\) or \((1-o-d)\equiv 0\) on I. If \((1-o-d)\equiv 0\) then \(-K_rd = \dot{o}+\dot{d}\equiv 0\) so that \(d\equiv 0\) and \(o\equiv 1\). But \(d\equiv 0 \Rightarrow \dot{d}\equiv 0\) so that \(\dot{o}\equiv 0\) which is incompatible with \(o\equiv 1\), since \(\dot{o}=-K_do\). We conclude that, necessarily, \(p_o\equiv 0\) on I. This equality implies that \(\dot{p}_o\equiv 0\) and from the penultimate equation of (3.13) we get \(-g_{ChR2}(V_{ChR2}-x_1)p_{x_1} -K_dp_d\equiv 0\) which also writes \(\langle \tilde{p},\tilde{f}_1(\tilde{y}) \rangle \equiv 0\). Now the first two equations of (3.13) correspond to

$$\begin{aligned} \dot{\tilde{y}}(t) = \frac{\partial \tilde{\mathcal {H}}}{\partial \tilde{p}}(\tilde{y}(t),\tilde{p}(t),p^0,o(t)), \end{aligned}$$

and the \(4{\mathrm {th}}\) and \(5{\mathrm {th}}\) equations correspond to

$$\begin{aligned} \dot{\tilde{p}}(t) = -\frac{\partial \tilde{\mathcal {H}}}{\partial \tilde{y}}(\tilde{y}(t),\tilde{p}(t),p^0,o(t)). \end{aligned}$$

We just showed that (3.10) \(\Rightarrow \) (\(p_o\equiv 0\) and (3.11)).

Suppose now that \(p_o\equiv 0\) on I and that (3.11) is satisfied and let us show that (3.13) is satisfied. The first two equations of (3.11) give the \(1{\mathrm {st}}\), \(2{\mathrm {nd}}\), \(4{\mathrm {th}}\) and \(5{\mathrm {th}}\) equations of (3.13). Moreover, \(p_o\equiv 0\) implies that the last equation of (3.13) is satisfied and that \(\dot{p}_o \equiv 0\). Taking into account that \(0 \equiv \langle \tilde{p},\tilde{f}_1(\tilde{y}) \rangle = -g_{ChR2}(V_{ChR2}-x_1)p_{x_1} -K_dp_d\), we obtain the \(6{\mathrm {th}}\) equation of (3.13). Finally, the \(3{\mathrm {rd}}\) equation of (3.13) is satisfied as a hypothesis, which ends the proof. \(\square \)

Proof of Theorem 2

The result of Lemma 1 is the first step of the proof. To finish up with it, consider the spiking problem in minimum time for the reduced system (3.12) :

$$\begin{aligned} \left\{ \begin{aligned} \dot{x}(t)&= f_0(x(t)) + g_{ChR2}o(t)(V_{ChR2}-x_1(t))\mathbf {e}_1,\\ \dot{d}(t)&= K_do(t) - K_rd(t), \end{aligned} \right. \end{aligned}$$

We remark that the dynamics of the variables x and d are completely decoupled. Furthermore, the targeted manifold is only defined by the location of variable \(x_1\). These two remarks imply that an optimal control for system (3.12) has to be optimal for the even more reduced control system :

$$\begin{aligned} \dot{x}(t) = f_0(x(t)) + g_{ChR2}o(t)(V_{ChR2}-x_1(t))\mathbf {e}_1. \end{aligned}$$

\(\square \)

3.3 Lie bracket configurations for the ChR2 4-states model

In the case of the ChR2 4-states model, we will observe numerically that the optimal control is bang–bang for various values of the maximum intensity \(u_{max}\). Here we give the expression of the first Lie brackets, that we first define. Lie brackets are the appropriate tool to investigate singular extremals. We give two equivalent definitions, depending on the notation used for the vector fields.

Let \(k\in \mathbb {N}^*\) and \(g,h : \mathbb {R}^k\rightarrow \mathbb {R}^k\) two vector fields of class \(C^1\). Let \((g_1,\ldots ,g_k)\) and \((h_1,\ldots ,h_k)\) their coordinate mappings. The Lie bracket \([g,h]: \mathbb {R}^k\rightarrow \mathbb {R}^k\) of g and h is the vector field defined for \(x\in \mathbb {R}^k\) by

$$\begin{aligned} {[}g,h](x) = J_h(x)g(x) - J_g(x)h(x), \end{aligned}$$

or equivalently by

$$\begin{aligned} {[}g,h](x) = \sum _{i=1}^k\sum _{j=1}^k \big (g_j(x)\partial _jh_i(x) - h_j(x)\partial _jg_i(x)\big )\partial _i, \end{aligned}$$

where \(J_h\) and \(J_g\) are the Jacobian matrices of h and g. The expression \(J_h(x)g(x)\) has to be understood as the product of the \(k\times k\)-matrix by the k-vector. Further in this paper we will use the convenient notation

$$\begin{aligned} \mathrm {ad}_h g := [h,g] \end{aligned}$$

that allows to reduce expressions of multiple Lie brackets. Finally, one important relation for the computation of singular controls is the following. Let \((x^u,p)\) be an extremal pair of the Pontryagin maximum principle associated to a control u. Then for any smooth vector field \(h:\mathbb {R}^k\rightarrow \mathbb {R}^k\) and all \(t\in [0,t_f]\),

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} \langle p(t), h(x^u(t))\rangle = \langle p(t), [F_0,h](x^{u}(t))\rangle + u(t)\langle p(t), [F_1,h](x^{u}(t))\rangle . \end{aligned}$$
(3.14)

In most cases, a singular optimal control \(\bar{u}\) would have the expression

$$\begin{aligned} \bar{u}(t) = \frac{\langle q(t) , \mathrm {ad}^2_{\hat{f}_0}f_2(z(t))\rangle }{\langle q(t) , \mathrm {ad}^2_{f_2}\hat{f}_0(z(t))\rangle }. \end{aligned}$$

Indeed, if I is an interval of \([0,t_f]\) on which the switching function \(\psi \) vanishes, then for \(t\in I\),

$$\begin{aligned} \psi (t)&= 0,\\ \dot{\psi }(t)&= \langle q(t), [\hat{f}_0,f_2](z(t)) \rangle = 0 ,\\ \ddot{\psi }(t)&= \langle q(t) , \mathrm {ad}^2_{\hat{f}_0}f_2(z(t))\rangle -\bar{u}(t)\langle q(t) , \mathrm {ad}^2_{f_2}\hat{f}_0(z(t))\rangle = 0. \end{aligned}$$

The expressions of \([\hat{f}_0,f_2]\) and \(\mathrm {ad}^2_{f_2}\hat{f}_0\) are not complicated since these brackets have non zero components only on the directions \(z_1\), \(z_{n+1}\), \(z_{n+2}\) and \(z_{n+3}\) (independently of \(n\in \mathbb {N}^*\)), which we also write v, \(o_1\), \(o_2\) and \(c_2\). We will not give the expression of \(\mathrm {ad}^2_{\hat{f}_0}f_2\) because it is too long and of little interest since we will treat the problem numerically. Let us just mention that it has non zero components on all the directions of the state space \(\mathbb {R}^{n+3}\).

$$\begin{aligned} {[}\hat{f}_0,f_2](z)&= -\Big (\varepsilon _1(1-o_1-o_2-c_2)+\varepsilon _2\rho c_2\Big )\frac{1}{C}g_{ChR2}(V_{ChR2}-v)\partial _v\\&\quad +\Big (\varepsilon _1(1-o_1-o_2-c_2)(e_{12}+K_{d1})+\varepsilon _1K_{d1}o_1\\&\quad +(\varepsilon _1K_r-\varepsilon _2e_{21})c_2\Big )\partial _{o_1}\\&\quad + \Big (-\varepsilon _1(1-o_1-o_2-c_2)e_{12}+\varepsilon _2K_{d2}o_2\\&\quad + \varepsilon _2(e_{21}+K_{d2}-K_r)c_2\Big )\partial _{o_2}\\&\quad - \varepsilon _2K_{d2}(o_2+c_2)\partial _{c_2}, \end{aligned}$$

and

$$\begin{aligned} \mathrm {ad}^2_{f_2}\hat{f}_0(z)&= -\Big ((\varepsilon _1)^2(1-o_1-o_2-c_2)+(\varepsilon _2)^2\rho c_2\Big )\frac{1}{C}g_{ChR2}(V_{ChR2}-v)\partial _v\\&\quad -\varepsilon _1\Big (\varepsilon _1(1-o_1-o_2-c_2)(e_{12}+K_{d1})\\&\quad +\varepsilon _1K_{d1}o_1-(\varepsilon _1K_r-\varepsilon _2e_{21})c_2\Big )\partial _{o_1}\\&\quad - \Big ((\varepsilon _1)^2(1-o_1-o_2-c_2)e_{12}+(\varepsilon _2)^2K_{d2}o_2\\&\quad + (\varepsilon _2)^2(-e_{21}+K_{d2}-K_r)c_2\Big )\partial _{o_2}\\&\quad + (\varepsilon _2)^2K_{d2}(o_2+c_2)\partial _{c_2}. \end{aligned}$$

4 Application to some neuron models with numerical results

In this section, we apply the reduction results of Sect. 3.2 to some widely used models and support our theoretical results with numerical results. These theoretical results regard the ChR2-3-states model and we also investigate numerically the associated ChR2-4-states models. The numerical results are obtained by direct methods based on the ipopt routine Wächter and Biegler (2006) to solve nonlinear optimization problems, and implemented with the ampl language Fourer et al. (2002). For a survey on numerical methods in optimal control, see Trélat (2012). The numerical values used for the ChR2-3-states and 4-states models are those in Appendix “The 3-state model” and “The 4-states model” sections. For each neuron model that we study, namely the FitzHugh–Nagumo model, the Morris–Lecar model and the reduced and complete Hodgkin–Huxley models, we implement the direct method for the ChR2-3-states and 4-states models and compare them. We repeat the computation for several values of the maximum control value in order to try to detect possible singular optimal controls. Indeed, it would be possible that a singular optimal control only appears above some threshold of the maximal control value. Nevertheless, no model numerically displays such controls. We then compare the neuron models in terms of their behavior with respect to optogenetic control. Physiologically, Channelrhdopsin has a depolarizing effect on a neuron membrane so that it is physiologically intuitive to expect that we need to switch on the light to obtain a spike, and the more light we put in the system, the faster the spike will occur. We propose to distinguish between two classes of models. The first class comprises neuron models that display the intuitive physiological response to optogenetic stimulation and the second class comprises neuron models that display an unexpected response.

4.1 The FitzHugh–Nagumo model

The FitzHugh–Nagumo model is not exactly a conductance-based model but a two-dimensional simplification of the Hodgkin–Huxley model. This model takes its name from the initial work of FitzHugh (1961) who suggested the system and Nagumo et al. (1962) who gave the equivalent circuit. The idea was to find a simpler model that still featured the mathematical properties of excitation and propagation.

The ChR2-3-states model

The ChR2-3-states controlled FitzHugh–Nagumo model is

$$\begin{aligned} (FHN)\left\{ \begin{aligned} \dot{v}(t)&= v(t)-\frac{1}{3}v^3(t)-w(t) +\frac{1}{C}g_{ChR2}o(t)(V_{ChR2}-v(t)),\\ \dot{w}(t)&= c(v(t) + a -bw(t)),\\ \dot{o}(t)&= u(t)(1-o(t)-d(t)) - K_do(t),\\ \dot{d}(t)&= K_do(t) - K_rd(t), \end{aligned} \right. \end{aligned}$$

where v is the membrane potential and w is a conductance-like variable that provides a negative feedback, and a, b and c are constants. In the original model, the numerical values of these constants were \(a=0.7\), \(b=0.8\) and \(c=0.08\). The adjoint equations read

$$\begin{aligned} (FHN_{adj})\left\{ \begin{aligned} \dot{p}_v(t)&= -p_v(t)(1-v^2(t)-\frac{1}{C}g_{ChR2}o(t))-cp_w(t),\\ \dot{p}_w(t)&= p_v(t) + bcp_w(t),\\ \dot{p}_o(t)&= -\frac{1}{C}g_{ChR2}(V_{ChR2}-v(t))p_v(t) + (u(t)+K_d)p_o(t) -K_dp_d(t),\\ \dot{p}_d(t)&= u(t)p_o(t) + K_rp_d(t), \end{aligned} \right. \end{aligned}$$

and the switching function is \(\varphi (t) = (1-o(t)-d(t))p_o(t)\). The following lemma gives the optimal control for the minimal time control of the ChR2-controlled FitzHugh–Nagumo model.

Proposition 1

The optimal control \(u^*:\mathbb {R}_+\rightarrow U\) for the minimal time control of the FitzHugh–Nagumo model is bang–bang and given by

$$\begin{aligned} u^*(t) = u_{max} \mathbf {1}_{p_o(t)>0}, \quad \forall t\in [0,t_f]. \end{aligned}$$

Furthermore, the optimal control begins with a bang arc of maximal value, i.e.

$$\begin{aligned} \exists t_1\in [0,t_f], u^*(t)=u_{max}, \quad \forall t\in [0,t_1]. \end{aligned}$$

Proof

Let us show that there is no optimal singular extremals. The results for conductance-based models given in Sect. 3.2 are straightforwardly applicable to the FitzHugh–Nagumo model, and the reduced control system is the following

$$\begin{aligned} (FHN')\left\{ \begin{aligned} \dot{v}(t)&= v(t)-\frac{1}{3}v^3(t)-w(t) +\frac{1}{C}g_{ChR2}u(t)(V_{ChR2}-v(t)),\\ \dot{w}(t)&= c(v(t) + a -bw(t)).\\ \end{aligned} \right. \end{aligned}$$

The adjoint equations for this system are

$$\begin{aligned} (FHN'_{adj})\left\{ \begin{aligned} \dot{p}_v(t)&= -p_v(t)(1-v^2(t)-\frac{1}{C}g_{ChR2}u(t))-cp_w(t),\\ \dot{p}_w(t)&= p_v(t) + bcp_w(t).\\ \end{aligned} \right. \end{aligned}$$

The vector fields defining the affine system (\(FHN'\)) are

$$\begin{aligned} \begin{aligned} f_0(v,w)&= (v-\frac{1}{3}v^3-w)\partial _v + c(v + a -bw)\partial _w,\\ f_1(v,w)&= \frac{1}{C}g_{ChR2}(V_{ChR2}-v)\partial _v.\\ \end{aligned} \end{aligned}$$

For the reduced system, the switching function is given by

$$\begin{aligned} \phi (t) = \langle p(t), f_1(v(t),w(t))\rangle = \frac{1}{C}g_{ChR2}(V_{ChR2}-v(t))p_v(t). \end{aligned}$$

Investigation of singular trajectories

Assume that there exists an open interval I along which the switching function vanishes. Then for all \(t\in I\),

$$\begin{aligned} \langle p(t), f_1(v(t),w(t))\rangle = 0. \end{aligned}$$

By continuity, this means that either v is constant and equals \(V_{ChR2}\) on I or \(p_v\) vanishes on I. The constant case is not possible since it implies from the dynamical system (FHN) that w would also be constant on I, but \((V_{ChR2},w)\) is not an equilibrium point of the uncontrolled system, for any \(w\in \mathbb {R}\). Then, necessarily, \(p_v\) vanishes on I. This implies that \(\dot{p}_v\) also vanishes and from (\(FHN_{adj}\)), \(p_w\) vanishes on I. This is incompatible from the Pontryagin maximum principle.

We showed that the reduced system does not present any singular extremals and from Theorem 2, the original system (FHN) does not either. The optimal control is then bang–bang and is given by the sign of the switching function of the original system. Taking into account that for all \(t\in [0,t_f]\), \(1-o(t)-d(t) > 0\) we get

$$\begin{aligned} u^*(t) = u_{max} \mathbf {1}_{p_o(t)>0}, \quad \forall t\in [0,t_f]. \end{aligned}$$

Finally, to show that the first arc correspond to a maximal control, suppose that \(u^*(0) = 0\). Then system (FHN) stays in its resting state, contradicting time optimality. \(\square \)

We implement the direct method for this problem with a targeted action potential \(V_s := 1.5\) mV and a control evolving in [0, 0.1]. The numerical values of the constants (abc) are set to the usual values (0.7, 0.8, 0.08). Since this model is not physiological, we chose the values for the constants C, \(g_{ChR2}\), \(V_{ChR2}\) and \(u_{max}\) quite arbitrarily, with the constraint that the behavior of the controlled system should not stray away from the uncontrolled system. When the control is off, the system stays at rest, as seen on Fig. 5.

Fig. 5
figure 5

In the absence of stimulation, the neuron stays in its resting state

We represent on Fig. 6 the evolution of the optimal trajectory of the membrane potential and the optimal control. As predicted, the optimal control is bang–bang and starts with a maximal arc. It has a unique switching time which means that there is no need to keep the light on all the way to the spike, an interesting fact for the controller. This optimal control can be qualified as physiological, the light must stay on until a point where the system is “launched” toward the spike and no further illumination is required.

Fig. 6
figure 6

Optimal trajectory and control for the FHN-ChR2-3-states model with \(u_{max}=0.5\,\mathrm {m\,s}^{-1}\)

The ChR2-4-states model

The ChR2-4-states model gives the same shape of optimal trajectory and control. We can compare the two ChR2 models and observe the results for different values of \(u_{max}\) on Fig. 7 and Table 1. Figure 7 shows the optimal membrane potential and the optimal control trajectories for \(u_{max} = 0.5\) m s\(^{-1}\) and Table 1 gathers the time to the first spike and the time to the switch of the optimal control, as a percentage of the time to the spike. The ChR2-4-states model fires faster than the ChR2-3-states. Furthermore, ChR2-4-states model requires less time in the light to fire than the ChR2-3-states. This phenomenon seems to be independent of the maximal value of the control. The gain is of around \(6\%\) in the four cases.

Fig. 7
figure 7

Optimal trajectory and bang–bang optimal control for the FHN-ChR2-3-states and FHN-ChR2-4-states models with \(u_{max}=0.5\mathrm {m\,s}^{-1}\)

Table 1 Comparison of the FHN-ChR2-3-states and FHN-ChR2-4-states models for different values of the maximum value of the control

4.2 The Morris–Lecar model

The Morris–Lecar model is a reduced conductance-based model taking into account a \(Ca^{2+}\) current for excitation and a \(K^+\) current for recovery (Lecar and Morris 1981). It comes from the experimental study of the oscillatory behavior of the membrane potential in the barnacle muscle. The original model is of dimension 3, but it is conveniently and commonly reduced to a two-dimensional model by invoking the fast dynamics of the \(Ca^{2+}\) conductance with respect to the other variables. This conductance is then replaced by its steady-state.

The ChR2-3-states model

The ChR2-3-states controlled Morris–Lecar model is given by

$$\begin{aligned} (ML)\left\{ \begin{aligned} \dot{\nu }(t)&= \frac{1}{C}\Big (g_K\omega (t)(V_K-\nu (t)) + g_{Ca}m_{\infty }(\nu (t))(V_{Ca}-\nu (t)) \\&\quad +g_{ChR2}o(t)(V_{ChR2}-\nu (t)) + g_L(V_L-\nu (t))\Big ),\\ \dot{\omega }(t)&= \alpha (\nu (t))(1-\omega (t)) - \beta (\nu (t))\omega (t),\\ \dot{o}(t)&= u(t)(1-o(t)-d(t)) - K_do(t),\\ \dot{d}(t)&= K_do(t) - K_rd(t), \end{aligned} \right. \end{aligned}$$

with

$$\begin{aligned} m_{\infty }(\nu )&= \frac{1}{2}\left( 1+\tanh \left( \frac{\nu -V_1}{V_2}\right) \right) ,\\ \alpha (\nu )&=\frac{1}{2}\phi \cosh \left( \frac{\nu -V_3}{2V_4}\right) \left( 1+\tanh \left( \frac{\nu -V_3}{V_4}\right) \right) ,\\ \beta (\nu )&=\frac{1}{2}\phi \cosh \left( \frac{\nu -V_3}{2V_4}\right) \left( 1-\tanh \left( \frac{\nu -V_3}{V_4}\right) \right) , \end{aligned}$$

where \(\nu \) is the membrane potential, \(\omega \) is the probability of opening of a \(K^+\) channel and \(m_{\infty }(\nu )\) represents the steady state of the probability of opening of a \(Ca^{2+}\) channel. The numerical constants of the model are given in “Appendix 1”. The adjoint equations read

$$\begin{aligned} (ML_{adj})\left\{ \begin{aligned} \dot{p}_{\nu }(t)&= \frac{1}{C}p_{\nu }(t)\Big (g_K\omega (t)+g_{Ca}m_{\infty }(\nu (t))+g_{ChR2}o(t)+g_L-g_{Ca}m'_{\infty }(\nu (t))\Big )\\&\qquad -p_{\omega }(t)\Big (\alpha '(\nu (t))(1-\omega (t))-\beta '(\nu (t))\omega (t)\Big ),\\ \dot{p}_{\omega }(t)&= -\frac{1}{C}g_K(V_K-\nu (t))p_{\nu }(t) + \Big (\alpha (\nu (t))+\beta (\nu (t))\Big )p_{\omega }(t),\\ \dot{p}_o(t)&= -\frac{1}{C}g_{ChR2}(V_{ChR2}-\nu (t))p_{\nu }(t) + (u(t)+K_d)p_o(t) -K_dp_d(t),\\ \dot{p}_d(t)&= u(t)p_o(t) + K_rp_d(t), \end{aligned} \right. \end{aligned}$$

and the switching function is again \(\varphi (t) = (1-o(t)-d(t))p_o(t)\). Proposition 2 gives the same conclusion as Proposition 1 for the ChR2-controlled Morris–Lecar model.

Proposition 2

The optimal control \(u^*:\mathbb {R}_+\rightarrow U\) for the minimal time control of the Morris–Lecar model is bang–bang and given by

$$\begin{aligned} u^*(t) = u_{max} \mathbf {1}_{p_o(t)>0}, \quad \forall t\in [0,t_f]. \end{aligned}$$

Furthermore, the optimal control begins with a bang arc of maximal value

$$\begin{aligned} \exists t_1\in [0,t_f], u^*(t)=u_{max}, \quad \forall t\in [0,t_1]. \end{aligned}$$

Proof

We apply the result of Theorem 2 and study the existence of singular extremals for the following reduced system

$$\begin{aligned} (ML')\left\{ \begin{aligned} \dot{\nu }(t)&= \frac{1}{C}\Big (g_K\omega (t)(V_K-\nu (t)) + g_{Ca}m_{\infty }(\nu (t))(V_{Ca}-\nu (t)) \\&\quad +g_{ChR2}u(t)(V_{ChR2}-\nu (t)) + g_L(V_L-\nu (t))\Big ),\\ \dot{\omega }(t)&= \alpha (\nu (t))(1-\omega (t)) - \beta (\nu (t))\omega (t). \end{aligned} \right. \end{aligned}$$

The adjoint equations for this system are

$$\begin{aligned} (ML'_{adj})\left\{ \begin{aligned} \dot{p}_{\nu }(t)&= \frac{1}{C}p_{\nu }(t)\Big (g_K\omega (t)+g_{Ca}m_{\infty }(\nu (t))+g_{ChR2}u(t)+g_L-g_{Ca}m'_{\infty }(\nu (t))\Big )\\&\quad -p_{\omega }(t)\Big (\alpha '(\nu (t))(1-\omega (t))-\beta '(\nu (t))\omega (t)\Big ),\\ \dot{p}_{\omega }(t)&= -\frac{1}{C}g_K(V_K-\nu (t))p_{\nu }(t) + (\alpha (\nu (t))+\beta (\nu (t)))p_{\omega }(t). \end{aligned} \right. \end{aligned}$$

The vector fields defining the affine system (\(ML'\)) are

$$\begin{aligned} \begin{aligned} f_0(\nu ,\omega )&= \frac{1}{C}\Big (g_K\omega (V_K-\nu ) + g_{Ca}m_{\infty }(\nu )(V_{Ca}-\nu ) + g_L(V_L-\nu )\Big )\partial _{\nu },\\&\quad + \Big (\alpha (\nu )(1-\omega ) - \beta (\nu )\omega \Big )\partial _{\omega }\\ f_1(\nu ,\omega )&= \frac{1}{C}g_{ChR2}(V_{ChR2}-v)\partial _{\nu }. \end{aligned} \end{aligned}$$

For the reduced system, the switching function is given by

$$\begin{aligned} \phi (t) = \langle p(t), f_1(\nu (t),\omega (t))\rangle = \frac{1}{C}g_{ChR2}(V_{ChR2}-\nu (t))p_{\nu }(t). \end{aligned}$$

Investigation of singular trajectories

Assume that there exists an open interval I along which the switching function vanishes. Then for all \(t\in I\),

$$\begin{aligned} \langle p(t), f_1(v(t),w(t))\rangle = 0. \end{aligned}$$

As for the FitzHugh–Nagumo model, there is no \(\omega \in [0,1]\) such that \((V_{ChR2},\omega )\) is an equilibrium point of the uncontrolled Morris–Lecar model, so that necessarily \(p_{\omega }\) vanishes on I. From (\(ML'\)) we deduce that for all \(t\in I\),

$$\begin{aligned} p_{\omega }(t)\Big (\alpha '(\nu (t))(1-\omega (t))-\beta '(\nu (t))\omega (t)\Big ) = 0, \end{aligned}$$

and since p cannot vanish on I then

$$\begin{aligned} \alpha '(\nu (t))(1-\omega (t))-\beta '(\nu (t))\omega (t) = 0. \end{aligned}$$

This means that the singular extremal is localized in the domain A of \(\mathbb {R}^2\) given by

$$\begin{aligned} A:= \{(\nu ,\omega )\in \mathbb {R}^2 | \alpha '(\nu )(1-\omega )-\beta '(\nu )\omega = 0 \}. \end{aligned}$$

We can rewrite it in a more convenient way

$$\begin{aligned} A= \left\{ (\nu ,\omega )\in \mathbb {R}^2 | \omega = \frac{\alpha '(\nu )}{\alpha '(\nu )+\beta '(\nu )} \text { and } \nu \ne V_3 \right\} . \end{aligned}$$

Domain A is represented on Fig. 8 below and it is easy to see that any trajectory of the dynamical system (\(ML'\)) has an empty intersection with A because for all \((\nu ,\omega )\in A\), \(\omega \in ]-\infty ,0[\cup ]1,+\infty [\), whereas the second component of the trajectory always stays in [0, 1].

The end of the proof is similar to the proof of Proposition 1.

Fig. 8
figure 8

Representation of the manifold in which a singular trajectory must evolve

Remark 1

Let us briefly show how the investigation of singular trajectories for the complete system before reduction is much more difficult. To do so, consider the controlled Morris–Lecar model (ML) with its system of adjoint equations (\(ML_{adj}\)) and the vector fields defined for \(x=(\nu ,\omega ,o,d)\in \mathbb {R}^4\) by

$$\begin{aligned} F_0(x)&:= \frac{1}{C}\Big (g_K\omega (V_K-\nu ) + g_{Ca}m_{\infty }(\nu )(V_{Ca}-\nu ) \\&\quad + og_{ChR2}(V_{ChR2}-\nu ) + g_L(V_L-\nu )\Big )\partial _{\nu } \\&\qquad \Big (\alpha (\nu )(1-\omega ) - \beta (\nu )\omega \Big )\partial _{\omega } -K_do\partial _o + (K_do-K_rd)\partial _d, \end{aligned}$$

and

$$\begin{aligned} F_1(x) =(1-o-d)\partial _{o}. \end{aligned}$$

Proposition 3

Let (xpu) be a singular extremal of \((ML)-(ML_{adj})\) on an open interval I of \([0,t_f]\). Then, without any further assumption,

$$\begin{aligned} \langle p(t),\mathrm {ad}^k_{F_0}F_1(x(t))\rangle\equiv & {} 0,\\ \langle p(t),\mathrm {ad}^k_{F_1}F_0(x(t)) \rangle\equiv & {} 0, \\ \langle p(t),[F_1,\mathrm {ad}^2_{F_0}F_1](x(t)) \rangle\equiv & {} 0, \end{aligned}$$

on I for all \(k\in \{1,2,3\}\).

Keeping in mind that we already proved that there is no optimal singular control, if we consider the system before reduction, Proposition 3 means that we need to consider the following system of equations to rule out optimal singular extremals

$$\begin{aligned} \langle p,[F_0,\mathrm {ad}^3_{F_1}F_0]\rangle + u\langle p,\mathrm {ad}^4_{F_1}F_0 \rangle&\equiv 0,\\ \langle p,[F_0,[F_1,\mathrm {ad}^2_{F_0}F_1]]\rangle + u\langle p,\mathrm {ad}^2_{F_1}(\mathrm {ad}^2_{F_0}F_1) \rangle&\equiv 0,\\ \langle p,\mathrm {ad}^4_{F_0}F_1\rangle + u\langle p,[F_1,\mathrm {ad}^3_{F_0}F_1] \rangle&\equiv 0, \end{aligned}$$

on I.

Proof

Let \(t\in I\). From the equalities \(\langle p(t),F_1(x(t)) \rangle = 0\) and \(\langle p(t),[F_0,F_1](x(t)) \rangle = 0\) we infer that

$$\begin{aligned} \left\{ \begin{aligned}&p_o(t) = 0,\\&\frac{1}{C}g_{ChR2}(V_{ChR2}-\nu (t))p_v(t) + K_dp_d(t) = 0. \end{aligned} \right. \end{aligned}$$
(4.1)

It can also be proved that \(\mathrm {ad}^3_{F_1}F_0 = -[F_0,F_1]\). The rest of the equalities are all given by (4.1). \(\square \)

For this model, we implemented the direct method with the numerical values of “Appendx 1” and “The 3-states model” section. The targeted action potential has been fixed to 30 mV.

Fig. 9
figure 9

Optimal trajectory and bang–bang optimal control for the ML-ChR2-3-states model

Fig. 10
figure 10

Optimal trajectory and bang–bang optimal control for the ML-ChR2-3-states model with numerical values of (Saint-Hilaire and Longtin 2004, Table 1). The constant stimulation fails to trigger a spike

Fig. 11
figure 11

Optimal trajectory and bang–bang optimal control for the ML-ChR2-3-states model with numerical values of “Appendix 1” and \(VChR2=20\) mV. The optimal control has only two switches

Fig. 12
figure 12

Optimal trajectory and bang–bang optimal control for the ML-ChR2-3-states and ML-ChR2-4-states models with \(u_{max}=0.028\,\mathrm {m\,s}^{-1}\) (physiological value)

The optimal control for the ChR2-3-states model is bang–bang and begins with a maximal arc. For the numerical values of “Appendix 1” and “The 3-states model” section, it displays three switching times. We represent on Fig. 9 the optimal trajectory of the membrane potential and the optimal control, for the physiological value of the maximal value control, computed in Appendix “The 3-states model” section, and also the trajectory obtained under constant maximal stimulation, just to observe that the optimal control obtained is indeed better than the constant maximal stimulation. Although the difference is very small, of the order of a millisecond, the calculated stimulation still outperforms the constant maximal stimulation. In order to show that the difference between the calculated optimal stimulation and the constant maximal stimulation can be huge, we implement the direct method on a system with different numerical values for the constants of the Morris–Lecar model (the Type I neuron of (Saint-Hilaire and Longtin 2004, Table 1), see Table 6, in Appendix “The 3-states model” section), and values for the ChR2-3-states model remaining unchanged, except for \(V_{ChR2}=0.1\)mV. The result is striking, the constant stimulation even fails to trigger a spike while the stimulation with three switching times makes the neuron fire (see Fig. 10). It is important to note that the presence of three switching times is not an intrinsic characteristic of the Morris–Lecar model itself. Indeed, we can find optimal controls with only two switches if we change the value for the equilibrium potential of the ChR2, keeping all the other constants of the model unchanged (Fig. 11).

The ChR2-4-states model

The shape of the optimal trajectory and control of the ChR2-4-states model correspond to the one of the ChR2-3-states model. Nevertheless, for small values of \(u_{max}\), including the physiological value computed in Appendix “The 3-states model” section, the ChR2-3-states model fires faster than the ChR2-4-states model whereas for larger values of \(u_{max}\), the opposite happens (see Fig. 12; Table 2). The threshold where this phenomenon happens is around the value \(u_{max} = 0.1\). Furthermore, the difference grows larger when \(u_{max}\) increases. This is an unusual behavior that suggests that the Morris–Lecar is less robust than the FitzHugh–Nagumo model, or the Hodgkin–Huxely models, as we are going to see.

Table 2 Comparison of the ML-ChR2-3-states and ML-ChR2-4-states models for different values of the maximum value of the control

4.3 The reduced Hodgkin–Huxley model

Similarly to the reduction of the initial Morris–Lecar model, there exists a popular reduction of the Hodgkin–Huxley model to a 2-dimensional conductance-based model. This reduction is based on the observation that, on the one hand, the variable m is much faster than the other two gating variables n and h, and on the other hand, the variable h is almost a linear function of the variable n (\(h\simeq a+bn\), with \(a=0.89\), \(b=-1.1\) being a good fit, see Figs. 14 and 15). These observations lead to a new system of equations derived from (HH) by setting the variable m in its stationary state \(m(t)=m_{\infty }(t)\) and taking the variable h as above.

$$\begin{aligned} (HH_{2D})\left\{ \begin{aligned} C\frac{\mathrm {d}V}{\mathrm {d}t}&= g_Kn^4(t)(V_K-V(t)) + g_{Na}m_{\infty }^3(V)(a+bn(t))(V_{Na}-V(t))\\&\quad + g_L(V_L-V(t)),\\ \frac{\mathrm {d}n}{\mathrm {d}t}&= \alpha _n(V(t))(1-n(t)) - \beta _n(V(t))n(t), \end{aligned} \right. \end{aligned}$$

with \(m_{\infty }(v) = \frac{\alpha _m(v)}{\alpha _m(v)+\beta _m(v)}\). It is important to note that, although the time constants of the ion channels have been mathematically investigated (see for example Rubin and Wechselberger 2008), the approximation of the variable h is purely based on observation, and not on a rigorous mathematical reduction. Nevertheless, if the linear approximation seems questionable when the membrane potential is held fixed (Fig. 13), it becomes quite remarkable when the whole system (HH) is considered as in Fig. 14 for a periodic behavior and Fig. 15 for a transitory behavior, with different initial membrane potentials \(V_0\). The different behaviors are obtained by tuning the external current \(I_{ext}\) that is applied.

Fig. 13
figure 13

Linear approximation of the variable h when the membrane potential is held fixed at \(-30\), 0, 30 and 60 mV

Fig. 14
figure 14

Linear approximation of the variable h for a periodic behavior of system (HH) and initial membrane potential of \(-30\), 0, 30 and 60 mV

Fig. 15
figure 15

Linear approximation of the variable h for a transitory behavior of system (HH) and initial membrane potential of \(-30\), 0, 30 and 60 mV

Fig. 16
figure 16

Optimal trajectory and bang–bang optimal control for the HH2D-ChR2-3-states model

Fig. 17
figure 17

Optimal trajectory and bang–bang optimal control for the HH2D-ChR2-3-states and HH2D-ChR2-4-states models with \(u_{max}=0.028\,\mathrm {m\,s}^{-1}\) (physiological value)

The ChR2-3-states model

In terms of singular controls, this model behaves similarly to the Morris–Lecar model. There is no singular extremal for the same reasons, and the optimal control is bang–bang with the same expression (the proof is exactly the same). The direct method is implemented with the numerical values of “Appendix 2” and “The 3-states model” section, the targeted action potential has been fixed to 90 mV. The optimal control is physiological here and has in fact no switching time, the light has to be on all the way to the spike (see Fig. 16).

The ChR2-4-states model

The ChR2-4-states model is interesting because it shows that the Hodgkin–Huxley behaves in the opposite way of the Morris–Lecar model. Indeed, the ChR2-4-states model fires slightly faster than the ChR2-3-states model, and requires less light, for small values of \(u_{max}\), including the physiological value of \(u_{max}=0.028\). Furthermore, when \(u_{max}\) increases, the 3-states and 4-states models exactly match, both in terms of optimal trajectory and optimal control (see Fig. 17; Table 3). This means that the ChR2-3-states model is a good approximation of the ChR2-4-states model, in terms of optimal control, for the reduced Hodgkin–Huxley. This is a nice property since the ChR2-3-states is theoretically tractable in terms of singular controls.

4.4 The complete Hodgkin–Huxley model

The ChR2-3-states model

The complete Hodgkin–Huxley model is more difficult to analyze mathematically, and optimal singular controls cannot be excluded a priori as for the previous models. Nevertheless, singular controls do not appear in our numerical simulations. Figure 18 shows the optimal trajectory and control for numerical values taken in “Appendix 2” and “The 3-states model” section.

Table 3 Comparison of the HH2D-ChR2-3-states and HH2D-ChR2-4-states models for different values of the maximum value of the control
Fig. 18
figure 18

Optimal trajectory and bang–bang optimal control for the HH-ChR2-3-states model

Fig. 19
figure 19

Optimal trajectory and bang–bang optimal control for the HH-ChR2-3-states and HH-ChR2-4-states models with \(u_{max}0.028\,\mathrm {m\,s}^{-1}\) (physiological value)

Table 4 Comparison of the HH-ChR2-3-states and HH-ChR2-4-states models for different values of the maximum value of the control

The ChR2-4-states model

We observe the same phenomenon as for the reduced Hodgkin–Huxley model, that is, for small values of \(u_{max}\), the ChR2-4-states model fires slightly faster than the ChR2-3-states model and when \(u_{max}\) increases, both models match (see Fig. 19; Table 4). This constitutes a new argument in favor of the reduced Hodgkin–Huxley model since it captures the features of the complete model in terms of optimal control. Finally, the fact that both Hodgkin–Huxley models have almost the same behavior for the two ChR2 models means that they can be qualified as robust with regards to the mathematical modeling of ChR2.

4.5 Conclusions on the numerical results

We begin with comments on the two versions of the ChR2 models for each neuron model. For every neuron model that we numerically treat, the ChR2-3-states and the ChR2-4-states versions behave qualitatively the same. We observe no optimal singular controls and the shapes of optimal controls and optimal trajectories are similar. Nevertheless, we can note some distinctions between the neuron models. For the FitzHugh–Nagumo model, the ChR2-4-states version fires always faster than the ChR2-3-states version. This is also the case for the two Hodgkin–Huxley models with the important difference that, when the control maximal value increases, the optimal trajectory and optimal control quantitatively match. The Hodgkin–Huxley models are thus very robust with respect to the ChR2 modeling. The Morris–Lecar model displays an unusual behavior when we compare the ChR2-3-states and the ChR2-4-states versions. Indeed, for low values of the control maximal value, including the physiological value computed in Appendix “The 3-states model” section, the ChR2-3-states version fires faster than the ChR2-4-states version and the opposite happens when the control maximal value increases.

As announced at the beginning of Sect. 4, the numerical results invite one to distinguish between two main behaviors of neuron models with respect to optogenetic control. Most of the models, that is all the models except the Morris–Lecar, behave as physiologically expected. The optimal control is bang–bang, begins with a maximal arc, and has at most one switch. The Morris–Lecar model has more than one switch. This means that it is more efficient to switch on and off the light several times than just keep the light on almost all the way up to the spike. That is why we qualify this model as nonphysiological. Moreover, by only changing the value of the ChR2 equilibrium potential (\(V_{ChR2}\)) we can observe a change of the number of switches. Finally, the behavior of the Morris–Lecar model emphasizes the critical importance of optimal control since it allows to find a control that triggers a spike when the expected physiological stimulation (with at most one switch) fails to trigger a spike.