Key words

1 Introduction

Ion channels are the molecular building blocks of cellular excitability, forming highly specific and efficient pores in the membrane. Gated by various types of stimuli (chemical ligands , electricity, mechanical force, temperature, or light), ion channels form a superfamily of transmembrane proteins that underlie a vast number of physiological and pathological events [1]. Within this superfamily, voltage-gated ion channels [2] play a uniquely important role: they detect changes in membrane potential using specialized voltage-sensing structures [3], and further modify the membrane potential by allowing ions to flow through the lipid bilayer. To perform this function, the channel molecule undergoes conformational transitions within a set of conducting and non-conducting states, governed by specific kinetic mechanisms [46].

The relationship between membrane voltage and ionic current is simple and can be derived from basic principles [7]. Electrically, a patch of membrane is equivalent to a capacitor (the lipid bilayer) connected in parallel with a variable conductance (the ion channels, swinging between closed and open states), which is connected in series with a battery (the electrochemical potential of permeant ions). If no external current is injected in this circuit, the current flowing through the conductance and the current charging the capacitor sum to zero. Thus, ignoring spatial effects, the change in the membrane potential vs. time is proportional to the net ionic current flowing through the membrane, as described by the differential equation:

$$ C\frac{\mathrm{d}V}{\mathrm{d}t}=-I, $$
(1)

where C is membrane capacitance , V is membrane potential , \( C\times \mathrm{d}V/\mathrm{d}t \) is the capacitive current that charges the membrane, I is the ionic current flowing through the membrane, and t is time.

All the ion channels within the membrane contribute to I, which is the algebraic sum of all the single-channel currents. Thus, any individual ion channel that opens or closes will cause an immediate and finite change in the net current I, unless V happens to be equal to the reversal potential for that channel. From this perspective, a closing of a channel is as significant as an opening. In turn, this change in the current modifies the rate at which the membrane potential changes over time. Then, as V evolves in time, the driving forces for the permeating ions and the kinetic properties of voltage-gated channels will also change. These changes will again modify I, closing the causal loop between membrane potential and ionic current.

Because they both sense and control membrane potential , voltage-gated ion channels play a key role in action potential generation and propagation, in neurons and other excitable cells [8]. Neurons, in particular, spend considerable amounts of chemical energy to create and maintain the electrochemical gradients necessary for action potentials to work [9], and thus to establish communication within the nervous system. Different types of neurons display unique patterns of cellular excitability [10] and assemble into brain circuits with distinct network properties [11]. The firing properties of individual neurons and neuronal circuits, and ultimately the function of the entire nervous system, are largely determined by the kinetic properties of voltage-gated ion channels [1215]. Considering the rich variety of excitable behavior at cellular and system levels, it’s not surprising that voltage-gated ion channels have their own impressive repertoire of molecular properties [16]. Understanding these properties, particularly the dynamics of state transitions and their voltage sensitivity, is key to understanding how neurons and circuits work.

1.1 Target Audience and Expectations

The aim of this chapter is to guide the reader through the most important aspects of modeling and testing the kinetic mechanisms of voltage-gated ion channels . We are focused on deriving biophysically realistic models from macroscopic currents obtained in whole-cell voltage-clamp experiments [17] and testing these models in live neurons using dynamic clamp [18]. The reader is expected to have a basic understanding of ion channel and membrane biophysics, and some experience with electrophysiology experiments. We tried to keep the discussion general, without relying on a specific computer program, hoping that the readers will be able to take the basic principles learned here and implement them in their preferred software. Nevertheless, for some of the examples presented here we have used a version of the QuB software (www.qub.buffalo.edu), as developed and maintained by our lab (http://milesculabs.biology.missouri.edu/QuB).

2 Ion Channel Models

Modeling ion channel kinetics is fun. However, when taken beyond exponential curve fitting for time constants , or sigmoid fitting of conductance curves, modeling becomes quite challenging. More experimental data and more sophisticated computational algorithms are necessary, and results are not so easy to interpret. Whether this effort is worthwhile depends on the specific goals of the investigator. For example, one may want to find a model that can be used as a computational building block in large-scale simulations of neuronal networks. For this application, simplified phenomenological models will compute faster and would probably work just as well [19]. However, one could set a more mechanistically oriented goal, where the biophysical knowledge available on a particular ion channel is assembled into a detailed computational model [20], which is then tested and refined against new experimental data, and then further used to quantitatively test various hypotheses.

Starting with the seminal work of Hodgkin and Huxley [21, 22], most ion channel models fall somewhere in the range defined by these two examples. Although phenomenological models that simply describe the data are useful, the ultimate goal would be to quantitatively understand how the ion channel works at the molecular level and how it interacts with its environment at the cellular level. A biophysically realistic model must agree with existing theory and experimental data [2330], but it should also remain computationally tractable. Above all, keep in mind that “all models are wrong but some are useful” [31].

2.1 Kinetic Mechanisms

First, a kinetic mechanism is defined by a set of possible conformational states. Although in principle a protein can assume a continuum of structural conformations, statistically, the molecule will reside most of the time in a relatively small subset of high-occupancy states. The time spent continuously in a given state—the “lifetime”—is a random quantity with an exponential probability distribution [32]. For voltage-gated ion channels , the high-occupancy states are the various conformations that correspond to functional and structural elements, such as resting or activated voltage sensors , closed or open pore, inactivated or non-inactivated channel, etc. [3]. Other states may be characterized by more subtle or less understood conformational changes. A state can be identified experimentally if it is associated with a measurable change in properties (e.g., conductance, fluorescence), or it can be inferred statistically from the data.

A kinetic mechanism is further defined by a set of allowed transitions between states. Powered by thermal energy or other sources, the channel undergoes conformational changes at random times. Which state is next is also a random event, with the average frequency of a given transition being inversely related to the energy barrier separating the two states. Transition frequencies are quantified by rate constants . According to rate theory [33], a voltage-dependent rate constant, k ij , corresponding to the transition from state i to state j, has the following expression:

$$ {k}_{ij}={k}_{ij}^0\times {\mathrm{e}}^{k_{ij}^1\times V}, $$
(2)

where V is the membrane potential and k 0 ij is the rate at zero membrane depolarization. k 1 ij is a factor that indicates how sensitive the rate constant is to the membrane potential, as follows:

$$ {k}_{ij}^1=\left({\delta}_{ij}^0\times {z}_{ij}\times F\right)/\left(R\times T\right), $$
(3)

where z ij is the electrical charge moving over the fraction δ ij of the electric field, F is Faraday’s constant , R is the gas constant , and T is the absolute temperature [34]. k 1 ij is zero for voltage-insensitive rates, while k 0 ij is zero for non-allowed transitions. Together, the set of possible states and the set of possible transitions describe the topology of a kinetic mechanism. The rate constants and their voltage dependence define the kinetic parameters of the mechanism.

2.2 Markov Formalism

The mathematical properties of a kinetic mechanism—finite set of discrete conformations, exponentially distributed lifetimes, random conformational changes—are beautifully captured by Markov models . Originally developed for stochastic processes, the Markov formalism can be directly applied to ion channels [35], by mapping each known or hypothesized conformation of the channel into a state of the Markov model . The rate constants associated with a Markov model can be compactly expressed as a rate matrix Q, of dimension N S × N S, where N S is the total number of states. The Q matrix has each off-diagonal element, q ij , equal to the rate constant k ij , and each diagonal element, q ii , equal to the negative sum of the off-diagonal elements of row i, so that the sum of each row of Q is zero. If a transition is not allowed between states i and j, q ij is zero.

The state of the model as a function of time can be conveniently expressed as a probability vector, P. At any time t, each element of P represents the occupancy of that state, or the fraction of channels that reside in that state. Under stationary conditions, the average fraction of the total time spent by the channel in each state can be calculated as an equilibrium state occupancy. For an ensemble of channels, the average number of channels residing in state i at equilibrium is equal to p i  × N C, where p i is the equilibrium occupancy of state i and N C is the total channel count.

When conditions change (e.g., when a voltage step is applied in a voltage-clamp experiment), the energy landscape of the channel changes as well. All the voltage-sensitive rate constants take different values, and thus the rate matrix Q will change as well. As a result, the equilibrium state occupancies will also be different. For an ensemble of channels, if a state becomes less likely to be occupied under the new conditions, the fraction of channels residing in that state will decrease over time, at a rate that depends on the average lifetime of that state. The same behavior would be observed from repeated trials of a single channel . However, in a single trial, the channel will simply continue its stochastic behavior, just with different transition frequencies.

The process of relaxation towards a new state of equilibrium is described by the ordinary differential equation (ODE):

$$ \frac{\mathrm{d}\mathbf{P}}{\mathrm{d}t}=\mathbf{P}\times \mathbf{Q}. $$
(4)

The state occupancies corresponding to equilibrium, P eq, can be obtained by setting the time derivative of P equal to zero and solving the resulting algebraic equation:

$$ \frac{\mathrm{d}{\mathbf{P}}_{\mathrm{eq}}}{\mathrm{d}t}={\mathbf{P}}_{\mathrm{eq}}\times \mathbf{Q}=0. $$
(5)

When conditions are stationary and the rate matrix Q is constant, the differential equation has a simple analytical solution:

$$ {\mathbf{P}}_t={\mathbf{P}}_0\times {\mathrm{e}}^{\mathbf{Q}\times t}, $$
(6)

where P t and P 0 are the state occupancies at an arbitrary time t and at time zero, respectively. The exponential of \( \mathbf{Q}\times t \) is another matrix, A, that contains the conditional state transition probabilities. Each element of A, a ij , is the conditional probability that the channel will be in state j at time t, given that it was in state i at time zero. No assumption is made about what other transitions would have occurred in that time interval. The transition probability matrix A for a given time t can be calculated numerically using the spectral expansion method [35], as follows:

$$ {\mathbf{A}}_t={\mathrm{e}}^{\mathbf{Q}\times t}={\displaystyle \sum_k{\mathbf{B}}_k\times {\mathrm{e}}^{\lambda_k\times t}}, $$
(7)

where the B k values are the spectral matrices derived from the eigenvectors of Q, and the λ k values are the eigenvalues of Q, with λ 0 always equal to zero.

The B k and λ k values can be calculated easily with a numerical library or with specialized software, such as Matlab or QuB. For analysis of macroscopic currents, it is convenient to calculate the transition probability matrix A δt that corresponds to the data sampling interval, δt. Then, the state occupancies can be calculated recursively, starting with some initial solution, using a simple vector–matrix multiplication:

$$ {\mathbf{P}}_{t+\delta \mathrm{t}}={\mathbf{P}}_t\times {\mathbf{A}}_{\delta t}. $$
(8)

In summary, the Markov formalism has the outstanding convenience of encapsulating all the properties of a kinetic mechanism, as well as the state of the channel, in a few matrices and vectors. The same mathematical and computational operations will apply to any ion channel model, regardless of its topology (how many states, which transitions are allowed) and kinetic properties (rate constant values). Furthermore, Markov models can be used at both single molecule [3642] and macroscopic levels [4348].

2.3 Hodgkin-Huxley-Type Models

The ion channel models originally proposed by Hodgkin and Huxley [21] can also be formulated as Markov models, as they explicitly represent the closed, open, or inactivated states of the channel. While they were empirical at the time of their discovery, HH models remain to this day reasonably realistic. Their main limitation—but also their power, depending on the application—resides in making some strongly simplifying assumptions about the channel, which are simply outdated now (e.g., equal and independent “activation particles,” or independent activation and inactivation processes). However, one should keep in mind that HH models are in disagreement with biophysical theory when their rate constants do not follow the Eyring rate theory [33], but instead are formulated as arbitrary functions of voltage. While their limited number of states and transitions would inherently reduce their ability to explain experimental data, HH models can gain more flexibility through these arbitrary rate functions.

3 Experimental Data

3.1 What Is in the Data?

A good way to understand the experimental data is to run simulations. Let’s consider the simple ion channel model shown in Fig. 1a. For illustration purposes, this model is a very crude approximation of a voltage-gated sodium (Nav) channel, featuring closed, open, and inactivated states. A single-channel stochastic simulation of a voltage-clamp recording is shown in Fig. 1b, where a noisy signal randomly jumps between zero and a tiny negative current. The noise in the trace is mostly caused by instrumentation, though the open state has its own intrinsic fluctuation in current [49]. The average single-channel current corresponding to the open state can be calculated as follows:

Fig. 1
figure 1

From model to data. A simple ion channel model (a) was used to simulate single-channel (b) and macroscopic (c) currents in response to a voltage step (d). The macroscopic current was simulated with an ensemble of 1000 channels, either deterministically (black trace) or stochastically (red trace). The inset shows a fit of the stochastic macroscopic current (red) with a two-exponential function. The individual exponential components of the fit line are also shown (green and blue)

$$ i=g\times \left(V-{V}_{\mathrm{R}}\right), $$
(9)

where g is the single-channel conductance , V is the membrane potential , and V R is the reversal potential for the permeant ion. Note that Eq. 9 is an approximation: the current is a nonlinear function of voltage when the permeant ion has unequal intra- and extracellular concentrations, as described by the Goldman-Hodgkin-Katz current equation [1].

For channels that have several conducting states, we can make the unitary current equation more general by introducing a conductance vector g, with each element g i equal to the conductance of state i, or equal to zero for non-conducting states. The dot product between the state occupancy vector P and the conductance vector g can be used to calculate the unitary current for an arbitrary set of state occupancies, as a function of time:

$$ {i}_t=\left({\mathbf{P}}_t\cdot \mathbf{g}\right)\times \left(V-{V}_{\mathrm{R}}\right). $$
(10)

When a single-channel trace is simulated, at any given time only one element of P is equal to one, and the rest are zero. As the channel changes state during the simulation, a different element of P becomes equal to one, and thus a different conductance is “selected” by the dot product P · g.

To calculate the total ionic current , I t , given by an ensemble of identical channels, we simply multiply the unitary current by the total number of channels, N C:

$$ {I}_t=\left({\mathbf{P}}_t\cdot \mathbf{g}\right)\times \left(V-{V}_{\mathrm{R}}\right)\times {N}_{\mathrm{C}}. $$
(11)

Computationally, I t can be efficiently calculated in two steps: first, calculate the state occupancies P t , using the recursive Eq. 8; then, calculate I t as a function of P t , using Eq. 11. The time-invariant vector g × (V − V R) × N C needs to be recalculated only when the voltage changes.

As shown earlier in Eq. 6, for a time interval where conditions are constant (e.g., during a voltage step), P t can be calculated as a function of some initial state occupancies, P 0. For a typical voltage-clamp protocol, the P 0 at the beginning of a sweep can be calculated as the equilibrium occupancies corresponding to the holding membrane potential . For this calculation to be accurate, the holding voltage should be maintained long enough to allow channels to reach equilibrium. If the protocol consists of a sequence of voltage step commands, the P 0 of one step can be calculated as being equal to the P t at the end of the previous step. This idea could also be applied to protocols where the command voltage varies continuously (e.g., during a “ramp”). In this case, a continuously varying episode can be approximated with a sequence of discrete steps of constant voltage. At the limit, each of these steps is as short as one acquisition sample.

Although very compact, Eq. 11 is not easy to interpret. To clarify its properties, we first replace P t with its solution as a function of P 0:

$$ {I}_t=\left(\left({\mathbf{P}}_0\times {\mathrm{e}}^{\mathbf{Q}\times t}\right)\cdot \mathbf{g}\right)\times \left(V-{V}_{\mathrm{R}}\right)\times {N}_{\mathrm{C}}. $$
(12)

Then, we replace eQ×t with its spectral expansion:

$$ {I}_t=\left(\left({\mathbf{P}}_0\times \left({\displaystyle \sum_{k=0}^{N_{\mathrm{S}}-1}{\mathbf{B}}_k\times {\mathrm{e}}^{\lambda_k\times t}}\right)\right)\cdot \mathbf{g}\right)\times \left(V-{V}_{\mathrm{R}}\right)\times {N}_{\mathrm{C}}. $$
(13)

We rearrange the terms and obtain:

$$ {I}_t={\displaystyle \sum_{k=0}^{N_{\mathrm{S}}-1}\left(\left(\left({\mathbf{P}}_0\times {\mathbf{B}}_k\right)\cdot \mathbf{g}\right)\times \left(V-{V}_{\mathrm{R}}\right)\times {N}_{\mathrm{C}}\times {\mathrm{e}}^{\lambda_k\times t}\right)}. $$
(14)

In the sum above, the term corresponding to k = 0 is a constant, because λ 0 is always equal to zero. That term is actually the current that would be generated when channels reached equilibrium under those conditions. The explanation is that all eigenvalues are negative, except λ 0, which makes all terms in Eq. 14 become vanishingly small when t is sufficiently large, with the exception of the λ 0 term, which remains constant:

$$ {I}_{t\to \infty }=\left(\left({\mathbf{P}}_0\times {\mathbf{B}}_0\right)\cdot \mathbf{g}\right)\times \left(V-{V}_{\mathrm{R}}\right)\times {N}_{\mathrm{C}}. $$
(15)

Since channels are at equilibrium when t is sufficiently large, one can recognize that the vector (P 0 × B 0) must be equal to P eq. Therefore, the current flowing at equilibrium has the expression:

$$ {I}_{\mathrm{eq}}=\left({\mathbf{P}}_{\mathrm{eq}}\cdot \mathbf{g}\right)\times \left(V-{V}_{\mathrm{R}}\right)\times {N}_{\mathrm{C}}, $$
(16)

where:

$$ {\mathbf{P}}_{\mathrm{eq}}={\mathbf{P}}_0\cdot {\mathbf{B}}_0. $$
(17)

In the equation above, P 0 can be any arbitrary probability vector. With these results, the macroscopic current can be written as:

$$ {I}_t={I}_{\mathrm{e}\mathrm{q}}+{\displaystyle \sum_{k=1}^{N_{\mathrm{S}}-1}\left({I}_k\times {\mathrm{e}}^{\lambda_k\times t}\right)}, $$
(18)

where I k is a scalar quantity with dimension of current:

$$ {I}_k=\left(\left({\mathbf{P}}_0\times {\mathbf{B}}_k\right)\cdot \mathbf{g}\right)\times \left(V-{V}_{\mathrm{R}}\right)\times {N}_{\mathrm{C}}. $$
(19)

The eigenvalues, λ k , can be replaced with time constants , τ k , obtaining the final current equation:

$$ {I}_t={I}_{\mathrm{e}\mathrm{q}}+{\displaystyle \sum_{k=1}^{N_{\mathrm{S}}-1}\left({I}_k\times {\mathrm{e}}^{-t/{\tau}_k}\right)}, $$
(20)

where τ k  = −1/λ k . The macroscopic current described by Eq. 20 as a function of time is a sum of N S − 1 exponentials, plus a constant term. Each exponential component is parameterized by a time constant τ k and an amplitude I k .

These results are general: any voltage-gated ion channel that has N S high-occupancy conformations will in principle generate a macroscopic current with N S − 1 exponentials, when subjected to a step change in membrane potential . This is illustrated in Fig. 1c for the simple three-state model: I t (the red trace) has the expected profile of rise (activation) followed by decay (inactivation). This time course is a sum of two simple exponential components that vanish to zero with different time constants . In this particular case, I eq is almost zero.

In the above equations, the macroscopic current I t was calculated as a deterministic function of some initial conditions P 0. However, one should keep in mind that I t is the sum of many unitary currents, each generated by an individual ion channel that makes random transitions between states. These stochastic events at the single-channel level will make the macroscopic current a stochastic process as well. Therefore, the state occupancy at time t is a random quantity, characterized by a probability distribution [44, 46, 47, 50]. The state at time t can be statistically predicted from some known previous state, but the uncertainty of the prediction increases with the time from the reference point. In contrast, the initial state of a deterministic process can predict any future state with equal precision. The difference between stochastic and deterministic processes is illustrated in Fig. 1c, where the trajectory of the stochastically simulated macroscopic current (black trace) consistently diverges from the deterministically calculated current (red trace).

3.2 Protocol Design

As discussed above, an ion channel kinetic mechanism is fully characterized by its number of states, connectivity matrix of allowed state transitions, and rate constants quantifying transition frequency and voltage dependence. This information is encoded in single-channel or macroscopic voltage-clamp recordings as a stochastically fluctuating current, mixed with noise and artifacts . In single-channel recordings, the mean value of the current randomly jumps between two (or more) levels, corresponding to molecular transitions between conducting and non-conducting channel conformations. For example, in the single-channel trace shown in Fig. 1b, there happens to be four conductance changes over 50 ms. A channel with faster kinetics would result in more transitions per second, or, equivalently, in shorter average lifetimes in each state. Furthermore, a channel with greater voltage sensitivity would exhibit transition frequencies that change more substantially with voltage. Overall, the statistical properties of single-channel data can be analyzed with a variety of mathematical methods and computational algorithms, to extract the kinetic mechanism of the underlying ion channel [3642, 51]

A macroscopic current is also a stochastic sequence of events, where individual channels randomly change state. Thus, stochastic fluctuations in the macroscopic current [52, 53] are also a potential source of information that could be used to extract the kinetic mechanism [44, 46, 47, 50]. However, these fluctuations are more difficult to separate from experimental noise and artifacts. First, depending on the recording technique, the experimental preparation, and the noise levels of the recording system, a change in the conductance state of a single channel may be very difficult or impossible to detect experimentally. Second, the frequency of transitions in the overall state of the ensemble is proportional to the total number of channels, and it may exceed the bandwidth of the recording system. For example, if the average single-channel transition frequency is 10 s−1, an ensemble of 10,000 channels would exhibit 100,000 transitions per second, while the recording bandwidth may be smaller by an order of magnitude. Thus, information encoded in the magnitude and frequency of stochastic current fluctuations may be lost.

A generally more reliable source of information is the mean value of the macroscopic current, as a function of time and voltage. Even in this case, although the mean value can be easily extracted from noisy data, decoding the kinetic mechanism is far from being trivial. The main difficulty lies in the ambiguous relationship between the exponential parameters describing the macroscopic current (time constants τ k and amplitudes I k ) and the kinetic parameters of the channel (rate constant factors k 0 ij and k 1 ij ). Following a step change in conditions, the overall state of the ensemble relaxes exponentially towards a new equilibrium. For a channel with N S states, this relaxation process is quantified by a set of 2 × N S − 1 parameters, as described by Eq. 20: N S − 1 time constants τ k , N S − 1 amplitudes I k , and the equilibrium current I eq. Every one of these exponential parameters, including I eq, is a mathematical function of all the rate constant parameters, and implicitly a function of membrane potential . Thus, while calculating the exponential parameters from the kinetic mechanism is straightforward, the inverse calculation is not. Furthermore, this also implies that no more than a maximum of 2 × N S − 1 kinetic parameters can be extracted from a macroscopic current generated in response to a single voltage step. In reality, kinetic mechanisms may have more parameters than that. For example, the three-state model in Fig. 1a has eight kinetic parameters but only five exponential parameters. Even with this unrealistically simple model, it is clear that the kinetic parameters of the model cannot be unequivocally determined from the mean value of the macroscopic current, unless the voltage-clamp protocol is expanded to more than one voltage step.

A second difficulty is related to the theoretical and experimental observability of all the exponential components, given the limited resolution of the recording system. The idea is that, although each pair of exponential parameters (τ k , I k ) depends on all the rate constants , fast or slow exponential components will be influenced most by similarly fast or slow rates, respectively. Then, if a certain exponential component is weakly represented in the data, some of the kinetic parameters will also be weakly determined. The contribution of an exponential component to the data, given a set of kinetic and conductance properties, depends on two factors. First, the amplitudes I k depend on the initial state occupancies P 0. Thus, depending on the voltage-clamp protocol, some components may have very small, or even zero, amplitude, and can be undetectable relative to the experimental resolution. Overall, a change in state occupancy is accompanied by a change in current only if the total occupancy of the conducting states changes. If this fraction doesn’t change, or if the change is small relative to the resolution of the recording system, the mean current value will remain approximately constant, even though the properties of the stochastic fluctuations may change. Second, an exponential component can be observed experimentally only if the bandwidth of the recording system is adequate. Thus, very fast exponentials may be distorted or filtered out, while very slow components may not be detected in short protocols. A property worth remembering is that these exponential components vanish in order, from the smallest to the largest time constant. As a result, the fastest components will be affected the most by experimental artifacts associated with abrupt changes in the command voltage.

In conclusion, voltage-clamp protocols must be designed carefully and optimized to minimize these issues. Overall, the most important practical recommendation we can make is to design and apply as many types of stimuli as feasible, to force the channel to visit as many states as possible, which should result in well-observed exponential components and well-determined kinetic parameters. Ultimately, designing a good set of stimulation protocols is an iterative process, without a priori solutions. It may well happen that applying yet another protocol exposes a new behavior of the channel, which then needs to be investigated with new or refined stimuli.

An example of a typical set of voltage-clamp protocols is given in Fig. 2, as applied to recording whole-cell Nav currents from mammalian neurons in brain slices [54]. A minimum of four protocols is necessary to investigate the kinetic properties of Nav channels, as illustrated in Fig. 2ad. Each of these voltage-clamp protocols forces the channel on a different state trajectory, thus exposing a different set of kinetic properties. For example, the protocol in Fig. 2a starts the channel in a state of deactivation and takes it through activation, opening, and inactivation. Several exponential components are well defined in the data, particularly the two time constants of inactivation. In contrast, the protocol in Fig. 2c starts the channel in a state of deactivation as well, but the channel is taken directly into inactivation, without opening. Two time constants of inactivation can also be detected in the data, but the exponential components have lower amplitude and thus are slightly less well defined.

Fig. 2
figure 2

Designing voltage-clamp protocols for Na+ currents. To gather information about the kinetic mechanism, the channels are forced to make transitions between different sets of states, as follows: deactivated to open to inactivated (a), deactivated to inactivated to open (b), non-inactivated to inactivated (c), and inactivated to non-inactivated (d). Raw data are further processed to extract state occupancies as a function of time and voltage (e and f). Adapted from [54]

With some of these protocols, the raw data can be used directly to determine the kinetic parameters (e.g., the time course of activation and inactivation in Fig. 2a). With others, the raw data are first processed to extract some empirical measure of state occupancy, which is then used to estimate kinetic parameters. Examples are the (pseudo) steady-state activation and inactivation in Fig. 2e, and the time course of recovery from inactivation and the subthreshold inactivation in Fig. 2f. Generally, the raw data are used directly when an exponential time course is experimentally observable in the macroscopic current. For example, when the channel activates, opens, and then inactivates (Fig. 2a). When state changes are not associated with changes in conductance, information is obtained from two-pulse protocols. For example, when the channel inactivates at membrane potentials where it cannot activate and open (Fig. 2c). In this case, the peak of the current is used as an empirical measure for the total occupancy of non-inactivated states available to generate current upon activation.

3.3 Experimental Artifacts

The data recorded in voltage-clamp experiments do not contain just the current of interest but are contaminated by a variety of artifacts, including other currents active in the preparation, experimental noise, voltage-clamp errors, etc. [17]. All of these artifacts will negatively affect fitting algorithms and can result in a distorted model. Although artifacts cannot be eliminated, they can be reduced to acceptable levels. Thus, the effects of random measurement noise, which lowers the precision of parameter estimates, can always be reduced by collecting more data, and generally are not an issue. Deterministic power line interference (50 or 60 Hz, or harmonics) can be easily removed online or offline. Uncompensated brief transients that occur when the command voltage changes abruptly can simply be excluded from the fit, provided that they don’t overlap with significant channel activity. However, longer transients must be somehow separated from the signal. Voltage-clamp errors caused by incomplete compensation of the series resistance could be significant. However, the actual voltage at the membrane can be either measured directly in some techniques (e.g., two-electrode voltage clamp ) or calculated from the measured series resistance and the recorded current. Then, a corrected version of the command voltage protocol can be constructed and used in the data fitting procedure. As explained above, an arbitrary voltage waveform can be approximated with a sequence of constant voltage steps. Another artifact is imperfect space clamp , which can occur when recording from neurons in vivo or in brain slices [55]. In this case, the current recorded from the soma can be contaminated with action potentials backpropagating from the axon [56], which usually escapes voltage-clamp control. Space-clamp errors can be reduced with a simple technique that selectively inactivates axonal sodium channels and thus makes the axon a passive compartment [57]. Finally, the bandwidth of the recording system should be sufficiently wide for the kinetics of the ion channel investigated. Initially, the cutoff frequency of the low-pass filter and the sampling rate of the digitizer should be set as high as possible to identify the fastest time constant present in the data. Then, acquisition parameters can be set to the values recommended by Nyquist’s sampling theorem [58]. In many cases, the fastest time constant corresponds to channel activation or deactivation.

Even when these artifacts cannot be eliminated, in principle they can be parameterized and included into the fitting algorithm. Unfortunately, contamination with other ionic currents that are active in the preparation cannot be easily encoded in the algorithm. These currents are generally unknown quantities and cannot be compensated computationally. The ideal solution is to isolate the current of interest pharmacologically, with a very specific blocker. If this is available, the same protocols can be repeated under control conditions and with the blocker applied, and the two sets of data can be subtracted from each other, giving the current of interest. This subtraction not only eliminates all other currents, including leak , but will also remove capacitive transients . However, not all channels can be completely isolated by pharmacological subtraction. The solution is to reduce all other currents as much as possible. Some currents can be blocked pharmacologically, while others can be rendered inactive by exploiting their kinetic properties, particularly voltage dependence. Furthermore, the background leak currents and possibly other currents left unblocked can be subtracted using the P/n technique [59], which will also remove capacitive transients . However, one should be aware that the P/n method relies on the assumption that leak currents are linear with voltage and thus cannot subtract voltage-sensitive currents.

When designing protocols to isolate the current of interest, pharmacologically or via the P/n technique, one should keep in mind that these procedures will extend the total acquisition time, and recording parameters may change over time. For example, the seal resistance may deteriorate, causing an increase in leak , the level of solution in the bath may change and alter pipette capacitance and transients, and series resistance may fluctuate and change the amplitude of recorded currents. Generally, currents may run down over time. All these changes will distort the subtracted current. One other problem with subtraction methods is that uncompensated series resistance errors depend on the total current flowing. Then, if the total current takes significantly different values under control versus pharmacological block or P/n conditions, the actual voltage at the membrane will also differ. As a result, the subtracted current will be contaminated with some leftover current. Thus, even if a good blocker is available for the channel of interest, reducing all the other currents pharmacologically is still recommended. A similar artifact occurs when the current of interest is functionally coupled with other currents (e.g., Ca2+ − activated K+ currents). These would no longer be activated when the current of interest is blocked. Finally, one should keep in mind that when two random variables are subtracted from each other, their mean values subtract but their variances add. Thus, subtracting two sets of currents will result in a signal with greater noise, which would make it difficult to apply fitting methods that rely on the properties of current fluctuations.

4 Fitting the Data

The objective is to find the kinetic mechanism that best explains the experimental data, but also agrees with prior knowledge. As discussed, a kinetic mechanism is fully defined by its topology, given by the number of states and the connectivity matrix of possible transitions, and by the parameters quantifying rate constants and voltage dependence. Finding the topology and finding the parameters are generally approached as separate problems, and our focus here is on estimating parameters for a given topology. A computational procedure for finding the best parameters combines a parameter optimization engine (the optimizer) and an algorithm that calculates how well the model explains the data (the cost function , or the goodness of fit ). The optimizer starts with a set of initial values and iteratively explores the parameter space, according to a defined search strategy, until it finds a set of parameters that maximizes the goodness of fit. For each point sampled in the parameter space, the optimizer calls the estimation algorithm to evaluate the goodness of fit . Typically, the optimizer is data- and model-blind, although it can be tweaked for a particular problem. A variety of general optimization algorithms that have been described in the literature [60] and are available in numerical libraries can be applied to ion channels. In contrast, the function that calculates the goodness of fit is very specialized and can be quite complicated.

4.1 Goodness of Fit

In general, how well the model explains the data can be defined in different ways, depending on the data and the model. For a deterministic time series contaminated by measurement noise, the goodness of fit is typically given by the sum of squared errors, S, between the experimental data points and the fit curve:

$$ S={\displaystyle \sum_t{\left({y}_t-{f}_t\left(\mathbf{M},\mathbf{K}\right)\right)}^2}, $$
(21)

where y t is the data point measured at time t and f t (M, K) is the calculated value at time t, given a structural model M and a set of parameters K. The best fit parameters are those that correspond to the lowest S value that could be reached.

Curve fitting is not the ideal method for data generated by ion channels or other stochastic processes. These data are not defined by a deterministic function that can be calculated for every time point. Instead, they are a stochastic sequence of channel states, contaminated with random measurement noise. Nevertheless, this stochastic sequence is generated by the ion channel according to a probability distribution, which is determined by the kinetic mechanism [32]. For single-channel data, this probability distribution can be used to calculate the likelihood of the data, L [36]:

$$ L=p\left(\mathbf{Y}\Big|\mathbf{M},\mathbf{K}\right), $$
(22)

where p is the conditional probability of the data sequence Y, given a model topology M and a set of parameters K. The best fit parameters correspond to the highest L value that could be reached. In practice, the logarithm of the likelihood function is used instead of the likelihood itself, because L may reach intractably small or large values.

Ideally, macroscopic currents should also be approached as a stochastic process, using a likelihood-based goodness of fit. A variety of mathematical and computational algorithms have been designed to calculate the likelihood of a macroscopic current [44, 46, 47, 50], all making various approximations to speed up the computation. Ultimately, the fastest but theoretically the least accurate approximation that can be made is to completely ignore the stochastic nature of the macroscopic current. Essentially, the goodness of fit in this case is calculated as the sum of squared errors between the experimental data and the calculated macroscopic current, I t:

$$ S={\displaystyle \sum_t{\left({y}_t-{I}_t\right)}^2}. $$
(23)

This approximation is most suitable when the analyzed current is generated by many channels, when stochastic fluctuations are small relative to the mean value and comparable to the measurement noise. All other methods that make more accurate assumptions exploit in some way the fluctuations of the current, and theoretically should produce more accurate or more precise parameter estimates. However, as discussed above, many experimental data are not clean enough for noise analysis and the mean of the current may be the only reliable source of information. This condition describes well the macroscopic currents generated by voltage-gated ion channels in whole-cell patch-clamp experiments. In the following section, we assume that the goodness of fit is calculated as the sum of squared errors, S.

4.2 Computing the Cost Function

A variety of voltage-clamp protocols can be applied to determine the kinetic mechanism, as illustrated in Fig. 2. For each data set that is included in the fitting procedure, the estimation algorithm must calculate the goodness of fit . When the cost function is the sum of squared errors, S, then the mean current I t must be calculated for every point in the data. Essentially, the algorithm must simulate a macroscopic current in response to the same voltage-clamp protocol as was used to record the data, given the set of parameters proposed by the optimizer in that iteration. For two-pulse protocols, such as those shown in Fig. 2, the simulated current sequence must also be processed in the same way as the experimental data. For example, the experimental recovery from inactivation (Fig. 2d) is calculated, as a function of time and recovery potential, as the ratio between the peak current obtained with the test pulse and the peak current obtained with the conditioning pulse. Although it might be tempting, it is a bad idea to calculate the theoretical recovery from inactivation using the sum of non-inactivated state occupancies. Instead, it should be calculated as for the experimental curve: first, simulate the response of the model to the two-pulse protocol, then, from this simulation, calculate the ratio of the two peaks.

I t is most efficiently computed recursively, using Eq. 8 to calculate P t+1 from P t , where t and t + 1 refer to consecutive samples. The computation is initialized with P 0, which is calculated as the equilibrium state occupancies that correspond to the holding voltage. The entire sequence of operations can be summarized as follows:

$$ \begin{array}{l}{\mathbf{P}}_{\mathrm{eq}}=\mathbf{1}\cdot {\mathbf{B}}_{0,{V}_{\mathrm{H}}},\\ {}{\mathbf{P}}_0={\mathbf{P}}_{\mathrm{eq}},\\ {}{\mathbf{P}}_1={\mathbf{P}}_0\times {\mathbf{A}}_{\delta t,{V}_1},\\ {}\dots \\ {}{\mathbf{P}}_t={\mathbf{P}}_{t\hbox{-} 1}\times {\mathbf{A}}_{\delta t,{V}_t},\\ {}{I}_t={\mathbf{P}}_t\times {\mathbf{I}}_{V_t},\\ {}{S}_t={\left({y}_t-{I}_t\right)}^2,\end{array} $$
(24)

where 1 is the normalized unity vector, with each element equal to 1/N S; \( {\mathbf{B}}_{0,{V}_{\mathrm{H}}} \) is the spectral matrix corresponding to λ 0 and calculated for a voltage equal to the holding potential, V H; \( {\mathbf{A}}_{\delta t,{V}_t} \) is the transition probability matrix calculated for δt and a voltage equal to the command potential at time t, V t ; S t is the squared error at time t. Finally, \( {\mathbf{I}}_{V_t} \) is a vector with dimension of current, with each element equal to the maximum current that would be generated if all the channels resided in that state:

$$ {\mathbf{I}}_{V_t}=\mathbf{g}\times \left({V}_t-{V}_{\mathrm{R}}\right)\times {N}_{\mathrm{C}}. $$
(25)

When the command voltage changes during a protocol sequence, the spectral matrix B 0 and the transition probability matrix A δt are replaced with the matrices calculated for that voltage value. As discussed, instead of the command voltage, one could use the actual voltage measured at the membrane, when available, or a voltage corrected for errors caused by the uncompensated series resistance .

The total sum of squared errors, S, is the sum of squared errors for all data points used in the analysis. S could be divided in components corresponding to individual data sets, each multiplied by a weighting factor:

$$ S={\displaystyle \sum_i{w}_i\times {S}_i}. $$
(26)

These weighting factors can be chosen empirically, to establish the relative contribution of each data component to the cost function .

4.3 Model Parameters

For a given model topology, the unknown parameters to be determined are the rate constant factors k 0 ij and k 1 ij . However, the macroscopic current depends on additional quantities: the unitary conductance, g, and the total number of channels, N C. Calculating I t in the cost function requires these quantities. Normally, for a given ion channel type, the unitary conductance has the same value in every recording and can be estimated directly from single-channel data, or via noise analysis from macroscopic currents [53, 61]. Although N C can also be determined through noise analysis, it takes a different value in each experimental preparation and it cannot always be known. One possibility is to make N C a parameter to be estimated [44]. If the data used in the fit were obtained from multiple experiments, then there will be multiple N C parameters, one for each preparation. The downside with this approach is a potentially large increase in the dimensionality of the parameter space, which would slow down the optimizer. Another possibility is to normalize the current in each data set to the local maximum value. The disadvantage in this case is a greater ambiguity in the estimated kinetic parameters. Furthermore, it can be problematic to analyze fluctuations. With some models, distinct combinations of rate constants and channel count values can generate the same macroscopic current in response to a voltage-clamp protocol [44]. However, in principle this ambiguity could be resolved by adding more protocols to the fit.

4.4 Prior Knowledge

Including prior knowledge in the model is necessary: although it restricts the freedom of the optimizer to search for parameters, it ensures that the parameters that best explain the new data are also in agreement with previous experiments and theory. Prior knowledge can be encoded in the topology of the model, but also in the kinetic parameter values. For example, the number and sequence of distinct conformational states that can be assumed by voltage sensors is defined by topology, while their degree of cooperativity is defined by rate constants . One could also encode in the model a hypothesis about the kinetic mechanism, and test it by fitting the data with this model. If the optimizer can find parameters that explain the data well, then the hypothesis is potentially correct, and vice versa.

A good example of including prior knowledge in the model is the study of inactivation in Nav channels by Kuo and Bean [62]. Phenomenologically, the steady-state and transient inactivation properties of Nav currents appear to be strongly voltage sensitive, as initially determined and modeled by Hodgkin and Huxley [21]. Yet, in more recent studies, very little electrical charge was detected to move during inactivation [26, 59, 6366]. While Hodgkin and Huxley postulated that activation and inactivation are independent processes in Nav currents, Kuo and Bean proposed instead an allosteric coupling of inactivation to activation, which makes inactivation apparently but not intrinsically voltage dependent. Their model is shown in Fig. 3.

Fig. 3
figure 3

Representing ion channel kinetic mechanisms with state models. This model has been formulated for Nav channels [62]. States C1…C5 and O6 represent the non-inactivated channel, whereas I7…I12 are inactivated states. O6 is the only conducting (open) state. The pathway either from C1 to C5 or from I7 to I12 corresponds to the activation of the four voltage sensors , assumed to be equal and independent. This assumption is denoted by the 4:3:2:1 or 1:2:3:4 ratios in the factors multiplying the α m or β m rates. The C5 to O6 transition corresponds to the opening of the channel. The model allows the channel to inactivate without opening, from any of the closed states C1…C5. However, the channel is most likely to inactivate from the open state O6 or when more voltage sensors are activated (e.g., from C5). This property is implemented via the allosteric factors a and b, which control the equilibrium and transition frequency between closed and inactivated states. The I7…I11 transition rates also include the allosteric factors a and b, to satisfy microscopic reversibility. When the channel reaches the open state O6 upon membrane depolarization, it is quickly and completely absorbed into the inactivated states I12 and I11. Inactivation from closed states happens more slowly, which gives the channel a chance to open before it inactivates during an action potential [54]. The only rates with significant voltage sensitivity are α m and β m. The allosteric coupling between activation and inactivation can explain the apparent voltage sensitivity of inactivation but also the minimal electrical charge detected to move within the channel during inactivation

4.5 Model Constraints and Free Parameters

Knowledge—or hypotheses—about rate constants can be implemented either as a set of mathematical constraints or as a penalty term added to the cost function . Generally, constraints are defined as an invertible transformation, c, between a set of model parameters K and a set of free parameters F:

$$ \mathbf{F}=c\left(\mathbf{K}\right), $$
(27)
$$ \mathbf{K}={c}^{-1}\left(\mathbf{F}\right). $$
(28)

The model is defined by the K parameters, which are subject to the constraints implemented by c. In contrast, the optimizer is model-blind and operates with the F parameters, which are “free” to take any value in the (−∞, +∞) range. The optimizer searches in the free parameter space and finds a set F* that is converted, via the c −1 transformation, into a set K* that best explains the data. The search is initialized with a set of free parameters F 0, obtained via the c transformation from an initial set of model parameters K 0.

If no constraints are formulated, c is the identity transformation. In this case, the number of free parameters is equal to the number of model parameters. However, at least one type of constraint has to be applied, which is to keep the rate constant factors k 0 ij greater than zero. This constraint can be implemented easily by using ln k 0 ij as the free variable, which can take any value in the (−∞, +∞) range, but restricts k 0 ij to (0, +∞). This transformation keeps the numbers of free and model parameters equal. However, any other type of constraint reduces the number of free parameters by one. A variety of useful model constraints can be implemented with c as a simple linear transformation. For example, one can constrain a rate to have a constant value, or two rates to have a constant product, or one can enforce microscopic reversibility. Detailed explanations of how to implement linear constraints are available in the literature [39, 44, 67].

Sometimes, a model parameter is allowed to take any value in a defined range, as implemented via the c transformation, but some of these values may be considered “better” than others. For example, a rate constant estimated at 100,000 s−1 by the (model-blind) optimizer, although physically acceptable, could be considered quite unlikely when obtained from data sampled at 1 kHz. In this case, a penalty term can be factored into the cost function S *:

$$ {S}^{*}=S\times p\left(\mathbf{K}\right), $$
(29)

where the penalty p(K) is a function of model parameters K. In principle, constraints implemented via the c transformation can also be formulated as a penalty, by making p(K) equal to one when K satisfies the constraints, or equal to a very large number when not. However, the advantage of having a reduced number of free parameters would be lost, and defining p(K) is not trivial.

Fundamentally, any assumption about the kinetic mechanism results in adding one or more computational constraints to the model, and thus reduces the number of free parameters. While having fewer parameters makes the fitting easier, the model will also have less flexibility in explaining the data. For example, the model proposed by Kuo and Bean [62], although making a radical departure from the classical dogma, makes the simplifying assumption that the channel has identical and independent voltage sensors , which dramatically limits the number of free parameters . This assumption has later been invalidated by clever biophysical experiments [68]. Another study relaxed the constraint that the inactivation rates in the Kuo and Bean model are voltage independent and found that a small but finite charge can explain their data better [54]. Ideally, one should always start with a well-constrained model that has relatively few free parameters and gradually remove the constraints until the fit no longer improves.

4.6 Searching for Best Parameters

Let’s take an intuitive look at how the optimizer searches for parameters. If we had a model that has only two parameters, we could imagine a 2D surface that represents on the z axis the error of the model relative to the data, as a function of the two parameter values represented on the x and y axes. The greater the error, the greater the z value. Hopefully, somewhere on this surface there is a single point where the error is the lowest. That is the solution that the optimizer has to find, corresponding to the set of best parameters. If we could apply a grid over the parameter space and calculate the error at every node, we could simply identify the parameter values where the error is lowest. As a further refinement, we could apply a smaller and finer grid on that point and improve the estimate.

Unfortunately, a grid search is prohibitive, because the number of free parameters can be large and the cost function can be computationally expensive. For example, for only two parameters, a 100 × 100 search grid requires 10,000 evaluations of the cost function. For some of these parameter values, the cost function may even be impossible to calculate, due to numerical instability, particularly for large k 1 ij values. For ten free parameters , which is not an unusually large number, we would need 10010 evaluations. Assuming an optimistic 1 ms computation time per cost function evaluation, this search would take a long, long, long time. Essentially, the optimizer is in the situation of a tourist trapped in a multidimensional universe, having to find the best restaurant in the city, in complete darkness and without map or smart phone, just guided by smell.

Clearly, the optimizer must use a clever and more efficient strategy. A solution is to mimic the effects of gravity. If we place a ball on our error surface, the ball will start rolling downhill, eventually settling at the bottom. Like our tourist, the ball is in darkness: it does not see the whole map, but it’s simply taken by gravity down the local gradient. An optimization algorithm can use the same search strategy, following the error function down its gradient in the multidimensional parameter space. Because of noise, the error is never zero at the bottom of the surface. Instead, the search is terminated when the gradient is zero for all parameters.

We had good experience with the Davidon-Fletcher-Powell method [69], implemented in code as dfpmin [70]. Compared to other search strategies, such as simplex [71], dfpmin is very quick and efficient. As with any gradient-based method, dfpmin requires the gradients of the cost function with respect to each parameter. In some cases, the gradients can be calculated analytically [44]. When not, the gradients could be approximated numerically by evaluating the cost function at two points in the parameter space that are separated by a very small distance, for each parameter. Due to potential numerical errors in the calculation, this distance must be chosen carefully, to be sure that the cost function does actually change over that small distance.

4.7 Interpreting Fitting Results

An example of fitting macroscopic currents with an ion channel kinetic mechanism is shown in Fig. 4. Although the fit is clearly very good, the best parameters found by the optimizer should be taken with a grain of salt. First, it’s possible that the parameters are not the very best, but only a local solution. Following the smell of food, our tourist will eventually find a restaurant, yet a much better one may be just around the corner. How does he find the very best place to eat, without trying them all? Indeed, finding the global minimum is a difficult problem in optimization [72]. A poor man’s global search strategy is simply to restart the optimizer at different points in the parameter space and take the overall best. While there is no theoretical guarantee, the more restarting points tried, the more likely it is that the solution is truly global. Alternatively, one could use algorithms specifically designed for global search, which are slower but more exhaustively explore the parameter space and can even search across model topologies [73, 74].

Fig. 4
figure 4

Fitting the data. This is an example where macroscopic data obtained with several voltage-clamp protocols, as discussed in Fig. 2, were pooled together and fitted with the kinetic mechanism shown in Fig. 3, using a computational algorithm that minimized the sum of squared errors. The response of the model to the same voltage-clamp protocols as used to record the data is represented by the red trace, which corresponds to the best parameters found by the optimizer. Adapted from [54]

A second issue to consider is the theoretical (a priori) identifiability of the model, which has two related aspects [75]. First, there may be distinct model topologies that explain the data equally well [76, 77]. Second, for a given model structure, there may be multiple parameter sets that are equally good [44], either as a continuum or as discrete points in the parameter space. As a result, even when the optimizer finds the global best, there may be other solutions that are just as good (although the corresponding physical models may not be equally plausible). The theoretical identifiability of a model does not depend on the quality of experimental data. Instead, it depends on the voltage-clamp protocols that were included in the analysis and on the mathematical criteria that were used to calculate the goodness of fit . Essentially, both of these narrow the range of equivalent solutions in the parameter space, which can lead to unique solutions. For example, if we tried to fit a stationary macroscopic current using the sum of squared errors as the goodness of fit , there would be a large continuum of solutions that could all explain equally well the data. However, if we then switched to a maximum likelihood method, the range of equivalent solutions in the parameter space would be narrower, because a solution must explain not only the mean value of the current but also the properties of fluctuations. If we further add a voltage step protocol, the range of equivalent parameters would narrow even more. Finally, including prior knowledge implemented as model constraints or as penalty would further restrict parameter space and increase model identifiability.

How do we know when a parameter solution is unique? A simple empirical technique can be used to determine whether a continuum of equivalent solutions exists. Starting from the solution found by the optimizer, one parameter can be slightly perturbed, and then constrained to a constant value. The resulting parameter set should give a higher error. Then, we restart the optimizer and test whether it is able to reach the same goodness of fit as before, by adjusting the other parameters to compensate for the change in the one that was constrained. If the optimizer returns to the same goodness of fit, though with a different set of parameters, we have proof that a continuum of solutions exists around that point. In this case, one could either add constraints to make the model simpler or add stimulation protocols to make the data more informative.

Another important issue is the practical (a posteriori) identifiability of the model, which depends on the properties of the data used in analysis [78]. Each time a new data set is recorded and analyzed, the parameter estimates and the cost function will be different. Because of stochastic fluctuations and experimental noise and artifacts , these quantities are not deterministic, but vary from trial to trial. Moreover, even a good model would never give a perfect fit. The variance of a parameter estimate depends on how much information is contained in the data about that parameter, relative to noise and fluctuation levels. A parameter that has large variance can take a broad range of values without changing the fit significantly, and thus cannot be trusted much. Ultimately, it comes down to how many times the channel visits a state, and how long-lived that state is. For states that are rarely visited or are very long-lived, there will be little information in the data about the transitions connecting that state to others. For example, in the limit case of a single-channel recording, a state simply cannot be identified if it is never visited in a data set. In general, certain features of the model that are theoretically observable in practice may be buried in the noise and hidden or distorted by artifacts . To reduce parameter variance , one can include more data in the analysis, but can also optimize the voltage-clamp protocols. The variance in the cost function is a factor to consider when different models are tested. If two models are comparable, their cost function probability distributions may overlap, which means that, statistically, the wrong model will sometimes give a better fit than the correct model [38, 79].

What if the best fit is still bad? This could be an indication that the topology of the model is wrong or that the optimization starting point was not appropriate and the optimizer has been trapped in a local minimum. As a solution, the optimization can be restarted with different initial values. If this doesn’t improve the fit, one could try different model topologies. One should be aware, however, that when the data included in the fit are collected with multiple voltage-clamp protocols, it becomes more difficult for the optimizer to explain all the data collectively. To improve the fit, there are simple changes that can be made to a model, such as inserting a state or connecting or disconnecting two states. Parameter estimates should always be inspected. For example, if the optimizer estimates a rate constant at one million per second, that could indicate something wrong with the model or with the data. Or, if rate constants that lead to an end state are very small while the opposite rates are very large, this could indicate a lack of evidence in the data for that state. Another example is when the rate constants connecting two states have very large values, which could indicate that the two states are in fact just one.

5 Testing Models in Live Neurons

As shown in the previous section, it is possible to find an ion channel model that explains voltage-clamp data well and gives insight into the biophysical mechanism of the channel. However, in an excitable cell, there are many ion channel types that work together to generate specific patterns of firing activity. A cell is a complex system where multiple components interact nonlinearly [80]. In contrast, voltage-clamp experiments isolate the channel of interest from this system and test it with predefined voltage waveforms. It is quite possible that some features of the kinetic mechanism that are critically important to the function of the cell may not be revealed in the voltage-clamp data and may not be captured by the model. Ideally, the model should also be tested functionally, in a cellular context.

A powerful tool for studying the function of voltage-gated ion channels in live neurons is dynamic clamp [8183]. The principle is to pharmacologically block the channel of interest and then functionally replace it with an injected current, dynamically calculated on the basis of a kinetic model [18]. As a first-order approximation, where we ignore the potential regulatory function of the permeant ions, Ca2+ in particular [84], the neuron makes no distinction between the native current and the model-based current, which are not necessarily carried by the same ions. Then, if the model is accurate, the neuron would exhibit the same firing pattern as with the actual channel. The sensitivity of the firing pattern to channel properties and the contribution of that particular current to spiking can be easily studied by varying the properties of the model and manipulating the model-based current in real time. The major advantage of this hybrid experimental-computational approach is that a channel can be investigated within a live cell without any knowledge about other conductances or cell properties.

5.1 Solving the Model in Real Time

Dynamic clamp can be understood within the context of a cellular model. To make it easier to explain the concepts, we make several simplifying assumptions: (1) besides voltage-independent leak channels, the neuron contains only Nav and Kv channels, with kinetic mechanisms described by Markov models; (2) the neuron has a single compartment and the membrane is isopotential; and (3) the model corresponds to an ideal whole-cell recording (zero access resistance , no pipette capacitance , etc.) [17]. The state of the model as a function of time is completely described by three variables: the membrane potential V, and the state occupancies of the Nav and Kv channels, P Na and P K. These state variables evolve in time according to the following ordinary differential equations:

$$ C\frac{\mathrm{d}V}{\mathrm{d}t}=-\left({I}_{\mathrm{Na}}+{I}_{\mathrm{K}}+{I}_{\mathrm{leak}}\right)+{I}_{\mathrm{app}}, $$
(30)
$$ \frac{\mathrm{d}{\mathbf{P}}_{\mathrm{K}}}{\mathrm{d}t}={\mathbf{P}}_{\mathrm{K}}\times {\mathbf{Q}}_{\mathrm{K}}, $$
(31)
$$ \frac{\mathrm{d}{\mathbf{P}}_{\mathrm{Na}}}{\mathrm{d}t}={\mathbf{P}}_{\mathrm{Na}}\times {\mathbf{Q}}_{\mathrm{Na}}, $$
(32)

where I app is the current injected into the neuron through the patch-clamp pipette. To run a computer simulation of our model neuron, we would have to integrate these equations with an ODE solver. A real neuron “integrates” a similar set of differential equations, just more complex, to account for multiple cellular compartments, ion channel stochasticity, etc.

In voltage clamp , I app is in principle equal to the sum of all ionic currents, so as to keep V equal to the command voltage and dV/dt equal to a predefined value (e.g., zero for a constant voltage step, or a finite value for a voltage ramp ). In this sense, I app becomes a measure of the total ionic currents active in the cell. In current clamp , I app is typically used to test the firing properties of a neuron under a range of conditions. For example, I app can be a constant value to bias the membrane potential or it can be a predefined waveform that mimics excitatory or inhibitory synaptic input. In dynamic clamp , I app is not predefined. Instead, I app is calculated in real time, as a function of the membrane potential V and some other quantities.

How can dynamic clamp be used to test a voltage-gated ion channel model in a live neuron? Let’s consider the case of Nav channels. First, we pharmacologically block the channel, with TTX in this case. As the Nav current was eliminated, the equations “integrated” by the cell simplify to just two:

$$ C\frac{\mathrm{d}V}{\mathrm{d}t}=-\left({I}_{\mathrm{K}}+{I}_{\mathrm{leak}}\right)+{I}_{\mathrm{app}}, $$
(33)
$$ \frac{\mathrm{d}{\mathbf{P}}_{\mathrm{K}}}{\mathrm{d}t}={\mathbf{P}}_{\mathrm{K}}\times {\mathbf{Q}}_{\mathrm{K}}. $$
(34)

Next, we replace the blocked current with a current generated by a Nav model that is solved on the computer. Effectively, we now have a hybrid biological-computational model that has the same set of ODEs, but two equations are “integrated” by the cell, and one is integrated on the computer:

$$ C\frac{\mathrm{d}V}{\mathrm{d}t}=-\left({I}_{\mathrm{K}}+{I}_{\mathrm{leak}}\right)+{I}_{\mathrm{app}},\kern.5em `\mathrm{integrated}"\ \mathrm{b}\mathrm{y}\ \mathrm{the}\ \mathrm{neuron} $$
(35)
$$ \frac{\mathrm{d}{\mathbf{P}}_{\mathrm{K}}}{\mathrm{d}t}={\mathbf{P}}_{\mathrm{K}}\times {\mathbf{Q}}_{\mathrm{K}},\kern.5em `\mathrm{integrated}"\ \mathrm{b}\mathrm{y}\ \mathrm{the}\ \mathrm{neuron} $$
(36)
$$ \frac{\mathrm{d}{\mathbf{P}}_{\mathrm{Na}}}{\mathrm{d}t}={\mathbf{P}}_{\mathrm{Na}}\times {\mathbf{Q}}_{\mathrm{Na}},\kern.5em \mathrm{integrated}\ \mathrm{on}\ \mathrm{the}\ \mathrm{computer} $$
(37)

where the injected current, I app, is now equal to the negative current generated by the Nav model, \( -{I}_{\mathrm{Na}}^{\mathrm{C}} \), plus a constant component, I inj, that can be used to apply current steps or for other functions:

$$ {I}_{\mathrm{app}}=-{I}_{\mathrm{Na}}^{\mathrm{C}}+{I}_{\mathrm{inj}}. $$
(38)

The channel model is solved on the computer over discrete time steps, using the recursive method:

$$ {\mathbf{P}}_{\mathrm{Na},\ t+\delta t}={\mathbf{P}}_{\mathrm{Na},t}\times {\mathbf{A}}_{\mathrm{Na},\mathrm{V}}, $$
(39)

where A Na,V is the transition probability matrix calculated for a given membrane potential V. A Na,V can be precalculated over a voltage range (e.g., −100 to +100 mV, every 0.1 mV) and stored in a look-up table. As illustrated in Fig. 5, the model is solved in a real-time computational loop, where every iteration corresponds to one integration step. For each iteration, V is read from the amplifier through the digital acquisition card (DAQ). Depending on V, the appropriate A matrix is selected from the look-up table and used to update P Na. Then, I CNa is recalculated and injected into the neuron via I app. This loop must execute fast enough so that the voltage at the membrane does not change significantly within one iteration, which would invalidate both the selected A Na,V and the injected current I CNa . The update rate should match the maximum rate of voltage change, which is typically the rising phase of the action potential . An update every 20 μs (50 kHz), or faster, is generally adequate. Once an update interval is chosen, every iteration of the loop should be completed precisely within that time. To ensure predictable time steps with minimum variability, the code should run with real-time priority on the computer.

Fig. 5
figure 5

Testing ion channel models in live neurons with dynamic clamp . As illustrated here, a Nav current is blocked with TTX and replaced with a model-based current, which is calculated in real time on the basis of a model and injected into the neuron via the patch-clamp pipette [18]

5.2 Equipment and Software

While dynamic clamp can be performed under a variety of electrophysiology paradigms, we focus here on whole-cell patch-clamp experiments in neurons [17]. Thus, in addition to the equipment and software used for patch clamp, one also needs a digital acquisition card, a dedicated computer, and specialized software for real-time computation. Ideally, the patch-clamp amplifier should feature “true” current clamp. We had good results with HEKA’s EPC 9 and 10 amplifiers, as well as with Molecular Devices’ Multiclamp 700B . EPC 10 is more convenient, because it allows summation of external and internal current in current-clamp mode. In contrast, the Multiclamp amplifier has only one input connection for applied current, which is normally used by the external DigiData digitizer. A solution in this case is to use an electronic summation circuit or a mechanical switch box. Although we don’t have first-hand experience with other instruments, there are several commercially available patch-clamp amplifiers that feature true current clamp, e.g., those made by A-M Systems, NPI, and Warner Instruments.

Although some patch-clamp amplifiers already include an internal (EPC 10 ) or external (Multiclamp 700B ) digitizer, these are not necessarily optimized for real-time feedback acquisition, where, in a very short time (tens of microseconds), a sample is read from the analog input, processed on the CPU, and another sample is written to the analog output. With a few exceptions (e.g., the hardware-based dynamic-clamp device commercialized by Cambridge Electronic Design), all dynamic-clamp applications use digitizers made by National Instruments . At the time of writing, we recommend the NI PCIe-6351 or NI PCIe-6361 (slightly faster) boards, which have been optimized for very low latency. One should be aware that the manufacturer typically specifies the maximum rate for buffered acquisition, not for real-time applications. Transferring one single sample across the PCIe bus has a finite latency that limits dramatically the throughput rate in real-time acquisition. For example, the maximum rate that we could obtain with a NI PCIe-6361 card is ~220,000 acquisition cycles per second, even though the board can acquire two million samples per second in buffered mode. Nevertheless, a throughput rate like this is excellent, being comparable with the bandwidth of the patch-clamp amplifier in current clamp .

Historically, the first dynamic-clamp programs were coded in some flavor of real-time Linux [8587]. At the time, obtaining acceptable real-time performance under the MS Windows operating system—or any other non-real-time OS—was simply not possible. On such a system, user programs can be interrupted at random by other programs or by the operating system itself. Another limitation at the time was the driver provided by National Instruments for programming their boards, which was incredibly slow for real-time applications (~1,000 cycles per second, according to our tests). However, the situation has completely changed over the last ten years, with the development by National Instruments of new digitizers and optimized drivers, with the advent of multicore processors and an improved PCIe bus, and with the general increase in CPU speeds. Today, dynamic-clamp software can be run in MS Windows with excellent real-time performance, on par with what is achieved under real-time Linux.

Dynamic-clamp programs are available for both real-time Linux [88] and MS Windows [18, 8991]. For the more biophysically inclined user, we recommend our own implementation of dynamic clamp in the QuB software, which runs under MS Windows (http://milesculabs.biology.missouri.edu/QuB). The major advantages are integration with a variety of ion channel modeling algorithms, a powerful scripting language for customized models and protocols, and sophisticated methods for solving Markov models of ion channels, deterministically or stochastically. The only method that can be used to solve large Markov models accurately is the matrix method described in this chapter, which is available in our software. Once a few quantities are precalculated and stored in look-up tables, very large Markov models can be solved using only vector–matrix multiplications, which can be executed very quickly on modern CPUs or on graphics processors (GPUs). For example, we were able to run models with as many as 26 states at 50 kHz or faster [18]. The matrix method is also very stable and accurate, even with long sampling intervals, which is generally not the case with methods that rely exclusively on ODE solvers to advance the state probabilities. In particular, integration with the Euler method, which is practically the only one that is fast enough for real-time applications, is bound to fail with even small Markov models [18].

In principle, any desktop computer can be used for dynamic clamp . However, for high-performance applications (large models and high-throughput rates), we recommend a fast computer that is used exclusively to run the dynamic-clamp engine. We had the best results with multicore Intel Xeon CPUs, installed in dual-processor server-grade systems. For example, the computer that we currently use in the lab has two Intel Xeon E5-2667 v2 8-core processors, clocking at 3.3 GHz, and runs Windows 7 Pro 64-bit. Our system was built by ASL, Inc., but many other computer integrators sell configurable systems. Of all the components, the most critical are the CPU and the motherboard.

5.3 Preparing and Running a Dynamic-Clamp Experiment

Setting up a dynamic-clamp experiment involves a few steps. First, the voltage monitor output of the patch-clamp amplifier should be connected to one of the analog inputs of the National Instruments digitizer, while the external current input of the amplifier should be connected to one of the analog outputs of the digitizer. Next, one needs to zero the offsets and calibrate the scaling factors between the amplifier and digitizer, for both input and output. The calibration procedure depends on the specific dynamic-clamp software but the idea is to make sure that the membrane voltage value read into the dynamic-clamp software is the same as the value reported by the patch-clamp amplifier. Likewise, the external current reported by the amplifier should match the current sent by the dynamic-clamp software. In our experience, there are always slight offsets of a few mV in membrane potential and a few pA in injected current, between the amplifier and the digitizer. These offsets must be compensated for in the software. In particular, one should be careful that the amplifier receives no unwanted external current when I app is equal to zero, as even a small current of a few pA can alter the firing pattern of a neuron. Most amplifiers have adjustable gain in current clamp (e.g., 1 pA/mV). The smallest gain should be selected that still allows the injection of the largest current that might be predicted by the model. For example, a model-based Nav current ranges from a few pA in the interspike interval , small but sufficient to influence neuronal firing properties, to several nA during an action potential .

The pipette capacitance should be reduced as much as possible by coating with Sylgard or other agents. In our experience, the residual capacitance should be no more than 5–6 pF, otherwise ringing may occur in dynamic clamp when large currents are injected, particularly with Nav currents during action potentials . Once a patch is obtained, the typical artifact estimation and compensation procedures should be applied for series resistance and pipette capacitance , as well as for membrane capacitance . Then, upon switching to current clamp , the pipette capacitance compensation should be slightly reduced (10–20 %) to avoid ringing, while series resistance should be compensated 100 %.

An example of a dynamic-clamp experiment is shown in Fig. 6, adapted from a previous study on serotonergic Raphé neurons in the brainstem [54]. In that study, the voltage-gated sodium current was investigated with voltage clamp to determine its kinetic mechanism and tested with dynamic clamp to determine its contribution to the neuronal firing. These neurons exhibit a pattern of spontaneous spiking with low frequency (1–5 Hz) and broad action potentials (4–5 ms), as shown in Fig. 6a, top traces. Other properties that are visible during a current step injection are a slow reduction in spiking frequency, a broadening of the action potential shape, and a reduction in spike height . A slow kinetic process was also observed in the voltage-clamp data [54, 9294], which has been explained by adding an inactivated state to the Kuo and Bean Nav model, as shown in Fig. 7. After blocking Nav channels with TTX and injecting this model into the cell, a firing pattern was obtained that exhibited the same spike frequency and shape as the control, maintained under a range of injected current values (Fig. 6a, b, lower traces). For this experiment, a unitary conductance of 10 pS was used and the total number of Nav channels, N C, was chosen so as to match the maximum slope of the action potential rising phase between control and dynamic-clamp experiments. In most of the examined cells, N C was ≈ 20,000.

Fig. 6
figure 6

Validating the kinetic mechanism in a live neuron. The figure compares the spiking patterns of a neuron under control conditions and with the Nav current blocked with TTX and replaced with a model-based current injected with dynamic clamp . A Nav model based on the kinetic mechanism shown in Fig. 7 was able to explain well the firing properties, including the slow adaptation in action potential shape and frequency (a and b, lower traces, and c, upper traces). In contrast, the Nav model shown in Fig. 3 could not reproduce the slow adaptation (c, lower traces). Adapted from [54]

Fig. 7
figure 7

A Nav channel kinetic mechanism obtained from voltage-clamp data and validated in live neurons. An inactivated state (I13) with special properties was added to the Kuo and Bean model [62] to explain voltage-clamp data and neuronal firing behavior [54]

The model with an additional inactivated state also reproduced the slow adaptation in frequency and action potential shape. A logical follow-up question is whether the adaptation in firing properties is caused by the slow inactivation detected within the kinetic mechanism. If true, then a dynamic-clamp -injected current based on a model that lacks slow inactivation should not result in adaptation. Indeed, as shown in Fig. 6c, this model generates a spiking pattern that clearly differs from the control: the spike height remains constant and only the falling phase of the action potential is slowed down. One should remember that the slope of the rising phase is proportional to the total number of Nav channels that are available to generate current, whereas the falling phase depends mostly on Kv channels. The explanation for the observed spike shape adaptation is that more and more Nav channels enter the slow inactivated state after each action potential , leaving progressively fewer channels available to contribute current to the rising phase.

Clearly, in this case the kinetic model obtained from voltage-clamp data was well validated by the dynamic-clamp experiment. However, the keen observer will have noticed a small but important difference: the action potential starts more abruptly in control than with the model (Fig. 6a2, b2). In fact, this is not a shortcoming of the kinetic model but a technical limitation of the brain slice preparation, where neurons maintain their processes intact. Although Nav channels are distributed throughout the cell in the soma, dendrites, and axon [95, 96], the axonal initial segment [97] has special properties that make it the site of action potential initiation [98100]. From the axonal initial segment, the action potential backpropagates to the soma, causing the abrupt onset [101104]. In contrast, the model-based current is injected strictly in the soma, which becomes the site of action potential initiation. This configuration resembles the case of dissociated neurons that have lost their axon and exhibit similarly smoothly rising action potentials. Nevertheless, this abrupt action potential onset can be reproduced in a dynamic-clamp experiment by adding a virtual axonal compartment that generates its own spike and thus contributes additional current to the rising phase of the somatic action potential [18]. Overall, it is quite remarkable that basic spiking properties are reproduced so well by a soma-injected model-based current.