12.1 Introduction

Ever since the first modeling of elastic collision between rigid spheres, the molecular dynamics (MD) simulation technique has been greatly developed to achieve atomic-level information. It has become a critical component of the widely used tool set, and been applied to both material science and biological systems, such as proteins, nucleic acids and lipid membranes. MD simulations not only allow the examination of experimental findings at the atomic level, which can test new hypotheses, but also provide data that cannot be obtained from experiments, such as the pressure profile of membranes [1]. Cells are normally surrounded and protected by plasma membranes which consist of different types of lipids, proteins, and carbohydrates. In the last decade, computer simulation has opened new ways to study bilayers at the atomic level, yielding a detailed picture of the structure and dynamics of membranes and membrane proteins [2,3,4].

Membranes serve many critical biological functions, such as forming barriers between intracellular and extracellular environments, regulating the transport of substances [5], detecting and transmitting electric and chemical signals through protein receptors [6], mediating the communications between cells [7] and so forth. Also, membrane proteins have been found to comprise approximately one-third of the human genome [8], and over half of these are known as drug targets. Thus, the biological functions of membrane proteins have become an important focus in fundamental research. Unfortunately, even with advanced experimental techniques, it is still difficult to achieve sufficient details of protein structures at the molecular or even atomic level, not to mention the relationship between structural information and functionality. To solve this problem, MD simulation is now accepted as an indispensable tool to achieve structural and dynamical information not available via experiments.

In principle, all details of molecular structures and interactions can be depicted by first principles using quantum mechanics. Unfortunately, most of the problems involving membrane proteins cannot be handled by quantum mechanics for its high computational cost. Hence, MD simulation applies molecular force fields, which are mainly based on a kind of potential energy descriptions at different atomic and molecular levels, to describe the topological structures and dynamic behaviors of membrane protein molecules. Molecular force fields are usually adopted to calculate the energies of molecules by using positions of atoms, and greatly speed up calculations compared to quantum mechanics. Thus it can be used to study the systems that contain tens of thousands of atoms. A lot of studies have shown that molecular force fields could help to explain many physical problems. In addition, one of the main approximations of the additive all-atom force fields is related to the description of electrostatic properties. Additive all-atom force fields of lipid, such as AMBER [9, 10], CHARMM [11,12,13], OPLS-AA [14], and united-atom force field of lipid, such as GROMOS [15], treat the electrostatic interactions with fixed atomic charges. The fixed partial charge is placed at the nucleus of each atom to represent its electrostatic properties. In additive force fields, the charge is a parameter which can be tuned to represent atom polarization effects in an average way through a mean-field approximation, which responds to different environments that a molecule might experience [16]. For lipid bilayers, the polar headgroups of lipids face the high-dielectricity water environment on one side and interact with the low-dielectric hydrocarbon core on the other side [17]. The electronic polarization experienced exterior or interior of the membrane by an embedded molecule is very different [18, 19]. Roux et al. [20] have investigated the ion selectivity of several membrane-binding channels and transporters. The results indicated that although the fundamental physical properties could be described using the non-polarizable models, more detailed understanding of the conformation-driven super-selectivity depended on improvements in force field models considering the explicit polarizability.

Several advances are made in both software and hardware aspects for simulating membrane and membrane proteins in the last decade, including the easy-to-use software for setting up MD simulations, the massively parallel algorithms and the GPU accelerated computing.

In the following sections, we first introduce the technical principle of MC and MD. Then the applications of both CG and All-atom model are introduced. And finally, the protocol is presented.

12.2 Technical Principle

12.2.1 Monte Carlo Simulations and Molecular Dynamics Simulation

12.2.1.1 Monte Carlo Simulations

The Monte Carlo simulation was first applied to perform computer simulation on the molecular system, therefore it occupies a special position in the history of molecular modeling. The Monte Carlo simulations obtain the conformations of a system through random changes of the positions of atoms, and meanwhile, change the system to appropriate orientations and conformations. Monte Carlo methods are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. Their essential idea is using randomness to solve problems that might be deterministic in principle. They are often used in physical and mathematical problems, which can be summarized into three distinct classes [21]: optimization, numerical integration and generating draws from a probability distribution. Based on the position of atoms, the system potential energy of each conformation, and the other values of properties can be calculated. Thus, Monte Carlo samples are from a 3N-dimensional space of the particles.

The classical expression for partition function Q:

$$ Q = c{\iint }{\text{d}}p^{N} {\text{d}}r^{n} { \exp }\left[ { - H\left( {r^{N} p^{N} } \right)/K_{\text{B}} T} \right] $$
(12.1)

where r N is the coordinates of all N particles, p N is the corresponding momenta, and c is a constant of proportionality. The H(r N p N) is the Hamiltonian of the system, which depends on the 3N positions and 3N momenta of the particles in the system. It can be written as the sum of the kinetic and potential energies of the system:

$$ H\left( {r^{N} p^{N} } \right) = \mathop \sum \limits_{i = 1}^{N} \frac{{\left| {p_{i} } \right|^{2} }}{2m} + V \left( {r^{N} } \right) $$
(12.2)

From the above two equations, the canonical ensemble partition can be separated into two separate integrals, one is over the positions part, and the other is over the momenta part.

Though Monte Carlo methods diversify in different aspects, they still follow a particular pattern: (1) defining a domain of possible inputs; (2) generating inputs randomly from a probability distribution over the domain; (3) performing a deterministic computation on the inputs; and (4) aggregating the results.

Here we take a circle inscribed in a unit square for an example. Given that the area ratio of the circle to the square is π/4, the value of π can be approximated using a Monte Carlo method [22]: (1) Draw a square and inscribe a circle within it; (2) Uniformly scatter objects of uniform size over the square; (3) Counting the number of objects inside the circle and the total number of objects; (4) The ratio of the two counts is an estimate of the ratio of the two areas, which is π/4; (5) Multiply the result by 4 to estimate π. In this procedure, the domain of inputs is the square that circumscribes the circle. Random inputs are generated by scattering grains over the square, then computations on each input are conducted (test whether it falls within the circle). Finally, the results are aggregated to obtain the final output, which is the approximation of π.

There are two important points to be noted here. Firstly, if the grains are not uniformly distributed, our approximation will be poor. Secondly, there should be a large number of inputs, because the approximation is usually poor if only a few grains are randomly dropped into the whole square. Generally, the approximation improves as more grains are dropped.

12.2.1.2 Molecular Dynamics Simulation

Molecular dynamics (MD) is a computer simulation method for studying the physical movements of atoms and molecules and belongs to a type of many-body simulation. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamical evolution of the system. In the most common version, the trajectories of atoms and molecules are determined by numerically solving Newton’s equations of motion for a system of interacting particles, where forces between the particles and their potential energies are calculated using interatomic potentials or molecular mechanics force fields. The method was originally used in the field of theoretical physics in the late 1950s [23, 24], and now it is widely applied in various fields, such as chemical physics, materials science and the modeling of biomolecules. For example, MD is frequently used to refine three-dimensional structures of proteins and other macromolecules based on experimental constraints from X-ray crystallography or NMR spectroscopy. In biophysics and structural biology, the method is used to study the motions of biological macromolecules such as proteins and nucleic acids, which is useful for interpreting the results of certain biophysical experiments and modeling interactions between molecules.

In principle, MD can be used for ab initio prediction of protein structure by simulating folding of the polypeptide chain from random coil. The trajectory is obtained by solving the different equations embodied in Newton’s second law (F = ma):

$$ \frac{{{\text{d}}^{2} x}}{{{\text{d}}t^{2} }} = \frac{{F_{i} }}{{m_{i} }} $$
(12.3)

The equation describes the motion of a particle of mass m i along one coordinate (x i ) with F xi being the force on the particle in the corresponding direction, and can be written as:

$$ F_{i} = - \nabla U\left( {r_{i} , \ldots ,r_{N} } \right) $$
(12.4)

where U (r i , …, r N ) is the potential energy function of N particles which contain the bonded and non-bonded interactions. The bonded interactions describe the interactions of the covalently bound atoms in proteins and lipid molecules, and the non-bonded interactions can be decomposed into four pieces: Coulomb energy between two atoms, Polarization interaction between atoms, Dispersion (van der Waals) potential, and Short-range repulsion. The Coulomb energy can be calculated as:

$$ U_{\text{Coul}} = \frac{{q_{i} q_{j} }}{{4\uppi \upvarepsilon _{0} r_{ij} }} $$
(12.5)

And the van der Waals potential is calculated based on the Lennard-Jones potential:

$$ U_{\text{LJ}} = 4\upvarepsilon\left[ {\left( {{\upsigma \mathord{\left/ {\vphantom {\upsigma r}} \right. \kern-0pt} r}} \right)^{12} - \left( {{\upsigma \mathord{\left/ {\vphantom {\upsigma r}} \right. \kern-0pt} r}} \right)^{6} } \right] $$
(12.6)

Here ε is the depth of potential at the minimum (r = 21/6σ), and the potential vanishes at r = σ.

Implementation of the Coulomb and LJ terms is straight forward, but a calculation of the induced polarization requires the iteration of the polarization equations, which increases the computational cost by several-fold.

Integration Algorithms. Given the position and velocities of N particles at time t, a straight forward integration of Newton’s equation of motion yields the following at t + Δt

$$ v_{i} \left( {t +\Delta t} \right) = v_{i} \left( t \right) + \frac{{F_{i} \left( t \right)}}{{m_{i} }}\Delta t $$
(12.7)
$$ r_{i} \left( {t +\Delta t} \right) = r_{i} \left( t \right) + v_{i} \left( t \right)\Delta t + \frac{{F_{i} \left( t \right)}}{{2m_{i} }}\Delta t^{2} $$
(12.8)

In the popular Verlet algorithm, one eliminates velocities by adding the time-reversed position at t − Δt:

$$ r_{i} \left( {t -\Delta t} \right) = r_{i} \left( t \right) - v_{i} \left( t \right)\Delta t + \frac{{F_{i} \left( t \right)}}{{2m_{i} }}\Delta t^{2} $$
(12.9)

While r i (t + Δt) can be given as:

$$ r_{i} \left( {t + \Delta t} \right) = 2r_{i} \left( t \right) - r_{i} \left( {t - \Delta t} \right) + \frac{{F_{i} \left( t \right)}}{{m_{i} }}\Delta t^{2} $$
(12.10)

This is especially useful in situations where one is interested only in the positions of the atoms. If required, velocities can be calculated from

$$ v_{i} \left( t \right) = \frac{1}{2\Delta t}\left[ {r_{i} \left( {t + \Delta t} \right) - r_{i} \left( {t - \Delta t} \right)} \right] $$
(12.11)

The Verlet algorithm has several drawbacks: (1) positions are obtained by adding a small quantity to large ones, which may lead to a loss of precision; (2) velocity at time t is available only at the next time step t + Δt; (3) it is not self-starting, i.e., at t 0, there is no position at t  Δt. These drawbacks can be avoided in the leap-frog algorithm, where the positions and velocities are calculated at different times separated by Δt/2:

$$ v_{i} \left( {t + {{\Delta t} \mathord{\left/ {\vphantom {{\Delta t} 2}} \right. \kern-0pt} 2}} \right) = v_{i} \left( {t - {{\Delta t} \mathord{\left/ {\vphantom {{\Delta t} 2}} \right. \kern-0pt} 2}} \right) + \frac{{F_{i} \left( t \right)}}{{m_{i} }}\Delta t $$
(12.12)
$$ r_{i} \left( {t + \Delta t} \right) = r_{i} \left( t \right) + v_{i} \left( {t + {{\Delta t} \mathord{\left/ {\vphantom {{\Delta t} 2}} \right. \kern-0pt} 2}} \right)\Delta t $$
(12.13)

The initial coordinate can be taken from the Protein Data Bank. After energy minimization, the coordinates give the t = 0 time atom positions. The initial velocities are sampled from a Maxwell-Boltzmann distribution:

$$ P\left( {v_{ix} } \right) = \left( {\frac{{m_{i} }}{{2\uppi{kT}}}} \right)\exp \left[ { - {{m_{i} v_{ix}^{2} } \mathord{\left/ {\vphantom {{m_{i} v_{ix}^{2} } {2kT}}} \right. \kern-0pt} {2kT}}} \right] $$
(12.14)

Boundaries and Ensembles. In MD simulations, however, the system size is so small that one should consider the boundary effects. Using vacuum is not realistic for bulk simulations because a vacuum creates an ordering of surface waters, which could influence the dynamics of a biomolecule separated by a few layers of water from the surface. The most common solution is to use periodic boundary conditions, that is, the simulation box is replicated in all directions just like in a crystal. The cube and rectangular prism are the obvious choices for a box shape, though other shapes are also possible. Application of the periodic boundary conditions results in an infinite system which, in turn, raises the question of accurate calculation of the long-range Coulomb interactions. This problem has been resolved using Ewald’s sum, where the long-range part is separately evaluated in the reciprocal Fourier space.

MD simulations are typically performed in the NVE ensemble, where all three quantities (number of atoms, volume, and energy) are constant. Due to truncation errors, keeping the energy constant in long MD simulations can be problematic. To avoid this problem, the alternative NVT and NPT ensembles are employed. The temperature of the system is obtained from the average kinetic energy:

$$ \left\langle K \right\rangle = \frac{3}{2}NkT $$
(12.15)

Thus, an obvious way to keep the temperature constant at T 0 is to scale the velocities as:

$$ v_{i} \left( t \right) \to\uplambda{v}_{i} \left( t \right),\;\uplambda = \sqrt {{{T_{0} } \mathord{\left/ {\vphantom {{T_{0} } {T\left( t \right)}}} \right. \kern-0pt} {T\left( t \right)}}} $$
(12.16)

Because the kinetic energy has considerable fluctuations, this is a rather crude method. A better method, which achieves the same result more smoothly, is the Berendsen thermostat, where the atoms are weakly coupled to an external heat bath with the desired temperature T 0:

$$ m_{i} \frac{{{\text{d}}^{2} }}{{{\text{d}}t^{2} }}r_{i} = F_{i} + m_{i}\upgamma_{i} \left[ {\frac{{T_{0} }}{T\left( t \right)} - 1} \right]\frac{{{\text{d}}r_{i} }}{{{\text{d}}t}} $$
(12.17)

If T(t) > T 0, the coefficient of the coupling term is negative, which invokes a viscous force slowing the velocity, and vice versa for T(t) < T 0. Similarly, in the NPT ensemble, the pressure can be kept constant by simply scaling the volume. Again, this is very crude, and a better method is to weakly couple the pressure difference to atoms using a similar force as above (Langevin piston), which will maintain the pressure at the desired value of ~1 atm.

12.2.2 Additive Force Field

Computational treatment of molecular dynamics is based on inter-atomic forces, which can be derived by solving the Schrödinger’s equation. And the related solving approaches are categorized as quantum mechanical method. However, high calculation costs limit the use of those approaches to relatively simple systems. In 1930, Andrews [25] first proposed the basic conception of molecular force fields, a bead-spring model was applied to describe the bond length and bond angle, and compute the interactions of non-bonded atoms by using van der Waals interaction expressions. Then Lifson and Warshel described consistent force field (CFF) called empirical function force field in the 1960s, which could be attributed to the modern molecular force field [26]. MD simulation, on the other hand, builds an empirical function to model the potential energy of the system. The function can be constructed via estimating the intermolecular interaction energies from isolated monomer wave functions (namely the perturbative method) or via the energy differences between isolated monomers and corresponding dimers (namely the super-molecular calculations) [27]. For different atoms or atoms in different environments, parameter sets are introduced as the variables in the potential energy function. Due to the diversity of interatomic interactions in biological systems, as well as the complex electrostatic environments, it is challenging to build a uniform set of parameters that can model the motions of atoms in different situations.

The classical way to estimate the interatomic interactions includes treating atoms as rigid spheres with fixed charges located on the nucleus. Electron distribution among bonded atoms based on electronegativity can be empirically illustrated as a partial charge, either positive or negative, on each atom. Thus, each atom responds to the surrounding electrostatic environment in an average way (mean-field approximation). From this simple treatment, interatomic electrostatic energies can be simply estimated via the Coulomb’s law, and the total electrostatic energy of the system is the summation of those pairwise energies. Then the complete potential energy function can be estimated via a summation of bonded energy terms (including bond lengths, bond angles, dihedral angles, and may also include improper dihedrals and other empirical correction terms), van der Waals interaction term (usually described by the Lennard-Jones potential), and the electrostatic energy. According to the targeting system, different force fields have been built. To accurately model the properties of small organic or inorganic molecules such as metals, crystals, polymers and nanoparticles in materials, force fields such as CFF [28,29,30,31], MM3 [32], MMFF94 [33,34,35], UFF [36], and DREIDING [37] are implemented. For the dynamics of macromolecules, force fields like AMBER [38, 39], CHARMM [40,41,42,43,44,45,46,47], GROMOS [48,49,50,51,52,53,54,55,56,57,58,59], and OPLS [60, 61] have also been built.

Currently, three force fields AMBER, OPLS, and CHARMM are also used for the modeling of the ionic liquids. In addition, the MM series force fields and CFF are suitable for the system of organic compounds. In the 1980s, molecular force fields such as AMBER, CHARMM, OPLS and GROMOS produce a positive impact on the research of life science and promote the development of the molecular force fields targeting life science.

12.2.2.1 Assisted Model Building with Energy Refinement (AMBER) Force Field

AMBER force field is one of the earliest molecular force field used for the research of biological macromolecules and covers the simulations of proteins, DNA, monosaccharide, and polysaccharide. In this force field, –CH2– and –CH3 are regarded as united atom and used to treat hydrogen bonding interactions. The simulation results show that the AMBER force field can obtain reasonable molecular geometry, conformation energy, vibration frequency and solvation free energy. The parameters of the AMBER force field are obtained as follows: (1) the parameters of equilibrium bonds length and angles are from the experimental data of microwave, neutron scattering and molecular mechanics calculations; (2) the distorted constants are built by microwave, NMR, and molecular mechanics calculations; (3) the non-bonded parameters are obtained through the unit cell calculations; and (4) the parameters of atomic charges are given by the calculations of local charge model and ab initio quantum mechanics. For non-bonded interactions within neighboring four atoms in the AMBER force field, the electrostatic interactions reduce to 1/1.2 of other atoms, while the van der Waals interactions reduce to 1/2 of other atoms. The bond stretching and angle bending energies in the AMBER force field are calculated using the harmonic oscillator model, dihedral angle torsion energy is described by Fourier series form, Lennard-Jones potential is chosen to represent the van der Waals force, and the Coulomb formula is applied to estimate the electrostatic interactions. The functional form of AMBER force field is shown as follows:

$$ \upvarepsilon_{ij} = \frac{{4\upvarepsilon_{ii}\upvarepsilon_{jj} }}{{\left( {\upvarepsilon_{ii}^{1/2} +\upvarepsilon_{jj}^{1/2} } \right)^{2} }} $$
(12.18)
$$ U = \mathop \sum \limits_{\text{bonds}} K_{r} \left( {\upgamma -\upgamma_{\text{eq}} } \right)^{2} + \mathop \sum \limits_{\text{angle}} K_{\uptheta} \left( {\uptheta -\uptheta_{\text{eq}} } \right)^{2} + \mathop \sum \limits_{\text{dihedral}} \frac{1}{2}U_{n} \left[ {1 + \cos \left( {n{\varphi } -\upgamma} \right)} \right] + \mathop \sum \limits_{i < j} \left[ {\frac{{A_{ij} }}{{R_{ij}^{12} }} - \frac{{B_{ij} }}{{R_{ij}^{6} }} + \frac{{q_{i} q_{j} }}{{\upvarepsilon{R}_{ij} }}} \right] + \mathop \sum \limits_{{{\text{H}} - {\text{bonds}}}} \left[ {\frac{{C_{ij} }}{{R_{ij}^{12} }} - \frac{{D_{ij} }}{{R_{ij}^{10} }}} \right] $$
(12.19)

where r, θ, φ are the bond length, angle, and dihedral angle, respectively. The fourth term represents the sum of the van der Waals and the electrostatic interactions, and the fifth term is the hydrogen bonding interactions.

12.2.2.2 Optimized Potentials for Liquid Simulations (OPLS) Force Field

The OPLS force field includes united-atom model (OPLS-UA) and all-atom model (OPLS-AA), and it is suitable for the simulations of organic molecules and peptides [62]. The bond stretching and bending parameters of OPLS force field are obtained based on the modifications of the AMBER force field. This force field is committed to calculate conformation energies of gas-phase organic molecules, solvation free energies of pure organic liquids and other thermodynamic properties. The OPLS force field is represented as follows:

$$ \begin{aligned}U\left( R \right) & = \mathop \sum \limits_{\text{bonds}} K_{b} \left( {b - b_{0} } \right)^{2} + \mathop \sum \limits_{\text{angle}} K_{\uptheta} \left( {\uptheta -\uptheta_{0} } \right)^{2} + \mathop \sum \limits_{\text{dihedral}} \frac{{k_{{\varphi }} }}{2}\left[ {1 + \cos \left( {n{\varphi } - {\varphi }_{0} } \right)} \right]\\ & \quad + \mathop \sum \limits_{\text{nonbond}} \left\{ {4\upvarepsilon_{ij} \left[ {\left( {\frac{{\upsigma_{ij} }}{{r_{ij} }}} \right)^{12} - \left( {\frac{{\upsigma_{ij} }}{{r_{ij} }}} \right)^{6} } \right] + \frac{{q_{i} q_{j} }}{{r_{ij} }}} \right\} \end{aligned}$$
(12.20)

12.2.2.3 Chemistry at Harvard Molecular Mechanics (CHARMM) Force Field

The CHARMM force field is developed by Harvard University, and the force field parameters are not only from the experimental results but also involve many results of quantum chemical calculations. This force field is mostly used to study multi-molecular systems including small organic molecules, solutions, polymers, biochemical molecules etc. [63]. It can also be used to perform energy minimization, molecular dynamics (MD) and Monte Carlo (MC) simulations. The form of CHARMM force fields is as follows:

$$ U = \sum k_{b} \left( {r - r_{0} } \right)^{2} + \sum k_{\uptheta} \left( {\uptheta -\uptheta_{0} } \right)^{2} + \sum \left[ {\left| {k_{{\varphi }} } \right| - k_{{\varphi }} \cos \left( {n{\varphi }} \right)} \right] + \sum k_{\aleph } \left( {\aleph - \aleph_{0} } \right)^{2} + \mathop \sum \limits_{i,j} \frac{{q_{i} q_{j} }}{{4\uppi \upvarepsilon _{0} r_{ij} }} + \mathop \sum \limits_{i,j} \left( {\frac{{A_{ij} }}{{{\text{r}}_{ij}^{12} }} - \frac{{B_{ij} }}{{{\text{r}}_{ij}^{6} }}} \right){\text{sw}}\left( {{\text{r}}_{ij}^{2} {\text{r}}_{{{\text{on}},}}^{2} {\text{r}}_{\text{off}}^{2} } \right) $$
(12.21)

In the CHARMM force, hydrogen bonding interaction energies are computed by the expression form as follow:

$$ E = \left( {\frac{A}{{r_{\text{AD}}^{6} }} - \frac{A}{{r_{\text{AD}}^{9} }}} \right)\cos^{m} \left( {{\varphi }_{{{\text{A}} - {\text{H}} - {\text{D}}}} } \right)\cos^{n} \left( {{\varphi }_{{{\text{AA}} - {\text{H}} - {\text{D}}}} } \right){\text{sw}}\left( {r_{{{\text{AD}},}}^{2} r_{{{\text{on}},}}^{2} r_{\text{off}}^{2} } \right) \times {\text{sw}}\left[ {\cos^{2} \left( {{\varphi }_{{{\text{A}} - {\text{H}} - {\text{D}}}} } \right),\cos^{2} \left( {{\varphi }_{\text{on}} } \right),\cos^{2} \left( {{\varphi }_{\text{off}} } \right)} \right] $$
(12.22)

where sw is defined as a switching function, and it is used to control the range of the hydrogen bonding interaction. The subscripts on and off indicate the start and termination point to calculate the bond lengths and angle values relating to hydrogen bonds in this function.

Force fields in themselves are not correct forms. If the performance of one force field is better than another one, it should be desirable. According to selected different simulation unit, the force field can be divided into all-atom models such as OPLS-AA and united-atom models such as OPLS-UA model.

12.2.2.4 Polarizable Force Field

As a well-established technique, additive force field has its intrinsic limitations. For systems involving a frequent and large change of electrostatic environment, such as the passage of small molecules or ions through lipid membrane bilayers, or the binding of substrate to the hydrophobic interior of an enzyme in water solution, the electron distribution change of those molecules can hardly be reflected by the fixed charge model. Thus, extended descriptions of the electrostatic interactions have been proposed to add polarization effects into the force field.

In general, several theoretical models have been developed to treat the polarization explicitly during the MD simulations: (1) The fluctuating charge/charge equilibration model; (2) Drude oscillator model, which is used in the CHARMM Drude FF [64]; (3) Induced dipole model, which has been implemented in the development of AMOEBA force field. Basic concepts of these models as well as their strengths and weaknesses will be described below.

Fluctuating Charge Model. The Fluctuating charges (FQ) model [65] treats the charges on the atoms as dynamical variables, and the topology can vary during the MD simulations. FQ model is based on the principle of electronegativity equalization: charges can redistribute among atoms until instantaneous electro-negativities are equalized, though the overall charge on the whole molecule is maintained [66]. The charge distribution can be derived from Taylor series expansion of the energy required to create a charge (q a ) on an atom (a) to the second order:

$$ U_{\text{ele}} = E_{a0} + \chi_{a} q_{a} + \frac{1}{2}J_{aa} q_{a}^{2} $$
(12.23)

Here E a0 is the electrostatic energy with zero charge being created (q a  = 0). χ a is the “Mulliken electronegativity” [67] and J aa is the “absolute hardness” [68].

Considering a system of multiple atoms, the charges are also placed on centres of the atoms, and the electrostatic interactions between atoms must also be counted and expressed by the column’s law. Thus, the total electrostatic energy of a system containing N atoms can be expressed as:

$$ U_{\text{ele}} = \sum\limits_{a = 1}^{N} {(E_{a0} } + \chi_{a} q_{a} + \frac{1}{2}J_{aa} q_{a}^{2} ) + \sum\limits_{a = 1}^{N} {\sum\limits_{b > a} {J_{ab} (r_{ab} )q_{a} } } q_{b} $$
(12.24)

Here, the Coulomb potential J aa (r ab ) [69] between unit charges on atoms a and b separated by a distance r ab , can be written as:

$$ J_{ab} (r_{ab} ) = \frac{{\frac{1}{2}\left( {J_{aa} + J_{bb} } \right)}}{{\sqrt {1 + \frac{1}{4}\left( {J_{aa} + J_{bb} } \right)^{2} r_{ab}^{2} } }} $$
(12.25)

J ab (r ab ) becomes equal to r ab  − 1 with a large distance (r ab  > 2.5 Å), making this component equal to that of a traditional non-polarizable force field.

Practically the extended Lagrangian method [70] can be applied with the charge on each atom being treated as dynamic particles and a “fictitious” mass being assigned to these particles, while the positions of atoms are propagated based on Newton’s equations of motion. The force on each charge is equal to the deviation of its own electronegativity to the averaged one. The Lagrangian strategy is also used in other polarizable models due to the requirement to perform self-consistent field calculations.

Compared with other models, one significant advantage of the FQ model is that the number of interactions being calculated is not increased, however, there are also some drawbacks. FQ model may cause non-physical charge distribution among atoms with large separations. Thus, FQ model leads to a non-physical charge distribution between infinitely separated atoms. This error exists in dealing with large polymers, making the polarization increase fast along the polymer chain [71]. To solve this problem, some variations of the FQ model, including atom-atom charge transfer (AACT) [72] and bond-charge increment (BCI) [73, 74] have been developed by restricting charge distributions between directly bonded atoms. However, these approaches cannot reproduce out-of-plane polarization in planar systems, such as benzene [75]. Besides, over polarization may also be resulted from an intrinsic reduction of polarizability in the condensed phase. As an example, extreme charges on polar atoms was obtained in condensed-phase simulations using electrostatic parameters derived from gas-phase experimental results [71]. To compensate for the over-polarization effect, hardness and electronegativity can be scaled or treated as a function of atom charge [76]. On the basis of the electronegativity equalization method, a modified electronegativity equalization method (MEEM) was developed by Yang et al. [77, 78], and the method was further developed to the atom-bond electronegativity equalization method (ABEEM), which allows more accurate estimation of electron distribution and electrostatic energy of large molecules.

CHEQ model has been successfully performed on the investigation of proteins, ion solvation, etc. [76]. Patel and co-workers have developed a polarizable force field for dimirystoylphosphatidylcholine (DMPC) and dipalmitolphosphatidylcholine (DPPC) based on the charge equilibration (CHEQ) force field approach [8, 79]. The CHEQ force field has been applied to the studies of bilayers and monolayers of lipids, as well as membrane-bounded protein channels, such as gramicidin A [80]. Taking the water permeation for example [79], the simulations using the polarizable force field showed higher permeation than the results with non-polarizable models. It was suggested that fixed-charge force field could not produce the expected dielectric property of the nonpolar hydrocarbon region, and water molecules in membrane interior had large dipole moments similar to the waters in the bulk [81].

Drude Oscillator Model. In the Drude oscillator model [82], a polarizable point dipole is introduced to each atom by connecting to a Drude particle with a harmonic spring, which is a direct extension of additive force field. A “core” charge and an opposite “shell” charge are assigned to the parent atom and the Drude particle [83] to maintain the normal atom charge state, that is to simulate the induced polarization via its displacement under the influence of an electric field.

Thus, the dipole moment of this two-particle system in presence of an electric field (E) can be expressed as:

$$ \mu_{i} = q_{i} d_{i} = \frac{{q_{i}^{2} E}}{k} $$
(12.26)

Here, the induced dipole moment μ i is dependent on the charge (q i ) of Drude particle and the spring distance d i , which is controlled by the spring force constant k. Both q i and k are adjustable parameters in this model. In MD simulation, initial positions of Drude particles can be achieved by energy minimization, with positions of atoms being fixed. Then those Drude particles will be involved in the simulation to dynamically get the corresponding dipole moments. Contributions to the total electrostatic energy from these induced dipoles can be separated into three parts: the interaction with static fields (charges, dipoles, etc.), induced dipole-induced dipole interaction, and polarization energy:

$$ U_{\text{ind}} = U_{\text{stat}} + U_{\mu \mu } + U_{\text{pol}} $$
(12.27)

Here U stat and U μμ can be calculated by Coulomb’s law, and the polarization energy is equal to the spring potential, which is:

$$ U_{\text{pol}} = \frac{1}{2}\sum\limits_{i = 1}^{N} {k_{i} } d_{i}^{2} $$
(12.28)

Different from additive FFs, electrostatic interactions between bonded atoms are included to obtain the correct molecular polarization response. In the Drude particle model, electrostatic interactions can be treated similarly to charge-charge interactions. However, adding Drude particles greatly increases the computational cost, thus in fact only heavy atoms are attached to Drude particles.

Drude oscillator based on polarizable force field has also been developed [84, 85], which includes a board classes of molecules such as proteins [86], carbohydrates [87,88,89], and DNA [90, 91]. And the parameters of RNA is close to completion [92]. Furthermore, Drude force fields of DPPC [17], cholesterol [93], and sphingomyelin [93] have been established recently. In all the simulations using Drude force field, the description of the membrane dipole potential has been improved as a result of the inclusion of atomic polarizabilities.

Induced Dipole Model. When accounting for higher-order contributions approximately via a modification of additive force field, the idea of explicit treatment of first-order induction was introduced [94]. The induced dipole model is implemented in FFs such as AMBER ff02 [95], AMOEBA (Atomic Multipole Optimized Energetics for Biomolecular Applications) [96] etc. AMOEBA [96] was developed by Ren and Ponder. The electrostatic energy in AMOEBA includes contributions from both permanent and induced multipoles. Permanent electrostatic interactions are computed with higher order moments where

$$ M_{i} = \left[ {q_{i} ,d_{ix} ,d_{iy} ,d_{iz} ,Q_{ixx} ,Q_{ixy} ,Q_{ixz} ,Q_{iyx} ,Q_{iyy} ,Q_{iyz} ,Q_{izx} ,Q_{izy} ,Q_{izz} } \right]^{T} $$
(12.29)

is a multipole composed of charge, q i , dipoles, d , and quadrupoles, Q iαβ . The interaction energy between two multipole sites is

$$ U_{\text{emp}}^{\text{perm}} = \left[ {\begin{array}{*{20}c} {q_{i} } \\ {d_{ix} } \\ {d_{iy} } \\ {d_{iz} } \\ {Q_{ixx} } \\ \vdots \\ \end{array} } \right]^{T} \left[ {\begin{array}{*{20}c} 1 & {\frac{\partial }{{\partial x_{j} }}} & {\frac{\partial }{{\partial y_{j} }}} & {\frac{\partial }{{\partial z_{j} }}} & \cdots \\ {\frac{\partial }{{\partial x_{i} }}} & {\frac{{\partial^{2} }}{{\partial x_{i} \partial x_{j} }}} & {\frac{{\partial^{2} }}{{\partial x_{i} \partial y_{j} }}} & {\frac{{\partial^{2} }}{{\partial x_{i} \partial x_{j} }}} & \cdots \\ {\frac{\partial }{{\partial y_{i} }}} & {\frac{{\partial^{2} }}{{\partial y_{i} \partial x_{j} }}} & {\frac{{\partial^{2} }}{{\partial y_{i} \partial y_{j} }}} & {\frac{{\partial^{2} }}{{\partial y_{i} \partial y_{j} }}} & \cdots \\ {\frac{\partial }{{\partial z_{i} }}} & {\frac{{\partial^{2} }}{{\partial z_{i} \partial x_{j} }}} & {\frac{{\partial^{2} }}{{\partial z_{i} \partial y_{j} }}} & {\frac{{\partial^{2} }}{{\partial z_{i} \partial z_{j} }}} & \cdots \\ \vdots & \vdots & \vdots & \vdots & \ddots \\ \end{array} } \right]\frac{1}{{R_{ij} }}\left[ {\begin{array}{*{20}c} {q_{j} } \\ {d_{jx} } \\ {d_{jy} } \\ {d_{jz} } \\ {Q_{jxx} } \\ \vdots \\ \end{array} } \right] $$
(12.30)

Since induced dipoles are introduced to represent polarization, the charge on each atom can be directly derived from experimental values in gas phase or high-level QM calculations. This is even more straightforward than the approach in additive force fields, in which partial charges are assigned to atoms to represent the polarization effects. However, this approach suffers from an important issue: polarization catastrophe. Thole [97] developed a series of approaches to solve this problem by mimicking smeared charge distributions between atoms of short distances using a set of fitting functions. In this way, the dipole field tensor, T ij , is modified so that it is not approximated with r ij  − 3 with a small atom separation. Outlined by Thole [97], if two dipoles are close to each other, the induced dipole calculated from the equation will be unphysically amplified. Thus, damping methods are important when dealing with dipole-dipole interactions with short distances.

At a very close distance, when the electron clouds overlap, the multipole approximation becomes inadequate. In 2015, the penetration effects were introduced into AMOEBA force field [98], and the method proposed by Piquemal et al. [99] was revisited. The charge of an atom is divided into a core and an electron cloud, and therefore the total electrostatic energy between two atoms can be calculated as three components, core-core, core-electron, and electron-electron interactions. The electrostatic energy can be written as

$$ E_{qq} \left( r \right) = {{\left[ {\begin{array}{*{20}c} {Z_{1} Z_{2} - Z_{1} \left( {Z_{2} - q_{2} } \right)\left( {1 - \exp \left( { -\upalpha_{2} r} \right)} \right)} \\ { - Z_{2} \left( {Z_{1} - q_{1} } \right)\left( {1 - \exp \left( { -\upalpha_{1} r} \right)} \right)} \\ { + \left( {Z_{1} - q_{1} } \right)\left( {Z_{2} - q_{2} } \right)\left( {1 - \exp \left( { -\upbeta_{1} r} \right)} \right)\left( {1 - \exp \left( { -\upbeta_{2} r} \right)} \right)} \\ \end{array} } \right]} \mathord{\left/ {\vphantom {{\left[ {\begin{array}{*{20}c} {Z_{1} Z_{2} - Z_{1} \left( {Z_{2} - q_{2} } \right)\left( {1 - \exp \left( { -\upalpha_{2} r} \right)} \right)} \\ { - Z_{2} \left( {Z_{1} - q_{1} } \right)\left( {1 - \exp \left( { -\upalpha_{1} r} \right)} \right)} \\ { + \left( {Z_{1} - q_{1} } \right)\left( {Z_{2} - q_{2} } \right)\left( {1 - \exp \left( { -\upbeta_{1} r} \right)} \right)\left( {1 - \exp \left( { -\upbeta_{2} r} \right)} \right)} \\ \end{array} } \right]} r}} \right. \kern-0pt} r} $$
(12.31)

where γ is the distance between two atoms, Z is the positive core charge, q is the net charge of the atom, (Z − q) can be considered as the electron cloud, and the α and β are two parameters controlling the magnitude of the damping of the electron cloud when the atom is interacting with the core and with electrons from other atoms. The α is intuitively set to the same as the number of valence electrons. When the distance between two atoms increases, Eq. 12.9 will reduce to additive Coulomb law. Thus, in the medium and long distances, the electrostatic energy is still calculated via multipole expansion accurately, and the penetration diminishes rapidly with distance, It is worth to remark that the penetration is only significant when distance is shorter than the sum of atomic van der Waals radii. The results of this method show the polarization response using perturbation theory rather than a variational approach to achieve the SCF condition, and produce an improvement in computational efficiency.

12.2.2.5 MARTINI Coarse-Grained (CG) Model

Although there are various coarse-grained (CG) approaches available, MARTINI model, developed by the groups of Marrink and Tieleman [100], is actually one of the most successful and broadly utilized CG force fields. The MARTINI model was initially developed to study the self-assembly and fusogenicity of small lipid vesicles in 2004 and later extended to investigate the interactions between membrane proteins and their lipid environments. Currently, the MARTINI force field provides parameters for a variety of biomolecules and materials, including the majority of lipid molecules, cholesterol, all native amino acids, carbohydrates, nucleotides, fullerene, polymers, and surfactants.

As a CG model, MARTINI adopts a four-to-one coarse-grained mapping scheme to reduce the resolution of the representation of a system. Four heavy atoms from all-atom models are represented by a single CG bead to discard degrees of freedom (DoF) of the system by assuming that the dynamic behavior of a given system is less strongly associated with those DoF. The ring-like molecules (e.g. benzene, cholesterol, and several of the aromatic amino acids) are mapped with higher resolution (up to two-to-one). The Martini model averages atomic properties to chemical entities and neglects individual atoms. A total of four main types of sites: polar (P), non-polar (N), apolar (C), and charged (Q) are defined to account for the interactions of a system. The parameterizations of non-bonded interactions of the chemical building blocks are extensively calibrated against thermodynamic data such as oil/water partitioning coefficients using a Lennard-Jones (LJ) 12-6 potential. In addition to the LJ interaction, charged groups (type Q) bear a charge ±e and interact via a Coulombic energy function. Coulombic interactions are screened implicitly with a relative dielectric constant εrel = 15 to account for the reduced set of partial charges and resulting dipoles that occur in an atomistic force field.

Bonded interactions are described by a standard set of potential energy functions that are common in classical force fields, including harmonic bond, angle potentials, and multimodal dihedral potentials. Proper dihedrals are primarily used to impose secondary structure on the peptide backbone. Improper dihedrals are mainly used to prevent out-of-plane distortions of planar groups. LJ interactions between nearest neighbors are excluded. The detailed parameterization process of MARTINI force filed can be found in Ref. [100].

Building a membrane protein system of interest using the Martini force field can be fulfilled by CHARMM-GUI Martini Make [101]. CHARMM-GUI [102] is a web-based graphical user interface to generate various molecular simulation systems and input files for major MD engines (e.g. CHARMM, NAMD, GROMACS, AMBER, and OpenMM programs) to facilitate and standardize the usage of common and advanced simulation techniques. By taking advantages of the frameworks in all-atom CHARMM-GUI modules, Wonpil Im and co workers [103] recently have provided a convenient interface to build complex bilayers, micelles, vesicles, and more, with proteins embedded, which supports the force field including martini, martini with polarizable water, dry martini, and ElNeDyn (an elastic network model for proteins).

12.3 Applications of Computer Simulations

12.3.1 Coarse-Grained Molecular Dynamic Simulation Case Study

12.3.1.1 Binding Sites Between Cholesterol in β2-Adrenergic Receptor

In the human genome, the integral membrane proteins represent a larger portion. Among them are G-protein coupled proteins (GPCR), which own the seven transmembrane domains and have over one thousand members, comprising the largest membrane protein family [104]. GPCRs primarily participate in the transduction of signals across the plasma membrane through their response to diverse extracellular environment, such as light, peptides, small molecules, protons, etc. Therefore, GPCRs are the major targets for the development of novel drug candidates in all clinical areas [105].

According to the sequence alignment, the GPCRs have been divided into five classes [104, 106]. β2-adrenergicreceptors (β2AR) belong to the class A receptors, which can be further divided into groups associated with ligand specificities, such as the opsin, amine, peptide, cannabinoid, and olfactory receptors [107]. GPCRs obviously take part in many physiological processes, which contain the neurotransmission, cellular metabolism, secretion, cellular differentiation, growth, inflammatory and immune response [108]. The adrenergic receptor modifications are associated with various diseases, such as asthma, hypertension, and heart failure [109]. β2AR is one of the best-characterized GPCRs, and expressed in pulmonary and cardiac myocyte tissue [110, 111]. β2AR can modulate the signal in the erythrocytes during the malarial infection [108, 112].

The cellar membrane can partly functionally module a lot of membrane proteins [113,114,115], and the functional modulation is associated with the physical or chemical interactions between the phospholipids, sphingolipids, and cholesterol etc. [116]. Cholesterol is an essential component of eukaryotic membranes and plays a critical role in membrane organization, dynamics, and functions. The equilibrium state of the proteins is sensitive to the presence and amount of the cholesterol [116]. Increasing the amount of cholesterol in the membrane moves the equilibrium to the inactive conformation of the proteins. Some works have found that cholesterol can modulate the physiological function of GPCRs, and it is high associated with the kinetic, energetic and mechanical stability of the β2AR [108]. Moreover, a study has shown that cholesterol seems to be helpful in crystallizing β2AR [107]. In 2007, Cherezov et al. published an X-ray crystallography model of human β2AR [107], and the model showed that cholesterol bound to the surface formed by α-helices H1, H2, H3, H4 and H8. Compared to rhodopsin, the ligand-binding pocket was formed by structurally conserved and divergent helices, which was also found to be present in most class A GPCRs. The observation of this complex structure formed by β2AR and cholesterol suggested a possible interaction between them. In addtion, Zocher et al. [116] found that cholesterol considerably increased the strength of interactions stabilizing structural segments of β2AR, and the interactions increased the stability of all the structure segments of β2AR except for the structure core segment and the binding of cholesterol. From the results, they speculated that the structural properties of the GPCRs in the presence of cholesterol might cause GPCRs to respond differently to environmental changes. Therefore, the amount of cholesterols that bind to GPCRs is important in the research.

To show the dynamics, functions and interaction energies of the binding cholesterol to β2AR, a series of microsecond (μs) level coarse-grained (CG) molecular dynamics (MD) including 8 μs CG MD simulations on β2AR embedded in DOPC, 1:1 DOPC/cholesterol, 3:1 DOPC/cholesterol, 6:1 DOPC/cholesterol mixture membrane were conducted, and β2AR is modeled via the Martini models combined with the elastic network [117], which conserve the tertiary and quaternary structures more faithfully without sacrificing realistic dynamics of a protein. The results of the simulations validated the cholesterol binding site with the crystal structure in β2AR and the interaction energy of the β2AR in different scale of DOPC/cholesterol mixture membrane.

12.3.1.2 Methods

Simulation Systems. Four systems of β2AR monomer, which embedded in the four different scales DOPC/cholesterol were constructed for the CG MD simulations. The model of β2AR monomer was designed to reproduce the shape, surface polarity and dynamics of the β2AR monomer as reported by the 3D4S crystal structure without the binding of cholesterol [118]. In the crystal structure of β2AR, the intracellular loop, which was located between the Helix 5 and Helix 6 and connected them, was lost, and the loop deletion was left [118]. The ELNEDIN term was used in the whole simulation progress, and the elastic network was used as the structure scaffold to describe and maintain the overall shape of β2AR. The ELMEDIN models are comparable to the atomistic protein models, and they can build good results of structure and dynamic properties of proteins, including the collective motions [117]. In the simulations, the ElNeDyn term is selected. The topology options set the elastic bond force constant to 500 kJ mol−1 nm−2 (-ef 500) and 200 kJ mol−1 nm−2 (-ef 200), and an upper bond length cut-off 0.9 nm.

The different scales of the mixed complexes of β2AR and DOPC/cholesterol were performed CG MD simulations to validate the cholesterol binding site with the crystal structure in β2AR and the interaction energy of the β2AR in different scales of DOPC/cholesterol mixture membrane [119]. The system β2AR monomer in the DOPC/cholesterol mixture membrane contained one β2AR monomer embedded in DOPC lipid bilayer which DOPC/cholesterol scales are 1:0 DOPC, 1:1 DOPC/cholesterol, 3:1 DOPC/cholesterol, 6:1 DOPC/cholesterol. The β2AR monomer embedded in DOPC/cholesterol mixture membrane was built by the CHARMM-GUI and the total amount of DOPC and cholesterol is 512. The cholesterol molecules were randomly dispersed in the DOPC lipid bilayer. Then β2AR inserted into the mixture membrane using the GROMACS. The details information of the above simulation systems are listed in Table 12.1.

Table 12.1 The detail information of the above simulation systems

CG MD Simulations. CG MD simulations were carried out with the GROMACS 4.5.3. The martini_2.1 force field was employed for all CG MD simulations, and the force field parameters of martini_2.2_lipids and martini_2.0_cholesterol were applied to the DOPC/cholesterol mixture bilayer [120]. During the simulations, all bonds were constrained using the LINCS algorithm, and the integration time step was set to 20 fs. The particle mesh Ewald (PME) method was employed to treat long-range electrostatic interactions, and a cut-off value of 12 Å was used for non-bonded interactions.

Prior to the MD runs, all systems were minimized to remove the conflicting contacts. Then, the systems were heated to 300 K within 1 ns. Each system was equilibrated for a further 1 μs with the constraint only imposed on the protein. The NPT simulation was performed and periodic boundary condition.

Spatial Distribution of Cholesterol around β 2 AR. Spatial distribution function (SDF) was used to reveal potential cholesterol-binding sites on the β2AR surface, and the SDF of the cholesterol molecules around β2AR was calculated as the 3D spatial distribution function of cholesterol model [121]. The SDF was calculated using the last 0.35 μs MD trajectory of β2AR in different scales of DOPC/cholesterol mixture membrane through the g spatial module in the Gromacs package. In general, the SDF reflects the average 3D density distribution of cholesterol CG models. Therefore, the peaks of SDF imply the locations where cholesterol molecules reside with a higher probability as previous paper described.

12.3.1.3 Spatial Distribution of Cholesterol Molecules Around β2AR in Different Scales of DOPC/Cholesterol Mixture Membrane

The binding sites of cholesterol around β2AR (PDB code 2RH1) [107] are shown in Fig. 12.1. The SDF of cholesterol is displayed as isosurfaces around the surface of 2RH1. Three cholesterol molecules are binding to 2RH1: one binds to the surface of helix H1 and H8, and the other two bind to the surface which is constructed by helix H1, H2, H3, H4.

Fig. 12.1
figure 1

Spatial distribution of cholesterol around the crystal structure of β2AR (PDB code 2RH1). a The crystal structure of β2AR. β2AR is shown in cartoon and colored in teal, and the cholesterol molecules are shown in sticks and colored in green. b, c Two-dimension projections of SDFs on the upper and lower membrane planes. The purple sites are the cholesterol binding sites of β2AR (PDB code 2RH1)

The SDF also expresses in the average spatial distribution of β2AR in DOPC/cholesterol membrane, which is based on the upper and lower planes. The membrane is divided into extracellular part and intracellular part. The analysis clearly shows the high cholesterol density site is in the surface of the helix of β2AR.

From Fig. 12.2, there are several higher cholesterol distributions in the surface of β2AR in the 1:1 DOPC/cholesterol mixture membrane. In this process, the elastic bond force constant of β2AR is set to 200 kJ mol−1 nm−22AR_DOPC_cholestrol_11_200). And all cholesterol located near the helix of β2AR. In the intracellular part, the first peak is located on the surface of H1 and H8, which is consistent with the crystal structure of β2AR (Fig. 12.1c). Another peak exists near the surface constructed of H3, H4 and H5. In the extracellular part, almost all the peaks distribute in the surface of H4, H5 and H6.

Fig. 12.2
figure 2

The spatial distribution of cholesterol around β2AR in the 1:1 DOPC/cholesterol membrane, and the elastic bond force constant of β2AR is set to 200 kJ mol−1 nm−2. a The SDF of cholesterol is shown in isosurface, and the average structure though the simulation is fitted to the isosurface. b The surface plot of the average SDF of cholesterol beads. c The 2D projections of the SDF around the helix of β2AR

From Fig. 12.3, there are more cholesterol distribution peaks in the surface of β2AR. In this process, the elastic bond force constant of β2AR is set to 500 kJ mol−1 nm−22AR_DOPC_cholestrol_11_500). In the intracellular part, the first peak is located on the surface of H1 and H8, which is consistent with the crystal structure of β2AR and the results of upon simulation. The elastic bond force is set to 200 kJ mol−1 nm−2 (Fig. 12.2c). The second peak exists near the surface constructed of H2, H3 and H4. Almost press closes to H4. In the crystal structure of 2RH1, there is one cholesterol molecule hold in the similar position. Also in the intracellular part, there are three small peaks. These three peaks are near the H3, H6 and H7. And in the extracellular part, most of the peaks distribute in the surface of H1, H5, H6 and H7.

Fig. 12.3
figure 3

The spatial distribution of cholesterol around β2AR in the 1:1 DOPC/cholesterol membrane, and the elastic bond force constant of β2AR is set to 500 kJ mol−1 nm−2. a The SDF of cholesterol is shown in isosurface, and the average structure though the simulation is fitted to the isosurface. b The surface plot of the average SDF of cholesterol beads. c The 2D projections of the SDF around the helix of β2AR

To the β2AR in the 3:1 and 6:1 DOPC/cholesterol mixture membrane, the results are shown in Table 12.2.

Table 12.2 Interaction energies of the eight helixes in different conditions (KJ/mol)

A potential cholesterol-binding site detected by the CG MD is located in the interface of H1 and H8, which is detected by all the simulations, and it is consistent with the crystal structure 2RH1. The results demonstrate that MD simulations could be employed to reproduce the binding model of cholesterols to β2AR. The cholesterol can mediate the dimeric structure, which has been reported in 2007. However, skepticism still exists: whether it is a physiologically relevant form or just a crystal packing artifact because the two cholesterol molecules that mediate the dimer are located on the crystal packing interface. Another potential cholesterol-binding site in the intracellular surface is also detected in most simulations except the systems of β2AR_DOPC_cholestrol_11_200 and β2AR_DOPC_cholestrol_61_500, which locates the interface of H2, H3, and H4.

The interaction energies of the eight helixes, which the elastic bond force constant of β2AR is set to 200 kJ mol−1 nm−2, are shown in Table 12.2. It can be seen that when the density of cholesterol increases, the interaction energies between H3, H4 with other helixes become weaken. The results suggest that the cholesterol increasing affect the conformation of H3 and H4 β2AR obviously.

12.3.2 All-Atom Molecular Dynamic Simulation Case Study

12.3.2.1 Calcium Facilitated Chloride Permeation in Bestrophin

Calcium-activated chloride channels (CaCCs) perform a variety of physiological roles in regulating photo-transduction, olfactory transduction, vasculartone, epithelial electrolyte secretion and neuronal and cardiac excitability [122]. Despite their broad distribution and important functions [123], the molecular identify of CaCCs remains cloudy. Significant progress has been made in recent years to identify the family members of CaCCS. Three groups of proteins (TMEM16, LRRC8 and bestrophins) have been regarded as CaCCs so far [124]. However, only bestrophin was demonstrated to have a chloride conducting pore, while the formation of anion channels by TMEM16 and LRRC8 was just indirectly evidenced.

Human Bestrophin 1 (hBest1) is highly expressed at the basolateral surface of retinal pigment epithelial (RPE) cells to regulate retinal homeostasis [125]. Mutations in hBest1 cause multiple retinal degeneration disorders, typically the autosomal dominant vitelliform macular dystrophy (Best disease) [126]. The chloride channel activity of hBest1 is stimulated by the intracellular calcium with a K d of 150 nM [127]. Although there is evidence indicating that the activation is directly regulated by the binding of Ca2+ at the cytosolic region of the protein, it is still unclear how Ca2+ participates in gating the channel.

Recent available X-ray structures of chicken BEST1 (Best1cryst) [128] and the bacterial homolog KpBest1 [129] open up a new avenue in understanding the mechanisms of calcium facilitated chloride permeation and selectivity of the bestrophin family. The chicken BEST1 shares 74% sequence identity with hBest1 and the protein assembles in a form of symmetrical homo-pentamer around a central axis. A single ~95 Å long, continuous ion pore located along the central axis of the protein forms the anion permeation pathway with a narrow necked fined by the conserved hydrophobic residues Ile76, Phe80 and Phe84 of each subunit. Mutations in the neck region significantly influence the channel property. Especially, the I76E mutation in hBEST1 flips the ion selectivity to Na+ and the mutations of F80E and F84E impair the Cl permeability [129]. Below the neck, the pore opens a large inner cavity with a maximum radius of 10 Å and ~45 Å long at the cytosolic region, in which Ca2+ might be accommodated. At the bottom of the channel’s cytosolic region, there is an aperture surrounded by Val205 (Ile 205 in hBes1). Replacing Ile205 by Threonine in hBest1 significantly decreased the chloride conductance [130], suggesting the important role of the aperture to contribute the anion selectivity.

Another prominent feature of the X-ray structure of Best1cryst is that each subunit has a strong Ca2+ binding cavity comprised by the acidic cluster (Glu300, Asp301, Asp302, Asp303 and Asp304). The coordination of Ca2+ in Best1cryst is similar to those observed in the EF hand domains [131] and the Ca2+ bowl’ of the BK potassium channel [132]. The Ca2+ clasps formed by the acidic cluster resemble a pentagonal geometry and locate at the midsection of the channel, near the membrane–cytosol interface. Mutations around the Ca2+ clasp in hBEST1 impair the interactions between the transmembrane domains and the cytosolic domains [133], resulting in a dysfunctional channel.

Although approximately 200 distinct mutations in bestrophins have been identified to cause the retinal degenerative diseases [128] and most of the mutations lead to a dysfunction of the chloride channel, the molecular mechanism of Ca2+ dependent chloride channel activity of bestrophin is still not fully understood. Here, in order to gain a molecular insight of calcium facilitated chloride permeation along the channel of Best1cryst, all-atom MD simulations are utilized to compare the chloride permeation property of Best1cryst in the presence of Ca2+ and Na+, respectively. The main purpose of this section is to illustrate how MD simulations could be employed to investigate the ion transporting process at the atomic level.

12.3.2.2 System Setup

The MD simulation systems were prepared using the recent available X-ray structure of chicken BEST1 (PDBid:4RDQ) [128]. The assembling of Best1cryst into the bilayer was employed using the CHARMM-GUI web server [134]. The co-crystallized Fab fragments were deleted and the Best1cryst was merged into a heterogeneous bilayer composed of 400 POPE/POPG lipids with a mixture ratio of 3:1 to mimic the experimental liposome condition [128]. The five Ca2+ ions coordinated by the acidic cluster in the Ca2+ clasps were retained during the system preparation. Then the systems were solvated with 41,959 TIP3P water molecules and the charges of the systems were balanced to neutral using 0.1 M CaCl2 and 0.2 M NaCl, respectively. The systems containing ~207032 atoms were placed into an orthogonal box of 115 × 115 × 150 Å3. All MD simulations were performed using Gromacs [135] 5.0.4 package with CHARMM 36 force field [136] under NPT condition. The leap-frog integrator [137] was used with an integration time-step of 2 fs. The calculation of electrostatic interactions was performed using the Particle-Mesh Ewald algorithm [138] with a cut-off of 1.2 nm. The same cut-off value was chosen for treating the van der Waals interactions. The semi-isotropic pressure coupling was employed using the Parrinello-Rahmanbarostat [139] to control the pressure at 1 bar with a coupling constant of 5 ps when production run was performed. The Nose-Hoover thermostat [140] was employed to couple the temperature of the systems around 303.15 K with a time constant of 1 ps.

After 50 ns MD simulations under NPT ensemble, the calculations of the PMFs along the reaction coordinate of chloride permeation of Best1cryst were performed using umbrella sampling technique. The initial conformations for the umbrella sampling simulations were obtained from the last frames of two 50 ns independent standard MD simulations with the ion concentrations of 0.1 M CaCl2 and 0.2 M NaCl, respectively. The z distance between Cl and the Best1cryst’s center of mass (COM) has been divided into 180 uniformly spaced bins with a length of 0.5 Å, which covers a distance of 90 Å. In the simulations of each window, the chloride anion was subjected to a harmonic potential with a spring constant of 6000 kJ/mol/nm2, which is implemented using the PLUMED free energy calculation library [141]. A cylinder constraint was also applied if Cl shifted away larger than 8 Å from the COM of Best1cryst in the xy plane. A 2 ns umbrella sampling MD simulation of each bin was conducted and the last 1.8 ns trajectories were used for the weighted histogram analysis. Then, the 1D potential of mean force (PMF) for the chloride permeation was estimated using the WHAM package [142] with a convergence tolerance of 10−6.

12.3.2.3 Free Energy Profiles of Chloride Permeation in the Presence of Ca2+ and Na+

The potential of mean force (PMF) profiles of chloride permeation in the presence of Ca2+and Na+ are compared to understand the mechanisms of Ca2+ facilitating chloride permeation in bestrophins. As shown in Fig. 12.4a, both of the PMF profiles have two distinct free energy maxima, corresponding to the Cl anion permeating through the neck and the aperture. However, when Ca2+ is present, the free energy barriers are considerably reduced compared to those in the presence of Na+, especially for Cl passing through the hydrophobic gate (z from 1.0 to 3.0 nm). Two peaks have been identified on the PMF profiles at the neck region, which correspond to the locations of the Cl anion at the pores defined by I76 (peak1) and F80 (peak2). The free energy barrier is lowered by 3 kcal/mol at peak1 (z = 2.5 nm) when Ca2+ is presence, whereas the free energy barrier is dramatically reduced about 12 kcal/mol by Ca2+ when Cl crosses the peak2 (z = 2.0 nm).

Fig. 12.4
figure 4

a Potential of mean force (PMF) profiles for Cl permeating along the channel in the presence of Ca2+ (red) and Na+ (green), respectively. b Ion pore along the Best1cryst

As Cl permeates further into the inner cavity from peak2, the PMF profile demonstrates a strong downhill character with the free energy difference is about 20 kcal/mol. This process corresponds to the Cl anion transiting from a partially dehydrated configuration to the fully hydrated state once entering the inner cavity. During the permeation of Cl in the inner cavity, the anion nearly faces no free energy barriers until it approaches the aperture defined by V205. The free energy barrier is about 5 kcal/mol when the Cl passes through the aperture in the presence of Ca2+, while it changes to ~13 kcal/mol when Na+ is present.

12.3.2.4 The Energetic Barriers Raised by the Dehydration of Chloride

In Fig. 12.5, the average water coordination numbers along the anion permeation pathway are depicted to understand the causes of the large energetic barriers for Cl permeating. By comparing the water coordination profiles and the PMFs, it can be seen that the free energy maxima are directly raised by the partially dehydrated state of Cl (Fig. 12.4). The lower the coordination number, the higher the energetic compensation for Cl permeation, indicating that the hydration states of chloride are strongly correlated with the permeation energetic barriers. In bulk water, the average water coordination number of chloride is 8 when CHARMM force field [143] is used, and the absolute free energy of hydration of a chloride ion is −77.2 kcal/mol [144]. As shown in Fig. 12.5, Cl exhibits the lowest coordination number around 3.2 when passing through the hydrophobic gate (z = 1.5–3 nm), which explains the high free energy barrier at the neck region on the PMF profiles. In addition, the water coordination profile of Na+ at the neck shows smaller coordination numbers than that of Ca2+, also explaining the higher free energy barrier when Na+ is present. Again, at the aperture, Cl exhibits a coordination number of 4 in the presence of Na+ while Cl shows a larger average coordination number by one when Ca2+ is present, leading to a free energy height of ~7 kcal/mol for Cl permeating the aperture.

Fig. 12.5
figure 5

Average water coordination numbers of Cl along the permeation pathway in the presence of Ca2+ (red) and Na+ (green)

When Cl enters the inner cavity, the Cl ion is recovered to the fully hydrated state with the average coordination numbers around 7.5, which is similar to the anion-water interactions at the extracellular region (z = 3.5–5 nm). The occurrence of sudden jumps on the water coordination profiles from z = –3 to 0 nm indicate the interactions between Cl and the cations in the inner cavity, which reduces the number of water coordinating to Cl.

12.3.2.5 Ca2+ Binding Sites Along the Permeation Pathway

After further analyzing the MD trajectories, it can be found that, in addition to bind the Ca2+ clasp sites, Ca2+ can also tightly bind to the conserved acid residues (E74, E98, E213 and D203) along the anion permeation channel, whereas the stable binding of Na+ at those sites are not observed because of the weaker coulomb interactions between the monocation and the carboxyl groups of glutamate and aspartate. A snapshot shown in Fig. 12.6 demonstrates the interactions between Ca2+ and these carboxyl groups of the acid residues along the channel. It is worth noting that the interactions between Ca2+ and E74 of Best1cryst (Q74 in hBEST1), just locating above the hydrophobic filter, may play an essential role in gating the channel. Because of the narrowness of the region just above the neck, five Ca2+ ions may not be accommodated simultaneously at this site to bind the pentamer’s five carboxyl groups belonging to E74. Alternatively, the binding of Ca2+ at this site adopts a triangular pattern. As shown in Fig. 12.6, only three Ca2+ ions would be tightly trapped here; two Ca2+ ions were grasped by each of two carboxyl groups and the rest Ca2+ was coordinated by the fifth carboxyl group. Such a binding fashion at this narrow region perfectly resolves the collision problem and prevents additional Ca2+ ions to bind. Moreover, two carboxyl groups grasping one Ca2+ ion, not only tightens the binding, but enhances the local concentration of Cl above the neck as well. Therefore, the permeation of chloride through the neck is facilitated if Ca2+ is present. On the contrary, the stable binding of Na+ to the carboxyl group of E74 is not identified, which explains the higher free energy barrier when Cl passes through the peak1 when Na+ is present.

Fig. 12.6
figure 6

Snapshot of Ca2+ binding to the conversed acid residues (E74, E98, E213 and D203) along the channel (left: side view, right: top view). Ca2+ ions are shown in wheat sphere, the acid residues are shown in van der Waals spheres and the gate residues (I76 and V205) are depicted in CPK. The figure was rendered using VMD [145]

There are three more Ca2+ binding sites at the cytosolic region along the permeation channel. Two of them locate in the inner cavity (E98 and E213) and one locates at the bottom of the protein (D203). Each E98 and E213 of the pentamer traps one Ca2+ ion, leading to a high Ca2+ concentration in the cavity. This is the reason why the inner cavity of BEST might play as a Ca2+ reservoir to help accumulate and release Ca2+ from ER stores.

In addition, the binding of Ca2+ to D203 at the bottom of protein is not so tight. According to the MD trajectories, the Ca2+ ion binding to D203 will be frequently exchanged with the free Ca2+ in solution, indicating Ca2+ could be easily released at this site. Moreover, umbrella sampling simulations show that as Cl permeates from the inner cavity through the aperture to the bulk solution, a free Ca2+ ion plays the role of carrier. The residues E213 and D203 on the two sides of the aperture reduce the transporting barrier of the ions, explaining the free energy discrepancy for Cl passing through the aperture between Ca2+ and Na+. This result is also in line with the hypothesis that bestrophin might conduct chloride as counter ion for Ca2+ uptake into cytosolic Ca2+ stores.

12.3.2.6 The Binding of Ca2+ Altering the Electrostatic Environment Along the Channel

In order to answer the question that why the free energy barrier decreases so dramatically (about 12 kcal/mol) for Cl passing through peak2 when Ca2+ is present, the Adaptive Poisson-Boltzmann Solver (APBS) package [146] in VMD [145] is employed to perform electrostatics calculations in the presence of Ca2+ and Na+, respectively. In Fig. 12.7, 3D charge densities are compared in the presence of Ca2+ and Na+, respectively. The result clearly shows the presence of Ca2+ radically changes the electrostatic properties along the channel. In the presence of Na+, the extracellular region exhibits a favorable environment for positively charged ions, thus raising the free energy barrier for Cl passing through the neck. On the contrary, the binding of Ca2+ to E74 flips the electrostatic environment around the outer entryway to favor negatively charged ions and enhance the local anion concentration, and therefore faciliting the anion permeation. In the inner cavity, the presence of Ca2+ and Na+ exhibits the same electrostatic properties. However, the binding of Ca2+ to E98 and E213 in the inner cavity dilates the charge densities to favor anions in the cavity (Fig. 12.7b), especially for the region just below the neck. When Na+ is present, the charge densities are not seen below the neck. This is key evidence to explain why the free energy barrier for Cl passing through peak2 is dramatically reduced in the presence of Ca2+.

Fig. 12.7
figure 7

Three dimensional charge densities along the anion permeation channel in the presence of Na+ (a) and Ca2+ (b). The positive charge densities are depicted in red using charge density isovalue of +0.5 and the negative charge densities are depicted in blue using charge density isovalue of −1.2

At the bottom of the protein below the aperture, a small volume of positive charge density could be still identified when Na+ is present, indicating the repulsion of the anions here. This result is also consistent with the PMF profiles that Cl would experience a higher energetic barrier passing through the aperture.

12.4 Protocol

In this section, we briefly introduce the procedure of using CHARMM-GUI interface to build membrane protein systems for MD simulations.

CHARMM-GUI (http://www.charmm-gui.org) [102], is a web-based graphical user interface to prepare complex bio-molecular systems for molecular dynamic simulations. During the last decades, a range of capabilities has been consistently extended since its original announced in 2006 and now it contains a number of different modules designed to set up a broad range of simulations [147].

One of the most prominent features of CHARMM-GUI is that the interface would provide input files for a majority MD simulation engines such as CHARMM, NAMD, GROMACS, AMBER, LAMMPS, Desmond and OpenMM, and help users to build a sophisticated membrane/protein system easily and interactively. Here, as shown in Fig. 12.8, we illustrate the utilization of the Membrane Builder model to generate a protein/membrane system in six subsequent steps.

Fig. 12.8
figure 8

A schematic workflow of the six subsequent steps

12.4.1 PDB Reading

Reading a PDB file is generally considered the first hurdle to initialize a simulation project. Since many PDB files may miss residues in loop region, especially for the membrane proteins, or introduce mutations to facilitate crystallization, therefore, before uploading the PDB file in step 1, it is highly recommended for the users to convert the PDB file into sequence format (such as using pdb2fasta) and then blasting [148] the sequence on NCBI to check the sequence completeness of the structure. The missing loop region and mutations can be completed and recovered using homology modeling package like MODELLER [149].

In addition, disulfide bonds, different protonation states of titratable residues, other post-translational modifications (such as phosphorylation, glycosylation, and lipid-tail linkers) may be easily handled in this step using PDB Reader and Manipulator.

12.4.2 Orient Protein

Generally, the PDB file of a membrane protein does not have proper information on relative disposition in a membrane bilayer. In Membrane Builder, users can place the protein appropriately in a lipid bilayer by aligning its principal axis or a vector between two specific C-alpha atoms with respect to the membrane normal. It is assumed that the membrane normal is parallel to the Z-axis and the center is located at Z = 0 Å. Users can either upload their own pre-oriented structure handled by external package like orient in VMD [145], or specify PDB entry ID of a database (PDB database [150] or OPM database [151]). Protein structures from OPM database are pre-oriented, therefore, users do not need any modification of the protein orientation.

12.4.3 Determine System Size

As of 2016, there has been 295 lipid types supported by Membrane Builder in the context of CHARMM additive Force Field including phosphoinositides, cardiolipin, sphingolipids, bacterial lipids, sterols, and fatty acids [102].

After alignment in the previous step, the protein cross-sectional along the Z-axis is calculated and the protein areas in the top and bottom lipid leaflets are used to determine the system size. Users can specify the type and the number of lipid molecules to build a homogeneous or heterogeneous system. If a user specifies the number of lipid molecules in a bilayer, the system size in XY is determined by a ratio of the XY dimension. It is recommended to have the same XY lengths, unless users have specific reasons. Because a membrane is allowed to have different types and amounts of lipid molecules for the lower and upper leaflets, the resulting lipid bilayer probably has a different system size in XY for each leaflet. To avoid such situations, proceeding to the next step is not allowed until the difference in area of each leaflet is less than the smallest surface area among the lipid molecules used for the lipid bilayer. Then, the size along the Z axis is determined by specifying the thickness of bulk water from the protein extent along Z. In the case of some membrane proteins or peptides that do not span the bilayer, the size along Z is determined by the specified water thickness from Z = ±20 Å, approximately from the lipid headgroup.

12.4.4 Build Components

In this step, Membrane Builder will generate individual components to fully solvate the protein, including lipid bilayer, bulk water, and counter ions. Any complex (homogeneous or heterogeneous) bilayer system can be generated by the so-called “replacement method” that first packs the lipid-like pseudo atoms, and then replaces them with lipid molecules one at a time by randomly selecting a lipid molecule from a lipid structural library. Using the replacement method, it generates nicely packed lipid molecules around a protein, although Membrane Builder provides an insertion method for limited homogeneous bilayer system building.

If the ion concentration is specified, the numbers of ions are determined by the ion-accessible volume and the total charges of the system are neutralized. The initial configuration of ions is then determined through Monte Carlo simulations using a primitive model, i.e., scaled Coulombic and van der Waals interactions.

12.4.5 Assembly

Each component generated in the previous step will be assembled here and this procedure will take minutes to hours depending on the system size. One of the most significant advantages of using the web environment is that, if a problem is found, users can go back and re-generate the whole system again before quitting the browser. Therefore, the visualization of the initially assembled structure is important to verify if the system is reasonable.

12.4.6 Equilibration

After assembly is accomplished, the equilibration must be performed to relax the uncorrelated initial system before MD production simulations. Membrane Builder provides six consecutive input files for widely used MD simulation engines such as CHARMM, NAMD, GROMACS, AMBER and OpenMM. To assure gradual equilibration of the initially assembled system, various restraints are applied to the protein, water, ions, and lipid molecules during the equilibration: (1) harmonic restraints to ions and heavy atoms of the protein, (2) repulsive planar restraints to prevent water from entering into the membrane hydrophobic region, and (3) planar restraints to hold the position of head groups of membranes along the Z-axis. These restraint forces are slowly reduced as the equilibration progresses. To warrant the successful equilibration, i.e., to avoid instability of dynamics integrations during equilibration, the NVT dynamics (constant volume and temperature) is used for the first and second steps with integration time step of 1 fs, and the NPAT (constant pressure, area, and temperature) dynamics for the rest equilibrations.