Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Introduction

Molecular dynamics (MD) simulations in their different flavors are widely used in a large variety of research areas of Computational Physics and Chemistry. They represent a powerful tool to study the motion of atoms in molecules, liquids, and solids. The term MD typically refers to the propagation of point particles – atomic nuclei or effective particles combining several nuclei – according to the laws of classical mechanics. In particular, the forces acting on the particles are calculated “on the fly” only at discrete points along the trajectory. Following this definition, we discuss in this chapter Ab Initio MD (AIMD), i.e., the atomic forces are calculated from first principles, classical atomistic MD using analytical empirical interaction potentials (force-fields), which sometimes is referred to as force-field molecular dynamics, and coarse grain MD using analytical empirical potentials between effective particles representing groups of atoms. We exclude methods which go beyond classical nuclei, such as path integral MD (Tuckerman 2002; Tuckerman and Hughes 1998; Tuckerman et al. 1993) and wavepacket dynamics (Balint-Kurti 2008; Worth et al. 2008), or beyond the Born–Oppenheimer approximation (Doltsinis and Marx 2002a,b). This overview, furthermore, leaves out the vast area of semi-empirical methods (see for instance Bredow and Jug (2005) for a recent review) including self-consistent charge density functional tight-binding (SCC-DFTB) (Elstner et al. 1998) and empirical valence-bond (EVB) theory (Aqvist and Warshel 1993; Shurki and Warshel 2003; Warshel 19912003).

The aim of this chapter is to offer practical guidance on how to choose the appropriate technique for a particular physical problem, how to set up a simulation, and how to analyze and visualize the output. In addition it should provide the theoretical background required to become a competent user of the available simulation software packages.

Choosing the Right Method

When choosing which type of molecular dynamics simulations to perform, it is important to understand the capabilities of each technique. The differences in the various methods are basically dependent on the detail with which each one models a physical system.

The most detailed molecular dynamics simulation technique is the ab-initio (quantum) molecular dynamics simulation approach that explicitly models the electrons of the particles within the system. Whereas, force-field molecular dynamics simulations model the nuclear interactions of the particles within the system, and therefore do not explicitly model each electron. Then the method that incorporates the least amount of detail is that of coarse grain molecular dynamics models where multiple particles are grouped together before being represented by a single interaction “bead.”

Therefore, quantum molecular dynamics simulations will generate the most detailed modeling of interatomic interactions as electrons are the basis of all such interactions. Quantum simulations allow for certain phenomena like electron transport within a system to be modeled, which cannot be modeled in force-field or coarse grain molecular dynamics simulations because they do not explicitly model electrons. Also, in order to model chemical reactions, quantum simulations are the most accurate approach (Note: there have been force-field and coarse-grain molecular dynamics simulations that have modeled the formation and breaking of bonds, but some a priori knowledge must then be included in the model to allow for the reaction to take place). The major limitations of quantum simulations is that the simulations are very computationally intensive, which results in the capability to model only small system sizes ( ∼ 102 particles) and time ( ∼ 10− 12 s). Thus the systems that can be modeled are limited to small molecules or portions of larger molecules (i.e., specific amino acids within a protein).

Force-field molecular dynamics simulations offer the ability to model molecules at the particle level. Often, information from quantum simulations is used to develop the empirical equations (force-field) that are used to govern the interactions between particles. Because force-field molecular dynamics simulations use less detail than the quantum simulations, they are able to model systems that are significantly larger in size ( ∼ 106 particles) for a longer period of time ( < 10− 6 s). Therefore, measuring the structural, mechanical, and/or transport properties of medium to large sized systems (i.e., proteins, functionalized nanoparticles, \(\ldots \)) is possible.

Finally, coarse grain molecular dynamics simulations reduce the number of degrees of freedom within the simulated system even further by grouping several atoms into one interaction bead. Therefore, even larger system sizes and times (on the order of seconds) are accessible via these simulations. Several of the same properties measured via force-field molecular dynamics simulations can be measured with coarse grain molecular dynamics simulations (i.e., structural, mechanical, and transport properties). However, due to the reduced detail in the models of the molecules, it is not possible to investigate specific chemical interactions within a system, such as hydrogen bonding.

Once you have chosen the appropriate method for the particular system and property to be investigated, the next choice is what simulation package to use. For classical MD simulations, there are several free molecular dynamics packages that can be found on the web including DL_POLY (Smith et al. 2002; Todorov and Smith 2009), GROMACS (van der Spoel et al. 2005a,b), HOOMD (Anderson et al. 2008HOOMD 2009), LAMMPS (LAMMPS 2010; Plimpton 1995), MOLDY (Refson 20002001), and NAMD (Bhandarkar et al. 2009; Phillips et al. 2005b), and there are also commercial packages including AMBER (Case et al. 20052008), CHARMM (Brooks et al. 2009CHARMM 2009), and GROMOS (GROMOS 2007; Scott et al. 1999). Generally, these codes can be divided into those that are mostly used for simulations of biological systems (AMBER, CHARMM, GROMACS, GROMOS, NAMD) and those that are more general simulation packages (HOOMD, LAMMPS, MOLDY). When choosing between these options, an important criterion is to choose a code that you feel comfortable using. Outside of comfort, another aspect to take into consideration is that packages will differ in the features they offer and the additional tools to perform analysis (usually lists of analysis tools can be found in the packages’ documentation).

For AIMD simulations, the user may choose from a large number of codes, for instance, ABINIT (2010; Aulbur et al. 2000), CASTEP (2009; Clark et al. 2005; Segall et al. 2002), CONQUEST (2009; Bowler et al. 2006), CP2K (Hutter et al. 2009; VandeVondele et al. 20052006), CPMD (Marx and Hutter 20002009; Parrinello et al. 2008), CP-PAW (2006; Blochl 1994; Blochl et al. 2003), DACAPO (2006), FHI98md (2002; Bockstedte et al. 1997), NWChem (2008; Kendall et al. 2000), ONETEP (2005; Skylaris et al. 2005), PINY (2005), PWscf (2009; Giannozzi et al. 2009), QuantumEspresso (2009; Giannozzi et al. 2009), SIESTA (2010; Artacho et al. 2008; Soler et al. 2002), S/PHI/nX (2009; Boeck 2009), or VASP (2009; Kresse and Furthmüller 1996).

Theoretical Background

Born–Oppenheimer Approximation

Let us begin by introducing our nomenclature and by reviewing some well-known basic relations within the Schrödinger formulation of quantum mechanics. A complete, nonrelativistic, description of a dynamic system of N atoms having the positions \(\mathbf{R} =\{{ \mathbf{R}}_{1},{\mathbf{R}}_{2},\ldots,{\mathbf{R}}_{I},\ldots,{\mathbf{R}}_{N}\}\) with n electrons located at \(\mathbf{r} =\{{ \mathbf{r}}_{1},{\mathbf{r}}_{2},\ldots,{\mathbf{r}}_{i},\ldots,{\mathbf{r}}_{n}\}\) would involve solving the time-dependent Schrödinger equation

$$\mathcal{H}\Phi (\mathbf{r},\mathbf{R};t) = i \hbar \frac{\partial } {\partial t}\Phi (\mathbf{r},\mathbf{R};t),$$
(7.1)

with the total Hamiltonian

$$\mathcal{H}(\mathbf{r,R}) = \mathcal{T} (\mathbf{R}) + \mathcal{T} (\mathbf{r}) + {\mathcal{V}}_{\mathrm{nn}}(\mathbf{R}) + {\mathcal{V}}_{\mathrm{ne}}(\mathbf{r,R}) + {\mathcal{V}}_{\mathrm{ee}}(\mathbf{r}),$$
(7.2)

being the sum of kinetic energy of the atomic nuclei,

$$\mathcal{T} (\mathbf{R}) = -\frac{ \hbar^{2}} {2} {\sum \limits _{I=1}^{N}\frac{{\nabla }_{I}^{2}} {{M}_{I}}},$$
(7.3)

kinetic energy of the electrons,

$$\mathcal{T} (\mathbf{r}) = - \frac{ \hbar^{2}} {2{m}_{e}}{ \sum \limits _{i=1}^{n}}{\nabla }_{ i}^{2},$$
(7.4)

internuclear repulsion,

$${\mathcal{V}}_{\mathrm{nn}}(\mathbf{R}) = \frac{{e}^{2}} {4\pi {\epsilon }_{0}}{ \sum \limits _{I=1}^{N-1}}{ \sum \limits _{J>I}^{N}} \frac{{Z}_{I}{Z}_{J}} {\vert {\mathbf{R}}_{I} -{\mathbf{R}}_{J}\vert },$$
(7.5)

electronic–nuclear attraction,

$${\mathcal{V}}_{\mathrm{ne}}(\mathbf{r,R}) = - \frac{{e}^{2}} {4\pi {\epsilon }_{0}}{ \sum \limits _{I=1}^{N}}{ \sum \limits _{i=1}^{n}} \frac{{Z}_{I}} {\vert {\mathbf{r}}_{i} -{\mathbf{R}}_{I}\vert },$$
(7.6)

and interelectronic repulsion,

$${\mathcal{V}}_{\mathrm{ee}}(\mathbf{r}) = \frac{{e}^{2}} {4\pi {\epsilon }_{0}}{ \sum \limits _{i=1}^{n-1}}{ \sum \limits _{j>i}^{n}} \frac{1} {\vert {\mathbf{r}}_{i} -{\mathbf{r}}_{j}\vert }.$$
(7.7)

Here, M I and Z I denote the mass and atomic number of nucleus I; m e and e are the electronic mass and elementary charge, and ε0 is the permittivity of vacuum. The nabla operators ∇ I and ∇ i act on the coordinates of nucleus I and electron i, respectively. The total wavefunction \(\Phi (\mathbf{r},\mathbf{R};t)\) simultaneously describes the motion of both electrons and nuclei.

The Born–Oppenheimer approximation (Doltsinis and Marx 2002b; Kołos 1970; Kutzelnigg 1997) separates nuclear and electronic motion based on the assumption that the much faster electrons adjust their positions instantaneously to the comparatively slow changes in nuclear positions. The electronic problem is then reduced to the time-independent (electronic) Schrödinger equation for clamped nuclei,

$${\mathcal{H}}_{\mathrm{el}}(\mathbf{r};\mathbf{R}){\Psi }_{k}(\mathbf{r};\mathbf{R}) = {E}_{k}(\mathbf{R}){\Psi }_{k}(\mathbf{r};\mathbf{R}),$$
(7.8)

where \({\mathcal{H}}_{\mathrm{el}}(\mathbf{r};\mathbf{R})\) is the electronic hamiltonian,

$${\mathcal{H}}_{\mathrm{el}}(\mathbf{r,R}) = \mathcal{T} (\mathbf{r}) + {\mathcal{V}}_{\mathrm{nn}}(\mathbf{R}) + {\mathcal{V}}_{\mathrm{ne}}(\mathbf{r,R}) + {\mathcal{V}}_{\mathrm{ee}}(\mathbf{r}),$$
(7.9)

and \({\Psi }_{k}(\mathbf{r};\mathbf{R})\) is the electronic wavefunction of state k. Meanwhile, nuclear motion is described by

$$\left [\mathcal{T} (\mathbf{R}) + {E}_{k}(\mathbf{R})\right ]{\chi }_{k} = i \hbar\frac{\partial } {\partial t}{\chi }_{k}$$
(7.10)

with the nuclear wavefunction \({\chi }_{k}(\mathbf{R},t)\) evolving on the potential energy surface \({E}_{k}(\mathbf{R})\) of the electronic state k. The total wavefunction is then the direct product of the electronic and the nuclear wavefunction,

$$\Phi (\mathbf{r},\mathbf{R};t) = {\Psi }_{k}(\mathbf{r},\mathbf{R}){\chi }_{k}(\mathbf{R},t)$$
(7.11)

In the classical limit (Doltsinis and Marx 2002b), the nuclear wave equation (7.10) is replaced by Newton’s equation of motion

$${M}_{I}{\ddot{\mathbf{R}}}_{I} = -{\nabla }_{I}{E}_{k}$$
(7.12)

For a great number of physical situations, the Born–Oppenheimer approximation can be safely applied. On the other hand, there are many important chemical phenomena such as charge transfer and photoisomerization reactions, whose very existence is due to the inseparability of electronic and nuclear motion. Inclusion of nonadiabatic effects is beyond the scope of this chapter and the reader is referred to the literature (e.g., Doltsinis 2006; Doltsinis and Marx 2002b) for more details.

The above approximations form the basis of conventional molecular dynamics, Eqs. 7.12 together with 7.8 being the working equations. Thus, in principle, a classical trajectory calculation merely amounts to integrating Newton’s equations of motion (7.12). In practice, however, this deceptively simple task is complicated by the fact that the stationary Schrödinger equation (7.8) cannot be solved exactly for any many-electron system. The potential energy surface therefore has to be approximated using ab initio electronic structure methods or empirical interaction potentials (so-called force-field molecular dynamics Sutmann (2002) and Allen and Tildesley (1987)). The former approach, usually referred to as ab initio molecular dynamics (AIMD), will be the subject of section “Ab Initio Molecular Dynamics,” while the latter – force-field molecular dynamics – will be discussed in section “Classical Molecular Dynamics.”

Ab Initio Molecular Dynamics

In the following, we shall focus on first principles molecular dynamics methods. Due to the high computational cost associated with ab initio electronic structure calculations of large molecules, computation of the entire potential energy surface prior to the molecular dynamics simulation is best avoided. A more efficient alternative is the evaluation of electronic energy and nuclear forces “on the fly” at each step along the trajectory.

Born–Oppenheimer Molecular Dynamics

In the so-called Born–Oppenheimer implementation of such a scheme (Marx and Hutter 2000), the nuclei are propagated by integration of Eq. 7.12, where the exact energy E k is replaced with the eigenvalue, \(\tilde{{E}}_{k}\), of some approximate electronic Hamiltonian, \(\tilde{{\mathcal{H}}}_{\mathrm{el}}\), which is calculated at each time step. For the electronic ground state, i.e., k = 0, the use of Kohn–Sham (KS) density functional theory (Dreizler and Gross 1990; Parr and Yang 1989) has become increasingly popular.

Car–Parrinello Molecular Dynamics

In order to further increase computational efficiency, Car and Parrinello have introduced a technique to bypass the need for wavefunction optimization at each molecular dynamics step (Car and Parrinello 1985; Marx and Hutter 2000). Instead, the molecular wavefunction is dynamically propagated along with the atomic nuclei according to the equations of motion

$${M}_{I}{\ddot{\mathbf{R}}}_{I} = -{\nabla }_{I}\langle {\Psi }_{k}\vert \tilde{{\mathcal{H}}}_{\mathrm{el}}\vert {\Psi }_{k}\rangle$$
(7.13)
$${\mu }_{i}\ddot{{\psi }}_{i} = - \frac{\delta } {\delta {\psi }_{i}^{\star }}\langle {\Psi }_{k}\vert \tilde{{\mathcal{H}}}_{\mathrm{el}}\vert {\Psi }_{k}\rangle +{ \sum \limits _{j}}{\lambda }_{ij}{\psi }_{j},$$
(7.14)

where the KS one-electron orbitals ψ i are kept orthonormal by the Lagrange multipliers λ ij . These are the Euler–Lagrange equations

$$\frac{\mathrm{d}} {\mathrm{d}t} \frac{\partial \mathcal{L}} {\partial \dot{q}} = \frac{\partial \mathcal{L}} {\partial q},\quad (q ={ \mathbf{R}}_{I},\;{\psi }_{i}^{\star })$$
(7.15)

for the Car–Parrinello Lagrangian (Car and Parrinello 1985)

$$\mathcal{L} ={ \sum \limits _{I}}\frac{1} {2}{M}_{I}\dot{{\mathbf{R}}}_{I}^{2} +{ \sum \limits _{i}}\frac{1} {2}{\mu }_{i}\langle \dot{{\psi }}_{i}\vert \dot{{\psi }}_{i}\rangle -\langle {\Psi }_{k}\vert \tilde{{\mathcal{H}}}_{\mathrm{el}}\vert {\Psi }_{k}\rangle +{ \sum \limits _{ij}}{\lambda }_{ij}(\langle {\psi }_{i}\vert {\psi }_{j}\rangle - {\delta }_{ij})$$
(7.16)

that is formulated here for an arbitrary electronic state Ψ k , an arbitrary electronic Hamiltonian \(\tilde{{\mathcal{H}}}_{\mathrm{el}}\), and an arbitrary basis (i.e., without invoking the Hellmann–Feynman theorem).

Classical Molecular Dynamics

While first-principles molecular dynamics simulations deal with the electrons in a system, this results in a large number of particles that must be considered and therefore the calculations become significantly time-consuming. Classical molecular dynamics ignore electronic motions and calculate the energy of a system as a function of the nuclear positions only, and therefore are used to simulate larger, less detailed systems for larger timescales. The successive configurations of the system are generated by solving the differential equations that constitute Newton’s second law (Eq. 7.12):

$$\frac{{d}^{2}{X}_{I}} {d{t}^{2}} = \frac{{F}_{{X}_{I}}} {{M}_{I}}$$
(7.17)

This equation describes the motion of a particle of mass M I along one dimension (X I ), where F XI is the force on the particle in that dimension. The solution of these differential equations results in a trajectory that specifies how the positions and velocities of the particles in the system vary with time.

In realistic models of intermolecular interactions, the force on particle I changes whenever particle I changes its position or whenever another atom with which particle I interacts changes its position. Therefore the motions of all the particles are coupled together, which results in a many-body problem that cannot be solved analytically. Therefore finite difference methods are used to integrate the equations of motion.

Generally, the integration of Eq. 7.17 is broken into consecutive steps that are conducted at different times t that are separated by increments of δt, which is generally referred to as the time step. First, the total force on each particle in the system at time t is calculated as the vector sum of its interactions with other particles.

Then, assuming the force is constant over the course of the time step, the accelerations of the particles are calculated, which are then combined with positions and velocities of the particles at time t to determine the positions and velocities at time t + δt. Finally, the forces on the particles in their new positions are determined, and then new accelerations, positions, and velocities are determined at t + 2δt and so on.

A common approach in the various finite difference methods used to integrate the equations of motions for classical molecular dynamics simulations is that it is assumed that the positions, velocities, and accelerations (as well as all other dynamic properties) can be approximated using Taylor series expansions:

$$\mathbf{R}(t + \delta t) = \mathbf{R}(t) + \delta t\mathbf{V}(t) + \frac{1} {2}\delta {t}^{2}\mathbf{A}(t) + \frac{1} {6}\delta {t}^{3}\mathbf{B}(t) + \frac{1} {24}\delta {t}^{4}\mathbf{C}(t) + \ldots $$
(7.18)
$$\mathbf{V}(t + \delta t) = \mathbf{V}(t) + \delta t\mathbf{A}(t) + \frac{1} {2}\delta {t}^{2}\mathbf{B}(t) + \frac{1} {6}\delta {t}^{3}\mathbf{C}(t) + \ldots $$
(7.19)
$$\mathbf{A}(t + \delta t) = \mathbf{A}(t) + \delta t\mathbf{B}(t) + \frac{1} {2}\delta {t}^{2}\mathbf{C}(t) + \ldots $$
(7.20)

where R is the position, V is the velocity, A is the acceleration, and B and C are the third and fourth derivatives of the positions with respect to time, respectively.

Verlet Algorithm

One of the most widely used finite difference methods in classical molecular dynamics simulations is the Verlet algorithm (Verlet 1967). In the Verlet algorithm, the positions and accelerations at time t and the positions from the previous time step \(\mathbf{R}(t - \delta t)\) are used to calculate the updated positions \(\mathbf{R}(t + \delta t)\) using the equation:

$$\mathbf{R}(t + \delta t) = 2\mathbf{R}(t) -\mathbf{R}(t - \delta t) + \delta {t}^{2}\mathbf{A}(t).$$
(7.21)

While the velocities do not explicitly appear in Eq. 7.21, they can be calculated from the difference in position over the entire time step:

$$\mathbf{V}(t) = \frac{\vert \mathbf{R}(t + \delta t) -\mathbf{R}(t - \delta t)\vert } {2\delta t}$$
(7.22)

or the difference in position over a half time step (\(t + \frac{1} {2}\delta t\)):

$$\mathbf{V}(t + \frac{1} {2}\delta t) = \frac{\vert \mathbf{R}(t + \delta t) -\mathbf{R}(t)\vert } {\delta t}$$
(7.23)

The fact that the velocities are not explicitly represented in the Verlet algorithm is one of the drawbacks to this method in that no velocities are available until the positions have been determined at the next time step. Also, in order to calculate the position of particles at t = δt, it is necessary to determine the positions at \(t = -\delta t\) since the algorithm requires the position at time t − δt to calculate the position at time t + δt. Often, this drawback is overcome by using the Taylor series to calculate \(\mathbf{R}(-\delta t) = \mathbf{R}(0) - \delta t\mathbf{V}(0) + \frac{1} {2}\delta {t}^{2}\mathbf{A}(t)\vert + \ldots \). A final drawback of the Verlet algorithm is that there may be a loss of precision in the resulting trajectories that result from the fact that the positions are calculated by adding a small term (\(\delta {t}^{2}\mathbf{A}(t)\), to the difference of two larger terms (2R(t) and \(\mathbf{R}(t - \delta t)\)) in Eq. 7.21.

“Leap-Frog” Algorithm

In an attempt to improve upon the original Verlet algorithm, several variations have been developed. The leap-frog algorithm (Hockney 1970) is one of the variations that uses the following equations to update the positions:

$$\mathbf{R}(t + \delta t) = \mathbf{R}(t) + \delta t\mathbf{V}(t + \frac{1} {2}\delta t),$$
(7.24)

and the velocities:

$$\mathbf{V}(t + \frac{1} {2}\delta t) = \mathbf{V}(t -\frac{1} {2}\delta t) + \delta t\mathbf{A}(t).$$
(7.25)

In the leap-frog algorithm, the velocities \(\mathbf{V}(t + \frac{1} {2}\delta t)\) are first calculated from the velocities at time \(t -\frac{1} {2}\delta t\) and the accelerations at time t using Eq. 7.24. Then the positions \(\mathbf{R}(t + \delta t)\) are calculated from the velocities \(\mathbf{V}(t + \frac{1} {2}\delta t)\) and the positions R(t) using Eq. 7.25. The algorithm gets its name from the fact that the velocities are calculated in manner such that they “leap-frog” over the positions to give their values \(t -\frac{1} {2}\delta t\). Then the positions are calculated such that they “leap-frog” over the velocities, and then the algorithm continues.

The “leap-frog” algorithm improves upon the standard Verlet algorithm in that the velocity is explicitly included in the calculations and also the “leap-frog” algorithm does not require the calculation of the differences of large numbers so the precision of the calculation should be improved. However, the fact that the calculated velocities and positions are not synchronized in time results in the fact that the kinetic energy contribution to the total energy cannot be calculated for the time at which the positions are defined. In response to this shortcoming in the “leap-frog” algorithm, a formalism to calculate the velocities at time t has been developed that follows

$$\mathbf{V}(t) = \frac{[\mathbf{V}(t + \frac{\delta t} {2} ) + \mathbf{V}(t -\frac{\delta t} {2} )]} {2}$$
(7.26)

Velocity Verlet Algorithm

The velocity Verlet method (Swope et al. 1982), which is a variation of the standard Verlet method, calculates the positions, velocities, and accelerations at the same time by using the following equations:

$$\mathbf{R}(t + \delta t) = \mathbf{R}(t) + \delta t\mathbf{V}(t) + \frac{1} {2}\delta {t}^{2}\mathbf{A}(t)$$
(7.27)
$$\mathbf{V}(t + \delta t) = \mathbf{V}(t) + \frac{1} {2}\delta t[\mathbf{A}(t) + \mathbf{A}(t + \delta t)].$$
(7.28)

The velocity Verlet method is a three-stage algorithm because the calculation of the new velocities (Eq. 7.28) requires both the acceleration at time t and at time t + δt. Therefore, first, the positions at t + δt are calculated using Eq. 7.27 and the velocities and accelerations at time t. The velocities at time \(t + \frac{1} {2}\delta t\) are then calculated using

$$\mathbf{V}(t + \frac{1} {2}\delta t) = \mathbf{V}(t) + \frac{1} {2}\delta t\mathbf{A}(t).$$
(7.29)

Then the forces are computed from the current positions, which results in being able to calculate A(t + δt). Then the final step consists of calculating the velocities at time t + δt using

$$\mathbf{V}(t + \delta t) = \mathbf{V}(t + \frac{1} {2}\delta t) + \frac{1} {2}\delta t\mathbf{A}(t + \delta t).$$
(7.30)

Therefore, the velocity Verlet allows for the velocities and positions to be calculated in a time-synchronized manner, and thus allows for the kinetic energy contribution of the total energy. Also, the precision of the results will be improved upon those from the standard Verlet algorithm as there are no differences of large numbers within the formalism of the method.

The selection of the best time integration method for a given problem and the size of the time step to use will be discussed in section “Setting the Time Step.”

Hybrid Quantum/Classical (QM/MM) Molecular Dynamics

The ab initio and classical simulation techniques discussed in the previous sections can be viewed as complementary. While AIMD is capable of dealing with electronic processes such as chemical reactions, charge transfer, and electronic excitations, its applicability is limited to systems of modest size, precluding its use in complex, large-scale biochemical simulations. Classical MD, on the other hand, can describe much larger systems on longer timescales, but misses any of the above-mentioned electronic effects, e.g., bond breaking and formation. The basic idea of the QM/MM approach is to combine the strengths of the two methods treating a chemically active region at the quantum level and the environment using molecular mechanics (i.e., a force-field). There are several excellent review articles on the QM/MM method in the literature (Senn and Thiel 2009; Thiel 2009).

Partitioning Schemes

The entire system, S, is partitioned into a chemically active inner region, I, and a chemically inert outer region, O. If the border between these regions cuts through chemical bonds, so-called link atoms, L, are usually introduced to cap the inner region (see section “Bonds Across the QM/MM Boundary”).

Subtractive Scheme

In a subtractive scheme, the total energy, \({E}_{\mathrm{QM/MM}}^{\mathbf{S}}\), of the entire system,

$${E}_{\mathrm{QM/MM}}^{\mathbf{S}} = {E}_{\mathrm{ MM}}^{\mathbf{S}} + {E}_{\mathrm{ QM}}^{\mathbf{I,L}} - {E}_{\mathrm{ MM}}^{\mathbf{I,L}}$$
(7.31)

is calculated from three separate energy contributions: (1) the MM energy of the entire system, \({E}_{\mathrm{MM}}^{\mathbf{S}}\), (2) the QM energy of the active region (including any link atoms), \({E}_{\mathrm{QM}}^{\mathbf{I,L}}\), (3) the MM energy of the active region \({E}_{\mathrm{MM}}^{\mathbf{I,L}}\).

The role of the third term in Eq. 7.31 is to avoid double counting and to correct for any artifacts caused by the link atoms. For the latter to be effective, the force-field has to reproduce the quantum mechanical forces reasonably well in the link region.

Additive Scheme

In an additive scheme, the total energy of the system is given by

$${E}_{\mathrm{QM/MM}}^{\mathbf{S}} = {E}_{\mathrm{ MM}}^{\mathbf{O}} + {E}_{\mathrm{ QM}}^{\mathbf{I,L}} + {E}_{\mathrm{ QM-MM}}^{\mathbf{I,O}}$$
(7.32)

The difference to the subtractive scheme is that here a pure MM calculation is performed for only the outer region and the interaction between QM and MM regions is achieved by an explicit coupling term,

$${E}_{\mathrm{QM-MM}}^{\mathbf{I,O}} = {E}_{\mathrm{ QM-MM}}^{\mathrm{bond}} + {E}_{\mathrm{ QM-MM}}^{\mathrm{vdW}} + {E}_{\mathrm{ QM-MM}}^{\mathrm{el}}$$
(7.33)

where \({E}_{\mathrm{QM-MM}}^{\mathrm{bond}}\), \({E}_{\mathrm{QM-MM}}^{\mathrm{vdW}}\), \({E}_{\mathrm{QM-MM}}^{\mathrm{el}}\), are bonded, van der Waals, and electrostatic interaction energies, respectively.

The simplest way to treat electrostatic interactions between the I and O subsystems is to assign fixed electric charges to all I atoms (mechanical embedding). In this case the QM problem is solved for the isolated subsystem I without taking into account the effects of the surrounding atomic charges in O. The majority of implementations use an electrostatic embedding scheme in which the MM point charges of region O are incorporated in the QM Hamiltonian through a QM-MM coupling term,

$$\hat{{H}}_{\mathrm{QM-MM}}^{\mathrm{el}} = -{\sum \limits _{i}^{n}}{ \sum \limits _{\alpha \in \mathbf{O}}} \frac{{q}_{\alpha }} {\vert {\mathbf{r}}_{i} -{\mathbf{R}}_{\alpha }\vert } +{ \sum \limits _{I\in \mathbf{I}+\mathbf{L}}}{ \sum \limits _{\alpha \in \mathbf{O}}} \frac{{q}_{\alpha }{Z}_{I}} {\vert {\mathbf{R}}_{I} -{\mathbf{R}}_{\alpha }\vert }$$
(7.34)

where q α are the MM point charges at positions \({\mathbf{R}}_{\alpha }\) (all other symbols as defined in section “Born–Oppenheimer Approximation”). In this way, the electronic structure of the QM region adjusts to the moving MM charge distribution. A problem that arises when an MM point charge is in close proximity to the QM electron cloud is overpolarization of the latter, sometimes referred to as “spill-out” effect. This can be avoided by modifying the Coulomb potential in the first term of Eq. 7.34 at short range (see for instance Laio et al. 2002).

At present, in all commonly used partitioning schemes, the partitions remain fixed over time, i.e., an MM atom cannot turn into a QM atom and vice versa. This can present a serious limitation, for instance, in the case of solvent diffusion through the chemically active region. A number of adaptive partitioning methods have been proposed to remedy this problem (Bulo et al. 2009; Heyden et al. 2007; Hofer et al. 2005; Kerdcharoen et al. 1996; Kerdcharoen and Morokuma 2002); however the computational overhead is enormous.

Bonds Across the QM/MM Boundary

Partitioning the total system into QM and MM regions in such a way that cuts chemical bonds is best avoided. However, in many cases this is inevitable. Then one has to make sure that any atoms participating in chemical reactions are at least three bonds away from boundary. Furthermore it is preferable to cut a bond that is unpolar and not part of a conjugated chain.

Link Atoms

Cutting a single covalent bond will create a dangling bond which must be capped by a so-called link atom; in most applications a hydrogen atom is chosen. In the QM calculation, the atoms of region I together with the link atoms L are treated as an isolated molecule in the presence of the point charges of the environment O. The original QM–MM bond, cut by the partitioning, is only treated at the MM level.

Boundary Atoms

Boundary atom schemes have been developed to avoid the artifacts introduced by a link atom. The boundary atom appears as a normal MM atom in the MM calculation, while carrying QM features to saturate the QM–MM bond and to mimic the electronic properties of the MM side. The QM interactions are achieved by placing a pseudopotential at the position of the boundary atom, parameterized to reproduce electronic properties of certain chemical end group, e.g., a methyl group in the case of a cut C–C bond. Among the various flavors that have been proposed, the pseudobond method for first principles QM calculations (Zhang 20052006; Zhang et al. 1999) and the pseudopotential approach for plane-wave DFT (Laio et al. 2002) are the most relevant in the present context.

Frozen Localized Orbitals

The basic idea behind the various frozen orbital methods (Amara et al. 2000; Assfeld and Rivail 1996; Assfeld et al. 1998; Day et al. 1996; Ferré et al. 2002; Fornili et al. 20032006a,b; Gao et al. 1998; Garcia-Viloca and Gao 2004; Gordon et al. 2001; Grigorenko et al. 2002; Jensen et al. 1994; Jung et al. 2007; Kairys and Jensen 2000; Loos and Assfeld 2007; Monard et al. 1996; Murphy et al. 2000; Nemukhin et al. 20022003; Philipp and Friesner 1999; Pu et al. 2004a,b2005; Sironi et al. 2007; Théry et al. 1994; Warshel and Levitt 1976) is to saturate the cut QM–MM bond by placing on either the MM or the QM atom at the boundary localized orbitals that have been determined in a prior quantum-mechanical SCF calculation on a model molecule containing the bond under consideration. To preserve the properties of the bond, the localized orbitals are then kept fixed in the subsequent QM/MM calculation. Different flavors are the Local SCF (LSCF) method (Assfeld and Rivail 1996; Assfeld et al. 1998; Ferré et al. 2002; Monard et al. 1996; Théry et al. 1994), extremely localized molecular orbitals (ELMOs) (Fornili et al. 20032006b; Sironi et al. 2007), frozen core orbitals (Fornili et al. 2006a), optimized LSCF (Loos and Assfeld 2007), frozen orbitals (Murphy et al. 2000; Philipp and Friesner 1999), generalized hybrid orbitals (Amara et al. 2000; Gao et al. 1998; Garcia-Viloca and Gao 2004; Jung et al. 2007; Pu et al. 2004a,b2005), and effective fragment potentials (EFP) (Day et al. 1996; Gordon et al. 2001; Grigorenko et al. 2002; Jensen et al. 1994; Kairys and Jensen 2000; Nemukhin et al. 20022003).

Of the three types of boundary treatment, the link atom method is the simplest both conceptually and in practice, and is hence the most widely used. The boundary atom and in particular the frozen orbital methods can potentially achieve higher accuracy but require careful a priori parametrization and bear limitations on transferability (Senn and Thiel 2009).

Coarse Grain Molecular Dynamics

A large number of important problems in fields that are often studied using molecular dynamics simulations (i.e. soft condensed matter physics, structural biology, chemistry and materials science) take place over a time span of microseconds to seconds and distances of few hundred nanometers to a few microns. However, these time and length scales are still unattainable via quantum or force-field molecular dynamics methods despite significant computational hardware advances (Mervis 2001; Reed 2003; Shirts and Pande 2000) and the development of increasingly powerful software (Lindahl et al. 2001; MacKerell et al. 1998; Phillips et al. 2005a; Wang et al. 2004). Therefore one approach that has been utilized in order to be able to study these complex problems is to reduce the computational demand of the simulation by reducing the number of atoms represented and therefore the degrees of freedom of the simulated system. This procedure of reducing the number of atoms represented in a system is done by grouping atoms together and representing them as a single interaction site and is generally referred to as “coarse graining” of the system. Figure 7-1 shows a comparison of the atomistic, united-atom and coarse grain representation.

Fig. 7-1
figure 1figure 1

Atomistic, united-atom, and coarse grain representations of organic molecules

The “bead-spring” coarse grain model of polymer chains that was created by Kremer and Grest in 1990 has served as the foundation for many of the coarse grain models that have been developed for a wide range of phenomena (at the current date this paper has been cited over 860 times) including various studies of polymers and biomolecules including DNA solutions. Many of the more recent coarse grain models have been developed for biological macromolecules since there are many examples of interesting biophysical phenomena that occur at large length and timescales. The most widely used coarse grain models for biological systems include the generic model of Lipowsky et al. (Goetz et al. 1999; Shillcock and Lipowsky 2002), the solvent-free model of Deserno et al. (Cooke et al. 2005), and the specific models of the Klein group (Shelley et al. 2001), the Voth group (which is called the Multi-Scale Coarse Grain model) (Izvekov and Voth 20052006), and the Marrink group (called the MARTINI force-field) (Marrink et al. 2007). The above coarse grain models have generally been developed for lipid membranes, however there are also coarse grain force-fields for proteins (as reviewed in Tozzini (2005) and some more recent examples Betancourt and Omovie (2009) and Bereau and Deserno (2009)) and DNA (Khalid et al. 2008; Tepper and Voth 2005).

When developing a coarse grain model for a system, there are two important decisions to be made: (1) how many atoms to combine (coarse grain) into a single interaction site and (2) how to parameterize the coarse grain force-field. In deciding the number of atoms to combine into a single interaction site, one must consider the obvious trade-off of how much detail are you able to sacrifice in order to simulate larger length and/or timescale phenomena and still be able to actually accurately model the phenomena of interest. The least amount of coarse graining that has been used is represented by what is called a “united-atom” representation of a molecule where all “heavy” atoms (generally all non-hydrogen elements in a molecule) are represented and the “light” (i.e., hydrogen) atoms are grouped with the heavy atom to which they are bonded into one interaction site. United atom versions of many of the popular all-atom force-fields listed in section “Classical Force Fields” exist and have been successfully used in several studies. In addition to united-atom models, there are several existing coarse graining methods that will combine different number of atoms together into one interaction site.

In general, coarse grain systems are governed by similar potential terms as are found in atomistic models such as nonbond terms (both pair-wise interactions and electrostatic interactions), bond stretching terms, and then in more sophisticated models even angle and dihedral terms will be included as well. Generally, all specific models are parameterized based on comparison to atomistic simulations and/or detailed experimental data. Effective coarse grain potentials have been extracted from atomistic simulations using inverse Monte Carlo schemes (Elezgaray and Laguerre 2006; Lyubartsev 2005) or force matching approaches (Izvekov and Voth 20052006). Another approach is to develop standard potential functions that are calibrated using thermodynamic data (Marrink et al. 2004). The advantage of the using either the inverse Monte Carlo or force matching schemes is that the resulting force-field will produce a higher level of accuracy and closer resemblance to atomistic simulations. However, these schemes produce force-fields that are useful for a given statepoint and therefore are not transferable. Whereas the advantages of the thermodynamic approach include that it produces a potential that has a broader range of applicability and also the thermodynamic approach does not require atomistic simulations to be done in the first place.

Interaction Potentials/Force Fields

Classical Force Fields

Classical, or empirical, force-fields are generally used to calculate the energy of a system as a function of the nuclear positions of the particles within the system, while ignoring the behavior of the individual electrons. As stated in the section “Born–Oppenheimer Approximation,” the Born–Oppenheimer approximation makes it possible to write the energy as a function of the nuclear coordinates. Another approximation that is key to the implementation of classical force-fields is that it is possible to model the relatively complex motion of particles within the system with fairly simple analytical models of inter and intra-molecular interactions. Generally, an empirical force-field consists of terms that model the nonbonded interactions (E nonbond), which include both the van der Waals and Coulombic interactions, the bonded interactions (E bond), the angle bending interactions (E angle), and the dihedral (bond rotations) interactions (E dihedral):

$$E(\mathbf{R}) = {E}_{\mathrm{nonbond}} + {E}_{\mathrm{bond}} + {E}_{\mathrm{angle}} + {E}_{\mathrm{dihedral}}.$$
(7.35)

Figure 7-2 presents representative cartoons of the bond, angle, and dihedral interactions from a molecular perspective. The form that each of these individual terms takes is dependent on the force-field that you are using. There are several different force-field options available for various systems. The best way to find the most suitable force-field for your specific problem is to conduct a literature and/or internet search in order to find which force-field has the capability to model the molecules you are interested in studying. However, if you are interested in modeling organic/biological molecules, there are several large force-fields that may be a good place to start, including Charmm (MacKerell et al. 1998), OPLS (Jørgensen et al. 1984), Amber (Cornell et al. 1995), and COMPASS (Sun et al. 1998). Likewise, there are several well-known large force-fields that can be used for solids like the BKS potential (van Beest et al. 1990) for oxides and the Embedded Atom Method (EAM) (Daw and Baskes 19831984; Finnis and Sinclair 1984) and Modified Embedded Atom Method (MEAM) (Baskes 1992) force-fields, which are primarily used to model metals. In addition to defining the functional forms used for the various terms in the general potential formulation, a force-field will also define the variables used in the potential which are derived from a combination of quantum simulation results and experimental observations.

Fig. 7-2
figure 2figure 2

Intramolecular terms of classical force-fields: bond, angle, and dihedral interactions

In the following sections, each of the terms in Eq. 7.35 will be discussed further and typical functional forms that are used in the previously mentioned force-fields and others to represent each term will be shown.

We limit the discussion to simple non-polarizable force fields in which the individual atoms carry fixed charges. They capture many-body-effects such as electronic polarization only in an effective way. More sophisticated polarizable force fields have been developed over the past two decades (see for instance Ponder et al. (2010) and references therein) however they are computationally substantially more demanding.

Nonbonded Interactions

There are two general forms of nonbonded interactions that need to be accounted for by a classical force-field: (1) the van der Waals (vdw) interactions and (2) the electrostatic interactions.

van der Waals Interactions

In order to model the van der Waals interactions, we need a simple empirical expression that is not computationally intensive and that models both the dispersion and repulsive interactions that are known to act upon atoms and molecules. The most commonly used functional form of van der Waals energy (E vdW) in classical force-fields is the Lennard-Jones 12-6 function that has the form:

$${E}_{\mathrm{vdW}}(\mathbf{R}) ={ \sum \limits _{I>J}}4{\epsilon }_{IJ}\left [{\left ( \frac{{\sigma }_{IJ}} {{R}_{IJ}}\right )}^{12} -{\left ( \frac{{\sigma }_{IJ}} {{R}_{IJ}}\right )}^{6}\right ],$$
(7.36)

where σ IJ is the collision diameter and ε IJ is the well depth of the interaction between atoms I and J. Both σ IJ and ε IJ are adjustable parameters that will have different values to describe the interactions between different pairs of particles (i.e., the values of σ and ε used to describe the interaction between two carbon atoms are different than the values of σ and ε used to describe the interaction between a carbon and an oxygen).

Equation 7.36 models both the attractive part (the R − 6 term) and the repulsive part (the R − 12 term) of the nonbonded interaction. Other formulations of the Lennard-Jones nonbond potential commonly have the same power law description of the attractive part of the potential, but will have different power law dependence for the repulsive part of the interaction, such as the Lennard-Jones 9-6 function:

$${E}_{\mathrm{vdW}}(\mathbf{R}) ={ \sum \limits _{I>J}}4{\epsilon }_{IJ}\left [{\left ( \frac{{\sigma }_{IJ}} {{R}_{IJ}}\right )}^{9} -{\left ( \frac{{\sigma }_{IJ}} {{R}_{IJ}}\right )}^{6}\right ].$$
(7.37)

When the nonbond interactions of a system that contains multiple particle types and multiple molecules are modeled using a Lennard-Jones type nonbond potential, it is necessary to be able to define the values of σ and ε that apply to the interaction between particles of type I and J. The parameters for these cross interactions are generally found using one of the two following mixing rules. One common mixing rule is the Lorentz-Berthelot rule where the value of σ IJ is found from the arithmetic mean of the two pure values and the value of ε IJ is the geometric mean of the two pure values:

$${\sigma }_{IJ} = \frac{({\sigma }_{I} + {\sigma }_{J})} {2}$$
(7.38)
$${\epsilon }_{IJ} = \sqrt{{\epsilon }_{I } {\epsilon }_{J}}$$
(7.39)

The other commonly used mixing rule is the one that defines bothσ IJ andε IJ asthe geometric mean of the values for the pure species:

$${\sigma }_{IJ} = \sqrt{{\sigma }_{I } {\sigma }_{J}}$$
(7.40)
$${\epsilon }_{IJ} = \sqrt{{\epsilon }_{I } {\epsilon }_{J}}$$
(7.41)

Most force-fields use the Lorentz-Berthelot mixing rule, however the OPLSforce-field is one force-field that utilizes the geometric mixing rule.

In other nonbond pairwise potentials, the repulsive portion of the interaction is modeled with an exponential term, which is in better agreement with the functional form of the repulsive term determined from quantum mechanics. One example of such a potential is the Buckingham potential (Buckingham 1938):

$${E}_{\mathrm{vdW}}(\mathbf{R}) ={ \sum \limits _{I<J}}\left.{A}_{IJ}\exp (-{B}_{IJ}{R}_{IJ}) -\left ( \frac{{C}_{IJ}} {{R}_{IJ}^{6}}\right )\right ],$$
(7.42)

where A IJ , B IJ , and C IJ are adjustable parameters that will have unique values for different types of particles. Another form of the nonbond interaction is the Born–Mayer–Huggins potential (Fumi and Tosi 1964; Tosi and Fumi 1964):

$${E}_{\mathrm{vdW}}(\mathbf{R}) ={ \sum \limits _{I<J}}{A}_{IJ}\exp ({B}_{IJ}({\sigma }_{IJ} - {R}_{IJ})) - \frac{{C}_{IJ}} {{R}_{IJ}^{6}} + \frac{{D}_{IJ}} {{R}_{IJ}^{8}},$$
(7.43)

where A IJ , B IJ , C IJ , D IJ and σ IJ are adjustable parameters that will have unique values for different types of particles. The Born–Mayer–Huggins potential (Eq. 7.43) is identical to the Buckingham potential (Eq. 7.42) when σ = D = 0.

All of the nonbond potential functional forms that have been presented to this point take into account the effect that one particle has on another particle based solely on the distance between the two particles. However, in some systems like metals and alloys as well as some covalently bonded materials like silicon and carbon, the nonbonded potential is a function of more than just the distance between two particles. In order to model these systems, the embedded-atom method (EAM) (Daw and Baskes 19831984; Finnis and Sinclair 1984) and modified embedded-atom method (MEAM) (Baskes 1992) utilize an embedding energy, F I , which is a function of the atomic electronic density ρ I of the embedded atom I and a pair potential interaction ϕ IJ such that

$${E}_{I}(\mathbf{R}) = {F}_{I}\left ({\sum \limits _{J\neq I}}{\rho }_{I}({R}_{IJ})\right ) + \frac{1} {2}{\sum \limits _{J\neq I}}{\phi }_{IJ}({R}_{IJ}).$$
(7.44)

The multi-body nature of the EAM potential is a result of the embedding energy term.

So while the EAM and MEAM potentials have a term to account for multi-body interactions they are still only pair-wise potential, as are all the other nonbond potentials presented to this point. However, there are multi-body potentials that will explicitly account for how the presence of a third, fourth, …atom affects the nonbond energy felt by any given atom. One example of a three-body potential is the Stillinger-Weber potential (Stillinger and Weber 1985):

$$E(\mathbf{R}) ={ \sum \limits _{I}}{ \sum \limits _{J>I}}{\phi }_{2}({R}_{IJ}) +{ \sum \limits _{I}}{ \sum \limits _{J\neq I}}{ \sum \limits _{K>J}}{\phi }_{3}({R}_{IJ},{R}_{IK},{\theta }_{IJK}),$$
(7.45)

where there is a two-body term ϕ2:

$${\phi }_{2}({R}_{IJ}) = {A}_{IJ}{\epsilon}_{IJ} \left[ {B}_{IJ} \left( \frac{{\sigma }_{IJ}} {{R}_{IJ}}\right)^{{p}_{IJ} } \left( \frac{{\sigma }_{IJ}} {{R}_{IJ}}\right)^{{q}_{IJ} } \right] \exp \left( \frac{{\sigma }_{IJ}}{{R}_{IJ} - {a}_{IJ}{\sigma }_{IJ}} \right) $$
(7.46)

and a three-body term ϕ3:

$$\begin{array}{rcl}{ \phi }_{3}({R}_{IJ},{R}_{IK},{\theta }_{IJK})&=& {\lambda }_{IJK}{\epsilon }_{IJK} \left[ \cos {\theta }_{IJK}-\cos {\theta }_{0,IJK} \right] ^{2} \\ & & \times \exp \left(\frac{{\gamma}_{IJ}{\sigma }_{IJ}} {{R}_{IJ} - {a}_{IJ}{\sigma }_{IJ}} \right) \\& & \times \exp \left( \frac{{\gamma }_{IK}{\sigma }_{IK}}{{R}_{IK} - {a}_{IK}{\sigma }_{IK}} \right).\end{array}$$
(7.47)

The Stillinger-Weber potential has generally been used for modeling crystalline silicon; however, more recently it has also been used for organic molecules as well. Another example of a three-body interatomic potential is the Tersoff potential (Tersoff 19881989), which also was created initially in an attempt to accurately model silicon solids.

Electrostatic Interactions

Due to the fact that not all particles in a molecule have the same electronegativity, different particles will have stronger attractions to electrons than others. However, since classical force-fields do not model the flow of electrons, the different particles within a molecule are assigned a partial charge that remains constant during the course of a simulation. Generally these partial charges q i are assigned to the nuclear centers of the particles. The electrostatic interaction between particles in different molecules or particles that are separated by at least two other atoms in a given molecule is calculated as the sum of the contributions between pairs of these partial charges using Coulomb’s law:

$${E}_{\mathrm{coul}} ={ \sum \limits _{I}}{ \sum \limits _{J}} \frac{{q}_{I}{q}_{J}} {4\pi {\epsilon }_{0}{R}_{IJ}}$$
(7.48)

where the charges of each particle are q I and q J and ε0 is the dielectric constant.

In practice, an Ewald sum (Ewald 1921) is generally used to evaluate the electrostatic interactions within a classical MD simulation. However, this is a very computationally expensive algorithm to implement and it results in a computational cost of N 3 ∕ 2, where N is the number of particles in the system. In order to obtain better computational scaling, fast Fourier transforms (FFTs) have been used to calculate the reciprocal space summation required within the Ewald sum. By using the FFT algorithm, one can reduce the cost of the electrostatic algorithm to NlogN. The most popular FFT algorithm that has been adopted for use in classical MD simulations is the particle-particle particle-mesh (pppm) approach (Hockney and Eastwood 1981; Luty et al. 19941995).

Bonded Interactions

The bonded interactions are needed to model the energetic penalty that will result from two covalently bonded atoms moving too close or too far away from one another. The most common functional form that is used to model the bond bending interactions is that of a harmonic term:

$${E}_{\mathrm{bond}} ={ \sum \limits _{\mathrm{bonds}}}{k}_{b}{({\mathcal{l}}_{b} - {\mathcal{l}}_{b}^{(0)})}^{2}$$
(7.49)

where k b is commonly referred to as the bond constant and is a measure of the bond stiffness and \({\mathcal{l}}_{b}^{(0)}\) is the reference length or often referred to the equilibrium bond length. Each of these parameters will vary depending on the types of particles that the bond is joining.

Angle Bending Interactions

The angle bending interactions are also modeled in order to determine the energetic penalties of angles containing three different particles compressing or overextending such that they distort the geometry of a portion of a molecule away from its desired structure.

Again, the most common functional form to model the angle interactions is a harmonic expression:

$${E}_{\mathrm{angle}} ={ \sum \limits_{\mathrm{angles}}}{k}_{a}{({\theta }_{a} - {\theta }_{a}^{(0)})}^{2}$$
(7.50)

where k a is the angle constant and is a measure of the rigidity of the angle, and \({\theta }_{a}^{(0)}\) is the equilibrium or reference angle.

Torsional Interactions

The torsional interactions are generally modeled using some form of a cosine series. The OPLS force-field uses the following expression for its torsional term:

$$\begin{array}{rcl}{ E}_{\mathrm{dihed}} ={ \sum \limits _{\mathrm{dihedrals}}}\frac{1} {2}{K}_{d}^{(1)}[1 +\cos (\phi )]& +& \frac{1} {2}{K}_{d}^{(2)}[1 -\cos (2\phi )] + \frac{1} {2}{K}_{d}^{(3)}[1 +\cos (3\phi )] \\ & +& \frac{1} {2}{K}_{d}^{(4)}[1 -\cos (4\phi )] \end{array}$$
(7.51)

where \({K}_{d}^{(i)}\) are the force constants for each cosine term and ϕ is the measured dihedral angle. The Charmm force-field uses the following expression:

$${E}_{\mathrm{dihed}} ={ \sum \limits _{\mathrm{dihedrals}}}{K}_{d}[1 +\cos (n\phi - {d}_{d})],$$
(7.52)

where K d is the force constant, n is the multiplicity of the dihedral angle ϕ, and d d is the shift of the cosine that allows one to more easily move the minimum of the dihedral energy.

First Principles Electronic Structure Methods

For the electronic ground state, i.e., k = 0, Kohn–Sham (KS) density functional theory is commonly used. In this case, the energy is given by

$${E}_{0} \approx {E}^{\mathrm{KS}}[\rho ] = {T}_{\mathrm{ s}}[\rho ] + \int \nolimits \nolimits d\mathbf{r}{v}_{\mathrm{ext}}(\mathbf{r})\rho (\mathbf{r}) + \frac{1} {2}\int \nolimits \nolimits d\mathbf{r}{v}_{\mathrm{H}}(\mathbf{r})\rho (\mathbf{r}) + {E}_{\mathrm{xc}}$$
(7.53)

with the kinetic energy of noninteracting electrons, i.e., using a Slater determinant as a wavefunction ansatz,

$${ \Psi }^{\mathrm{KS}} = \frac{1} {\sqrt{n!}}\left \vert \begin{array}{cccc} {\psi }_{1}({\mathbf{x}}_{1}) & {\psi }_{2}({\mathbf{x}}_{1}) &\cdots & {\psi }_{n}({\mathbf{x}}_{1}) \\ {\psi }_{1}({\mathbf{x}}_{2}) & {\psi }_{2}({\mathbf{x}}_{2}) &\cdots & {\psi }_{n}({\mathbf{x}}_{2})\\ \vdots & \vdots & \vdots & \\ {\psi }_{1}({\mathbf{x}}_{n})&{\psi }_{2}({\mathbf{x}}_{n})&\cdots &{\psi }_{n}({\mathbf{x}}_{n}) \end{array} \right \vert $$
(7.54)
$${T}_{\mathrm{s}}[\rho ] = -\frac{1} {2}{\sum \limits _{i}}^{n}{f}_{ i} \int \nolimits \nolimits d\mathbf{r}\,{\psi }_{i}(\mathbf{r}){\nabla }^{2}{\psi }_{ i}(\mathbf{r})$$
(7.55)

where f i is the number of electrons occupying orbital ψ i , the external potential including nucleus–nucleus repulsion and electron–nucleus attraction,

$${v}_{\mathrm{ext}}(\mathbf{r}) ={ \sum \limits _{I=1}^{N-1}}{ \sum \limits _{J>I}^{N}} \frac{{Z}_{I}{Z}_{J}} {\vert {\mathbf{R}}_{I} -{\mathbf{R}}_{J}\vert }-{\sum \limits _{I=1}^{N}} \frac{{Z}_{I}} {\vert \mathbf{r} -{\mathbf{R}}_{I}\vert }$$
(7.56)

the Hartree potential (electron–electron interaction)

$${v}_{\mathrm{H}}(\mathbf{r}) = \int \nolimits \nolimits d\mathbf{r}^ \prime\, \frac{\rho (\mathbf{r}^ \prime)} {\vert \mathbf{r} -\mathbf{r}^ \prime\vert }$$
(7.57)

the exchange-correlation energy, E xc, and the electron density

$$\rho (\mathbf{r}) ={ \sum \limits _{i}^{n}}{f}_{ i}\vert {\psi }_{i}(\mathbf{r}){\vert }^{2}$$
(7.58)

The orbitals which minimize the total, many-electron energy (Eq. 7.53) are obtained by solving self-consistently the one-electron Kohn–Sham equations,

$$\left [-\frac{1} {2}{\nabla }^{2} + {v}_{\mathrm{ ext}}(\mathbf{r}) + {v}_{\mathrm{H}}(\mathbf{r}) + \frac{\delta {E}_{\mathrm{xc}}[\rho ]} {\delta \rho (\mathbf{r})} \right ]{\psi }_{i}(\mathbf{r}) = {\epsilon }_{i}{\psi }_{i}(\mathbf{r})$$
(7.59)

DFT is exact in principle, provided that \({E}_{xc}[\rho ]\) is known, in which case E KS (see Eq. 7.53) is an exact representation of the ground state energy E 0 (see Eq. 7.8). In practice, however, \({E}_{xc}[\rho ]\) is not – and presumably never will be – known exactly; therefore (semiempirical) approximations are used.

The starting point for most density functionals is the local density approximation (LDA), which is based on the assumption that one deals with a homogeneous electron gas. E xc is split into an exchange term E x and a correlation term E c . Within the LDA, the exchange functional is given exactly by Dirac (1930):

$${E}_{x}^{\mathrm{LDA}}[\rho ] = \int \nolimits \nolimits \rho (\mathbf{r}){\epsilon }_{x}^{\mathrm{LDA}}(\rho (\mathbf{r}))d\mathbf{r}$$
(7.60)

where

$${\epsilon }_{x}^{\mathrm{LDA}}(\rho ) = -\frac{3} {4}{\left ( \frac{3} {\pi }\right )}^{\frac{1} {3} }\rho {(\mathbf{r})}^{\frac{1} {3} }$$
(7.61)

The LDA correlation functional, on the other hand, can only be approximated. We give here the most commonly used expression by Vosko et al. (1980), derived from Quantum Monte Carlo calculations:

$${E}_{c}^{\mathrm{LDA}}[\rho ] = \int \nolimits \nolimits \rho (\mathbf{r}){\epsilon }_{c}^{\mathrm{LDA}}(\rho (\mathbf{r}))d\mathbf{r}$$
(7.62)

where

$$\begin{array}{rcl}{ \epsilon }_{c}^{\mathrm{LDA}}(\rho ) = A\left \{\ln \left (\frac{{x}^{2}} {X}\right )\right.& +&{ \frac{2b} {Q}\tan }^{-1}\left ( \frac{Q} {2x + b}\right ) - \frac{b{x}_{0}} {X({x}_{0})}\left [\ln \left (\frac{{(x - {x}_{0})}^{2}} {X} \right )\right. \\ & +&{ \frac{2(b2{x}_{0})} {Q} \tan }^{-1}\left.\left.\left ( \frac{Q} {2x + b}\right )\right ]\right \} \end{array}$$
(7.63)

with X = x 2 + bx + c, \(x = \sqrt{{r}_{s}}\), \({r}_{s} = \root{3}\of{ \frac{3} {4\pi \rho (\mathbf{r})}}\), \(Q = \sqrt{4c - {b}^{2}}\), x 0 = − 0. 104098, A = 0. 0310907, b = 3. 72744, c = 12. 9352.

This simplest approximation, LDA, is often too inaccurate for chemically relevant problems. A notable improvement is usually offered by so-called semilocal or gradient corrected functionals (generalized gradient approximation (GGA)), in which E x and E c are expressed as functionals of ρ and the first variation of the density, \(\nabla \rho \):

$${E}_{x}^{\mathrm{GGA}}[\rho,\nabla \rho ] = \int \nolimits \nolimits \rho (\mathbf{r}){\epsilon }_{x}^{\mathrm{GGA}}(\rho (\mathbf{r}),\nabla \rho )d\mathbf{r}$$
(7.64)
$${E}_{c}^{\mathrm{GGA}}[\rho,\nabla \rho ] = \int \nolimits \nolimits \rho (\mathbf{r}){\epsilon }_{c}^{\mathrm{GGA}}(\rho (\mathbf{r}),\nabla \rho )d\mathbf{r}$$
(7.65)

Popular examples are the BLYP (Becke 1988; Lee et al. 1988), BP (Becke 1988; Polák 1986), and BPW91 (Becke 1988; Perdew et al. 1992) functionals. The expressions for \({\epsilon }_{x,c}^{\mathrm{GGA}}(\rho (\mathbf{r}),\nabla \rho )\) are complex and shall not be discussed here.

In many cases, accuracy can be further increased by using so-called hybrid functionals, which contain an admixture of Hartree–Fock exchange to KS exchange. Probably the most widely used hybrid functional is the three-parameter B3LYP functional (Becke 1993),

$${E}_{xc}^{\mathrm{B3LYP}} = a{E}_{ x}^{\mathrm{LDA}} + (1 - a){E}_{ x}^{\mathrm{HF}} + b\Delta {E}_{ x}^{\mathrm{B}} + (1 - c){E}_{ c}^{\mathrm{LDA}} + c{E}_{ c}^{\mathrm{LYP}}$$
(7.66)

where a = 0. 80, b = 0. 72, c = 0. 81, and E x HF is the Hartree-Fock exchange energy evaluated using KS orbitals.

New functionals are constantly proposed in search of better approximations to the exact E xc . Often functionals are designed to remedy a particular shortcoming of previous functionals, for instance, for dispersion interactions.

Building the System/Collecting the Ingredients

Setting Up an AIMD Simulation

Building a Molecule

In many cases, the coordinates of a molecular structure are available for download on the web, from crystallographic databases (CCDC 2010; ICSD 2009; PDB 2010; Reciprocal Net 2004; Toth 2009) or journal supplements. For relatively small molecules, an initial guess structure can be built using molecular graphics software packages such as molden (2010).

Plane Waves and Pseudopotentials

The most common form of AIMD simulation employs DFT (see section “First Principles Electronic Structure Methods”) to calculate atomic forces, in conjunction with periodic boundary conditions and a plane wave basis set. Using a plane wave basis has two major advantages over atom-centered basis functions: (1) there is no basis set superposition error (Boys and Bernardi 1970; Marx and Hutter 2000) and (2) the Pulay correction (Pulay 19691987) to the Hellmann–Feynman force, due to basis set incompleteness, vanishes (Marx and Hutter 20002009).

Plane Wave Basis Set

As a consequence of Bloch’s theorem, in a periodic lattice, the Kohn–Sham orbitals (see Eq. 7.57) can be expanded in a set of plane waves (Ashcroft and Mermin 1976; Meyer 2006),

$${\psi }_{\mathbf{k},j}(\mathbf{r}) ={ \sum \limits _{\mathbf{G}}}{c}_{\mathbf{G}}^{\mathbf{k},j}{e}^{i(\mathbf{k}+\mathbf{G})\mathbf{r}}$$
(7.67)

where k is a wavevector within the Brillouin zone, satisfying Bloch’s theorem,

$$\psi (\mathbf{r} + \mathbf{T}) = {e}^{i\mathbf{k}\mathbf{T}}\psi (\mathbf{r})$$
(7.68)

for any lattice vector T,

$$\mathbf{T} = {N}_{1}{\mathbf{a}}_{1} + {N}_{2}{\mathbf{a}}_{2} + {N}_{3}{\mathbf{a}}_{3}$$
(7.69)

\({N}_{1},{N}_{2},{N}_{3}\) being integer numbers, and \({\mathbf{a}}_{1},{\mathbf{a}}_{2},{\mathbf{a}}_{3}\) the vectors defining the periodically repeated simulation box.

In Eq. 7.67, the summation is over all reciprocal lattice vectors G which fulfill the condition \(\mathbf{G \cdot T} = 2\pi M\), M being an integer number. In practice, this plane-wave expansion of the Kohn-Sham orbitals is truncated such that the individual terms all yield kinetic energies lower than a specified cutoff value, E cut,

$$\frac{ \hbar^{2}} {2m}\vert \mathbf{k} + \mathbf{G}{\vert }^{2} \leq {E}_{\mathrm{ cut}}$$
(7.70)

The plane-wave basis set thus has the advantage over other basis sets that convergence can be controlled by a single parameter, namely E cut.

In this periodic setup, the electron density (see Eq. 7.58) can be approximated by a sum over a mesh of N kpt k-points in the Brillouin zone (Chadi and Cohen 1973; Monkhorst and Pack 1976; Moreno and Soler 1992),

$$\rho (\mathbf{r}) \approx \frac{1} {{N}_{\mathrm{kpt}}}{ \sum \limits _{\mathbf{k}}}{f}_{\mathbf{k},j}\vert {\psi }_{\mathbf{k},j}(\mathbf{r}){\vert }^{2}$$
(7.71)

Since the volume of the Brillouin zone, \({V }_{\mathrm{BZ}} = {(2\pi )}^{3}/{V }_{\mathrm{box}}\), decreases with increasing volume of the simulation supercell, V box, only a small number of k-points need to be sampled for large supercells. For insulating materials (i.e., large bandgap), a single k-point is often sufficient, typically taken to be k = 0 (Γ-point approximation).

Pseudopotentials

While plane waves are a good representation of delocalized Kohn–Sham orbitals in metals, a huge number of them would be required in the expansion (Eq. 7.67) to obtain a good approximation of atomic orbitals, in particular near the nucleus where they oscillate rapidly. Therefore, in order to reduce the size of the basis set, only the valence electrons are treated explicitly, while the core electrons (i.e., the inner shells) are taken into account implicitly through pseudopotentials combining their effect on the valence electrons with the nuclear Coulomb potential. This frozen core approximation is justified as typically only the valence electrons participate in chemical interactions. To minimize the number of basis functions the pseudopotentials are constructed in such a way as to produce nodeless atomic valence wavefunctions. Beyond a specified cutoff distance from the nucleus, R cut the nodeless pseudo-wavefunctions are required to be identical to the reference all-electron wavefunctions.

Normconserving Pseudopotentials

Normconserving pseudopotentials are generated subject to the condition that the pseudo-wavefunction has the same norm as the all-electron wavefunction and thus gives rise to the same electron density. Although normconserving pseudopotentials have to fulfill a (small) number of mathematical conditions, there remains considerable freedom in how to create them. Hence several different recipes exist (Bachelet et al. 1982; Goedecker et al. 1996; Hamann et al. 1979; Hartwigsen et al. 1998; Kerker 1980; Troullier and Martins 19901991; Vanderbilt 1985).

Since pseudopotentials are generated using atomic orbitals as a reference, it is not guaranteed that they are transferable to any chemical environment. Generally, transferability is the better the smaller the cutoff radius R cut is chosen. However, the reduction in the number of plane waves required to represent a particular pseudo-wavefunction – i.e., the softness of the corresponding pseudopotential – increases as R cut gets larger. So R cut has to be chosen carefully and there is always a trade-off between transferability and softness. An upper limit for R cut is given by the shortest interatomic distances in the molecule or crystal the pseudopotential will be used for: one needs to make sure that the sum of the two cutoff radii of any two neighboring atoms is smaller than their actual spatial separation.

For each angular momentum l, a separate pseudopotential \({V }_{l}^{\mathrm{PS}}(r)\) is constructed. The total pseudopotential operator is written as

$$\hat{{V }}^{\mathrm{PS}}={V }_{\mathrm{loc}}^{\mathrm{PS}}(r)+{\sum\limits_{l}{V }}_{\mathrm{nl},l}^{\mathrm{PS}}(r)\hat{{P}}_{l}$$
(7.72)

where the nonlocal part is defined as

$${V }_{\mathrm{nl},l}^{\mathrm{PS}}(r) = {V }_{ l}^{\mathrm{PS}}(r) - {V }_{\mathrm{ loc}}^{\mathrm{PS}}(r)$$
(7.73)

and the local part \({V }_{\mathrm{loc}}^{\mathrm{PS}}(r)\) is taken to be the pseudopotential \({V }_{l}^{\mathrm{PS}}(r)\) for one specific value of l, typically the highest one for which a pseudopotential was created. The pseudopotential (Eq. 7.70) is called semi-local, since the projector \(\hat{P}_l\) only acts on the l-th angular momentum component of the wavefunction, but not on the radius r. (Note: a pseudopotential is called nonlocal if it is l-dependent.)

To achieve higher numerical efficiency, it is common practice to transform the semi-local pseudopotential (Eq. 7.70) to a fully nonlocal form,

$$\hat{{V}}^{\mathrm{PS}}={V}_{\mathrm{loc}}^{\mathrm{PS}}(r) +{\sum \limits_{ij}}\vert{\beta}_{i}>{B}_{ij}<{\beta}_{j}\vert$$
(7.74)

using the Kleinman-Bylander prescription (Kleinman and Bylander 1982).

Vanderbilt Ultrasoft Pseudopotentials

An ultrasoft type of pseudopotential was introduced by Vanderbilt (1990) and Laasonen et al. (1993) to deal with nodeless valence states which are strongly localized in the core region. In this scheme the normconserving condition is lifted and only a small portion of the electron density inside the cutoff radius is recovered by the pseudo-wavefunction, the remainder is added in the form of so-called augmentation charges. Complications arising from this scheme are the nonorthogonality of Kohn–Sham orbitals, the density dependence of the nonlocal pseudopotential, and need to evaluate additional terms in atomic force calculations.

How to Obtain Pseudopotentials?

There are extensive pseudopotential libraries available for download with the simulation packages CPMD (Parrinello et al. 2008), CP2K (Hutter et al. 2009) or online (Vanderbilt Ultra-Soft Pseudopotential Site  2006). However, before applying any pseudopotentials, they should always be tested against all-electron calculations. Pseudopotentials used in conjunction with a particular density functional should have been generated using the same functional.

In many cases, the required pseudopotential will not be available in any accessible library; in this case it may be generated using freely downloadable programs (Vanderbilt Ultra-Soft Pseudopotential Site  2006).

Setting Up a Classical MD Simulation

There are two general stages that make up the preparation to conduct force-field molecular dynamics simulations: (1) gathering preliminary information and (2) building the actual system.

Gathering Preliminary Information

Gathering the preliminary information before conducting the simulation is mostly focussed on making sure that the simulation is possible. First, it is important to identify the type and number of molecules that you wish to model. Then, it is necessary to find the force-field that will allow you to most accurately model the molecules and physical system that you want to simulate. A brief synopsis of some of the larger classical force-field parameter sets is given in section “Classical Force Fields”. These force-fields and references may be good starting points in searching for the correct classical force-field to use for a given system, but the best way to find a specific force-field is to just conduct a search for research articles that may have been conducted on the same system. If no force-field parameters exist for the system of interest, then you can use configurations and energies from quantum simulations to parameterize a given force-field for your system. A methodology for how a force-field was parameterized originally is presented in the relevant paper; however, this is a complicated exercise and is probably best left to the experts.

Building the System

After identifying that a force-field exists for the system you wish to model, the next step is to build the initial configuration of the molecules within the system. The initial configuration will consist of initial spatial coordinates of each atom in each given molecule. When building a large system consisting of several molecules of various types, it is easiest to write a computer code that contain the molecular structure and coordinates of each molecule present in the system, and then have the code replicate each molecule how ever many times is necessary in order to build the entire system. Alternatively, most of the molecular dynamics simulation packages previously mentioned have capabilities to build systems from a pdb file; however, these tools are often useful for only certain systems and force-fields. There is unfortunately no one tool which can be used to build any system with any force-field.

These initial configurations can represent a minimum energy structure either from another simulation (i.e., a final structure from a energy minimization in a quantum or classical Monte Carlo simulation can be used as the starting state for classical simulations), from experimental observation (i.e., the pdb database for crystallographic structures of proteins) or building the initial coordinates based upon the equilibrium bond distances and bond angles from the force-field.

The placement of the molecules within the simulation box can be done in a number of different ways as well. The molecules can be placed on the vertices of a regular lattice, or in any other regularly defined geometry that may be useful for conducting your simulation (i.e., in simulating the structural properties of micelles often times the surfactant molecules will initially be placed on the vertices of a buckey ball such that they are in a spherical configuration). Also, molecules can be placed at random positions within the simulation box. The one advantage of placing molecules at regularly spaced positions is that it is easier to insure that there is no overlapping of molecules, whereas with the randomly placed molecules it can be quite difficult to ensure that a placed molecule does not overlap with another molecule in the box (particularly for large or highly branched molecules).

In addition to containing the initial spatial coordinates of all of the molecules in the system, the initial configuration must also contain some additional information about the atoms and molecules in the systems. Each atom in the configuration must contain a label of what atomic species (i.e., carbon, nitrogen, …) it represents. This label will be different for each simulation code used but all of them will have some type of label as it will inform the simulation code what force-field values to use to represent the interactions of that atom. A list of all of the covalent bonds, the bond angles, and the dihedrals in the system will also need to be included in the initial configuration. The lists of the bonds, angles, and dihedrals contain an identifier for each atom that make up the bond, angle, or dihedral and then an identifier for the type that informs the simulation package which parameters to use in calculating the energy of the bond, angle, or dihedral. The final component of the initial configuration of a classical simulation is a list of all of the various types of atoms, bonds, angles, and dihedrals in the system along with their corresponding force-field parameters (i.e., ε and σ for atom types to describe their nonbond interactions, force constants, and equilibrium values for bond, angle, and dihedral types).

Finally, after building the initial configuration, the simulation is about ready to be performed. The last step is to choose the simulation variables and set up the input to the simulation package in order to convey these selections.

These options and the decision process behind choosing from the various options will be presented in the following sections.

Preparing an Input File

Optimization Algorithms

Optimization algorithms are often used to find stationary points on a potential energy surface, i.e., local and global minima and saddle points. The only place where they directly enter MD is in the case of Born–Oppenheimer AIMD, in order to converge the SCF wavefunction for each MD step. It is immediately obvious that the choice of optimization algorithm crucially affects the speed of the simulation.

Steepest Descent

The Steepest Descent method is the simplest optimization algorithm. The initial energy \(E[{\Psi }_{0}] = E({\mathbf{c}}_{0})\), which depends on the plane wave expansion coefficients c (see Eq. 7.65), is lowered by altering c in the direction of the negative gradient,

$${ \mathbf{d}}_{n} = -\frac{\partial E({\mathbf{c}}_{n})} {\partial \mathbf{c}} \equiv -{\mathbf{g}}_{n}$$
(7.75)
$${ \mathbf{c}}_{n+1} ={ \mathbf{c}}_{n} + {\Delta }_{n}{\mathbf{d}}_{n}$$
(7.76)

where Δ n > 0 is a variable step size chosen such that the energy always decreases, and n is the optimization step index. The steepest descent method is very robust; it is guaranteed to approach the minimum. However, the rate of convergence ever decreases as the energy gets closer to the minimum, making this algorithm rather slow.

Conjugate Gradient Methods

The Conjugate Gradient method generally converges faster than the steepest descent method due to the fact that it avoids moving in a previous search direction. This is achieved by linearly combining the gradient vector and the last search vector,

$${ \mathbf{c}}_{n+1} ={ \mathbf{c}}_{n} + {\Delta }_{n}{\mathbf{d}}_{n}$$
(7.77)

where

$${ \mathbf{d}}_{n} = -{\mathbf{g}}_{n} + {\beta }_{n}{\mathbf{d}}_{n-1}$$
(7.78)

Different recipes exist to determine the coefficient β n (Jensen 2007) among which the Polak–Ribière formula usually performs best for non-quadratic functions,

$${\beta }_{n} = \frac{{\mathbf{g}}_{n}({\mathbf{g}}_{n} -{\mathbf{g}}_{n-1})} {{\mathbf{g}}_{n-1}{\mathbf{g}}_{n-1}}$$
(7.79)

In the case of a general non-quadratic function, such as the DFT energy, conjugacy is not strictly fulfilled and the optimizer may search in completely inefficient directions after a few steps. It is then recommended to restart the optimizer (setting β = 0). Convergence can be improved by multiplying g n with a preconditioner matrix, e.g., an approximate inverse of the second derivatives matrix (Hessian in the case of geometry optimization) \tilde{H}. The method is then called Preconditioned Conjugate Gradient (PCG). In the CPMD code, the matrix \tilde{H} is approximated by

$$\tilde{\mathbf{H}} = \left \{\begin{array}{ll} {H}_{GG^{ \prime}}^{\mathrm{KS}}\, & \mathrm{for}\,\;G \geq {G}_{\mathrm{cut}} \\ {H}_{{G}_{\mathrm{cut}}{G}_{\mathrm{cut}}}^{\mathrm{KS}}\, & \mathrm{for}\,\;G < {G}_{\mathrm{ cut}}\\ \end{array} \right \}$$
(7.80)

where \({H}_{GG^{ \prime}}^{\mathrm{KS}}\) is the Kohn–Sham matrix is the plane-wave basis and G cut is a cutoff value for the reciprocal lattice vector G (set to a default value of 0.5 a.u.).

Direct Inversion of the Iterative Subspace

Having generated a sequence of optimization steps c i , the Direct Inversion of the Iterative Subspace (DIIS) method (Császár and Pulay 1984; Hutter et al. 1994; Pulay 19801982) is designed to accelerate convergence by finding the best linear combination of stored c i vectors,

$${ \mathbf{c}}_{n+1} ={ \sum \limits _{i=1}^{n}}{a}_{ i}{\mathbf{c}}_{i}$$
(7.81)

Ideally, of course, c n + 1 is equal to the optimum vector c opt. Defining the error vector e i for each iteration as

$${ \mathbf{e}}_{i} ={ \mathbf{c}}_{i} -{\mathbf{c}}_{\mathrm{opt}}$$
(7.82)

Eq. 7.79 becomes

$${\sum \limits _{i=1}^{n}}{a}_{ i}{\mathbf{c}}_{\mathrm{opt}} +{ \sum \limits _{i=1}^{n}}{a}_{ i}{\mathbf{e}}_{i} ={ \mathbf{c}}_{\mathrm{opt}}$$
(7.83)

Equation 7.81 is satisfied if

$${\sum \limits _{i=1}^{n}}{a}_{ i} = 1$$
(7.84)

and

$${\sum \limits _{i=1}^{n}}{a}_{ i}{\mathbf{e}}_{i} = 0$$
(7.85)

Instead of the ideal case Eq. 7.83, in practice one minimizes the quantity

$$\langle {\sum \limits _{i=1}^{n}}{a}_{ i}{\mathbf{e}}_{i}\vert {\sum \limits _{j=1}^{n}}{a}_{ j}{\mathbf{e}}_{j}\rangle$$
(7.86)

subject to the constraint (Eq. 7.82), which is equivalent to solving the system of linear equations

$$\left (\begin{array}{ccccc} {b}_{11} & {b}_{12} & \cdots & {b}_{1n} & - 1 \\ {b}_{21} & {b}_{22} & \cdots & {b}_{2n} & - 1\\ \vdots & \vdots & \vdots & \vdots & \vdots \\ {b}_{n1} & {b}_{n2} & \cdots & {b}_{nn} & - 1 \\ - 1& - 1&\cdots & - 1& 0\\ \end{array} \right )\left (\begin{array}{c} {a}_{1} \\ {a}_{2}\\ \vdots \\ {a}_{n} \\ \lambda \\ \end{array} \right ) = \left (\begin{array}{c} 0\\ 0\\ \vdots \\ 0\\ -1\\ \end{array} \right )$$
(7.87)

where

$${b}_{ij} =\langle { \mathbf{e}}_{i}\vert {\mathbf{e}}_{j}\rangle$$
(7.88)

and the error vectors are approximated by

$${ \mathbf{e}}_{i} = -\tilde{{\mathbf{H}}}^{-1}({\mathbf{c}}_{\mathrm{ opt}}){\mathbf{g}}_{i}$$
(7.89)

using an approximate Hessian matrix \tilde{H}, e.g., Eq. 7.78.

Controlling Temperature: Thermostats

If understanding the behavior of the system as a function of temperature is the aim of your study, then it is important to be able to control the temperature of your system. The temperature of the system is related to the time average of the kinetic energy, which generally can be calculated by

$$ < \mathcal{H} >_{NV T} = \frac{3}{2}N{k}_{B}T.$$
(7.90)

Below we introduce specific thermostatting techniques for MD simulations at thermodynamic equilibrium, e.g., for calculating equilibrium spatial distribution and time-correlation functions. However, when MD simulations are performed on a system undergoing some non-equilibrium process involving exchange of energy between different parts of the system, e.g., when an energetic particle, such as an atom or a molecule, hits a crystal surface, or there is a temperature gradient across the system, one has to resort to specially developed techniques, see for example Kantorovich (2008), Kantorovich and Rompotis (2008)and Toton et al. (2010). In these methods, based on the so-called Generalized Langevin Equation, the actual system on which MD simulations are performed is considered in contact with one (or more) heat bath(s) kept at constant temperature(s), and the dynamics of the system of interest reflects the fact that there is an interaction and energy transfer between the system and the surrounding heat bath(s).

Rescale Thermostat

One obvious way to control the temperature of a system is to rescale the velocities of the atoms within the system (Woodcock 1971). The rescaling factor λ is determined from \(\lambda \sqrt{{T}_{\mathrm{target } } /{T}_{0}}\), where T target and T 0 are the target and initial temperatures, respectively. Then, the velocity of each atom is rescaled such that \({V }_{f} = \lambda {V }_{i}\). In practice, the inputs generally required to use a rescale thermostat include:

  • T 0 – Initial temperature

  • T target – Target temperature

  • τ – Damping constant (i.e., frequency with which to apply the thermostat)

  • δT – Maximum allowable temperature difference from T target before thermostat is applied

  • f rescale – Fraction of temperature difference between current temperature and T target is corrected during each application of thermostat

If it is desired to have a strict thermostat (i.e., when first starting a simulation that might have particles very near one another), then δT and τ should have values of ∼ 0. 01T target and  1 time step, respectively, and f rescale should be near 1.0. However, if you wish to allow a more lenient thermostat, then the value of δT should be of the same order of magnitude as T target, τ should be ∼ 102–103 time steps, and \({f}_{\mathrm{rescale}} \sim 0.01\)–0. 1.

Berendsen Thermostat

Another way to control the temperature is to couple the system to an external heat bath, which is fixed at a desired temperature. This is referred to as a Berendsen thermostat (Berendsen et al. 1984). In this thermostat, the heat bath acts as a reservoir of thermal energy that supplies or removes temperature as necessary. The velocities are rescaled each time step, where the rate of change in temperature is proportional to the difference in the temperature in the system T(t) and the temperature of the external bath T bath:

$$\frac{dT(t)} {dt} = \frac{1} {\tau }({T}_{\mathrm{bath}} - T(t))$$
(7.91)

which when integrated results in the change in temperature each time step:

$$\Delta T = \frac{\delta t} {\tau } ({T}_{\mathrm{bath}} - T(t)).$$
(7.92)

In Eqs. 7.89 and 7.92, τ is the damping constant for the thermostat. In practice, the necessary inputs when using the Berendsen thermostat include:

  • T bath – temperature of the external heat bath

  • τ – damping constant for the thermostat

Obviously the amount of control that the thermostat imposes on the simulation is controlled by the value of τ. If τ is large, then the coupling will be weak and the temperature will fluctuate significantly during the course of the simulation. While if τ is small, then the coupling will be strong and the thermal fluctuations will be small. If τ = δt, then the result will be the same as the rescale thermostat, in general.

Nosé–Hoover Thermostat

While the Berendsen thermostat is efficient for achieving a target temperature within your system, the use of a thermostat that represents a canonical ensemble once the system has reached a thermal equilibrium. The extended system method, which was originally introduced by Nosé (1984a,b) and then further developed by Hoover (1985), introduces additional degrees of freedom into the Hamiltonian that describes the system, from which equations of motion can be determined.

The extended system method considers the external heat bath as an integral part of the system by including an additional degree of freedom in the Hamiltonian of the system that is represented by the variable s. As a result, the potential energy of the reservoir is

$${E}_{\mathrm{pot}} = (f + 1){k}_{B}T\ln s,$$
(7.93)

where f is the number of degrees of freedom in the physical system and T is the target temperature. The kinetic energy of the reservoir is calculated by

$${E}_{\mathrm{kin}} = \frac{Q} {2}\left( \frac{ds} {dt} \right)^{2},$$
(7.94)

where Q is a parameter with dimensions of energy ×(time)2 and is generally referred to as the “virtual” mass of the extra degree of freedom s. The magnitude of Q determines the coupling between the heat bath and the real system, thus influencing the temperature fluctuations.

Utilizing Eqs. 7.91 and 7.94, and substituting the real variables for the corresponding Nosé variables, the equations of motion are found to be as follows:

$${\mathbf{\ddot{R}}}_{I} = \frac{{\mathbf{F}}_{I}}{{M}_{I}} - \gamma {\mathbf{R}}_{I},$$
(7.95)
$$\dot{\gamma } = - \frac{1}{{\tau }_{\mathrm{NH}}}\left( \frac{f + 1} {f} \frac{{T}_{\mathrm{target}}} {T} - 1 \right),$$
(7.96)

where \(\gamma = \frac{\dot{s}}{s}\)and \({\tau }_{\mathrm{NH}} = \frac{Q}{f{k}_{B}{T}_{\mathrm{target}}}\) Thevariable τNHis an effective relaxation time, or damping constant.

In practice, the inputs that are necessary when utilizing the Nosé–Hoover thermostat during a molecular dynamics simulation include

  • T target – Target temperature

  • τNH – Damping constant

  • Q – Fictitious mass of the additional degree of freedom s

The most significant variable in the above list is Q. Large values of Q may cause poor temperature control, with the infinite limit resulting in no energy exchange between the temperature bath and the real system, which is the case of conventional molecular dynamics simulations resulting in the microcanonical ensemble. However, if Q is too small then the energy oscillates and the system will take longer in order to reach a thermal equilibrium.

Controlling Pressure: Barostats

It may be desired to study the behavior of the simulated system while the pressure is held constant (i.e., pressure-induced phase transitions). Many experimental measurements are made in conditions where the pressure and temperature are held constant and so it is of utmost importance to be able to accurately replicate these conditions in simulations.

One thing of note is that the pressure often fluctuates more than other quantities such as the temperature in an NVT molecular dynamics simulation or the energy in a NVE molecular dynamics simulation. This is due to the fact that the pressure is related to the virial term, which is the product of the positions of the particles in the system and the derivative of the potential energy function. These fluctuations will be observed in the instantaneous values of the system pressure during the course of the simulation, but the average pressure should approach the desired pressure. Since generally the temperature and number of atoms will also be held constant during constant pressure simulations, and the volume of the system will be allowed to change in order to arrive at the desired pressure, therefore, less compressible systems will show larger fluctuations in the pressure than the systems that are more easily compressed.

Berendsen Barostat

Many of the approaches used for controlling the pressure are similar to those that are used for controlling the temperature. One approach is to maintain constant pressure by coupling the system to a constant pressure reservoir as is done in the Berendsen barostat (Berendsen et al. 1984), which is analogous to the way temperature is controlled in the Berendsen thermostat. The pressure change in the system is determined by

$$\frac{dP(t)} {dt} = \frac{1} {{\tau }_{P}}({P}_{0} - P(t)),$$
(7.97)

where τ P is time constant of the barostat, P 0 is the desired pressure and P(t) is the system pressure at any time t. In order to accommodate this change in pressure, the volume of the box is scaled by a factor of μ3 each time step, therefore the coordinates of each particle in the system are scaled by a factor of μ (i.e., \({\mathbf{R}}_{I}(t + \delta t) = {\mu }^{1/3}{\mathbf{R}}_{I}(t)\), where

$$\mu = \left[ 1 - \frac{\delta t} {{\tau }_{P}}(P - {P}_{0}) \right]^{\frac{1} {3} }.$$
(7.98)

In practice, the inputs for the Berendsen barostat will include:

  • P 0 – Desired pressure

  • τ P – Time constant of the barostat

One other input that may be included in the use of the Berendsen barostat is to define which dimensions are coupled during the pressure relaxation. For example, you could define that the pressure is relaxed in a way that the changes in all three dimensions are coupled and therefore all of the dimensions change at the same rate. On the other hand, the pressure relaxation can be handled in an anisotropic manner, such that none of the dimensions are coupled and each dimension will have its own scaling factor that results from the individual pressure components.

Nosé–Hoover Barostat

Similar to the Nosé–Hoover thermostat, the extended system method has been applied to create a barostat (Hoover 1986) that is coupled with a Nosé–Hoover thermostat. In this case, the extra degree freedom η corresponds to a “piston,” and it is added to the Hamiltonian of the system, which results in the following equations of motion:

$$\frac{d\mathbf{R}(t)} {dt} = \mathbf{V}(t) + \eta (t)(\mathbf{R}(t) -{\mathbf{R}}_{\mathrm{COM}}),$$
(7.99)
$$\frac{d\mathbf{V}(t)} {dt} = \frac{\mathbf{F}(t)} {M} - [\chi (t) + \eta (t)]\mathbf{V}(t),$$
(7.100)
$$ \frac{d\chi (t)} {dt} = \frac{1} {{\tau }_{T}^{2}}\left( \frac{T} {{T}_{0}} - 1 \right)$$
(7.101)
$$\frac{d\eta (t)} {dt} = \frac{1} {N{k}_{B}{T}_{0}{\tau }_{P}^{2}}V (t)(P - {P}_{0}),$$
(7.102)
$$\frac{dV (t)} {dt} = 3\eta (t)V (t)$$
(7.103)

where R COM are the coordinates of the center of mass of the system, η is the thermostat extra degree of freedom and can be thought of as a friction coefficient, τ T is the thermostat time constant, χ is barostat extra degree of freedom and is considered a volume scaling factor and τ P is the barostat time constant. Equations 7.102 and 7.103 explicitly contain the volume of the simulation box, V (t). Generally, this barostat is implemented using the approach described in Melchionna et al. (1993).

In addition to the variables that are a part of the equations of motion, there is a variable Q that represents the “mass” of the “piston.” This is analogous to the “mass” variable in the Nosé–Hoover thermostat. In practice, the required input for the Nosé–Hoover barostat will include:

  • P 0 – Desired pressure

  • T 0 – Desired temperature

  • τ P – Time constant of the barostat

  • τ T – Time constant of the thermostat

  • Q – The “mass” of the piston

Like in the case of the Nosé–Hoover thermostat, care must be taken when selecting the value of the variable Q. A small value of Q is representative of a piston with small mass, and thus will have rapid oscillations of the box size and pressure, whereas a large value of Q will have the opposite effect. The infinite limit of Q results in normal molecular dynamics behavior.

Setting the Time Step

Born–Oppenheimer MD

Since BO-MD is classical MD in the sense that the nuclei are classical particles, the same rules concerning the choice of time step apply to both BO-MD and atomistic force-field MD. The largest possible time step, δt, is determined by the fastest oscillation in the system – in many molecules this would be a bond stretching vibration involving hydrogen, e.g., CH, NH, or OH. It is immediately plausible that δt must be smaller than the shortest vibrational period in order to resolve that motion and for the numerical integrator (see section “Classical Molecular Dynamics”) to be stable. Let us assume a particular molecule has an OH vibration at 3,500 cm− 1, corresponding to a period of about 10 fs. Then the time step has to be chosen smaller than 10 fs. Using a harmonic approximation it can be shown that the Verlet algorithm is stable for \({\omega }^{2}\delta {t}^{2} < 2\) (Sutmann 2006). In the present example this would dictate a maximum time step of 2 fs. However, although such a choice guarantees numerical stability, it results in deviations from the exact answer. Therefore, in practice smaller time steps – typically around 1 fs – are often used.

Car–Parrinello MD

Although in CP-MD the nuclei are still treated as classical particles, the choice of time step can no longer be based solely on the highest nuclear frequency \({\omega }_{\mathrm{n}}^{\mathrm{max}}\). We also need to consider the fictitious dynamics of the electronic degrees of freedom. In fact, the optimum simulation time step is closely linked to the value of the fictitious electron mass μ as we will see in the following.

The fictitious mass μ has to be chosen small enough to guarantee adiabatic separation of electronic and nuclear motion. This means that the frequency spectrum of the electronic degrees of freedom (Marx and Hutter 2009; Pastore et al. 1991)

$${\omega }_{ph} = \sqrt{\frac{2({\epsilon }_{p } - {\epsilon }_{h } )} {\mu }}$$
(7.104)

must not overlap with the vibrational spectrum of the nuclear system. The lowest electronic frequency according to Eq. 7.101 is

$${\omega }^{\mathrm{min}} = \sqrt{\frac{2({\epsilon }_{\mathrm{LUMO } } - {\epsilon }_{\mathrm{HOMO } } )} {\mu }}$$
(7.105)

The highest electronic frequency is determined by the plane-wave cutoff energy E cut,

$${\omega }^{\mathrm{max}} \approx \sqrt{\frac{2{E}_{\mathrm{cut } } } {\mu }}$$
(7.106)

Thus the maximum simulation time step, which is inversely proportional to ωmax, thus obeys the relation

$$\Delta {t}^{\mathrm{max}} \propto \sqrt{ \frac{\mu } {{E}_{\mathrm{cut}}}}$$
(7.107)

According to Eq. 7.104 the maximum time step can be increased by simply increasing μ. However, this would also result in a lowering of \({\omega }_{\mathrm{e}}^{\mathrm{min}}\) (see Eq. 7.102) and therefore in a smaller separation \({\omega }_{\mathrm{e}}^{\mathrm{min}} - {\omega }_{\mathrm{n}}^{\mathrm{max}}\) between the nuclear and electronic spectra.

Let us discuss the above using some realistic numbers. In the case of the H2O molecule, for example, the HOMO-LUMO gap with the BLYP functional is about 5.7 eV. Assuming a typical value of 400 a.u. for μ, the minimum electronic frequency (Eq. 7.102) is ca. 6,900 cm− 1. The highest energy molecular vibrational mode in a CP-MD simulation using these parameter values is the asymmetric stretch at about 3,500 cm− 1. This means that electronic and nuclear spectra are well separated. A basis set cutoff of E cut = 70 Ry ( = 35 a.u.) leads to a maximum electronic frequency (Eq. 7.103) of ≈ 92,000 cm− 1 corresponding to a vibrational period of 15 a.u.. Hence the CP-MD time step has to be smaller than this number. For water, a time step/fictitious mass combination of 4 a.u./400 a.u. has been shown to be a good compromise between efficiency and accuracy (Kuo et al. 2004).

If we were to increase μ to 1,000 a.u., we could afford a larger time step of about 6 a.u. (according to Eq. 7.104). However, \({\omega }_{\mathrm{e}}^{\mathrm{min}}\) (Eq. 7.102) would become ca. 4,500 cm− 1, dangerously close to \({\omega }_{\mathrm{n}}^{\mathrm{max}}\). A simple trick that is often used to be able to afford larger time steps is to replace all hydrogen atoms by deuterium atoms thus downshifting \({\omega }_{\mathrm{n}}^{\mathrm{max}}\). For systems with a small or even vanishing (e.g., metals) bandgap it is increasingly difficult or impossible to achieve adiabatic separation of electronic and nuclear degrees of freedom following the above considerations. A solution to this problem is the use of separate thermostats for the two subsystems (Marx and Hutter 2009; Sprik 1991)

Postprocessing

Data Analysis

Spatial Distribution Functions

For a system of N particles in a volume V at temperature T, the probability of molecule 1 being in the volume element d R 1 around the position R 1, molecule 2 being in d R 2, …, molecule N being in d R N is given by McQuarrie (1992)

$${P}^{(N)}(\mathbf{R})d\mathbf{R} = {P}^{(N)}({\mathbf{R}}_{ 1},\ldots,{\mathbf{R}}_{N})d{\mathbf{R}}_{1},\ldots,d{\mathbf{R}}_{N} = \frac{{e}^{-E(\mathbf{R})/kT}} {{Z}_{N}}$$
(7.108)

with the configuration integral

$${Z}_{N} ={ \int \nolimits \nolimits }_{V }{e}^{-E(\mathbf{R})/kT}d\mathbf{R}$$
(7.109)

where E(R) is the potential energy of the system at configuration R (cf. Eqs. 7.8 and 7.10).

For a subset of n molecules, the probability of molecule 1 being in d R 1, …, molecule n being in d R n is

$${P}^{(n)}({\mathbf{R}}_{ 1},\ldots, {\mathbf{R}}_{n}) = \frac{\int \nolimits \nolimits \cdots \int \nolimits \nolimits {e}^{-E(\mathbf{R})/kT}d{\mathbf{R}}_{n+1}\ldots d{\mathbf{R}}_{N}} {{Z}_{N}}$$
(7.110)

The probability of any molecule being in d R 1, …, any molecule n being in d R n is

$${\rho }^{(n)}({\mathbf{R}}_{ 1},\ldots,{\mathbf{R}}_{n}) = \frac{N!} {(N - n)!}{P}^{(n)}({\mathbf{R}}_{ 1},\ldots,{\mathbf{R}}_{n})$$
(7.111)

In a liquid the probability of finding any one molecule in d R 1, \({\rho }^{(1)}({\mathbf{R}}_{1})d{\mathbf{R}}_{1}\), is independent of R 1. Therefore

$$\frac{1} {V }\int \nolimits \nolimits {\rho }^{(1)}({\mathbf{R}}_{ 1})d{\mathbf{R}}_{1} = {\rho }^{(1)} = \frac{N} {V } = \rho $$
(7.112)

The dependence of the molecules of a liquid on all the other molecules, in other words, their correlation, is captured by the correlation function \({g}^{(n)}({\mathbf{R}}_{1},\ldots,{\mathbf{R}}_{n})\), which is defined by

$${\rho }^{(n)}({\mathbf{R}}_{ 1},\ldots,{\mathbf{R}}_{n}) = {\rho }^{n}{g}^{(n)}({\mathbf{R}}_{ 1},\ldots,{\mathbf{R}}_{n})$$
(7.113)

Using Eq. 7.108 we can thus write

$${g}^{(n)}({\mathbf{R}}_{ 1},\ldots,{\mathbf{R}}_{n}) = \frac{{V }^{n}N!} {{N}^{n}(N - n)!} \frac{\int \nolimits \nolimits \cdots \int \nolimits \nolimits {e}^{-E(\mathbf{R})/kT}d{\mathbf{R}}_{n+1}\ldots d{\mathbf{R}}_{N}} {{Z}_{N}}$$
(7.114)

The two-body correlation function \({g}^{(2)}({\mathbf{R}}_{1},{\mathbf{R}}_{2})\) is of particular interest as it can be determined in X-ray diffraction experiments. In the following we shall only consider the dependence of g (2) on the interparticle distance \(R = {R}_{12} = \vert {\mathbf{R}}_{1} -{\mathbf{R}}_{2}\vert \), i.e., we have averaged over any angular dependence, and call \({g}^{(2)}({R}_{12}) = g(R)\) the radial distribution function. The quantity \(\rho g(R)d{\mathbf{R}}_{I}\) is proportional to the probability of finding another particle, I, in d R I if the reference particle is at the origin. Spherical integration yields

$$\int \nolimits \nolimits \rho g(R)4\pi {R}^{2}dR = N - 1 \approx N$$
(7.115)

showing that \(\rho g(R)4\pi {R}^{2}\,dR\) is the number of particles in the spherical volume element between R and R + dR about the central particle. The radial distribution function g(R) is proportional to the local density \(\rho (R) = \rho g(R)\) about a certain molecule. In a fluid, \(g(R) \rightarrow 1\) as \(R \rightarrow \infty \), i.e., there is no long-range order and we “see” only the average particle density. At very short range, i.e., \(R \rightarrow 0\), \(g(R) \rightarrow 0\), due to the repulsiveness of the molecules. Examples from a CP-MD simulation of liquid water are shown in Fig 7-3 . The radial distribution function g(R) provides a useful measure of the quality of a simulation as it can be compared to experimental – X-ray or neutron diffraction – data obtained by Fourier transform of the structure factor

$$ h(k) = \rho \int \nolimits \nolimits [g(R) - 1]{e}^{i\mathbf{k}\mathbf{R}}d\mathbf{R} $$
(7.116)

where k is the wave vector.

Fig. 7-3
figure 3figure 3

Radial distribution function of liquid water from CP-MD simulations at 900 and 1,200 K, respectively

In addition to characterizing the structure of a liquid, the radial distribution function may also be used to calculate thermodynamic properties such as the total energy,

$$E = \frac{3} {2}NkT + 2\pi N\rho {\int \nolimits \nolimits }_{0}^{\infty }u(R)g(R){R}^{2}dR$$
(7.117)

the pressure,

$$p = \rho kT -\frac{2} {3}\pi {\rho }^{2}{ \int \nolimits \nolimits }_{0}^{\infty }\frac{du(R)} {dR} g(R){R}^{3}dR$$
(7.118)

and the chemical potential,

$$\mu = kT\ln (\rho {\Lambda }^{3}) + 4\pi \rho {\int \nolimits \nolimits }_{0}^{1}d\xi {\int \nolimits \nolimits }_{0}^{\infty }u(R)g(R,\xi ){R}^{2}dR$$
(7.119)

where

$$\Lambda = \sqrt{ \frac{{h}^{2 } } {2\pi mkT}}$$
(7.120)

is the thermal de Broglie wavelength. By varying the coupling parameter ξ between 0 and 1, one can effectively take a molecule in and out of the system. It should be stressed that Eqs. 7.1147.119 have been derived assuming a pairwise additive intermolecular potential u(R).

We now define the potential of mean force, i.e., the interaction between n fixed molecules averaged over the configurations of the remaining molecules \(n + 1,\ldots,N\), as

$${w}^{(n)}({\mathbf{R}}_{ 1},\ldots,{\mathbf{R}}_{n}) = -kT\ln {g}^{(n)}({\mathbf{R}}_{ 1},\ldots,{\mathbf{R}}_{n})$$
(7.121)

The mean force acting on molecule J is then obtained from

$${f}_{J}^{(n)} = -{\nabla }_{ J}{w}^{(n)}$$
(7.122)

Time Correlation Functions

The classical time autocorrelation function of some vectorial function

$$\mathbf{A}(t) = \mathbf{A}(P(t),Q(t)) = \mathbf{A}(P,Q;t)$$
(7.123)

where Q(t) and P(t) are the generalized coordinate and momentum, respectively, is defined as

$$C(t) =<\mathbf{A}(0)\mathbf{A}(t)>=\int \nolimits \nolimits \cdots \int \nolimits \nolimits dP\,dQ\mathbf{A}(P,Q;0)\mathbf{A}(P,Q;t)f(P,Q)$$
(7.124)

where f(P, Q) is the equilibrium phase space distribution function.

From the velocity autocorrelation function, for example, one can calculate the diffusion coefficient as

$$D=\frac{1}{3}{\int\nolimits\nolimits}_{0}^{\infty}<{\mathbf{V}}_{I}(0){\mathbf{V}}_{I}(t)>dt$$
(7.125)

where V I is the velocity of particle I. Alternatively, one can obtain the diffusion coefficient for long times from the associated Einstein relation,

$$6tD=<\vert{\mathbf{R}}_{I}(t)-{\mathbf{R}}_{I}(0){\vert}^{2}>$$
(7.126)

In practice, D is then determined from a linear fit to the mean square displacement (rhs of Eq. 7.123) as one sixth of the slope. An example is shown in Fig 7-4 .

Another common application of correlation functions is the calculation of IR absorption spectra. The lineshape function, I(ω), is given by the Fourier transform of the autocorrelation function of the electric dipole moment M,

$$I(\omega)= \frac{1}{2\pi}{\int \nolimits \nolimits}_{-\infty}^{\infty} <\mathbf{M}(0)\mathbf{M}(t)>{e}^{-i\omega t}dt$$
(7.127)
Fig. 7-4
figure 4figure 4

Mean square displacement of liquid water from CP-MD simulations at 900 K and linear fit to determine the diffusion constant D using Eq. 7.123

Visualization

Due to the nature of MD simulations, one of the most productive forms of analysis of a simulation is to be able to visualize the trajectory of the molecules of interest. This is particularly useful since experimental techniques are not able to produce visual pictures of atomistic interactions and therefore it is something that only simulations (at this point) are able to provide. In order to visualize a simulation trajectory there are several different very powerful computer packages that are commonly used. These software packages include VMD (2009), PyMol (2010), RasMol (2008), and several others (Free Molecular Visualization Software 2008). Figure 7-5 shows an example of the type of pictures that can be made using the visualization software.

Fig. 7-5
figure 5figure 5

A snapshot of a micelle formed from DDAO molecules and oil molecules formed using the VMD software package (VMD 2009)

Each of these codes will generally accept the trajectory in any number of standard inputs (i.e., pdb, xyz,…) and then will generate snapshots which can be rendered individually or as a movie. In addition to providing the visualization, these codes have become progressively powerful analysis codes in their own right. They now have the ability to measure bond lengths, angles, and dihedrals as a function of time, determine the solvent accessible surface area, hydrogen bond network, and many other useful structural related properties of the system.