1 Introduction

The development of advanced materials for energy storage has grown into a topic of intense research due to their importance in powering portable devices, electric vehicles, and electrical grids collecting energy from renewable sources. During the last decade, Li-ion rechargeable batteries have become a gold standard in storing electrical energy [1,2,3,4]. However, in an ever-growing demand for better batteries, low cost and natural abundance of precursor materials are quickly emerging as the basis for beyond Li-ion technology. In this context, the fifth most abundant element in the earth crust and the second lightest and smallest alkali metal after Li, Na, is currently considered as a natural candidate for the next generation of low-cost batteries [5, 6].

Therefore, research on active materials for Na-based technologies is gathering momentum [7]. Atomic-scale computational approaches are becoming increasingly useful for these exploratory studies in order to avoid time-consuming trial and error approaches [8, 9]. Based on ab initio methods and modern information technology tools, these techniques enable the assessment of critical properties of interest such as phase stability, electronic structure, and ionic conductivity [10,11,12,13,14,15,16,17]. However, the computation of some of these properties can be very time consuming and, therefore, impracticable to tackle from a pure ab initio viewpoint. This is particularly true for solid-state ionics and ion intercalation processes in electrolyte and electrode materials for batteries, where one should deal with the mobility of alkali ions.

Typically, ion diffusion occurs over long timescales and its statistically meaningful study usually requires to model systems containing thousands of atoms at least. Such simulations are not currently feasible with the use of ab initio methods. Classical interatomic potentials (force fields) are a practical solution to this problem in many cases, since such methods reduce the electronic degrees of freedom and thus allow for handling longer timescales and larger system sizes. Many studies of electrolyte and electrode materials based on interatomic potentials have been performed in the last two decades, mainly dealing with statical energy evaluations to determine ion diffusion paths and activation energies, defect chemistry, and stability of surfaces and nanostructures [10].

In spite of such success, classical interatomic potentials are often inefficient to properly account for rare events, specially when they are applied in molecular dynamics (MD) simulations. Ion diffusion processes are indeed rare events that involve ion hopping between adjacent sites and, sometimes, even collective ion transport. In order to properly simulate such phenomena in technologically relevant materials, very long simulations are normally required (see, e.g., Ref. [18]). To overcome this issue, different proposals for enhancing sampling efficiency in MD simulations of ion diffusion have been reported. For example, one can modify the particles momenta on the fly to stimulate the events of interest (ion particles jumps), but these methods do not preserve the desired distribution [19]. A similar approach is to rely on very high unphysical temperatures to force the observation of rare diffusion events [20, 21]. The so-called Generalized Shadow Hybrid Monte Carlo method (GSHMC) is another promising technique, which has been proven to be successful when applied to the study of rare events in complex biological processes [22,23,24], but it has not been used for computing properties of solid crystalline systems yet.

In this work, we investigate the effectiveness of enhanced sampling approaches in the simulation of various properties of olivine \(\hbox {NaFePO}_4\) using GSHMC-based techniques. We focused on \(\hbox {NaFePO}_4\) because this system is a promising candidate as a cathode material for Na-ion batteries [25]. It is the Na counterpart of \(\hbox {LiFePO}_4\), which is used in many commercial Li-ion batteries nowadays [26]. In contrast to the Li case, \(\hbox {NaFePO}_4\) forms a stable partially sodiated structure \(\hbox {Na}_{2/3}\hbox {FePO}_4\) upon charge [25, 27] or chemical Na intercalation [13, 28]. The \(\hbox {NaFePO}_4\) and \(\hbox {Na}_{2/3}\hbox {FePO}_4\) systems offer us the opportunity to test the GSHMC sampling approach in a technologically relevant material, and they are complex enough to analyze the performance of different sampling techniques. In addition, a force field specifically developed for olivine \(\hbox {NaFePO}_4\) already exists [29].

The paper is organized as follows. In Sect. 2, we describe the force field used to model the bulk \(\hbox {NaFePO}_4\). Then, in Sect. 3 we summarize the basics of the GSHMC method and explain the additional modifications that we have introduced to the original method. Section 4 compares the efficiency in terms of accuracy and performance of two variants of GSHMC and the standard MD method to account for structural and dynamical properties of the bulk \(\hbox {NaFePO}_4\) and \(\hbox {Na}_{2/3}\hbox {FePO}_4\). Finally, conclusions are presented in Sect. 5.

2 Computational model

The force field proposed for olivine \(\hbox {NaFePO}_4\) by Whiteside et al. [29] follows the Born model, with the addition of shells to some ions. The shell model is introduced to describe the ionic polarization as suggested by Dick and Overhauser [30]. In this model, an ion is described using a central core with a charge X and a shell of a charge Y. These two charges are balanced so that the sum \((X + Y)\) is the same as the valence state of the ion. A core and a shell are coupled together in a core–shell unit via a harmonic potential, which allows the shell to move with respect to the core, thus simulating a dielectric polarization.

The total potential energy is given by

$$\begin{aligned} U = V_\text {C} + V_\text {BH} + V_\text {CS}, \end{aligned}$$
(1)

where \(V_\text {C}\) stands for the long-range Coulomb interactions, \(V_\text {BH}\) is a Buckingham potential that models short-range repulsions and van der Waals forces between atoms, and \(V_\text {CS}\) is the interaction within each core–shell unit. In Eq. (1), the Coulomb interactions are computed between every pair of charged particles in the system but not within a core–shell unit. The short-range potential is considered solely between shells when core–shell units are involved and \(V_\text {CS}\) is computed for each core–shell unit.

The terms in Eq. (1) are explicitly given by

$$\begin{aligned} V_\text {C}(r_{ij}) = \frac{1}{4\pi \epsilon _0}\sum _{i,j=1}^N\frac{q_i q_j}{r_{ij}}, \end{aligned}$$

where \(\epsilon _0\) is the vacuum permitivity, \(r_{ij}\) is the distance between particles i and j, \(q_i\) and \(q_j\) are their respective charges and N is the number of particles,

$$\begin{aligned} V_\text {BH}(r_{ij}) = \sum _{i,j=1}^N A_{ij} \exp \left( -\frac{r_{ij}}{\rho _{ij}} \right) - \frac{C_{ij}}{r_{ij}^6}, \end{aligned}$$

where \(A_{ij}\), \(\rho _{ij}\), and \(C_{ij}\) are positive constants defining the shapes of the repulsive and the attractive terms of the potential, and

$$\begin{aligned} V_\text {CS}(r_{l}) = \sum _{l=1}^L \frac{1}{2} \ k_l \ r_{l}^2, \end{aligned}$$

where \(k_l\) is the spring constant for the l-th core–shell unit, \(r_{l}\) is the displacement between the shell center and its core, and L is the total number of shells.

In the work by Whiteside et al. [29], an extra three-body bonding term for the O–P–O angles in the PO\(_4\) tetrahedral units was also included. It takes the form of a harmonic angle-bending potential given by

$$\begin{aligned} V_\text {Ang}(\theta _k) = \sum _{k=1}^K \frac{1}{2} k_\text {ang} (\theta _k - \theta _0)^2, \end{aligned}$$

where \(k_\text {ang}\) is the spring constant, \(\theta _0\) is the equilibrium bond angle, \(\theta _k\) is the current value of the bond k, and K is the total number of angle interactions.

For this study, we took from Ref. [29] the full set of parameters defining the force field for olivine \(\hbox {NaFePO}_4\) (Table 1).

Table 1 Force field parameters for olivine \(\hbox {NaFePO}_4\) taken from Ref. [29]

At this point, it must be mentioned an important issue regarding molecular dynamics simulations based on a core–shell potential model. In the original core–shell model, shell particles are massless and the model requires them to be always at their optimal positions with zero forces [30]. When atomic motions are considered during dynamical simulations, the shells should respond instantaneously to the motions of the cores. Two main approaches are found in the literature to deal with the integration of equations of motion in this case: the so-called shell relaxation (CS-min) scheme [31] and the adiabatic shells (CS-adi) method [32].

The CS-min approach consists of three steps: (1) to calculate the forces on all cores with the shells fully relaxed; (2) to update the core positions using the forces; (3) to relax the shells for the new core positions [31]. The last step involves the energy minimization in the multidimensional space of shell configurations which turns to be a very computationally demanding task. The CS-adi scheme was proposed as a faster alternative to the CS-min method. In the CS-adi approach, a small fraction x of the ion mass is put on the shell, whereas the remaining (1-x) fraction belongs to the core. Then, all the particles positions propagate following the conventional MD technique [32]. Having sufficiently small masses, the shells adiabatically follow the cores motion during the simulation. A proper choice of the mass distribution for the core–shell units is crucial for the accuracy of the method. Care has to be taken to ensure the negligible effect of an extra thermal energy, introduced by the relative motion between a core and its shell, on the kinetic energy of the simulated system. However, to date there is no systematic way to assign mass values for shells.

In this study, we choose to apply the adiabatic shell scheme due to its computational efficiency and propose a novel approach for introducing a shell mass in the way that reduces its negative effect on the kinetic energy of the system.

3 Sampling

Our choice of the simulation technique for modeling olivine \(\hbox {NaFePO}_4\) has been based on two requirements. We looked for an enhanced sampling method, which can efficiently sample multidimensional space and detect the rare events, as well as be easily extended for simulations on meso-scales. Such properties are critical for effective study of ion transport in bulk and nanostructured materials.

The Generalized Shadow Hybrid Monte Carlo method or GSHMC by Akhmatskaya and Reich was originally developed for efficient atomistic simulation of complex systems [33] and then adjusted to simulation on meso-scales, without losing its capacity for exact sampling at the target temperature [23]. The method, however, has never been applied to solid-state chemistry. In this study, we investigate the performance of GSHMC in simulation of olivine \(\hbox {NaFePO}_4\) and propose some modifications to the original algorithm aiming to improve its accuracy and sampling efficiency specifically in simulation of battery materials.

3.1 GSHMC: Generalized Shadow Hybrid Monte Carlo

The GSHMC method is a type of Markov Chain Monte Carlo, with better sampling performance than Monte Carlo or MD in molecular simulations and with a negligible computing overhead. GSHMC is especially appropriate when exploring configurational spaces of high dimensionality, finding global energy minima, and simulating rare events such as phase transitions. Its theoretical foundation has already been published elsewhere [23, 24, 33,34,35]. It has recently been implemented in an open-source MD package [36, 37] and applied to the study of proteins [22, 38]. In the following lines, we present a brief summary of the method.

Essentially, GSHMC is a Hybrid Monte Carlo (HMC) method [39] that aims to achieve high efficiency by sampling with respect to modified energies (modified or shadow Hamiltonians). At the same time, it preserves most of the dynamical information by applying a partial momentum update instead of fully resampling the momenta between molecular dynamics trajectories, as is the case of HMC.

Shadow Hamiltonians are asymptotic expansions of the true Hamiltonian in powers of the time step \(\Delta t\). They are conserved better than true Hamiltonians by symplectic integrators such as the leapfrog/Verlet algorithm commonly used in molecular simulations [40]. Thus, replacing Hamiltonians with shadow Hamiltonians in Metropolis tests leads to higher acceptance rates than those obtained in the HMC method. The computational cost required for the evaluation of shadow Hamiltonians is negligible compared to the force evaluation in an MD simulation. Efficient algorithms for computing modified energies can be found for example in Refs. [33, 41,42,, 42]. The GSHMC method employs the Lagrangian formulation of shadow Hamiltonians of an arbitrary order for the leapfrog integrator [33]. In the case of the fourth-order of approximation, it leads to the following shadow Hamiltonian:

$$\begin{aligned} {\mathcal {\tilde{H}}} = U + \frac{1}{2}{\dot{\mathbf {x}}}[M \dot{\mathbf{x}}] + \frac{\Delta t^2}{12}{\dot{\mathbf {x}}}[M {\dddot{\mathbf {x}}}] - \frac{\Delta t^2}{24}{\ddot{\mathbf {x}}}[ M {\ddot{\mathbf {x}}}], \end{aligned}$$
(2)

where U is the potential energy, \({\mathbf {x}}\) is the positions vector, and M is the atomic mass matrix. The derivatives of the positions can be obtained using the finite difference approximation. The order of approximation of modified Hamiltonians used in the simulation also affects the acceptance rates. Higher approximation orders provide better acceptance rates, but they also require more time to compute.

The GSHMC method consists of a series of two alternating steps: First, one integrates a short MD trajectory at constant energy and then performs a partial momentum update. Each of these steps can be accepted or rejected following the result of a Metropolis test where the acceptance probabilities are calculated using shadow Hamiltonians \({\mathcal {{\tilde{H}}}}\) instead of true Hamiltonians. The full algorithm can be summarized as follows:

  • Molecular dynamics (MD) step

    • Given vectors for positions \({\mathbf {x}}\) and momenta \({\mathbf {p}}\), temperature T and mass matrix M, integrate the Hamiltonian equations of the system:

      $$\begin{aligned} \begin{array}{c} \dot{{\mathbf {p}}}= -\displaystyle \frac{\partial {\mathcal {H}}}{\partial {\mathbf {x}}}, \ \ \dot{{\mathbf {x}}}= \displaystyle \frac{\partial {\mathcal {H}}}{\partial {\mathbf {p}}} \end{array} \end{aligned}$$
      (3)

      with

      $$\begin{aligned} \begin{array}{ll} {\mathcal {H}} =\displaystyle \frac{1}{2}{\mathbf {p}}^{{\mathrm{T}}} M^{-1} {\mathbf {p}} + U({\mathbf {x}}), \end{array} \end{aligned}$$

      using a symplectic method \(\varPsi _{\Delta t}\) over L steps with time step \(\Delta t\). This generates a new configuration \(\varPsi _{{\mathcal {T}}}(\mathbf {x},{\mathbf {p}})=({\mathbf {x}}', {\mathbf {p}}')\), with \({\mathcal {T}}=L\Delta t\).

    • Accept or reject the new configuration \(({\mathbf {x}}', {\mathbf {p}}')\) by performing a Metropolis test with the probability

      $$\begin{aligned} \min \left\{ 1,\frac{\exp \left( -\beta {\mathcal {\tilde{H}}}({\mathbf {x}}',{\mathbf {p}}') \right) }{\exp \left( -\beta {\mathcal {\tilde{H}}}({\mathbf {x}},{\mathbf {p}}) \right) } \right\} , \end{aligned}$$

      where \(\beta =1/k_{\mathrm{{B}}}T\) with \(k_{\mathrm{{B}}}\) being the Boltzmann constant and \({\mathcal {\tilde{H}}}({\mathbf {x}},{\mathbf {p}})\) the shadow Hamiltonian.

      • If accepted: save \({\mathbf {x'}}\) and \({\mathbf {p'}}\) as the current positions and momenta \(({\mathbf {x}},{\mathbf {p}})\).

      • If rejected: restore the initial \({\mathbf {x}}\) and \({\mathbf {p}}\) and negate the momenta to ensure the stationarity of the canonical distribution.

  • Partial momentum update (PMU) step

    • Generate a noise vector \({\mathbf {u}}\) from the Gaussian distribution \({\mathcal {N}} (0,\beta ^{-1} M)\) as

      $$\begin{aligned} {\mathbf {u}} = \beta ^{-1/2} M^{1/2} \xi , \end{aligned}$$

      where \(\xi =(\xi _1,\ldots ,\xi _{3N})^{{\mathrm{T}}}\), \(\xi _i\sim {\mathcal {N}}(0,1)\), \(i=1,\ldots ,3N\) and N is the system size.

    • For the current positions \({\mathbf {x}}\) update the momenta \({\mathbf {p}}\) using the partial momentum update procedure:

      $$\begin{aligned} \left( \begin{array}{c} {\mathbf {u'}} \\ {\mathbf {p'}} \end{array}\right) = \left( \begin{array}{cc} \cos (\phi ) & -\sin (\phi )\\ \sin (\phi ) & \cos (\phi ) \end{array}\right) \left( \begin{array}{c} {\mathbf {u}} \\ {\mathbf {p}} \end{array}\right) , \end{aligned}$$
      (4)

      where \(\phi \) is a parameter taking values from \((0,\pi /2]\).

    • Accept or reject the new momenta \({\mathbf {p'}}\) by performing a Metropolis test with the probability

      $$\begin{aligned} \min \left\{ 1,\frac{\exp \left( -\beta [{\mathcal {{\tilde{H}}}}({\mathbf {x}},{\mathbf {p'}})+\frac{1}{2}({\mathbf {u'}})^{{\mathrm{T}}} M^{-1} {\mathbf {u'}}] \right) }{ \exp \left( -\beta [{\mathcal {{\tilde{H}}}}({\mathbf {x}},{\mathbf {p}})+\frac{1}{2}{{\mathbf {u}}}^{{\mathrm{T}}} M^{-1} {\mathbf {u}}] \right) } \right\} . \end{aligned}$$
      • If accepted: save \({\mathbf {p'}}\) as the current momentum \({\mathbf {p}}\).

      • If rejected: restore the initial \({\mathbf {p}}\).

Repeat MD and PMU step for a desired number of iterations.

As the simulation is performed in a modified ensemble with respect to shadow Hamiltonians, reweighting has to be applied to calculations of statistical averages [33]. More specifically, given an observable \(\varOmega ({\mathbf {x}},{\mathbf {p}})\) and its values \(\varOmega _i\), \(i=1,\ldots ,K\), along a sequence of states \(({\mathbf {x}}_i,{\mathbf {p}}_i)\), \(i=1,\ldots ,K\), the averages \(\langle \varOmega \rangle \) are calculated as

$$\begin{aligned} \langle \varOmega \rangle _K =\frac{\sum _{i=1}^{K} w_i \varOmega _i}{\sum _{i=1}^{K} w_i} \end{aligned}$$

with weight factors

$$\begin{aligned} w_{i} =\exp \left[ -\beta \left( {\mathcal {H}}({\mathbf {x}}_i,{\mathbf {p}}_i) - {\mathcal {{\tilde{H}}}} ({\mathbf {x}}_i,{\mathbf {p}}_i) \right) \right] . \end{aligned}$$

3.2 New features introduced in GSHMC for this study

3.2.1 Modified Adaptive Integration Approach (MAIA)

As was pointed above, the original GSHMC method uses the leapfrog integrator and the corresponding modified Hamiltonians of arbitrary accuracy [33]. The leapfrog integrator is a popular choice for molecular dynamics due to the favorable combination of properties such as the second order of accuracy, the reasonably long stability limit interval, the simplicity and computational efficiency. Recently, Akhmatskaya et al. in their work titled "Adaptive splitting integrators for enhancing sampling efficiency of modified Hamiltonian Monte Carlo methods in molecular simulation" (preprint) demonstrated that replacing the leapfrog integrator with the one-parameter 2-stage splitting adaptive integrator specially designed for shadow Hamiltonian Monte Carlo methods may significantly improve accuracy and sampling performance of GSHMC [42]. The authors termed this scheme the Modified Adaptive Integration Approach or MAIA. The adaptive integrator is uniquely determined for a given simulated system and simulation time step, in such a way that the expected error in modified Hamiltonians, \(\Delta {\mathcal {{\tilde{H}}}}\), is minimal. This immediately implies the best acceptance rates possible within a chosen setup since GSHMC does sample with respect to modified Hamiltonians and, therefore, the error \(\Delta {\mathcal {{\tilde{H}}}}\) enters the Metropolis test.

In this study, we investigate the efficiency of the MAIA method in simulations of olivine \(\hbox {NaFePO}_4\). We briefly summarize MAIA below.

A 2-stage one-parameter splitting integrator \(\psi _{\Delta t}\) of a Hamiltonian system (3) with a Hamiltonian

$$\begin{aligned} {\mathcal {H}}({\mathbf {x}},{\mathbf {p}})= \frac{1}{2} {\mathbf {p}}^{{\mathrm{T}}} M^{-1} {\mathbf {p}}+U({\mathbf {x}}) \equiv A+B \end{aligned}$$
(5)

is defined as a composition of solution g-flows of partial systems \(X \in \{A,B\}, \varPhi _g^X\), where \(g=\{b \Delta t , \Delta t/2, (1-2b) \Delta t \}\), \(\Delta t\) is a time step and \(0<b<=1/4\) is a parameter of the family:

$$\begin{aligned} \psi _{\Delta t} = \left( \phi ^B_{b\Delta t} \circ \phi ^A_{\Delta t/2} \circ \phi ^B_{(1/2-b)\Delta t}\right) \circ \left( \phi ^B_{(1/2-b)\Delta t} \circ \phi ^A_{\Delta t/2} \circ \phi ^B_{b\Delta t}\right) \equiv \varPhi ^1_{\Delta t/2} \circ \varPhi ^2_{\Delta t/2}. \end{aligned}$$
(6)

The maps \(\varPhi ^1_{\Delta t/2}\) and \(\varPhi ^2_{\Delta t/2}\) advance the solution over a first and a second halves step of length \(\Delta t/2\) respectively, therefore the name 2-stage for this integrators family.

Such an integrator is symplectic as a composition of symplectic flows and reversible due to the palindromic structure of (6). A free parameter b fully describes a 2-stage integrator and can be chosen according to some special requirements on the properties of an integrator. Several 2-stage splitting integrators with the parameters b fixed to some specific values are commonly used in molecular dynamics and/or Hybrid Monte Carlo methods [43, 44]. The most celebrated one is the Verlet/leapfrog integrator. Indeed, with \(b=1/4\) both maps in (6), \(\varPhi ^1_{\Delta t/2}\) and \(\varPhi ^2_{\Delta t/2}\), become a velocity Verlet (VV) algorithm with a time step of \(\Delta t/2\):

$$\begin{aligned} \psi _{\Delta t}= & \left( \phi ^B_{\Delta t/4} \circ \phi ^A_{\Delta t/2} \circ \phi ^B_{\Delta t/4}\right) \circ \left( \phi ^B_{\Delta t/4} \circ \phi ^A_{\Delta t/2} \circ \phi ^B_{\Delta t/4}\right) \\= & \varPhi ^1_{\Delta t/2} \circ \varPhi ^2_{\Delta t/2} \equiv \psi ^{\mathrm{{VV}}}_{\Delta t/2} \circ \psi ^{\mathrm{{VV}}}_{\Delta t/2}. \end{aligned}$$

This suggests that in order to make a fair comparison in terms of computational efficiency between an arbitrary 2-stage scheme with the parameter \(b \ne 1/4\) and the Verlet integrator in its usual formulation, a 2-stage integrator (6) should be run with a twice longer time step than Verlet, but for a twice shorter number of integration steps L, i.e., \(\Delta t_\text {2-stage}=2\Delta t_\text {Verlet}\) and \(L_\text {2-stage}=L_\text {Verlet}/2\). Some specific choices of b in (6) lead to the 2-stage integrators, which are capable of outperforming Verlet in accuracy and efficiency with the appropriately selected time steps as it was demonstrated in Refs. [4244]. However, with an increasing time step the Verlet integrator shows better performance due to the longer stability limit interval (see for example Ref. [45]).

The MAIA approach provides a rational choice of an integration parameter and identifies a unique value \(b^*\) of the parameter b (and thus a unique integrator) for a given simulated system and a chosen time step \(\Delta t\) as

$$\begin{aligned} b^*=\arg \min \limits _{0<b<\frac{1}{4}} \max \limits _{0<h<\bar{h}} \rho (h,b), \end{aligned}$$
(7)

where \(\rho (h,b)\) is the upper bound for the expected value of the modified energy error \(\varDelta = {\mathcal {{\tilde{H}}}}(\psi _{L h}({\mathbf {x}},{\mathbf {p}})) - {\mathcal {\tilde{H}}}({\mathbf {x}},{\mathbf {p}})\) with respect to the modified density \(\pi ({\mathbf {x}},{\mathbf {p}})\propto {\hbox {e}}^{-\beta {\mathcal {\tilde{H}}}({\mathbf {x}},{\mathbf {p}})}\), i.e., \({\mathbb {E}}_{\pi }(\varDelta ) \le \rho (h,b)\). Here, as before, \({\mathbf {x}}\) and \({\mathbf {p}}\) are position and momentum, respectively, \(\psi _{L h}\) is a 2-stage integrator advancing the numerical solution over L steps, h is a dimensionless time step, and \(\bar{h}=\sqrt{2}\omega \Delta t\) with \(\omega \) being the highest frequency of the simulated system. Such a choice of \(b^*\) guarantees the best conservation of the modified Hamiltonians and thus the best acceptance of proposals in the GSHMC method. Depending on the values of the highest frequency of a simulated system \(\omega \), and a choice of a time step \(\Delta t\), the adaptive integrator can either coincide with already known integrators with a fixed parameter, e.g., Verlet, the minimum-error integrator, ME [43], or BCSS [44] or be a new integrator, whose efficiency is the best under the chosen conditions.

The derivation of \(\rho (h,b)\) and, the formulae for modified Hamiltonians \({\mathcal {{\tilde{H}}}}({\mathbf {x}},{\mathbf {p}})\) of various orders of approximation corresponding to the multi-stage splitting integrators were obtained in Ref [42]. Here we only present the expressions we used in this study.

The fourth-order modified Hamiltonian for 2-stage splitting integrators derived in terms of quantities available during a simulation reads as:

$$\begin{aligned} {\mathcal {\tilde{H}}}({\mathbf {x}},{\mathbf {p}})= & \frac{1}{2} {\mathbf {p}}^{{\mathrm{T}}} M^{-1} {\mathbf {p}} + U({\mathbf {x}}) \\&\quad +\Delta t^2 \left( \alpha {\mathbf {p}}^{{\mathrm{T}}}M^{-1} \nabla \dot{U}({\mathbf {x}}) + \gamma \nabla U({\mathbf {x}})^ T M^{-1} \nabla U({\mathbf {x}}) \right) , \end{aligned}$$

with

$$\begin{aligned} \alpha= & \frac{6b^*-1}{24},\\ \gamma= & \frac{6b^{*2}-6b^*+1}{12}, \end{aligned}$$

where \(\nabla \dot{U}({\mathbf {x}})\) is the numerical time derivative of the gradient of the potential \(\nabla U({\mathbf {x}})\) and \(b^*\) is a parameter of a system specific 2-stage integrator. The upper bound function \(\rho (h,b)\) is calculated as [42]:

$$\begin{aligned} \rho (h,b)= & \frac{(SB_h+C_h)^2}{2S(1-A_h^2)},\\ S= & \frac{1+2h^2\gamma }{1+2h^2\alpha },\\ A_h= & \frac{h^4b(1-2b)}{4}-\frac{h^2}{2}+1,\\ B_h= & -\frac{h^3(1-2b)}{4}+h,\\ C_h= & -\frac{h^5b^2(1-2b)}{4}+h^3b(1-b)-h. \end{aligned}$$

Importantly, finding the appropriate parameter \(b^*\) in (7) can be done at the pre-processing stage of the simulation. Therefore, the procedure does not introduce any computational overhead. Additionally, the method is available for constrained and unconstrained dynamics and it is thus applicable to a broad range of problems.

In Sect. 4, we compare performance of GSHMC achieved using two different integration schemes, the velocity Verlet and MAIA, for a range of time steps and lengths of MD trajectories. We find that using the MAIA integrators may improve performance of the original GSHMC method by a factor as high as 2.

3.2.2 Randomized Shell Mass Generalized Shadow Hybrid Monte Carlo (RSM-GSHMC)

One important drawback of the adiabatic dynamics core–shell approach is its potential negative effect on the simulated kinetic properties due to the introduction of a shell mass. Previous studies demonstrated that with the careful choice of a shell mass and a time step such an effect becomes negligible [31, 32]. There is not, however, a clear criterion for choosing these parameters, and finding the appropriate parameter values is a matter of trial and error.

In this paper, we propose to take an advantage of the flexibility of the GSHMC method in order to smooth the undesired effect of the shell mass on the kinetics of a simulated system. The flexibility we refer to in this context is the possibility to vary on the fly in a rigorous manner the simulation parameters in GSHMC [42]. This can be done before starting each new molecular dynamics trajectory, i.e., on each Monte Carlo step, by randomizing the simulation parameters around pre-assigned fixed values. These parameters can be selected independently from a chosen distribution. The randomization helps to avoid some bad combinations of fixed values that might lead to accuracy or performance degradation such as slow convergence and non-ergodicity. Based on this idea, we introduced randomization of a shell mass in the GSHMC method.

We implemented the mass randomization as a part of the momentum update step. Before updating the momenta, we redistribute a fraction of the atomic mass between core and shell, keeping the total mass constant:

$$\begin{aligned} \begin{array}{c} m_{\mathrm{{c},i}} = m_{\mathrm{{c},0} }- \lambda _i r\\ m_{\mathrm{{s},i} }= m_{\mathrm{{s},0}} + \lambda _i r \end{array} \end{aligned}$$
(8)

where \(m_{\mathrm{{c},i}}\) and \(m_{\mathrm{{s},i}}\) are the core and shell masses at Monte Carlo step i, \(m_{\mathrm{{c},0}}\) and \(m_{\mathrm{{s},0}}\) are their respective initial values, r is the amount of mass that we want to randomize, and \(\lambda _i\) is a random number generated from a uniform distribution \({\mathcal {U}}(0,1)\) at step i.

It is important to notice that \(m_{\mathrm{{s},0}}\) has to be large enough to ensure the stability of the numerical integrator (its minimum value will depend on the time step used in the simulation). For a discussion about how to choose \(m_{\mathrm{{s},0}}\) see Ref. [32]. On the other hand, r should not be bigger than \(m_{\mathrm{{c},0}}/2\), as that could lead to a situation in which the shell is actually heavier than the core. Having these constraints is enough to rigorously implement the algorithm. However, the optimal choice of r remains empirical. Below we summarize the modified momentum update step in RSM-GSHMC.

  • Given the mass matrix M, generate a randomized mass matrix \(M'\) by applying the randomization described in (8) to the core and shell particles.

  • Generate a noise vector \({\mathbf {u}}\) from the Gaussian distribution as in the original GSHMC:

    $$\begin{aligned} {\mathbf {u'}}=\beta ^{-1/2} M'^{1/2}\xi . \end{aligned}$$
  • Adjust the current momenta \({\mathbf {p}}\) to the new masses:

    $$\begin{aligned} {\mathbf {p'}}=M'^{1/2} M^{-1/2}{\mathbf {p}}. \end{aligned}$$
  • Update the candidate momenta \({\mathbf {p'}}\) using the partial momentum update procedure:

    $$\begin{aligned} \left( \begin{array}{c} {\mathbf {u'}'} \\ {\mathbf {p''}} \end{array}\right) = \left( \begin{array}{cc} \cos (\phi ) & -\sin (\phi )\\ \sin (\phi ) & \cos (\phi ) \end{array}\right) \left( \begin{array}{c} {\mathbf {u'}} \\ {\mathbf {p'}} \end{array}\right) . \end{aligned}$$
  • Accept or reject the new momenta \({\mathbf {p''}}\) by performing a Metropolis test with the probability

    $$\begin{aligned} {\text{min}} \left\{ 1, \frac{{\text{exp}} \left( -\beta [{\tilde{{\mathcal{H}}}}]({\mathbf {x}}, {\mathbf {p}}'' )+\frac{1}{2}({\mathbf {u}}'')^{{\mathrm{T}}} M'^{-1} {\mathbf {u}}'' ] \right) }{{\text{exp}} \left( -\beta [{\tilde{{\mathcal{H}}}}({\mathbf {x}}, {\mathbf {p}}')+\frac{1}{2} ({\mathbf {u}}')^{{\mathrm{T}}} M'^{-1} {\mathbf {u}}' ] \right) } \right\} . \end{aligned}$$
    • If accepted: save \({\mathbf {p''}}\) as the current momenta \({\mathbf {p}}\).

    • If rejected: restore the initial \({\mathbf {p}}\).

3.2.3 Implementation: MultiHMC

The GSHMC method was implemented in the open-source molecular dynamics software GROMACS [46], version 4.5.4. Details of this implementation can be found in Refs. [36, 37]. GROMACS was chosen for its popularity, computational efficiency, and effective parallelization. The implementation of the GSHMC method was done in a self-contained manner, respecting the parallel scalability and introducing almost no computational overhead. The same software package was used for the implementation of 2-stage integrators [45] and, in particular, for the MAIA method. We call the resulting package MultiHMC-GROMACS, and it is available for public use under the GNU Lesser General Public License. All the classical atomistic simulations in this work were performed with the MultiHMC-GROMACS software.

Additionally, the randomized mass algorithm for the adiabatic core–shell model was implemented in the same software package as a single function call inside the partial momentum update step.

4 Numerical experiments

In this section, we present a series of numerical experiments performed to validate our computational model and to evaluate the performance of the proposed sampling approach. To this end, we used four different atomistic simulation methods: MD (CS-min), MD (CS-adi), GSHMC, and RSM-GSHMC. In addition, we compared the results with available experimental data and assessed the accuracy of the underlying force field by performing some groundstate DFT calculations.

4.1 DFT calculations

The total energies were computed using the projected augmented wave (PAW) method [47, 48] within the PBE generalized gradient approximation (GGA) [49] as implemented in the VASP package version 5.3.3 [50]. The GGA+U approach, in which an effective Hubbard U-like term is added to exchange–correlation functional, was required to correctly account for the electronic correlation of iron 3d electrons [15]. We used a U value of 4.3 eV as suggested for \(\hbox {NaFePO}_4\) in other works [28, 51, 52]. An energy cutoff of 600 eV and a proper k-point mesh were used to ensure that the total energies had converged within 5 meV per formula unit (f.u.). The geometry optimization was considered converged when forces on the atoms for each component became smaller than 0.02 eV/Å.

The surface energies of different possible terminations for \(\hbox {NaFePO}_4\) were computed following the approach outlined by Wang et al. [17] for its lithium counterpart. In Table 2, the DFT results obtained for surface energies are shown in comparison with the ones reported using the interatomic potential chosen for this study [29]. The tested methods provide similar values of the surface energies and close trends in the surfaces stability order. The agreement is very good considering the fact that the classical model was obtained by fitting to bulk structural properties only. These results support our choice of the interatomic potential for atomistic simulations carried out in this study.

Table 2 \(\hbox {NaFePO}_4\) surface energies (\(\gamma \)) for different terminations considered after relaxation, as determined by DFT calculations in the present work (\(\gamma _\text {DFT}\)) and with classical interatomic potentials (\(\gamma _\text {FF}\)) [29]

4.2 MD simulations

We considered two different systems: a fully sodiated \(\hbox {NaFePO}_4\) and a partially sodiated Na\(_{2/3}\)FePO\(_4\). The latter was chosen as the most stable compound reported by Saracibar et al. [28] for that composition, which corresponds to a stable ordered superstructure in the Na\(_x\)FePO\(_4\) (\(0\le x\le 1\)) phase diagram [51, 53, 54]. The unit cell of \(\hbox {Na}_{2/3}\hbox {FePO}_4\) contains 12 f.u., i.e., 80 atoms.

For bulk \(\hbox {NaFePO}_4\) we built a model system based on a (\(6 \times 6 \times 6\)) supercell containing 864 \(\hbox {NaFePO}_4\) f.u. (10368 particles in total including the Fe and O shells). For \(\hbox {Na}_{2/3}\hbox {FePO}_4\) we used a (\(6 \times 3 \times 2\)) supercell with 432 \(\hbox {Na}_{2/3}\hbox {FePO}_4\) f.u. (5040 particles). In both cases, the force field parameters are those presented in Sect. 2, with a cutoff of 12 Å for electrostatics and periodic boundary conditions applied in the three dimensions. For the partially sodiated case, the extra charge in the system due to removing 1/3 of Na atoms was compensated by averaging the net charge on Fe atoms as was previously suggested in the similar study for \(\hbox {LiFePO}_4\) [20].

All of the simulations were initially equilibrated with a 50 ps run using an NPT ensemble with a specified target temperature (T) and pressure \(P=1\) bar. We employed the Berendsen thermostat [55] and the Andersen barostat [56].

Na-ion diffusion events require the presence of vacant Na sites; thus, only the partially sodiated system was used for the computation of diffusion coefficients. The production runs in these cases were performed in an NVT ensemble at temperatures between 10 and 700 K using the Berendsen thermostat.

The velocity Verlet integrator was used for all MD simulations. The optimal choice of the time step in MD is discussed in Sect. 4.5.

4.3 GSHMC simulations

We tested two versions of the GSHMC method: the original approach and the RSM-GSHMC. We used the same MD setup as described above in the MD runs of the GSHMC methods with two exceptions. The MD trajectories were run in an NVE ensemble; thus, no thermostats were involved, and the MAIA integrator was chosen instead of velocity Verlet for most of the tests.

The velocity Verlet integrator was coupled with the original GSHMC method in the parameters refining procedure in order to compare its performance with respect to the MAIA integrators. As in the case of MD simulations, the choice of the time step will be discussed in the following sections. The parameter \(\phi \) in Eq. (4) was fixed to 0.2. The number of integration steps was 500 for MAIA and 1000 for velocity Verlet. The fourth-order modified Hamiltonian was used in all tests.

4.4 Validation

First, we verified that the underlying force field used in this study provides reliable results when employed for dynamical simulations. In addition, we wanted to check that the proposed simulation techniques with the chosen simulations settings are capable to accurately reproduce the properties of olivine \(\hbox {NaFePO}_4\).

To this end, we calculated the lattice constants of the fully sodiated \(\hbox {NaFePO}_4\) at \(T=300\) K and \(P=1\) bar, based on production runs of 0.5 ns using GSHMC, RSM-GSHMC, MD (CS-min), and MD (CS-adi). The results are shown in comparison with the experimental data [14] and the DFT-based calculations in Table 3. We found that all the methods yield very similar lattice constants, with relative differences less than 2%.

Table 3 Computed lattice constants of olivine \(\hbox {NaFePO}_4\) at 300 K using different approaches

We also considered the thermal expansion of bulk \(\hbox {NaFePO}_4\) to evaluate the suitability of the force field. We computed the volume expansion of the unit cell as a function of temperature by performing simulations under an NPT ensemble. The pressure was maintained using the Andersen barostat. The target temperature was controlled by using the Berendsen thermostat in MD, while GSHMC keeps T constant by design. We considered temperatures between 10 and 700 K, making sure that the box size and the potential energy were completely stabilized before measuring the volume. The resulting thermal expansion for \(\hbox {NaFePO}_4\) is shown in Fig. 1 along with the experimental results of Moreau et al. [14]. The three tested methods yield similar slopes, and the small difference observed with respect to the experimental values is negligible (relative variations are less than 1%). Therefore, we can conclude that the model combined with the simulation methods under study properly accounts for the thermal expansion of olivine \(\hbox {NaFePO}_4\).

Fig. 1
figure 1

Thermal expansion for olivine \(\hbox {NaFePO}_4\) calculated using MD, GSHMC, RSM-GSHMC in an NPT ensemble with the Andersen barostat. Experimental values are taken from Ref. [14]

As a final validation test we present in Table 4 the average values for potential (U) and kinetic energies (K), temperatures, and two structural parameters, the angles between the bonded O–P–O species (\(\theta _{{\mathrm {O{-}P{-}O}}}\)) and their corresponding P–O distances (\(d_{{\mathrm {P{-}O}}}\)). As can be seen from Table 4, the randomized mass algorithm, RSM-GSHMC, provides the best agreement with the experimental data. This is not surprising because, as it will be further demonstrated in Sect. 4.5, being an enhanced accuracy and sampling method, RSM-GSHMC produces more uncorrelated samples than other tested samplers provided that all methods are run for the same simulation time, which is the case in Tables 3 and 4. That should guarantee for RSM-GSHMC more accurate ensemble averages.

Table 4 Average values for temperature (T), potential energy (U), kinetic energy (K), O–P–O angles (\(\theta _{\mathrm {O{-}P{-}O}}\)) and P–O internuclear distances (\(d_{\mathrm {P-O}}\))

In Fig. 2, we show the computational performance measured in nanoseconds per day for the four methods considered. All simulations were run in parallel on 8 cores on the same computational server. In terms of performance, the adiabatic core–shell approach offers a great advantage that outweighs any marginal loss of accuracy. The significantly lower performance observed with the MD (CS-min) scheme is due to the big overhead introduced by the search of optimal shell positions at each time step. The loss of performance registered at temperatures over 500 K is a consequence of using a smaller time step, which was found necessary to keep the simulations stable at such high temperatures. The GSHMC approaches always achieved a higher performance because their increased numerical stability allowed the use of longer time steps. For temperatures below 500 K, the time steps were set to 1.15 fs for MD (CS-adi) and 2.3 fs for GSHMC methods, whereas for higher temperatures, they had to be reduced to 0.5 and 2 fs, respectively.

Fig. 2
figure 2

Computational performance for all the considered methods at different temperatures

4.5 Accuracy and sampling performance

In order to include in our tests the calculation of Na self-diffusion coefficients, we chose the partially sodiated Na\(_{2/3}\)FePO\(_4\) as a benchmark system. The diffusion coefficients are notoriously difficult to determine from dynamical simulations because they require considerably long runs to reach convergence. In this work, they were derived from the mean square displacement of Na-ions using the Einstein relation

$$\begin{aligned} \langle |{\mathbf {x}}_\text {Na}(t+\tau )-{\mathbf {x}}_\text {Na}(t)|^2\rangle = 6D \tau , \end{aligned}$$
(9)

where the term on the left side, which is the squared displacement of a Na-ion during an integrated interval \(\tau \), is proportional to the Na self-diffusion (or diffusion) coefficient (D) and \(\tau \).

Fig. 3
figure 3

Acceptance rates for positions (left) and momenta (right) for GSHMC simulations using MAIA and velocity Verlet integrators at \(T=300\) K

In what follows, all reported properties are results of averaging over five different production runs of 2 ns each, unless stated otherwise.

The first step for optimizing the accuracy and performance of the novel approaches in prediction of various properties of the system of interest is to find the best combination of numerical integrators and simulation parameters to be used. We begin with measuring the sampling efficiency of GSHMC in two different scenarios, namely when the method is combined with the new MAIA integrator or when it uses the standard velocity Verlet.

For these experiments, we chose a time step in the MAIA integrator twice longer than the one in the velocity Verlet case presented in its usual, 1-stage formulation. However, since MAIA performs two force evaluations per each integration step in contrast to only one in velocity Verlet, a number of integration steps in MAIA has been chosen to be half as many as in velocity Verlet. Such a setup equalizes the computational effort required for each integrator and makes the comparison fair (see Sect. 3.2.1 for the detailed explanation). For simplicity, we use from now on an effective time step, defined as \( \Delta t/n_{\mathrm {stages}}\), where \(n_{\mathrm {stages}}\) is equal to either 1 for velocity Verlet or to 2 for MAIA.

One can evaluate the influence of the tested integrators on the sampling performance by looking at the acceptance rates (AR) for the positions and momenta Metropolis tests. Ideally, the AR for positions in GSHMC should be close to 100%, to minimize the time spent on computing trajectories that are finally rejected. In Fig. 3, we show the AR for positions and momenta when using the two integrators for a range of integration time steps (\(\Delta t\)). As follows from Fig. 3, MAIA always leads to better acceptance rates, as for positions as for momenta, than can be achieved with velocity Verlet.

Fig. 4
figure 4

Integrated autocorrelation functions for the diffusion coefficient and structural parameters at \(T=300\) K

Another way to compare the sampling efficiency is to calculate integrated autocorrelation functions (IACF), defined as:

$$\begin{aligned} {\mathrm {IACF}}=\sum \limits _{l=0}^{K'} {\hbox {ACF}}(\tau _l), \end{aligned}$$
(10)

where ACF\((\tau _l)\), \(l=0,\ldots ,K'<K\) is the standard autocorrelation function for the time series \(\varOmega _k\) of K samples, \(k=1,\ldots ,K\), with the normalization

$$\begin{aligned} {\hbox {ACF}}(\tau _0)={\hbox {ACF}}(0)=1. \end{aligned}$$

The IACF gives a quantitative measure of time required, on average, for generating a non-correlated sample, and thus lower values of IACF imply more efficient sampling.

In Fig. 4, we present the IACF values for several properties of the system obtained with the MD and GSHMC methods for different effective time steps. The latter was combined with velocity Verlet (GSHMC-VV) and MAIA (GSHMC-MAIA). Clearly, the combination of GSHMC with MAIA always produces the lowest IACF values, which translates into a more efficient sampling. On the other hand, plotting IACF as a function of the effective time step helps to reveal the influence of the time step on the overall performance and suggests a way to choose the optimal one. More specifically, we found that the best performance was observed for all the simulation methods at the effective time step of 1.15 fs and thus the rest of the tests were performed with this value. Also, since the GSHMC-MAIA combination provided the best sampling efficiency, we proposed this setup for future studies.

Once we chose the proper settings for the MD and GSHMC methods, longer 4 ns simulations at constant volume and temperature (NVT) were performed. In Fig. 5, we plot the relative IACF for the structural parameters and self-diffusion observed with MD (CS-adi), GSHMC-VV and GSHMC-MAIA at 300 K with respect to the corresponding IACF values obtained with RSM-GSHMC-MAIA. Clearly, the best performance is obtained with the RSM-GSHMC-MAIA simulations for all simulated properties (lowest IACF values, up to 3.3 times better than in MD). This is a very promising result, especially for computing self-diffusion coefficients in solid bulk materials.

Fig. 5
figure 5

Relative IACF with respect to RSM-GSHMC-MAIA for structural properties and diffusion coefficients obtained with the optimal simulation parameters at \(T=300\) K

The enhanced sampling of the Na-ion self-diffusion observed with RSM-GSHMC in Fig. 5 implies shorter integration times required for obtaining the converged self-diffusion value. Figure 6 monitors the average self-diffusion obtained with MD, GSHMC, and RSM-GSHMC at \(T=300\) K with increasing simulation time up to 4 ns. Though convergence is not fully achieved with any of the methods, the GSHMC-based, and especially RSM-GSHMC, demonstrate higher rates of convergence and as a result clear signs of convergence after 3 ns of simulation. With longer simulation times, all methods should converge to almost the same values. However, those values still would differ due to the different level as of accuracy as of sampling efficiency of the tested samplers. Achieving a comparable statistical error for each method is possible, but it would require significantly longer simulations for the MD case. In general, obtaining full convergence for the diffusion coefficient using molecular dynamics is notoriously difficult and could require extremely long simulations. Since the purpose of this paper is to specify the most efficient and promising sampling method for studying electrochemically active materials, we leave a proper investigation of ion transport using RSM-GSHMC as a subject for future research.

Next, we investigated the performance of MD, GSHMC, and RSM-GSHMC in the range of temperatures by running a series of NVT simulations at temperatures between 10 and 700 K. As before (see Sect. 4.4), the integration time steps had to be reduced for temperatures greater than 500 K for all methods.

Fig. 6
figure 6

Diffusion coefficient convergence at \(T=300\) K for MD, GSHMC and RSM-GSHMC methods

We introduced a variable X that measures the sampling performance by taking into account both the effective time step \(\Delta t/n_{\mathrm {stages}}\) and the IACF as:

$$\begin{aligned} X=\frac{\Delta t/n_{\mathrm {stages}}}{{\mathrm {IACF}}}. \end{aligned}$$
(11)

In Figs. 7 and 8, we present the performance of the methods for a range of temperatures and different quantities of interest in terms of relative X values with respect to the ones obtained with MD. We can see that the GSHMC methods with and without mass randomization offer a significant improvement over MD (up to 2.5 and 4.7 for GSHMC and RSM-GSHMC, respectively).

Fig. 7
figure 7

Sampling performance (X) relative to MD at different temperatures achieved for several structural parameters when using the GSHMC (top) and RSM-GSHMC (bottom) methods

As we noticed in Fig. 5, the RSM-GSHMC method is particularly beneficial for calculating diffusion coefficients. This is apparent at all temperatures (see Fig. 8). For other calculated properties, RSM-GSHMC also demonstrates its superiority over MD and GSHMC at all temperatures though its performance differs less dramatically from the one offered by GSHMC. Yet another advantage of RSM-GSHMC over other tested methods is that it can be further tuned by modifying the amount of randomized mass in Eq. (8) for each specific temperature.

Fig. 8
figure 8

Sampling performance (X) relative to MD at different temperatures achieved when computing the diffusion coefficients with GSHMC and RSM-GSHMC methods

5 Conclusions

We presented the new methodology for atomistic simulation of solid-state materials for batteries, which offers a better accuracy and sampling efficiency than can be achieved with popular molecular dynamics (MD) approaches. The sampling in this method is performed with Generalized Shadow Hybrid Monte Carlo (GSHMC), which combines in a rigorous and effective manner molecular dynamics trajectories with Monte Carlo steps. The accuracy of the method is ensured by the new system specific adaptive integrators MAIA used in the MD step as well as by the modifications introduced in the adiabatic core–shell model for retaining the dynamics of a simulated system. Utilizing the adiabatic core–shell model (CS-adi) in the new method instead of the core–shell relaxation scheme (CS-min) yields important performance gain saving up to 80% of the computational time. We have applied the method to the study of olivine \(\hbox {NaFePO}_4\) systems and analyzed its accuracy and performance in comparison with available experimental data, the DFT computed properties, and the results obtained with other atomistic simulation methods (MD and conventional GSHMC). The accuracy of the method in the calculation of lattice constants and thermal expansion has been compared against DFT-based calculations and experimental data, obtaining reliable results for all properties. Moreover, the method demonstrates a better agreement with the experimental data than one can observe with other tested atomistic methods, namely MD (CS-min), MD (CS-adi) and the original GSHMC.

Introducing the novel MAIA integrator in our new methodology has also allowed for more efficient sampling when characterizing structural properties, such as average angles between atoms and bond lengths, as well as improving stability at higher temperatures.

Applying a randomization term to the shell mass improved not only the accuracy but also the sampling efficiency, especially when measuring diffusion coefficients. This modification of the GSHMC algorithm does not introduce significant overhead and is fully compatible with parallel implementations.

In summary, the proposed methodology can be viewed as an alternative to molecular dynamics for atomistic studies of solid-state battery materials whenever high accuracy and efficient sampling are critical for obtaining tractable simulation results. Indeed, RSM-GSHMC is an importance sampling Generalized Hybrid Monte Carlo method, which introduces stochasticity in a simulation, while retaining the dynamical properties of a system; provides a fast convergence due to sampling in a modified ensemble and allows for longer simulation time steps in Hamiltonian dynamics by using the method specific adaptive numerical integrators combined with the holonomic constraints. The method offers a better accuracy than the conventional MD through the rigorous temperature control, more accurate numerical integration of Hamiltonian equations and the reduced negative effect of the shell mass on the system’s kinetics. Besides, the RSM-GSHMC method is implemented in the popular highly efficient GROMACS package with a low computational overhead (<2%) in comparison with MD. It can be easily combined with the ensemble simulation methods available in the GROMACS package.

Finally, the presented GSHMC methodology can be successfully applied not only to simulations of solid-state materials but also to the study of fluids as demonstrated in previous works [22, 33], as long as reliable force fields for the model system are provided. This means that, in principle, each of the three electroactive materials forming a battery device, i.e., the cathode, anode, and electrolyte, can be treated with the GSHMC approach to simulate, for instance, the active ions self-diffusion.