1 Introduction

Particle tracking (PT) methods have been used to simulate solute transport in environmental flows, including flows through porous media, for nearly half a century (Gingold and Monaghan 1977; Tompson and Dougherty 1988; Fabriol et al. 1993; Thomson 1987; Bosler et al. 2017; Ding et al. 2013; Hassan and Mohamed 2003; Ramirez et al. 2008; Roubinet et al. 2010; Semra et al. 1993). PT methods discretize solute mass onto an ensemble of Lagrangian particles that move through a flow field according to a Fokker-Planck equation (Tompson and Dougherty 1988; Kinzelbach 1988). The approach is fast and easily parallelized (e.g., Rizzo et al. 2019) because every particle is independent, and it is free of the numerical dispersion that can lead to error in Eulerian methods (LaBolle et al. 1996; Salamon et al. 2006). As well, a random walk component can be added to simulate stochastic variability in highly heterogeneous systems (LaBolle et al. 2000; Herrera et al. 2017; Dimou and Adams 1993; Srinivasan et al. 2010). Classical PT methods were designed for passive solutes or elementary, first-order growth/decay reactions because neither of these requires explicit modeling of mixing process or particle interactions; in other words, all particles remain independent at all times. However, generalized Lagrangian simulations of arbitrarily complex mixing and/or reactive transport require particles to interact, and the question of how to efficiently incorporate such capability in PT schemes has been the subject of a growing body of recent research.

One method for simulating particle interactions follows the work of Benson and Meerschaert (2008), which relies on the definition of a “particle co-location probability” based on a co-location density that reflect the underlying dispersive transport mechanisms, which may be Fickian or anomalous (e.g. Bolster et al. 2012). This probability represents the overlap of the position density functions of individual particle pairs, calculated via convolution, and the resulting co-location probability has formed the core of a series of permutations of the method. Early versions were limited to simple reactions, such as mixing-driven bimolecular irreversible reactions of the form \(A+B\rightarrow C\) (Benson et al. 2017; Paster et al. 2014, 2013), but further developments mean that now arbitrarily complex reaction networks and mixing can be simulated. In these newer schemes, each particle is a “reactor” that can carry multiple chemical species or solid/aqueous phases (Benson and Bolster 2016; Engdahl et al. 2017; Bolster et al. 2020; Schmidt et al. 2020a). Current incarnations of these methods for Lagrangian simulations of complex mixing and reaction processes fall into two basic categories: mass transfer particle tracking (MTPT) (e.g., Bolster et al. 2016; Schmidt et al. 2020b), or kernel density estimation (KDE) methods (Sole-Mari and Fernàndez-Garcia 2018; Sole-Mari et al. 2017). Both of these still discretize the mass of solutes onto particles, numerically represented as kernels, that move according to a Fokker-Planck equation. Both have been shown to be highly accurate, and both allow mixing (diffusion) and spreading (dispersion) processes to be decoupled (Benson et al. 2019), but the KDE and MTPT differ in how they conceptually represent and simulate particle interactions.

The KDE and MTPT approaches are operator splitting schemes that initially move particles following classical PT methods with, if desired, a combination of deterministic and stochastic motion, after which interactions are evaluated. For the KDE, a kernel density function is applied that essentially interpolates concentrations based on the distances between particles (Sole-Mari et al. 2019a) and the optimal density function may also be impacted by the global or local shape of the solute plume (Sole-Mari and Fernàndez-Garcia 2018). With this method, some degree of smoothing of the concentration field occurs, which can be optimized with a kernel that adapts dynamically over time, meaning that a given particle’s mass can be spread out over a bigger or smaller volume. MTPT instead allows particles to exchange mass among themselves when their co-location probabilities are nonzero (Bolster et al. 2016). This allows sharp chemical gradients to be simulated and prevents unintended long-range interactions. The MTPT co-location density can vary spatially (for example, due to spatial changes in dispersion coefficients, see Schmidt et al. (2020b)) and will also change if the time step is changed, but the MTPT particle density is always an entirely local quantity, controlled by the magnitude of the spreading processes.

The fact that the size of the co-location density function for the MTPT methods depends on the user-defined time step length implies an unusual property for a numerical scheme–namely, that the time step of the simulation, \(\varDelta t\), cannot be made arbitrarily small. Most numerical methods for continuum mechanical systems rely on a finite time approximation of, for example, temporal derivatives in a system of differential equations that are being numerically integrated. Since such methods are based on the limit definition of a derivative, the numerical solution converges to the actual solution as \(\varDelta t\) is decreased at a rate proportional to the accuracy of the numerical scheme (i.e., \(\mathcal {O}(\varDelta t^{n})\)). In general, the smaller the time step, the more accurate the solution. MTPT is different because as \(\varDelta t\) decreases, the size, or range, of the particle position density function decreases, so the magnitude of the co-location density must also decrease. This means that the accuracy of an MTPT scheme can increase up to a point as \(\varDelta t\) is reduced, but then the accuracy will begin to degrade once \(\varDelta t\) is made too small because particle interactions that drive the model will artificially cease. A lack of interactions is not necessarily detrimental because particles beyond a certain distance limit would not be able to interact in real settings. However, it is important to consider the impact of how the choice of \(\varDelta t\) limits the numerical accuracy of the family of MTPT methods, which is currently a deficiency in the MTPT literature.

Note that this behavior with temporal discretization, in terms of decreasing \(\varDelta t\), is different from that in response to spatial discretization. For particle methods, increasing particle number N is analogous to decreasing \(\varDelta x\) in an Eulerian simulation. That is to say, increasing N decreases error to an arbitrary level, and this is discussed in more depth in Schmidt et al. (2018, 2019); Sole-Mari et al. (2019b). However, the choice of N for a given simulation has a direct impact on the proper choice for \(\varDelta t\), which is further elaborated on in Sect. 2.2.

The purpose of this manuscript is to establish some rigorous guidance for choosing the optimal time step, \({\widehat{\varDelta t}}\), for MTPT methods when applied to simulation of solute transport and mixing. Here, the “optimal” step is the \(\varDelta t\) that maximizes accuracy before it begins to degrade, or the minimum permissible time step. Our main objectives are (i) to quantify the accuracy of the MTPT method as a function of \(\varDelta t\) and (ii) to determine reliable criteria for ensuring maximum accuracy in transport problems with heterogeneous velocity fields. This begins with a discussion of the basic properties of the MTPT method (Sect. 2) and analytical derivations of expressions for \({\widehat{\varDelta t}}\) across a variety of problem types, each of which displays important qualities for practical applications (Sect. 2.2). The predictors of optimal \(\varDelta t\) are tested against a suite of numerical test cases in Sect. 3, and we also provide a discussion of the choice of optimal \(\varDelta t\) on simulation run time. The results show that safe thresholds for \({\widehat{\varDelta t}}\) can be determined based on the average particle spacing in translation invariant systems, and by the maximum particle spacing in spatially variable settings, where maximum spacing may be determined analytically or by completing an inexpensive calibration run of the model (e.g., a simulation that only includes conservative random-walks but not mass transfers). The manuscript closes with a brief discussion of the implications of these findings and some notes on their practical applications.

2 Analytic Model and Methods

We consider a chemically conservative, advective-diffusive system for a single species that may be described by the equation

$$\begin{aligned} \frac{\partial C}{\partial t} + \nabla \cdot (\varvec{v} C) = \nabla \cdot (\varvec{D} \nabla C),\qquad \varvec{x} \in \varOmega \subseteq \mathbb {R}^d,\qquad t > 0, \end{aligned}$$
(1)

where \(C(t, \varvec{x})\) \([\text {mol}\ \text {L}^{-d}]\) is the concentration of the single species, \(\varvec{D}\) \([\text {L}^2\ \text {T}^{-1}]\) is the diffusion tensor (assumed here to be constant in space), and \(\varvec{v}(\varvec{x})\) \([\text {L}\ \text {T}^{-1}]\) is the spatially dependent velocity.

2.1 Mass-Transfer Particle Tracking Methods

The MTPT formalism discretizes a mass of solute (or multiple solutes) onto an ensemble of particles that are moved according to a Fokker-Planck representation of (1). Like most random walk PT (RWPT) approaches (see Salamon et al. 2006), the MTPT method represents advective motion deterministically, but MTPT can represent diffusive/dispersive transport deterministically, stochastically, or as a combination of the two; for example, deterministic diffusion with random walk dispersion, which allows one to separate mixing and spreading processes (Benson et al. 2019). Stochastic components are simulated following well-established methods for RWPT (see LaBolle et al. 1996) and deterministic mixing is simulated by inter-particle mass transfers that occur following

$$\begin{aligned} m_i(t + \varDelta t) = m_i(t) + \sum _{j = 1}^{N} \beta \left[ m_j(t) - m_i(t)\right] \mathcal {W}_{ij}, \end{aligned}$$
(2)

in which \(m_i(t)\) is the mass carried by particle i, N is the total number of particles (a discussion of this numerical formalism is provided in Sole-Mari et al. (2019b)). For simplicity, we present the case for a single chemical species but the same approach can be repeated independently for each species in multi-species systems. For Fickian diffusion, \(W_{ij} :=W\left( \varvec{X}_i, \varvec{X}_j; h\right)\) is the mass-transfer (MT) kernel with form

$$\begin{aligned} W(\varvec{X}_i, \varvec{X}_j;\; \beta , \varvec{D}, \varDelta t) = \frac{1}{\sqrt{\left( \beta ^{-1} 4 \pi \varDelta t\right) ^d \det (\varvec{D})}} \exp \left[ -\frac{\left( \varvec{X}_i - \varvec{X}_j\right) ^T \varvec{D}^{-1} \left( \varvec{X}_i - \varvec{X}_j\right) }{\beta ^{-1} 4 \varDelta t}\right] , \end{aligned}$$
(3)

in which \(d = 1, 2, 3\) is the number of spatial dimensions, \(\beta\) is a parameter that encodes the kernel’s bandwidth (described further below), and particle positions are denoted \(\varvec{X}_i,\ i = 1, \dots , N\). We note that kernel-based particle methods, while increasing in popularity, must be carefully employed, as improper choices of kernel or kernel parameters can result in inaccurate or over-smoothed concentration fields. See Sole-Mari et al. (2019b) for an analysis of the kernel and bandwidth choices we employ herein. Here, we consider the case of isotropic diffusion (\(\varvec{D} = D \varvec{I}\)) for the rest of this section, though all of the presented notions readily generalize. In this case, the weight function is

$$\begin{aligned} W(\varvec{X}_i, \varvec{X}_j;\; h) = (2 \pi h^2)^{-d/2} \exp \left[ -\frac{\Vert \varvec{X}_i - \varvec{X}_j \Vert ^2}{2 h^2}\right] , \end{aligned}$$
(4)

where the kernel is parameterized by bandwidth h. The MT kernel is normalized in order to ensure mass conservation (Sole-Mari et al. 2019b; Herrera et al. 2009; Schmidt et al. 2020b) such that

$$\begin{aligned} \mathcal {W}_{ij} :=\mathcal {W}(\varvec{X}_i, \varvec{X}_j;\; h) = \frac{W_{ij}}{\frac{1}{2}\left[ \sum _{i = 1}^{N} W_{ij} + \sum _{j = 1}^{N} W_{ij}\right] }. \end{aligned}$$
(5)

The parameter \(\beta\) that appears in (2) and (3) is defined to be

$$\begin{aligned} \beta :=\frac{2 D \varDelta t}{h^2} \implies h = \sqrt{\frac{1}{\beta } 2D \varDelta t}. \end{aligned}$$
(6)

Commonly chosen values for \(\beta\) are \(\beta = 0.5\) and \(\beta = 1\) (for an in-depth analysis of the effect of choosing different values for \(\beta\), see Sole-Mari et al. (2019b); Benson et al. (2020)), though we only consider \(\beta = 1\) for this work, as it has been shown to generate the lowest error for the locally perfectly mixed case represented by (1) (Sole-Mari et al. 2019b). Note that for \(\beta = 1\) the mass-transfer kernel is the fundamental solution to the diffusion equation

$$\begin{aligned} W(\varvec{X}_i, \varvec{X}_j; 2 D \varDelta t) = (4 \pi D \varDelta t)^{-d/2} \exp \left[ -\frac{\Vert \varvec{X}_i - \varvec{X}_j \Vert ^2}{4 D \varDelta t}\right] . \end{aligned}$$
(7)

Advection is simulated separately in this operator splitting scheme, proceeding according to the equation

$$\begin{aligned} \frac{d \varvec{X}_i}{dt} = \varvec{v}(\varvec{X}_i), \quad i = 1, \dots , N. \end{aligned}$$
(8)

The simplest method to solve (8) would be employing forward Euler; i.e.,

$$\begin{aligned} \varvec{X}_i(t + \varDelta t) = \varvec{X}_i(t) + \varvec{v} \varDelta t. \end{aligned}$$
(9)

which has first-order (\(\mathcal {O}(\varDelta t)\)) accuracy in time. When the velocity field is heterogeneous, higher-order methods tend to be more efficient than excessive time-step refinement, and we use a third-order Runge–Kutta method for some of the results in Sect. 3 that include advection.

2.2 Optimal \(\varDelta t\) for MTPT Simulations

In this section, we derive expressions for the optimal time step length, \({\widehat{\varDelta t}}\), for a given number of particles and physical parameters within a certain model. Note that the term “optimal” would seem to imply a choice of \(\varDelta t\) that minimizes error; however, for some types of simulations, error may instead remain small and relatively flat before spiking due to a too-small \(\varDelta t\). In this case, \({\widehat{\varDelta t}}\) will correspond to a minimum acceptable \(\varDelta t\) for the diffusive mass-transfers, but will still be referred to as an optimal choice because the smallest admissible \(\varDelta t\) will minimize errors in other parts of a simulation, without the need for substepping. For our purposes, this is advection, but chemical reactions could be included in other cases.

The cases for which we derive \({\widehat{\varDelta t}}\), all of which include diffusion, are as follows:

  1. 1.

    1D, constant velocity;

  2. 2.

    1D, variable velocity advection;

  3. 3.

    2D, constant velocity;

    • isotropic diffusion;

    • anisotropic diffusion.

Each case requires a slightly different approach in order to derive \({\widehat{\varDelta t}}\), but they all start from the same basic assumption. Namely, the particles in a simulation must be close enough to one another so that the MT kernel (parameterized by the magnitude of diffusion and time step length) “reaches” a particle’s nearest neighbors and allows the proper amount of mass-transfer to occur without introducing excess error. We capture this condition, in each case, by defining a ratio of inter-particle spacing to the characteristic diffusion length over a time step of length \(\varDelta t\).

For all cases, we assume the initial distribution of particles to be a regularly spaced grid. This leads to clean derivations for \({\widehat{\varDelta t}}\), but is not a necessary condition. Previous work that considered the other two conditions that reasonably arise (randomly scattered, stationary particles and random-walking particles), has shown that the randomly scattered, stationary case tends to introduce a systemic error to the simulation (due to the likelihood of initial large gaps between particles that persist through the simulation), and the random-walking case tends to exhibit error closer to the stationary, regularly spaced case (Sole-Mari et al. 2019b).

2.2.1 Optimal \(\varDelta t\) for a 1D Simulation with Constant Velocity

We first consider the simplest case possible for establishing \({\widehat{\varDelta t}}\)–that is, a constant velocity in a single spatial dimension. Because this condition leads to particle spacings that do not change with time, it is numerically equivalent to a zero velocity case, which is, in turn, equivalent to the case of pure diffusion. As discussed in Sect. 2.2, we derive \({\widehat{\varDelta t}}\) under the assumption that a particle’s mass-transfer kernel must be wide enough to “see” neighboring particles. This can be enforced by requiring that the MT kernel bandwidth should be no smaller than the inter-particle spacing (\(\varDelta s\)); i.e., for the evenly spaced particles we consider

$$\begin{aligned} h = \sqrt{\frac{1}{\beta } 2 D \varDelta t} \ge \varDelta s:=\frac{L}{N}, \end{aligned}$$
(10)

where L is the length of the 1D modeling domain. Solving for \(\varDelta t\) in (10) gives

$$\begin{aligned} \varDelta t\ge \frac{\left( L / N\right) ^2}{\frac{1}{\beta } 2 D}, \end{aligned}$$
(11)

which is to say

$$\begin{aligned} {\widehat{\varDelta t}}:=\frac{\left( L / N\right) ^2}{\frac{1}{\beta } 2 D}. \end{aligned}$$
(12)

2.2.2 Optimal \(\varDelta t\) for 1D Variable Velocity

The next case we consider is a 1D case where velocity is a function of space that leads to the compression and expansion of particle separations, as would happen in natural, 3D heterogeneous flows. We note that this 1D problem is an idealization and not reflective of realistic incompressible flows that are traditionally studied; however, the problem is valuable as a numerical exercise and serves to incrementally build the complexity of the analysis. The velocity field we consider defines slow/fast pairs \(\varvec{V} :=\{v_{\text {min}}, v_{\text {max}}\}\) that are imposed in space at alternating points of a velocity grid with spacing \(\varDelta x\), and in between these points we linearly interpolate the velocity. Additionally, we impose a constant velocity of \(v_{\text {min}}\) at the beginning (left-hand side) of the domain to allow for some “setting time” before entering the variable velocity field. Figure 1 shows an example of this “sawtooth” velocity field, with the midpoint of the initially evenly spaced particle plume indicated by \(X_0\) (vertical red line).

Fig. 1
figure 1

Example of the velocity field for the condition discussed in Sects. 2.2.2 and 3.2. The vertical line indicates the midpoint of the initial particle plume that is symmetric about the midpoint and evenly spaced

The problem of estimating \({\widehat{\varDelta t}}\) becomes more challenging for this condition because inter-particle spacings are no longer constant (or equal for all particles), and thus, the numerator of (12) must be calculated in a different manner. Rather than deriving an analytical expression for \(\varDelta s\) that would be valid only for this exact type of velocity field, we take a more general approach to estimating \({\widehat{\varDelta t}}\). We define

$$\begin{aligned} {\widehat{\varDelta t}}:=\frac{{\overline{\varDelta s}}_{\text {max}}^2}{\frac{1}{\beta } 2 D}, \end{aligned}$$
(13)

where an estimate of the mean inter-particle spacing at time t may be calculated as

$$\begin{aligned} {\overline{\varDelta s}}(t) :=\frac{X_{\text {max}}(t) - X_{\text {min}}(t)}{N}. \end{aligned}$$
(14)

We note that for some problems, such as this one, this quantity may be calculated analytically; however, in more complicated cases, it may always be determined empirically by running a purely advective simulation. We then calculate \({\overline{\varDelta s}}_{\text {max}}\) to be the maximum \({\overline{\varDelta s}}(t)\) over all k time steps and m values of \(\varDelta t\)

$$\begin{aligned} {\overline{\varDelta s}}_{\text {max}} :=\max _{\varDelta t\in \{\varDelta t_1, \dots , \varDelta t_m\}}\left( \max _{k} {\overline{\varDelta s}}_k\right) . \end{aligned}$$
(15)

2.2.3 Optimal \(\varDelta t\) for 2D Constant Velocity

Just as in the 1D case considered in Sect. 2.2.1, particle spacings do not change when velocity is constant. As a result, for a square domain \(\varOmega :=[0, L] \times [0, L]\), and assuming an equal spacing of N grid-aligned particles in the x- and y-directions, we arrive at the following expression for the optimal time step in the isotropic diffusion case

$$\begin{aligned} {\widehat{\varDelta t}}:=\frac{\left( L / \sqrt{N}\right) ^2}{\frac{1}{\beta } 2 D}. \end{aligned}$$
(16)

The explanation for the change in the numerator between (12) and (16) is that, in this case of grid-aligned particles arranged in a square, a particle and its nearest neighbor are separated by a distance of \(\varDelta s= L / \sqrt{N}\).

For the case of anisotropic diffusion with the diagonal diffusion tensor (without loss of generality, as any non-diagonal case may be diagonalized for this analysis)

$$\begin{aligned} \varvec{D} :=\begin{bmatrix} D_x &{} 0 \\ 0 &{} D_y \end{bmatrix}, \end{aligned}$$

we instead have the quantities

$$\begin{aligned} {\widehat{\varDelta t}}_i :=\frac{\left( L / \sqrt{N}\right) ^2}{\frac{1}{\beta } 2 D_i},\quad i = x, y, \end{aligned}$$
(17)

and

$$\begin{aligned} {\widehat{\varDelta t}}:=\max \left\{ {\widehat{\varDelta t}}_x, {\widehat{\varDelta t}}_y\right\} . \end{aligned}$$
(18)

Note that both of the above formulations assume the square domain that we impose; however, both may be reformulated in a way that captures the relevant length-scale of interest. For example, with a domain \(\varOmega :=[0, L_x] \times [0, L_y]\), a quantity such as \(\left( \left[ L_x L_y\right] / \sqrt{N}\right) ^2\) may be used.

3 Results

In this section, we conduct numerical experiments to investigate the efficacy of the guidelines derived in Sect. 2.2 for choosing the optimal time step length, \({\widehat{\varDelta t}}\), for a given model with fixed physical parameters and a chosen particle number (N). We test each condition described in Sect. 2.2, including: 1D, constant velocity (Sect. 3.1); 1D, variable, “sawtooth” velocity field (Sect. 3.2); and 2D, constant velocity with both isotropic and anisotropic diffusion (Sect. 3.3). Finally, we present results for a 2D problem with linear shear flow (Bolster et al. 2011), which is one of the most extreme cases of rapid particle separations, inducing “hyper-mixing” caused by high transverse variability in the velocity magnitude (Sect. 3.4). Given the complexity of the shear flow, we could not rigorously derive a value for \({\widehat{\varDelta t}}\) using the same methods as before, but we do present a heuristically motivated guideline that seems to predict \({\widehat{\varDelta t}}\) with sufficient accuracy for modeling.

Additionally, in Sect. 3.5, we investigate the relationship between \({\widehat{\varDelta t}}\) and the computational cost, or run time, of a simulation. This is because run time for a given simulation is a serious concern for modelers who may use these methods. The increased accuracy gained by choosing a smaller value for \(\varDelta t\) may not be worth it if this choice results in unacceptable run times.

In all but one case, we employ an initial condition that has an analytical solution (the 1D sawtooth velocity field has no analytical solution, so we adopt a classical RWPT simulation as our reference solution), and for all cases, we calculate the root-mean-squared error (RMSE) in comparison to the analytical/reference solution. Results are plotted as RMSE vs. \(\varDelta t\) (which decreases from left to right on all plots); the expected behavior is that error either decreases steadily until \({\widehat{\varDelta t}}\) or decreases to a minimal level and maintains that level until \({\widehat{\varDelta t}}\). Then, after \(\varDelta t\) becomes smaller than \({\widehat{\varDelta t}}\), particles stop fully communicating via mass-transfers, and error increases steadily until reaching a maximal level.

The code used to generate the results in this section is available at https://doi.org/10.5281/zenodo.5542405 (Schmidt et al. 2021).

3.1 1D Constant Velocity

A constant velocity field is translation invariant so, numerically, it is identical to the zero-velocity case for the purposes of simulating mass-transfer and determining \({\widehat{\varDelta t}}\). As such, we simulate a purely diffusive system on static particles that are evenly spaced on \(x = [0, 50]\). Additionally, we test our predictor for \({\widehat{\varDelta t}}\) using the two most commonly used kernel bandwidths \(\beta \in \{0.5, 1\}\). The \(\beta = 0.5\) kernel may be interpreted as the co-location density for two particles moving via diffusion and operates on the principle that the probability of co-location between particle pairs determines how much mass they share and thus transfer (Benson and Bolster 2016; Schmidt et al. 2018, 2019). In contrast, the \(\beta = 1\) kernel is the fundamental solution to the diffusion equation, and, as such, the solutions generated by it are the analytical solution to the diffusion equation discretized at nearby particle locations (Sole-Mari et al. 2019b; Schmidt et al. 2020b). For these two chosen bandwidths, we consider both Gaussian and Heaviside initial conditions that admit Gaussian and error function analytical solutions, respectively. Simulations are conducted for differing particle numbers, \(N \in \{250, 500, 1000, 2500, 5000\}\), \(\varDelta t\in [10^{-5}, 10^{1}]\), \(D = 1\ \left[ L^2 T^{-1}\right]\), and total simulation time \(T_{\text {max}} = 10\ [T]\).

We see in Figs. 2 and 3 that our predictor for \({\widehat{\varDelta t}}\) (depicted by dotted lines with color matching the corresponding curve for a fixed N) very accurately determines the optimal \(\varDelta t\) as we have defined it, for all considered cases. One behavior we must note is that for the \(\beta = 0.5\) case, \({\widehat{\varDelta t}}\) is in fact the optimal \(\varDelta t\) in the classical sense of the word, while, for the \(\beta = 1\) case, we see error maintain a very low level before it jumps sharply upward after the lower bound of \({\widehat{\varDelta t}}\) is violated, making it an “optimal” \(\varDelta t\) in the sense discussed in Sect. 2.2. We note here that we also tested for a range of other \(\beta\) values and observe qualitatively similar behavior. Thus, for the remainder of Sect. 3, we will only depict results for \(\beta = 1\), though other values were tested in all cases.

To obtain a more tangible feel for the impact of choosing good, bad, or optimal \(\varDelta t\) on the error in a simulation, consider Fig. 4. The experiment we consider is the 1D purely diffusive with Gaussian initial condition and \(\beta = 1\), and the error plots for this experiment are depicted in Fig. 2. In Fig. 4, we plot the final results for the MTPT simulation versus the analytic solution for the \(N = 2500\) case, corresponding to the purple curve in the error plots shown in Fig. 2. In panel (a), we see the results when employing \(\varDelta t= 10\) that is larger than \({\widehat{\varDelta t}}\), and we notice a small but visually distinguishable discrepancy between the computed and exact solution (see inset). In panel (b), we see the results for \(\varDelta t= {\widehat{\varDelta t}}= 2 \times 10^{-4}\) and observe near-perfect coincidence between computed and exact solutions. Finally, in panel (c), we see the results of choosing \(\varDelta t= 1 \times 10^{-5}\) that is less than the lower bound of \({\widehat{\varDelta t}}\), and we see unacceptably large error because the time step choice is too small to allow meaningful amounts of mass transfer to occur.

Fig. 2
figure 2

Results for a 1D purely diffusive MTPT simulation with Gaussian initial condition (a) and Heaviside initial condition (b) for the \(\beta = 1\) case. We see that \({\widehat{\varDelta t}}\) (indicated by the dashed vertical lines with color corresponding to a given particle number) accurately predicts the optimal (minimum acceptable) time step length, in terms of resultant error

Fig. 3
figure 3

Results for a 1D purely diffusive MTPT simulation with Gaussian initial condition (a) and Heaviside initial condition (b) for the \(\beta = 0.5\) case. We see that \({\widehat{\varDelta t}}\) (indicated by the dashed vertical lines with color corresponding to a given particle number) accurately predicts the optimal time step length, in terms of resultant error

Fig. 4
figure 4

Concentration plots for simulated results versus the analytic solution for a 1D purely diffusive MTPT simulation with Gaussian initial condition and \(\beta = 1\). a For a large value of \(\varDelta t= 10 > {\widehat{\varDelta t}}\), we see imperfect fit between the computed and exact solution, primarily at the center of the domain. b For \(\varDelta t= 2 \times 10^{-4} = {\widehat{\varDelta t}}\), we see near-perfect match between the computed and exact solution. c For \(\varDelta t= 1 \times 10^{-5} < {\widehat{\varDelta t}}\), we see unacceptably bad fit between the computed and exact solution due to violating the lower bound of \({\widehat{\varDelta t}}\)

3.2 1D Spatially Variable Velocity

For this case, we once again consider 1D diffusion simulated via MTPT, but we add in a spatially variable velocity field that alternates between slow and fast velocities, with a linear interpolation between, as depicted in Fig. 1. We consider three slow/fast pairs \(\varvec{V}_i = \left\{ v_{\text {min}}, v_{\text {max}}\right\} \ \left[ L T^{-1}\right] ,\ i = 1, 2, 3\), and the numerical values are given in Table 1. We set the diffusion coefficient \(D = 10^{-3}\ \left[ L^2 T^{-1}\right]\), and we calculate \(T_{\text {max}}\) for each \(\varvec{V}_i\) such that a particle starting at the midpoint of the particle plume (\(X_0\) in Fig. 1) will (on average) experience two full periods of the velocity field. For all three velocity fields, we run MTPT simulations with \(10^3\) particles. Because there is no analytical solution to this system, we compute error by comparing the MTPT results to a 500-run ensemble average of classical random-walk particle tracking (RWPT) simulations (e.g., Benson and Meerschaert 2008; Bolster et al. 2016; Schmidt et al. 2017), each with \(10^6\) particles.

Table 1 Slow/fast velocity pairs for the 1D spatially variable flow example (Sects. 2.2.2 and 3.2 ). See Fig. 1 for a depiction of the “sawtooth” velocity field

The results of these experiments are depicted in Fig. 5. In this case, the error minima are not as pronounced; however, the important result is that \({\widehat{\varDelta t}}\) (indicated by a dotted vertical line with a color corresponding to the velocity field \(\varvec{V}_i\)) appears to predict the minimal-error \(\varDelta t\) quite accurately. We do see that choosing a time step length near \({\widehat{\varDelta t}}\) (with some flexibility in going larger or smaller) will minimize error for a simulation of this type and will also be optimal in the sense of being the minimum acceptable \(\varDelta t\).

Fig. 5
figure 5

Results for a 1D advective-diffusive MTPT simulation for three different velocity fields with the “sawtooth” pattern for the \(\beta = 1\) case with \(10^3\) particles. See Fig. 1 and Table 1 for a depiction of the field and values for the slow/fast velocity pairs We see that \({\widehat{\varDelta t}}\) (indicated by the dashed vertical lines with color corresponding to a given particle number) accurately predicts the minimum acceptable time step length, in terms of resultant error. Error is calculated by comparing MTPT results to RWPT simulations employing \(10^6\) particles

3.3 2D Constant Velocity

In this section, we consider the 2D problem with a constant velocity field and simulate a purely diffusive system on stationary particles that are evenly spaced on \(\varOmega = [0, 50] \times [0, 50]\). We run the simulation for differing particle numbers, \(N \in \{2601, 5625, 10201, 15625\}\), and \(\varDelta t\in [10^{-5}, 10^{1}]\), with \(D = 1\) for the case of isotropic diffusion and \(\left[ D_x, D_y\right] = \left[ 1.0, 0.2\right] \ \left[ L^2 T^{-1}\right]\) for the anisotropic case, and \(T_{\text {max}} = 10\ [T]\). For both cases, we choose a domain-centered Gaussian as the initial condition, and we calculate error by comparing to the Gaussian analytical solution.

3.3.1 Isotropic Diffusion

The results for the 2D, constant velocity, isotropic diffusion case are depicted in Fig. 6. There, we see results quite similar to the 1D, constant velocity case (for \(\beta = 1\), as in this case) in that error is relatively flat and low, until the lower bound of \({\widehat{\varDelta t}}\) is violated. More importantly, the calculated \({\widehat{\varDelta t}}\), indicated by the dashed vertical lines with a color corresponding to a given N, is an accurate predictor of the optimal \(\varDelta t\).

Fig. 6
figure 6

Results for a 2D purely diffusive (isotropic) MTPT simulation with Gaussian initial condition for the \(\beta = 1\) case. We see that \({\widehat{\varDelta t}}\) (indicated by the dashed vertical lines with color corresponding to a given particle number) accurately predicts the minimum acceptable time step length, in terms of resultant error

3.3.2 Anisotropic Diffusion

The results for the 2D, constant velocity, anisotropic diffusion case are depicted in Fig. 7. In these results, we see behavior that is similar to that of the isotropic case, with \({\widehat{\varDelta t}}\) accurately predicting the optimal \(\varDelta t\). However, the key difference that may be observed between Figs. 6 and 7 is that, in the anisotropic case, \({\widehat{\varDelta t}}\) is reliably a larger value, and the reason for this can be seen in the form of Eqs. (17) and (18). In words, the smaller magnitude of diffusion in the y-direction (\(D_y = D_x / 5\)) leads to a narrower mass-transfer kernel in that direction and results in a larger minimum \(\varDelta t\) to prevent error in the mass-transfer calculations.

Fig. 7
figure 7

Results for a 2D purely diffusive (anisotropic with \(\left[ D_x, D_y\right] = \left[ 1.0, 0.2\right]\)) MTPT simulation with Gaussian initial condition for the \(\beta = 1\) case. We see that \({\widehat{\varDelta t}}\) (indicated by the dashed vertical lines with color corresponding to a given particle number) accurately predicts the minimum acceptable time step length, in terms of resultant error

3.4 2D Linear Shear Flow

For our final experiment, we consider the case of 2D linear shear flow which is known to induce hyper-mixing and has an analytical solution, as presented by Bolster et al. (2011). This case is intentionally chosen to be an extreme one because particle spacings change drastically and a particle’s nearest neighbors are changing constantly. For this problem, the velocity field by which the particles move is defined to be \(\varvec{v}(\varvec{x}) = \left[ \alpha y, 0\right]\), leading to the Green’s function

$$\begin{aligned} c(\varvec{x}, t) = \frac{1}{2 \pi \sqrt{\det (\varvec{\kappa })}} \exp \left[ -\frac{1}{2} \varvec{x}^T \varvec{\kappa }^{-1} \varvec{x}\right] , \end{aligned}$$
(19)

and the covariance matrix, \(\varvec{\kappa }\), is defined to be

$$\begin{aligned} \varvec{\kappa }:=\begin{bmatrix} 2 D_x t + \frac{2}{3} D_y \alpha ^2 t^3 &{} D_y \alpha t^2 \\ D_y \alpha t^2 &{} 2 D_y t \end{bmatrix}. \end{aligned}$$
(20)

For our simulations, we choose a point-source (Dirac delta) initial condition centered in the domain \(\varOmega = [-8, 8] \times [-8, 8]\), leading to an analytical solution that is the Green’s function given in (19). For the simulations discussed in this section, we set the shear rate \(\alpha = 0.72\ \left[ T^{-1}\right]\), the (isotropic) diffusion coefficient \(D = 0.4\ \left[ L^2 T^{-1}\right]\), and total simulation time \(T_{\text {max}} = 4.2\ [T]\).

Applying the basic principles used to develop our previous \({\widehat{\varDelta t}}\) approximations, we propose the following expression for predicting the optimal time step

$$\begin{aligned}&{\widehat{\varDelta t}}_i :=\frac{\left( L / N_i\right) ^2}{\frac{1}{\beta } 4 D},\quad i = x, y, \end{aligned}$$
(21)
$$\begin{aligned}&{\widehat{\varDelta t}}= \max \left\{ {\widehat{\varDelta t}}_x, {\widehat{\varDelta t}}_y \right\} . \end{aligned}$$
(22)

This equation bears similarity to that employed in the 2D, constant-velocity, anisotropic case, Equation (18), with two salient differences. First, the two quantities in (2122) differ not in the diffusion coefficient but in the number of particles used to discretize in the x- and y-directions, and second, there is an extra factor of 2 in the numerator of (2122). The reason for this factor of 2 is that the shearing actually causes a given particle to be closer to nearby neighbors for much of the simulation, as the stationary, grid-aligned case, is actually the most separated particles on adjacent rows ever are from one another. As a result, this simulation can admit a smaller time step when it is sufficiently discretized in the y-direction, and we account for that by the (heuristically included) extra factor of 2.

The results for this case are depicted in Fig. 8, and we see that the behavior is qualitatively similar to all previous cases. For example, \({\widehat{\varDelta t}}\) (indicated by dashed vertical lines corresponding to a given value of N) accurately predicts the optimal \(\varDelta t\) for the cases with equal particle discretization in the x- and y-directions, that is \(\left[ N_x, N_y \right] = \left\{ \left[ 51, 51\right] , \left[ 101, 101\right] \right\}\). However, the other two cases, \(\left[ N_x, N_y \right] = \left\{ \left[ 101, 51\right] , \left[ 51, 101\right] \right\}\), display a more complicated behavior. Specifically, the y discretization appears to be more important for this problem, and this may be observed in the fact that increasing \(N_x\) from 51 to 101 and holding \(N_y\) at 51 does not appear to alter the optimal \(\varDelta t\), while a coarser x discretization (\(N_x = 51\)) with a finer y discretization (\(N_y = 101\)) does admit a smaller \(\varDelta t\) to attain optimal error. This behavior is easily explained because, for a given simulation, the particle spacings in the x-direction are static (velocity is fixed for a given y-position since all the streamlines are parallel). Thus, the more interesting behavior is happening among particles residing on different rows in the y-direction, meaning discretization in that direction has more of an impact on the optimal \(\varDelta t\). One final note about this problem is that the calculated values of \({\widehat{\varDelta t}}\) may not capture the actual \(\varDelta t\) that minimizes error but are reasonable, if conservative, estimates for optimal \(\varDelta t\), and will result in a stable simulation that exhibits error near the minimum level. Additionally, because these estimates may be obtained a priori, the calculated values for \({\widehat{\varDelta t}}\) can serve as starting values for a parameter search that will find the true optimal value for \(\varDelta t\).

Fig. 8
figure 8

Results for a 2D advective-diffusive MTPT simulation for linear shear flow with point-mass initial condition for the \(\beta = 1\) case. We see that \({\widehat{\varDelta t}}\) (indicated by the dashed vertical lines with color corresponding to a given particle number) accurately predicts the optimal \(\varDelta t\) for the cases where \(N_x = N_y\) and \(\left[ N_x, \;N_y\right] = \left[ 101, 51\right]\), and acts as a reasonable conservative estimate for optimal \(\varDelta t\) in the case of \(\left[ N_x, \;N_y\right] = \left[ 51, 101\right]\)

3.5 Relationship Between Optimal \(\varDelta t\) and Run Time

Our final numerical experiments are run with the goal of investigating the relationship between \({\widehat{\varDelta t}}\) and simulation run time. To achieve this, we run the 1D constant velocity problem of Sect. 3.1 with a Gaussian initial condition and \(\beta = 1\) (see Fig. 2a for the error results), and we run the 2D constant velocity, isotropic diffusion problem of Sect. 3.3.1 (see Fig. 6a for the error results). We time the code using MATLAB’s integrated timing function, and we plot the results of these runs in Figs. 9 and 10 .

The first thing we note in Figs. 9 and 10 is that run times initially decrease with decreasing \(\varDelta t\). This may seem odd because a decrease in \(\varDelta t\) leads to a directly corresponding increase in the number of steps taken in a simulation (presuming total time is fixed, as it is here). However, this is the result of a numerical convention that is applied to these methods. A fixed interaction radius is imposed for mass-transfer calculations to avoid undue computational burden. This is a common modeling decision because mass-transfer interactions between particles decay exponentially with distance due to the Gaussian kernel. As such, far away particles have negligible interactions with one another that quickly approach machine precision, and can thus be neglected.

For the 1D simulations of Fig. 9, we see that in the lowest particle-number case, \(N = 250\), the run time for the \({\widehat{\varDelta t}}\) simulation is quite near the minimum run time, and even smaller than for the largest time step considered, \(\varDelta t= 10\). However, as particle number grows, the run time for the \({\widehat{\varDelta t}}\) simulation increases further away from the minimum, becoming marginally larger than the run time for \(\varDelta t= 10\) for the \(N = 2500\) case (1.02 vs. 0.92 s), and approximately 2.5 times larger for the \(N = 5000\) case. For the 2D results depicted in Fig. 10, the results are much more clear cut. For all particle numbers, we see run times that decrease with \(\varDelta t\) to the point of \({\widehat{\varDelta t}}\). We do see that \({\widehat{\varDelta t}}\) does not capture the minimum run time for any of these cases, but we see in Fig. 6, that choosing \(\varDelta t\) based on minimum run time would result in unacceptable levels of error.

Fig. 9
figure 9

Run time profiling results for a 1D purely diffusive MTPT simulation with Gaussian initial condition and \(\beta = 1\) (see Fig. 2a for the error results). We see that as particle number increases, run times for \(\varDelta t= {\widehat{\varDelta t}}\) become more costly, though the computational burden does not appear to be excessive for the considered cases

Fig. 10
figure 10

Run time profiling results for Results for a 2D purely diffusive, isotropic MTPT simulation with Gaussian initial condition and \(\beta = 1\) (see Fig. 6a for the error results). We see that \({\widehat{\varDelta t}}\) is the optimal choice as it minimizes error and displays the lowest run time before entering the regime of spiking error, as see in Fig. 6a

4 Summary and Conclusions

The purpose of this manuscript is to assess and address the fact that the particle interactions in the MTPT model are controlled by a mass transfer kernel whose bandwidth is dependent on the magnitude of diffusion in the problem (whether isotropic or anisotropic). Since this is affected by the chosen length of a simulation’s time step, \(\varDelta t\), the width of this kernel must be large enough to reach a particle’s nearest neighbors, thus an arbitrarily small time step is not permissible for a fixed number of particles. MTPT accuracy diminishes below a certain threshold because particle interactions cease and our objective has been to identify ways of predicting this limiting threshold.

Our main result is the development of reliable predictors of this optimal time step, \({\widehat{\varDelta t}}\), that are obtained by balancing the width of the mass transfer kernel with some measure of inter-particle spacing, a quantity that varies with the problem setup. While an exhaustive or comprehensive list of \({\widehat{\varDelta t}}\)’s for every possible problem appears to be impossible, the results within this work are portable to other problems, if careful consideration is made. At the very least, conservative estimates for \({\widehat{\varDelta t}}\) may be obtained using the predictors in this manuscript, but the ideas presented herein may also be custom-tailored to whatever problem a modeler faces. The trick remains to determine what measure of inter-particle spacing is germane to the problem of interest and whether there are other factors that alter the amount of mass-transfer occurring within the problem (as in the shear flow problem of Sect. 3.4). Thus, to apply these results to a problem of the modeler’s choosing, the modeler must: (1) determine some measure for average inter-particle spacing (the numerator terms in each \({\widehat{\varDelta t}}\) calculation), that are perhaps distinct in multiple dimensions; (2) consider the ratio of this average spacing to the mass-transfer kernel bandwidth, \(2 D_i \beta ^{-1},\ i = x, y, z\); (3) choose the \(\varDelta t\) that is either equal to this single ratio, or the maximum of the ratio over different spatial dimensions.

We also see that, because of the nature of MTPT methods, choosing \({\widehat{\varDelta t}}\) over a larger \(\varDelta t\) may not result in longer model run times. In 1D, this depends on particle number, with \({\widehat{\varDelta t}}\) sometimes less costly than the largest considered \(\varDelta t\) and sometimes more costly. However, for the largest particle number considered, we only saw a tripling of the run time for a reduction in \(\varDelta t\) of more than 5 orders of magnitude. In 2D, \({\widehat{\varDelta t}}\) was always the lowest run time in the range of acceptable values for \(\varDelta t\), as a narrow range of smaller \(\varDelta t\)’s could result in lower run times, but this would enter the regime of unacceptable error for \(\varDelta t< {\widehat{\varDelta t}}\). These behaviors are in stark contrast to Eulerian methods because while errors can be made arbitrarily small by decreasing \(\varDelta t\) in grid-based methods (presuming instability is not an issue), decreasing \(\varDelta t\) is guaranteed to have a direct relation to increasing run time.

Looking ahead, a logical continuation of this work could be to develop adaptive time stepping schemes where, for example, the inter-particle distances at the beginning of the time step are used to determine \({\widehat{\varDelta t}}\), to maximize accuracy. It may also be possible to use different time steps in different portions of a particle ensemble, depending on the “density” of particles in a region, to minimize errors. There are some nontrivial technical challenges that would need to be addressed first, but conceptually this would merely be a temporal equivalent of a multi-grid scheme, so it should not be an insurmountable barrier. Combined with recent parallelization techniques for interacting particle systems (e.g., Engdahl et al. 2019; Cherfils et al. 2012), these advances could make large-scale simulations of mixing and reactive transport with MTPT computationally practical and numerically efficient.

Overall, the foundations laid out in this work provide practitioners with accuracy guidelines for MTPT methods, which have been lacking in the literature. These results are important because they help to build out our collective understanding of the properties of the MTPT method and offer practical guidelines for its implementation. Such scrutiny is essential for any new or emerging method to make it as reliable and effective as possible.