1 Introduction

Diffusion is one of the most omnipresent natural phenomena. Its ubiquity is matched by that of porous media; given a sufficient spatial resolution, virtually every solid or soft material (condensed matter) is porous. It therefore comes as no surprise that the earliest reported study of diffusion dealt with diffusion in a porous medium: selective movement of liquids across an animal bladder observed by Jean Antoine Nollet in 1752 (Narasimhan 1999, Sect. 5.2). Much of the early theoretical understanding of diffusive processes has been derived from studies of solutes in both free (bulk) fluids and fluids in confining geometries, e.g., capillary tubes.

The theory of diffusion follows two distinct but interconnected paths, that of collective (effective) motion of (infinitely many) particles and that of an individual particle. The former is based on Fick’s law of diffusion, which was formulated in 1855 by Adolf Fick in direct analogy with Fourier’s law of heat conduction in solids: “concentration is analogous to temperature, heat flux is analogous to solute flux and thermal diffusivity is analogous to chemical diffusivity” (Narasimhan 1999). In a companion paper published the same year, Fick studied solute diffusion in a semipermeable membrane by conceptualizing the latter as a collection of one-dimensional capillary tubes (Narasimhan 1999).

Irregular motion of minute charcoal particles was observed in 1784 by Jan Ingenhousz, who reported his observations in a short note on the use of a microscope (van der Pas 1971). Such an irregular motion of inanimate corpuscules was observed, 43 years later, by Robert Brown for the motion of pollen grains (van der Pas 1971). Both authors pointed out that inanimate corpuscules exhibit an irregular, continuous motion as if they were alive. This motion was later termed Brownian motion (van der Pas 1971). The origins of this motion were not understood until 1905–1906 when the studies of Albert Einstein (1905), Marian von Smoluchowski (1906), and William Sutherland (1905) related this irregular motion to the impact of molecular forces exerted by the liquid. A couple years later, Paul Langevin (1908) formalized their work by expressing a particle’s motion in terms of Newton’s second law and introducing a random force to represent the action of the fluid molecules surrounding the suspended particle.

We provide a brief overview of these two approaches to modeling diffusion in fluids. Section 2 contains a description of Langevin’s approach to a mechanistic description of the Brownian motion in free fluid of individual point-size inert particles and its relation to Fick’s diffusion equation (Sect. 2.1), as well as its generalizations that account for a finite number of finite-size particles (Sect. 2.2), particle’s electric charge (Sect. 2.3), and chemical interactions between diffusing particles (Sect. 2.4). Models of molecular diffusion in the presence of geometric constraints (e.g., the Knudsen and Fick–Jacobs diffusion) are discussed in Sect. 3; when these constraints are imposed by the solid matrix of a porous medium, the resulting equations provide a pore-scale representation of (advection–)diffusion transport. Section 3 also includes a brief introduction to phenomenological Darcy-scale descriptors of these processes; a comparative analysis of systematic upscaling techniques for deriving Darcy-scale models from their pore-scale counterparts can be found elsewhere in this special issue. Section 4 provides examples of other phenomena whose Darcy-scale models employ diffusion-like equations, including single- and multiphase flow and hydrodynamic dispersion. We conclude our review by discussing Darcy-scale models of non-Fickian diffusion in Sect. 5.

2 Molecular Diffusion in Free Fluid

2.1 Fundamentals of Molecular Diffusion

We consider the motion of inert particles in a host medium, which may be, e.g., a fluid or the surface between a liquid and a gas. We follow in the next section Langevin’s explanation of Brownian motion and then show the equivalence to the Fick’s diffusion equation following Einstein and Smoluchowski.

2.1.1 Brownian Motion

Irregular motion of a particle in a fluid can be described by Newton’s second law. The ambient fluid affects this motion in two opposite ways: The drag due to fluid viscosity slows the particle down, while the collisions with fluid molecules accelerate the particle (assuming the latter is sufficiently small). The drag force F exerted by a fluid with dynamic viscosity \(\mu \) on a (spherical) particle of radius \(r_p\) is quantified by Stokes’ law, \(F = -\, 6 \pi \mu r_p\). Figure 1 illustrates a typical trajectory of a Brownian particle.

Fig. 1
figure 1

A representative trajectory of a Brownian particle obtained by random walk particle tracking

The equations of motion of a particle moving along the one-dimensional (\(d=1\)) trajectory X(t) are

$$\begin{aligned} \frac{{\text {d}} X(t)}{{\text {dt}}} = V(t), \qquad \qquad \frac{{\text {d}} V(t)}{{\text {dt}}} = -\, \gamma V(t) + \sqrt{2 \kappa } \xi (t), \end{aligned}$$
(1)

where \(\gamma = 6 \pi \mu r_p/m\) is the drag coefficient, with m denoting the particle’s mass. The second term represents a random force that stands for the action of the surrounding liquid molecules. In the absence of the random forces, the particle velocity tends to 0 on the relaxation timescale \(\tau _\gamma = \gamma ^{-1}\). The random force \(\xi \)(t) has zero mean and unit variance and correlated on a timescale much smaller than the relaxation time. Thus, the correlation function of \(\xi (t)\) can be approximated as Gaussian white noise,

$$\begin{aligned} \langle \xi (t) \xi (t') \rangle = \delta (t - t'), \end{aligned}$$
(2)

where \(\delta (t)\) is the Dirac delta. The angular brackets denote the average over all realizations of the random process \(\xi (t)\).

The velocity process in (1) is an Ornstein–Uhlenbeck process (Risken 1996; Gardiner 2010). The parameter \(\kappa \) is a priori unknown. However, it can be determined from the theorem of the equipartition of the kinetic energy between the degrees of freedom of a system in thermal equilibrium. The equipartition theorem implies that the velocity fluctuations in equilibrium, i.e., at times \(t \gg \tau _\gamma \), are given by \(\lim _{t \rightarrow \infty } \langle V(t)^2 \rangle = {k T}/{m}\), where k is the Boltzmann constant and T is the absolute temperature. The angular brackets denote the average over all particles. At the same time, one obtains for the velocity fluctuations in equilibrium by solving for V(t) in (1) the expression \(\lim _{t \rightarrow \infty } \langle V(t)^2 \rangle = {\kappa }/{\gamma }\). Thus, the strength of the random force is directly related to the dissipation of kinetic energy due to viscosity as

$$\begin{aligned} \kappa = \frac{k T \gamma }{m}. \end{aligned}$$
(3)

This relation is an expression of the fluctuation–dissipation theorem (Kubo 1966). At times \(t \gg \tau _\gamma \), the velocity V(t) enters a dynamical steady state at which the fluctuations due to the random force are balanced by drag such that \(v(t) = \sqrt{{2 k T}/m \gamma } \xi (t)\), where we used the expression (3) for \(\kappa \). Using the definition of the drag coefficient \(\gamma \) gives the velocity \(V(t) = \sqrt{kT / 3 \pi \mu R}\), which is independent of the particle mass. The equation of motion of a particle for times \(t \gg \tau _\gamma \) thus simplifies to

$$\begin{aligned} \frac{{\text {d}} X(t)}{{\text {d}} t} = \sqrt{2 D_0} \xi (t), \qquad D_0 = \frac{k T}{m \gamma }, \end{aligned}$$
(4)

where the coefficient \(D_0\) is given by the Einstein–Smoluchowski relation. By using the above expression for \(\gamma \), one obtains the Stokes–Einstein relation \(D_0 = kT / (6 \pi \mu R)\), which again is independent of the particle mass. The only particle property that enters into the diffusion coefficient is its size in terms of the radius R. The stochastic differential equation (4) can be seen as the starting point of the analyses of Einstein (1905) and von Smoluchowski (1906).

2.1.2 The Diffusion Equation

Let f(xt) denote the probability density function (PDF) of finding a Brownian particle X at point x at time t. Our derivation of the evolution equation for f(xt) follows closely that in Einstein (1905). The stochastic process (4) constitutes a Markov process because a particle’s position at a time \(t + \Delta t\) depends solely on its position at the previous time t,

$$\begin{aligned} X(t + \Delta t) = X(t) + w_\Delta , \qquad \qquad w_\Delta = \sqrt{2D_0} \int \limits _{t}^{t+\Delta t} \xi (t') {\text {dt}}'. \end{aligned}$$
(5)

The variable \(w_\Delta \) has a Gaussian PDF \(f_w(x,t)\) with mean 0 and variance \(2 D_0 \Delta t\). The PDFs \(f(x,t+\Delta )\) of \(X(t+\Delta t)\) and f(xt) of X(t) are related by

$$\begin{aligned} f(x,t+\Delta ) = \int \limits _{-\infty }^\infty f(x - x',t) f_w(x',t) {\text {d}} x'. \end{aligned}$$
(6)

For small \(\Delta t \ll t\), the variance of \(f_w(x)\), \(2 D_0 \Delta t\), is small so that only small values of \(x'\) contribute to the integral. Hence, expanding \(f(x,t+\Delta t)\) and \(f(x-x',t)\) into respective Taylor series around the point (xt), we obtain an equation for f(xt):

$$\begin{aligned} f + \frac{\partial f}{\partial t} \Delta t&= f + \int \limits _{-\infty }^\infty f_w(x',t) {\text {d}} x' - \frac{\partial f}{\partial x} \int \limits _{-\infty }^\infty x' f_w(x',t) {\text {d}} x' \nonumber \\&\quad + \frac{\partial ^2 f}{\partial x^2} \int \limits _{-\infty }^\infty \frac{x'^2}{2} f_w(x',t) {\text {d}} x' + \cdots . \end{aligned}$$
(7)

Since the Gaussian PDF \(f_w(x,\cdot )\) is symmetric, the second, fourth, etc., terms on the right side vanish, while the third, fifth, etc., are of order \(\Delta t^i\) with \(i = 2, 4\), etc. From the basic properties of a PDF, the first integral in (7) equals 1, while the third integral is half of the variance, \((2 D_0 \Delta t)/2\). Therefore, (7) reduces to a diffusion equation

$$\begin{aligned} \frac{\partial f(x,t)}{\partial t} = D_0 \frac{\partial ^2 f(x,t)}{\partial x^2}. \end{aligned}$$
(8)

Thus, the coefficient \(D_0\) defined in (4) is the diffusion coefficient.

As the particle motion is statistically isotropic and the motions along the coordinate axes are statistically independent, the above derivations can be readily generalized to d spatial dimensions. If particles are injected at point \({\mathbf {x}} = {\mathbf {0}}\) of the d-dimensional infinite domain at time \(t=0\), the initial PDF is \(f({\mathbf {x}},t=0) = \delta ({\mathbf {x}})\) and the solution of the d-dimensional version of (8) is a Gaussian PDF

$$\begin{aligned} f({\mathbf {x}},t) = \frac{1}{(4 \pi D_0 t)^{d/2}} \exp \left( -\frac{{\mathbf {x}}^2}{4 D_0 t} \right) . \end{aligned}$$
(9)

Thus, the mean squared particle displacement along the x coordinate axis until time t is given by

$$\begin{aligned} \langle X(t)^2 \rangle \equiv \int \limits _{-\infty }^\infty x^2 f( x,t) {\text {dx}}= 2 D_0 t. \end{aligned}$$
(10)

Using the expression for \(D_0\) in (4) gives the following relation between the particle displacement and the particle size,

$$\begin{aligned} r_p = \frac{k T t}{3 \pi \mu \langle X(t)^2 \rangle }. \end{aligned}$$
(11)

Hence, the particle size can be obtained from the observation of the mean square displacement.

Another quantity of practical interest is the probability density of first particle arrival times at a given distance \(x'\) (Redner 2001). For simplicity, we consider the case of one-dimensional diffusion. For an initial position \(X(t = 0) = x'\), we consider the distribution of first arrival times at the origin at \(x = 0\). The arrival time T(x) is defined as

$$\begin{aligned} T(x') = \min (t|X(t) \le 0). \end{aligned}$$
(12)

The first arrival time PDF \(g(t,x') = \langle \delta [t - T(x')] \rangle \) can be determined in different ways. We choose here the following. The PDF \(f(x',t)\) of finding a particle at \(x = 0\) at time t that has started at \(x'\) at \(t = 0\) is

$$\begin{aligned} f(x',t) = \int \limits _0^t g(t',x') f(0,t-t') {\text {dt}}'. \end{aligned}$$
(13)

It is equal to the probability \(g(t,x')\) that the particle arrives for the first time at the origin multiplied by the probability \(f(0,t-t')\) that the particle returns to the origin after the time \(t - t'\) has passed. In Laplace space, the solution of this equation is

$$\begin{aligned} {\hat{g}}(\lambda ,x') = \frac{{\hat{f}}(x',\lambda )}{{\hat{f}}(0,\lambda )}. \end{aligned}$$
(14)

The Laplace transform of f in (9) is

$$\begin{aligned} {\hat{f}}(x,\lambda ) = \frac{1}{2 \sqrt{D \lambda }} \exp \left( - |x|\sqrt{\frac{\lambda }{D}} \right) , \end{aligned}$$
(15)

so that the Laplace transform of the first arrival time PDF becomes

$$\begin{aligned} {\hat{g}}(\lambda ,x') = \exp \left( - |x'| \sqrt{\frac{\lambda }{D}} \right) , \end{aligned}$$
(16)

which is the Laplace transform of a Lévy–Smirnov density or inverse Gaussian density. Computing the inverse Laplace transform, we arrive at

$$\begin{aligned} g(t,x') = \frac{x' }{\sqrt{4 \pi D t^3}} \exp \left( - \frac{x'^2}{4 D t} \right) . \end{aligned}$$
(17)

The Gaussian particle distribution and the inverse Gaussian first arrival time distribution are illustrated in Fig. 2.

Fig. 2
figure 2

Illustration of a the probability density function f(xt) of particle positions X(t) at (black, green, orange, blue) times \(t = 1, 2, 4, 8 \cdot 10^{7}\) s, and bg(tx) of particle arrival times T(x) at (black, green, orange, blue) \(x' = 10^{-4}, 10^{-3}, 10^{-2}, 10^{-1}\) m for \(D = 10^{-9}\) m\(^2\)/s

2.2 Diffusion of Finite-Size Particles

Finite size of diffusing particles introduces two additional features: (i) such particles experience drag and hydrodynamic interactions while moving through a fluid, and (ii) the smallest distance between any two particles is limited by their radii, giving rise to exclusion volumes that the particles cannot enter. A macroscopic manifestation of these phenomena is the dependence of an effective diffusion coefficient D on the particle concentration \(c({\mathbf {x}},t)\) (e.g., Batchelor 1976; Bruna and Chapman 2012) and the references therein).

Consider a system of \(N_\text {par}\) non-deformable spherical particles, whose radius R is significantly smaller than a characteristic length L of a simulation domain \({\mathcal {D}}\). At any given time t, the dynamics of the centroids of these particles, \({\mathbf {X}}_i(t)\) with \(i=1,\ldots ,N_\text {par}\), satisfies a stochastic differential equation

$$\begin{aligned} {\text {d}} {\mathbf {X}}_i = \sqrt{2 D_0} {\text {d}} {\mathbf {B}}_i + {\mathbf {U}} {\text {dt}}, \qquad i=1,\ldots ,N_\text {par}. \end{aligned}$$
(18)

Here, \(D_0\) is the coefficient of molecular diffusion defined by the kinetic theory of gases (Kennard 1938) in terms of a molecule’s mean free path \(\lambda \) and mean velocity v as \(D_0 = \lambda v / 3\), \({\mathbf {B}}_i\) denotes the d-dimensional (\(d = 2\) or 3) standard Brownian motion of the ith particle, and \({\mathbf {U}}({\mathbf {X}}_i)\) represents advection velocity or appropriately scaled external forces acting identically on all particles. The presence of the exclusion volumes implies that the random processes \({\mathbf {X}}_i(t)\) (\(i=1,\ldots ,N_{_\text {par}}\)) are no longer independent. The PDF, \(f_{{\mathbf {X}}_i}({\mathbf {x}},t)\), of the random centroid \({\mathbf {X}}_i\) occupying the point \({\mathbf {x}}\) at time t satisfies, up to first order in \(\varepsilon = R/L\), a nonlinear partial differential equation (Bruna and Chapman 2012)

$$\begin{aligned} \frac{\partial f_{{\mathbf {X}}_i} }{\partial t} = D_0 \nabla _{{\mathbf {x}}}^2 [f_{{\mathbf {X}}_i} + \alpha _d (N_\text {par}-1) \varepsilon ^d f_{{\mathbf {X}}_i}^2] - \nabla _{{\mathbf {x}}} \cdot [{\mathbf {U}}({\mathbf {x}}) f_{{\mathbf {X}}_i}], \quad {\mathbf {x}} \in {\mathcal {D}}. \end{aligned}$$
(19)

Here, \(\alpha _d = \pi /2\) or \(2\pi / 3\) in \(d = 2\) or 3 spatial dimensions, respectively.

If the number of particles is large, such that \(N_\text {par}-1 \approx N_\text {par}\), then it follows from (19) that their volumetric concentration, \(c({\mathbf {x}},t) \equiv \varphi f_{{\mathbf {X}}_i}\) with \(\varphi =\pi N_\text {par} \varepsilon ^d / (2 d)\) denoting the volume fraction of particles relative to the volume of \({\mathcal {D}}\), satisfies a nonlinear advection–diffusion equation

$$\begin{aligned} \frac{\partial c}{\partial t} = \nabla _{{\mathbf {x}}} \cdot [D(c) \nabla _{{\mathbf {x}}} c] - \nabla _{{\mathbf {x}}} \cdot [{\mathbf {U}}({\mathbf {x}}) c], \quad {\mathbf {x}} \in {\mathcal {D}}, \end{aligned}$$
(20)

with the concentration-dependent effective diffusion coefficient

$$\begin{aligned} D(c) = D_0 [1 + 4(d-1)c]. \end{aligned}$$
(21)

The finite particle size gives rise to a somewhat nontrivial collective behavior. First, the value of the effective diffusion coefficient D depends on the dimensionality d. Second, D differs from the “self-diffusion” coefficient of an individual particle, \(D_0\); this is in contrast to point particles undergoing Brownian motion, for which these two diffusion coefficients are identical (Sect. 2.1). Third, (21) suggests that particles of the finite size diffuse faster (have higher effective diffusion coefficient, D) than their zero-volume counterparts \(D_0\) (whose effective diffusion coefficient is \(D_0\)). That is because collisions of large particles introduce a bias to their random (Brownian) motion, thus accelerating their net spreading (Bruna and Chapman 2012).

2.3 Diffusion of Electrically Charged Particles

Consider Brownian motion (diffusion) of ions, which are treated as point charges. Diffusion coefficients of individual cations (the positively charged ions, denoted by the subscript “\(+\)”) and anions (the negatively charged ions, denoted by the subscript “−”) are denoted by \(D_+\) and \(D_-\), respectively. This Brownian motion of cations and anions in a solution results in spatial variability of their respective volumetric concentrations \(c_\pm ({\mathbf {x}},t)\). The latter are related to the electrochemical potentials of cations and anions,

$$\begin{aligned} \mu _\pm ({\mathbf {x}},t) = RT\ln c_\pm ({\mathbf {x}},t) + z_\pm F \varphi ({\mathbf {x}},t), \end{aligned}$$
(22)

where \(z_\pm \) are the cation and anion charges (valencies); R and F are the gas and Faraday constants, respectively; T is the temperature; and \(\varphi ({\mathbf {x}},t)\) is the electric potential. Spatial variability of \(\mu _\pm \) induces ionic (Nernst–Planck) fluxes

$$\begin{aligned} {\mathbf {J}}^\text {NP}_\pm = - M_\pm c_\pm \nabla \mu _\pm , \end{aligned}$$
(23)

where the ion mobility \(M_\pm \) is related to the molecular diffusion coefficient of ions in the fluid, \( D_\pm \), by the Einstein relation \(M_\pm = D_\pm /RT\).

In the absence of homogeneous chemical reactions, mass conservation of anions and cations, \(\partial _t c_\pm = - \nabla \cdot {\mathbf {J}}^\text {NP}_\pm \), gives rise to the Nernst–Planck equations

$$\begin{aligned} \frac{\partial c_{\pm }}{\partial t}=\nabla \cdot (D_{\pm } \nabla c_{\pm }) - \nabla \cdot ( {\mathbf {U}}_\pm c_{\pm } ), \quad {\mathbf {U}}_\pm \equiv - \frac{z_\pm D_{\pm } F}{RT} \nabla \varphi , \qquad {\mathbf {x}} \in {\mathcal {D}}. \end{aligned}$$
(24)

The total (net) ionic charge density \(q \equiv F(z_+c_+ + z_- c_-)\) is related to the electric potential \(\varphi ({\mathbf {x}},t)\) through a Poisson equation,

$$\begin{aligned} - {\mathcal {E}} \nabla ^2 \varphi = F(z_+c_+ + z_- c_-), \qquad {\mathbf {x}} \in {\mathcal {P}}, \end{aligned}$$
(25)

where \({\mathcal {E}}\) is dielectric constant of the solvent.

2.4 Diffusion-Limited Chemical Reactions

Chemical reactions are contact processes. They depend on the availability of reacting species and on the processes that bring them into contact, which here is diffusion. In this section, we briefly report three examples that illustrate the impact of diffusion on chemical reactions.

2.4.1 Smoluchowski Theory

To illustrate the main ideas behind the Smoluchowski reaction rate theory (von Smoluchowski 1917), we consider a fast irreversible bimolecular chemical reaction

$$\begin{aligned} A + B \longrightarrow C. \end{aligned}$$
(26)

The species A acts as a stationary sink, which is surrounded by B particles at constant concentration \(c_B\). The A and B particles annihilate when they are brought into contact; this means the intrinsic reaction kinetics are very fast. Diffusion is the rate-limiting process. The relative motion of the A and B particles is governed by the Langevin model discussed above. As A and B particles are idealized as point particles in the Langevin model for diffusion, we need to define a reaction or capture radius \(r_0\). This means two particles are in contact and annihilate if their distance is smaller than \(r_0\).

The relative distance \({\mathbf {r}}(t) = {\mathbf {X}}_A(t) - {\mathbf {X}}_B(t)\) between a particle A and a particle B satisfies the Langevin equation

$$\begin{aligned} \frac{{\text {d}} {\mathbf {r}}(t)}{ {\text {dt}}} = \sqrt{4 D_0} \varvec{\xi }(t). \end{aligned}$$
(27)

Thus, the PDF \(f({\mathbf {r}},t)\) of \({\mathbf {r}}(t)\) satisfies the diffusion equation (8) for the diffusion coefficient \(2 D_0\). Under radial symmetry in \(d = 3\) dimensions, \(f({\mathbf {r}},t) \rightarrow f(r,t)\) and

$$\begin{aligned} \frac{\partial f}{\partial t} = \frac{2 D_0}{r^2} \frac{\partial }{\partial r} \left( r^2\frac{\partial f}{\partial r} \right) . \end{aligned}$$
(28)

In order to determine the reaction rate for a single A particle, we consider the following boundary value problem. The concentration at \(r = \infty \) is equal to \(f = c_B\), and at \(r = r_0\) we have an absorbing boundary such that \(f(r_0,t) = 0\). The steady-state solution \(f_{\infty }(r)\) to (28)

$$\begin{aligned} f_\infty (r) = c_B \left( 1 - \frac{r}{r_0}\right) . \end{aligned}$$
(29)

Thus, we obtain for the flux at \(r = r_0\)

$$\begin{aligned} j_B = 8 \pi D r_0^2 \frac{\partial f_\infty }{\partial r}(r_0) = 8 \pi D r_0 c_B. \end{aligned}$$
(30)

For the concentration \(c_A\) of A particles, this means that the reaction rate is

$$\begin{aligned} R = 8 \pi D r_0 c_A c_B. \end{aligned}$$
(31)

From this, it follows that the reaction rate constant is

$$\begin{aligned} k = 8 \pi D r_0. \end{aligned}$$
(32)

Thus, the reaction rate is determined by the diffusion rate.

2.4.2 Ovchinnikov–Zeldovich Segregation

Here, we briefly report on the impact of segregation due to heterogeneous initial reactant distributions in the presence of diffusive mass transfer (Ovchinnikov and Zeldovich 1978). For spatially uniform initial concentrations in a well-mixed reaction, the evolution of the concentrations \(c_i\) of the \(i = A, B, C\) species in the reaction (26) follows the kinetic rate law

$$\begin{aligned} \frac{{\text {dc}}_i(t)}{{\text {dt}}}&= - k c_A(t) c_B(t),&\frac{{\text {d}} c_C(t)}{{\text {dt}}} = k c_A(t) c_B(t) \end{aligned}$$
(33)

with \(i = A, B\). For equal initial concentrations \(c_A(t = 0) = c_B(t = 0) = c_{0}\), the solution for \(c_A(t) = c_B(t) = c(t)\) is

$$\begin{aligned} c(t) = \frac{c_{0}}{1 + k c_{0} t}. \end{aligned}$$
(34)

Thus, for times larger than \(1/ k c_0\), the reactant concentrations decay as \(c(t) \sim 1/t\).

Fig. 3
figure 3

Island formation in an instantaneous irreversible bimolecular reaction under heterogeneous initial conditions in one spatial dimension

Next, we consider the situation of heterogeneous initial distributions of the reacting species A and B such that the spatial averages \({\overline{c}}_{A0} = {\overline{c}}_{B0}\). The chemical reaction leads to a local depletion of the minority species and the formation of A and B islands, this means segregation of the reactants as illustrated in Fig. 3. Reactions between A and B particles are limited by diffusion to the island boundaries, where reactions can occur. For simplicity, we consider an instantaneous reaction such that \(c_A c_B = 0\). Furthermore, we assume that the fluctuations of initial particle numbers \(\delta N_0 \propto \sqrt{N_0}\). This implies that \(\delta c_{A0} = c_{A0} - {\overline{c}}_{A0} \) and \(\delta c_{B0}\) satisfy \(\overline{V^2 \delta {c_{A0}}^2} = \overline{ V^2 \delta {c_{B0}}^2}= c_{A0} V\), where V is the volume of an island. The difference \(u = c_A - c_B\) is conservative and subject to diffusion only. The mean and variance of its initial distribution \(u_0\) are \({\overline{u}}_0 = 0\) and \(\overline{V^2 u_0^2} = 2 c_{A0} V\). The A islands, characterized by \(u > 0\), and the B islands, characterized by \(u < 0\), grow diffusively. This means their typical size grows as \(\ell \sim \sqrt{D t}\). Thus, in an island of volume \(V \sim (Dt)^{d/2}\), the typical value of u is given in terms of the initial excess of the majority species per volume; this means \(|u| \sim |\delta N_0|/V \sim \sqrt{c_{A0} V}/V \sim V^{-1/2}\). As locally \(|u| = c_A\) or \(c_B\), we obtain

$$\begin{aligned} {\overline{c}}_A(t) \sim \frac{1}{(Dt)^{d/4}}. \end{aligned}$$
(35)

This behavior is valid for \(d < 3\). The segregation of the reacting species due to the heterogeneous initial distribution significantly slows down the reaction kinetics relative to the well-mixed case. The detailed analysis of Ovchinnikov–Zeldovich segregation using spectral analysis and numerical simulations can be found in Toussaint and Wilczek (1983), Kang and Redner (1985), among many others.

2.4.3 Diffusion in the Presence of Randomly Distributed Traps

An illustrative example for the impact of chemical heterogeneity on reactivity is diffusion in a medium characterized by a random distribution of traps, which are characterized by a constant number density \(n_0\). The trap positions are distributed uniformly, and the traps are allowed to overlap (Kayser and Hubbard 1983). The random distribution of traps may be identified with a distribution of specific reactive surface area in porous media. The concentration \(c_A({\mathbf {x}},t)\) of a species A evolves according to the diffusion equation \(\partial _t c_A = D \nabla ^2 c_A\). At the surface of each trap, absorbing boundary conditions are specified, \(c_A({\mathbf {x}},t) = 0\).

For \(d = 1\) spatial dimension, the distance \(\ell \) between traps is distributed exponentially as \(\sim \exp (- n_0 \ell )\). The average concentration of species A between two traps behaves at asymptotically long times \(t \gg \ell ^2/D\) as (Havlin and Ben-Avraham 2002)

$$\begin{aligned} {\overline{c}}_{A}(t,\ell ) = \frac{1}{\ell } \int \limits _0^\ell c_A(x,t) {\text {dx}} \propto \exp (-\pi ^2 D t/\ell ^2). \end{aligned}$$
(36)

The overall average concentration of species A is obtained by spatially averaging over all trap-free intervals such that (Kayser and Hubbard 1983)

$$\begin{aligned} {\overline{c}}_A(t) = \int \limits _0^\infty n_0^2 \ell \exp (-n_0 \ell ) {\overline{c}}_A(t,\ell ) {\text {d}} \ell \sim \exp \left[ - \gamma \left( t/\tau _D \right) ^{1/3} \right] , \end{aligned}$$
(37)

where \(\gamma \) is a dimensionless constant and \(\tau _D = n_0^2 D\) is the characteristic diffusion time between traps. For higher dimensions, a similar reasoning gives

$$\begin{aligned} {\overline{c}}_A(t) \gtrsim \exp \left[ - \gamma \left( t/\tau _D\right) ^{d/(d+2)} \right] . \end{aligned}$$
(38)

The overall reaction kinetics is slower than what would be predicted by a first-order decay, \({\overline{c}}_A(t) \sim \exp (-t/\tau _D)\), based on a mean decay time equal to \(\tau _D\), the diffusion time between traps, which would be a reasonable first guess for the reactivity of the system. This behavior is due to the fact that there is a small but finite probability of finding intervals or holes that are arbitrarily large with an associated very large diffusion time and thus survival time. Figure 4 illustrates the evolution of the species concentrations according to the exponential and stretched exponential laws.

Fig. 4
figure 4

Evolution of the species concentration under randomly distributed traps in \(d = 1\) (blue), \(d = 2\) (orange), and \(d = 3\) (green) spatial dimensions for \(\gamma = 1\). The black line denotes the exponential decay law

3 Molecular Diffusion in Crowded Environments

3.1 Geometrically Constrained Diffusion

The diffusion models described above assume that the individual and/or collective motion of particles is unaffected by the geometry of a simulation domain (e.g., pore space) \({\mathcal {D}}\). This assumption can be violated when pores become small, in extreme cases reaching the size of diffusing particles (e.g., red blood cells moving through capillaries). If particles move in a d-dimensional (\(d = 2\) or 3) space \({\mathcal {D}}\) whose characteristic size w in k (\(k < d\)) dimensions is exceedingly small, it is common to introduce diffusion models in an “effective dimension” \(d_\text {eff} = d - k\). For example, the effective dimension of a transport model in a narrow long capillary (\(d = 2\)) is \(d_\text {eff} = 1\), while that in a narrow fracture is \(d_\text {eff} = 2\).

Under such conditions, (19) is replaced with a PDF equation (Bruna and Chapman 2014),

$$\begin{aligned} \frac{\partial f_{{\mathbf {X}}_i} }{\partial t} = D_0 \nabla _{{\mathbf {x}}}^2 [f_{{\mathbf {X}}_i} + \alpha _h (N_\text {par}-1) \varepsilon ^{d_\text {eff}} f_{{\mathbf {X}}_i}^2] - \nabla _{{\mathbf {x}}} \cdot [\bar{{\mathbf {U}}}({\mathbf {x}}) f_{{\mathbf {X}}_i}], \quad {\mathbf {x}} \in {\mathcal {D}}_\text {eff}, \end{aligned}$$
(39)

which is defined in a domain \({\mathcal {D}}_\text {eff}\) with the effective dimension \(d_\text {eff}\). In special cases, the coefficient \(\alpha _h\) can be related analytically to the “confinement parameter” h (Bruna and Chapman 2014); otherwise, it serves as a fitting parameter. Equation (39) governs the PDF of finding the ith finite-size particle, \( f_{{\mathbf {X}}_i}({\mathbf {x}},t)\), at the space–time point \(({\mathbf {x}},t)\).

Collective diffusion of a large number of finite-size particles, \(N_\text {par}-1 \approx N_\text {par}\), is described by the nonlinear advection–diffusion equation (20) with the following caveats. First, the volume fraction \(\varphi _h\) in the definition of the volumetric concentration \(c \equiv \varphi _h f_{{\mathbf {X}}_i}\) can now be explicitly derived only in a few special cases for which \(\alpha _h\) is computable (Bruna and Chapman 2014). Second, (20) is defined in the domain \({\mathcal {D}}_\text {eff}\) of the effective (reduced) dimension \(d_\text {eff}\); the drift velocity vector \(\bar{{\mathbf {U}}}\) has the corresponding number of components. Third, the effective diffusion coefficient D(c) in (21) is replaced with

$$\begin{aligned} D(c) = D_0 [1 + g_h c], \end{aligned}$$
(40)

wherein the coefficient \(g_h > 0\) is explicitly given for the same special cases.

For point particles, both (39) and the corresponding nonlinear equation for the particle concentration reduce to a linear ADE

$$\begin{aligned} \frac{\partial {\mathcal {A}}}{\partial t} = D_0 \nabla _{{\mathbf {x}}}^2 {\mathcal {A}}- \nabla _{{\mathbf {x}}} \cdot [\bar{{\mathbf {U}}}({\mathbf {x}}) {\mathcal {A}}], \qquad {\mathbf {x}} \in {\mathcal {D}}_\text {eff}. \end{aligned}$$
(41)

Here, \({\mathcal {A}}\) stands for both the PDF \(f_{{\mathbf {X}}_i}({\mathbf {x}},t)\) and the concentration \(c({\mathbf {x}},t)\). Since the particles have zero diameter, the “smallness” of \({\mathcal {D}}\) is expressed in terms of the Knudsen number\(\text {Kn} = \lambda / w\), where \(\lambda \) is the mean free path of a particle diffusing in the space with the characteristic length w. Two special cases of the general, yet approximate, Eq. (41) are described below.

3.1.1 Knudsen Diffusion

If advection–diffusion transport in a pore (\({\mathcal {D}}\)) of diameter w takes place in the regime with \(\text {Kn} \ll 1\), then the pore is sufficiently large for the confinement effects to be negligible and the diffusion coefficient is \(D = D_0 \equiv \lambda v / 3\) where v is the mean molecular velocity. If \(\text {Kn} \gg 1\), then the (Knudsen) diffusion coefficient is \(D = D_\text {Kn} \equiv w v / 3\) (Jacobs 1967). In the intermediate regime, the Bosanquet relation estimates the effective diffusion coefficient D as the harmonic mean between \(D_0\) and \(D_\text {Kn}\) (e.g., Zalc et al. 2004),

$$\begin{aligned} D = \left( \frac{1}{D_0} + \frac{1}{D_\text {Kn} } \right) ^{-1}. \end{aligned}$$
(42)

This empirical treatment of diffusion replaces (41) with an ADE

$$\begin{aligned} \frac{\partial c}{\partial t} = D \nabla _{{\mathbf {x}}}^2 c- \nabla _{{\mathbf {x}}} \cdot [{\mathbf {U}}({\mathbf {x}}) c], \qquad {\mathbf {x}} \in {\mathcal {D}}, \end{aligned}$$
(43)

i.e., allows one to obtain the concentration \(c({\mathbf {x}},t)\) by solving the standard ADE, albeit with the modified diffusion coefficient.

Knudsen diffusion plays an important role in a large number of natural and engineered (nano)porous media. Examples include transport of various species in pharmaceutical tablets (Klinzing and Zavaliangos 2016), in tubular sublimators utilized for purification of large organic molecules (Qian et al. 2016), in graphites designed for high-temperature gas-cooled nuclear reactors (Kane et al. 2018), in membranes used for distillation desalination (Deshmukh and Elimelech 2017), and in hierarchical metamaterials tailored for energy storage devices (Zhang and Tartakovsky 2017).

3.1.2 Fick–Jacobs Diffusion

Consider advection–diffusion transport in a two-dimensional (\(d=2\)) channel of a small width \(w = H\), with the transverse coordinate \(-H/2< y < H/2\). The corresponding effective dimension is \(d_\text {eff} = 1\); the resulting effective model is defined on the domain \(D_\text {eff} = \{x: -L< x < L\}\), with the confinement parameter \(h = H/ \varepsilon \). This model takes the form of a one-dimensional version of (41),

$$\begin{aligned} \frac{\partial {\mathcal {A}}}{\partial t} = D_0 \frac{\partial ^2 {\mathcal {A}}}{\partial x^2} - U \frac{ \partial {\mathcal {A}}}{ \partial x}, \qquad -L< x < L. \end{aligned}$$
(44)

Here, \({\mathcal {A}}(x,t)\) stands for both the PDF \(f_{X_i}(x,t)\) and the concentration c(xt). A generalization of (44), which accounts for spatial variability of the channel width, \(h = h(x)\), is referred to as the Fick–Jacobs equation (Jacobs 1967; Bruna and Chapman 2014),

$$\begin{aligned} \frac{\partial {\mathcal {A}}}{\partial t} = D_0 \frac{\partial }{\partial x} \left[ h \, \frac{\partial }{\partial x} \left( \frac{{\mathcal {A}} }{ h }\right) \right] - U \frac{ \partial {\mathcal {A}}}{ \partial x}, \qquad -L< x < L. \end{aligned}$$
(45)

This equation and its multiple variants (see, e.g., Dorfman and Yariv (2014) and Sect. 4 in Burada et al. (2009)) have been used to describe pore-scale diffusion processes in biological (e.g., ion channels), geological (e.g., tight formations), and manufactured (e.g., nanotubes) porous media.

3.2 Continuum (Darcy-Scale) Representations of Diffusion in Porous Media

The diffusion models described in the previous sections are valid for the fluid phase (pore space) of a porous medium. Consequently, they can be deployed in pore-scale simulations of transport processes. Their use at the continuum (Darcy) scale requires either systematic upscaling (see, e.g., Battiato et al. (2019) in this issue for a comparative review of various upscaling techniques) or empirical modifications.

For example, a commonly used (and often criticized) expression,

$$\begin{aligned} D_\text {eff} = \frac{\omega D}{\tau }, \end{aligned}$$
(46)

relies on geometric characteristics of a porous medium—its porosity \(\omega \) and tortuosity \(\tau \)—to relate the Darcy-scale diffusion coefficient \(D_\text {eff}\) to its pore-scale counterpart D. Combining (46) with the (empirical) Bruggeman relation \(\tau = 1/ \sqrt{\omega }\) yields another popular model, \(D_\text {eff} = \omega ^{3/2} D\). The Darcy-scale transport models of this kind assume that the pore- and Darcy-scale equations, such as those described above, are identical except for the values of their diffusion coefficients.

Even for linear diffusion phenomena, empirical relations such as \(D_\text {eff} = \omega ^{3/2} D\) or (46) fail to capture many salient features of the Darcy-scale diffusivity, including its tensorial nature due to pore geometry (Battiato et al. 2019). For nonlinear phenomena, e.g., diffusion of charged particles (Sect. 2.3), these relations do not account for the observed dependence of \(D_\text {eff}\) on solute/electrolyte properties and transport conditions (e.g., the electrical double layer that forms on the solid surfaces, thus reducing the pore space available for diffusion) (Zhang and Tartakovsky 2017). Even if the empirical models like (46) were sufficiently accurate, they are of limited use in materials design that aims to identify an optimal pore geometry (Zhang et al. 2015).

In a somewhat tautological way, (46) can be used as a definition of a porous medium’s tortuosity \(\tau \). It allows one to estimate a value of the otherwise unobservable \(\tau \) from measurements of D and \(D_\text {eff}\). While the thermodynamic considerations yield a value of the pore-scale (free fluid) diffusivity D for many of the diffusion phenomena described above, the value of \(D_\text {eff}\) has to be inferred by fitting a solution of, e.g., the ADE (43) to Darcy-scale concentration measurements. The latter step assumes both the validity of ADE-like representations of Darcy-scale diffusion phenomena and the dependence of \(D_\text {eff}\) solely on geometric properties of the pore space. Since both assumptions are problematic (see the discussion above and Sect. 5), this procedure for estimating the tortuosity can lead to unphysical results. For example, “tortuosity factors are often much larger for Knudsen diffusion than for bulk diffusion, in spite of their intended and purely geometric nature” (Zalc et al. 2004).

3.3 Trap and Symmetric Barrier Models

For diffusion in a fluid at rest, the density of particles which move according to the Langevin equation (4) is described by the diffusion equation (8). The latter is a combination of Fick’s law of diffusion, \(J = - D \partial _x f\), and mass conservation, \(\partial _t f = - \partial _x J\). For diffusion in heterogeneous media, one distinguishes between the symmetric barrier and trap models (Bouchaud and Georges 1990). These alternative models and their implications are discussed below.

3.3.1 Trap Model

The Langevin equation (4) with a spatially variable diffusion coefficient,

$$\begin{aligned} \frac{{\text {d}} X(t)}{ {\text {dt}}} = \sqrt{2 D[X(t)]} \xi (t), \end{aligned}$$
(47)

gives rise to a Fokker–Planck equation for the PDF f(xt) (Risken 1996; Gardiner 2010),

$$\begin{aligned} \frac{\partial f}{\partial t} = \frac{\partial ^2 D(x) f}{\partial x^2} . \end{aligned}$$
(48)

Its steady-state solution, the PDF \(f^\text {eq}(x) \propto 1 / D(x)\), implies that particles accumulate in regions of small diffusion coefficient D(x), i.e., the particles get trapped. Figure 5 shows the steady-state particle distribution for a linearly varying \(D(x) = x+1\) in a domain \(0< x < 1\) with reflecting boundary conditions. Particles accumulate, i.e., are trapped where the diffusion coefficient is small.

Fig. 5
figure 5

Steady-state particle distributions for the trap (green) and symmetric (black) barrier models. The blue line denotes the diffusion coefficient D(x)

3.3.2 Symmetric Barrier Model

An alternative way to account for heterogeneity is to generalize Fick’s law, \(J = - D(x) \partial _x f\). Combined with mass conservation, this yields a diffusion equation for f(xt),

$$\begin{aligned} \frac{\partial f}{\partial t} = \frac{\partial }{\partial x} \left[ D(x) \frac{\partial f}{\partial x} \right] . \end{aligned}$$
(49)

Its steady-state solution, the PDF \(f^\text {eq} = \text {constant}\), implies that particles are uniformly distributed, i.e., there is no accumulation of mass in regions of high or low diffusivity. The Langevin equation corresponding to (49) is (Risken 1996; Gardiner 2010)

$$\begin{aligned} \frac{{\text {d}} X(t)}{{\text {dt}}} = \frac{{\text {d}} D[X(t)]}{{\text {dx}}} + \sqrt{2 D[X(t)]} \xi (t). \end{aligned}$$
(50)

The drift term drives the particles away from regions of low diffusivity toward those of high diffusivity. This drift counteracts the trapping effect of Sect. 3.3.1 and thus leads to a uniform particle distribution.

The term “symmetric barrier model” is illustrated by a finite-volume discretization of (49),

$$\begin{aligned} \frac{{\text {d}} f_n(t)}{{\text {d}} t} = \sum _m w_{nm} \left[ f_m(t) - f_n(t)\right] , \qquad f_n(t) \equiv f(x_n,t). \end{aligned}$$
(51)

Equation (51) is a master equation with the transition rates

$$\begin{aligned} w_{nm} = \frac{{\hat{D}}_{nm}}{\Delta x^2}, \end{aligned}$$
(52)

where \({\hat{D}}_{nm}\) is the bond diffusivity between cells n and m, which is typically determined as the harmonic mean between the diffusion coefficient in the neighboring cells. The symmetry of the transition rates, \(w_{mn} = w_{nm}\), gives rise to the term “symmetric barrier model.”

4 Diffusion-Like Phenomena in Porous Media

Linear and nonlinear diffusion equations provide Darcy-scale representations of various flow and transport phenomena in porous media that are related to the phenomenon of diffusion in a wider sense. These phenomena are not necessarily related to the motion of Brownian particles in a free fluid, but describe instead diffusion of pressure and diffusive propagation of a phase field such as fluid saturation. We also discuss diffusion-like phenomena of solute dispersion that are caused by hydrodynamic velocity fluctuations.

4.1 Flow in Porous Media

Linear diffusion: single-phase flows. Consider a porous medium that is completely saturated with a fluid of density \(\rho \) and dynamic viscosity \(\mu \). Within the pore space, fluid flow can be described by Stokes or Navier–Stokes equations, depending on the pore size and flow velocity. When averaged over a sufficiently large volume of the medium, i.e., at the Darcy scale, the same flow is described by Darcy’s law \({\mathbf {q}} = - K \nabla h\) which, in analogy with Fick’s law of diffusion, postulates a linear relation between the fluid’s volumetric (Darcy) flux \({\mathbf {q}}\) and the gradient of hydraulic head \(h = \psi - x_3\). Here, \(\psi = p / (\rho g)\) is the pressure head, with g denoting the gravitational acceleration constant; and the vertical coordinate \(x_3\) (positive downward) represents the elevation head. The constant of proportionality \(K = k \rho g / \mu \) in Darcy’s law is called hydraulic conductivity; it is a property of both the porous medium (via its dependence on k, the medium’s permeability) and the fluid (its density \(\rho \) and viscosity \(\mu \)). Darcy’s law is essentially phenomenological, even though it can be derived (after many approximations) from the pore-scale Stokes equations by means of homogenization (Battiato et al. 2019).

When combined with mass conservation for a fluid-saturated volume of the porous medium, \(\partial _t (\omega \rho ) = - \nabla \cdot {\mathbf {q}}\), Darcy’s law yields a diffusion equation for the hydraulic head \(h({\mathbf {x}},t)\),

$$\begin{aligned} \frac{\partial h}{\partial t} = D \nabla ^2 h, \end{aligned}$$
(53)

where \(D \equiv K / S_s\) is the water diffusivity and the specific storage \(S_s\) accounts for slight compressibility of both the fluid and the porous matrix. This equation is written for homogeneous and isotropic porous media and implies the absence of fluid sources and sinks. Its generalizations, which account for these features, are straightforward.

Nonlinear diffusion: multiphase flows. Consider next a porous medium whose pores are filled with air and water. The fraction of the pores occupied by water, in a (representative elementary) volume of the medium, is referred to as water saturation, \(S_w \le 1\). A Darcy-scale description of water flow through partially saturated porous media relies on a phenomenological generalization of the phenomenological Darcy’s law, \({\mathbf {q}} = - K(S_w) \nabla h\), in which the hydraulic conductivity is an increasing function of saturation. Its relationship to, and derivation from, the pore-scale Stokes equations for multiple fluids separated by immiscible interfaces is even more tenuous than that for single-phase flows (Battiato et al. 2019). That is, in part, because the hydraulic conductivity K is affected not only by the saturation \(S_w\) but also by the fluid topology (Picchi and Battiato 2018; Picchi et al. 2018).

Assuming the air pressure throughout the porous medium to equilibrate (nearly) instantaneously with the atmosphere, and combining the Darcy’s law with mass conservation, \(\omega \partial _t S = - \nabla \cdot {\mathbf {q}}\) leads to the Richards equation written in terms of the water content \(\theta ({\mathbf {x}},t) \equiv \omega S({\mathbf {x}},t)\),

$$\begin{aligned} \frac{\partial \theta }{\partial t} = \nabla \cdot [ D(\theta ) \nabla \theta ] - U(\theta ) \frac{\partial \theta }{ \partial x_3}. \end{aligned}$$
(54)

The moisture diffusivity \(D(\theta ) \equiv K(\theta ) ({\text {d}} \psi / {\text {d}} \theta )\) and the “velocity” \(U (\theta ) \equiv {\text {d}} K / {\text {d}} \theta \) are calculated from the experimentally determined constitutive relations \(\theta = \theta (\psi )\) and \(K = K(\theta )\).

Unlike the \(\psi \)-based Richards equation, the nonlinear ADE (54) cannot be used in fully saturated regions of a porous medium and its solutions for composite porous materials exhibit jump discontinuities. Generalizations of the Richards equation include systems of coupled nonlinear parabolic equations of multiphase flow, which track spatiotemporal evolution of the saturation (and pressure) of each fluid phase; and phase-field dynamic models of wetting and drying of porous media (Mitkov et al. 1998), which capture the observed hysteresis in the relative conductivity \(K = K(\psi )\).

4.2 Dispersion

4.2.1 Hydrodynamic Dispersion

Upscaling of transport equations from the pore to the Darcy scale is the subject matter of Battiato et al. (2019) in this issue. Under a number of assumptions, this procedure leads to what is otherwise a phenomenological advection–dispersion equation for the Darcy-scale solute concentration \(c({\mathbf {x}},t)\),

$$\begin{aligned} \omega \frac{\partial c}{\partial t} = \nabla \cdot ({\mathbf {D}} \nabla c) - \nabla \cdot ({\mathbf {q}} c). \end{aligned}$$
(55)

The hydrodynamic dispersion tensor \({\mathbf {D}}\) accounts for both pore-scale molecular diffusion D and spatial variability of pore-scale fluid velocity. For pore geometry with a characteristic length (e.g., average grain size) \(\ell _g\), one defines a Péclet number as \(\text {Pe} = {\overline{u}} \ell _g/D\), where \({\bar{u}}\) = \(|\bar{{\mathbf {q}}} |/ \omega \) is the magnitude of the average pore velocity. For high Péclet numbers, the correlation time in the mean flow direction is \(\tau _u \approx \ell _g/{\overline{u}}\) and the velocity variance \(\sigma _u^2 \sim {\overline{u}}^2\). Thus, the effect of velocity fluctuations on the longitudinal dispersion (the \(D_{11}\) component of the dispersion tensor \({\mathbf {D}}\) when the \(x_1\) coordinate is aligned with the flow direction) is

$$\begin{aligned} D_L \sim {\overline{u}} \ell _g. \end{aligned}$$
(56)

This implies that \(D_L /D \sim \text {Pe}\), which was observed in experiments and numerical simulations at high Péclet numbers (Pfannkuch 1963; Bear 1972; Bijeljic and Blunt 2006). For \(\text {Pe} < \text {Pe}_\text {cr}\) with the critical Péclet number \(\text {Pe}_c = 400\), the longitudinal dispersion coefficient scales as \(D_L/D \sim \text {Pe}^{1.2}\), and for \(\text {Pe} \ll 1\) as \(D_L/D \sim 1\) (Bijeljic and Blunt 2006). These behaviors are captured by the relation (Bear 1972)

$$\begin{aligned} D_L = D \gamma + \alpha _L {\overline{u}} \frac{\text {Pe}}{\text {Pe} + 2 + 4 \delta ^2}, \end{aligned}$$
(57)

where \(\alpha _L\) is the longitudinal dispersion length, which corresponds to the grain size \(\ell _g\); \(\gamma \) represents the effect of the tortuous pore geometry on molecular diffusion in the bulk, e.g., \(\gamma = \omega / \tau \) in (46); and the parameter \(\delta \) characterizes the shape of the pore channels. The second term on the right side of (57) is mechanical dispersion.

4.2.2 Macrodispersion

Spatial variability of hydraulic conductivity \(K({\mathbf {x}})\) at the field scale is modeled by representing its logarithm, \(Y = \ln K\), as a multivariate Gaussian random field, whose statistics (the mean \({\bar{Y}}\), variance \(\sigma _Y^2\), and correlation length \(\ell _Y\)) are inferred from measurements. Upscaling of transport equations from the Darcy to the field scale by means of stochastic perturbation theory (Cushman 1997; Dagan 2012; Dagan and Neuman 1997) gives, under several approximations, an advection–dispersion equation (55) with the dispersion coefficient \({\mathbf {D}}\) and Darcy velocity \({\mathbf {q}}\) replaced by their effective counterparts \({\mathbf {D}}^*\) and \(\bar{{\mathbf {q}}}\). The Péclet number is now defined as \(\text {Pe} = {\overline{u}} \ell _Y /D_L\), where \(D_L\) in this context represents the (constant) local scale dispersion and \({\bar{u}}\) = \(|\bar{{\mathbf {q}}} |/ \omega \) is the average macroscopic flow velocity magnitude. Again, at high Péclet numbers the correlation time is estimated as \(\tau _u = \ell _Y/{\bar{u}}\). The first-order (in \(\sigma _Y^2\)) perturbation theory estimates the velocity variance to be \(\sigma _u^2 \sim \sigma _Y^2 {\overline{u}}^2\) (e.g., Dagan 1984). Thus, the longitudinal macrodispersion coefficient is (e.g., Dagan 1984)

$$\begin{aligned} D_L^* = \sigma _Y^2 {\overline{u}} \ell _Y. \end{aligned}$$
(58)

This result relates the field-scale dispersion coefficient to the medium’s hydraulic properties (\(\sigma _Y^2\) and \(\ell _Y\)) and the mean flow velocity (\({\overline{u}}\)).

5 Non-Fickian Diffusion in Porous Media

Fickian diffusion (and dispersion) models predict the first and second centered moments of a solute plume (i.e., its center of mass and the spread) to increase linearly with time, \(\langle X(t)^2 \rangle \propto t\); the spatial distribution of a solute plume to have a Gaussian shape (9); and the solute breakthrough curves to be inversely Gaussian. A number of factors, e.g., the finite size of particles diffusing in confined environments (Sects. 2.2 and 3.1) or the electrical charge carried by particles (Sect. 2.3), can cause a deviation from this behavior; in other words, one could argue that the Fickian behavior is more rare (“anomalous”) than the non-Fickian one (Cushman and O’Malley 2015).

Heterogeneity of porous media is another factor that gives rise to non-Fickian dynamics. Breakthrough curves are characterized by strong tailing, plumes exhibit non-Gaussian shapes with pronounced forward and/or backward tails, and both a plume (Bouchaud and Georges 1990; Cushman et al. 2009) and a particle’s trajectory (Regner et al. 2013) grow nonlinearly in time,

$$\begin{aligned} \langle X(t)^2 \rangle \propto t^\alpha , \qquad 0 < \alpha \le 2, \end{aligned}$$
(59)

with the values of the exponent \(0< \alpha < 1\) and \(1 < \alpha \le 2\) indicating the sub- and super-diffusive behavior, respectively. Inferring the exponent’s value from (typically) noisy data is not straightforward (Cushman and Moroni 2001; Moroni and Cushman 2001; Regner et al. 2013, 2014); renormalization group classification methods (O’Malley and Cushman 2012; O’Malley et al. 2014) proved to be robust even for data with high noise-to-signal ratios (Regner et al. 2014).

While differing in their foundational assumptions and data requirements, models of non-Fickian transport in heterogeneous media are interrelated in the sense that all are nonlocal in space and/or time, i.e., the (advection–)diffusion equation are replaced with their integro-differential counterparts (Neuman and Tartakovsky 2009). The latter reference provides a plethora of comparative studies that focus on the commonalities, differences, and relative strengths of the competing models of non-Fickian transport. A brief description of a few of representative models is provided below.

5.1 Ensemble-Averaged Nonlocal Diffusion Equations

Solving a d-dimensional (\(d \ge 1\)) version of the diffusion equation (49) with spatially varying \(D({\mathbf {x}})\) would yield \(f({\mathbf {x}},t)\) that differs from the Gaussian distribution (9), which is obtained for the constant \(D = D_0\). Hence, even though that equation follows from Fick’s law, \({\mathbf {q}} = - D({\mathbf {x}}) \nabla f\), it is common to refer to such diffusion as non-Fickian. That is because, while written in the differential (infinitesimal) form, every term in Fick’s law has a support scale \({\mathcal {S}}_\text {sup}\) on which it is measured and the spatial variability of, for example, \(D({\mathbf {x}})\), on scales smaller than \({\mathcal {S}}_\text {sup}\) is unresolved.

Within the probabilistic framework (Cushman 1997; Dagan 2012; Dagan and Neuman 1997), this lack of information about the sub-scale variability of D is handled by treating it as a random field so that the corresponding diffusion equation, e.g., d-dimensional version of (49) defined on the computational domain \({\mathcal {D}} \in {\mathbb {R}}^d\), becomes stochastic. Ensemble-averaging of this equation yields (under some approximations) a space–time nonlocal (integro-differential) equation for the mean state variable, \({\bar{f}}({\mathbf {x}},t)\) (Neuman et al. 1996):

$$\begin{aligned} \frac{\partial {\bar{f}}}{\partial t} = \nabla \cdot [{\bar{D}} \nabla {\bar{f}}] - \nabla \cdot \int \limits _0^t \int \limits _{{\mathcal {D}}} \varvec{\kappa }({\mathbf {x}}, {\mathbf {x}}', t-t') \nabla {\bar{f}}({\mathbf {x}}',t') {\text {d}} {\mathbf {x}}' {\text {dt}}'. \end{aligned}$$
(60)

The kernel \(\varvec{\kappa }({\mathbf {x}}, {\mathbf {x}}', t-t') \approx C_D({\mathbf {x}},{\mathbf {x}}') \nabla _{{\mathbf {x}}} \nabla _{{\mathbf {x}}'}^\top G({\mathbf {x}}, {\mathbf {x}}', t-t')\) is a symmetric positive-semidefinite second-rank tensor (dyadic) that is related to the two-point covariance function of the random diffusion coefficient, \(C_D({\mathbf {x}},{\mathbf {x}}')\), and the mean-field Green’s function G. If \({\mathcal {D}} \equiv {\mathbb {R}}^d\) and \(D({\mathbf {x}})\) is a second-order stationary field, i.e., if its mean (\({\bar{D}} = D_0\)) and variance (\(\sigma _D^2\)) are constant, then \(G = G({\mathbf {x}} - {\mathbf {x}}', t-t')\) is given by the Gaussian function (9). An alternative to this approximation of the kernel \(\varvec{\kappa }({\mathbf {x}}, {\mathbf {x}}', t-t')\) is to treat it as a phenomenological transfer function in the spirit of the approaches described in the subsequent sections.

Nonlocal analogues of (60) for non-Fickian advection–dispersion transport can be found in Fiori et al. (2007), Koch and Brady (1988), Morales-Casique et al. (2006), Neuman (1993), among others.

5.2 Continuous Time Random Walks

The continuous time random walk (CTRW) (Montroll and Weiss 1965; Scher and Lax 1973) relaxes the condition that a particle’s motion is a Markovian process. Particle motion is characterized by a stochastic recursion relation for both the particle position \(X_n\) and particle time \(T_n\) after n random walk steps as \(X_{n+1} = X_n + \xi _n\) and \(T_{n+1} = T_n + \tau _n\). The space–time random displacements \((\xi _n,\tau _n)\) at subsequent steps are independent and distributed according to a joint PDF \(\varPsi (x,t)\). This recursive relation is non-Markovian in time but describes a Markov process in terms of its evolution in step number n. Thus, it is also called a semi-Markov process.

For an uncoupled CTRW, \(\varPsi (x,t) = \varLambda (x) \varPsi (t)\), with a sharply peaked transition density \(\varLambda (x)\), an expansion similar to that used to derive (8) leads to a time nonlocal equation for the PDF, f(xt), of finding a particle at the space–time point (xt) or, equivalently, for c(xt), the concentration of particles (Kenkre et al. 1973; Metzler and Klafter 2000),

$$\begin{aligned} \frac{\partial f}{\partial t} = \int \limits _{0}^t \kappa (t-t') \frac{\partial ^2 f}{\partial x^2}(x,t') {\text {dt}}'. \end{aligned}$$
(61)

The diffusion kernel \(\kappa \) is defined in terms of the transition times PDF \(\varPsi (t)\) by

$$\begin{aligned} \kappa (t) = \frac{1}{2} \langle \xi ^2 \rangle {\mathcal {K}}(t), \qquad \hat{{\mathcal {K}}}(\lambda ) = \frac{\lambda {\hat{\varPsi }}(\lambda )}{1 - {\hat{\varPsi }}(\lambda )}, \end{aligned}$$
(62)

where, for any appropriate function g(t), \({\hat{g}}(\lambda )\) indicates its Laplace transform; and \(\langle \xi ^2 \rangle \) is the variance of \(\xi _n\) for all \(n \ge 1\). The transition time PDF \(\varPsi (t)\) has to be specified by the modeler from either prior knowledge or measurements of the state variable f. For a power-law distribution, \(\varPsi (t) \sim t^{-1-\beta }\) with \(0< \beta < 1\), the displacement variance scales as

$$\begin{aligned} \langle X(t)^2 \rangle \propto t^\beta , \qquad 0< \beta < 1, \end{aligned}$$
(63)

which indicates sub-linear scaling or sub-diffusive behavior.

CTRW is related to both Levy walk models (Metzler and Klafter 2000; Cushman et al. 2009) and fractional advection–dispersion equations (Benson et al. 2000; Cushman and Ginn 2000) that are characterized by spatiotemporal kernel functions with an asymptotic power-law scaling. Likewise, CTRW is connected to time-domain random walk models (Cvetkovic et al. 1996; Delay and Bodin 2001; Painter and Cvetkovic 2005) and multirate mass transfer models (Dentz and Berkowitz 2003).

5.3 Matrix Diffusion and Multirate Mass Transfer

The multirate mass transfer (MRMT) approach (Haggerty and Gorelick 1995; Carrera et al. 1998) separates the support scale into a mobile continuum and a suite of immobile continua, which communicate through a linear mass transfer. At each point, the immobile continua PDF \(f_\text {im}(x,t)\) is related to its mobile continua PDF f(xt) through a linear relation (Carrera et al. 1998) \(f_\text {im}(x,t) = \int _0^t \phi (t - t') f(x,t') {\text {dt}}'\), where the kernel function \(\phi (\cdot )\) is user-supplied. The evolution of the mobile continua PDF f(xt) is given by an integro-differential equation (Haggerty and Gorelick 1995)

$$\begin{aligned} \frac{\partial f}{\partial t} = D \frac{\partial ^2 f}{\partial x^2} - \epsilon \frac{\partial }{\partial t} \int \limits _0^t \phi (t - t') f(x,t') {\text {dt}}', \end{aligned}$$
(64)

where \(\epsilon \) is the ratio of the immobile and mobile volume fractions. The memory function \(\phi (t)\) reflects the mass transfer mechanisms between the mobile and immobile continua.

Fig. 6
figure 6

Illustration of the memory functions for the slab-shaped (green) and spherical (blue) inclusions. The dash-dotted lines denote the preasymptotic \(\sim t^{-1/2}\) behavior

For diffusive mass transfer into slab-shaped immobile regions, relevant for fracture matrix exchange, the memory function is given in terms of its Laplace transform as (Carrera et al. 1998)

$$\begin{aligned} {\hat{\phi }}(\lambda ) = \frac{\tanh (\sqrt{\lambda \tau _D})}{\sqrt{\lambda \tau _D}}, \end{aligned}$$
(65)

where \(\tau _D\) is the characteristic diffusion time in the mobile domain. For diffusion in a medium characterized by spherical inclusions, it is given by

$$\begin{aligned} {\hat{\phi }}(\lambda ) = \frac{3}{\sqrt{\lambda \tau _D}}\left[ \coth (\sqrt{\lambda \tau _D}) - \frac{1}{\sqrt{\lambda \tau _D}}\right] . \end{aligned}$$
(66)

These memory functions behave as \(\phi (t) \propto t^{-1/2}\) for \(t \ll \tau _D\) and decay exponentially fast for \(t \gg \tau _D\). The displacement variance for times \(t \ll \tau _D\) behaves sub-diffusively as (Bouchaud and Georges 1990)

$$\begin{aligned} \langle X(t)^2 \rangle \propto t^{1/2} \end{aligned}$$
(67)

and evolves diffusively for time \(t \gg \tau _D\), as shown in Fig. 6. For diffusive mass transfer into heterogeneous immobile regions, the memory function can be determined by solving a heterogeneous diffusion problem (Gouze et al. 2008). For a heterogeneous medium composed of immobile regions with different diffusion properties, the memory function may behave as \(\phi (t) \propto t^{\gamma }\) with \(0< \gamma < 1\) before a certain cutoff time.

Both the CTRW and MRMT frameworks have similar phenomenology as both account for memory effects due to a distribution of characteristic mass transfer timescales (Dentz and Berkowitz 2003; Dentz et al. 2012). In fact, it can be shown that the exponents in the power-law scalings of the memory function and transition time distribution are the same, \(\beta = \gamma \).

6 Summary

This paper provides a brief review of diffusion phenomena in porous media. It starts with the description of Brownian motion in a free fluid by the Langevin equation and its equivalence to the diffusion equation. We discuss the fundamental mechanisms of diffusion and derive the Einstein–Smoluchowski relations, which connect microscopic velocity fluctuations and energy dissipation.

We review the generalization of the diffusion concept from point-size inert particles to ensembles of finite number of particles of finite size, which leads to concentration-dependent diffusion coefficients and in this sense to non-Fickian diffusion. The Brownian motion of ions and in general charged particles is described by the Nernst–Planck equation.

The impact of diffusion on the rates of fast bimolecular chemical reactions can be quantified by the Smoluchowski theory, which relates the reaction rate directly to the diffusion rates of the reacting particles. For heterogeneous initial reactant concentration, the Ovchinnikov–Zeldovich mechanism explains the slowing down of bimolecular chemical reactions due to segregation.

We then consider diffusion processes in porous media. Knudsen and Fick–Jacobs diffusion describe molecular diffusion under confinement. For diffusion in heterogeneous porous media, we distinguish between trap and barrier models, which have distinctly different transport properties. Trap models are related to diffusion problems under heterogeneous linear adsorption, while barrier models describe Fickian diffusion under spatial heterogeneity.

We furthermore discuss diffusion-like phenomena in porous media. These include Darcy-scale flow models, which take the form of linear (single phase) and nonlinear (multiphase) diffusion equations, and dispersive transport of solutes. The concept of dispersion shares the same phenomenology as diffusion in that it models particle motion due to small-scale velocity fluctuations as a random walk and in this sense as a large-scale Brownian motion.

We conclude this brief review with an overview of phenomena and models for anomalous diffusion in porous media.