1 Introduction

Understanding and applying the theory of anomalous transport opens up rich fields of study in science and engineering, transforming our perspective and facilitating extraordinary discoveries that would not be possible otherwise. This class of phenomena refers to fascinating and widespreadFootnote 1 processes that, viewed at appropriate scales, exhibit non-Markovian long-term memory effects, non-Fickian long-range interactions, nonergodic statistics, and non-equilibrium dynamics [1]. Anomalous transport is observed in a wide variety of complex, multi-scale, and multi-physics systems such as subdiffusion and superdiffusion in porous media, kinetic plasma turbulence, aging of polymers, glassy materials, amorphous semiconductors, biological cells, heterogeneous tissues, and disordered media [2,3,4,5]. The crucial point that prompts this work is that conventional mathematical models cannot describe such processes in a succinct, compact way that directly expresses their anomalous and nonlocal character.

This work is founded on the use of fractional-order partial differential equations (FPDEs), which seamlessly generalize standard PDEs of integer order to real-valued order. In practice, FPDEs appear within tractable mathematical models for anomalous transport, ranging from complex fluids to non-Newtonian rheology and the design of aging materials [1, 6,7,8,9], but also in modeling transport phenomena when rates of change in the quantity of interest depend on space or time. In this context, FPDEs with “variable orders” can be exploited in diverse physical and biological applications [10,11,12] to capture transitions between different transport regimes. Moreover, even classical long-standing issues such as monotonicity, anisotropy, and multi-fractal scaling laws in turbulence can be reformulated and reinterpreted in the context of fractional calculus and probability theory. FPDEs therefore emerge as an expressive approach to modeling such physics, transforming the current practice in mathematical modeling and giving rise to a new generation of flexible, high-fidelity, and direct approaches [4, 13,14,15].

In this review article, we focus on three important applications of FPDEs, reporting the scientific evidence of how and why fractional modeling naturally emerges in each case, along with a review of selected nonlocal mathematical models that have been proposed. For brevity, throughout this article we use the term “fractional” to mean “fractional-order”. Despite conflicting with the most common usage of the adjective “fractional” in the English language, this is standard in the literature; thus, fractional-order derivatives are referred to as “fractional derivatives” and fractional-order models as “fractional models”.

Anomalous Subsurface Transport (Sect. 3)

The accurate prediction at large scales of contaminant transport in both surface and subsurface water is fundamental for the management of water resources and critical for environmental safety. However, the explicit description of the systems where transport takes place is extremely challenging, especially at large scales, due to the complexity the medium. Such media feature heterogeneities that are either difficult or impossible to observe, and hence cannot be described with certainty at all relevant scales and locations. Moreover, even when the environment’s microstructure can be captured, numerical simulations of appropriate PDE models such as systems of advection-diffusion equations may be infeasibly expensive if conducted at fully resolved small scales [16]. In fact, the same types of equations that are accurate at small scales do not extrapolate and predict solutes’ behavior at larger scales, due to the appearance of “anomalous”, or “non-Fickian” behavior [17,18,19,20]. At large scales, FPDEs are called for.

Turbulent Flows (Sect. 4)

Turbulence “remembers” and is fundamentally nonlocal. Coherent motions and “turbulence spots” structures inherently give rise to intermittent signals with sharp peaks, heavy-skirts, and skewed distributions of velocity increments [21, 22], manifesting the non-Markovian, non-Fickian nature of turbulence. This suggests that nonlocal interactions cannot be ruled out in the physics of turbulence [23]. In addition to such an inherent nonlocality, filtering the Navier-Stokes and energy equations in the corresponding large eddy simulation (LES) of turbulent flows and scalar turbulence, in which large-scale motions are “resolved” and only the small scales are “modeled”, would make the existing nonlocality in the corresponding subgrid stochastic processes (i.e., turbulent fluctuations) even more pronounced [24,25,26]. This requires the development of new modeling paradigms in addition to new statistical measures that can meticulously highlight the nonlocal character of turbulence and their absence in the common turbulence modeling practice.

Anomalous Materials/Rheology (Sect. 5)

Accurate modeling of the evolution of material response and failure across multiple time and length scales is essential for life cycle prediction and design of new materials. While the mechanical behavior of a number of standard engineering materials (e.g., metals, polymers, rubbers) is quite well understood, a significant modeling effort still needs to be conducted for complex materials, where microstructure heterogeneities, randomness and small-scale physical mechanisms [5, 27] (e.g., trapping effects and collective behavior) lead to non-standard and, at times, counter-intuitive responses. Two examples are bio-tissues and natural materials, e.g., biopolymers, which are multi-functional products of millions of years of evolution, locally optimized for their hosts and environment, and constrained by a limited set of building blocks and available resources [28, 29]. These materials possess unprecedented properties at low densities, especially due to their hierarchical and multi-scale structure, leading to a wide spectrum of behaviors, such as power-law viscoelasticity , visco-plastic strains under hysteresis loading, damage and failure, fractal avalanche ruptures and self-healing mechanisms [29,30,31,32,33,34].

1.1 Outline of the Article

Before describing each of the aforementioned applications, we review the foundations of fractional calculus: we classify fractional models via their connection with the underlying stochastic processes that serve as the statistical backbone of fractional modeling. The organization of the rest of the review article is as follows: Sects. 3, 4, and 5 are dedicated to subsurface transport, turbulent flows, and anomalous materials, respectively. Each section has the same structure: first, we motivate the need for fractional modeling and provide results or tools necessary for a full understanding of the section. Next, we provide evidence of fractional behavior, reporting state-of-the-art results that highlight the improved accuracy of FPDEs as opposed to classical PDEs. Then, the core of each section is a description of past and current models, with some insights into discretization techniques currently in use. At the end of each section, a paragraph on future directions gives our perspective on fruitful research directions in each area.

2 An Overview of Fractional Derivatives

2.1 Classification of Fractional Derivatives and Models

We introduce and classify the most commonly used fractional-order differential operators in the context of diffusion models based on random walks. For simplicity, we restrict our discussion to one spatial dimension except for a few remarks in which the extension to higher dimensions is touched upon.

To avoid mathematical difficulties, we discuss stochastic processes in terms of their discretizations, thinking of them as sequences of random variables \(X_{n \Delta t}\) for time step \(\Delta t > 0\) and integer n defined as cumulative sums of increments. Strictly speaking, FPDEs govern the statistical properties of continuous-time random walks, which are appropriate scaling limits or long-time limits of the discrete random walks, limits in which n becomes large relative to \(\Delta t\) [35]. However, the rigorous definition of such stochastic processes requires significant excursions into probability theory; this is true even for the classical case of Brownian motion [36, 37]. Thus, while not entirely precise, in introducing fractional operators we characterize the related process in their discretized form, providing references where rigorous definitions of the process, as well as proofs of convergence of the discretization to the continuous-time process in appropriate limits, are given.

For the sake of clarity, we summarize the notation used in this section and throughout the article in Table 1; there, we record each important symbol, the description of the symbol, and its definition or first appearance in the article.

Table 1 Summary of the most relevant symbols used in the paper, their corresponding description, and referral to their definition or first appearance

2.1.1 Normal, or Fickian, Diffusion

The connection between Brownian motion \(B_t\) and the classical diffusion equation was studied in seminal works by Bachelier [38], Einstein [39], and Von Smoluchowski [40]. The diffusion equation is posed in an initial value problem,

$$\left\{ \begin{array}{lr} \frac{\partial u}{\partial t}(x,t) = k^2 \Delta u(x,t), & \quad x \in \mathbb {R}, t > 0, \\ u(x, t = 0) = u_0 (x), & x \in \mathbb {R}, \end{array} \right.$$
(1)

in which \(k^2 > 0\) is the diffusion coefficient and \(\Delta u = \partial ^2 u / \partial x^2\) denotes the Laplacian. Brownian motion \(B_t\) is a continuous-time stochastic process defined for \(t \ge 0\), which when discretized in time steps of size \(\Delta t\) has the property that \(B_{t = 0} = 0\) and

$$\begin{aligned} B_{t+\Delta t} = B_{t} + \Delta B; \quad \Delta B \sim \mathcal {N}(\mu = 0, \sigma = k\sqrt{\Delta t}). \end{aligned}$$
(2)

The above notation indicates that the increment \(\Delta B\) at each time step of \(\Delta t\) is drawn from a normal distribution \(\mathcal {N}(0,k\sqrt{\Delta t})\) with mean \(\mu = 0\) and standard deviation \(\sigma = k\sqrt{\Delta t}\). This has probability density function

$$\begin{aligned} p_{\mathcal {N}}(x; \mu ,\sigma ) = \frac{1}{\sigma \sqrt{2 \pi }} e^{-\frac{1}{2} \left( \frac{x-\mu }{\sigma } \right) ^2} \end{aligned}$$
(3)

The rule (2) for sampling a path of \(B_t\) at times \(m \Delta t\), \(m = 0, 1, 2, ...\), is an example of a discrete stochastic differential equation (SDE), and is referred to as the Euler-Maruyama discretizationFootnote 2 of Brownian motion [41].

Fig. 1
figure 1

Comparison of normal diffusion, superdiffusion, and subdiffusion via mean-square displacement (MSD) of the particle models and the scaling-in-time of the fundamental solution of the diffusion equations governing the density functions. Brownian motion exhibits both an MSD and scaling factor that are linear in time. Superdiffusive Lévy flight exhibits infinite MSD, and a fundamental solution scaling factor \(t^{\alpha }\) for \(\alpha > 1\), while the superdiffusive Lévy walk exhibits the same scaling of the fundamental solution as well as a finite MSD that scales as \(t^{\alpha }\). The subdiffusive Brownian motion with waiting times exhibits sublinear MSD and fundamental solution scaling, proportional to \(t^{1/\alpha }\) for \(\alpha < 1\)

The discrete process \(B_{ m \Delta t}\) should be thought of as tracing a path in \(\mathbb {R}\) of a particle undergoing “jumps” in a random direction at time intervals of size \(\Delta t\). At each time t, the position \(B_{t}\) of the particle is a random variable. It can be shown that the paths of the continuous-time process \(B_t\) are almost surely continuous in time [42]. From Eq. (2) and the central limit theorem, it follows that Brownian motion satisfies the scaling property

$$\begin{aligned} \langle B_t^2 \rangle = 2 k^2t, \end{aligned}$$
(4)

where the left-hand side denotes the variance or second moment of the random variable \(B_t\) (Fig. 1). Given an initial distribution of particles \(u_0(x)\) in \(\mathbb {R}\) which then undergo Brownian motion, the distribution u(xt) of particles in \(\mathbb {R}\) is governed by Eq. (1). In other words, diffusing particles described at a microscopic scale by Brownian motion, i.e., by Eq. (2) in discrete time, have their distribution in space — a macroscopic property — governed by the heat equation [35, Sect. 1.1]. This is illustrated in Fig. 2.

Fig. 2
figure 2

(Left) Eight independent sample paths of Brownian motion representing the path of a particle starting at the origin and stepping according to the rule (2). (Right) For \(t= 1,2,3\), the probability density of the location of the particle, i.e., the fundamental solution to the classical heat Eq. (1)

The consistency between this macroscopic description and the microscopic model is illustrated by scaling properties. A necessary property of such a Brownian motion model is the second-moment condition (4), which states that on average, particles travel a distance \(k\sqrt{t}\) from their initial position after time t. This is reflected in the fact that the solution of Eq. (1) with initial condition \(u_0(x) = \delta _0(x)\) is

$$\begin{aligned} u(x,t) = \frac{1}{\sqrt{4\pi t}} e^{-{x^2}/{4k^2t}}, \end{aligned}$$
(5)

which is the normal density (3) with standard deviation \(k\sqrt{t}\). Note that this solution has the property that

$$\begin{aligned} u(x,t_2) = \left( \frac{t_2}{t_1}\right) ^{-{1}/{2}} \, u\left( \frac{x}{\left( {t_2}/{t_1}\right) ^{-{1}/{2}}},t_1\right) , \quad t_2> t_1 > 0. \end{aligned}$$

Thus, the distribution of plume of particles in this diffusion model spreads out as \((t_2/t_1)^{1/2}\) as time elapses from \(t_1\) to \(t_2\), consistent with Eq. (4).

The model for normal diffusion reviewed here is also referred to as Fickian diffusion. The heat Eq. (1) can be derived from the mass conservation with flux term J,

$$\begin{aligned} \frac{\partial u}{\partial t} + \frac{\partial J}{\partial x} =0 \end{aligned}$$

under Fick’s law \(J = \nabla u\). As discussed by Schumer et al. [43], the fractional diffusion equations we introduce below follow from mass conservation with non-Fickian fluxes.

2.1.2 \(\alpha\)-Stable Lévy Flights and the Fractional Laplacian

Many important systems exhibit diffusive behavior, but do not satisfy the scaling property (4) [2]. This type of diffusion is referred to as anomalous diffusion, as it cannot be described by Eq. (2) with normally distributed increments. We desire a microscopic model that generalizes Brownian motion \(B_t\), and a corresponding macroscopic model that generalizes the diffusion equation (1). The first model we propose remains in the framework of a discrete SDE with independent identically distributed (i.i.d.) increments,

$$\begin{aligned} X_{t+\Delta t} = X_{t} + \Delta X, \quad X_0 = 0, \end{aligned}$$
(6)

but the increments \(\Delta X\) are no longer drawn from a normal distribution. It follows from the central limit theorem that the only way to obtain a microscopic model in this framework that is statistically distinct from \(B_t\), i.e., not equivalent in distribution, is to draw step sizes from a probability density function with infinite variance [35, 44].

We introduce the isotropic \(\alpha\)-stable random variable \(S_\alpha (\gamma ,\sigma ,\mu )\). This family of random variables is definedFootnote 3 most simply by their characteristic function. For a general random variable X, the characteristic function \(\varphi _X\) is related to the probability density function \(p_X\) by

$$\begin{aligned} \varphi _X(\xi ) = \int e^{i\xi x} p_X(x) dx. \end{aligned}$$

Thus, the characteristic function of the normal random variable is \(e^{i\xi \mu - \sigma ^2 \xi ^2/2}\). The \(\alpha\)-stable random variable has characteristic function

$$\begin{aligned} \varphi _\alpha (\xi ;\gamma ,\sigma ,\mu )=e^{i\xi \mu -|\sigma \xi |^\alpha (1 - i\gamma \text {sgn}(\xi ) \Phi ) }, \end{aligned}$$
(7)

where

$$\begin{aligned} \Phi = \left\{ \begin{array}{l} \tan \left( \frac{\pi \alpha }{2}\right)\,\text{ if}\;\alpha \ne 1, \\ -\frac{2}{\pi } \log (|\sigma \xi |)\,\text{ if}\;\alpha = 1. \end{array} \right. \end{aligned}$$

The parameter \(\alpha \in (0,2]\) is referred to as the stability parameter of the distribution, \(\mu \in \mathbb {R}\) as the center, \(\gamma \in [-1,1]\) as the skewness, and \(\sigma \in (0,\infty )\) as the scale. The isotropic or symmetric \(\alpha\)-stable distribution \(S_\alpha (\gamma = 0, \sigma , \mu = 0)\) therefore has characteristic function

$$\begin{aligned} \varphi _\alpha (\xi ;\gamma = 0,\sigma ,\mu = 0)=e^{-\sigma |\xi |^\alpha }, \end{aligned}$$

generalizing the characteristic function of the normal distribution with mean \(\mu = 0\) and standard deviation \(\sigma /\sqrt{2}\) and reducing to it when \(\alpha = 2\). By the Fourier inversion theorem, the probability density function of \(S_\alpha (\gamma ,\sigma ,\mu )\) can be written

$$\begin{aligned} f_\alpha (x;\gamma ,\sigma ,\mu ) = {\frac{1}{2\pi }} \int e^{{-i \xi x}} \varphi _\alpha (\xi ; \gamma ,\sigma ,\mu ) d\xi . \end{aligned}$$
(8)

In general, the \(\alpha\)-stable density does not admit a closed-form expressionFootnote 4, but in the symmetric case where \(\gamma = \mu = 0\), it has the property that

$$\begin{aligned} f_\alpha (x;\gamma = 0,\sigma ,\mu = 0) \sim \frac{1}{|\sigma x|^{1+\alpha }} \text { for large } x, \quad 0< \alpha < 2, \end{aligned}$$
(9)

as discussed in, e.g., Nolan [45] or Cont and Tankov [46]. In other words, the density exhibits Paretian or power-law tails (Fig. 3). This is in contrast to the rapidly decaying square-exponential tails of the normal distribution. In many settings, such tails are informally referred to as being examples of heavy or fat tails [47, 48].

Fig. 3
figure 3

(Left) Eight independent sample paths of symmetric \(\alpha\)-stable Lévy flight with \(\alpha = \sqrt{3}\) representing the path of particle starting at the origin and stepping according to the rule (10). (Right) For \(t= 1,2,3\), the probability density of the location of the particle given by Eq. (8), i.e., the fundamental solution to the fractional diffusion equation (11). Compared to Fig. 2, note that despite some qualitative similarity between the shapes of the density functions, the presence of long jumps signifies a striking difference between the paths of a particle undergoing Lévy flight versus Brownian motion

Using the isotropic distribution introduced above, we introduce the isotropic \(\alpha\)-stable Lévy flight \(X^\alpha _t\) by providing the corresponding discrete stochastic process. This is given for \(t =k\Delta t\) with integer k by \(X^\alpha _0 = 0\) and the rule [35]

$$\begin{aligned} X_{t+\Delta t}^{\alpha } = X^{\alpha }_{t} + \Delta X^\alpha ; \quad \Delta X^\alpha \sim S_\alpha (\gamma = 0,\sigma = k(\Delta t)^{1/\alpha }, \mu = 0). \end{aligned}$$
(10)

The continuous-time stochastic process \(X^\alpha _t\) for \(t \ge 0\) can be thought of as a scaling limit as \(\Delta t \rightarrow 0\) of the above random walk, and enjoys several theoretical properties such as stability and an extended central limit theorem [35, 49]. However, has the property that for \(\alpha < 2\), the paths of \(X^\alpha _t\) are almost surely discontinuous, in contrast to Brownian motion — hence the name Lévy “flight”. Given an initial distribution \(u_0(x)\) of particles in \(\mathbb {R}\) which undergo \(\alpha\)-stable Lévy flight, the evolution of the distribution u(xt) for \(t > 0\) is governed by the space-fractional diffusion equation [35, Sect. 1.2]

$$\begin{aligned} \begin{array}{l} \frac{\partial u}{\partial t}(x,t) = -k^\alpha (-\Delta )^{\alpha /2} u(x,t) \\ u(x, t = 0) = u_0 (x), \end{array} \end{aligned}$$
(11)

as illustrated in Fig. 4. The fractional negative Laplacian \((-\Delta )^{\alpha /2}\) is defined for \(0< \alpha < 2\) and for any dimension d as

$$\begin{aligned} (-\Delta )^{\alpha /2} u(x) = C_{d,\alpha } \text { p.v.}\int _{\mathbb {R}^d} \frac{u(x) - u(y)}{|x-y|^{d+\alpha }} dy, \quad x \in \mathbb {R}^d, \end{aligned}$$
(12)

with

$$\begin{aligned} C_{d,\alpha }= \frac{4^{{\alpha /2}} \Gamma \left( {\alpha /2}+\frac{d}{2}\right) }{\pi ^{d/2}|\Gamma (-{\alpha /2})|}; \end{aligned}$$
(13)

see Lischke et al. [50]. Above, “p.v.” denotes the principal value of a singular integral. We have defined this operator in any dimension for future reference, although our present discussion only requires the case \(d = 1\). Perhaps the simplest characterization of the fractional Laplacian is the Fourier representation,

$$\begin{aligned} \mathcal {F} \left[ (-\Delta )^{\alpha /2} u \right] (\xi ) = |\xi |^{\alpha } \mathcal {F}[u](\xi ), \end{aligned}$$

where the Fourier transform is

$$\begin{aligned} \mathcal {F}[u](\xi ) = \int e^{-i \xi x} u(x) dx. \end{aligned}$$
(14)

The simplest case of Eq. (11) is the initial condition \(u_0(x) = \delta _0(x)\), in which case the solution is

$$\begin{aligned} u(x,t) = f_\alpha (x;\gamma = 0,\sigma = k t^{1/\alpha },\mu = 0). \end{aligned}$$
(15)

This is known as the fundamental solution. Although this solution cannot be written in closed form, it satisfies

$$\begin{aligned} u(x,t_2) = \left( \frac{t_2}{t_1}\right) ^{-{1}/{\alpha }} \, u\left( \frac{x}{\left( {t_2}/{t_1}\right) ^{-{1}/{\alpha }}}, t_1\right) , \quad t_2> t_1 > 0, \end{aligned}$$

as shown in [35, Sect. 1.2]. This illustrates that a plume of particles undergoing isotropic \(\alpha\)-stable Lévy flight spreads by a factor of \((t_2/t_1)^{1/\alpha }\) as time elapses from \(t_1\) to \(t_2\), a faster rate when \(\alpha < 2\) than the normal rate \(t^{1/2}\) . Thus, \(\alpha\)-stable Lévy flight is an example of superdiffusion. The dependence of the above solution as well as sample paths on \(\alpha\) is shown in Fig. 5.

Fig. 4
figure 4

The seemingly innocuous heavy tails of the \(\alpha\)-stable density, signifying non-vanishing probability of long jumps, are responsible for the striking properties of \(\alpha\)-stable Lévy flights. As \(\alpha\) decreases from 2, more mass in the middle region of the density is lost and is transferred towards the tails and the center, so that the relative probability of very small movements and very long movements increases (right). This is evident in the sample paths of the process (left)

Fig. 5
figure 5

A plot of the \(\alpha\)-stable densities in a log-log scale that illustrates the tail behavior asserted in Eq. (9). While \(\alpha\)-stable densities do not have a closed-form expression for all x, their simple, asymptotic inverse power-law behavior is an important heuristic

However, since \(\alpha > 0\), the tail behavior of the isotropic \(\alpha\)-stable density implies that the second moment of \(X^\alpha _t\) diverges for \(\alpha < 2\),

$$\begin{aligned} \langle X^\alpha _t \rangle ^2 = \infty ,\quad 0< \alpha < 2. \end{aligned}$$

with the first moment (the mean) diverging also when \(\alpha \le 1\) [45, 44]. This implies that the variance of \(\alpha\)-stable motion is not a useful statistic for parameterizing \(\alpha\)-stable Lévy flight; it bears no useful relationship to \(\alpha\). This aspect can be tackled in several ways, motivating the introduction of further fractional-order operators, such as tempered operators and fractional material derivatives.

We point out several important properties of the fractional Laplacian. From the definition (12), it is clear that \((-\Delta )^{\alpha /2} c = 0\), c being a constant. The fractional Laplacian also satisfies the semigroup property \((-\Delta )^{\alpha /2} (-\Delta )^{\beta /2} = (-\Delta )^{(\alpha +\beta )/2}\) [51]. However, one property that is apparent from the definition is that, unlike integer-order derivatives, the fractional Laplacian is a nonlocal operator, i.e., the value of \((-\Delta )^{\alpha /2} u (x)\) depends on the values of u in all of \(\mathbb {R}\) (or \(\mathbb {R}^d\), for \(d>1\)). In contrast, the value of any integer-order derivative of u at x depends only on the values of u in an infinitesimal neighborhood of x.

2.1.3 The Riemann-Liouville Fractional Derivatives and Asymmetric \(\alpha\)-Stable Lévy Flight

The fractional Laplacian (12) was introduced in the previous section as a symmetric or rotation invariant operator for describing the symmetric or isotropic \(\alpha\)-stable Lévy flight. This model introduced a stability parameter \(0 < \alpha \le 2\) allowing it to generalize normal diffusion, with the scale \(\sigma\) and center \(\mu\) playing similar roles as the standard deviation and mean of the normal distribution. However, the stable distribution also allows for a skewness parameter \(\gamma \in [-1,1]\), with \(\beta = 0\) in the symmetric case, which has no analogue in the normal distribution or for Brownian motion. This is due to the central limit theorem, which states that the use of any finite-variance distribution for the i.i.d. increments \(\Delta X\) in Eq. (6), no matter how asymmetric, leads to \(X_t\) being normally distributed, so that the density is necessarily symmetric about the mean. In this section, we introduce the one-sided Riemann-Liouville fractional derivatives as appropriate operators for modeling asymmetric \(\alpha\)-stably Lévy flights, which are defined by Eq. (10) with \(\Delta X^\alpha \sim S_\alpha (\gamma ,\sigma = k(\Delta t)^{1/\alpha }, \mu = 0)\) for nonzero \(\beta\).

The left-sided and right-sided Riemann-Liouville derivatives in \(\mathbb {R}\) are defined, for \(n = \lceil \alpha \rceil\), as

$$\begin{aligned} ^{\text {RL}}_{\;\;a}{\mathbb {D}}^{\alpha }_x u(x) = \frac{1}{\Gamma (n-\alpha )}\left[ \frac{\partial ^n}{\partial z^n}\int _{a}^z \frac{u(y)}{|z-y|^{\alpha -n+1}} dy\right] _{z = x}, \end{aligned}$$
(16)
$$\begin{aligned} ^{\text {RL}}_{\;\;x}{\mathbb {D}}^{\alpha }_{b} u(x) =\frac{(-1)^n}{\Gamma (n-\alpha )} \left[ \frac{\partial ^n}{\partial z^n}\int _z^{b}\frac{u(y)}{|z-y|^{\alpha -n+1}} dy\right] _{z=x}. \end{aligned}$$
(17)

The texts of Oldham and Spanier [52], Podlubny [53], and Meerschaert and Sikorskii [35] discuss these operators in detail. These derivatives are frequently used in models with \(a = -\infty\) and \(b = \infty\). In connection with initial value problems, the left-sided Riemann-Liouville derivative in time, \(^{\text {RL}}_{\;\;0}{\mathbb {D}}^{\alpha }_t u(t)\), is sometimes used with \(a = 0\). We have written the definitions (16) and (17) to avoid ambiguities in notation, and clearly show that substitution of the variable x occurs after integration and differentiation. An alternative approach is to define Riemann-Liouville fractional integrals separately, as in the right-hand sides of Eqs. (16) and (17); see [51].

One quirk of the notation for Riemann-Liouville derivatives in Eqs. (16) and (17) is the writing of the upper and lower limits of integration [ax] and [bx], respectively, as subscripts. While this is suggestive, the result is that the variable of evaluation x occurs twice in the notation for each operator. If these derivatives are evaluated at any numerical value of x, this value should be substituted in both locations; thus, \(^{\text {RL}}_{\;\;{a}}{\mathbb {D}}^{\alpha }_5 u(5)\) represents a valid evaluation of the derivatives, but \(^{\text {RL}}_{\;\;{a}}{\mathbb {D}}^{\alpha }_x u(5)\) and \(^{\text {RL}}_{\;\;{a}}{\mathbb {D}}^{\alpha }_5 u(x)\) do not.

With \(a = -\infty\) and \(b = \infty\), the Riemann-Liouville derivatives can be represented in frequency space by

$$\begin{aligned} \begin{array}{c} \mathcal {F}\left[ ^{\;\text {RL}}_{{-\infty }}{\mathbb {D}}^{\alpha }_x u \right] (\xi ) = (-i\xi )^\alpha \mathcal {F}[u](\xi ), \\ \mathcal {F}\left[ {^{\text {RL}}_{\;\;x}{[\mathbb {D}}^{\alpha }_\infty u} \right] (\xi ) = (i\xi )^\alpha \mathcal {F}[u](\xi ). \end{array} \end{aligned}$$
(18)

In one dimension, these can be used in the asymmetric diffusion model

$$\begin{aligned} \begin{aligned} \frac{\partial u}{\partial t}(x,t)&= \frac{-k^\alpha }{\cos (\pi \alpha /2)} \left[ p \, \left( ^{\;\text {RL}}_{{-\infty }}{\mathbb {D}}^{\alpha }_x u(x,t) \right) + (1-p) \, \left( ^{\text {RL}}_{\;\;x} \mathbb {D}^{\alpha }_{\infty } u(x,t) \right) \right] \\ u(x, t = 0)&= u_0 (x), \end{aligned} \end{aligned}$$
(19)

which describes anomalous diffusion of independent particles. Here, the positions of each particle at time steps of \(k \Delta t\) for integer k are governed by Eq. (6) with increments \(\Delta X\) being drawn from the asymmetric \(\alpha\)-stable distribution

$$\begin{aligned} {\Delta X} \sim S_\alpha (\gamma = 2p-1,\sigma = k (\Delta t)^{1/\alpha },\mu = 0). \end{aligned}$$
(20)

The resulting random variable given by sum of k increments is denoted \(X^{\alpha ,p}_{t}\), for \(t = k \Delta t\). Thus, the skewness ranges from \(\gamma = -1\) when \(p = 0\) to \(\gamma = 1\) when \(p=1\). The fundamental solution of Eq. (19) is

$$\begin{aligned} u(x,t) = f_\alpha (x;\gamma = 2p-1,\sigma = k t^{1/\alpha },\mu ); \end{aligned}$$
(21)

cf. Eq. (15).

Sample paths of the process \(X^{\alpha ,p}_t\) are illustrated in Fig. 6. Note that when \(p = 1/2\), the distribution reverts to the symmetric \(\alpha\)-stable distribution, and it can be shown in this case that Eq. (19) reduces to Eq. (11); more specifically,

$$\begin{aligned} \frac{1}{\cos (\pi \alpha /2)} \left[ \frac{1}{2} \, \left( ^{\;\text {RL}}_{{-\infty }}{\mathbb {D}}^{\alpha }_x u(x) \right) + \frac{1}{2} \, \left( ^{\text {RL}}_{\;\;x} \mathbb {D}^{\alpha }_\infty u(x) \right) \right] = (-\Delta )^{\alpha /2} u(x). \end{aligned}$$

The Fourier representation (Eq. 18) suggests that the left-sided Riemann-Liouville derivative \({^{\;\text {RL}}_{{-\infty }}{\mathbb {D}}^{\alpha }_x u}\) should be thought of as a fractional power of the operator \(\partial / \partial x\). However, the correspondence between Eqs. (19) and (20) makes it clear that to obtain a complete description of \(\alpha\)-stable Lévy flights in one dimension necessitates two operators, a left-sided and a right-sided operator, which agree with one another when \(\alpha = 2\). Our interest is these models lie in the fact that an extended centralized limit theorem hold for processes with i.i.d. increments drawn from distributions with infinite variance, but for which the tails of the density function satisfy Pareto-type conditions as in Eq. (9). For such processes, \(\alpha\)-stable distributions play an analogous role to the normal distribution in the classical central limit theorem; unlike the classical theorem, for full generality, skewed \(\alpha\)-stable distributions must be included in such a result. See Meerschaert and Scheffler [49] or Meerschaert and Sikorskii [35] for a treatment of these results.

Fig. 6
figure 6

\(\alpha\)-stable Lévy flights allow for asymmetric diffusion, which has no analogue within the classical diffusion framework. The \(\alpha\)-stable density (Eq. 20) admits a skewness parameter \(\beta\), ranging from \(-1\) to 1, which can adjust the relative probability of long jumps in a given direction (right), a statistical property that is evident in the sample paths (left). Such models are governed by the fractional-order diffusion equation involving Riemann-Liouville derivatives, as in Eq. (19)

We mention how the Riemann-Liouville derivative can be utilized in dimensions \(d > 1\). An anisotropic diffusion operator was introduced by Meerschaert et al. [54] and Benson et al. [17] as

$$\begin{aligned} -(-\Delta )_M^{\alpha /2}u(x)= C_{\alpha ,d} \int _{|\theta | = 1} D^{\alpha }_\theta u(x) M(d\theta ), \quad C_{\alpha ,d} = \frac{\Gamma (\frac{1-\alpha }{2}) \Gamma (\frac{d+\alpha }{2})}{2 \pi ^{\frac{1+d}{2}}}. \end{aligned}$$
(22)

Here, \(M(d\theta )\) denotes a nonnegative measure on the angle \(\theta\) in the unit sphere \(\{ | \theta | = 1\}\) in \(\mathbb {R}^d\), and the Riemann-Liouville directional derivative is given by

$$\begin{aligned} D^{\alpha }_\theta u(x) = ^{\;\text{RL}}_{-\infty}{\mathbb{D}}^{\alpha }_t v (t) \big |_{t = 0}, \quad \text {where } v(t) = u(x + t\theta ). \end{aligned}$$

Benson et al [17] showed that when the measure M is uniform, the operator (Eq. 22) reduces to the fractional Laplacian (Eq. 12). In higher dimensions and for general measures M, the operator (Eq. 22) plays an analogous role to the operator in the right-hand side of Eq. (19), which is in fact a special case of it for \(d = 1\). As such, it is used in models of anistropic multivariate \(\alpha\)-stable Lévy diffusion.

2.1.4 Subdiffusion and the Caputo Fractional Derivative

The superdiffusive model introduced above, in which a plume of particles spreads out in space with rate \(t^{1/\alpha }\) for \(0< \alpha < 2\), raises the question of whether a process can be constructed which results in diffusion slower than the Brownian rate \(t^{1/2}\). In this section, we introduce such a model, constructed as Brownian motion with random waiting times drawn from a skewed stable distribution, supported over positive real numbers with a power-law tail. Here, we step away from the framework of the SDE given by Eq. (6). Rather than being defined by a simple time-stepping scheme with i.i.d. increments, the paths of the process are defined by a transformation, or “postprocessing”, of Brownian paths \(B_t\).

We introduce Brownian motion with waiting times, denoted by \(B_{\tau (t)}\). The intuition is that the particle paths traced out in space by a discretization of \(B_{\tau (t)}\) are paths of discretized Brownian motion \(B_t\), but the particles wait at each point of the path for a random time drawn from the totally skewed stable distribution. The operational time \(\tau (t)\), which introduces waiting and replaces linear time t, is an inverse stable subordinator. This is a stochastic process in the variable t, although we write \(\tau (t)\) rather than using a subscript for typographical reasons. This process is constructed by first defining the stable subordinator D(t), and defining \(\tau (t)\) to be the inverse processFootnote 5 of \(D(\tau )\). Both D(t) and \(\tau (t)\) are nondecreasing processes with units of time. In terms of paths, \(\tau (t)\) arises from D(t) as

$$\begin{aligned} \tau (t) = \inf \{ \tau \text { such that } D(\tau ) > t \}. \end{aligned}$$
(23)

Intuitively, D(t) represents a cumulative waiting time process, keeping track of the total time waited by a particle throughout a path, while the inverse \(\tau (t)\) represents an operational time, i.e., the time spent traveling. The increments of D(t) represent the time waited at each location of a particle before the jump to the next location. More specifically, D(t) is a totally skewed \(\beta\)-stable Lévy process (Eq. 20) with stability index \(\beta \in (0,1)\), \(\gamma = 1\), scale \(\sigma = \cos (\pi \beta /2)\), and center \(\mu = 0\); see Meerschaert and Sikorskii [35], Example 5.14. The construction of sample paths of \(B_{\tau (t)}\) is demonstrated in Fig. 7.

Fig. 7
figure 7

(Top left) Example of an \(\alpha\)-stable subordinator density function, representing the density for random waiting times for the processes corresponding to the time-fractional diffusion equation (27). (Top right) Sample path of the subordinator (cumulative waiting time) D(t), the parent path, and the inverse subordinator (operational time) \(\tau (t)\) given by Eq. (23). Note that as t increases, \(\tau (t)\) need not advance. (Bottom left) Three sample paths of Brownian motion. (Bottom right) Three sample paths of Brownian motion with waiting times, constructed from the Brownian paths in the bottom left panel. The particles trace out the same Brownian paths in space, but now wait for potentially several time steps at each location, as specified by the operation time \(\tau (t)\)

The resulting probability density function of D(t),

$$\begin{aligned} \psi _\beta (t) = f_\beta (t;\gamma =1,\sigma = \cos (\pi \beta /2),\mu =0) \end{aligned}$$
(24)

for waiting times is supported in nonneagative real numbers. Due to the nonnegative support of the waiting time density, the characteristic function (Eq. 7) yields the Laplace transform of the waiting time density as

$$\begin{aligned} \mathcal {L}\left[ \psi _\beta \right] (s) = e^{-s^\beta }, \end{aligned}$$
(25)

where the Laplace transform is defined as

$$\begin{aligned} \mathcal {L}\left[ u\right] (s) = \int _0^\infty e^{-st} u(t) dt . \end{aligned}$$
(26)

See Meerschaert and Sikorskii [35], p. 108 and p. 156 for a discussion. The variance of the process \(B_{\tau (t)}\) is given by

$$\begin{aligned} \left\langle B_{\tau (t)} \right\rangle ^2 = \frac{2}{\Gamma (\beta +1)} t^{\beta }, \quad 0< \beta < 1, \end{aligned}$$

which is the desired subdiffusive property. Note that the finiteness of the variance does not imply that the normal central limit theorem applies to \(B_{\tau (t)}\), which is not equal in distribution to Brownian motion nor to any Lévy process. In fact, \(B_{\tau (t)}\) is not a Markov processes.

The probability density of Brownian motion with waiting times \(B_{\tau (t)}\) is governed by the time-fractional diffusion equation,

$$\begin{aligned} \begin{array}{l} ^{\text {C}}_{{\,0}}{\mathbb {D}}^{\beta }_t u(t)= k^2 \Delta u(x,t) \\ u(x, t = 0) = u_0 (x) . \end{array} \end{aligned}$$
(27)

Here, the Caputo derivative is defined for \(0< \beta < 1\) by

$$\begin{aligned} ^{\text {C}}_{\,a}{\mathbb {D}}^{\beta }_t u(t) = \frac{1}{\Gamma (1-\beta )} \int _{a}^{t} \frac{d u}{dt}(s) \frac{1}{|s-t|^{\beta }} ds . \end{aligned}$$
(28)

For \(a = 0\), this operator is characterized by the simple Laplace transform representation (see Meerschaert and Sikorskii [35], page 111)

$$\begin{aligned} \mathcal {L}\left[ ^{\text {C}}_{{\,0}}{\mathbb {D}}^{\beta }_t u \right] (s) = s^\beta \mathcal {L}[u](s) - s^{\beta - 1} u(0) . \end{aligned}$$
(29)

Higher order Caputo derivative can be defined, although the Laplace transforms of the resulting operators involve initial conditions for derivatives of u; see Sect. 2.3 of Meerschaert and Sikorskii [35]. The Caputo derivative is most frequently utilized as a derivative in time for initial value problems, with the fractional order \(0< \alpha < 1\).

Before introducing the fundamental solution to the time-fractional diffusion, we introduce the Mittag-Leffler function [55, 56]

$$\begin{aligned} E_\theta (z) = \sum _{\ell =0}^{\infty } \frac{z^\ell }{\Gamma (\theta \ell + 1)}, \quad \theta > 0. \end{aligned}$$
(30)

This Mittag-Leffer \(E_\theta (z)\) reduces to the exponential function \(e^z\) when \(\theta = 1\), and has Laplace transform property

$$\begin{aligned} \mathcal {L}\left[ E_\theta (-k^2 t^\theta ) \right] (s) = \frac{s^{\theta -1}}{s^\theta + k^2}, \end{aligned}$$

which immediately implies that \(E_\beta (-k^2 t^\beta )\) solves the fractional ordinary differential equation

$$\begin{aligned} ^{\text {C}}_{\,0}{\mathbb {D}}^{\beta }_t u = k^2 u. \end{aligned}$$

Returning to the diffusion Eq. (27) with initial condition \(u(x,t=0) = \delta (x)\), applying the Fourier transform in space implies that

$$\begin{aligned} \mathcal {F}\left[ u(\cdot ,t) \right] (\xi ) = E_\beta (-k^2 \xi ^2 t^\beta ), \end{aligned}$$

which, as shown by Mainardi et al. [56], yields a solution that can be written

$$\begin{aligned} u(x,t) = t^{-\beta /2} U(|x|/t^{\beta /2}); \quad \end{aligned}$$

with

$$\begin{aligned} U(x) = \frac{1}{2} \sum _{k=0}^\infty \frac{(-x)^k}{k!\Gamma [-(\beta /2) k + 1 - (\beta /2)]}. \end{aligned}$$

being a special case of the Fox-Wright function. Note that \(U(x) = u(x,t=1)\). While the fundamental solution above is transcendental, it has the following properties: for \(\alpha = 1\), it reduces to the solution (Eq. 5) of the classical diffusion equation; for \(0< \alpha < 1\), the solution decays faster than exponential and slower than Gaussian; and the second moment of the solution is

$$\begin{aligned} \sigma ^2(t) = 2 \frac{t^\beta }{\Gamma (\alpha +1)} \end{aligned}$$

Note that the \(t^\beta\) scaling of this second moment is consistent with the scaling of the fundamental solution above.

2.1.5 Continuous-Time Random Walks and Space-Time-Fractional Diffusion

Both the \(\alpha\)-stable Lévy flight \(X^{\alpha ,p}_t\), which led to the space-fractional diffusion equation discused in Sect. 2.1.3, and Brownian motion with \(\beta\)-stable subordinator operational time \(B_{T^{\beta }(t)}\), which led to the time-fractional diffusion equation discussed in Sect. 2.1.4, are examples of continuous-time random walks [35]. A continuous-time random walk (CTRW) allows for a general family of processes in space to be time-changed by a general family of waiting time processes. To illustrate this concept, we consider the process \(X^{\alpha ,p}_{T^\beta (t)}\), which is \(\alpha\)-stable Lévy flight \(X^{\alpha ,p}_t\) defined at the discrete level by Eq. (20) time-changed by the \(\beta\)-stable subordinator process \(t \mapsto T^\beta (t)\) introduced in Sect. 2.1.4. This models a particle that performs independent jumps drawn from the \(\alpha\)-stable process, waiting at each point for a random time drawn independently from the \(\beta\)-stable subordinator process. As shown by, e.g., Meerschaert and Sikorskii [35] (Sect. 4.5), the probability density of this particle position is then governed by a differential equation that is fractional in both time and space,

$$\begin{aligned} \begin{array}{l} ^{\text {C}}_{\,0}{\mathbb {D}}^{\beta }_t u(t,x) = \frac{-k^\alpha }{\cos (\pi \alpha /2)} \left[ p \, \left( ^{\;\text {RL}}_{{-\infty }}{\mathbb {D}}^{\alpha }_x u(x,t) \right) + (1-p) \, \left( ^{\text {RL}}_{\;\;x}{\mathbb {D}}^{\alpha }_{\infty } u(x,t) \right) \right] \\ u(x,t = 0) = u_0 (x), \end{array} \end{aligned}$$
(31)

While intuitive, this result deserves a more detailed outline within the general theory of CTRWs. In the standard CTRW model, particles wait at a location for time drawn from a density function \(\psi\), and jump to a new location by an increment drawn from a density function \(\phi\). The waiting time and jump samples are assumed to be i.i.d., and uncoupled from each other [44, 57,58,59]. Thus, the densities \(\psi\) and \(\phi\) completely determine the CTRW. From the waiting time density \(\psi\), the probability that a particle will remains at any given position for time t is

$$\begin{aligned} \Psi (t) = 1 - \int _0^t \psi (t) dt; \end{aligned}$$
(32)

this is referred to as the survival probability of a CTRW particle. Then, given an initial probability density of a particle \(u_0(x) = u(x,t=0)\), which can also be thought of as an initial distribution of an ensemble of independent particles, the following equation was derived by Montroll and Weiss [60] for the density at later times:

$$\begin{aligned} u(x,t) = \Psi (t) u_0(x) - \int _0^t \psi (t-\tau ) \int _{-\infty }^{\infty } \phi (y) u(x-y,\tau ) dy d\tau . \end{aligned}$$
(33)

This equation is central to the CTRW theoryFootnote 6. Taking the Laplace transform in time, the Fourier transform in space, and solving for \(\mathcal {F} \left[ \mathcal {L} [u]\right] (\xi ,s)\) yields the Montroll-Weiss equation [60],

$$\begin{aligned} \mathcal {F}\left[ \mathcal {L} [u] \right] (\xi ,s) = \frac{1-\mathcal {L}[\psi ](s)}{s} \frac{\mathcal {F}[u_0](\xi )}{1-\mathcal {L}[\psi ](s) \mathcal {F}[\phi ](\xi )}. \end{aligned}$$

In the case that \(\phi\) is the \(\alpha\)-stable density (Eq. 20) and \(\psi\) is the \(\beta\)-stable subordinator density (Eq. 24), then \(\mathcal {F}[\phi ]\) is given by the analytical formula (7) and \(\mathcal {L}[\psi ]\) by Eq. (25), so that the Montroll-Weiss equation represents a closed-form solution of u in \((\xi ,s)\)-space. Unsurprisingly, it is impossible to perform inverse transforms and obtain u itself analytically, but u can be shown to satisfy (Eq. 31) using the representations Eq. (29) and (18) [35].

2.1.6 Lévy Walks and Fractional Material Derivatives

Superdiffusive \(\alpha\)-stably Lévy flight exhibits infinite MSD, which is a drawback for certain applications. Related to this is the infinite speed of propagation intrinsic to Lévy flights, i.e., the fact that particles have a nonzero probability of traveling an arbitrary large distance in a unit of time. Brownian motion also suffers from this feature, although this probability of large excursions is so low that MSD remains finite. A prototypical model of superdiffusion that cannot be described by a Lévy flight is ballistic motion, in which particles simply move from an initial configuration in fixed random directions with speed v, for all time t. A ballistic particle travels a distance vt in time t from an initial position \(x_0\). If reorientations are allowed, then the positions of these so-called sub-ballistic particles in space-time are confined to a ballistic cone

$$\begin{aligned} \left\{ (x,t) \text { such that } x \in [x_0 - vt, x_0+vt], t \ge 0 \right\} . \end{aligned}$$

Because the density function of the particle positions is compactly supported, all moments of the position are finite. Such a process cannot be described by Lévy flights.

To capture such behavior, we introduce the Lévy walk model, following Zaburdaev et al. [44]. Such models are based on continuous-in-time motion of particles, rather than instantaneous jumps. A speed v of particles in a medium is specified; each particle moves with speed v in a chosen direction, before a reorientation event occurs in which the direction changes instantaneously and the particle continues to move with speed v before the next direction. Assuming the direction at reorientation is sampled uniformly on the unit sphere, such a walk is determined by a probability density function for the duration of movement \(\psi (\tau )\). This leads to a survival probability \(\Psi (t)\) given by Eq. (32), with \(\psi\) now representing the duration density. Thus, \(\Psi (t)\) returns the probability that a particle has persisted in a given direction for time \(\tau\), i.e., has not experienced reorientation for time \(\tau\). Similar to the CTRW case, a master equation can be derived for the probability density u(xt) of the location of the particle in Laplace-Fourier space:

$$\begin{aligned} \mathcal {F}\left[ \mathcal {L} [u] \right] (\xi ,s) = \frac{\mathcal {L}[\Psi ](s + i v \xi ) + \mathcal {L}[\Psi ](s - i v \xi ) }{2 - \mathcal {L}[\psi ](s + i v \xi ) + \mathcal {L}[\psi ](s - i v \xi )} \mathcal {F}[u_0](\xi ). \end{aligned}$$
(34)

Unlike the master equation for CTRWs, this equation exhibits coupling in Fourier and Laplace variables, representing coupling in space-timeFootnote 7. This results in governing equations that are considerably more complex than those of a standard CTRW. For a Lévy walk, \(\psi\) is taken to be a Pareto-type distribution,

$$\begin{aligned} \psi (\tau ) = \frac{1}{\tau _0} \frac{\gamma }{(1 + \tau /\tau _0)^\gamma }, \quad \tau _0> 0, \gamma > 0. \end{aligned}$$

An asymptotic expansion of \(\mathcal {L}[\psi ]\) and \(\mathcal {L}[\Psi ]\) substituted in Eq. (34) yields the following approximation for the evolution of the density function of a Lévy walk in Fourier-Laplace space:

$$\begin{aligned} \mathcal {F}\left[ \mathcal {L} [u] \right] (\xi ,s) \approx \frac{(s + i v \xi )^{\gamma -1} + (s - i v \xi )^{\gamma -1} }{(s + i v \xi )^\gamma + (s - i v \xi )^\gamma } \mathcal {F}[u_0](\xi ). \end{aligned}$$

Given v and \(u_0\), this equation can be inverted to compute u(xt), but obtaining a governing equation in (xt) is less straightforward from this point on, due to space-time coupling. Sokolov and Metzler [61] suggest defining a fractional material or substantial derivative

$$\begin{aligned} (v^{-1} \partial _t \pm \partial _x)^{1/\gamma } u := \mathcal {F}^{-1} \mathcal {L}^{-1} \left[ (s + i v \xi )^\gamma + (s - i v \xi )^\gamma \mathcal {F}[u_0](\xi ) \right] , \end{aligned}$$

in order to obtain a governing equation for u(xt). Recent works, such as those of [62], have explored numerical discretizations for these operators.

Fig. 8
figure 8

The evolution of the probability density function (denoted \(P_{\text {LW}}\) in the figure) of a Lévy walk, reproduced from [44]. Here, \(\gamma = 3/2\) and the density is plotted for \(t = 100\) (black), \(t = 200\) (blue), and \(t = 300\) (red). The density mimics the density of a \(\gamma\)-stable Lévy flight in an interior region of the ballistic cone, scaling outwards as \(t^{1/\gamma }\), supported inside the ballistic front (consisting of two points in one dimension) that scales outwards as t

Despite the greater mathematical difficulties related to governing equations, as compared to other fractional models, Lévy walks have been widely used due to the physical nature of finite speed of propagation and finite MSD; see [44] for a survey. When \(1< \gamma < 2\), by numerical approximations, it can be seen that u(xt) evolves from a \(\delta\)-distribution with “a central part of the profile approximated by the Lévy distribution sandwiched between two ballistic peaks” that propagate at speed v (Fig. 8), with an MSD and self-similarity property for large t that features a superdiffusive scale factor of \(t^{1/\gamma }\) [44].

2.1.7 Variable-Order Fractional Derivatives

Given the physical meaning within stochastic models of the fractional order \(\alpha\) in derivatives such as Eqs. (12), (16), and (28), it is reasonable to expect that these parameters may vary in space and time. Variable-order fractional models are convenient to describe anomalous diffusion in the case of heterogeneous materials or media, or, more generally, when the nature of the diffusion process (subdiffusive, superdiffusive, and classical) changes with space and time. While models with constant fractional order are the simplest and most widely used, some of the model descriptions we discuss in the following sections are improved by the use of a variable fractional order. In recent years, with the purpose of increasing the descriptive power of fractional operators, new models characterized by a variable fractional order have been introduced for both space- and time-fractional differential operators [63,64,65,66,67] and several discretization methods have been designed [68,69,70,71,72]. The improved descriptive power of variable-order fractional operators has been demonstrated in some recent works on parameter estimation [73,74,75].

Given a function

$$\begin{aligned} \alpha : \mathbb {R}^d \times \mathbb {R} \rightarrow \mathbb {R}, \end{aligned}$$

i.e., a function \(\alpha (\mathbf {x},t)\) of space and time, we define variable-order operators as follows. For a function \(u(\mathbf {x},t)\) with \(\mathbf {x} \in \mathbb {R}^d\) and \(t \in \mathbb {R}\), we define the variable-order fractional LaplacianFootnote 8 as

$$\begin{aligned} \mathfrak {L}^{\alpha (\cdot ,\cdot )} u(\mathbf {x},t) = C_{d,\alpha (\mathbf {x},t)} \text { p.v.} \int _{\mathbb {R}^d} \frac{u(\mathbf {x},t) - u(\mathbf {y},t)}{|\mathbf {x}-\mathbf {y}|^{d+\alpha (\mathbf {x},t)}} d\mathbf {y}. \end{aligned}$$
(35)

Here, \(\alpha (\mathbf {x},t)\) is restricted to take values in (0, 2). Note that for constant \(\alpha\), \(\mathfrak {L}^{\alpha } = (-\Delta )^{\alpha /2}\). For \(d = 1\) and \(\alpha (x,t)\) restricted to (0, 1), we define the variable-order left-sided Riemann-Liouville fractional derivative as

$$\begin{aligned} ^{\;\text {RL}}_{{-\infty }}{\mathbb {D}}^{\alpha (\cdot ,\cdot )}_x u(x,t) = \frac{1}{\Gamma (1-\alpha (x,t))} \frac{\partial }{\partial x} \int _{-\infty }^x \frac{u(y,t)}{|x-y|^{\alpha (x,t)}} dy, \end{aligned}$$

The right-sided Riemann-Liouville may be defined for variable order in an analogous way. We define the variable-order Caputo fractional derivative, again for \(\alpha (x,t)\) taking values in (0, 1), as

$$\begin{aligned} ^{\text {C}}_{\,0}{\mathbb {D}}^{\alpha (\cdot ,\cdot )}_t u(x,t) = \frac{1}{\Gamma (1-\alpha (x,t))} \int _{-\infty }^t \frac{d u}{dt}({s}) \frac{1}{|s-t|^{\alpha (x,t)}} {ds}. \end{aligned}$$
(36)

2.1.8 Relationships Between Processes, Fractional Models, and Applications

To summarize and offer a quick look-up of anomalous diffusion processes, their corresponding fractional models, and applications of each process/model, we have included these relationships in Table 2. This table includes references to the previous sections where each process and model is described, as well as pointers to the applications in the following sections where the models are utilized. We have limited references to applications to only those three areas that we focus on in this article.

Table 2 Relationships between diffusion models, fractional models, and applications discussed in this article

2.2 Connection to Nonlocal Calculus

Fractional-order differential operators can be viewed as a special case of nonlocal models [76,77,78]. The intrinsic nonlocality of fractional operators has been illustrated in the previous section; this property describes the fact that fractional-order derivatives of a function at a point \(\mathbf {x} \in \mathbb {R}^d\) typically depend on values of the same function at all points \(\mathbf {y}\in \mathbb {R}^d\), no matter how large the distance between \(\mathbf {x}\) and \(\mathbf {y}\) may be. An example of this is the formula (12) for the fractional Laplacian.

General nonlocal diffusion (or Laplace) models include integral operators of the form [79, 80]

$$\begin{aligned} \mathfrak {L}[u](\mathbf {x}) = \int _{\mathbb {R}^d} \gamma (\mathbf {x},\mathbf {y})[u(\mathbf {x}) - u(\mathbf {y})] d\mathbf {y}\end{aligned}$$
(37)

with kernels \(\gamma\) having support in \(\{|\mathbf {x}-\mathbf {y}| \le \delta \}\), where the so-called interaction radius \(\delta\) is such that \(\delta \in (0,\infty ]\). A quick comparison with the integral formula (12) shows that when the kernel \(\gamma\) is properly selected and \(\delta =\infty\), then the fractional Laplacian is formally equivalent to (37) (see [78] for a rigorous derivation and a discussion).

Nonlocal Laplace operators featuring kernels with bounded support may be preferred to fractional operators for physical reasons when modeling short-range interactions [81, 82] as well as mathematical convenience when posing volume conditions, the nonlocal counterpart of classical boundary conditions [79, 83]. The latter reason gives rise to truncated fractional-order derivatives [77, 84].

General nonlocal models also allow for more flexibility with regard to regularity. Considering diffusion or Poisson’s problems, fractional-order problems exhibit regularity explicitly parametrized by the fractional order [51]; in contrast, nonlocal models involving nonsingular kernel operators lead to problems that impose no regularity on the solution [79] and can be naturally utilized to model fracture dynamics [82, 85]. Finally, we remark that the relationship between fractional and nonlocal models extends to more general operators than those of diffusion/Laplace type. There is indeed a well-established nonlocal vector calculus [80, 86], of which fractional-order vector calculus is a special case (see [77] for rigorous results where the convergence of truncated fractional gradient and divergence is proven in norm and pointwise).

2.3 A Remark About Numerical Methods for Fractional-Order Models

Over the past two decades, a significant amount of progress has been made in developing numerical methods, ranging from finite difference/volume schemes to finite-element methods, in addition to a variety of new spectral theories for single and multi-domain spectral methods, obtaining efficient and easy-to-construct smooth/non-smooth basis and test functions. Performing a thorough and inclusive review of all the contributions made in this direction is nearly impossible and out of the scope of the present work. Interested readers can find a wide spectrum of research carried out in the context of numerical analysis of fractional models in [87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105], and references therein.

We restrict ourselves to discussing one aspect related to numerical methods, on the computational feasibility of solving fractional models. In the time-fractional case, efficient long-time numerical integration is of interest to capture inherent long-time far-from-equilibrium dynamics and to enable the full convolution computations for large-scale systems. To this end, a number of fast time-stepping schemes have been developed during the last 20 years, which greatly reduce the cost of solving fractional models, making them quite comparable to classical models. These include the fast convolution method by Lubich and Schädle [106], which reduced the computational complexity of direct finite difference discretizations of time-fractional models from \(\mathcal {O}(N^2)\) to \(\mathcal {O}(N \log N)\), and memory requirements from \(\mathcal {O}(N)\) to \(\mathcal {O}(\log N)\), where N denotes the number of time steps. High-order extensions of the method were developed [107, 108] and applied to three-dimensional simulations of fluid-structure interactions in cerebral arteries and aneurysms [107]. Among a vast number of works in the literature, we also briefly outline matrix-based schemes, such as fast-inversion approaches [109] and kernel compression methods [110] for time-fractional problems. For space-fractional FPDEs, adaptive methods and hierarchical matrices approaches have accomplished similar, dramatic reductions in computational complexity and memory costs for solving models [111,112,113]. Efficient solvers and preconditioners for the fractional Laplacian were also developed by [114]. The point we make is that from two decades of numerical methods development in the field, the current state-of-the-art numerical methods for fractional models produce computational costs comparable to integer-order cases, therefore being timely computational tools to be readily employed in large-scale systems modeled by FPDEs.

3 Anomalous Subsurface Transport

The accurate prediction at large scales of contaminant transport in both surface and subsurface water is fundamental for efficient management of water resources and hence critical for environmental safety. However, the explicit description of the systems where transport takes place is extremely challenging, especially at large scales, due to the complexity of surface and subsurface environments. In fact, the latter, feature heterogeneities that are either hard or impossible to measure and, hence, cannot be described with certainty at all scales and locations of relevance. On the other hand, even when the environment’s microstructure can be captured, numerical simulations of PDE models such as the advection-diffusion equation (ADE) may be prohibitively expensive if conducted at small scales. Furthermore, the same equations that are accurate at small scales, fail to predict solutes’ behavior at larger scales, due to the appearance of “anomalous”, or “non-Fickian” behavior [19].

Still, in the past, the classical ADE has been broadly utilized as a model for solute transport [115,116,117]. As thoroughly explained in [20], in the presence of heterogeneous media, ADEs fail to be accurate at large scales due to the fact that they are treated as deterministic models. A coarse-grained model can be considered deterministic only when media properties do not vary rapidly in the neighborhood of a point; however, even with mild heterogeneities [18] quantities defined at large scales vary rapidly enough to justify treating them as random functions of space and/or time over a fictitious macroscopic continuum. In this case, the ADE becomes an SDE. Interestingly, when treating the ADE’s parameters as stochastic, the ensemble mean concentration through randomly heterogeneous media is generally non-Fickian, i.e., non-classical. This can be observed in a simple manner by performing Monte Carlo numerical simulations. After generating several random realizations of the underlying velocity field, the ADE is numerically solved for each field and the concentration is averaged over all realizations, revealing a non-classical behavior [20].

In view of the following section where fractional behavior is discussed in the context of turbulence, we point out that the above stochastic theories are closely related to those governing turbulent diffusion. However, while transport in porous media takes place at small Reynolds numbers, the latter take place at large ones. Furthermore, porous velocities depend on hydraulic properties in a known manner, whereas turbulent velocities fluctuate randomly in space-time, making the first uncertainty epistemic (e.g., incomplete knowledge of medium properties) and the second aleatory (i.e., controlled by chance). This makes it easier to reduce the uncertainty in solute transport models by tuning them using hydrogeologic data (see, e.g., [118]).

In this section we show that Fractional ADEs (FADEs) are appropriate models to describe non-Fickian transport of solutes without the prohibitive burden of resolving the heterogeneities at the small scales explicitly thanks to their integral nature that allows to embed length scales in the definition of the operator. Before reporting on early works featuring a plain fractional Laplacian model and later works where variable fractional orders are introduced, we dedicate a few words to another nonlocal model, also popular in the literature: the continuous-time random walk (CTRW) approach. As we point out later on, these models have similarities and share advantages being, perhaps, the most important the strong connection to stochastic processes that makes them easier to analyze and interpret.

Fractional Subsurface Models Based on Continuous-Time Random Walks

In Sect. 2.1.5, we discussed the basic concepts of CTRWs, introduced by Montroll andWeiss [60]. We now explain how these models arise in subsurface transport and lead to fractional equations, following Berkowitz et al. [119]; further relevant works in the literature include [120,121,122,123,124].

To analyze subsurface particles, we begin by examining the solute concentration \(C(\mathbf {x},t)\) for a given configuration of particles; \(C(\mathbf {x},t)\) refers to the number of particles at a site \(\mathbf {x}\), normalized by the total number of particles in the system. In the absence of sinks and sources, the solute concentration \(C(\mathbf {x},t)\) varies with time t at the site \(\mathbf {x}\) by following a stochastic mass balance expression, i.e.,

$$\begin{aligned} \dfrac{\partial C(\mathbf {x},t)}{\partial t}= -\sum \limits _\mathbf {y}\big (w(\mathbf {y},\mathbf {x})C(\mathbf {x},t) - w(\mathbf {x},\mathbf {y})C(\mathbf {y},t)\big ). \end{aligned}$$

The expression above is known in the literature as (discrete) master equation [125]. Here, \(w(\mathbf {x},\mathbf {y})\) is the transition rate at which a particle moves from \(\mathbf {y}\) to \(\mathbf {x}\), the first term in the sum represents the normalized rate of solute outflow from site \(\mathbf {x}\) to all sites \(\mathbf {y}\), whereas the second term represents the normalized rate of solute inflow from all sites \(\mathbf {y}\) to \(\mathbf {x}\). We further assume that the transition rates corresponding to different sites or displacements are statistically independent, i.e., hydraulic and transport properties of porous media and system states (e.g., hydraulic fluxes) lack spatial correlations. This is referred to as statistical incoherence; under this assumption, the ensemble mean concentration \(c(\mathbf {x},t)=\langle C(\mathbf {x},t) \rangle\), where \(\langle \cdot \rangle\) refers to an average over all possible configurations of the particle system, satisfies the so-called generalized master equation, i.e.,

$$\begin{aligned} \dfrac{\partial c(\mathbf {x},t)}{\partial t}= -\sum \limits _\mathbf {y}\int \limits _0^t \Big (\theta (\mathbf {y}-\mathbf {x},t-\tau )c(\mathbf {x},\tau ) - \theta (\mathbf {x}-\mathbf {y},t-\tau )c(\mathbf {y},\tau )\Big )\,d\tau . \end{aligned}$$

As discussed in Berkowitz et al. [119], this equation is equivalent to a spacetime coupled CTRW equation

$$\begin{aligned} c(x,t) = \sum _y \int _0^t \chi (s'-s,t'-t)c(s',t') dt' + \delta (s) \delta (t - 0^+), \end{aligned}$$

with an explicit correspondence between the function \(\theta\) and the space-time density function \(\chi (s,t)\); see also Klafter and Silbey [126]. If the CTRW is uncoupled, i.e.,

$$\begin{aligned} \chi (s,t) = \psi (s) \phi (t), \end{aligned}$$

then this equation is equivalent to the CTRW equation (33) discussed in Sect. 2.1.5. As a result, c(xt), in absence of advection, and with \(\psi\) and \(\phi\) given by the stable distributions specified in Sect. 2.1.5, is governed by the FPDE (Eq. 31). This cements the importance of FPDEs in subsurface transport, although in some cases, the incoherence assumption that is required to derive the generalized master equation may not be valid.

A simple and fairly general FADE for subsurface transport under the influence of both advection and anomalous diffusion is the one-dimensional advection and space-time-fractional diffusion equation with constant coefficients (see, e.g., [16]):

$$\begin{aligned} \begin{aligned} ^{\text {C}}_{{\,0}}{\mathbb {D}}^{\beta }_t c(x,t)&= -V\dfrac{\partial c}{\partial x} -D \left[ p \, \left( ^{\;\text {RL}}_{{-\infty }}{\mathbb {D}}^{\alpha }_x c(x,t) \right) + (1-p) \, \left( ^{\text {RL}}_{\;\;x}{\mathbb {D}}^{\alpha }_{\infty } c(x,t) \right) \right] \\ c(x,t = 0)&= c_0 (x), \end{aligned} \end{aligned}$$
(38)

where c is the solute concentration, V a constant velocity, D a constant diffusion coefficient and \(\alpha\) the fractional order. In Sect. 2.1.5, we presented equation (31), which is identical to the above equation except for the advection term \(-V \partial c / \partial x\), as the governing equation for the probability density of a continuous-time random walk. As discussed by [35], the inclusion of the advection term corresponds to a stochastic model in which the particle drifts with constant velocity and jumps to the left or the right with density specified by the diffusion term. Thus, when \(\beta = 1\) in (Sect. 2.1.5), the FADE (Eq. 38) governs the evolution of the probability density

$$\begin{aligned} f_\alpha (x;\gamma = 2p-1,\sigma = k (\Delta t)^{1/\alpha },\mu = Vt) \end{aligned}$$

of the skewed \(\alpha\)-stable process \(S_\alpha (\gamma = 2p-1,\sigma = k (\Delta t)^{1/\alpha },\mu = Vt)\). Comparing to the asymmetric diffusion model in Sect. 2.1.3, with fundamental solution (Eq. 21), this density differs only in that the center drifts with velocity vt. This describes a particle that drifts with velocity Vt and makes jumps to the left or right drawn from the stable distribution (20). More specifically, when particle paths are discretized in steps of \(\Delta t\), the position of the particle changes by \(V \Delta t + \Delta X\) at each time step, where \(\Delta X\) is an increment of the process \(X^{\alpha ,p}_t\) given by Eq. (20). When \(0< \beta < 1\), similar to the CTRW model described in Sect. 2.1.5, this equation governs the probability density of a particle undergoing the process just described, time-changed by the inverse \(\beta\)-stable subordinator, again introducing waiting times to the process.

As pointed out by Neuman and Tartakovsky [20], when \(\beta = 1\), Eq. (38) corresponds to Markovian random walk processes of statistically independent and identically distributed non-Gaussian displacements, and, as such, they can only occur in an uncorrelated velocity field; in hydrology, this can be viewed as a limitation of both CTRW and plain FADEs. Instead, it is possible that variable-coefficient or variable-order models may be able to describe processes associated with statistically non-homogeneous velocity fields. However, we are not aware of a specific theoretical framework that relates variable-coefficient and variable-order FADEs and CTRWs. Nor are we aware of a framework that relates such variable parameters to physical properties of the medium. At present, the only way of estimating such parameters is by fitting the models to observed concentrations and/or mass fluxes, and not by hydraulic data such as hydraulic conductivity, advective porosity or flow parameters such as hydraulic gradients, fluxes and advective porosities [20].

As discussed in Sect. 2.1.5, limits of a CTRW with infinite and statistically independent waiting times lead to time-fractional FPDEs. A physical mechanism that would result in time-fractional derivatives in a FADE is particle trapping due to media heterogeneities [127, 128]. Such models are discussed in Sect. 3.2.2.

We conclude this section with advantages in using FADEs as opposed to more general CTRW models. First, it is well known [129] that FADEs can account for source and boundary terms and velocity dynamics can be easily included by an additional velocity equation, which leads to a velocity-concentration coupled system. Furthermore, even though not thoroughly explored, making model fitting for FADEs is a computationally less challenging task, due to the limited number of parameters to fit.

3.1 Evidence of Fractional Behavior in the Presence of Heterogeneity

In this section we provide two examples of fractional behavior of solute concentration. We start by considering a highly heterogeneous environment and then we show that even in circumstances where a classical behavior is expected, i.e., in the absence of heterogeneities, the macroscopic solute concentration behaves nonlocally and, hence, can be described by a FADE.

Fractional behavior is most readily seen in transport through heterogeneous media. The first experiment we discuss studied subsurface transport of tritium in a highly heterogeneous environment such as the MADE site, located on the Columbus Air Force Base in northeastern Mississippi. This unconfined, alluvial aquifer consists of generally unconsolidated sands and gravels with smaller clay and silt components. Irregular lenses and horizontal layers were observed in an aquifer exposure near the site [130]. Detailed studies characterizing the spatial variability of the aquifer and the spreading of the conservative tracer plume for the experiment conducted at the beginning of the 1990s can be found in [131]. Benson et al. [18] used Eq. (38) to model particle concentration; here, model parameters are determined a priori by tuning them on the basis of measurements (we refer to [18], Sects. 4.2 and 4.3, for a detailed description of the calibration process). In Fig. 9 we report four snapshots of the normalized longitudinal tritium mass distribution. These plots are obtained by numerical integration of the analytic solutions of both the classical ADE and the FADE. These distributions clearly indicate that the fractional model outperforms the classical one.

Fig. 9
figure 9

A comparison of classical (Gaussian) and fractional (\(\alpha\)-stable) predictions of the normalized mass as a function of space at specific time instants for the MADE data set. The data points represent the maximum concentration measured in vertical slices perpendicular to the direction of the plume. These maxima were then integrated versus the travel distance. Source: [18]

Strong heterogeneity, however, is not necessary to observe fractional behavior. Increasing experimental evidence suggests that in laboratory experiments where the media is “constructed” as nearly homogeneous, the observations are consistent with anomalous transport, see, e.g., [19, 132]. In fact, some authors even claim that strict classical transport may not even exist [133]. Benson et al. [17] analyzed a test case where the tracer’s concentration was intuitively expected to follow a classical ADE. They considered a one-dimensional tracer test in a laboratory-scale, 1m, sandbox, constructed with very uniform sand in an effort to minimize heterogeneity, see Fig. 10, left. In other words, the sandbox was designed and built using as homogeneous a porous medium as possible by following the setup in [134]. Here, simple tracer tests, conducted to estimate the transport characteristics of the sand, indicated the appearance of non-classical breakthrough curves (BTCs, i.e., plots of the concentration as a function of time) with heavy tails, similar to \(\alpha\)-stable solutions. This behavior was likely due to channeling within smaller and smaller grains that resulted from sand emplacement through standing water and from cracked and intact surface clays on the sand particles [17]. In Fig. 10, right, a comparison, conducted in [17], between BTCs obtained with the classical ADE and the FADE equation shows the agreement of the latter with measured BTCs at a specific location. While in this figure the differences between classical and fractional behavior are not striking as in Fig. 9, they are still noticeable.

Fig. 10
figure 10

Tracer transport in homogeneous sand shows evidence of anomalous behavior which can be reproduced by a fractional diffusion-advection equation. On the left, the setup of the homogeneous sand tracer experiment (as described in [134]). On the right, a comparison of the corresponding classical (Fickian) and fractional (\(\alpha\)-stable) predictions. Source: [17]

We also mention that evidence of anomalous behavior and its successful description by FADEs has been observed in unsaturated soils [135], saturated porous media [136], streams and rivers [137, 138], and overland solute transport due to rainfall [139].

3.2 State of the Art: a Progression of Fractional Models for Subsurface Transport

As described at the beginning of this section, classical diffusion does not take into account long-distance spatial and time correlations. The anomalous movement of particles in the subsurface, however, depends on both far upstream/downstream concentrations (resulting in space-fractional equations [43, 132, 140, 141]) or past conditions (resulting in time-fractional equations [141,142,143,144]). Considering only the movement of solute particles in an infinitesimal neighborhood, like in the classical diffusion model for Brownian motion, is too restrictive for the complexities of groundwater pore spaces or trapping zones in natural streams. More specifically, the presence of preferential paths in hydrologic domains results in high-velocity zones (superdiffusion), whereas the presence of trapping regions results in low-velocity zones where the particles “wait” before they return to the higher velocity zone (this concept is also known in the literature as the distinction between immobile and mobile zones) [16].

In this section we review fractional models of increasing complexity for anomalous subsurface transport. While the simpler models are viable choices in the presence of a low degree of heterogeneity, as this degree increases, more sophisticated models are required to obtain reliable predictions. We first present early works featuring a one-dimensional space FADE with constant coefficients and constant fractional order. Next, we extend this model to the case of variable coefficients and generalize it to the multidimensional setting. We then present two types of one-dimensional time-FADEs and conclude the section with a very general model featuring both space- and time-fractional derivatives of variable order. For all these models, we refer to Sect. 2.1 for their mathematical details and interpretation in the context of stochastic processes.

3.2.1 Spatial Fractional Derivatives

We introduce the constant coefficients, constant-order spatial FADE in one dimension introduced in [17] and provide details regarding its parameters in relation to solute transport. The solute concentration at point x and time t, c(xt), satisfies the equation

$$\begin{aligned} \frac{\partial c}{\partial t}(x,t) = -V\dfrac{\partial c}{\partial x} -D \left[ p \, \left( ^{\;\text {RL}}_{{-\infty }}{\mathbb {D}}^{\alpha }_x c(x,t) \right) + (1-p) \, \left( ^{\text {RL}}_{\;\;x}{\mathbb {D}}^{\alpha }_{\infty } c(x,t) \right) \right] \end{aligned}$$

where V is the average plume velocity, D is a fractional diffusion coefficientFootnote 9 that controls the rate of spreading, \(1\le \alpha \le 2\) (dimensionless) is the fractional order, and \(0\le p \le 1\) determines the skewness \(\gamma = 2p-1\). Solutions can be positively (\(p=0\)) or negatively (\(p=1\)) skewed, whereas they are symmetric when \(p=0.5\), for which the sum of the Riemann-Liouville derivatives results in the fractional Laplacian The fractional order \(\alpha\) codes for the heterogeneity of the velocity field, with a higher probability of large velocities as it decreases towards one [145]. We recall that for \(\alpha =2\) the FADE reduces to the traditional advection-diffusion equation (ADE) for groundwater flow and transport. The FADE above was introduced for the first time by Benson et al. [146] to model scale-dependent dispersivity in fitted groundwater plumes. In this paper the authors observed that, given a data set of solute concentration, the fitted parameter D grows with time when the classical ADE is used; such evidence of superdiffusion is an indicator that a space-fractional model is preferable. Indeed, in subsequent works, see, e.g., [18], the same authors show that the FADE allows the same data set to be fit with a constant-coefficient model such as Eq. (38), where D does not vary over time. From a particle perspective, the combination of left-sided and right-sided RL derivatives allows a solute particle to jump to any point in the domain; this simple concept was used by Schumer et al. [43] to provide a derivation of Eq. (38) using an Eulerian interpretation of the particles’ behavior.

The Grünwald-Letnikov Discretization Technique

A standard discretization technique used in the FADE community for the approximation of the left-sided and right-sided RL derivatives Eqs. (16) and (17) in Eq. (38) is the shifted Grünwald-Letnikov (GL) finite difference formula introduced by Meerschaert and Tadjeran [147]. The GL scheme is based on the following identities:

$$\begin{aligned} \begin{array}{l} ^{\;\text {RL}}_{-\infty } \mathbb {D}^{\alpha }_x u(x,t) = \lim _{h\rightarrow 0} h^{-\alpha } \sum\limits_{j=0}^\infty g_j^{\alpha } u(x+(j-1)h,t) \\ ^{\text {RL}}_{\;\;x}{\mathbb {D}}^{\alpha }_\infty u(x,t) = \lim _{h\rightarrow 0} h^{-\alpha } \sum\limits_{j=0}^\infty g_j^{\alpha } u(x-(j-1)h,t), \end{array} \end{aligned}$$
(39)

where the GL weights are given by

$$\begin{aligned} g_j^\alpha = (-1)^j \dfrac{\Gamma (\alpha +1)}{\Gamma (j+1)\Gamma (\alpha -j+1)}. \end{aligned}$$

The GL approximation of the one-dimensional FADE is obtained by truncating the summation in Eq. (39). The temporal derivative and the classical first-order spatial derivative can be obtained by standard time discretization schemes for PDEs. Formulas (39) clearly highlight the nonlocal nature of fractional derivatives and the associated high computational cost compared to PDEs.

FADEs with Variable Coefficients on Bounded Domains

In a heterogeneous porous medium, at a scale where the geological character of the medium changes with location, the constant-coefficient model (38) is insufficient for accurate and reliable predictions. A first step towards a more accurate model is introducing space dependence in the material parameters V and D. Furthermore, in practical settings, simulations of solute transport must be confined to bounded domains, so that it becomes mandatory to establish ways to prescribe nonlocal boundary conditions that guarantee existence and uniqueness of solutions. In the literature there are at least three variants of the FADE with space-dependent coefficients [148]: the fractional-flux ADE (FF-ADE), the fractional-divergence ADE (FD-ADE), and the fully fractional-divergence ADE (FFD-ADE). In this review we focus on the former because of its resemblance with classical advection-diffusion equations and for which we formulate the associated equation on bounded domainsFootnote 10.

The FF-ADE model in the one-dimensional domain \((-L,L)\) is derived from the classical conservation of mass equation

$$\begin{aligned} \dfrac{\partial }{\partial t}c(x,t) + \dfrac{\partial }{\partial x}q(x,t) = 0, \;\;\mathrm{for}\;x\in (-L,L), \end{aligned}$$
(40)

where the flux q is given by the following constitutive equation [43]

$$\begin{aligned} q(x,t)= V(x) c(x,t) + D(x) \Big [ p \big (^{\text {RL}}_{-L}{\mathbb {D}}^{\alpha -1}_x c(x,t)\big ) -(1-p)\big ( ^{\text {RL}}_{\;\;x}{\mathbb {D}}^{\alpha -1}_L c(x,t)\big ) \Big ]. \end{aligned}$$
(41)

Here, the first term is the advective flux that models the average drift of contaminant particles, whereas the second and third terms are the dispersive fluxes, which model large particle jumps in the left and right directions, respectively. Note that, because we consider the bounded domain \((-L,L)\), the integrals in the left- and right-sided derivatives are “truncated” at \(-L\) and L, respectively. Furthermore, since \(\partial (^{\text {RL}}_{-L}{\mathbb {D}}^{\alpha -1}_x c(x,t))/\partial x = ^{\text {RL}}_{-L}{\mathbb {D}}^\alpha _x c(x,t)\) the RL derivatives in the definition of the flux q have exponent \(\alpha -1\). The resulting FF-ADE corresponds to the models proposed in, e.g., [149]. We point out that, as described in detail in [150], Caputo derivatives as the ones introduced in Sect. 2.1 can also be used in place of RL derivatives in the definition of the flux (leading to what is referred to as Caputo flux).

The restriction of the FADE to a bounded domain requires the prescription of appropriate boundary conditions to guarantee that Eq. (40) is well-posed. We consider two types of boundary conditions: reflecting and absorbing. Using the flux function defined in Eq. (41), we can identify a reflecting (or no-flux) condition by setting the diffusive part of the flux q equal to zero at the boundary, i.e., \(x=\pm L\). As an example, the reflecting boundary condition on the right boundary corresponds to

$$^{\text {RL}}_{\;\;0}{\mathbb {D}}^{\alpha -1}_L c(L,t)-^{\text {RL}}_{\;\;L}{\mathbb {D}}^{\alpha -1}_1 c(L,t)=0.$$

Instead, absorbing boundary conditions correspond to prescribing a zero “Dirichlet” condition at the boundary, i.e.,

$$\begin{aligned} c(\pm L,t)=0. \end{aligned}$$

Clearly, these conditions can be mixed resulting in absorbing/reflecting boundary conditions on either the left or right boundary of the domain. It is important to note that, in the absence of advection, the no-flux (reflecting) condition implies that the total mass is conserved, see Proposition 2.3 in [150].

We also mention that a new space-fractional model with variable advection and diffusion coefficients for anomalous, anisotropic transport has been proposed in [151].

Multidimensional FADEs

The multidimensional version of Eq. (38) was proposed by Meerschaert et al. [54] and further analyzed in [17]. For \((-\Delta )_M^{\alpha /2}\) defined as in Eq. (22), we have that for \(\mathbf {x}\in {\mathbb {R}^d}\) the concentration of a solute is described by the following law:

$$\begin{aligned} \dfrac{\partial }{\partial t}c(\mathbf {x},t) +\mathbf{V}\cdot \nabla c(\mathbf {x},t) -D(-\Delta )_M^{\alpha /2}c(\mathbf {x},t)= 0, \end{aligned}$$
(42)

where \(\mathbf{V}\) is the average solute velocity and D is the fractional diffusion coefficient. In [54] the operator \((-\Delta )_M^{\alpha /2}\) corresponding to Eq. (22), is introduced via inverse Fourier transform, i.e.,

$$\begin{aligned} (-\Delta )_M^{\alpha /2}c(\mathbf {x},t) = \mathcal F^{-1} \left\{ \int _{|{\varvec{\theta }}|=1} (i \mathbf{k}\cdot {\varvec{\theta }})^\alpha \widehat{c}(\mathbf{k},t)M(d{\varvec{\theta }})\right\} . \end{aligned}$$

Here, \({\varvec{\theta }}\) is a d-dimensional unit vector, \(\mathbf{k}\) is the wave vector and \(\widehat{c}\) is the spatial Fourier transform of c. Note that the coefficient D can be embedded in the measure M (even when it depends on the space variable). As for the one-dimensional constant-coefficient equation (38), the multidimensional FADE can also be extended to the variable-coefficient case. Furthermore, in the special case of jumps occurring only along the standard coordinate vectors \(\mathbf{e}_j\), it is possible to derive fundamental solutions to Eq. (42). Finally, the special case of uniform measure over the \(d-1\) unit sphere corresponds to an advection-diffusion equation where the diffusion term is given by the standard fractional Laplacian operator \((-\Delta )^{\alpha /2}\).

3.2.2 Temporal Fractional Derivatives

The time-FADE, used to model particle trapping in heterogeneous porous media, is characterized, in a jump process perspective, by long waiting times between jumps. This FADE replaces the first-order time derivative in an ADE with a time-fractional derivative of either RL or Caputo type. In this section, we review two popular time-FADEs: the time-FADE (with RL derivatives) and the fractional mobile-immobile equation (with Caputo derivatives), also known as FMIM.

Time-Fractional Advection-Diffusion Equation

The time-fractional advection-diffusion equation (time-FADE) was introduced in the works by Zaslavsky [152] and, independently, by Liu et al. [153]. In one dimension, it is given by

$$\begin{aligned} ^{\text {C}}_{{0}}{\mathbb {D}}^{\alpha }_t c(x,t) = -v\dfrac{\partial }{\partial x}c(x,t)+D \dfrac{\partial ^2}{\partial x^2}c(x,t), \end{aligned}$$
(43)

where the first term is the Caputo derivative defined in Eq. (28) on the half-axis. The units of the velocity parameter v are \(\mathrm{L}/\mathrm{T}^\alpha\) and the ones of the diffusion coefficient D are \(\mathrm{L}^2/\mathrm{T}^\alpha\), where L denotes units of space and T units of time. Note that, in the literature, \(^{\text {C}}_{0}{\mathbb {D}}^{\alpha }_t f(t)\) is often denoted by \(\frac{\partial ^\gamma }{\partial t^\gamma }f(t)\), where \(\gamma\) plays the same role as \(\alpha\). Furthermore, as pointed out at the beginning of this section, the time-FADE can be seen as the scaling limit of a CTRW. It is possible to obtain representations of solutions to Eq. (43) by subordination, i.e., via randomization of the time variable by the inverse stable subordinator [154].

Fractional Mobile-Immobile Equation

The fractional mobile-immobile (FMIM) model proposed by Schumer et al. [155] is a generalization of the classical mobile-immobile (MIM) model [156]. The latter, in its classical definition, partitions the solute concentration into a mobile phase, \(c_m\), and an immobile phase, \(c_{im}\) and equates the divergence of the total flux of the mobile concentration to a weighted sum of the time rate of change of each phase, i.e.,

$$\begin{aligned} \dfrac{\partial }{\partial t}c_m(x,t) + \beta \dfrac{\partial }{\partial t}c_{im}(x,t) = -v\dfrac{\partial }{\partial x}c_m(x,t)+D \dfrac{\partial ^2}{\partial x^2}c_m(x,t), \end{aligned}$$
(44)

where \(\beta =\eta _{im}/\eta _m\), being \(\eta _{im}\) and \(\eta _m\) the porosities of the immobile and mobile phases. The relationship between \(c_m\) and \(c_{im}\) is then given by one or more coupled mass transfer equations, resulting in the following relationship

$$\begin{aligned} \dfrac{\partial }{\partial t}c_{im}(x,t)= f(t) {*} \dfrac{\partial }{\partial t}c_{m}(x,t) + f(t) (c_m(x,0)-c_{im}(x,0)), \end{aligned}$$
(45)

where \({*}\) indicates the convolution operation and f(t) is a memory function. The FMIM model in [155] defines f(t) as the power function \(f(t)= t^{-\alpha }/\Gamma (1-\alpha )\) with \(0<\alpha <1\). By noting that

$$\begin{aligned} f(t){*} \dfrac{\partial }{\partial t}c_{m}(x,t) = ^{\text {C}}_{{0}}{\mathbb {D}}^{\alpha }_t c_m(x,t), \end{aligned}$$

the combination of Eqs. (45) and (44) results in the time-FADE

$$\begin{aligned} \dfrac{\partial }{\partial t}c_{im}(x,t)= ^{\text {C}}_{{0}}{\mathbb {D}}^{\alpha }_t c_m(x,t) + f(t) (c_m(x,0)-c_{im}(x,0)), \end{aligned}$$

A CTRW model for the FMIM model was developed by Benson and Meerschaert [157]; here, waiting times experienced by solute particles in the immobile phase are modeled by a power law (as for the time-FADE). Power-law waiting times have also been observed in river transport studies by Haggerty et al. [158] and Schmadel et al. [159].

3.2.3 Variable-Order FADEs

Constant-coefficient and constant-order models are invaluable basic tools for the analysis of complex engineering systems such as the flow through the subsurface; however they are unable to evolve between different physical behaviors, i.e., they cannot capture transitions between diffusive regimes. These transitions are caused by the fact that solutes in the subsurface diffuse through porous, fractured, layered and heterogeneous aquifers, whose structure changes with space as well as time. This leads to anomalous diffusion characterized by a variable-order scaling of the MSD. A first step towards more descriptive models was the introduction of variable-coefficient models, as described in the previous section. Yet, modeling such transitions using constant-order fractional operators would require a continuous update of the underlying governing equations. For this reason, several recent works (in the context of subsurface modeling and beyond) have explored the use of variable-order operators. The use of these operators becomes particularly important in the presence of complex media that feature a hybrid anomalous mechanism [11]. As an example, we can exploit variable-order fractional operators, like the ones introduced in Sect. 2.1.7, when the nature of the transport processes transitions across very different underlying physical phenomena such as transitions from subdiffusive flow to diffusive flow, and from diffusive flow to superdiffusive flow [160,161,162,163,164,165]. Note that these complex transport processes have been observed experimentally in various fields; for fluid flow through porous media we mention, e.g., [166, 167].

A complete variable-order fractional model was proposed in [165] and further explored in [168] for the description of the same MADE data set introduced at the beginning of this section. The one-dimensional variable-order time-space FADE is given by

$$\begin{aligned} ^{\text {C}}_{{0}}{\mathbb {D}}^{\beta (x,t)}_t c(x,t) = -V\dfrac{\partial c}{\partial x} -D^- \, \left( ^{\;\text {RL}}_{-\infty }{\mathbb {D}}^{\alpha (x,t)}_x c(x,t) \right) - D^+ \, \left( ^{\text {RL}}_{\;\;x}{\mathbb {D}}^{\alpha (x,t)}_{\infty } c(x,t) \right) , \end{aligned}$$
(46)

where the variable-order derivatives are defined as in Sect. 2.1.7.

To confirm the improved accuracy of models such as the one in (46) we report in Fig. 11 a comparison, conducted in [165], of a classical model, a constant-order fractional model and a variable-order fractional model. Here, the authors consider concentration data from the field experiment conducted at the Grimsel test site [169] where uranine, a fluorescent dye, was injected into a shear zone as a tracer and its concentration was measured at an extraction well away from the injection site. The BTC of uranine, measured at the extraction well corresponds to the blue crosses in the figure. The authors compare the following models: the classical advection-diffusion equation, corresponding to \(\beta =1\) and \(\alpha =2\) in (46), the constant-order time-FADE with \(\beta =0.9\) and \(\alpha =2\), and the variable-order time-FADE with \(\beta (t)=0.9 + t/150\), \(t\in (0,15]\), and \(\alpha =2\). BTCs in the figure show that the classical ADE model is not capable to depict the tailing/subdiffusive behavior, whereas the constant-order time-FADE underestimates the late-time decay, which features classical behavior. The choice of \(\alpha\) and \(\beta\) in the variable-order time-FADE is based on the following considerations: first, the measured BTC has a fast-increasing early time tail, implying a Gaussian-type of particle jump that corresponds to \(\alpha =2\). Second, the heavy late-time tail suggests a time-dependent \(\alpha\) that should be less than 1 at early times (subdiffusive) and should slowly converge to 1 at late times (classical diffusion). The corresponding solid black BTC clearly captures the variable diffusion behavior of the normalized concentration.

Fig. 11
figure 11

A comparison in semi-log scale of the normalized concentration at the extraction point for the Grimsel test site [169] obtained using the classical advection-diffusion equation, the constant-order time-FADE, and variable-order time-FADE (ADE, Constant-index FDM, and Variable-index time FDM, respectively in the legend) together with the normalized experimental data. Source: Sun et al. [165]. Note that their use of \(\alpha\) and \(\beta\) is switched from our use of the same symbols in the text; thus, in the above legend, \(\alpha\) denotes the order of the time-fractional derivative

3.3 Future Directions in Anomalous Subsurface Modeling

In the previous sections we provided evidence of the occurrence of anomalous behavior in subsurface transport even for a low degree of heterogeneity and we have shown that FADEs can be accurate models when properly tuned. However, the identification of an optimal fractional model for a specific setting (e.g., for specific hydraulic properties) is not trivial and has not been thoroughly explored in the literature. One of the main challenges in this context is the fact that model parameters cannot be directly related to media properties, as carefully explained in [20]. Furthermore, oftentimes, it is hard or nearly impossible to collect solute measurements, so that only a very small set of data that are sparse in time and space and potentially affected by noise is available. Yet, in this context, FADEs have the advantage, compared to other models for subsurface transport, of having only a handful of parameters to tune, i.e., the identification problem consists in discovering a small set of parameters such as the diffusivity and the fractional order.

Only a few works in the literature have addressed this problem. In the context of highly heterogeneous settings, we mention the work by Pang et al. [168] where the authors propose to use multi-fidelity Bayesian optimization to discover variable-order fractional operators for the advection-diffusion equation (46) from field data in the MADE data set mentioned at the beginning of this section. Other recent works addressing a similar learning problem for fractional operators include optimization-based approaches such as the one used in [84], fractional/nonlocal physics-informed neural network approaches such as [74, 170] and operator-regression techniques such as the one developed in [171]. It is important to keep in mind that in all these works the computational cost may become prohibitive, due to the integral nature of the operators involved and to the strong singularities that require sophisticated (and expensive) quadrature rules. Thus, together with the development of new learning techniques or the extension of the current ones to more complex settings, it is mandatory to design more efficient discretization schemes and numerical solvers.

4 Turbulence

Richard P. Feynman described turbulence as the most important unsolved problem in classical physics [172], a problem that stands today. By “turbulence”, we refer to the three-dimensional and highly vortical fluid motions characterized by stochastic perturbations in pressure and flow velocity, and caused by excessive kinetic energy in areas of fluid flow that overcome the “damping effects” of the fluid’s viscosity. The onset of turbulence can be predicted by the dimensionless Reynolds number Re, a ratio of kinetic energy to viscous damping in the fluid flow. Yet, the question remains of what mathematically governs the evolution of a turbulent flow and whether it is feasible to fully simulate turbulent flows by means of numerical methods.

In 1970, Emmons [173] reviewed the possibilities for computational fluid dynamics, concluding: “... the problem of turbulent flows is still the big holdout. This straightforward calculation of turbulent flows — necessarily three-dimensional and unsteady — requires a number of numerical operations too great for the foreseeable future.” After almost a decade, however, the field of direct numerical simulation (DNS) of turbulence was established with successful numerical simulations of wind-tunnel flows at moderate Re by Hussaini and Voigt [174], Karniadakis et al. [175], Kim et al. [176], and Orszag and Patterson [177]. These early computational developments were based on employing a Newtonian fluid assumption and applying the principles of conservation of mass, momentum, and energy to an infinitesimally small fluid element or parcel; see, e.g., [178,179,180]. This led to the derivation of the Navier-Stokes and energy equations, emerging as a set of convective nonlinear PDEs that govern the evolution of fluid velocity/temperature fields in turbulence. In this context, assuming some proper (random) initial/boundary conditions, one can discretize the governing equations and solve for the “entire degrees of freedom of turbulence” in the physical and parametric (stochastic) space.

The great challenge is that, in practice, DNS becomes prohibitively expensive, especially at high Re, more so in complex geometries. Hence, one of the main goals in turbulence modeling has been to systematically lower the total number of degrees of freedom to a manageable level, at the cost of reducing the accuracy of turbulence predictions. This approach has been mainly centered around the overarching theme of ensemble averaging the set of PDEs representing the various scalar and vector turbulence fields; see, e.g., [181, 182]. This gives rise to new mathematical terms in the averaged or filtered governing equations, known as turbulence closure terms that can only be modeled as they are essentially unknown. When the entire time and length scales of turbulence are averaged, an operation denoted by \(\omega \mapsto \overline{\omega }\), the Reynolds Averaged Navier-Stokes (RANS) equations are obtained, solely describing the mean-flow dynamics of turbulence. Alternatively, if one applies a mathematically well-defined low-pass filter to the Navier-Stokes equations, an operation denoted by \(\phi \mapsto \tilde{\phi }\), the resulting filtered governing equations describe the large eddy dynamics of turbulence, where only small-scale subgrid dynamics need be modeled; this is referred to as large eddy simulation (LES). The common approach in the literature for modeling closure terms of any kind has been based on the use of classical local differential operators. More specifically, the majority of turbulence models have been constructed based on Boussinesq’s turbulent viscosity concept [183], in which one assumes that the turbulent stress tensors are proportional to the local gradient of mean velocity at any point. The proportionality coefficient, referred to as turbulent viscosity, is to be inferred from data.

The impetus for the fractional models we describe in this section is that the small-scale dynamics of turbulence are statistically anomalous, i.e., non-Markovian and non-Fickian, so that nonlocal closure models emerge as appropriate tools. Employed at the continuum level, fractional models therefore capture anomalous features in the small-scale stochastic subgrid dynamics of turbulence. The mathematical modeling of turbulence must address the fact that nonlinear interactions between the turbulence structures and motions create statistically complex phenomena that lead to a variety of anomalous features, including multi-power-law scalings in space-time, rare events, short-to-long-range coherent motions, and enhanced turbulent mixing. These features urge better and novel understanding of the underlying nonlocal closure terms that appear as a result of the ensemble averaging or filtering of the governing equations. The nonlocal mode of thinking has the potential to shift the turbulence modeling paradigm and achieve a new level of physical and statistical consistency compared to classical approaches. This is especially true at high Re, for which a proper and efficient framework that unites computational, mathematical, and statistical aspects was not available until recently.

4.1 Evidence of Fractional Behavior in Turbulence

An intuitive concept of nonlocality and memory effects was been established by Eringen and Wegner [184], where a point within a fluid field (medium) is influenced by all points of the body at all past times. Coherent random motions and the spatially turbulence spots structures inherently give rise to intermittent signals with self-similarities, sharp peaks, heavy-skirts, and skewed distributions of velocity increments. Such statistical features have been well observed experimentally even in the context of most canonical problems, e.g., grid turbulence, in which the skewness factor negatively appears and the Kurtosis factor strongly exceeds three, emphasizing the non-Gaussian character of statistics (see, e.g., [23]). Moreover as demonstrated by Egolf and Kutter [185] (page 92), nonlocal effects appear even in the context of turbulent fields obtained numerically solving the Navier-Stokes equations. Such widespread statistical measures indicate the non-Markovian and non-Fickian nature of turbulence, and they are the consequence of nonlinear and coherent vortical effects that occur in a wide spectrum of length and time scales. Therefore, nonlocal interactions cannot be ruled out of modern turbulence physics. These considerations are particularly timely; in fact, we can now benefit from the spectrum of modern nonlocal and fractional modeling tools reviewed in Sect. 2.1, equipped with well-established mathematical/statistical theories, that enable us to take such nonlocal/history effects into account with physical consistency and mathematical rigor.

Furthermore, averaging entire spatial scales as in RANS models or applying a spatial filter to the Navier-Stokes and energy equations as in LES models would make the underlying physical nonlocality in the corresponding closure terms in RANS models and the subgrid turbulent fluctuations in LES models even more pronounced. This sheds lights on why turbulence modeling is a nonlocal task and further motivates the development of “nonlocal closure models” that can properly address and incorporate the underlying memory and long-range effects. Specifically, in what follows, we present a DNS study, recently presented by Akhavan-Safaei et al. [24], that introduces new statistical measures and highlights the nonlocal character of subgrid-scale dynamics in the context of scalar turbulence.

4.1.1 The Case of Scalar Turbulence Subgrid Dynamics

An ideal LES is such that the true, filtered turbulent intensity is captured accurately through a robust subgrid scale (SGS) modeling that is physically and mathematically expressive. In fact, the LES equations include closure terms that directly link the correct evolution in time of turbulent intensity to the nature of the SGS closure and its modeling. Here, as a canonical problem, we consider the advection-diffusion (AD) equation

$$\begin{aligned} \frac{\partial \phi }{\partial t}+\frac{\partial }{\partial x_i}\left( \phi \, V_i \right) = -\theta \, V_2+\mathcal {D} \, \frac{\partial ^2 \phi }{\partial x_i \partial x_i}, \quad i=1,2,3, \end{aligned}$$
(47)

in which \(\mathcal {D}\) denotes the molecular diffusion coefficient of the passive scalar, and the imposed mean scalar gradient is taken to be uniform as \(\nabla \langle \Phi \rangle = \left( 0,\theta ,0\right)\), where \(\theta\) is a real-valued constant. In the LES representation of the scalar turbulence, multiplying both sides of the filtered AD equation by \(\widetilde{\phi }\), the filtered scalar field \({\phi }\), yields the time-evolution of the filtered turbulent intensity as

$$\begin{aligned} \frac{1}{2} \frac{\partial }{\partial t}\left( \widetilde{\phi } \, \widetilde{\phi } \right) + \widetilde{\phi } \, \frac{\partial }{\partial x_i} \left( \widetilde{\phi } \, \widetilde{V}_i \right) = -\theta \, \widetilde{\phi } \, \widetilde{V}_2 + \mathcal {D} \, \widetilde{\phi } \, \frac{\partial ^2 \, \widetilde{\phi }}{\partial x_i \partial x_i} - \widetilde{\phi } \, \frac{\partial \, q^R_i}{\partial x_i}. \end{aligned}$$

Here, \(q_i^R\) denotes the i-th component of the residual, SGS scalar flux defined as \(q_i^R = \widetilde{\phi V_i} - \widetilde{\phi } \widetilde{V_i}\). Employing the filtered continuity equation \(\nabla \cdot \varvec{\widetilde{V}}=0\) and the chain rule for differentiation, we obtain

$$\begin{aligned} \begin{aligned} \frac{1}{2} \frac{\partial }{\partial t}\left( \widetilde{\phi } \, \widetilde{\phi } \right) + \widetilde{\phi } \, \widetilde{V}_i \, \frac{\partial \widetilde{\phi }}{\partial x_i} =&-\theta \, \widetilde{\phi } \, \widetilde{V}_2 + \mathcal {D} \, \frac{\partial }{\partial x_i}\left( \widetilde{\phi } \, \frac{\partial \widetilde{\phi }}{\partial x_i} \right) - \mathcal {D} \, \frac{\partial \, \widetilde{\phi }}{\partial x_i} \, \frac{\partial \, \widetilde{\phi }}{\partial x_i}\\ {}&- \frac{\partial }{\partial x_i}\left( \widetilde{\phi } \, q^R_i \right) + q^R_i \, \frac{\partial \widetilde{\phi }}{\partial x_i}. \end{aligned} \end{aligned}$$
(48)

Applying the ensemble-averaging operator, \(\langle \cdot \rangle\), on Eq. (48) returns a transport equation for the filtered scalar variance, \(\left\langle \widetilde{\phi } \, \widetilde{\phi } \right\rangle\). Akhavan-Safaei et al. [24] considers the case of homogeneous turbulent velocity and scalar fields, in which \(\left\langle \frac{\partial }{\partial x_i}\left( \cdot \right) \right\rangle = \frac{\partial }{\partial x_i}\langle (\cdot ) \rangle = 0\). Defining the filtered scalar gradient as \(\widetilde{\varvec{G}}(\mathbf {x}) = \nabla \widetilde{\phi }(\mathbf {x})\), the time-evolution of the filtered scalar variance takes the following form

$$\begin{aligned} \begin{array}{c} \frac{1}{2} \frac{d}{d t}\left\langle \widetilde{\phi } \, \widetilde{\phi } \right\rangle = -\widetilde{\mathcal {T}} + \widetilde{\mathcal {P}} - \widetilde{\chi } + \Pi , \\ \widetilde{\mathcal {T}} = \left\langle \widetilde{\phi } \, \widetilde{V}_i \, \widetilde{G}_i \right\rangle , \quad \widetilde{\mathcal {P}} = -\theta \left\langle \widetilde{\phi } \, \widetilde{V}_2 \right\rangle , \quad \widetilde{\chi } = \mathcal {D} \, \left\langle \widetilde{G}_i \, \widetilde{G}_i \right\rangle , \quad \Pi = \left\langle q^R_i \, \widetilde{G}_i \right\rangle . \end{array} \end{aligned}$$
(49)

In Eq. (49), \(\widetilde{\mathcal {T}}\) denotes the turbulent transport of filtered scalar variance while \(\widetilde{\mathcal {P}}\) represents the production of resolved scalar variance by the uniform mean scalar gradient, and \(\widetilde{\chi }\) is the resolved scalar variance dissipation due to the molecular diffusion. Unlike these three terms, \(\Pi\) (representing the SGS production of resolved scalar variance) is the only contributing term in Eq. (49) that contains the effects of the SGS scalar flux. Therefore, as pointed out earlier, understanding the true statistical nature of \(\varvec{q}^R \cdot \widetilde{\varvec{G}}\) is essential for the SGS modeling and precise evaluation of the resolved scalar variance in the LES. This examination of \(\varvec{q}^R \cdot \widetilde{\varvec{G}}\) might be viewed both from single-point and two-point statistics as discussed by Meneveau [186] in the context of the LES for homogeneous isotropic turbulent flows. We focus on the two-point statistics of the SGS production of resolved scalar variance. This quantity is well represented in terms of the following normalized two-point correlation function

$$\begin{aligned} \mathcal {C}(q^R_i \, , \, \widetilde{G}_i) = \frac{\left\langle q^R_i(\mathbf {x}) \, \widetilde{G}_i(\mathbf {x}+\varvec{r}) \right\rangle }{\left\langle q^R_i(\mathbf {x}) \, \widetilde{G}_i(\mathbf {x}) \right\rangle }, \end{aligned}$$
(50)

where \(\varvec{r}=(r_1,r_2,r_3)\) denotes the spatial shift from the location \(\mathbf {x}\). Moreover, the probability density function (PDF) of the SGS production of scalar variance normalized by its \(L_2\)-norm, i.e., \(\varvec{q}^R \cdot \widetilde{\varvec{G}}/\Vert \varvec{q}^R \cdot \widetilde{\varvec{G}} \Vert\), is a novel statistical measure for studying the statistical behavior of \(\Pi\) and yielding a more comprehensive insight into the SGS modeling.

Let \(T_{\text {LE}}\) be the eddy turnover time. By taking a large sample space over \(10 \, T_{\text {LE}}\) of this stationary process (after resolving the passive scalar field for \(15 \, T_{LE}\)), the PDF of the normalized SGS production of filtered scalar variance is computed for four different filter widths, \(\Delta /\eta =8, \, 20, \, 41, \, 53\). These computations, shown in Fig. 12a, demonstrate that as \(\Delta\) becomes larger, the PDF exhibits broader tails. Emergence of this tail behavior implies that as the filter width increases, long-range spatial interactions become stronger and more pronounced [187]. Motivated by this observation, a two-point diagnosis of the SGS scalar production of the filtered variance as defined in Eq. (50) would be another statistical measure shedding light on the long-range interactions in addition to the filter width effects. Considering \(\parallel\) as the direction along the imposed mean scalar gradient and \(\perp\) representing the directions perpendicular to the imposed mean gradient, we focus on the evaluation of \(\mathcal {C}(q^R_\parallel \, , \, \widetilde{G}_\parallel )\).

Fig. 12
figure 12

Statistics of true subgrid-scale contribution to the filtered scalar variance rate. (a) PDF of normalized SGS dissipation of filtered scalar variance, \(-\varvec{q}^R \cdot \widetilde{\varvec{G}}\), computed over a sample space of \(10 \, T_{LE}\) of statically stationary turbulence. (b) Time-averaged two-point correlation function (50) between \(q^R_\parallel\) and \(\widetilde{G}_\parallel\) with \(r=\vert \varvec{r}_\perp \vert\). Source: [24]

Here, one case takes \(\varvec{r}=(r_1,0,0)\) and \(\varvec{r}=(0,0,r_3)\) and takes the average of the resulting two-point correlation functions. Due to the statistically stationary turbulence, such procedure is performed for 20 data snapshots that are uniformly spaced over \(10 \, T_{LE}\) (on the same spatio-temporal data, used to compute the PDFs); hence, the time-averaged value of \(\mathcal {C}(q^R_\parallel \, , \, \widetilde{G}_\parallel )\) is obtained. Figure 12b illustrates this two-point correlation function extending over a wide range of spatial shift, \(r=\vert \varvec{r} \vert\), and evaluated at four filter widths similar to the ones utilized in Fig. 12a. This plot quantitatively and qualitatively reveals that as \(\Delta\) increases, greater correlation values between the SGS scalar flux \(q^R_\parallel (\mathbf {x})\), and filtered scalar gradient \(\widetilde{G}_\parallel (\mathbf {x}+\varvec{r})\) are observed at a fixed r. These spatial correlations are significant both in the dissipation and also inertial subranges.

This confirms substantial nonlocal effects in the true SGS dynamics, which need to be carefully addressed in the SGS modeling for LES. A popular and fairly simple approach for modeling the SGS scalar flux is eddy diffusivity modeling (EDM). In EDM, the main assumption is that the SGS scalar flux is proportional to the resolved scale scalar gradient (i.e., the conventional locality assumption) as

$$\begin{aligned} \varvec{q}^R(\mathbf {x}) \approx -\mathcal {D}_{\text {ED}} \, \widetilde{\varvec{G}}(\mathbf {x}), \end{aligned}$$
(51)

and \(\mathcal {D}_{\text {ED}}\) is the proportionality coefficient. Obviously, EDM is a local modeling approach by construction. Computing \(\mathcal {C}(q^R_\parallel \, , \, \widetilde{G}_\parallel )\) while \(q^R_\parallel\) is approximated with EDM, one can compare it with its true value as shown in Fig. 12b. Figure 13 illustrates such comparison for two filter widths, \(\Delta /\eta =8, \, 53\), and it reveals that in both of the cases local EDM substantially fails to predict the conspicuous long-range spatial correlations observed in the true two-point correlation values. This evidence strongly suggests the adoption of more sophisticated, nonlocal mathematical modeling tool that goes beyond conventional SGS modeling.

Fig. 13
figure 13

Comparison between the true values of two-point correlation function given in Eq. (50) and the ones obtained from the local eddy diffusivity modeling of the SGS scalar flux given in Eq. (51). The evaluations are performed at two filter widths of \(\Delta /\eta = 8, \, 53\). Source: [24]

4.2 State of the Art: Fractional Turbulence Modeling

In what follows, we present the history and state of the art in fractional turbulence modeling, including nonlocal RANS, fractional eddy viscosity modeling, fractional scalar turbulence LES modeling, in addition to tempered fractional LES SGS modeling for turbulence.

4.2.1 Nonlocal RANS Models: a Narrative Survey

Recall that most of turbulence models are built based on Boussinesq’s turbulent viscosity concept. Thus, one conventionally assumes that the turbulent stress tensors \(\tau _{ij} ^{R}\) are proportional to the symmetric part of local mean velocity gradient at any point (i.e., strain rate tensor). Hence, the corresponding proportionality coefficient, known as the turbulent viscosity, emerges as the unknown turbulence model parameter \(\mu _{T}\) in

$$\begin{aligned} \tau _{ij} ^{R} = \mu _{T} \left( \frac{\partial \bar{u}_i}{\partial x_j} + \frac{\partial \bar{u}_j}{\partial x_i} \right) , \end{aligned}$$

where \(\bar{u}_i\) represents the local mean velocity components. Prandtl [188] in 1942 aimed to move beyond this local constraint by introducing the extended mixing length concept for the first time. The corresponding new model was a great migration from locality to nonlocality, but did not achieve a remarkable success as it did not significantly improve accuracy. Afterward, he parametrized the primitive model in a way that the mixing length was taken to be greater than the (differential) length scale of the problem, including a higher (second-order) Taylor expansion term. This strategy was analogous to adding a “weak sense of (short-range) nonlocality” to the model. This was regarded as a weak nonlocal model in the sense that the stress term was still in the form of Boussinesq’s and the relation with the strain rate tensor in the same point was collinear. However, von Karman insisted on the consideration of the common local mixing length, which is generated by the local flow conditions and suggested considering the mixing length in terms of two succeeding derivatives [185]. Bradshaw [189] in 1973 showed that Boussinesq’s hypothesis fails over curved surfaces and noted that form of the stress-strain relations was the main cause of this failure. It should be mentioned that there were some important developments mostly based on polynomial series, compared to the Boussinesq-type modeling including the works done by Lumley [190], Spencer and Rivlin [191, 192], and Pope [193]; however, a noticeable lack accuracy both in terms of physics and mathematics emerged as additional second- (and higher) order tensor series developments were demanded, where an interplay between predictability and practicality remained an open question.

As indicated in Sect. 2.1, Brownian motion can serve as a statistical model for the spread of a cloud of particles the continuum limit of which is a parabolic integer-order diffusion equation. Generalizing this approach to heavy-tailed processes such as Lévy processes can model the intermittency in turbulent flow signals and through a heavy-tailed central limit theorem converge to an anomalous diffusion equation with fractional derivatives in space and/or time [4]. This suggests that employing fractional-based Reynolds stresses would be a proper model for the turbulent diffusion term. In a pioneering work by Hinze et al. [194] in 1974, the authors described the memory effect in a turbulent boundary layer flow. They utilized the experimental data produced downstream of a hemispherical cap, attached to the lower wall of channel geometry. They demonstrated that when one computes eddy viscosity using Boussinesq’s theory in the lateral gradient of the mean flow and turbulence shear-stresses, there is a huge non-uniform distribution that exists in the outer region of the boundary layer. Interestingly, we see a nonlocal expression for the gradient of the transported field in a novel approach by Kraichnan [195] in the same year (1974), for the scalar quantity transport. Afterward, fractional-order models based on the RANS approach were offered in [196,197,198,199,200]. Most of these works are using Green’s functions based on the residual velocity to provide the expression for the Reynolds stress or scalar fluxes.

One of the main contributions for the development of nonlocal models has been done by Egolf and Hutter [185, 201]. They started from Lévy flight statistics and generalized the zero-equation local Reynolds shear stress expression to a nonlocal and fractional type. The method is based on Kraichnanian convolution-integral approach and utilizing different weighting functions. Using the mentioned weighting functions, one can make a bridge between the first-order gradient of the common eddy diffusivity models and the mean velocity difference term. The proposed model is based on the four distinct steps that can be followed conveniently to replace a local operator with a nonlocal one. In reality, the final proposed model is a more general and extended version of Prandtl’s zero-equation mixing length and shear-layer turbulence models. The proposed model is called Difference-Quotient Turbulence Model (DQTM), given by

$$\begin{aligned} \overline{u'_2 u'_1} = - \sigma \chi _2[\overline{u}_1 (x_1,x_2) - \overline{u}_{1,\text {min}} (x_1)] \frac{\overline{u}_{1,\text {max}} (x_1)- \overline{u}_1 (x_1,x_2)}{x_{2,\text {max}} - x_2} . \end{aligned}$$

Although well motivated and presented as somewhat of a generalization of classical models, such models have not been thoroughly tested against established integer-order models, and their practical efficiency in addressing the nonlocalities has not been adequately examined. Recently, a series of remarkable developments in fractional turbulence modeling gave a new and practical perspective on employing fractional calculus in turbulence, and introducing new statistical measures that directly reveal where classical approaches have room for modernization and enhancement. In what follows we review such cutting edge approaches.

4.2.2 Fractional Eddy Viscosity Models

Recently, Di Leoni et al. [202] developed a new nonlocal eddy viscosity-based model (see Eq. 52 below) that can be applied in both isotropic and anisotropic turbulent flows. They obtained a proper two-point stress-strain rate correlation structure for a priori testing the developed model and performed tests based on the high-resolution DNS data set for the homogeneous isotropic turbulence (HIT) and the channel flow canonical test cases.

The investigation of the model performance is set based on the necessary conditions for any LES approach in providing the accurate two-point statistics of the filtered quantities in the terms of correlations and spectra. The proposed model is given by:

$$\begin{aligned} \tau _{ij} ^{\alpha } = - \nu _{T} \big ( D_i^{\alpha } \bar{u}_j + D_j^{\alpha } \bar{u}_i \big ). \end{aligned}$$
(52)

where the derivative operators \(D_i^{\alpha }\) and \(D_j^{\alpha }\) are both of order \(0<\alpha <1\) respectively in i and j directions, however, they are employed as the truncated Caputo derivative variations, still being the convolution of the first derivative of velocity with respect to an inverse power-law kernel with index \(\alpha\), however, over a truncated (compact) integral support, forming a finite nonlocality horizon, for the purpose of lowering the computational cost.

Several numerical tests conducted in [202] indicated that the new model provides a better correlation between the filtered rate of strain rate and subgrid-scale stress tensor. Specifically, this model predicts the long tails in the ground-truth subfilter stress-strain rate correlation functions. However, other conventional local eddy viscosity-based models like classical Smagorinsky, which corresponds to \(\alpha = 1\), miss this important feature as they decay faster (see Fig. 14).

Fig. 14
figure 14

Two-point correlations between SGS stress and filtered rate of strain rate in different scenarios (\(\alpha =1\) corresponds to the local model) at the filter size \(\Delta =31 \eta\) (a) , and \(\Delta =53 \eta\) (b). Source: [202]

In addition to the significantly better capability in the prediction of the long-tail interactions in the new model, the probability density functions of the dissipation quantities for the HIT flow using the box filtering approach, are matching much better than the local model. The local model predictions are purely dissipative and with no tail behavior, which is in contrast with the ground-truth DNS data sets. Moreover, effects of the different parameters in the LES procedure have been analyzed including filter size, filter type, wall distance for the channel flow case, and integration radius.

Alternatively in studying the turbulent transport and mixing, kinetic Boltzmann theory has shown a rich and promising ground based upon principles of statistical mechanics, which by construction is well suited for the stochastic description of turbulence at microscopic level [203]. In the following, the fundamental sources of nonlocal closure and the SGS modeling for the residual passive scalar flux are studied at the kinetic Boltzmann transport framework. Our objective is to derive a nonlocal eddy diffusivity SGS model at the continuum level. In what follows we present three recent development of LES SGS where the SGS small motions are modeled by the BGK kinetics transport.

4.2.3 Fractional LES SGS Modeling for Scalar Turbulence

Statistical description of LES is well represented through incorporating a filtering procedure into the kinetic Boltzmann transport. For the purpose of passive scalar transport, applying a spatially and temporally invariant filtering kernel, \(\varvec{\mathcal {G}} = \varvec{\mathcal {G}}(\varvec{r})\), onto the distribution function \(g(t,\mathbf {x},\varvec{u})\) linearly decomposes that into the filtered, \(\widetilde{g}=\varvec{\mathcal {G}} {*} g\), and the residual, \(g^{\prime }=g-\widetilde{g}\), components. Therefore, filtering the BGK equation results in the following filtered BTE (FBTE) for the passive scalar:

$$\begin{aligned} \frac{\partial \widetilde{g}}{\partial t} + \varvec{u}\cdot \nabla \, \widetilde{g} = -\frac{\widetilde{g}-\widetilde{g^{\text {eq}}(\mathcal {B})}}{\tau _g}. \end{aligned}$$
(53)

where \(\mathcal {B}\) represents the generic Boltzmann filter size. As elaborated by Girimaji [204], the nonlinear nature of the collision operator, \(C_{\mathrm {BGK}}(g)\), prohibits the filtering kernel to commute with; thus, it initiates a source of closure at the kinetic level in FBTE (Eq. 53). Defining \(\widetilde{\mathcal {B}}:=(\varvec{u}-\widetilde{\varvec{V}})^2/c_T^2\), this closure problem is manifested in the following inequality,

$$\begin{aligned} \widetilde{g^{\text {eq}}(\mathcal {B})} = \frac{\widetilde{\Phi \, \exp (-\mathcal {B}/2)}}{(2\pi )^{3/2} \, c_T^3} \ne \frac{\widetilde{\Phi } \, \exp (-\widetilde{\mathcal {B}}/2)}{ (2\pi )^{3/2} \, c_T^3} = g^{\text {eq}}(\widetilde{\mathcal {B}}). \end{aligned}$$
(54)

The identified closure requires proper means of modeling so that one can numerically solve the FBTE (Eq. 53). A common practice is to approximate this closure problem with a modified relaxation time approach that is described in detail in [205]. Despite the success of this approach in some applications, it is not physically consistent with the filtered turbulent transport dynamics [204]. Nevertheless, here we manage to adjust this inconsistency by looking at the nonlocal effects arising from filtering the Maxwell distribution function, \(g^{\text {eq}}(\mathcal {B})\), and model them with proper mathematical tools. Considering the spatial filtering kernel \(\varvec{\mathcal {G}}(\varvec{r})\) with the filter width \(\Delta\), and applying it on the Maxwell equilibrium distribution as

$$\begin{aligned} \widetilde{g^{\text {eq}}(\mathcal {B})} = \varvec{\mathcal {G}} {*} g^{\text {eq}}\big (\mathcal {B}(t,\varvec{u},\mathbf {x})\big ) = \int _{R_f}^{} \varvec{\mathcal {G}}(\varvec{r}) \, g^{\text {eq}}\big (\mathcal {B}(t,\varvec{u},\mathbf {x}-\varvec{r})\big ) \, d\varvec{r}, \end{aligned}$$

where \(R_f=[-\Delta /2 \, , \Delta /2]^3\). Subsequently, by rewriting the right-hand side of the passive scalar FBTE (Eq. 53) into the following form

$$\begin{aligned} -\frac{1}{\tau _g} \left( \widetilde{g} - \widetilde{g^{\text {eq}}(\mathcal {B})} \right) = \underbrace{-\frac{1}{\tau _g} \left( \widetilde{g} - g^{\text {eq}}(\widetilde{\mathcal {B}}) \right) }_{\text {closed}} + \underbrace{\frac{1}{\tau _g} \left( \widetilde{g^{\text {eq}}(\mathcal {B})} - g^{\text {eq}}(\widetilde{\mathcal {B}}) \right) }_{\text {unclosed}}, \end{aligned}$$

the unclosed part is structurally multi-exponentially distributed and maybe approximated by a power-law distribution model as we propose

$$\begin{aligned} \widetilde{g^{\text {eq}}(\mathcal {B})} - g^{\text {eq}}(\widetilde{\mathcal {B}}) \approx g^\alpha (\widetilde{\mathcal {B}}) = \frac{\widetilde{\Phi }}{c_T^3} \, F^{\alpha }(\widetilde{\mathcal {B}}), \end{aligned}$$

where \(F^{\alpha }(\widetilde{\mathcal {B}})\) denotes an \(\alpha\)-stable Lévy distribution that is mathematically designed based on heavy-tailed stochastic processes and replicate the power-law behavior [8, 206]. The corresponding macroscopic continuum variables associated with the filtered Eq. (47) are obtained in terms of the filtered distribution functions, \(\widetilde{f}\) and \(\widetilde{g}\), as

$$\begin{aligned} \begin{array}{l} \widetilde{\Phi } = \int _{\mathbb {R}^d} \widetilde{g}(t,\mathbf {x},\varvec{u}) \, d\varvec{u}, \\ \widetilde{V}_i = \frac{1}{\rho }\int _{\mathbb {R}^d} u_i \, \widetilde{f}(t,\mathbf {x},\varvec{u}) \, d\varvec{u}, \quad i=1,2,3. \end{array} \end{aligned}$$
(55)

According to the microscopic reversibility of the particles that assumes the collisions occur elastically, the right-hand side of Eq. (53) equals zero [207]. Therefore,

$$\begin{aligned} \frac{\partial \widetilde{\Phi }}{\partial t} + \nabla \cdot \int _{\mathbb {R}^d} \varvec{u} \, \widetilde{g} \, d\varvec{u} = 0. \end{aligned}$$
(56)

Since we are working with spatial filtering kernels, \(\varvec{\mathcal {G}}=\varvec{\mathcal {G}}(\varvec{r})\),

$$\begin{aligned} \int _{\mathbb {R}^d} \varvec{u} \, \widetilde{g} \, d\varvec{u} = \int _{\mathbb {R}^d} (\varvec{u}-\widetilde{\varvec{V}}) \, \widetilde{g} \, d\varvec{u}+ \int _{\mathbb {R}^d} \widetilde{\varvec{V}} \, \widetilde{g}\, d\varvec{u}. \end{aligned}$$
(57)

By plugging Eq. (57) into Eq. (56), we obtain that

$$\begin{aligned} \frac{\partial \widetilde{\Phi }}{\partial t} + \nabla \cdot \left( \widetilde{\Phi } \, \widetilde{\varvec{V}}\right) = -\nabla \cdot \varvec{q}, \end{aligned}$$

where

$$\begin{aligned} q_i=\int _{\mathbb {R}^d} \left( u_i-\widetilde{V}_i\right) \, \widetilde{g} \, d\varvec{u}. \end{aligned}$$

The corresponding filtered passive scalar flux is obtained through a sequence of step-by-step derivations as

$$\begin{aligned} \widetilde{\varvec{q}} = -\mathcal {D} \, \nabla \widetilde{\Phi }, \end{aligned}$$

and the divergence of residual scalar flux is derived as the fractional Laplacian of the filtered total scalar concentration,

$$\begin{aligned} \nabla \cdot \varvec{q}^R = -\mathcal {D}_\alpha \, (-\Delta )^{\alpha } \, \widetilde{\Phi }, \quad \alpha \in (0,1], \end{aligned}$$

where \(\mathcal {D}_\alpha := \frac{C_\alpha (c_T \, \tau _g)^\alpha }{\tau _g} \, (\alpha +2) \, \Gamma (\alpha )\) is a model coefficient with the unit [\(L^\alpha /T\)]. The filtered AD equation for the total passive scalar concentration, developed from the filtered kinetic BTE with an \(\alpha\)-stable Lévy distribution model, yields a fractional-order SGS scalar flux model at the continuum level. The aforementioned filtered AD equation reads as

$$\begin{aligned} \frac{\partial \widetilde{\Phi }}{\partial t}+\frac{\partial }{\partial x_i}\left( \widetilde{\Phi } \, \widetilde{V}_i\right) = \mathcal {D} \, \Delta \widetilde{\Phi } +\mathcal {D}_{\alpha } (-\Delta )^{\alpha } \, \widetilde{\Phi }. \end{aligned}$$
(58)

Through a proper choice for the fractional Laplacian order \(\alpha\), the developed model optimally works in an LES setting. Applying the Reynolds decomposition and considering the passive scalar with imposed uniform mean gradient, Eq. (58) fully recovers the filtered transport equation for the transport of the filtered scalar fluctuations, \(\widetilde{\phi }\).

4.2.4 Nonlocal Spectral Transfer Model and Scaling Law for Scalar Turbulence

Recently, Akhavan-Safaei and Zayernouri [208] revisited the spectral transfer model for the turbulent intensity in passive scalar transport, and proposed a physically meaningful modification to the scaling of scalar variance cascade, given by

$$\begin{aligned} E_\phi (k) \sim \chi \, \varepsilon ^{-1/3} \, \mathbf{k} ^{-2/3} \, (\mathbf{k} ^2+\mathcal {C}_\alpha \mathbf{k} ^{2\alpha })^{-1/2}, \end{aligned}$$

in which \(\chi\) represents the rate of spectral flux function, \(\varepsilon\) denotes the dissipation rate of turbulent kinetic energy (TKE) and \(\mathbf{k}\) is the Fourier wave number. This generalizes the \(-5/3\) law, which corresponds to \(\alpha =0\). The comparison between the classic scaling law and this generalized model is depicted in Fig. 15. This work begins with redefining the corresponding length scale of the scalar transport, being traditionally approximated only as \(1/\mathbf{k}\). While this way of thinking is quite consistent with the Brownian motion model at small scales (considering small jumps of finite variance) the authors argued using several experimental studies that the scalar turbulence (i.e., both the scalar increments and the scalar fluctuations) do not obey the K41 local-isotropy hypotheses, and they are anomalous and lead to nonlocal behavior. Hence, they modified the corresponding scalar variance length scale in a way that it additionally included a new scale-free term to directly take the corresponding self-similar large jumps into account. This new inclusive length scale, combined with the Kolmagorov’s velocity-scale and obtained from TKE, defined a new scalar time scale that leads to a new nonlocal power-law scaling for the cascade of scalar variance. From the generalized spectral transfer model, the authors obtained back a new fractional-order scalar transport model, which can be viewed as a re-derivation of their earlier fractional LES work, originally derived from the filtered Boltzmann transport equation in [24].

Fig. 15
figure 15

(Left) Turbulent scalar intensity \(E_\phi (k) \sim \chi \, \varepsilon ^{-1/3} \, \mathbf{k} ^{-5/3}\), versus (Right) the modified (generalized) scaling law \(E_\phi (k) \sim \chi \, \varepsilon ^{-1/3} \, \mathbf{k} ^{-2/3} \, (\mathbf{k} ^2+\mathcal {C}_\alpha \mathbf{k} ^{2\alpha })^{-1/2}\) in [208], obtained from the DNS data

4.2.5 Fractional/Tempered Fractional LES Models for Fluid Turbulence

For some pedagogical purposes, we first presented the case of fractional LES SGS modeling for scalar turbulence. However, this new paradigm in LES modeling actually began prior to [24]. Samiee et al. [25] developed the first ever fractional LES model for homogeneous isotropic turbulent flows as

$$\begin{aligned} (\nabla . \tau ^{R}) = \mu _{\alpha } (- \Delta )^{(\alpha )} \bar{V}, \end{aligned}$$

being based on the derivation of fractional Laplacian closure term in the spatially filtered Navier-Stokes equations when employing a Lévy stable distribution as the equilibrium model in the filtered BGK kinetic transport equation. In [209], they later developed a generalized version of this earlier model (suitable for incorporating data with tailored/truncated tails). Employing rather a tempered Lévy stable distribution in the kinetic level this time gave rise to the formulation of the tempered fractional LES closure term as

$$\begin{aligned} (\nabla . \tau ^{R}) = \nu _{\alpha } \sum _{k=0}^{\kappa } \phi _k ^{\kappa } ( \Delta + \lambda _k)^{(\alpha )} \bar{V}, \end{aligned}$$

forming a novel, data-friendly and expressive tempered fractional Laplacian SGS model for turbulence. They also showed that the newly developed nonlocal models can better recover the non-Gaussian statistics of subgrid-scale stress motions while they are being employed at the continuum level.

4.2.6 Dynamic Nonlocal LES Modeling

The recent developments in [24, 25, 209] offer a novel LES modeling paradigm for modeling the stochastic SGS motions. By employing fractional and tempered fractional Laplacian operators as additional linear terms to the filtered Navier-Stokes equations, this paradigm directly takes the superdiffusive nature of turbulence into account. However, all of the aforementioned models feature a static order of the fractional derivative that does not vary in time. To dynamically calculate the corresponding optimal fractional indices and tempering parameters throughout the flow simulation, dynamic nonlocal LES models have recently been formulated in [210, 211] along with the corresponding a priori and a posteriori studies. Such automated dynamic nonlocal LES modeling has been performed in the context of both flow and scalar turbulence, and has significantly enhanced the capability of the SGS LES models in prediction of turbulence TKE back-scattering and in properly addressing the notion of intermittency in the SGS dynamics. These features subsequently lead to sharp peaks and heavy tails of distributions in the small-scale motion statistics, e.g., in the dissipation of TKE and turbulent scalar variance.

4.3 Future Directions in Fractional Turbulence Modeling

Laval et al. [212] analyzed the effects of the local and nonlocal interactions on the intermittency corrections in the scaling properties in three-dimensional turbulence. They observed that nonlocal interactions are responsible for the creation of the intense vortices and on the other hand, local interactions are trying to dissipate them. Inspired by the mentioned observations, they came up with a new turbulence model that accounts for both the local and nonlocal interactions for the study of intermittency. In their proposed model, the large and small scales are being coupled by nonlocal interactions using a multiplicative process and additive noise along with a turbulent viscosity model for the local interactions. The results of the new model qualitatively cover the previously observed anomaly and intermittency aspects.

In the context of nonlocal turbulence modeling, Song and Karniadakis [213] proposed a variable-order fractional model for wall-bounded turbulent (mean) flows. They represented the Reynolds stresses with a nonlocal fractional derivative of variable-order that decays with the distance from the wall. Interestingly, they found that this variable fractional order has a universal form for all Re and for three different flow types, i.e., channel flow, Couette flow, and pipe flow. In addition to the aforementioned fully developed flows, they modeled turbulent boundary layers and discussed how the streamwise variation affects the universal curve (see also [214] for a follow-up work). Later, Pang et al. [215] proposed a nonlocal truncated operator with spatially variable order, which is suitable for modeling wall-bounded turbulence, e.g., turbulent Couette flow. They showed that nonlocal physics-informed neural networks (nPINNs) can jointly infer the variable order, exhibiting a universal behavior with respect to Re, a finding that can contribute to better understanding of nonlocal interactions in wall-bounded turbulence. In terms of memory effects (i.e., nonlocality in time), Parish and Duraisamy in [216] developed a dynamic SGS model for LES, based on the Mori–Zwanzig (MZ) formalism. This closure model was constructed by exploiting similarities between two levels of coarse-graining via the Germano identity of fluid mechanics and by assuming that memory effects have a finite-temporal support. This work suggests future studies on using time-fractional derivatives in turbulence models.

The aforementioned developments are practically interesting, mathematically exciting, and algorithmically robust. They enthusiastically encourage the field of research in turbulence to gradually open their arms towards a whole new wealth of recent mathematical developments in both theory and practice of fractional modeling. Inevitably, further systematic studies and developments of nonlocal turbulence models are needed (both numerically and experimentally) in order to achieve the charming blend of enhanced accuracy and lowered cost in realistic applications. On this note, we end this section by emphasizing that the non-Markovian/non-Fickian nature does not relax in compressible flows (i.e., variable density problems). Therefore the idea of developing generalized fractional turbulence models for transonic-to-hypersonic flows is a new and nourishing venue for research, in which the existing sense of classical thermodynamics can become fundamentally non-equilibrium and nonlocal.

5 Fractional Constitutive Laws in Material Science

Accurate modeling of evolving material response and failure across multiple time and length scales is essential for life cycle prediction and design of new materials. While the mechanical behavior of a number of standard engineering materials (e.g., metals, polymers, rubbers) is quite well understood, a significant modeling effort still needs to be conducted for complex materials, where microstructure heterogeneities, randomness and small-scale physical mechanisms (such as collective behavior) lead to non-standard and, at times, counter-intuitive responses. Two examples are bio-tissues and natural materials (e.g., biopolymers), which are multi-functional products of millions of years of evolution, locally optimized for their hosts and environment, and constrained by a limited set of building blocks and available resources [28, 29]. These materials possess unprecedented properties at low densities, especially due to their hierarchical and multi-scale structure, leading to a wide spectrum of behaviors, such as power-law viscoelasticity, visco-plastic strains under hysteresis loading, damage, failure, fatigue, fractal avalanche ruptures and self-healing mechanisms.

The main motivation for fractional materials modeling is the power-law fingerprint arising in microstructures undergoing anomalous diffusion, observed in a range of complex materials. Such microstructures often display a fractal nature with subdiffusive dynamics, e.g., of entangled polymer chains, and defect interactions such as dislocation avalanches, cracks and voids. Such non-exponential behavior cannot be accurately modeled by integer-order, linear viscoelastic models, which require arbitrary arrangements of Hookean/Newtonian elements and introduce a limited number of exponential (Debye) relaxation modes that, at most, represent a truncated power-law approximation [217]. While these approximations may be satisfactory for short times and engineering precision, they often result in high-dimensional parameter spaces and still lack predictability outside the experimental time/length scales, often requiring recalibration. In this context, fractional operators become appropriate and natural modeling choices, since their integro-differential operators naturally utilize power-law convolution kernels, coding self-similar microstructural features in a reduced-order mathematical language with smaller parameter spaces (similarly to the case of anomalous transport, see Sect. 3). This fact allows accurate and predictive modeling, in an efficient manner, of bio-tissues [218,219,220,221,222,223,224] and polymers [3, 225,226,227] for multiple time scales.

In this section we review fractional models for materials undergoing power-law behaviors, termed anomalous materials, in a range of non-equilibrium and path-dependent responses. We start with linear viscoelasticity, introducing the basic modeling building block, known as Scott-Blair element that models a single power-law response and can be combined to incorporate more complex behaviors. In harmony with the previous Sections, we will emphasize on potential multi-scale connections, stochastic processes, and thermodynamic consistency. After providing evidence of cases where fractional behavior/power-laws appear as intrinsic qualities in a number of systems, we report on the state-of-the-art models incorporating multiple physical mechanisms.

Fractional Viscoelasticity: Rheological Building Blocks

We start with the Boltzmann superposition integral for linear viscoelasticity, obtained from the linear superposition of infinitesimal step strains \(\delta \varepsilon (t)\) applied to a viscoelastic material [228]:

$$\begin{aligned} \sigma (t) = \int ^t_{-\infty } G(t-\tau ) \dot{\varepsilon }(\tau )\,d\tau , \end{aligned}$$
(59)

where \(\dot{\varepsilon }\) and \(\sigma (t)\) denote, respectively, the strain rate and stress. The convolution kernel G(t), is a relaxation function, directly related to stress relaxation experiments under step strains. It is traditionally modeled through combinations of Hookean springs and Newtonian dashpots, yielding a multi-exponential relaxation in the form \(G(t) = \sum ^N_{i=1} C_i \exp (-t/\tau _i)\). In this particular choice of kernel, Eq. (59) is equivalent to a multi-term ordinary differential equation (ODE).

Relaxation experiments across multiple time- and frequency-scales indicate that anomalous materials exhibit memory effects in time for stress/strain responses, which translates into a single power-law scaling in the form \(G(t) \propto t^{-\alpha }\), with \(\alpha \in (0,1)\). This indicates that, contrary to exponential relaxation forms, there is a spectrum of relaxation times arising from the material microstructure [6], for which standard ODE models (e.g., generalized Maxwell model in creep/relaxation representations) would require a large number of parameters.

The fundamental fractional rheological building block element, termed Scott-Blair (SB) model, is obtained by substituting the power-law kernel \(G(t) = Et^{-\alpha }/\Gamma (1-\alpha )\) into Eq. (59), leading to the following form:

$$\begin{aligned} \sigma (t) = ^{\;\;\;\text {C}}_{-\infty }{\mathbb {D}}^\alpha _t \varepsilon (t) = \frac{E}{\Gamma (1-\alpha )}\int ^t_{-\infty } (t-\tau )^{-\alpha } \dot{\varepsilon }(\tau )\,d\tau , \end{aligned}$$
(60)

which is equivalent to the Riemann-Liouville fractional derivative \(^{\;\text {RL}}_{-\infty }{\mathbb {D}}^\alpha _t \varepsilon (t)\) if the function \(\varepsilon (t)\) is sufficiently well behaved at \(t \rightarrow -\infty\) [53]. While this equivalence is satisfied for semi-infinite domains, the choice of Riemann-Liouville and Caputo definitions matter when we introduce a causal strain history and switch the lower limit of Eq. (60) from \(-\infty\) to 0, which leads to two different fractional Cauchy problems. For the Caputo definition, we have [228]:

$$\begin{aligned} \sigma (t) = E\,{^{\text {C}}_{0}{\mathbb {D}}^\alpha _t} \varepsilon (t),\quad t > 0, \quad 0< \alpha < 1,\quad \varepsilon (0) = \varepsilon _0. \end{aligned}$$
(61)

On the other hand, when employing Riemann-Liouville derivatives, we obtain:

$$\begin{aligned} \sigma (t) = E\,{^{\text {RL}}_{\,\,\,\,0}{\mathbb {D}}^\alpha _t} \varepsilon (t), \quad t > 0, \quad 0< \alpha < 1,\quad ^{\text {RL}}_{\,\,\,\,0}{\mathbb {D}}^{\alpha -1}_t \varepsilon (t)\big |_{t=0} = \varepsilon _0, \end{aligned}$$
(62)

where we remark that problem Eq. (61) is more commonly adopted due to the appearance of integer-order ICs, while both aforementioned problems are equivalent in the presence of homogeneous ICs. The SB element provides a constitutive interpolation between a Hookean spring (\(\alpha \rightarrow 0\)) and a Newtonian dashpot (\(\alpha \rightarrow 1\)). The unique parameter pair \((E\,[Pa.s^\alpha ],\alpha )\) codes snapshots of a dynamic process instead of an equilibrium state of the system [6]. Consequently these properties are only associated with equilibrium states in the limit cases for the fractional order \(\alpha\). We remark that although the FDE (Eq. 61) utilizing the Caputo definition is widely employed to represent the SB element in the literature, the pioneering works on anomalous rheology modeling are attributed to Gerasimov [229] in 1948, introducing a similar power-law convolution operator as Eq. (60), which may be referred in the literature as the Gerasimov-Caputo derivative [230]. We refer the reader to [230, 231] for more details on the historical context of fractional derivatives in viscoelasticity.

Mechanistic and Thermodynamic Interpretations

Apart from the Boltzmann integral representation (Eq. 59), characterized by an integro-differential nature, the SB element can also be obtained through a continuous arrangement of canonical, Hookean and Newtonian elements, both from their constitutive and free-energy levels [232, 233], making the notion of SB elements intrinsically incorporating an infinite number of relaxation times more evident. In [233], a hierarchical ladder-like structure of standard Maxwell viscoelastic elements was employed. This structure led to a coupled system of ODEs, which had an infinite continued fraction (a recursion of fractions) representation in terms of the Maxwell model constants in the Laplace domain. Then, applying an inverse Laplace transform, a fractional stress-strain relationship was recovered for homonegeous initial conditions, therefore equivalent to both forms Eqs. (61) and (62). In [232], an isothermal Helmholtz free-energy density was derived for the SB element from the elastic energies of a discrete-to-continuum arrangement of standard Maxwell branches, obtaining the following form for the free-energy \(\psi\) as a function of the strain:

$$\begin{aligned} \psi (\varepsilon ) = \frac{1}{2} \int ^\infty _0 h(z)\left[ \int ^t_0 \exp (-\frac{t-s}{z})\dot{\varepsilon }(s) ds\right] ^2dz, \,\, h(z) = \frac{E z^{-1-\alpha }}{\Gamma (\alpha )\Gamma (1-\alpha )}, \end{aligned}$$
(63)

where h(z) denotes the relaxation spectrum. Therefore, Eq. (63) represents the amount of available elastic energy to perform work from the SB element in the time domain, which cannot be directly inferred from Eqs. (61) and (62). Naturally, the two limit cases for \(\alpha\) are \(\psi (\varepsilon ) \rightarrow E\varepsilon ^2/2\) when \(\alpha \rightarrow 0\), and \(\psi (\varepsilon ) \rightarrow 0\) when \(\alpha \rightarrow 1\). Furthermore, under suitable thermodynamic constraints, it is shown that the SB element is thermodynamically admissible and that the Caputo representation of Eq. (62) can be derived from Eq. (63) under continuum mechanics arguments.

Energy Decoupling in the Frequency Domain

Similar to the aforementioned representations, power-law structures also appear in viscoelastic dynamic properties and rheological experiments in the frequency domain [6], such as the complex shear modulus, defined as the ratio between the Fourier transform of stresses and strains:

$$\begin{aligned} G^*(\omega ) := \frac{\mathcal {F}[\sigma ](\omega )}{\mathcal {F}[\varepsilon ](\omega )} = G^\prime (\omega ) + i G^{\prime \prime }(\omega ), \end{aligned}$$
(64)

where \(\omega \,[s^{-1}]\) denotes the frequency. The term \(G^\prime\) is the storage modulus, and \(G^{\prime \prime }\) denotes the loss modulus, i.e., the stored and dissipated energy per cycle, respectively. Employing definition (64) into Eq. (62), the dynamic modulus of the Scott-Blair element is obtained [234]:

$$\begin{aligned} G^\prime (\omega ) = Re(G^*) = E\omega ^\alpha \cos \left( \frac{\alpha \pi }{2}\right) , \quad G^{\prime \prime }(\omega ) = Im(G^*) = E\omega ^\alpha \sin \left( \frac{\alpha \pi }{2}\right) , \end{aligned}$$

which provides a clear storage/loss decomposition, with the value of \(\alpha\) determining whether the material of interest is predominantly dissipative for certain frequency ranges.

Relationships to Material Microstructure and Stochastic Processes

The mechanistic origins of macroscopic power-law behaviors in complex materials are due to spatio-temporal anomalous subdiffusive processes in fractal microstructures [57]. We focus on the temporal case, in which the MSD of microstructural constituents follows a nonlinear scaling in the form \(\langle \Delta x \rangle ^2 \propto t^\alpha\). Bagley and Torvik [235] provided a relationship between the complex shear modulus obtained from the Rouse theory of polymer dynamics. They started with the result of Rouse’s theory for the shear modulus, i.e.,

$$\begin{aligned} G^\prime (\omega ) = n k T \sum ^N_{p=1} \frac{\omega ^2 \tau ^2_p}{1+\omega ^2\tau ^2_p}, \quad G^{\prime \prime }(\omega ) = \omega \mu _s + n k T \sum ^N_{p=1} \frac{\omega \tau _p}{1+\omega ^2\tau ^2_p}, \end{aligned}$$

where n denotes the number of molecules per unit volume, N is the number of monomers in the polymer chain, T represents the absolute temperature, k is Boltzmann’s constant. The term \(\tau _p\) denotes the relaxation times of the solution, which was approximated as \(\tau _p \approx \tau _1/p^2 = 6(\mu _0 - \mu _s)/(p^2\pi ^2nkT)\), which is valid when the number of submolecules N is large. The terms \(\mu _0\) and \(\mu _s\) denote, respectively, the steady-flow viscosities of the solution and solvent. They further worked on Rouse’s results, and by assuming the polymer chains and \(\omega \tau _1\) to be sufficiently large, obtained the following power-law form for the dynamic shear modulus:

$$\begin{aligned} G^*(\omega ) = i \omega \mu _s + \left[ \frac{3}{2} (\mu _0 - \mu _s) nkT\right] ^{1/2}(i\omega )^{1/2}. \end{aligned}$$

After applying the inverse Fourier transform, the above relationship leads to a Riemann-Liouville representation between stresses-strains with \(\alpha = 1/2\). Similar observations were also reported for \(\sigma (t)\) utilizing a Zimm model, where the inclusion of hydrodynamic interactions leads to a fractional order \(\alpha = 2/3\). Glöckle and Nonnenmacher [236] showed that fractional relaxation can be modeled by a special type of CTRW describing a trapping problem due to entanglements of polymer chains, thus slowing down the relaxation process. In their work, the random walkers, i.e., the particles, are considered as packages of free volume that allow conformational reorientations of chain segments, thus leading to relaxation. They obtained a waiting time distribution of such particles through a Fox-Wright representation in the form:

$$\begin{aligned} \chi (t) \sim \frac{{A}}{{\bar{\tau }}} \sum ^\infty _{k=0} \frac{(-1)^k}{\Gamma (-\beta k - \beta )} \left( \frac{\bar{\tau }}{t}\right) ^{\beta k + \beta + 1}, \end{aligned}$$

for which the leading term indicates that the CTRW waiting time corresponding to fractional relaxation exhibits a Lévy-type decay in the form \(\chi (t) \sim t^{-\beta -1}\).

Connecting Dynamic Viscoelasticity Across Scales

A connection between power-laws propagating from micro- to macro-rheology was proposed in [237], with the use of a Generalized Stokes-Einstein Relation (GSER) for spheres undergoing generalized Langevin dynamics in a viscoelastic medium:

$$\begin{aligned} |G^*(\omega )|\approx \frac{k T}{\pi a \langle \Delta r^2(1/\omega )\rangle \Gamma [1+\alpha (\omega )]}, \quad \alpha (\omega ) \equiv \frac{d\, \mathrm {ln} \langle \Delta r^2(1/\omega )\rangle }{d\, \mathrm {ln}\,t}\big |_{t=1/\omega }, \end{aligned}$$
(65)

which is valid for spheres of radius a comparable to the length scale of the embedding medium. Here, the dynamic shear modulus \(G^*(\omega )\) is related to a velocity memory function from Langevin dynamics. Among a variety of representations for the GSER, Eq. (65) assumes a power-law structure of the MSD with exponent \(\alpha\), which approaches zero when the sphere is confined by elastic structures present in the complex fluid. Such power-law representation also reduces errors near the frequency extremes when employing Laplace and Fourier transforms.

Physical Interpretation of Fractional Orders

Despite existing connections between micro- and macro-rheological properties, the physical interpretation of the emerging fractional orders has been elusive. More recently, a connection between the fractional order and the fractal dimension of the material microstructure was made by Mashayekhi et al. [238], where the authors extended the Zimm theory of polymer dynamics to fractal media as a bridge between the meso- and macro-scales. They showed that the fractional order is a rate-dependent material property that is strongly correlated with the fractal and spectral dimensions in fractal media.

5.1 Evidence of Fractional Behavior

We provide a few examples of fractional/power-law behaviors in viscoelasticity and micro/macro-scale plasticity. We start with two examples in viscoelasticity of solid-like and fluid-like natures in which fractional modeling is more appropriate, both with better fits and a reduced number of model parameters.

Viscoelastic Rheology

Jaishankar and McKinley [6] calibrated classical and fractional Maxwell models to the four orders-of-magnitude relaxation data for highly anomalous butyl rubber data from Blair et al. [239] (Fig. 16a), and observed that the three-parameter fractional Maxwell model provided an excellent fit to the experimental data, while a multi-exponential, integer-order Maxwell model required six parameters to provide a satisfactory fit. Moreover, using the calibrated fractional relaxation parameters they obtained an accurate prediction of the creep compliance for the same material, especially for long-time behavior. The second experiment from [6] concerns the dynamic properties of acacia gum, a commonly used food preservative. In this case, they compared a four-parameter fractional Maxwell model with a single mode (three-parameter) standard Maxwell model (Fig. 16b) and demonstrated that while the fractional Maxwell model captures a complex Cole-Cole behavior, its integer-order counterpart is unable to even estimate the qualitative response. We note that other factors, such as material heterogeneity can introduce multiple power-law relaxation regimes.

Fig. 16
figure 16

Comparison between standard and fractional-order models. (a) Relaxation behavior of Butyl rubber using experimental data from Scott-Blair. (b) Cole-Cole plot (\(G^\prime\) versus \(G^{\prime \prime }\)) for the dynamic properties of acacia gum. Source: [6]

In [34] Stamenović measured the complex shear modulus \(G^*(\omega )\) of cultured human airway smooth muscle and observed two distinct power-law regimes separated by an intermediate plateau. Kapnistos et al. [240] found an unexpected tempered power-law relaxation response of entangled polystyrene ring polymers, compared to the usual relaxation plateau of linear chain polymers. Such behavior was interpreted through self-similar conformations of double-folded loops of ring polymers, instead of the reptation observed in linear chains.

Power-Law Plasticity

The creep behavior of human embryonic stem cells (ESCs) under differentiation was studied by Pajerowski et al. [241] through micro-aspiration experiments at different pressures. The cell nucleus demonstrated distinguished visco-elasto-plastic power-law scalings, with \(\alpha = 0.2\) for the plastic regime, independent of the applied pressure. It is discussed that such low power-law exponent arises due to the fractal arrangement of chromatin inside the cell nucleus (Fig. 17).

Fig. 17
figure 17

(a) Scale-free creep of ESCs nuclei under aspiration. Low applied stresses \(\Delta P\) yield a single power-law creep scaling. For large stresses, a plastic transition is observed at \(\tau _{plastic} \approx 8-10\, [s]\), with a creep exponent \(\alpha \approx 0.2\), independent of stress values. (b) Different stages of nucleus aspiration, showing a viscoelastic recovery (ii)–(iii), followed by irreversible plastic deformation (iv). Source: [241]

Studies on force-induced mechanical plasticity of mouse embryonic fibroblasts were performed by Bonadkar et al. [30]. It was found that the viscoelastic relaxation and the permanent deformations followed a stochastic, normally distributed, power-law scaling \(\beta (\omega )\), with values ranging from \(\beta \approx 0\) to \(\beta \approx 0.6\). The microstructural mechanism of plastic deformation in the cytoskeleton is due to the combination of permanent stretching and buckling of actin fibers.

As for evidence of power-laws in failure of crystalline materials, Richeton et al. [33] investigated the emergence of intermittency and dislocation avalanches in polycrystalline plasticity through acoustic emission experiments on ice under creep compression. Their findings demonstrate that different from the scale-free, close-to-critical dislocation dynamics of single crystals [242], the introduction of average grain sizes \(\langle d \rangle\) from the polycrystal microstructure led to a tempered power-law distribution of avalanche sizes. While the exponential tempering cutoff changes with \(\langle d \rangle\), the authors observed a constant power-law scaling for all samples.

Connections to Stochastic Processes

Although the subdiffusive MSD coefficient \(0< \alpha < 1\) is observed in a variety of studies for complex materials and fluids, there exist different interpretations on the underlying stochastic processes linked to the subdiffusive physics, e.g., crowding or caging effects in cells and polymers. Szymanski and Weiss [243] utilized fluorescence correlation spectroscopy (FCS) of proteins immersed in crowded dextran solutions and reported a distribution of MSD coefficients with average \(\langle \alpha \rangle \approx 0.82\), and compared this experimental finding with recovered distributions of MSD coefficients for simulated fractional Brownian motion (fBm), obstructed diffusion (OD), and CTRWs. Their findings indicated that the recovered distributions for fBm and OD matched the experiments, while the recovered distributions for CTRW-induced diffusion, with average \(\langle \alpha \rangle \approx 0.59\) did not agree well with the data due to ergodicity breaking. Weber et al. [244] studied the subdiffusion of bacterial chromosomal loci in viscoelastic cytoplasm and further concluded that fractional Langevin motion to be more likely than CTRW and OD, due to the presence of ergodicity and a negative velocity auto correlation function. Regarding polymers, Wong et al. [5] studied the thermal motion of colloidal tracer particles in entangled actin filament (F-actin) networks, under different concentrations and network mesh sizes. They observed a subdiffusive behavior when the tracer particles radius were comparable to the network mesh size, and suggested that such anomalous behavior happens due intermittent caging behavior, followed by sudden infrequent jumps with a power-law distribution of caging times \(\tau _c\) in the form \(P(\tau _c) = \tau ^{-1.33}_c\).

5.2 State of the Art: Anomalous Materials Modeling

As observed in Sect. 5.1, experimental evidence suggests that complex material behavior may possess more than a single power-law scaling in the viscoelastic regime, particularly in multi-fractal structures, which are characteristic of cells [34] and biological tissues [245], due to their complex, hierarchical and heterogeneous microstructure. For such cases, a single SB element is not sufficient to capture the observed behavior, even if linear viscoelasticity holds. Furthermore, material nonlinearity due to large strains and additional physics such as plasticity, damage and failure require more advanced rheological models, which could have full or partial fractional nature. In this section we refer to a class of fractional models in the literature, classified by rheology type and nature of the corresponding FDEs. We acknowledge that rheology is a vast field with a large number of different types of material behavior, and here we limit our review to visco-elasto-plasticity, damage mechanics and failure.

5.2.1 Viscoelasticity

Linear Viscoelasticity

We start by introducing two natural extensions of the SB viscoelastic model through serial and parallel combinations. The first one is the fractional Kelvin-Voigt (FKV) model, which is given by a parallel combination of SB elements, and relates the stresses \(\sigma (t)\) and strains \(\varepsilon (t)\) in the following additive form [233]:

$$\begin{aligned} \sigma (t) = E_1\,^{\text {C}}_{0}{\mathbb {D}^{\alpha _1}_t} \varepsilon (t) + E_2\,^{\text {C}}_{0}{\mathbb {D}^{\alpha _2}_t} \varepsilon (t) ,\quad t > 0, \quad \varepsilon (0) = 0, \end{aligned}$$
(66)

where the fractional orders are such that \(0< \alpha _1, \alpha _2 < 1\), and \(E_1\,[Pa.s^{\alpha _1}]\), \(E_2\,[Pa.s^{\alpha _2}]\) are the associated pseudo-constants. The corresponding relaxation function also assumes additive form of two SB elements:

$$\begin{aligned} G^{\text {FKV}}(t) := \frac{E_1}{\Gamma (1-\alpha _1)} t^{-\alpha _1} + \frac{E_2}{\Gamma (1-\alpha _2)} t^{-\alpha _2}, \end{aligned}$$

where contrary to the scale-free relaxation behavior of a single SB element, since we assume \(\alpha _2 > \alpha _1\), the FKV model possesses two time scale-dependent power-law regimes, given by \(G^{\text {FKV}} \sim t^{-\alpha _2}\) as \(t\rightarrow 0\) and \(G^{\text {FKV}} \sim t^{-\alpha _1}\) as \(t\rightarrow \infty\), which characterizes a transition from faster to slower relaxation regimes. We note that this quality allows the FKV model to describe materials that reach an equilibrium behavior for large times when \(\alpha _1 \rightarrow 0\), which is intuitive from the mechanistic standpoint as one of the SB elements becomes a Hookean spring.

Through a serial combination of SB elements, we obtain the fractional Maxwell (FM) model [6], given by:

$$\begin{aligned} \sigma (t) + \frac{E_2}{E_1}\,^{\text {C}}_{0}{\mathbb {D}}^{\alpha _2 - \alpha _1}_t \sigma (t) = E_2 \,^{\text {C}}_{0}{\mathbb {D}}^{\alpha _2}_t \varepsilon (t) , \quad t>0, \end{aligned}$$
(67)

with \(0< \alpha _1< \alpha _2 < 1\), and two sets of initial conditions for strains \(\varepsilon (0) = 0\), and stresses \(\sigma (0) = 0\). We note that in the case of non-homogeneous initial conditions, there needs to be compatibility conditions [228] between stresses and strains at \(t=0\). The corresponding relaxation function for this building block model assumes the more complex Miller-Ross form [6]:

$$\begin{aligned} G^{\text {FM}}(t) := E_1 t^{-\alpha _1} E_{\alpha _2-\alpha _1, 1-\alpha _1}\left( -\frac{E_1}{E_2} t^{\alpha _2 - \alpha _1}\right) , \end{aligned}$$

where \(E_{a,b}(z)\) denotes the two-parameter Mittag-Leffler function, defined as [228]:

$$\begin{aligned} E_{a,b}(z) = \sum ^\infty _{k=0} \frac{z^k}{\Gamma (a k + b)},\quad \text {Re}(a) > 0,\quad b \in \mathbb {C},\quad z \in \mathbb {C}. \end{aligned}$$
(68)

Interestingly, the presence of a Mittag-Leffler function in () produces a stretched exponential relaxation for smaller time scales and a power-law behavior for larger time scales. The asymptotic behaviors are given by \(G^{\text {FM}} \sim t^{-\alpha _1}\) as \(t\rightarrow 0\) and \(G^{\text {FM}} \sim t^{-\alpha _2}\) as \(t \rightarrow \infty\), indicating that, contrary to the FKV model, the FM model has a constitutive transition from slower-to-faster relaxation. We refer the reader to [32, 246] for a number of applications of the aforementioned models. Additionally, we notice that both FKV and FM models are able to recover the SB element with a convenient set of pseudo-constants, or naturally reveal the necessity of standard rheological elements according to available data. Furthermore, we also outline more complex building block models that produce more flexible responses, including three to four fractional orders, such as the fractional Kelvin-Zener (FKZ), fractional Poynting-Thomson (FPT), and fractional Burgers (FB) models. We refer the reader to [32, 233] for more details on such models.

Numerical Discretization

A well-known numerical scheme to discretize the time-fractional Caputo derivatives of order \(0< \alpha < 1\) in Eqs. (66) and (67) is the implicit L1-difference scheme by Lin and Xu [247]. Let points on a uniform time-grid be defined as \(t_n = n \Delta t\) with \(n=0,\,1,\,\dots ,\,N\) time steps of size \(\Delta t\). The discrete time-fractional Caputo derivative of a function u(t) evaluated at \(t = t_{n+1}\) is given by:

$$\begin{aligned} ^{\text {C}}_0{\mathbb {D}}_t^\alpha u(t) \Big |_{t=t_{n+1}} = \frac{1}{\Gamma (2-\alpha )} \sum _{j=0}^{n} d_j \frac{u_{n+1-j}-u_{n-j} }{\Delta t^\alpha } + r^{n+1}_{\Delta t}, \end{aligned}$$

where \(r^{n+1}_{\Delta t} \le C_u \Delta t^{2-\alpha }\) with the constant \(C_u\) only depending on u(t), and the convolution weights \(d_j := (j+1)^{1-\alpha } - j^{1-\alpha }, j=0,1,\dots ,n\). The above expression can be rewritten and approximated as:

$$\begin{aligned} ^{\text {C}}_0{\mathbb {D}}_t^\alpha u(t)\Big |_{t=t_{n+1}} \approx \frac{1}{\Delta t^\alpha \Gamma (2-\alpha )} \left[ u_{n+1} - u_n + \mathcal {H}^{\alpha }u \right] , \end{aligned}$$

where the so-called history term \(\mathcal {H}^{\alpha }u\) is given by:

$$\begin{aligned} \mathcal {H}^{\alpha }u = \sum _{j=1}^{n} d_j \left[ u_{n+1-j}-u_{n-j} \right] . \end{aligned}$$

We note that although the above discretization is of practical and simple implementation, there exist many sophisticated numerical methods for fractional Cauchy equations that employ faster schemes, and also address non-smooth, nonlinear and stiff problems. We also emphasize that employing the kernel G(t) into the Boltzmann representation for the aforementioned models may be impractical, since one would need other specialized numerical methods that are model-dependent, and would require evaluations of Mittag-Leffler functions.

Nonlinear Viscoelasticity

Fractional linear viscoelastic models are suitable candidates to describe the anomalous dynamics of a number of materials undergoing small strains. However, under large strains, material nonlinearities induce stress/strain dependencies on the relaxation behavior. One alternative to incorporate such nonlinearity is through quasi-linear viscoelasticity (QLV) [248], which replaces G(t) by a multiplicative decomposition between a reduced relaxation function g(t) and an instantaneous, nonlinear elastic tangent response:

$$\begin{aligned} \sigma (t,\varepsilon ) = \int ^t_0 g(t-s)\frac{\partial \sigma ^e(\varepsilon )}{\partial \varepsilon } \dot{\varepsilon }\,ds, \end{aligned}$$

with \(\sigma ^e(\varepsilon )\) and \(g(0^+) = 1\). Fractional approaches to QLV were developed by Doehring et al. [249] for arterial valve cusp and by Craiem et al. [218] for arterial wall viscoelasticity. In the latter, a reduced, fractional Kelvin-Voigt-type relaxation function \(g(t) = C + Dt^{-\alpha }\) was employed, with pseudo-constant \(D\,[s^\alpha ]\), and nonlinear exponential form \(\sigma ^{e}(\varepsilon ) = A \left( e^{B \varepsilon } - 1\right)\), with constant \(A\,[Pa]\). Therefore, the fractional QLV formulation is able to capture not only linear/nonlinear instantaneous stress response, due to the rearrangement and alignment of fibers with the load direction, but also the anomalous power-law relaxation of the fractal microstructure. We also mention nonlinear models that take into account the Mittag-Leffler-type relaxation dynamics, such as the fractional QLV model in [249] and the fractional K-BKZ model introduced by Jaishankar and McKinley [7].

5.2.2 Visco-elasto-plasticity

Several works employed fractional calculus to account for the visco-plastic regimes of several classes of materials. We outline three of them: time-fractional, space-fractional and stress-fractional.

Time-fractional approaches focus on introducing memory effects into internal variables [250, 251], and consequently modeling power-laws in both viscoelastic and visco-plastic regimes. This is of interest for polymers, cells, and tissues. In this context, fractional visco-elasto-plastic models provide a constitutive interpolation between rate-independent plasticity and Perzyna’s visco-plasticity by introducing a SB model acting the plastic regime [250], and utilizes a rate-dependent yield function of the form

$$\begin{aligned} f(\sigma , q) := |\sigma | - \left[ \sigma ^Y + K {}^C_0{\mathcal {D}}_t^{\alpha _K} q(t) + H q(t)\right] , \quad 0< \alpha _K < 1, \end{aligned}$$

where \(\sigma ^Y\) and q denote, respectively, the yield stress and the accumulated plastic strain, with pseudo-constant \(K\,[Pa.s^{\alpha _K}]\) and Hookean constant H. The above form for the yield function was later proved to be thermodynamically consistent in a further extension of the model to account for continuum damage mechanics [252].

A three-dimensional space-fractional approach to elastoplasticity was also developed by Sumelka [253] to account for spatial nonlocalities. The model is based on rate-independent elastoplasticity, and nonlocal effects are accounted for through a fractional continuum mechanics approach, where the strains are defined by a space-fractional Riesz-Caputo derivative of displacements u(x) in the form

$$\begin{aligned} ^{\text {RC}}_{\;\;a}{\mathbb {D}}^\alpha _b u(x)= \frac{\Gamma (2-\alpha )}{2} \left( ^{\text {C}}_{a}{\mathbb {D}}^\alpha _x u(x) + (-1)^n \,^{\text {C}}_{x}{\mathbb {D}}^\alpha _b u(x)\right) , \end{aligned}$$
(69)

for left- and right-sided fractional Caputo derivatives [253] with \(n = \lceil \alpha \rceil\).

Finally, stress-fractional models for plasticity have found applicability in soil mechanics and geomaterials that follow non-associated plastic flow [254, 255], i.e., the yield surface expansion in the stress space does not follow the usual normality rule, and may be non-convex. The work by Sumelka [254] proposed a three-dimensional fractional visco-plastic model, where a fractional flow rule with order \(0< \alpha < 1\) in the stress domain naturally models non-associative plasticity. Interestingly, this model recovers the classical Perzyna visco-plasticity as \(\alpha \rightarrow 1\), and the effect of the fractional flow rule can be a compact descriptor of microstructure anisotropy. Recently, a similar stress-fractional model was developed [255], and successfully applied to soils under compression. We refer the reader to the detailed review work by Sun et al. [256] for a review of uses of fractional calculus in plasticity.

5.2.3 Damage Mechanics, Aging and Failure

There have also been recent efforts to include damage, aging and failure effects into fractional calculus frameworks. Existing formulations are focused on either adding classical failure frameworks into existing fractional constitutive laws, or by developing fractional failure mechanisms. Here, we mostly focus on the latter and start with the work by Caputo and Fabrizio [257], that developed a variable-order viscoelastic model in the form:

$$\begin{aligned} \sigma (x,t) = g(\alpha (x,t)) A(x) {}^C_{t_0}{\mathbb {D}}_t^{\alpha (x,t)} \varepsilon (x,t), \end{aligned}$$

where \(g(\alpha (x,t)) := (\alpha _C - \alpha (x,t))^2/4\) denotes a material degradation function with critical damage \(\alpha _C\), A(x) represents a space-dependent pseudo-property, and \(0< \alpha (x,t) < \alpha _{C}\) is the variable fractional order, also interpreted here as damage. The variable-order Caputo derivative is defined in Eq. (36). Interestingly, this mixed interpretation for \(\alpha (x,t)\) makes it a multi-physics descriptor for anomalous damage, viscosity, and material aging. The evolution of \(\alpha (x,t)\) is described by an integer-order phase-field equation, and the resulting model is proved to be thermodynamically admissible.

A key aspect to develop failure models relies on consistent forms of damage energy release rates, i.e., on obtaining the compatible operator for the loss of elastic energy, which is a nontrivial task even for the simplest fractional constitutive law (Eq. 61). This has been achieved by employing the concept of fractional free-energy densities [232, 258, 259]. Alfano and Musto [258] developed a cohesive zone, damaged fractional viscoelastic Kelvin-Zener model, and studied the influence of integer and fractional damage energy release rates on damage evolution. In this case, integer-order energy loss considers Hookean-type rheology to compute the damage energy release rates, which may be justified when Hookean elements are present in the viscoelastic constitutive law, but incompatible for fully fractional cases (an arrangement of Scott-Blair elements). The corresponding free-energy for the SB element is given by:

$$\begin{aligned} \psi ^{{\text {SB}}}(t) = \frac{E}{2\Gamma (1-\alpha )}\int ^t_0 \int ^t_0 \left( 2t-\tau _1 - \tau _2\right) ^{-\alpha }\dot{\varepsilon }(\tau _1)\dot{\varepsilon }(\tau _2)\,d\tau _1 d\tau _2, \end{aligned}$$
(70)

with \(0< \alpha < 1\), which clearly carries a power-law behavior over time. Among their findings, the authors obtained a rate-dependence of the fracture energy in terms of the fractional-order \(\alpha\), opening interesting directions towards failure of anomalous viscoelastic media such as polymers. In [252] this idea was extended to plasticity, and a fractional visco-elasto-plastic model with memory-dependent damage was developed, with isotropic damage evolution \(0 \le D(t) < 1\) given by Lemaitre’s approach [260]:

$$\begin{aligned} \dot{D}(t) = \frac{\dot{\gamma }(t)}{1-D(t)}\left( -\frac{Y^{ve}(t)}{S}\right) ^s, \end{aligned}$$
(71)

with material damage parameters \(s,\,S \in \mathbb {R}^+\), plastic slip \(\dot{\gamma }\) and damage energy release rate \(Y^{ve}(t) = -\psi ^{SB}(t)\). We note that although (71) is a nonlinear ODE, the memory is introduced through the power-law form of \(Y^{ve}\) (70). In this formulation, the viscoelastic and visco-plastic fractional orders introduce a competition between rate-dependent hardening and damage-induced softening, which could open interesting directions for modeling localized hardening in failing anomalous media. Sumelka et al. in [261] also developed the idea of memory-dependent damage for soft materials through a stress-driven time-fractional hyperelastic damage model, with evolution equation in the following fractional nonlinear Cauchy form:

$$\begin{aligned} {}^C_{t-l_t}{\mathbb {D}}_t^{\alpha } D(x,t) = \frac{1}{T^\alpha }\Phi {\left\langle \frac{I_D}{\tau _D} - 1 \right\rangle _M}, \end{aligned}$$
(72)

where \(\Phi\) represents an overstress function in terms of a stress intensity \(I_D\), threshold stress \(\tau _D\) for damage evolution, and a ramp function in Macaulay notation \(\langle \cdot \rangle _M\). The memory length is driven by a time scale \(l_t\), which was taken as a fraction of the total time T. This model was applied with an Ogden hyperelastic law to patient-specific three-dimensional abdominal aortic aneurysm (AA) for critical zone identification, with obtained fractional order \(\alpha = 0.75\).

Additional work on variable-order models in the context of fractional damage, aging and failure includes the following contributions. In Beltempo et al. [262] a variable-order viscoelastic creep model was developed, where the evolution of the fractional order \(\alpha (t)\) dictates the process of concrete aging. The variable-order viscoelastic model developed in Meng et al. [263] employed a piecewise constant order followed by two linear decreasing functions for \(\alpha (t)\) successfully described the initial viscoelasticity, softening and hardening of amorphous glassy polymers under compression. Finally, variable-order operators also proved to be useful mathematical tools to determine the onset of fracture. Patnaik and Semperlotti [10] employed a variable fractional-order activation function for damage, where the sharp power-law activation threshold induced by the fractional operator was successfully employed to determine crack propagation and branching of brittle materials. We refer the reader to the recent review works on the use of variable-order [11] and distributed-order [264] fractional models in viscoelasticity and structural mechanics. In the distributed-order case, fractional derivatives are integrated with respect to a distribution of fractional orders within a certain range of values.

5.3 Future Directions in Modeling Anomalous Materials

Although there exists a large spectrum of fractional models in the context of materials science, solid mechanics and rheology, these models are mostly characterized by constant-order fractional operators, for which a significant number of fast time-integration schemes is available. Yet, there is still a need for efficient numerical methods for variable- and distributed-order operators. In fact, although fractional models lead to a compact physical description with reduced number of material parameters, the computational cost is still high when calibrating the models with large experimental data sets. Furthermore, although there exist an increasing number of distributed-order operators in the context of viscoelasticity, structural mechanics, and anomalous diffusion [264], further validation against experimental data is needed.

We point out interesting research directions that could involve the use of variable- and distributed-order differential equations in the multi-scale modeling of materials. Recently, nano-scale simulation studies on trapping of nano-particles in hydrogel networks indicated a time-temperature dependency of the MSD in the evolution of anomalous diffusion regimes, where a subdiffusion regime has been found to be of transitional nature at intermediate time scales, with ballistic/normal diffusion dynamics for short/long-time scales [265]. This motivates the study of variable-order models in time to compactly describe the macroscopic rheological evolution of such polymer networks. Furthermore, the observation of distributions of power-law scaling parameters in micro-rheology creep experiments on cells [30, 266] indicate the presence of microstructure-induced randomness in rheological response. In this sense, distributed-order models may arise as interesting approaches to naturally incorporate the stochastic parametric data into the differential operator [267].

6 Conclusion

In this work we reviewed fundamental concepts of anomalous transport processes and provided the mathematical and statistical background for understanding them. We then selected three applications where the use of fractional models has experienced dramatic growth and improvement. This set of applications was chosen at our discretion and is, by no means, complete. In fact, several other scientific and engineering fields are currently benefiting from fractional modeling (see, e.g., image processing, finance, machine learning algorithms and many others). However, based on the amount of literature, significance of the applications, and variety of fractional models for their descriptions, we believe that subsurface transport, turbulence, and anomalous materials allowed us to provide insights into the several uses and benefits of fractional modeling. Furthermore, these applications are still the subject of very active fractional research. Finally, given the recent advances in high-performance computing and machine learning, we believe it is now the best time to promote and increase the usability of fractional and nonlocal models for those applications that cannot be adequately described by the classical PDE models.