1 Introduction

The purpose of this paper is to present an overview of random differential equations (RDEs). The theoretical treatment of RDEs is not new; since mid-last century, there are several contributions on the theory of RDEs [5, 6, 41, 87, 93,94,95,96, 99]. However, these equations seem to have remained in the shadow of stochastic differential equations (SDEs) of Itô type for many years [54, page 4]. Perhaps, this point is due to the fact that Itô calculus has given rise to a very fruitful area of research in pure mathematics [36, 89], together with a high impact of applications in fields that generate many resources for research, such as Physics and Finance [1, 77]. At the beginning of this century, though, the treatment of RDEs seems to have been brought back to research [54, 75, 84, 104]. This may be motivated by the explosion of investigations on stochastic numerical methods for Fluid Mechanics, especially based on spectral expansions, creating the field of uncertainty quantification (UQ) [71, 92, 110]. In this survey, we will expose theory and methods for RDEs using classical and recent literature on the subject, making special emphasis on topics for which review works are still lacking.

1.1 UQ

Given a physical system formulated by a mathematical model consisting of differential equations, simulations are needed to predict the response behavior. To achieve this target, apart from having efficient integrators of the model, one should keep in mind the uncertainties associated to collected data, due to limitations of experiments, erratic calibrations, intrinsic variability of the physical phenomenon and incomplete knowledge of it, etc. Uncertainties in data give rise to uncertainties for the input parameters of the model, since these prescribe the constitutive laws of the system. Thus, it is more realistic to treat the input parameters as random variables (rather than averaged values), which makes the model response a stochastic process. Of course, a single simulation of the model for certain specific values of the parameters is not sufficient. Rather, the statistical content of the response (mean, variance, probability density function) is the main interest. The field of UQ, as its name entails, is devoted to the investigation of models for which the uncertainty in data is non-negligible, and to quantify the impact of such uncertainty on parameters (inverse problem) and on the response (forward problem, or uncertainty propagation problem). In modeling applications, UQ is essential for validation (output vs. data), variability analysis (variance, probabilistic intervals, robustness of the prediction, controllability of the system), risk analysis (prediction of critical values, thresholds), and uncertainty management (impact of each parameter on the output, for priorities). Standard texts about computational methods for UQ are [71, 92, 110]. The report [97] is also highly recommendable for the incursion into the field of UQ.

1.2 Formulation

The general form of an RDE initial value problem is the following:

$$\begin{aligned} {\left\{ \begin{array}{ll} x'(t,\omega )=f(t,x(t,\omega ),\omega ),\;&{} t\in I,\,\omega \in \Omega , \\ x(t_0,\omega )=x_0(\omega ),\;&{}\omega \in \Omega . \end{array}\right. } \end{aligned}$$
(1.1)

Here, \(I\subseteq {\mathbb {R}}\) is an interval containing \(t_0\), and \(\Omega \) is the sample space of an underlying complete probability space \((\Omega ,{\mathcal {F}},{\mathbb {P}})\), where \({\mathcal {F}}\subseteq 2^\Omega \) is the \(\sigma \)-algebra of events and \({\mathbb {P}}\) is the probability measure. The outcomes (i.e. the elements of \(\Omega \)) are generically denoted by \(\omega \); these represent any random possibility, situation, circumstance... The term \(x(t,\omega )\) is a stochastic process from \(I\times \Omega \) to \({\mathbb {R}}^q\), \(q\ge 1\), where I is independent of \(\omega \), and \(x'(t,\omega )\) stands for its derivative in some probabilistic sense. Indeed, there are different notions of limit in probability theory (almost sure, mean square), and each notion gives rise to a derivative and a calculus. For example, in the sample-path sense, the trajectories \(x(\cdot ,\omega )\) solve the deterministic problem, while in the mean-square sense or more generally in the p-th sense, the limits are considered in the Lebesgue space \((\mathrm {L}^p(\Omega ),\Vert \cdot \Vert _p)\). Analogously, a problem of random partial differential equations (RPDEs) could be the following:

$$\begin{aligned} {\left\{ \begin{array}{ll} u_t(x,t,\omega )={\mathcal {L}}(u(x,t,\omega ),x,t,\omega ),\;&{} D\times (0,T]\times \Omega , \\ {\mathcal {B}}(u(x,t,\omega ),x,t,\omega )=0,\;&{} \partial D\times [0,T]\times \Omega , \\ u(x,0,\omega )=u_0(x,\omega ),\;&{} D\times \{t=0\}\times \Omega . \end{array}\right. } \end{aligned}$$

Here, \({\mathcal {L}}\) is a differential operator, \({\mathcal {B}}\) is a boundary operator, and \(u(x,t,\omega )\) is a random field, whose partial derivatives are interpreted in some probabilistic sense. In general, the sample-path sense represents the weakest mode of solution; assuming existence of a probabilistic solution, it coincides with the deterministic one by trajectories, and it may be simulated by drawing realizations from \(\omega \in \Omega \). For standard theoretical details on RDEs and RPDEs, the reader is referred to [84, 93, 96]. Any type of input uncertainty is allowed. The stochastic solution is differentiable by trajectories; if an irregular phenomenon is being modeled, a discrete noisy random error (for example, \({\mathcal {E}}\sim \text {Normal}(0,\sigma ^2 I)\), where \(\sigma >0\) is the standard deviation and I is the identity matrix) may be incorporated into the stochastic model response, which matches with the Bayesian formulation of statistical models [81, 92, 110].

1.3 Motivation of RDEs vs. SDEs

These types of RDEs are of special interest in applications, because they allow for modeling uncertainty in a more flexible manner than assuming certain fluctuating patterns, as is the case of the conventional SDEs of Itô type. Within the context of SDEs, uncertainty is restricted to a white noise representation, which may partially limit their applications. In Biology, for example, the formulation of RDEs may be more realistic, since they permit the use of non-Gaussian patterns and bounded random quantities [54]. Apart from the biological sciences, since the development of the generalized polynomial chaos (gPC) technique for solving RDEs around twenty years ago [110, 112], the role of these equations has been very important in Fluid Dynamics and Physics in general. In the setting of these experimental sciences, one should keep in mind the experimental errors when measuring, which are often delimited and not necessarily Gaussian.

Some examples of random systems in Biology or Physics are the SIR model for the spread of epidemics or Burgers’ equation for viscous fluids:

$$\begin{aligned}&{\left\{ \begin{array}{ll} S'(t,\omega )=-\beta (\omega ) S(t,\omega )I(t,\omega ), \\ I'(t,\omega )=\beta (\omega ) S(t,\omega )I(t,\omega )-\nu (\omega ) I(t,\omega ), \\ R'(t,\omega )=\nu (\omega ) I(t,\omega ), \end{array}\right. } \\&{\left\{ \begin{array}{ll} u_t(x,t,\omega )+u(x,t,\omega )u_x(x,t,\omega )=\nu (\omega ) u_{xx}(x,t,\omega ),\; x\in (-1,1),\;t\in (0,T), \\ u(-1,t,\omega )=1+\delta (\omega ),\; u(1,t,\omega )=-1, \end{array}\right. } \end{aligned}$$

respectively. In the first case, the force of infection and the recovery rate are assigned probability distributions. For example, if doctors state that, for the disease under study, the mean time of recovery is between 3 and 10 days, then a prior distribution for \(1/\nu \) could be a uniform law on [3, 10]. In the second case, detailed studies are available in the literature [109, 110, 114]; if a small amount of uncertainty exists in the value of the boundary condition (possibly due to some bias measurement or estimation errors), then the location of the transition layer at steady state may change significantly.

1.4 Outline

The remainder of the paper is organized as follows. In Sect. 2, which is divided into several subsections, we deal with solutions to RDEs. After starting with sample-path solutions in Sect. 2.1, along the section we mostly focus on strong solutions. Some classical facts on strong solutions are reviewed in Sect. 2.2. Then, we report current results on the Fröbenius method for linear ordinary and fractional RDEs in Sect. 2.3, on RPDEs in Sect. 2.4, and on delay RDEs in Sect. 2.5. A comparison between the notions of solution is made in Sect. 2.6. In Sect. 3, some simulation methods for RDEs pertaining to the field of UQ are exposed. In Sect. 3.1, we revisit well-known stochastic expansions for UQ (Taylor, perturbation and polynomial chaos series), as an alternative to Monte Carlo simulation, as well as inverse parameter estimation techniques (experts’ judgment, maximum entropy principle and Bayesian inference). Later, in Sect. 3.2, we turn our attention to recent works on density estimation: transformation of random variables, Liouville’s equation, and hybrid methods based on stochastic expansions. These topics on densities have not been treated in the literature with such an emphasis. Finally, in Sect. 4, possible research directions on RDEs are suggested.

2 Theory on RDEs

As in the theory of deterministic differential equations, part of the development of RDEs deals with the existence and uniqueness of the solution process in some stochastic sense. The theory on ordinary RDEs started to develop last mid-century, and the reader is referred to [5, 6, 41, 87, 93,94,95,96, 99] for an exposition on it. In the last twenty years, there has been an increase of attention on RDEs. More complex ordinary RDEs have been solved based on a greater development of the random calculus, as well as other types of equations such as RPDEs, fractional RDEs, and delay RDEs. The purpose of this section is to present the theoretical framework of RDEs and to explore these more recent and elaborate types of problems. We will base on [104] for recent results on the random calculus, on [14, 16, 20, 21, 27, 28, 57] for the study of second-order linear RDEs by means of the Fröbenius method, on [8] for the investigation of a linear fractional RDE by means of the Fröbenius method, on [9] for the study of the advection RPDE by employing a known chain rule theorem, and on [29] for the study of a linear delay RDE by using the method of steps and mean-square Riemann integration.

2.1 Sample-path solutions

Given a stochastic process \(x:I\times \Omega \rightarrow {\mathbb {R}}\) (we focus on the one-dimensional situation \(q=1\) for simplicity), a trajectory or sample path is the real function \(x(\cdot ,\omega ):I\rightarrow {\mathbb {R}}\), for any given outcome \(\omega \in \Omega \). We say that x solves (1.1) in the sample-path sense or pathwise if the trajectories \(x(\cdot ,\omega )\) solve the corresponding deterministic problem, almost surely (a.s.). As detailed in [96], trajectories may be differentiable at every point in I, or almost everywhere on I by allowing absolutely continuous functions [53]. It is important to verify that x is indeed a stochastic process: measurable for each \(t\in I\), and I and \(\Omega \) independent. For example, as pointed out in [84, page 102], if a is an unbounded positive random variable, then \(x(t,\omega )=1/(1-a(\omega )t)\) is not a solution to the Riccati equation \(x'(t,\omega )=a(\omega )x(t,\omega )^2\), since there is not a common real interval for which all solutions exist. Also, for example, \(x(t,\omega )=\frac{1}{4} t^2\) for \(\omega \in \Omega \backslash \Omega ^{\star }\) and \(x(t,\omega )=0\) for \(\omega \notin \Omega \backslash \Omega ^{\star }\), where \(\Omega ^{\star }\notin {\mathcal {F}}\), does not solve \(x'(t,\omega )=\sqrt{|x(t,\omega )|}\), because it is not measurable.

2.2 Strong solutions

Recall that, given a complete probability space \((\Omega ,{\mathcal {F}},{\mathbb {P}})\), the Lebesgue space \(\mathrm {L}^p(\Omega )\), \(1\le p<\infty \), is the set of random variables \(U:\Omega \rightarrow {\mathbb {R}}\) such that \(\Vert U\Vert _p=({\mathbb {E}}[|U|^p])^{1/p}<\infty \) (finite absolute moments up to order p), where \({\mathbb {E}}\) denotes the expectation operator. When \(p=\infty \), \(\Vert U\Vert _\infty =\inf \{C>0:|U|\le C\text { a.s.}\}\). These spaces are Banach, and for \(p=2\), the space is Hilbert. This case \(p=2\) is particularly important, since it defines the set of random variables with well-defined mean and finite variance; these two statistics are essential for any statistical analysis. A stochastic process \(x:I\times \Omega \rightarrow {\mathbb {R}}\) is said to be of order p if \(\Vert x(t)\Vert _p<\infty \) for all \(t\in I\). Here the symbol \(\omega \) has been dropped, as we are integrating along \(\Omega \). This stochastic process may be viewed as a map \(x:I\rightarrow \mathrm {L}^p(\Omega )\). Actually, this is not an individual function; rather, it is an equivalence class for which a representative is chosen. Common concepts of Mathematical Analysis, such as continuity, differentiability, Riemann integrability, etc., may be defined in the sense of \((\mathrm {L}^p(\Omega ),\Vert \cdot \Vert _p)\), by considering the corresponding limits. This approach leads to a new random calculus. In the cases \(p=1\), \(p=2\) and \(p=4\), we have mean, mean-square and mean-fourth calculus, respectively. Good expositions on random calculus are available in [84, 93, 96, 99, 104]. (Remark: The \(\mathrm {L}^p\)-Riemann integral is a particular case of the Bochner integral [96], but this more abstract notion will not be required for our purposes.)

Given (1.1), the solution x may be considered in this Lebesgue sense, also called strong solution. The connection between sample-path and strong solutions is not obvious. For example, as shown in [84, page 141], the stochastic process \(x(t,\omega )=\mathrm {e}^{a(\omega )t}\), for a exponentially distributed with rate 1, is not a mean solution to \(x'(t,\omega )=a(\omega )x(t,\omega )\), because \({\mathbb {E}}[x(t)]=\infty \). In general, as proved in [96, page 541], if a is unbounded (normal, Poisson, gamma, etc. distributions), then for all p there exists an initial condition \(x_0=x(0)\in \mathrm {L}^p(\Omega )\) such that \(x(t,\omega )=x_0(\omega )\mathrm {e}^{a(\omega )t}\notin \mathrm {L}^p(\Omega )\) for any \(t\ne 0\). In fact, in the context of strong solutions, Picard’s theorem for existence and uniqueness is very stringent and boundedness of random coefficients is usually needed [96, 93, chapter 5]. According to [99], if x is continuously differentiable in the p-th sense, with derivative \(x'\), then there exists an equivalent stochastic process \(\varphi (t,\omega )\) on \(I\times \Omega \), product measurable, such that its sample paths are absolutely continuous, \(\varphi '(t,\omega )\) exists almost everywhere on \(I\times \Omega \), and \(\varphi '(t,\cdot )=x'(t)\) a.s. for almost every \(t\in I\). Also, if x is continuous in the p-th sense on I, then there exists an equivalent stochastic process \(\varphi (t,\omega )\) on \(I\times \Omega \), product measurable, such that \([\int _I x(s)\,\mathrm {d}s](\omega )=\int _I x(s,\omega )\,\mathrm {d}s\), where the integral on the left is the abstract Riemann integral and the integral on the right is the ordinary Lebesgue integral for real-valued functions. Finally, any \(\mathrm {L}^p\)-solution to (1.1) has a product measurable representative which is an absolutely continuous solution in the sample-path sense. Thereby, in practice, one picks the deterministic solution and checks whether it can be differentiated in the p-th sense, by using extensions of the ordinary calculus to the random setting.

We do not spend more time on this classical theory and, in the following parts, we move towards current results on strong solutions.

2.3 Second-order linear RDEs: Fröbenius method and strong solutions

The Fröbenius method consists in finding a strongly convergent power series solution to an RDE, in analogy to the deterministic theory of ordinary differential equations. The general RDE problem is given by

$$\begin{aligned} {\left\{ \begin{array}{ll} x''(t,\omega )+a(t,\omega )x'(t,\omega )+b(t,\omega )x(t,\omega )=0, \; t\in {\mathbb {R}}, \\ x(t_0,\omega )=y_0(\omega ), \; x'(t_0,\omega )=y_1(\omega ). \end{array}\right. } \end{aligned}$$
(2.1)

Here, \(a(t,\omega )\) and \(b(t,\omega )\) are stochastic processes and \(y_0(\omega )\) and \(y_1(\omega )\) are random variables on \((\Omega ,{\mathcal {F}},{\mathbb {P}})\). The stochastic process \(x(t,\omega )\) is the \(\mathrm {L}^{p}(\Omega )\)-solution. It is assumed that \(a(t,\omega )\) and \(b(t,\omega )\) are analytic stochastic processes on a neighborhood \((t_0-r,t_0+r)\), for \(r>0\) fixed, in the strong sense [93, p. 99]: \(a(t,\omega )=\sum _{n=0}^\infty a_n(\omega ) (t-t_0)^n\) and \(b(t,\omega )=\sum _{n=0}^\infty b_n(\omega ) (t-t_0)^n\) are two random power series in \(\mathrm {L}^{p}(\Omega )\), where \(a_0,a_1,\ldots \), \(b_0,b_1,\ldots \) are of order p. The expansions coincide with the strong Taylor series of \(a(t,\omega )\) and \(b(t,\omega )\): \(a_n(\omega )=a^{(n)}(t_0,\omega )/n!\), where \(a^{(n)}\) is the n-th \(\mathrm {L}^p(\Omega )\)-derivative. Random power series converge absolutely and uniformly on any closed interval strictly contained in the maximum domain of convergence. For each t, there is exponential convergence of the series (not uniformly in t). One searches for an analytic solution process \(x(t,\omega )\) of the form \(x(t,\omega )=\sum _{n=0}^\infty x_n(\omega ) (t-t_0)^n\), for \(t\in (t_0-r,t_0+r)\), where the sum is considered in \(\mathrm {L}^p(\Omega )\). From here on, we may drop \(\omega \) for simplicity. From the point of view of UQ, by truncating the random power series, by applying the linearity of expectation and by using precomputed moments of the inputs, moments of the solution up to order p may be approximated exponentially fast at each t, at low cost, thus improving the Monte Carlo method (at least for low or moderately large t).

The study of these types of problems began in 2010, with particular linear equations of Mathematical Physics; for example, Airy’s, Hermite’s, Legendre’s and Laguerre’s RDEs, just to name a few [20, 21, 27, 28]. The authors of those papers assumed exponential growth of the absolute moments of the equation coefficient, which is equivalent to its boundedness. Since then, three research lines have been developed: generalization to arbitrary linear equations [16]; weakening of hypothesis, as much as possible, for particular linear equations [14, 57]; and extension to linear fractional RDEs [8]. Essentially, we will focus on the present-day works [8, 14, 16, 57].

As [16] shows, random power series can be differentiated termwise in the Lebesgue sense. Also, random power series can be multiplied in Cauchy form in the Lebesgue sense. As a consequence, by mimicking the proof of the deterministic setting, problem (2.1) may be solved by employing the Fröbenius method. The basic idea consists in imposing a formal power series solution and obtaining a recursive relation for the expansion coefficients; afterward, the convergence of the series in the Lebesgue sense needs to be proved, by using norms and probabilistic inequalities.

Theorem 2.1

[16] Let \(a(t)=\sum _{n=0}^\infty a_n (t-t_0)^n\) and \(b(t)=\sum _{n=0}^\infty b_n (t-t_0)^n\) be two random series in the \(\mathrm {L}^\infty (\Omega )\) setting, for \(t\in (t_0-r,t_0+r)\), being \(r>0\) fixed. Assume that the initial conditions \(y_0\) and \(y_1\) belong to \(\mathrm {L}^p(\Omega )\). Then the stochastic process \(x(t)=\sum _{n=0}^\infty x_n(t-t_0)^n\), \(t\in (t_0-r,t_0+r)\), with coefficients defined by

$$\begin{aligned}&x_0=y_0,\quad x_1=y_1, \\&x_{n+2}=\frac{-1}{(n+2)(n+1)}\sum _{m=0}^n \left[ (m+1)a_{n-m}x_{m+1}+b_{n-m}x_m\right] ,\;n\ge 0, \end{aligned}$$

is the unique analytic solution to the random initial value problem (2.1) in the \(\mathrm {L}^{p}(\Omega )\) sense.

Given a random variable z, the boundedness \(\Vert z\Vert _\infty <\infty \) is equivalent to \({\mathbb {E}}[|z|^n]\le HR^n\), for certain \(H>0\) and \(R>0\). Growth hypotheses of the form \({\mathbb {E}}[|z|^n]\le HR^n\) were common in the literature to find random analytic solutions to particular cases of (2.1). See for example Airy’s RDE in [27] and Hermite’s RDE in [20]. Theorem 2.1 generalizes the results obtained in those papers.

Example 2.2

Airy’s RDE is defined as follows:

$$\begin{aligned} {\left\{ \begin{array}{ll} x''(t)+Atx(t)=0, \; t\in {\mathbb {R}}, \\ x(0)=y_0, \\ x'(0)=y_1, \end{array}\right. } \end{aligned}$$
(2.2)

where A, \(y_0\) and \(y_1\) are random variables. It is supposed that the initial conditions are of second order. In the general notation, \(a(t)=0\) and \(b(t)=At\). Since A is bounded, the \(\mathrm {L}^\infty (\Omega )\)-convergence of the series that define a(t) and b(t) holds, so Theorem 2.1 is applicable: there is an analytic solution stochastic process x(t) to (2.2) on \({\mathbb {R}}\). Its explicit solution is given by [27]

$$\begin{aligned} x(t)= & {} y_0x_1(t)+y_1 x_2(t), \\ x_1(t)= & {} 1+\sum _{n=1}^\infty \frac{(-1)^n A^n (3n-2)!!!}{(3n)!} t^{3n},\quad x_2(t)=t+\sum _{n=1}^\infty \frac{(-1)^n A^n (3n-1)!!!}{(3n+1)!}t^{3n+1}. \end{aligned}$$

Example 2.3

Hermite’s RDE is given as follows:

$$\begin{aligned} {\left\{ \begin{array}{ll} x''(t)-2t x'(t)+Ax(t)=0, \; t\in {\mathbb {R}}, \\ x(0)=y_0, \\ x'(0)=y_1, \end{array}\right. } \end{aligned}$$
(2.3)

where A, \(y_0\) and \(y_1\) are random variables. We suppose that \(y_0,y_1\in \mathrm {L}^p(\Omega )\). When A is bounded, the input stochastic processes \(a(t)=-2t\) and \(b(t)=A\) are expressible as \(\mathrm {L}^\infty (\Omega )\)-convergent random power series. Hence Theorem 2.1 is applicable and guarantees the existence of a strong solution process x(t) on \({\mathbb {R}}\). By [20], this solution possesses the closed form

$$\begin{aligned} x(t)= & {} y_0x_1(t)+y_1 x_2(t), \\ x_1(t)= & {} 1+\sum _{n=0}^\infty \frac{t^{2n+2}}{(2n+2)!}\prod _{j=0}^n (4j-A),\quad x_2(t)=t+\sum _{n=0}^\infty \frac{t^{2n+3}}{(2n+3)!}\prod _{j=0}^n (4j+2-A). \end{aligned}$$

Example 2.4

Legendre’s RDE is

$$\begin{aligned} {\left\{ \begin{array}{ll} (1-t^2)x''(t)-2t x'(t)+A(A+1)x(t)=0,\;|t|<1, \\ x(0)=y_0, \\ x'(0)=y_1. \end{array}\right. } \end{aligned}$$
(2.4)

In [21], the authors constructed a mean-square convergent power series solution x(t) to (2.4) on \((-1/\mathrm {e},1/\mathrm {e})\) under certain assumptions on the random inputs A, \(y_0\) and \(y_1\): A bounded, independent of \(y_0\) and \(y_1\), and \(y_0,y_1\in \mathrm {L}^4(\Omega )\). The hypotheses were weakened in Theorem 2.1: to have an \(\mathrm {L}^p(\Omega )\)-solution on the whole interval \((-1,1)\), one needs A bounded and \(y_0,y_1\in \mathrm {L}^p(\Omega )\), with no independence condition. According to [21], the strong solution has the following explicit form:

$$\begin{aligned} x(t)=y_0{\tilde{x}}_1(t)+y_1{\tilde{x}}_2(t), \end{aligned}$$

where

$$\begin{aligned} {\tilde{x}}_1(t)= & {} \sum _{m=0}^\infty \frac{(-1)^m}{(2m)!}P_1(m)t^{2m},\quad {\tilde{x}}_2(t)=\sum _{m=0}^\infty \frac{(-1)^m}{(2m+1)!}P_2(m)t^{2m+1}, \\ P_1(m)= & {} \prod _{k=1}^m (A-2k+2)(A+2k-1),\quad P_2(m)=\prod _{k=1}^m (A-2k+1)(A+2k). \end{aligned}$$

Let us see in this example how one may use the series for forward UQ. Let us focus on the first two moments of the output. The mean-square convergence preserves the convergence of the expectation and the variance. Approximations for \({\mathbb {E}}[x(t)]\) and \({\mathbb {V}}[x(t)]\) are thus obtained using \({\mathbb {E}}[x^N(t)]\) and \({\mathbb {V}}[x^N(t)]\), respectively, where \(x^N(t)\) denotes the truncated series up to \(m=N\). Both \({\mathbb {E}}[x^N(t)]\) and \({\mathbb {V}}[x^N(t)]\) are determined just by using the linearity of the expectation and the precomputed moments of \((y_0,y_1,A)\), \({\mathbb {E}}[y_0^{j_1}y_1^{j_2}A^{j_3}]\). More specifically, from the programing standpoint, both \(x^N(t)\) and \((x^N(t))^2\) are expanded as symbolic linear combinations of monomials \(x_0^{j_1}x_1^{j_2}A^{j_3}t^{j_4}\), so \({\mathbb {E}}[x^N(t)]\) and \({\mathbb {E}}[(x^N(t))^2]\) are determined by substituting each symbolic pattern \(y_0^{j_1}y_1^{j_2}A^{j_3}\) by the corresponding moment \({\mathbb {E}}[y_0^{j_1}y_1^{j_2}A^{j_3}]\). For each numeric value of N, \({\mathbb {E}}[x^N(t)]\) and \({\mathbb {V}}[x^N(t)]\) become polynomials in t of degree \(2N+1\) and \(4N+2\), respectively, which present rapid evaluation in t. These comments apply for any other linear RDE. The reader will find abundant numerical simulations in any of the cited references.

With similar ideas, problem (2.1) may be easily generalized by adding a stochastic forcing term c(t):

$$\begin{aligned} {\left\{ \begin{array}{ll} x''(t)+a(t)x'(t)+b(t)x(t)=c(t), \; t\in {\mathbb {R}}, \\ x(t_0)=y_0, \; x'(t_0)=y_1. \end{array}\right. } \end{aligned}$$
(2.5)

Theorem 2.5

Let \(a(t)=\sum _{n=0}^\infty a_n (t-t_0)^n\) and \(b(t)=\sum _{n=0}^\infty b_n (t-t_0)^n\) be two random series in the \(\mathrm {L}^\infty (\Omega )\) setting, for \(t\in (t_0-r,t_0+r)\), with \(r>0\) fixed. Let \(c(t)=\sum _{n=0}^\infty c_n (t-t_0)^n\) be a random series in the \(\mathrm {L}^{p}(\Omega )\) sense on \((t_0-r,t_0+r)\). Suppose that \(y_0\) and \(y_1\) are in \(\mathrm {L}^p(\Omega )\). Then the stochastic process \(x(t)=\sum _{n=0}^\infty x_n(t-t_0)^n\), \(t\in (t_0-r,t_0+r)\), with coefficients

$$\begin{aligned} x_0= & {} y_0,\quad x_1=y_1, \\ x_{n+2}= & {} \frac{1}{(n+2)(n+1)}\left\{ -\sum _{m=0}^n \left[ (m+1)a_{n-m}x_{m+1}+b_{n-m}x_m\right] +c_n\right\} ,\;n\ge 0, \end{aligned}$$

is the unique analytic solution to the problem (2.5) in the \(\mathrm {L}^{p}(\Omega )\) sense.

The boundedness conditions from Theorem 2.1 may be relaxed for certain particular forms of (2.5) by increasing the integrability of the initial conditions and employing the explicit series developments. In [14], the boundedness condition was weakened for Airy’s, Hermite’s and Laguerre’s RDE. (In the case of Laguerre’s equation, there is only one initial condition because 0 is a regular-singular point.) On the other hand, in [57], boundedness was weakened for the Legendre’s RDE.

Theorem 2.6

[14] Suppose that \(y_0\) and \(y_1\) belong to \(\mathrm {L}^{2p}(\Omega )\) and that

$$\begin{aligned} \Vert A^n\Vert _{2p}\le \eta {\mathcal {H}}^{n-1} (n-1)!^s \end{aligned}$$
(2.6)

for \(n\ge n_0\), for constants \(n_0,\eta ,{\mathcal {H}},s>0\). Consider the RDE problems (2.2) (Airy’s equation), (2.3) (Hermite’s equation), and

$$\begin{aligned} tx''(t)+(1-t)x'(t)+Ax(t)=0,\;t\in {\mathbb {R}},\quad x(0)=y_0 \end{aligned}$$

(Laguerre’s equation). Then, for \(0\le s<2\), there is a random power series solution that converges in \(\mathrm {L}^p(\Omega )\) for all \(t\in {\mathbb {R}}\). For \(s=2\), the random power series solution converges in \(\mathrm {L}^p(\Omega )\) on a neighborhood of \(t=0\): for Airy’s equation on \(|t|<\root 3 \of {9/{\mathcal {H}}}\), for Hermite’s equation on \(|t|<2/\sqrt{{\mathcal {H}}}\), and for Laguerre’s equation on \(|t|<1/{\mathcal {H}}\).

Theorem 2.7

[57] Suppose that \(y_0\) and \(y_1\) belong to \(\mathrm {L}^{2p}(\Omega )\) and that

$$\begin{aligned} \Vert A^n\Vert _{4p}\le \eta {\mathcal {H}}^{n-1} (n-1)!^s \end{aligned}$$
(2.7)

for \(n\ge n_0\), for constants \(n_0,\eta ,{\mathcal {H}},s>0\). Consider the RDE problem (2.4). If \(0\le s<1\), then the random power series x(t) is the \(\mathrm {L}^p(\Omega )\)-solution to (2.4) on \((-1,1)\). If \(s=1\), then it is the \(\mathrm {L}^p(\Omega )\)-solution to (2.4) on a neighborhood of zero contained in \((-1,1)\).

As explained in the references, assumptions (2.6) and (2.7) are fulfilled by the bounded, Normal, Gamma and Poisson distributions, with \(s=0\), \(s=1/2\), \(s=1\) and \(s=1\), respectively. Conditions (2.6) and (2.7) for \(s\le 1\) are equivalent to \(\phi _a(t)<\infty \) on a neighborhood of 0, where \(\phi _a(t)={\mathbb {E}}[\mathrm {e}^{tA}]\) denotes the moment-generating function of A [73, theorem A, page 5]. The conditions for \(s\le 2\) and \(A\ge 0\) are equivalent to \(\phi _{\sqrt{A}}(t)<\infty \) on a neighborhood of 0, by [73, theorem B, page 6]. If \(s<1\), then \(\phi _a(t)<\infty \) for every \(t\in {\mathbb {R}}\). The converse is not true in general: if \(A\sim \text {Poisson}(1)\), then the moments of A are the Bell numbers \(b_n\), which satisfy \(\log \Vert A\Vert _n=\log (b_n)/n\sim \log n\) as \(n\rightarrow \infty \) [35, pages 102–109], therefore the minimum s for which (2.7) holds is \(s=1\). But when A is Poisson distributed, we can also solve the Legendre’s RDE.

Theorem 2.8

[57] Consider the Legendre’s RDE (2.4). Assume that the initial conditions \(y_0\) and \(y_1\) belong to \(\mathrm {L}^{2p}(\Omega )\) and that the equation coefficient A follows a \(\text {Poisson}(\lambda )\) distribution. Then the random power series x(t) is the \(\mathrm {L}^p(\Omega )\) solution to (2.4) on \((-1,1)\).

In the remaining part of this subsection, an application of the Fröbenius method for linear fractional RDEs is exposed. In recent decades, equations that in their formulation contemplate the so-called memory effects (delays, or non-local operators for fractional derivatives), are having a great impact. This interest is augmented in the case that such equations also consider uncertainties in their formulation. We take as an example the paper [8]. An extension of the classical derivative \(x'(t)\) is given by the Caputo fractional derivative

$$\begin{aligned} ({}^{C}\! D_{0^+}^{\alpha }x)(t)=\frac{1}{\Gamma (1-\alpha )}\int _0^t (t-s)^{-\alpha }x'(s)\,\mathrm {d}s, \end{aligned}$$

where \(0<\alpha \le 1\). In the random scenario, with \(x(t,\omega )\) being a strongly differentiable stochastic process, such improper integral may be considered as an \(\mathrm {L}^p\)-Riemann integral. Let us consider the following non-autonomous fractional RDE initial value problem:

$$\begin{aligned} ({}^{C}\! D_{0^+}^{\alpha }x)(t,\omega )-B(\omega )t^\beta x(t,\omega )=0,\quad t>0,\quad x(0,\omega )=A(\omega ), \end{aligned}$$

where \(\beta >0\), and A and B are random variables. The Fröbenius method yields a formal random power series solution of the form \(x(t,\omega )=\sum _{n=0}^\infty x_n(\omega )t^{(\alpha +\beta )n}\). By using the random calculus, the goal is to prove that this series is convergent (and hence differentiable) in a strong sense. The main result proved there is the following.

Theorem 2.9

[8, theorem 3] If the random variables A and B are independent and \(\Vert B^n\Vert _2\le \eta H^{n-1} ((n-1)!)^r\) for \(n\ge n_0\), for certain \(\eta ,H,r,n_0>0\), then

$$\begin{aligned} x(t,\omega )=A(\omega )+\sum _{n=1}^\infty B(\omega )^n A(\omega ) \prod _{k=1}^n \frac{\Gamma ((k-1)\alpha +\beta k+1)}{\Gamma (k(\alpha +\beta )+1)}t^{(\alpha +\beta )n} \end{aligned}$$

is the mean-square solution on

$$\begin{aligned} D={\left\{ \begin{array}{ll} [0,\infty ),&{} r<\alpha , \\ \left[ 0,\frac{(\alpha +\beta )^{\frac{\alpha }{\alpha +\beta }}}{H^{\frac{1}{\alpha +\beta }}}\right) ,&{} r=\alpha . \end{array}\right. } \end{aligned}$$

The work on memory RDEs and their stochastic solutions is relatively new, from only a few years ago (the first work that I am aware of is [75], from 2012, which treats both types of probabilistic solutions). It would be of interest to carry out the extension of some of the non-autonomous classical linear equations of Mathematical Physics, such as Hermite’s or Legendre’s equations, to their fractional version with random data.

2.4 Advection RPDE: random chain rule theorem and strong solutions

An RPDE problem incorporates uncertainty into the deterministic problem by randomizing the input parameters (equation coefficients, initial conditions, boundary values, forcing terms, etc.), with any type of probability distributions. The solution becomes a differentiable random field, which solves the problem in the sample-path or the strong sense. Unlike the case of ordinary RDEs, there is no general theory that guarantees the existence of such types of solutions. Here we focus on the one-dimensional linear advection equation with uncertainties, based on the recent work [9].

An important model to describe the concentration of a chemical substance transported by a one-dimensional fluid that flows with a known velocity is

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial }{\partial t}Q(x,t,\omega )+V(t,\omega )\frac{\partial }{\partial x}Q(x,t,\omega )=0,\quad t>0,\quad x\in {\mathbb {R}}, \\ Q(x,0,\omega )=Q_0(x,\omega ),\quad x\in {\mathbb {R}}, \end{array}\right. } \end{aligned}$$
(2.8)

where \(V(t,\omega )\) is the stochastic velocity and \(Q_0(x,\omega )\) is the stochastic initial concentration of the substance at the spatial point. By deterministic theory on PDEs, the sample-path solution to (2.8) is given by \(Q(x,t,\omega )=Q_0(x-A(t,\omega ))\), where \(A(t,\omega )=\int _0^t V(\tau ,\omega )\,\mathrm {d}\tau \) is a sample-path integral. Our purpose is to show that \(Q(x,t)=Q_0(x-A(t))\) is the mean-square solution to (2.8) under certain conditions, where \(A(t)=\int _0^t V(\tau )\,\mathrm {d}\tau \) is understood now as a mean-square Riemann integral.

Since \(Q_0(x-A(t))\) is a composition of two stochastic processes, we need appropriate versions of the chain rule for mean-square differentiation. The hypotheses are stricter than in the sample-path sense.

Theorem 2.10

[104, theorem 3.19] Let \(1\le p<\infty \). Let f be a deterministic \(C^1\) function. Let \(\{x(t):\,t\in [a,b]\}\) be a stochastic process and \(t\in [a,b]\) such that:

  1. (i)

    x is \(\mathrm {L}^{2p}(\Omega )\)-differentiable at t.

  2. (ii)

    x is path continuous on [ab].

  3. (iii)

    There exist \(r>2p\) and \(\delta >0\) such that \(\sup _{s\in [-\delta ,\delta ]} {\mathbb {E}}[|f'(x(t+s))|^r]<\infty \).

Then f(x(t)) is \(\mathrm {L}^{p}(\Omega )\)-differentiable at t and \(\frac{\mathrm {d}}{\mathrm {d} t} f(x(t))=f'(x(t))x'(t)\).

Example 2.11

Consider the RDE problem \(x'(t)=ax(t)\), \(t\in {\mathbb {R}}\), \(x(0)=1\), where a is a random variable. The sample-path solution is \(x(t)=\mathrm {e}^{at}\). By using the chain rule theorem for \(\mathrm {L}^{p}(\Omega )\)-differentiation, we have that \(x(t)=\mathrm {e}^{at}\) is the \(\mathrm {L}^{p}(\Omega )\)-solution on \({\mathbb {R}}\) if and only if the moment-generating function of a, \(\phi _a(t)={\mathbb {E}}[\mathrm {e}^{at}]\), is finite on \({\mathbb {R}}\).

Theorem 2.12

[33, theorem 2.1] Let g be a deterministic differentiable function. Let \(\{y(t):\,t\in [a,b]\}\) be a stochastic process such that [ab] contains the range of g. Suppose that y is \(\mathrm {L}^{p}(\Omega )\)-\(C^1\). Then y(g(t)) is \(\mathrm {L}^{p}(\Omega )\)-differentiable on the whole domain of g and \(\frac{\mathrm {d}}{\mathrm {d} t} y(g(t))=y'(g(t))g'(t)\).

Theorem 2.13

[9, theorem 2.3] Let \(\{y(t):\,t\in [c,d]\}\) be an \(\mathrm {L}^{p}(\Omega )\)-differentiable stochastic process on [cd] with \(C^1\) sample paths. Denote by \(y'\) the \(\mathrm {L}^{p}(\Omega )\)-derivative of y, and by \({\dot{y}}\) the classical derivative of y. Suppose that \(y'\) and \({\dot{y}}\) are indistinguishable stochastic process (i.e., \({\mathbb {P}}[y'(t)={\dot{y}}(t),\;\forall t]=1\)). Let \(\{x(t):\,t\in [a,b]\}\) be a stochastic process with range in [cd], and \(t\in [a,b]\) such that:

  1. (i)

    x is \(\mathrm {L}^{2p}(\Omega )\) differentiable at t.

  2. (ii)

    x is path continuous on [ab].

  3. (iii)

    There exist \(r>2p\) and \(\delta >0\) such that \(\sup _{s\in [-\delta ,\delta ]} {\mathbb {E}}[|y'(x(t+s))|^r]<\infty \).

Then y(x(t)) is \(\mathrm {L}^{p}(\Omega )\) differentiable at t and \(\frac{\mathrm {d}}{\mathrm {d} t} y(x(t))=y'(x(t))x'(t)\).

By applying Theorem 2.12, one obtains the mean-square solution to (2.8) when the velocity is deterministic.

Theorem 2.14

[9, theorem 2.6] Let V(t) be a deterministic velocity function and \(Q_0(x)\) be a stochastic initial condition. Assume that:

  1. (i)

    V(t) is continuous on \([0,\infty )\).

  2. (ii)

    \(Q_0(x)\) is mean-square \(C^1({\mathbb {R}})\).

Then \(Q(x,t)=Q_0(x-A(t))\), with \(A(t)=\int _0^t V(\tau ) \mathrm {d}\tau \), is the mean-square solution to (2.8).

When the velocity is random, further technical hypotheses need to be added to apply Theorem 2.13.

Theorem 2.15

[9, theorem 2.7] Let V(t) be a stochastic velocity and \(Q_0(x)\) be a stochastic initial condition. Suppose that:

  1. (i)

    \(Q_0(x)\) is mean-square differentiable on \({\mathbb {R}}\), with sample paths in \(C^1({\mathbb {R}})\), and with mean-square derivative and classical derivative being indistinguishable stochastic processes.

  2. (ii)

    V(t) is mean-fourth continuous on \([0,\infty )\).

  3. (iii)

    We have

    $$\begin{aligned} \sup _{s\in [-\delta ,\delta ]} {\mathbb {E}}[|Q_0'(x-A(t+s))|^r]<\infty ,\text { for some } r>4 \text { and } \delta >0 \end{aligned}$$

    and

    $$\begin{aligned} \sup _{h\in [-\eta ,\eta ]} {\mathbb {E}}[|Q_0'(x+h-A(t))|^{q}]<\infty ,\text { for some } q>4 \text { and } \eta >0, \end{aligned}$$

    for each \(x\in {\mathbb {R}}\) and \(t>0\).

Then \(Q(x,t)=Q_0(x-A(t))\), with \(A(t)=\int _0^t V(\tau ) \mathrm {d}\tau \), is the mean-square solution to (2.8).

2.5 Linear RDE with discrete delay: method of steps and strong solutions

The Lebesgue approach for ordinary RDEs has been widely used; however, for RDEs with discrete delay, there is still a lack of theoretical analysis. Some recent works are [10, 11, 23]. In [10], general RDEs with discrete delay were investigated, with the goal of generalizing results from [93]. In [11], the basic autonomous-homogeneous linear RDE was investigated in the p-th sense. In [23], the same equation was analyzed, but focusing on probability densities by means of the random variable transformation technique. In the present part, I will detail the main results from [29], which generalizes [11] to the non-autonomous case.

The problem considered is the following:

$$\begin{aligned} \left\{ \begin{array}{ccl} x'(t,\omega ) &{} = &{} a(\omega )x(t,\omega )+b(\omega )x(t-\tau ,\omega )+f(t,\omega ),\;t\ge 0,\;\omega \in \Omega , \\ x(t,\omega ) &{} = &{} g(t,\omega ),\; -\tau \le t\le 0,\;\omega \in \Omega . \end{array} \right. \end{aligned}$$
(2.9)

The delay \(\tau >0\) is constant, while the inputs are random. Formally, according to the deterministic theory [67] (method of steps and variation of constants),

$$\begin{aligned} x(t,\omega )&= \mathrm {e}^{a(\omega )(t+\tau )}\mathrm {e}_\tau ^{b_1(\omega ),t}g(-\tau ,\omega ) \nonumber \\&\quad + \int _{-\tau }^0 \mathrm {e}^{a(\omega )(t-s)}\mathrm {e}_\tau ^{b_1(\omega ),t-\tau -s}(g'(s,\omega )-a(\omega )g(s,\omega ))\,\mathrm {d} s \nonumber \\&\quad + \int _0^t \mathrm {e}^{a(\omega )(t-s)}\mathrm {e}_\tau ^{b_1(\omega ),t-\tau -s}f(s,\omega )\,\mathrm {d} s, \end{aligned}$$
(2.10)

where \(b_1=\mathrm {e}^{-a\tau }b\) and

$$\begin{aligned} \mathrm {e}_\tau ^{c,t}={\left\{ \begin{array}{ll} 0, &{} -\infty<t<-\tau , \\ 1, &{} -\tau \le t<0, \\ \displaystyle 1+c\frac{t}{1!}, &{} 0\le t<\tau , \\ \displaystyle 1+c\frac{t}{1!}+c^2\frac{(t-\tau )^2}{2!}, &{} \tau \le t<2\tau , \\ \;\;\; \vdots &{} \;\;\; \vdots \\ \displaystyle \sum _{k=0}^n c^k\frac{(t-(k-1)\tau )^k}{k!}, &{} (n-1)\tau \le t<n\tau , \end{array}\right. } \end{aligned}$$

is the delayed exponential function [67, definition 1], \(c,t\in {\mathbb {R}}\), and \(n=\lfloor t/\tau \rfloor +1\).

Apart from the chain rule theorem from the preceding subsection, other results on the random calculus are required. We show the random Leibniz’s rule for differentiating parametric \(\mathrm {L}^p\)-Riemann integrals. In general, when tackling RDE problems of any type, the idea is similar: extend the theory from ordinary calculus so that the candidate solution can be differentiated in the random Lebesgue sense.

Theorem 2.16

[29, proposition 3] Let F(ts) be a stochastic process on \([a,b]\times [c,d]\). Let \(u,v:[a,b]\rightarrow [c,d]\) be two differentiable deterministic functions. Suppose that \(F(t,\cdot )\) is \(\mathrm {L}^p\)-continuous on [cd], for each \(t\in [a,b]\), and that \(\frac{\partial F}{\partial t}(t,s)\) exists in the \(\mathrm {L}^p\)-sense and is \(\mathrm {L}^p\)-continuous on \([a,b]\times [c,d]\). Then \(H(t)=\int _{u(t)}^{v(t)} F(t,s)\,\mathrm {d} s\) is \(\mathrm {L}^p\)-differentiable and

$$\begin{aligned} H'(t)=v'(t)F(t,v(t))-u'(t)F(t,u(t))+\int _{u(t)}^{v(t)} \frac{\partial F}{\partial t}(t,s)\,\mathrm {d} s \end{aligned}$$

(the integral is considered as an \(\mathrm {L}^p\)-Riemann integral).

This theorem allows for differentiating the candidate (2.10) rigorously. The main results on existence and uniqueness of solution to (2.9) are the following. The first theorem assumes boundedness for the coefficients a and b, which matches with the theory on ordinary RDEs [93, 96]. The second theorem increases the index for the Lebesgue spaces corresponding to f and g, so that the conditions for a and b can be weakened and allow typical unbounded distributions.

Theorem 2.17

[29, theorem 7] Fix \(1\le p<\infty \). Suppose that a and b are bounded random variables, g belongs to \(C^1([-\tau ,0])\) in the \(\mathrm {L}^{p}\)-sense, and f is continuous on \([0,\infty )\) in the \(\mathrm {L}^{p}\)-sense. Then the stochastic process x(t) defined by (2.10) is the unique \(\mathrm {L}^p\)-solution to (2.9).

Theorem 2.18

[29, theorem 6] Fix \(1\le p<\infty \). Suppose that \(\phi _a(\zeta )<\infty \) for all \(\zeta \in {\mathbb {R}}\), b has absolute moments of any order, g belongs to \(C^1([-\tau ,0])\) in the \(\mathrm {L}^{p+\eta }\)-sense, and f is continuous on \([0,\infty )\) in the \(\mathrm {L}^{p+\eta }\)-sense, for certain \(\eta >0\). Then the stochastic process x(t) defined by (2.10) is the unique \(\mathrm {L}^p\)-solution to (2.9).

2.6 Sample-path solutions vs. strong solutions

In applications, namely computational UQ, sample-path solutions are nearly always considered. These represent the most straightforward generalization of deterministic solutions to the random setting. Then, why should one deal with strong solutions? First, there is a pure mathematical concern on them; strong solutions represent a different notion of solution and it is of mathematical interest to investigate theorems on existence. (Recall that strong solutions, if exist, coincide with the sample-path solution for ordinary RDEs.) Second, when working with sample-path solutions, there is no guarantee that they have finite absolute moments, in particular finite variance. Lebesgue spaces are the natural places to work with statistical moments, so strong solutions ensure the existence of certain statistics of interest. Nonetheless, it is true that a strong solution is not only concerned with finite absolute moments, but also with the existence of a derivative in the metric coming from the Lebesgue norm; admittedly, this strong differentiation may be too much for a task of computational UQ. However, in some cases, strong solutions may be useful in applications, especially when they are related to strong convergence. Convergence of a random sequence in \(\mathrm {L}^p(\Omega )\) respects the convergence of moments up to order p; in particular, mean-square convergence preserves the convergence of the mean and the variance. For example, under the reviewed Fröbenius method, the moments of the solution may be approximated, at exponential convergence rate, by the moments of a truncated sum, which are easy to compute by applying the linearity of expectation and by using precomputed moments of the inputs. In other cases, though, the study of strong solutions may not offer an advantage for stochastic computations; for instance, the advection RPDE treated before.

3 Simulation of RDEs

The main interest when simulating RDE models relies on the extraction of the statistical content of the response. This is UQ. One may wish to estimate moments or, if possible, the probability density function. The classical approach for moment estimation is based on Monte Carlo sampling [7, 46, 110]. Rooted in the law of large numbers, it is a method that always converges as the number of realizations M increases, it is easy to implement if there exists a solver for the deterministic model, and it is robust (the implementation or the convergence rapidity do not depend on t or the dimension of the random space). However, the estimates are random quantities in nature, and the rate of convergence is \({\mathcal {O}}(M^{-1/2})\) and depends on how many seconds one needs to integrate each simulated deterministic model. For example, for three decimals of accuracy, \(10^6\) resolutions may be required. (In this regard, there is no guarantee that this or any specific pre-assigned number will suffice; a procedure that selects M adaptively may be used.) Sampling is based on realizations, which only provide local information; this penalizes the efficiency when determining the global variability. Thus, alternative strategies are needed to accelerate the Monte Carlo simulation and its variance-reduction variants, at least for random dimensionality and independent variables of low or moderate size. In this section, we first revisit non-statistical methods of stochastic expansions for moment estimation (Taylor, perturbation and polynomial chaos expansions) and inverse parameter estimation techniques (experts’ judgment, maximum entropy principle and Bayesian inference), for which there is an extensive list of reviews [51, 64, 71, 76, 83, 92, 97, 98, 109, 110]. Then, as alternatives to kernel methods for density estimation in certain situations, we review the random variable transformation technique, Liouville’s equation, and hybrid methods based on stochastic representations [12, 15, 19, 56, 58]. These techniques on density estimation have not been placed in the literature with such an emphasis.

3.1 Stochastic expansions: approximation of statistics

3.1.1 Power series

The most straightforward methods for UQ are the randomized Taylor series and perturbation expansions. The former method was analyzed in the preceding section when discussing the Fröbenius method for linear ordinary and fractional RDEs. For nonlinear equations, the viscid Burgers’ RPDE has been tackled by employing power series and difference equations [59]. By using the exact random field solution that arises from the Cole-Hopf transformation as reference, the mean-square convergence of the series was tested numerically. Contrary to the well-studied case of linear RDEs, convergence can only be expected in a small neighborhood in space-time when the input random parameters have small dispersion. Nonetheless, in the region of convergence, rapid approximations of the main statistics and of the density function may be determined at virtually no computational cost. Another view for Taylor series is by means of the Lie transformation; among other results, the application of the Lie transformation for the random SIR epidemic model may be consulted in [63]. On the other hand, the perturbation method, introduced in the sixties [34] to study nonlinear vibrations, expands the stochastic solution as a mean-square power series in terms of the centered perturbation [93]. The coefficients are obtained recursively through differential equations by matching terms according to the perturbation powers. These expansions may exhibit (non-uniform) exponential convergence rate in a mean-square sense, thus giving exponentially fast approximations to the expectation and the variance of the response. However, for complex models with strong nonlinearities, these expansions may be difficult to apply and get typically truncated at low-order terms. Some recent applications of the perturbation method in Mechanical Engineering are [31, 65, 66].

Let us see two examples of application. The first one revisits the formulae used in [59] for the viscous Burgers’ RPDE. And the second one reviews the application of the perturbation method from [60] for Burgers’ equation.

Example 3.1

Consider

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial u}{\partial t}+u\frac{\partial u}{\partial x}=\nu \frac{\partial ^2 u}{\partial x^2},&{} x\in {\mathbb {R}},\,t>0, \\ u(x,0)=f(x),&{} x\in {\mathbb {R}}. \end{array}\right. } \end{aligned}$$

For forward UQ, the following formulas may be used:

$$\begin{aligned} U(k,h)= & {} \frac{1}{k!h!}\left. \frac{\partial ^{k+h} u(x,t)}{\partial x^k \partial t^h}\right| _{\begin{array}{c} x=0 \\ t=0 \end{array}}; \\ u(x,t)= & {} \sum _{k=0}^\infty \sum _{h=0}^\infty U(k,h)x^kt^h;\\ U(k,h+1)= & {} \frac{1}{h+1}\left\{ \nu (k+2)(k+1)U(k+2,h)\right. \\&\left. -\sum _{r=0}^k\sum _{s=0}^h (k-r+1)U(r,h-s)U(k-r+1,s)\right\} ; \\ u^{K,H}(x,t)= & {} \sum _{k=0}^K\sum _{h=0}^H U(k,h)x^kt^h. \end{aligned}$$

For a moment \({\mathbb {E}}\left[ (u^{K,H}(x,t))^q\right] \), the series is expanded and the expectation is computed by linearity. The series only converges in a limited neighborhood, when the input random parameters present low variation. A similar behavior is expected for other models of Fluid Dynamics. Nonetheless, within the region of convergence, the convergence is exponentially fast, and the computational cost is really cheap. Further details and numerical experiments appear in [59].

Example 3.2

Let us review an example for the viscous Burgers’ equation in steady state: the ordinary RDE \(uu'=\nu u''\), \(x\in (-1,1)\), with boundary conditions \(u(-1)=1+\delta \) and \(u(1)=-1\). There is a small random perturbation to the left boundary, and \(\nu \) is considered constant. In practice, this randomness usually arises due to measurement errors or the incomplete knowledge of the true physics. The transition layer z, which is supersensitive to \(\delta \), is the zero of the expectation \({\mathbb {E}}[u(z)]=0\). Its standard deviation is \(\sigma (z)=\sqrt{{\mathbb {V}}[u(z)]}\). The perturbation method may be used to estimate z and \(\sigma (z)\). Consider an expansion of the form \(u(x)=\sum _{n=0}^\infty u_n(x)\xi ^n\), where \(u_n(x)\) are unknown deterministic functions and \(\xi =\delta -{\mathbb {E}}[\delta ]\). If we match terms according to the powers of \(\xi \), it is obtained:

$$\begin{aligned}&{\mathcal {O}}(\xi ^n), n \text { even:}\quad {\left\{ \begin{array}{ll} \sum _{j=0}^{n/2-1} (u_j u_{n-j})'+u_{n/2}u_{n/2}'=\nu u_n'',\; x\in (-1,1), \\ u_n(-1)=0,\; u_n(1)=0; \end{array}\right. } \\&{\mathcal {O}}(\xi ^n), n \text { odd:}\quad {\left\{ \begin{array}{ll} \sum _{j=0}^{(n-1)/2} (u_j u_{n-j})'=\nu u_n'',\; x\in (-1,1), \\ u_n(-1)=0,\; u_n(1)=0. \end{array}\right. } \end{aligned}$$

Once the coefficients are known, define the truncations

$$\begin{aligned}&u(x)\approx u^N(x)=\sum _{n=0}^N u_n(x)\xi ^n; \\&{\mathbb {E}}[u(x)] \approx {\mathbb {E}}[ u^N(x)]=\sum _{n=0}^N u_n(x)M_n; \\&{\mathbb {E}}[ u(x)^2] \approx {\mathbb {E}}[ (u^N(x))^2]= \sum _{n_1,n_2=0}^N u_{n_1}(x)u_{n_2}(x)M_{n_1+n_2}; \\&{\mathbb {E}}[ u^N(z^N)]=0, \quad \sigma _N(z)=\sqrt{{\mathbb {E}}[ (u^N(z^N))^2]}; \end{aligned}$$

where \(M_n={\mathbb {E}}[(\delta -{\mathbb {E}}[\delta ])^n]\) is the n-th centered moment of \(\delta \). In [60], it is seen that exponential convergence of these approximations holds with N, from numerical experiments. Thus, the statistical information of the solution may be obtained rapidly, improving Monte Carlo simulation.

3.1.2 Orthogonal expansions

There is an extensive list of reviews on this kind of expansions [51, 64, 71, 76, 83, 92, 97, 98, 109, 110], so we only give here a brief exposition on them.

The polynomial chaos (PC) method expands the random solution in terms of Hermite polynomials when there is a functional dependence on independent Gaussian parameters [48]. Mean-square convergence of the expansion holds by the Cameron-Martin theorem [22]. This approach was extended in 2002 to deal with any type of input uncertainty beyond Gaussian from the Askey scheme, to give rise to generalized polynomial chaos (gPC) expansions [112]. Mean-square convergence is ensured when the moment problem for the inputs is uniquely solvable (this includes bounded random variables, and more generally random variables with finite moment-generating function; the lognormal distribution, for instance, fails to satisfy determinacy by its moments) [45].

A real function may be expanded, in an optimal way, in terms of orthogonal polynomials in the space of square-integrable functions with a weight. It was observed that the weights may be taken as probability density functions, and thus expand in a mean-square sense the solution to random systems in terms of Hermite, Legendre, Laguerre, etc. orthogonal polynomials, depending on the distribution of the input parameters (normal, uniform, gamma, etc. respectively). Spectral convergence holds, that is, the rapidity of convergence depends on the smoothness of the functional dependence on the random inputs. For example, for \(C^\infty \) dependence, there is exponential convergence; for \(C^r\) dependence, \(1\le r<\infty \), the convergence is algebraic; and for discontinuous functions, the convergence becomes slower and, as for Fourier expansions, the Gibbs phenomenon appears; see [110]. The convergence rate is not uniform in t [47, 50]. The method may give a huge improvement over Monte Carlo simulation when the random dimensionality and t are of low or moderate size, otherwise Monte Carlo simulation is unbeatable. If \(Y=g(\xi )\), where Y and \(\xi \) are random vectors and the components of \(\xi \) are independent, then

$$\begin{aligned} Y(\omega )=\sum _{i=1}^\infty {\hat{Y}}_i\phi _i(\xi (\omega )) \end{aligned}$$

in mean square, where \({\hat{Y}}_i={\mathbb {E}}[Y\phi _i(\xi )]/{\mathbb {E}}[\phi _i(\xi )^2]\) are constants and \(\phi _i\) are orthogonal polynomials with respect to the density of \(\xi \). Polynomial chaos (PC) expansions arise when \(\xi \) is Gaussian and \(\phi _i\) are tensor Hermite polynomials. Generalized polynomial chaos (gPC) expansions appear when \(\xi \) belongs to the Askey scheme, and \(\phi _i\) is constructed through tensor products. In general, one may perform an orthogonalization Gram-Schmidt procedure from the canonical basis [107]. In the context of RDE problems \(x'(t,\xi )=f(t,x(t,\xi ),\xi )\), \(x(t_0,\xi )=x_0(\xi )\) [we make the inputs explicit], the solution is expanded in terms of \(\xi \):

$$\begin{aligned} x(t,\xi )=\sum _{i=0}^\infty {\hat{x}}_i(t)\phi _i(\xi ). \end{aligned}$$

To obtain the deterministic coefficients, the Galerkin method is usually applied [112]:

$$\begin{aligned}&x(t,\xi )\approx x^P(t,\xi )=\sum _{i=0}^P {{\tilde{x}}}_{i,P}(t)\phi _i(\xi ), \\&{\left\{ \begin{array}{ll} ({{\tilde{x}}}_{k,P})'(t){\mathbb {E}}[\phi _k(\xi )^2]={\mathbb {E}}[(x^P)'(t,\xi )\phi _k(\xi )]={\mathbb {E}}[f(t,x^P(t,\xi ),\xi )\phi _k(\xi )], \\ {{\tilde{x}}}_{k,P}(t_0){\mathbb {E}}[\phi _k(\xi )^2]={\mathbb {E}}[x^P(t_0,\xi )\phi _k(\xi )]={\mathbb {E}}[x_0(\xi )\phi _k(\xi )],\;\; k=0,\ldots ,P. \end{array}\right. } \end{aligned}$$

Other gPC approaches, of non-intrusive nature [2, 108, 111], use quadratures, interpolation with least-squares fitting, etc. for the estimation of coefficients. The choice depends on the system, its nonlinearities, and whether the Galerkin system for the coefficients is easy to set and integrate [109]. If an efficient integrator is available for the Galerkin system, then with a single resolution of it, functional representations of the model response are obtained from the Galerkin projections. The expectation, the variance, and any statistical information of the response may be extracted from the polynomial approximation by a simple post-processing. Some applications of gPC expansions in epidemiology and engineering are [25, 61, 85, 88, 105, 113]. When the inputs \(\xi \) are non-independent, other techniques must be used, which increases the complexity; in [62, section 4.1], which studies random Hamiltonian systems and polynomial representations for their stochastic solutions, there is a brief review about this situation. Finally, for problems involving discontinuous dependence of the solution on random inputs, there are alternatives to global polynomial representations: expansions using piecewise polynomials, multielement gPC, and wavelets [71, 110].

Remark 3.3

These methods for UQ assume that there is an integrator for the governing model, or for a more complex version of it, in the deterministic context. When the differential equation map is smooth, classical methods are applicable: Runge-Kutta, finite differences, finite elements, etc. However, in an RDE, a Hölder continuous process may be present (for example, \(x'(t,\omega )=-x(t,\omega )+B(t,\omega )\) is well-defined, where B is a Brownian motion). In such a case, classical integrators are not applicable or rarely attain their traditional order, and alternative methods need to be sought. In the book [54], the authors expose Taylor- and SDE-based schemes. These schemes permit obtaining realizations of the solution and thus apply Monte Carlo simulation or any collocation method. Another approach consists in truncating series representations of input stochastic processes (for example, the Karhunen-Loève expansion of B), to reduce the random dimensionality of the problem to a finite degree [109, 110]; this may give rise to a smooth RDE problem, for which classical integrators are of use.

Remark 3.4

Methods based on gPC expansions work for different types of RDE problems: delayed, fractional, etc. [49]. They are also usable for random difference equations, with intrusive (discrete Galerkin projection technique) and non-intrusive variants [13, 17]; albeit deterministic discrete equations are easily solvable by recursion, the use of polynomial expansions may render a significant improvement over Monte Carlo simulation for UQ.

3.1.3 Input uncertainties

All of these techniques are valuable tools in the context of forward UQ, that is, when the probability distributions of the input random parameters are previously fixed and the statistical content of the response is sought with efficient numerical methods. But, in general, the uncertainties associated to the inputs are unknown. Given a physical system described by a set of identifiable parameters, assume a collection of observations of some measurable features of the system. An inverse problem consists in inferring characteristics of the parameters. Despite the simplicity of the definition, inverse modeling is challenging. In real systems, there is uncertainty associated with the physics and measurement errors. This implies that unknowns are better treated as random quantities. Therefore, inferring the parameters means estimating their probability distributions. This is one of the main themes in UQ [92].

In order to build the probabilistic model of the input parameters, one mostly takes into account the nature and the amount of available data for the response [97]. The approaches are described hereunder.

When no data or practically no data are recorded for the response, experts’ judgment may help to prescribe the input distributions [86]. The principle of maximum entropy may be a useful tool in such a case [7, 40, 80, 102]. It is based on maximizing the Shannon entropy functional (the ignorance) for the density, subject to available information (consistency), for example, support, mean, dispersion, etc. For instance, if only a bounded support is known, then the uniform distribution maximizes the entropy; if only the mean and the variance are specified, then the Gaussian law is selected; and if the support is positive and the mean value is known, then the least biased distribution is the exponential. Alternatively, if several parameter measurements are at disposal, then a kernel density estimation or the empirical distribution assignment may be valid options.

When a minimum number of data is available for the response and the model complexity allows it, tools from statistical inference may be applied. Bayesian inference is a rigorous approach to estimate input probability distributions and also model discrepancy [72]. It is a natural mechanism to incorporate the distributions of the observational errors and the prior beliefs on the parameters. These give rise to posterior densities of the parameters, which are used to extract their statistical content (mean, variance, mode, higher-order moments, or any other quantity of interest). This inference may be carried out through quadrature integration for the Bayes’ formula, or through Markov chain Monte Carlo algorithms for elaborate models. Mathematically, from the (possibly scarce) data d of size m and from a prior density of the parameters \(\zeta \) of the RDE problem, \(\pi (\zeta )\) (noninformative, or underpinned by experience, intuition or other studies and measurements, with tools such as the principle of maximum entropy, but independently of the current data d), simulations from the posterior distribution of the parameters are obtained. The Bayesian model is:

$$\begin{aligned} {\left\{ \begin{array}{ll} \text {Statistical model: } x_i|\zeta ,\sigma ^2\sim \text {Normal}\left( G(t_i,\zeta ),\sigma ^2\right) \\ \text {Prior: }\zeta \sim \pi (\zeta ),\;\sigma ^2\sim \pi (\sigma ^2). \end{array}\right. } \end{aligned}$$

Here \(x(t_i,\omega )=x_i(\omega )\) models the datum \(d_i\), \(i=1,\ldots ,m\), and \(G(t,\zeta )\) is the deterministic solution to the differential system. The conditional distributions to the parameters are independent. Notice that a model error is taken into account here; for forward problems, a perfect model \(x(t)=G(t,\zeta )\) is assumed since the interest is in the propagation of uncertainty and the development of fast numerical methods, but for Bayesian inverse problems, the assumption of no model discrepancy is too restrictive. The goal of the Bayesian inference is the posterior distribution of \((\zeta ,\sigma ^2)\):

$$\begin{aligned} \pi (\zeta ,\sigma ^2|d)=\frac{\pi (d|\zeta ,\sigma ^2)\pi (\zeta ,\sigma ^2)}{\int \pi (d|\zeta ',\sigma '^2)\pi (\zeta ',\sigma '^2)\,\mathrm {d}\zeta '\,\mathrm {d}\sigma '^2}. \end{aligned}$$

From it, the posterior predictive distribution (the density for new observations) is

$$\begin{aligned} \pi ({\tilde{x}}|d)=\int \pi ({\tilde{x}}|\zeta ,\sigma ^2)\pi (\zeta ,\sigma ^2|d)\,\mathrm {d} \zeta \,\mathrm {d}\sigma ^2. \end{aligned}$$

A detailed example of application may be consulted in the recent paper [18]. It investigates a compartmental social model, where several inverse parameter estimation and sensitivity analysis techniques are applied to derive public strategies.

Markov chain Monte Carlo procedures (for example, the Metropolis algorithm) usually rely on the repeated resolution of the forward deterministic problem. Thus, functional representations of the model response may become valuable tools to speed up the procedure, especially those that exhibit fast mean-square convergence (e.g. exponential). In the last years, Bayesian inference has been combined with gPC expansions, by employing the stochastic Galerkin projection technique, the stochastic collocation method, and least-squares minimization [78, 79, 82], albeit the method presents limitations when the expansions are not sufficiently accurate at reasonable cost or the data are very informative [74]. Mathematically, if \(G(t,\zeta )\) is not known in closed form, then consider a mean-square approximation \(G^P(t,\zeta )\). The new approximate statistical model is

$$\begin{aligned} \pi ^P(x_i|\zeta ,\sigma ^2)\sim \text {Normal}(G^P(t_i,\zeta ),\sigma ^2). \end{aligned}$$

The Bayes’ formula is

$$\begin{aligned} \pi ^P(\zeta ,\sigma ^2|d)=\frac{\pi ^P(d|\zeta ,\sigma ^2)\pi (\zeta ,\sigma ^2)}{\int \pi ^P(d|\zeta ',\sigma '^2)\pi (\zeta ',\sigma '^2)\,\mathrm {d}\zeta '\,\mathrm {d}\sigma '^2}. \end{aligned}$$

One has convergence

$$\begin{aligned} \lim _{P\rightarrow \infty }\pi ^P(\zeta ,\sigma ^2|d)=\pi (\zeta ,\sigma ^2|d),\quad \lim _{P\rightarrow \infty }\pi ^P({\tilde{x}}|d)=\pi ({\tilde{x}}|d), \end{aligned}$$

in the sense of the Kullback-Leibler divergence, which measures the distance between probability distributions by relative entropy.

In the following example, we study the reverse situation to Example 3.2, namely inverse UQ from data. Perturbation expansions are taken to approximate the solution to Burgers’ RDE.

Example 3.5

Given noisy observations of the transition layer position (z such that \(u(z)=0\)), \(d=(z_1,\ldots ,z_{n_d})\), with known variance \(\sigma ^2\), what is the probability distribution of the random perturbation \(\delta \)? Given the solution \(u_\delta (x)\) with transition layer \(z_\delta \), define a Bayesian model:

$$\begin{aligned} d|\delta \sim \pi (d|\delta )=\displaystyle \prod _{i=1}^{n_d} \pi _{\text {Normal}(z_\delta ,\sigma ^2)}(z_i), \;\; \delta \sim \pi (\delta ). \end{aligned}$$

The posterior density of \(\delta \) is

$$\begin{aligned} \pi (\delta |d)=\frac{\pi (d|\delta )\pi (\delta )}{\int _{-\infty }^\infty \pi (d|\delta ')\pi (\delta ')\,\mathrm {d}\delta '}. \end{aligned}$$

Let \(u_\delta ^N(x)=\sum _{n=0}^N u_{n,\delta }(x)\xi ^n\) be the perturbation expansion of \(u_\delta (x)\). Let \(z_\delta ^N\) be the zero of \(u_\delta ^N(x)\). Define a surrogate Bayesian model:

$$\begin{aligned} d|\delta \sim \pi ^N(d|\delta )=\displaystyle \prod _{i=1}^{n_d} \pi _{\text {Normal}(z_\delta ^N,\sigma ^2)}(z_i), \;\;\delta \sim \pi (\delta ). \end{aligned}$$

The posterior density of \(\delta \) is

$$\begin{aligned} \pi ^N(\delta |d)=\frac{\pi ^N(d|\delta )\pi (\delta )}{\int _{-\infty }^\infty \pi ^N(d|\delta ')\pi (\delta ')\,\mathrm {d}\delta '}. \end{aligned}$$

In the sense of the Kullback-Leibler divergence, \(\pi ^N\) converges to the target \(\pi \).

Remark 3.6

Other modeling frameworks and uncertainty treatments may be possible. For example, in frequentist inference, parameters are regarded as unknown constants, which are estimated (pointwise values, confidence intervals, hypothesis testing) through a sampling distribution. In contrast to Bayesian inference, the parameter is not treated as a random variable and the sampling distribution does not provide a distribution for it. The model error is characterized by a random quantity, which gives rise to a statistical model. Nonlinear regression, as an extension of linear regression, is an adequate tool for parameter calibration [92, chapter 7], [18].

3.2 Density estimation: exact and approximate methods

3.2.1 Exact methods

If the input-output relation of the model is given in closed form, one may use the so-called random variable transformation (RVT) technique, or probability transformation method (PTM):

$$\begin{aligned} Z=g(Y),\; \dim (Z)=\dim (Y),\; h=g^{-1} \; \Rightarrow \;\pi _Z(z)=\pi _Y(h(z))|Jh(z)|, \end{aligned}$$

where Y and Z are random vectors, \(\pi \) denotes the corresponding probability density function, g is a non-random injective transformation, and J is the Jacobian. The formula is based on preservation of probabilities [99]. After marginalizing (integration through quadrature rules), the densities of the components of Z can be obtained. Moments are calculated by integration of the density. This method, though classical and simple, has been relatively little explored in mathematical modeling. In the last decade, it has been applied to study ordinary RDEs, RPDEs and random recursive equations [24, 32, 37, 55, 69, 70, 90]. It may even be used for inverse parameter estimation [24].

There is an alternative version of the RVT technique when the transformation mapping g consists of sums and products. Let A be a random variable, with a density \(\pi _A\) and independent of the random vector \((Z_1,Z_2)\) (which does not necessarily have a density), where \(Z_1\ne 0\) a.s. Then the random variable \(Z_1A+Z_2\) has the density

$$\begin{aligned} \pi _{Z_1A+Z_2}(z)={\mathbb {E}}\left[ \pi _A\left( \frac{z-Z_2}{Z_1}\right) \frac{1}{|Z_1|}\right] ,\quad z\in {\mathbb {R}}. \end{aligned}$$
(3.1)

This is proved by using the law of total probability. To provide the analogy with the customary RVT formula, there is a transformation mapping \(g(A,Z_1,Z_2)=(Z_1A+Z_2,Z_1,Z_2)\) on \({\mathbb {R}}^3\), whose inverse is \(h(Z,Z_1,Z_2)=((Z-Z_2)/Z_1,Z_1,Z_2)\). The Jacobian of h is \(1/Z_1\). If possible, the expectation \({\mathbb {E}}\) in (3.1) should be computed by integration (quadratures). Otherwise, if the explicit probability law of \((Z_1,Z_2)\) is too complicated (because it depends on a significant number of random variables), then Monte Carlo simulation may be applied for \({\mathbb {E}}\), with its variance-reduction variants. The estimation becomes a parametric problem, in contrast to kernel methods. It is faster, it does not depend on kernel or bandwidth choices, and, since it acts pointwise, the support, discontinuities and non-differentiability points of the target density are correctly captured, without smoothing them out [30].

Let us see an example of application, from [19, 56].

Example 3.7

Bateman equations for the radioactive decay chain model are

$$\begin{aligned} N_1'(t)= & {} -\lambda _1 N_1(t), \\ N_j'(t)= & {} -\lambda _jN_j(t)+\lambda _{j-1}N_{j-1}(t),\quad j=2,\ldots ,n-1, \\ N_n'(t)= & {} \lambda _{n-1} N_{n-1}(t). \end{aligned}$$

There is a serial decay chain of n nuclear species \(X_1\rightarrow X_2 \rightarrow \cdots \rightarrow X_n\), where \(\lambda _j\) is the rate of decay of the radionuclides from the species \(X_j\) into the species \(X_{j+1}\). Transition rates from \(j+1\) to j are not allowed. The parameter \(\lambda _n\) is 0 because the nuclides of species \(X_n\) are stable. The number of radionuclides of species \(X_j\) at time t is \(N_j(t)\), \(j=1,\ldots ,n\). The closed-form solution is

$$\begin{aligned} N_m(t)=\sum _{i=1}^m \left[ N_i(0)\left( \prod _{j=i}^{m-1} \lambda _j \right) \sum _{j=i}^m \frac{\mathrm {e}^{-\lambda _j t}}{\prod _{p=i,\,p\ne j}^m (\lambda _p-\lambda _j)}\right] . \end{aligned}$$

To pass to a random equation, let us consider random inputs \(\lambda _j\), \(N_i(0)\) (it is assumed \(\lambda _n=0\)). If one expresses \(N_m(t)=N_1(0)U_m(t)+V_m(t)\), where

$$\begin{aligned} U_m(t)= & {} \left( \prod _{j=1}^{m-1} \lambda _j \right) \sum _{j=1}^m \frac{\mathrm {e}^{-\lambda _j t}}{\prod _{p=1,\,p\ne j}^m (\lambda _p-\lambda _j)}, \\ V_m(t)= & {} \sum _{i=2}^m \left[ N_i(0)\left( \prod _{j=i}^{m-1} \lambda _j \right) \sum _{j=i}^m \frac{\mathrm {e}^{-\lambda _j t}}{\prod _{p=i,\,p\ne j}^m (\lambda _p-\lambda _j)}\right] . \end{aligned}$$

then, by (3.1), the probability density function of \(N_m(t)\) is

$$\begin{aligned} \pi _{N_m}(x;t)&= {\mathbb {E}}\left[ \pi _{N_1(0)} \left( \frac{x-V_m(t)}{U_m(t)} \right) \frac{1}{|U_m(t)|} \right] \\&= \int _{{\mathbb {R}}^{m-1}} \int _{{\mathbb {R}}^{m-1}} \pi _{N_1(0)} \left( \frac{x-V_m(t)}{U_m(t)} \right) \frac{1}{|U_m(t)|} \\&\quad \times \pi _{(N_2(0),\ldots ,N_m(0),\lambda _1,\ldots ,\lambda _{m-1})}(N_2(0),\ldots ,N_m(0),\lambda _1,\ldots ,\lambda _{m-1})\\&\quad \times \mathrm {d}\lambda _1\cdots \mathrm {d}\lambda _{m-1}\,\mathrm {d} N_2(0)\cdots \mathrm {d} N_m(0). \end{aligned}$$

For small n (for example, \(n=3\)), the RVT technique is applied with quadrature-based integration. For a higher n, Monte Carlo simulation for \({\mathbb {E}}\) is conducted. More details and numerical examples are given in [19].

More generally, the law of total probability may be employed to go even further than the RVT formula. In order to keep the paper to a reasonable length, let me just refer the reader to an investigation on the advection RPDE [9]. In that contribution, there is a comprehensive study on the estimation of the probability density function, and even the cumulative distribution function when some input is deterministic. The probabilistic functions of interest are put in terms of an expectation, and Monte Carlo simulation is readily applied as earlier discussed. Several situations are considered and numerical examples are performed.

Another way to view the RVT technique is through Liouville’s equation. It is a PDE that dictates the evolution of the probability density function associated with the solution to an RDE. In the framework of Itô SDEs, which might be more familiar to the reader, Liouville’s equations are usually termed Fokker-Planck equations or forward Kolmogorov equations [3]. For ordinary equations (1.1) whose only source of randomness is due to the initial state \(x_0\) and the motion follows a deterministic law once started, different proofs of the Liouville’s equation

$$\begin{aligned} \frac{\partial }{\partial t}\pi (x;t)+\nabla _{x} \cdot \left( f(t,x)\pi (x;t)\right) =0 \end{aligned}$$

[\(\pi (x;t)\) is the density of x(t)] are available in the classical literature, by using characteristic functions [68, 93, theorem 6.2.2], dynamical systems theory [99, theorem 8.4], and the principle of preservation of probability [26]. Some applications are the following: [4, 42,43,44, 52]. For RPDEs, there is more work to be done; only some studies on the transport random equation are found in the literature: [91, 39, section 3.1], [38, proposition 3.1], [103, 9, theorem 4.1]. In this part of the survey, we present theoretical results on first-order linear RPDEs that extend those for ordinary RDEs. We base on the new contribution [58]. In the proof of the theorem, a key fact is the closedness of solution to first-order linear PDEs under composition. The solution is composed with a test function. The composition is put into the equation, expectation is applied, the derivatives are expanded, and conditional expectations in terms of integration are employed. Finally, the fundamental lemma of calculus of variations is applied.

Theorem 3.8

[58, theorem 2.9] Let

$$\begin{aligned} \sum _{i=1}^n g_i(x)u_{x_i}(x)=0, \end{aligned}$$
(3.2)

where \(x=(x_1,\ldots ,x_n)\in {\mathbb {R}}^n\) is the independent variable, and \(g_i:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\) and \(u:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^m\) are random fields. Let \(g=(g_1,\ldots ,g_n)\). Then, it holds

$$\begin{aligned}&\sum _{I\subseteq [s]} (-1)^{s-|I|}\! \left( \displaystyle \mathrel {\mathop {\otimes }_{i\in I}}\!\nabla _{x_i}\!\right) \\&\quad \cdot \left( \!{\mathbb {E}}\!\left[ \!\left. \left( \displaystyle \prod _{i\in I^c} \left( \nabla _{x_i} \cdot g(x_i)\right) \!\right) \! \left( \displaystyle \mathrel {\mathop {\otimes }_{i\in I}} g(x_i)\!\right) \right| u(x_k)=q_k,\,\,\forall k=1,\ldots ,s\right] \! \pi \!\right) \! = \! 0, \end{aligned}$$

where \(\pi \equiv \pi (q_1,\ldots ,q_s;x_1,\ldots ,x_s)\) is the joint probability density function of \((u(x_1),\ldots ,u(x_s))\) evaluated at \((q_1,\ldots ,q_s)\in {\mathbb {R}}^m\times \cdots \times {\mathbb {R}}^m\), \([s]=\{1,\ldots ,s\}\), |I| is the cardinality of I, \(I^c\) is the complement of I, and \(\otimes \) denotes the Kronecker product. In particular, when g is deterministic,

$$\begin{aligned} \left[ \displaystyle \mathrel {\mathop {\otimes }_{i=1}^s} g(x_i)\right] \cdot \left[ \left( \displaystyle \mathrel {\mathop {\otimes }_{i=1}^s}\nabla _{x_i}\right) \pi \right] =0. \end{aligned}$$

Corollary 3.9

[58, theorem 2.1] Given (3.2), it holds

$$\begin{aligned} \nabla _{x}\cdot \left( {\mathbb {E}}\left[ g(x)|u(x)=q\right] \pi (q;x)\right) ={\mathbb {E}}\left[ \nabla _{x}\cdot g(x)|u(x)=q\right] \pi (q;x), \end{aligned}$$

where \(\pi (q;x)\) is the probability density function of u(x) evaluated at \(q\in {\mathbb {R}}^m\). In particular, when g is deterministic,

$$\begin{aligned} g(x)\cdot \nabla _{x} \pi (q;x)=0. \end{aligned}$$

Several consequences of these results and examples can be read in [58]. Namely, densities for pathwise stochastic integrals (in particular, primitives of important Gaussian processes), ordinary RDEs, linear RPDEs, transformations of random variables, and modeling of shrimp growth. Let us mention that, with alternative arguments, based on the conservation of probability in a fractional volume element, a Liouville’s equation for the probability density function associated to fractional RDEs may be obtained [58, 100].

3.2.2 Approximate methods

If the closed-form solution is not simple, then one may combine stochastic representations of x(t) and the RVT method. Essentially, one constructs a sequence \(\{x^N(t,\omega )\}_{N=1}^\infty \) of processes that tends to \(x(t,\omega )\) as \(N\rightarrow \infty \) for each t, in the mean-square or the a.s. senses, and proves that, for every t,

$$\begin{aligned} \lim _{N\rightarrow \infty } \pi _{x^N(t)}(x)=\pi _{x(t)}(x)\text { almost everywhere}, \end{aligned}$$

where \(\pi \) denotes the corresponding probability density function. This is a strong mode of convergence for densities, since it implies convergence in \(\mathrm {L}^1({\mathbb {R}},\mathrm {d}x)\) (convergence in total variation [101, page 41]) due to Scheffé’s lemma [106, page 55]. The sequence of processes is built by truncating stochastic representations, for example, gPC, power series, Fourier or Karhunen-Loève expansions. Each \(\pi _{x^N(t)}\) is computed by means of the RVT technique. These types of constructions may avoid some of the deficiencies of kernel density estimation: rate of convergence, selection of kernel and bandwidth, and capture of density features (discontinuities, non-differentiability points, tails and support). Examples of application are [12, 15], which combine the RVT method with Karhunen-Loève and gPC expansions, respectively. These are briefly reported here for illustration.

In [12], the following non-autonomous logistic RDE problem was studied:

$$\begin{aligned} {\left\{ \begin{array}{ll} x'(t,\omega )=a(t,\omega )(1-x(t,\omega ))x(t,\omega ),\quad t\in [t_0,T], \\ x(t_0,\omega )=x_0(\omega ), \end{array}\right. } \end{aligned}$$

where \(x_0(\omega )\) is a random variable and \(a(t,\omega )\) is a stochastic process. This initial value problem corresponds to Verhulst’s model, which extends the Malthusian growth model by means of a carrying capacity when there is a lack of nutrients and competition between species as time passes. In the autonomous case, the computation has been addressed in [37] as an application of the RVT technique. Here, we consider a sequence of processes \(\{a_N(t,\omega )\}_{N=1}^\infty \) that approximates \(a(t,\omega )\) in \(\mathrm {L}^2([t_0,T]\times \Omega )\). For instance, the Karhunen-Loève expansion \(a(t,\omega )=\mu _a(t)+\sum _{j=1}^\infty \sqrt{\nu _j} \phi _j(t)\xi _j(\omega )\), where \(\mu _a(t)={\mathbb {E}}[a(t)]\), \(\{\nu _j\}_{j=1}^\infty \) and \(\{\phi _j(t)\}_{j=1}^\infty \) are the set of eigenvalues and eigenfunctions, respectively, associated to the covariance integral operator of a(t), and \(\{\xi _j\}_{j=1}^\infty \) is a sequence of uncorrelated random variables with zero expectation and unit variance. This expansion allows for reducing the random dimensionality of the problem to a finite degree, assuming the uncorrelated random variables are independent; details are available in [109, section 2.3]. The sample-path solution to the logistic problem is given by

$$\begin{aligned} x(t,\omega )=\frac{1}{1+\mathrm {e}^{\int _{t_0}^t -a(s,\omega )\,\mathrm {d} s}\left( -1+\frac{1}{x_0(\omega )}\right) }. \end{aligned}$$

If \(K(t,\omega )=\int _{t_0}^t a(s,\omega )\,\mathrm {d} s\) and \(Z(t,\omega )=\mathrm {e}^{-K(t,\omega )}>0\), then the probability density function of x(t) is

$$\begin{aligned} \pi (x,t)={\mathbb {E}}\left[ \pi _{x_0}\left( \frac{Z(t)x}{1+x(-1+Z(t))}\right) \frac{Z(t)}{(1+x(-1+Z(t)))^2}\right] , \end{aligned}$$

assuming that \(\pi _{x_0}\) is the density of \(x_0\), and that \(x_0\) and a are independent. Consider the truncations \(x_N(t,\omega )\), \(K_N(t,\omega )\) and \(Z_N(t,\omega )\) obtained by truncating a to \(a_N\). The density \(\pi _N(x,t)\) of the truncation \(x_N(t,\omega )\) is the same as \(\pi (x,t)\) but with subindices N for Z(t). Due to the finite random dimensionality, \(\pi _N(x,t)\) (its expectation-based expression) may be computed by quadrature-based integration. In the following results, the pointwise convergence of \(\pi _N(x,t)\) towards \(\pi (x,t)\) is studied.

Theorem 3.10

[12, theorem 2.3] Suppose that \(x_0\) has a density function \(\pi _{x_0}\), which is continuous and bounded on (0, 1). Let \(\{a_N(t)\}_{N=1}^\infty \) be any sequence of stochastic processes, independent of \(x_0\), that tends to a(t) in \(\mathrm {L}^2([t_0,T]\times \Omega )\). Then \(\lim _{N\rightarrow \infty } \pi _N(x,t)=\pi (x,t)\) for each \(x\in (0,1)\) and \(t\in [t_0,T]\).

Theorem 3.11

[12, theorem 2.4] Suppose that \(x_0\) has a density function \(\pi _{x_0}\), which is almost everywhere continuous and bounded on (0, 1). Let \(\{a_N(t)\}_{N=1}^\infty \) be any sequence of stochastic processes, independent of \(x_0\), that tends to a(t) in \(\mathrm {L}^2([t_0,T]\times \Omega )\). Suppose that \(\int _{t_0}^t a(s)\,\mathrm {d} s\) is absolutely continuous for each \(t\in [t_0,T]\). Then \(\lim _{N\rightarrow \infty } \pi _N(x,t)=\pi (x,t)\) for each \(x\in (0,1)\) and \(t\in [t_0,T]\).

Numerical experiments may be consulted in [12].

Finally, in [15], it was proposed a hybrid method based on stochastic polynomial expansions, the RVT technique, and multidimensional integration schemes, to obtain accurate approximations to the density function of the solution to RDEs. Let \(\xi =(\xi _1,\ldots ,\xi _s)\in {\mathbb {R}}^s\) be the random vector whose components are the random input parameters in (1.1). The term \(x(t)=(x_1(t),\ldots ,x_q(t))\) is a stochastic process \(x:I\times \Omega \rightarrow {\mathbb {R}}^q\) that solves (1.1), having joint density function \(\pi (x,t)\). One aims at computing the marginal density functions of \(x_i(t)\), \(\pi _i(x,t)\), for \(i=1,\ldots ,q\). The stochastic process x(t) has a mean-square polynomial expansion \(x(t)=\sum _{k=1}^\infty {\hat{x}}_k(t)\phi _k(\xi )\), where \({\hat{x}}_k(t)=({\hat{x}}_{k,1}(t),\ldots ,{\hat{x}}_{k,q}(t))\in {\mathbb {R}}^q\) are the deterministic coefficients of the expansion, estimated via a Galerkin projection technique for each truncation order P. Let \(x^P(t)\) be the P-th partial sum of the expansion. Its i-th component is denoted by \(x_i^P(t)\). The density \(\pi _i(x,t)\), \(x\in {\mathbb {R}}\), is estimated by using the density function of \(x_i^P(t)\), \(\pi _i^P(x,t)\), given a fixed \(1\le i\le q\). A computational method based on the RVT technique is proposed, since the random variable \(x_i^P(t)\) is an explicit transformation of \(\xi \). The proposed methodology can be applied, for example, to the SIR epidemiological model with random coefficients.

3.2.3 Moment estimation from the density

It is well-known that moments of the stochastic response, \({\mathbb {E}}[x(t)^m]\), may be computed by integration with respect to the probability density function \(\pi (x,t)\) of x(t): \({\mathbb {E}}[x(t)^m]=\int _{-\infty }^{\infty } x^m \pi (x,t)\,\mathrm {d}x\). If the target density \(\pi (x,t)\) or an approximating density \(\pi _N(x,t)\) for it (as \(N\rightarrow \infty \)) are computed by means of the RVT technique and quadrature-based marginalization with high accuracy, then \(\int _{-\infty }^{\infty } x^m \pi (x,t)\,\mathrm {d}x\) or \(\int _{-\infty }^{\infty } x^m \pi _N(x,t)\,\mathrm {d}x\) should be integrated by quadratures. Otherwise, if there is an expectation expression for \(\pi (x,t)\) or \(\pi _N(x,t)\), see (3.1), and parametric Monte Carlo simulation is employed for its estimation (rather than quadratures), then \({\mathbb {E}}[x(t)^m]\) should be approximated by Monte Carlo simulation or polynomial expansion methods from the beginning, as in Sect. 3.1, instead of using the formulae \(\int _{-\infty }^{\infty } x^m \pi (x,t)\,\mathrm {d}x\) or \(\int _{-\infty }^{\infty } x^m \pi _N(x,t)\,\mathrm {d}x\). This issue also occurs when \(\pi (x,t)\) is represented by a kernel density estimate.

4 Possible future works on RDEs

In this last section, some possible future works about RDEs are listed. These are concerned with random calculus and strong solutions, computational issues in UQ, and modeling.

  • As discussed in Sect. 2.3, the mean-square resolution of linear RDEs with the Fröbenius method has been a fruitful area of research in the last years. For general second-order linear equations whose random initial conditions have finite variance, boundedness assumptions for the random input coefficients are required. This is not strange, as shown in the second paragraph of Sect. 2.2. A natural question is whether the boundedness requirement for the equation coefficients can be relaxed if higher integrability is assumed for the initial conditions, namely finiteness of a \((2+\epsilon )\)-moment, \(\epsilon >0\). For example, in the case of \(x'(t,\omega )=a(\omega )x(t,\omega )\), \(x(0,\omega )=x_0(\omega )\), with \(\Vert x_0\Vert _{2+\epsilon }<\infty \), the existence of mean-square solution is equivalent to the finiteness of the moment-generating of a on the real line. This is a consequence of the random chain rule theorem, detailed in Theorem 2.10. It would be of interest to investigate which is the role of the moment-generating function when solving random linear RDEs of higher order, at least the classical linear equations from Mathematical Physics. Theorems 2.62.8 contain assumptions related to this open problem. I am obtaining positive results in this direction.

  • There are still many RDEs to be investigated under the mean-square treatment. For example, RPDEs of wave and Poisson type, or the extension of some of the non-autonomous classical linear equations of Mathematical Physics, such as Hermite’s or Legendre’s equations, to their fractional version with random data.

  • Theorem 3.8 shows the Liouville’s equation for a general first-order linear RPDE problem. Though it seems complex, the cited paper contains examples where the PDE for the probability density function of the stochastic solution can be obtained. It is an open problem whether the theorem is generalizable to higher-order linear RPDEs or nonlinear RPDEs. I am unsure, since a key fact of the proof is the closedness of solution to first-order linear PDEs under composition. On the other hand, the applicability of the Liouville’s equation for UQ needs further investigation as well, in the setting of RDEs with no closed-form solution or with boundary conditions, RPDEs and fractional RDEs. There are no works in this regard yet. Finally, to get a deeper understanding on fractional RDEs, it would be of interest to derive the Liouville’s equation in an alternative manner to that described in the last paragraph of Sect. 3.2.1 (i.e. that based on the conservation of probability in a fractional volume element); for example, test functions / characteristic functions, expectations, fundamental lemma of calculus of variations / Fourier transforms, etc. are the fundamental tools to tackle non-fractional RDEs.

  • In the last decade, there are several papers dealing with the RVT technique (with quadratures classically, or with parametric Monte Carlo more recently) to derive the probability density function of the stochastic solution. Also with the combination of stochastic representations for the solution and the RVT method. The obtained probability density function corresponds to single instants of time; it is sometimes called the first probability density function. In this regard, there are many RDE problems that have not been studied in the density sense, for instance, models in Ecology such as Richards, Lundqvist-Korf, Hossfeld, etc., or models with imaginary random inputs. On the other hand, it would be of interest to investigate joint probability densities, at pairs or triples of times. Correlations of a process, for example, are related to joint densities at pairs of times. Finally, for complex models, the input-output relation may not be explicit, and computational alternatives to the RVT technique should be investigated in order to improve kernel density estimation. In the last paragraph of Sect. 3.2.2, I referred to a hybrid approach that combines polynomial expansions (which are optimal in a mean-square sense) and the RVT method. It works well, although it lacks of theoretical analysis and suffers from the curse of dimensionality.

  • When the probability density function of the stochastic solution is expressible as an expectation, then parametric Monte Carlo simulation is applicable. This avoids the difficulties of quadrature integration when the random dimensionality is large. As commented in Sect. 3.2.1, compared to kernel density estimation, the method is faster, it does not depend on kernel or bandwidth choices, and, since it acts pointwise, the support, discontinuities and non-differentiability points of the target density are correctly captured, without smoothing them out. However, in some situations, it may present slow convergence: since the random quantity inside the expectation (3.1) involves random denominators, its variance may be high for some points, which produces noise that plagues the density estimate (see Example 3.7 with its associated references). This issue does not occur with kernel density estimation. For the moment, I am thinking on a selection of denominators: for each realization, the decomposition of the stochastic solution is made so that the resulting denominator takes the highest value.

  • The application of polynomial representations in the context of random Hamiltonian systems has been recently investigated in [62]. When the random input parameters are independent, the Galerkin system for the gPC coefficients is Hamiltonian too. When the inputs are non-independent, expansions in terms of the canonical polynomial basis are employed; the system for the expansion coefficients is then volume preserving. Thus, geometric integrators are of use. However, in the context of Hamiltonian dynamics, nonlinearities other than multiplicative are common, so the construction and numerical resolution of the Galerkin system need to be further investigated. It may require quadratures or Taylor expansions.

  • This item is quite broad, but it is a well-known open issue within the UQ community: improvement of the efficiency of the UQ methodology for a large number of uncertainties in an expensive model. Essentially, how to handle the curse of dimensionality when developing and using stochastic representations, both intrusive and non-intrusive.

  • We know that mean-square convergence of gPC expansions is ensured when the moment problem for the random inputs is uniquely solvable. However, many theoretical aspects about the convergence of gPC expansions and stochastic Galerkin projections are still open. For example, the mean-square convergence of Galerkin projections for general RDE problems, conditions for the convergence of the densities of the polynomials, rates of convergence, etc.

  • Alternatives to Bayesian inference for inverse parameter estimation in RDEs shall be investigated. For RDEs with no model discrepancy involved, methods that ensure fast uncertainty propagation have been developed (gPC expansions, etc.). However, input-uncertainty representation is still an open problem in such a case. In many situations, experts’ judgment is not enough to set appropriate input distributions, especially existing correlations. The maximum entropy principle, albeit usually recommended, depends on the parameterization of the problem and the prior information available about the inputs. And when Karhunen-Loève expansions are used to represent infinite-dimensional input uncertainties, the laws and dependencies of the random Fourier coefficients are unknown. Even though these topics tend to be ignored to focus on forward numerical procedures, they are essential for a proper modeling of the true physics.

  • Any existing or future modeling study may greatly benefit from the use of UQ techniques. Given data, when a deterministic model is set, one checks that the parameters are identifiable and can be estimated from optimization routines. This gives rise to a fit of the data and to physically interpretable parameter values, from an averaged viewpoint. Nonetheless, the errors in measurements and the natural random variability of the phenomenon under study, which are assumed to be irreducible, are better modeled by conducting a probabilistic analysis. The inference of input uncertainties and the computation of quantities of interest for the stochastic response shall complete any prior deterministic-type study.