1 Introduction

Polynomial chaos expansion is a deterministic approach to solving stochastic partial differential equations (SPDEs). The main idea is to construct a complete set of \(L^2\) orthogonal basis functions of the underlying probability space. Then the expansion coefficients will satisfy a deterministic system of PDEs, called the propagator in [22]. It is usually recognized as the stochastic Galerkin method [33] due to its evident resemblance with the classic spectral Galerkin method. Traditionally, the polynomial chaos expansion approach is well suited for SPDEs driven by Gaussian white noise or Lévy noise [12, 19]. One can check [22, 23] for theoretical analysis and [20, 36, 37] for numerical studies. Recently, a distribution-free stochastic Malliavin calculus framework was developed in [25], giving rise to a new class of SPDEs driven by arbitrary type of noise. The numerical aspects of these distribution-free SPDEs were then investigated in [10]. For linear SPDEs, the propagator system has lower triangular and sparse structure, and is independent of the type of noise involved. However, for nonlinear problems, the propagator system is fully coupled, and varies from one kind of noise to another. The Wick–Malliavin approximation was proposed in [24] as a decoupling technique. The authors in [10] also derived an error estimate for the truncation of the propagator system, showing that the mean square truncation error converges at an exponential rate with respect to the polynomial order, and at a cubic rate with respect to the number of random variables included.

In many test cases, solving the deterministic propagator is much faster than Monte Carlo simulation for a desired accuracy level, in that it is free of repeated sampling and the curse of half order convergence. However, the size of the propagator system grows exponentially with respect to both truncation parameters, and the (cubic) convergence rate is not fast enough to compensate for it. We often need to evolve thousands of expansion coefficients sequentially to obtain a high-fidelity approximation. This can be prohibitively expensive, especially for problems with long time integration and/or high space dimensions. It is therefore imperative to design fast algorithms with guaranteed accuracy and practically observable speedup.

Fast algorithms such as the proper orthogonal decomposition (POD) algorithm [3, 32] have been an active research area for parametrized partial differential equations (PPDEs) which are ubiquitous in science and engineering. In the “many-query” (when simulations of the PPDE are necessary for many different realizations of the parameter) or “real-time” (when simulation result is sought in negligible time) contexts, the reduced basis method (RBM) [5, 9, 26, 29] is by now a well-known approach. It only queries the PPDE emulator, that is the “truth” solver, for a rather small number of times at some strategically determined locations in the parameter domain in the so-called offline phase to build up a reduced solution space. The online phase seeks a surrogate solution for any parameter value as the energy projection of the solution corresponding to that parameter into the reduced solution space. What differentiates RBM from other model reduction approaches is perhaps its explicit error certification which also guides the building of the low-dimensional solution space.

This paper considers the adaptation of RBM for our distribution-free SODEs and SPDEs. Classic RBM usually deals with problems with an explicit parameter dependence [18, 27]. Our setting is based on differential equations that are notably free of parameters. It is therefore not immediately clear how to apply RBM directly since, first of all, there is no parameter-induced solution manifold thanks to the lack of a parameter. The second challenge originates from the fact that the propagator system must be solved sequentially. This sequential nature means that, even if we find a way to parametrize the propagator system, the parametric problems are inter-connected, a phenomenon that is notably missing from any PPDE.

To address these challenges, we first construct a non-conventional “parameterization” of the propagator system which exploits the multi-index of the polynomial chaos expansion. We then design a RBM that is free of any dedicated offline procedure to accommodate the sequential nature of the problem since the usual offline procedure would have demanded queries of the “truth” solver at any parameter values independent of those at the others. The resulting reduced solver is still online efficient, meaning that its complexity is independent of the size of the “truth” solver. The resulting solver for the stochastic differential equations is a hybrid of “truth” and reduced solvers which is orders of magnitude faster than using the “truth” solver alone. To facilitate the handling of time, we propose a new space-time-like treatment of time in the numerical schemes for ODEs and PDEs. Unlike existing approaches, it is based on time-stepping, thus is free of the hassle of forming the space-time variational formulation, but obtains approximations of the solution in a space-time fashion. When the spatial domain has high dimensions, it is infeasible to handle the full space-time solutions and construct the resulting reduced solver, we design an accurate yet efficient compression technique for the spatial component of the space-time snapshots that RBM is adopting as bases. Obviously this compression procedure is not necessary for ODES. Therefore, we have two versions of the certified offline-free reduced basis (COFRB) methods, COFRB_ODE and COFRB_PDE.

The remainder of the paper is organized as follows. In Sect. 2, we review the stochastic analysis to derive the propagator system for SPDE model problems and the classical reduced basis method. The COFRB methods and the details of their implementation are presented in Sect. 3. Complexity discussions are in Sect. 4. Section 5 is devoted to numerical results. Finally, some concluding remarks and discussions on future work are given in Sect. 6.

Table 1 Selected notation used throughout this article

2 Background

In this section, we present the necessary background materials for our algorithm, namely the distribution-free stochastic analysis and the classical reduced basis method. To ease readability, Table 1 lists the notations that are most often used in this section and beyond.

2.1 Distribution-Free Stochastic Analysis

Let \((\Omega , {\mathcal {F}},\mathbb {P})\) be a probability space, and \(\Xi =\{\xi _{k}\}_{k=1}^{\infty }\) is a sequence of independent identically distributed random variables with \({\mathbb {E}}[\xi _{k}]=0\) and \({\mathbb {E}}[\xi _k^2]=1\) for each \(k\ge 1\). Suppose that there exists an orthogonal set of polynomials \(\{\varphi _{n}(\xi )\}_{n=0}^{\infty }\) such that \(\mathbb {E}[\varphi _n(\xi )\varphi _{m}(\xi )]=\delta _{mn}n!\). For instance, if \(\{\xi _{k}\}_{k=1}^{\infty }\) are standard Gaussian variables, then \(\{\varphi _{n}\}_{n=0}^{\infty }\) is the set of probabilists’ Hermite polynomials. Orthogonal polynomials for a wide class of random distributions are listed in [33]. In particular, it is clear that \(\varphi _0(\xi )=1\) and \(\varphi _1(\xi )=\xi \).

Then we construct the polynomial chaos basis functions under the notation of multi-indices. Denote by \({\mathcal {J}}\) the set of multi-indices \(\alpha =(\alpha _k)_{k=1}^{\infty }\) of finite length \(|\alpha |=\sum _{k=1}^{\infty } \alpha _k\):

$$\begin{aligned} {\mathcal {J}}=\left\{ \alpha =(\alpha _k)_{k=1}^{\infty }: \alpha _k\ge 0, \ |\alpha | < \infty \right\} . \end{aligned}$$

Let \(\varepsilon _0\) be the multi-index whose entries are all zero, and \(\varepsilon _k\) be the multi-index such that its k-th entry is 1 and all other entries are zero. We also define polynomials and factorials of multi-indices:

$$\begin{aligned} \xi ^{\alpha }:=\prod _{k=1}^{\infty }\xi _{k}^{\alpha _k}, \quad \alpha !:=\prod _{k=1}^{\infty } \alpha _k !. \end{aligned}$$

The generalized polynomial chaos (gPC) basis functions [34] are taken to be the tensor products of \(\{\varphi _{n}\}_{n=0}^{\infty }\):

$$\begin{aligned} \Phi _{\alpha }:=\prod _{k=1}^{\infty }\varphi _{\alpha _k}(\xi _k),\text { for each }\alpha \in {\mathcal {J}}. \end{aligned}$$
(2.1)

Then \(\Phi _{\varepsilon _0}=1\), \(\Phi _{\varepsilon _k}=\xi _k\), and \(\mathbb {E}[\Phi _{\alpha }\Phi _{\beta }]=\alpha !\delta _{\alpha \beta }\). It is demonstrated in [25] that \(\{\Phi _\alpha ,\alpha \in {\mathcal {J}}\}\) is indeed a complete Cameron-Martin type basis [7].

Proposition 2.1

We assume that the moment generating function \(\mathbb {E}[\exp (\theta \xi _k)]\) exists for all \(\theta \) in some neighborhood of 0. Then \(\{\Phi _{\alpha },\alpha \in {\mathcal {J}}\}\) forms a complete orthogonal basis of \(L^2(\Omega ,\sigma (\Xi ),\mathbb {P})\), and for any \(\eta \in L^2(\Omega ,\sigma (\Xi ),\mathbb {P})\), we have its stochastic polynomial chaos expansion

$$\begin{aligned} \eta =\sum _{\alpha \in {\mathcal {J}}}\eta _{\alpha }\Phi _{\alpha }, \quad \eta _{\alpha }=\frac{\mathbb {E}[\eta \Phi _{\alpha }]}{\alpha !}. \end{aligned}$$

The Wick product [31] serves as a convolution type binary operator on polynomial chaos basis functions:

$$\begin{aligned} \Phi _{\alpha }\diamond \Phi _{\beta }=\Phi _{\alpha +\beta }, \end{aligned}$$

so that for two random variables u and v,

$$\begin{aligned} u\diamond v=\sum _{\alpha \in {\mathcal {J}}}\sum _{\beta \in {\mathcal {J}}}u_\alpha v_\beta \Phi _{\alpha +\beta }\text { where }u=\sum _{\alpha \in {\mathcal {J}}}u_{\alpha }\Phi _{\alpha }\text { and }v=\sum _{\beta \in {\mathcal {J}}}v_{\beta }\Phi _{\beta }. \end{aligned}$$

Now we take the time variable into account. Let [0, T] be some time interval and \(\{m_k(t)\}_{k=1}^{\infty }\) be a complete orthonormal basis of \(L^2([0,T])\). We define the following driving noise \(\dot{{\mathfrak {N}}}(t)\):

$$\begin{aligned} \dot{{\mathfrak {N}}}(t)=\sum _{k=1}^{\infty }m_k(t)\xi _k= \sum _{k=1}^{\infty }m_k(t)\Phi _{\varepsilon _k}, \end{aligned}$$
(2.2)

and the stochastic process

$$\begin{aligned} {\mathfrak {N}}(t)=\int _{0}^{t} \dot{{\mathfrak {N}}}(s) \ ds = \sum _{k=1}^{\infty } \left( \int _{0}^{t} m_{k}(s) \ ds \right) \xi _{k}. \end{aligned}$$
(2.3)

We take the linear parabolic SPDE as our model problem. The general form is

$$\begin{aligned}&\partial _t u(t,x)={\mathcal {L}}u(t,x)+{\mathcal {M}}u(t,x)\diamond \dot{{\mathfrak {N}}}(t), \quad (t,x)\in (0,T] \times D, \nonumber \\&u(0,x)=u_0(x),\quad x \in D . \end{aligned}$$
(2.4)

Here \(D\subset \mathbb {R}^d\) is some spatial domain, and

$$\begin{aligned} \mathcal {L}u(t,x)&=\sum _{i=1}^{d}\sum _{j=1}^{d} a_{ij}(x) \partial _{i}\partial _{j} u(t,x) + \sum _{i=1}^{d}b_i(x)\partial _i u(t,x) + c(x) u(t,x), \end{aligned}$$
(2.5)
$$\begin{aligned} \mathcal {M}u(t,x)&=\sum _{i=1}^d\alpha _i(x)\partial _i u(t,x) + \beta (x) u(t,x), \end{aligned}$$
(2.6)

where \(\partial _i\) is the i-th spatial partial derivative. We expand u(tx) into \(\{\Phi _{\alpha },\alpha \in {\mathcal {J}}\}\):

$$\begin{aligned} u(t,x)=\sum _{\alpha \in {\mathcal {J}}}u_{\alpha }(t,x)\Phi _{\alpha }. \end{aligned}$$

By the definition of Wick product,

$$\begin{aligned} {\mathcal {M}}u(t,x)\diamond \dot{{\mathfrak {N}}}(t)= \sum _{\alpha \in {\mathcal {J}}}\sum _{k=1}^{\infty } {\mathcal {M}}u_{\alpha }(t,x)m_k(t)\Phi _{\alpha +\varepsilon _k}. \end{aligned}$$

We come up with the deterministic propagator system by comparing the expansion coefficients on both sides of (2.4):

$$\begin{aligned}&\partial _t u_\alpha (t,x) = {\mathcal {L}}u_\alpha (t,x) + \sum _{\varepsilon _k \le \alpha } {\mathcal {M}} u_{\alpha - \varepsilon _k}(t,x) m_k(t), \quad (t,x) \in (0,T] \times D \nonumber , \\&u_\alpha (0,x)=u_0(x)\mathbb {1}_{\{\alpha = \varepsilon _0\}}, \quad x \in D. \end{aligned}$$
(2.7)

It is a system of linear parabolic deterministic PDEs, with a lower-triangular and sparse structure, i.e. a multi-index with order n only talks to itself and multi-indices with order \(n-1\). As a result, we can solve the system sequentially, and coefficients with the same order can be updated in parallel. Additionally, the system does not depend on the type of randomness involved, which implies the computational overhead from changes of noise is almost negligible. That is, the propagator is solved once for all types of randomness.

Remark 2.1

The i.i.d. assumption of the sequence of random variables can be relaxed. In [25] the authors only assumed that \(\{\xi _k\}_{k=1}^{\infty }\) contains uncorrelated random variables with zero mean and unit variance. Here we restrict ourselves to i.i.d. sequences for technical simplicity.

Remark 2.2

In the special case of i.i.d standard Gaussian variables, \(\dot{{\mathfrak {N}}}(t)\) is the Gaussian white noise \({\dot{W}}(t)\), and \({\mathfrak {N}}(t)\) is the Wiener process W(t). Moreover, (2.4) is equivalent to the Itô type SPDE

$$\begin{aligned} du(t,x)={\mathcal {L}}u(t,x)dt+{\mathcal {M}}u(t,x)dW(t). \end{aligned}$$
(2.8)

More details can be found in [12].

2.2 Reduced Basis Methods: A Brief Overview

In this section, we present a brief overview of the main ingredients of the reduced basis method while referring to the recent surveys and monographs [16, 18, 27] for more details. Let \(\varvec{\mu }\) be a p-dimensional parameter, and X be a Hilbert space of functions on D. Given any parameter value \(\varvec{\mu }\) in a prescribed parameter domain, the goal is to compute \(u(\varvec{\mu }) \in X\) satisfying a PPDE written in a weak form

$$\begin{aligned} a(u(\varvec{\mu }),v; \varvec{\mu }) = f(v; \varvec{\mu }), \quad v \in X. \end{aligned}$$
(2.9)

We denote by \((\cdot , \cdot )_{X}\) the inner product associated with the space X, whose induced norm is \(|| \cdot ||_{X} = \sqrt{(\cdot , \cdot )_{X}}\). As is common in the RBM literature [29], we assume that \(a(\cdot , \cdot ; \varvec{\mu })\) and \(f(\cdot ; \varvec{\mu })\) are “affine”Footnote 1 with respect to the parameter \(\varvec{\mu }\). There are strategies for situations when the affine assumption is not satisfied, e.g., empirical interpolation [1]. We assume there exists a spatial discretization having \({N_h} \gg 1\) degrees of freedom to compute an approximate solution of (2.9), \(u^{N_h}(\varvec{\mu }) \in X^{{N_h}}\) with \(\dim (X^{{N_h}}) = {N_h}\) called truth solution, within an acceptable accuracy for every \(\varvec{\mu }\). RBM attempts to, after a once-for-all preparation stage, provide a surrogate to \(u^{{N_h}}\) with comparable accuracy, but with orders-of-magnitude less computational cost than the truth solver. The essential idea is to compress the collection of discrete solutions \(u^{{N_h}}(\varvec{\mu })\) for enough \(\mu \) into a low-dimensional space, and then to efficiently compute a projected approximation for any \(\varvec{\mu }\).

Indeed, an RBM algorithm approximates the solution space by an N-dimensional subspace of \(X^{{{N_h}}}\), denoted by \(X^{{{N_h}}}_{N}\), with \(N \ll {{N_h}}\). \(X^{{{N_h}}}_{N}\) is formed through a greedy algorithm in a hierarchical manner as the span of the so-called “snapshots”, by hierarchically constructing a sample set \(S^N = \{\varvec{\mu }^1, \dots , \varvec{\mu }^N\}\) from the training set discretizing the parameter domain and obtaining the truth solution for \(S^N\). The surrogate RB solution \(u_N^{{N_h}} (\varvec{\mu })\) is computed as an energy projection into \(X^{{{N_h}}}_{N}\). In addition to the size of the system decreasing from \({{N_h}}\) to N, further saving is realized through pre-computation and storing of the parameter-independent components of the RB “stiffness” matrix which is decomposed via the affine assumption. Moreover, these components can be computed by adding one more row and one more column each time a new sample location \(\varvec{\mu }^i\) is selected and the new snapshot resolved, thanks to the nested structure of \(X^{{{N_h}}}_{N}\).

Selecting snapshots through the a posteriori error estimate: The key question left is how to select the representative parameters \(\varvec{\mu }^1, \ldots , \varvec{\mu }^N\) for the critically important sample set \(S^N\). RBM adopts a greedy scheme to iteratively construct \(S^N\) leaning on an efficiently-computable error estimate that quantifies the discrepancy between the dimension-i RBM surrogate solution \(u^{{N_h}}_{{i}}(\varvec{\mu })\) and the truth solution \(u^{{N_h}}(\varvec{\mu })\), \(\Delta _{{i}}(\varvec{\mu }) \ge \left\| u^{{N_h}}_{{i}}(\varvec{\mu }) - u^{{N_h}}(\varvec{\mu })\right\| _{X^{{{N_h}}}}\). Assuming the existence of this error estimate, the greedy procedure for constructing \(S^N\) is summarized in Algorithm 1.

figure a

Dealing with time: There are two standard approaches to deal with time within reduced basis methods [14]. The first (and more standard one) is based on a time-stepping scheme in the offline phase. The reduced basis is built by the so called POD-Greedy method [4, 15, 17]. It combines the standard greedy algorithm for exploring the parameter dependence as in Algorithm 1, and a proper orthogonal decomposition (POD) in time to determine the time steps that contains the most information of the temporal trajectory for the chosen parameter that can not be well approximated by the current RB space. During the online phase, the algorithm amounts to a time-stepping system that is reduced in size but with the same time step as the “truth” solver.

The second is based on the space-time variational formulation of the original problem [30, 35] which considers time as an additional variable. This results in a Petrov-Galerkin problem in \(d+1\) dimensions with d being the spatial dimension of the problem. The reduced basis is then formed following the standard greedy approach as outlined in this section.

3 COFRB Algorithms

Numerical discretization of (2.7), which provides the basis for the solution of the SPDE (2.4) via the spectral expansion, usually follows the method of lines technique, in which we start with standard spatial discretization schemes, transforming the propagator system into a larger system of ODEs. Suppose that the spatial discretization has \(M_f\) degrees of freedom, and \({\tilde{A}}\) and \({\tilde{B}}\) are \(M_f \times M_f\) discretized version of \({\mathcal {L}}\) and \({\mathcal {M}}\). The ODE system is

$$\begin{aligned}&{\mathbf {u}}_\alpha '(t)={\tilde{A}}{\mathbf {u}}_\alpha (t) + \sum _{\varepsilon _k \le \alpha }m_k(t) {\tilde{B}}{\mathbf {u}}_{\alpha -\varepsilon _k}(t), \quad t \in (0,T] \end{aligned}$$
(3.1a)
$$\begin{aligned}&{\mathbf {u}}_\alpha (0)={\mathbf {u}}_0\mathbb {1}_{\{\alpha =\varepsilon _0\}} \end{aligned}$$
(3.1b)

where \({\mathbf {u}}_{\alpha }(t)\in \mathbb {R}^{M_f}\) is the vector \(u_{\alpha }(x,t)\) evaluated at those degrees of freedom. Then suitable ODE solvers such as Runge-Kutta methods can be directly adopted .

Since there are infinitely many i.i.d. random variables in \(\Xi \), it is impossible to handle the infinite system (2.7) or (3.1) as is. A finite truncation is always necessary. Toward that end, for positive integers \(K, M \ge 0\), we define the truncated multi-index set

$$\begin{aligned} {\mathcal {J}}_{M,K} :=\{\alpha \in {\mathcal {J}}: |\alpha | \le M,\ d(\alpha ) \le K\}, \quad {\mathcal {N}}=\#({\mathcal {J}}_{M,K})=\left( \begin{array}{c} M+K\\ M \end{array}\right) . \end{aligned}$$
(3.2)

That is, \({\mathcal {J}}_{M,K}\) contains multi-indices whose corresponding polynomial order is no more than M, and number of random variables no more than K. The size of \({\mathcal {J}}_{M,K}\) grows rapidly with respect to both M and K. Given this truncation, the truth solution \({\mathbf {u}}_{M,K}^{N_h}\) approximately solving the SPDE (2.4) is

$$\begin{aligned} {\mathbf {u}}^{N_h}_{M,K}(t):=\sum _{\alpha \in {\mathcal {J}}_{M,K}} {\mathbf {u}}_{\alpha }(t) \Phi _\alpha , \end{aligned}$$
(3.3)

with \({\mathbf {u}}_{\alpha }(t)\) solving (3.1) for all \(\alpha \in {\mathcal {J}}_{M,K}\). This procedure gives highly accurate approximations. However, it is prohibitively expensive to solve when \({\mathcal {N}}\) is large as the ODE system (3.1) has to be resolved \({\mathcal {N}}\) times. This challenge can be mitigated by the COFRB method, described below for ODE and PDE separately.

3.1 SODE Problem

In this subsection, we consider the following simple linear SODE problem.

$$\begin{aligned}&u'(t)=u(t)+1+u(t)\diamond \dot{{\mathfrak {N}}}(t),\quad t\in [0,T] \end{aligned}$$
(3.4a)
$$\begin{aligned}&u(0)=1. \end{aligned}$$
(3.4b)

The corresponding propagator system amounts to

$$\begin{aligned} \frac{d}{dt}u_\alpha (t)=u_\alpha (t)+\mathbb {1}_{\{\alpha =\varepsilon _0\}}+\sum _{\varepsilon _k \le \alpha }u_{\alpha -\varepsilon _k}(t)m_k(t). \end{aligned}$$
(3.5)

For simplicity, we use the forward Euler and Crank–Nicolson time discretization schemes to describe our space-time treatment which is the foundation of the resulting COFRB method. Toward that end, we partition the time domain (0, T] uniformly by a grid \(0=t_0<t_1<\ldots <t_n=T\) with time step size \(\Delta t=t_j-t_{j-1}\). The forward Euler and Crank–Nicolson schemes read

$$\begin{aligned} u_{\varepsilon _0}^{j}&=u_{{\varepsilon _0}}^{j-1}+\Delta t (u_{\varepsilon _0}^{j-1}+1), \quad j=1,\ldots , n \end{aligned}$$
(3.6a)
$$\begin{aligned} u_\alpha ^{j}&=u_{\alpha }^{j-1}+\Delta t \left( u_\alpha ^{j-1}+\sum _{\varepsilon _k \le \alpha }u_{\alpha -\varepsilon _k}^{j-1} m_k^{j-1}\right) \text { for } \alpha \ne \varepsilon _0, \quad j=1,\ldots , n \end{aligned}$$
(3.6b)
$$\begin{aligned}&u_{\varepsilon _0}^{j}=u_{\varepsilon _0}^{j-1}+\frac{1}{2}\Delta t\left( u_{\varepsilon _0}^{j-1}+1\right) +\frac{1}{2}\Delta t \left( u_{\varepsilon _0}^{j}+1\right) ,\quad j=1,\ldots ,n \end{aligned}$$
(3.7a)
$$\begin{aligned}&u_{\alpha }^{j}=u_{\alpha }^{j-1}+\frac{1}{2}\Delta t\left( u_{\alpha }^{j-1}+\sum _{\varepsilon _k \le \alpha }u_{\alpha -\varepsilon _k}^{j-1} m_{k}^{j-1}\right) +\frac{1}{2}\Delta t \left( u_{\alpha }^{j}+\sum _{\varepsilon _k \le \alpha }u_{\alpha -\varepsilon _k}^{j} m_{k}^{j}\right) ,\nonumber \\&\quad \quad \text { for }\alpha \ne \varepsilon _0,\ j=1,\ldots ,n. \end{aligned}$$
(3.7b)

Here \(u_{\alpha }^j\) is an approximation of \(u_{\alpha }(t)\) at time \(t=t_j\). Our space-time treatment is to rewrite the scheme as the following system with \(\vec {U}_\alpha \) encoding the full time evolution except the initial condition.

$$\begin{aligned} A \vec {U}_\alpha = \vec {f}(u_\alpha ^0, u_{\alpha - \varepsilon _k}) \quad {\vec {U}_\alpha }=\left( u_\alpha ^1, u_\alpha ^2, \dots , u_\alpha ^n \right) ^T. \end{aligned}$$
(3.8)

We therefore have a parametric problem in (3.8) with \(\alpha \) being the parameter. The challenge is that \(\vec {U}_\alpha \) is coupled with \(\vec {U}_{\alpha - \varepsilon _k}\) through the right hand side. This distinction from the usual RBM setting (2.9) means that we can not inquire (3.8) for a single \(\alpha \). Instead, it must be done sequentially starting from the lowest index \(\varepsilon _0\). As a consequence, unfortunately the usual greedy Algorithm 1, thus the very existence of the offline procedure, does not apply. This motivates our design of a “offline-free” algorithm. To present it, we slightly abuse the notation to denote the right hand side simply by \(\vec {f}_\alpha \) hiding its dependence on \(u_{\alpha - \varepsilon _k}\) in the remainder of the paper. The following definition of A and \(\vec {f}_\alpha \) for forward Euler and Crank–Nicolson methods finishes our description of the “truth” approximation.

$$\begin{aligned} A=\left( \begin{array}{cccc} 1 &{} &{} &{} \\ -(1+\Delta t) &{} 1 &{} &{} \\ &{} \ddots &{} \ddots &{} \\ &{} &{} -(1+\Delta t) &{} 1 \\ \end{array} \right) , \, \vec {f}_{\alpha }=\left( \begin{array}{c} u_\alpha ^0+\Delta t u_\alpha ^0 +\Delta t \sum _{\varepsilon _k \le \alpha } u_{\alpha -\varepsilon _k}^0 m_k^0\\ \Delta t \sum _{\varepsilon _k \le \alpha } u_{\alpha -\varepsilon _k}^1 m_k^1\\ \vdots \\ \Delta t \sum _{\varepsilon _k \le \alpha } u_{\alpha -\varepsilon _k}^{n-1} m_k^{n-1}\\ \end{array}\right) \end{aligned}$$
(3.9)
$$\begin{aligned}&A=\left( \begin{array}{cccc} 1-\frac{1}{2}\Delta t &{} &{} &{} \\ -(1+\frac{1}{2}\Delta t) &{} 1-\frac{1}{2}\Delta t &{} &{} \\ &{} \ddots &{} \ddots &{} \\ &{} &{} -(1+\frac{1}{2}\Delta t) &{} 1-\frac{1}{2}\Delta t \\ \end{array} \right) ; \quad \end{aligned}$$
(3.10a)
$$\begin{aligned}&\vec {f}_{\alpha }=\left( \begin{array}{c} u_\alpha ^0+\frac{1}{2}\Delta t u_\alpha ^0 +\frac{1}{2}\Delta t \sum _{\varepsilon _k \le \alpha } u_{\alpha -\varepsilon _k}^0 m_k^0+\frac{1}{2}\Delta t \sum _{\varepsilon _k \le \alpha } u_{\alpha -\varepsilon _k}^1 m_k^1\\ \frac{1}{2}\Delta t \sum _{\varepsilon _k \le \alpha } u_{\alpha -\varepsilon _k}^1 m_k^1+\frac{1}{2}\Delta t \sum _{\varepsilon _k \le \alpha } u_{\alpha -\varepsilon _k}^2 m_k^2\\ \vdots \\ \frac{1}{2}\Delta t \sum _{\varepsilon _k \le \alpha } u_{\alpha -\varepsilon _k}^{n-1} m_k^{n-1}+\frac{1}{2}\Delta t \sum _{\varepsilon _k \le \alpha } u_{\alpha -\varepsilon _k}^{n} m_k^n\\ \end{array}\right) . \end{aligned}$$
(3.10b)

Proceeding similarly to the classical RBM, given that we already have a pre-selected multi-indices set \(S^i = \{\alpha _k\}_{k=1}^{i}\) and the corresponding “truth approximations” written in a matrix form \(\vec {U}=\left( \vec {U}_{\alpha _1}, \ldots , \vec {U}_{\alpha _i}\right) \), the goal is to find, for each \(\alpha \), the RB coefficients \(\vec {c}_\alpha =(c_1(\alpha ),c_2(\alpha ),\ldots ,c_i(\alpha ))^T\) such that the RB solution \(\vec {U}_{\alpha }^{{i}}=\vec {U}\vec {c}_\alpha \) satisfies the equation \(A \vec {U}_{\alpha }^{{i}} = \vec {f}_\alpha \) in certain sense. In this paper, we solve the least squares problem as in [8]

$$\begin{aligned} \vec {U}^TA^TA\vec {U}\vec {c}_\alpha =\vec {U}^TA^T\vec {f}_\alpha \end{aligned}$$
(3.11)

minimizing \(\Vert A\vec {U}^{N}_{\alpha }-\vec {f}_\alpha \Vert _{l^2}\) which is a Petrov-Galerkin formulation. We note that we can also formulate the reduced problem as a Galerkin projection,

$$\begin{aligned} \vec {U}^TA\vec {U}\vec {c}_\alpha =\vec {U}^T\vec {f}_\alpha , \end{aligned}$$
(3.12)

which is less stable according to our numerical result (not reported in this paper). The remaining key question of how to determine the reduced multi-indices set \(S^i\) is answered by a “keep-or-toss” approach guided by a quantity \(\Delta _i(\alpha )\) bounding from the above the error between the reduced solution \(\vec {U}_{\alpha }^{i}\) and truth approximation \(\vec {U}_\alpha \) for any multi-index \(\alpha \). This upper bound resulting from a residual-based a posteriori error estimate,

$$\begin{aligned} \Delta _i(\alpha )=\frac{\Vert A\vec {U}^{{i}}_{\alpha }-\vec {f}_\alpha \Vert _{l^2}}{\sqrt{\beta _{LB}}} \end{aligned}$$
(3.13)

is standard for RBM [8, 29]. Here \(\beta _{LB}\) is the lower bound for the smallest eigenvalue of \(A^TA\). The detailed algorithm is provided in Algorithm 2. We note that this “keep-or-toss” approach was initially explored in a different setting in [13], and then extended to a more general multi-layer enhanced greedy algorithm for the classical RBM setting in [21].

figure b

Remark 3.1

(Generality of time discretizations) This paper presents the method for two kinds of time discretization schemes. In fact, any ODE solver that can be written in the form of (3.8), such as Runge-Kutta, linear Adams multistep methods et al, can be handled.

Remark 3.2

(Computational details) To ensure that the reduced system is well-conditioned, we apply the modified Gram–Schmidt (MGS) transformation on the basis matrix \(\vec {U}\). Moreover, special care must be taken to ensure efficiency and accuracy of the COFRB method when it comes to calculating the RB matrices and vectors (\(\vec {U}^TA^TA\vec {U}\) and \(\vec {U}^TA^T\vec {f}_\alpha \)), and the residual norm (as an example, a straightforward implementation may result in a negative norm). For these reasons, we provide the details of implementing steps 5, 6 and 10 of Algorithm 2 in “Appendix A”.

3.2 SPDE Problem

We devote this section to the stochastic PDE case using linear parabolic PDEs as an example. For the truth approximation, we adopt a Fourier collocation approach with \({\mathcal {N}}_c\) collocation points for the spatial discretization [6]. These \({\mathcal {N}}_c\) collocation points \(C^{{\mathcal {N}}_c}=\{x_j\}_{j=1}^{{\mathcal {N}}_c}\) are taken as a tensor product of \({\mathcal {N}}_x\) collocation points for each dimension. Obviously, for \(D \subset \mathbb {R}^d\) we have \({\mathcal {N}}_c={\mathcal {N}}_x^d\). We denote this spatial discretization degrees of freedom by \(M_f\) and present two ways to apply COFRB to this problem.

3.2.1 Method with No Spatial Compression Embedded

The first, a direct application of the Algorithm COFRB_ODE presented in the last section, becomes obvious once we write the “truth” solver in the form of (3.8). It achieves the RB reduction of the parameter-induced manifold by simply replacing numbers \(u_\alpha ^j\) for ODE by vectors \({\mathbf {u}}_\alpha ^j\) for PDE which contains the spatial variation of the solution at the \(j^{\mathrm{th}}\) time step. Toward that end, we also consider forward Euler and Crank–Nicolson time discretization methods for (3.1) which read

$$\begin{aligned} \mathrm{FE:} \, {\mathbf {u}}_\alpha ^{j}&={\mathbf {u}}_\alpha ^{j-1}+\Delta t\left( {\tilde{A}}{\mathbf {u}}_\alpha ^{j-1}+\sum _{\varepsilon _k\le \alpha }m_k^{j-1} {\tilde{B}} {\mathbf {u}}_{\alpha -\varepsilon _k}^{j-1}\right) , \quad j=1,\ldots ,n \end{aligned}$$
(3.14)
$$\begin{aligned} \mathrm{CN:}\, {\mathbf {u}}_\alpha ^{j}&={\mathbf {u}}_\alpha ^{j-1}+\frac{\Delta t}{2}\left( {\tilde{A}}{\mathbf {u}}_\alpha ^{j-1}+\sum _{\varepsilon _k\le \alpha }m_k^{j-1} {\tilde{B}} {\mathbf {u}}_{\alpha -\varepsilon _k}^{j-1}\right) +\frac{\Delta t}{2}\left( {\tilde{A}}{\mathbf {u}}_\alpha ^{j}+\sum _{\varepsilon _k\le \alpha }m_k^{j} {\tilde{B}} {\mathbf {u}}_{\alpha -\varepsilon _k}^{j}\right) ,\nonumber \\&\quad \quad \quad \quad j=1,\ldots ,n. \end{aligned}$$
(3.15)

Here \({\mathbf {u}}_\alpha ^{j}\) is now a vector containing the spatial variations, \({\mathbf {u}}_\alpha ^{j}=[u^{(1)}_\alpha (t_j),\ldots ,u^{({\mathcal {N}}_c)}_\alpha (t_j)]^T=[u_\alpha (t_j,x_1),\ldots ,u_\alpha (t_j,x_{{\mathcal {N}}_c})]^T\). Recall that \({\tilde{A}}\) is the spatial differential operator discretized by the spectral collocation method. We can now rewrite them in a matrix form:

$$\begin{aligned} A \vec {U}_\alpha = \vec {f}_\alpha , \quad {\vec {U}_\alpha }=\left( ({\mathbf {u}}_\alpha ^1)^T, ({\mathbf {u}}_\alpha ^2)^T, \dots , ({\mathbf {u}}_\alpha ^n)^T \right) ^T , \end{aligned}$$
(3.16)

with A written block-wise for the FE and CN schemes as follows

$$\begin{aligned}&A=\left( \begin{array}{cccc} I &{} &{} &{} \\ -(I+\Delta t{\tilde{A}}) &{} I &{} &{} \\ &{} \ddots &{} \ddots &{} \\ &{} &{} -(I+\Delta t{\tilde{A}}) &{} I \\ \end{array} \right) ; \, A=\left( \begin{array}{cccc} I-\frac{\Delta t}{2}{\tilde{A}} &{} &{} &{} \\ -(I+\frac{\Delta t}{2}{\tilde{A}}) &{} I-\frac{\Delta t}{2}{\tilde{A}} &{} &{} \\ &{} \ddots &{} \ddots &{} \\ &{} -(I+\frac{\Delta t}{2}{\tilde{A}}) &{} I-\frac{\Delta t}{2}{\tilde{A}} &{} \\ \end{array} \right) . \end{aligned}$$
(3.17)

Here \(I\in {\mathbb {R}}^{{\mathcal {N}}_c\times {\mathcal {N}}_c}\) is the identity matrix. To facilitate presentation of the detailed efficient implementation of the algorithm (see “Appendix A”), we write \(\vec {f}_\alpha \) for the FE solver in the following form

$$\begin{aligned} \vec {f}_{\alpha }&=\left( \begin{array}{c} {\mathbf {u}}_\alpha ^0+\Delta t {\tilde{A}}{\mathbf {u}}_\alpha ^0 +\Delta t \sum _{\varepsilon _k \le \alpha } m_k^0{\tilde{B}}{\mathbf {u}}_{\alpha -\varepsilon _k}^0\\ \Delta t \sum _{\varepsilon _k \le \alpha } m_k^1{\tilde{B}}{\mathbf {u}}_{\alpha -\varepsilon _k}^1\\ \vdots \\ \Delta t \sum _{\varepsilon _k \le \alpha } m_k^{n-1}{\tilde{B}}{\mathbf {u}}_{\alpha -\varepsilon _k}^{n-1}\\ \end{array}\right) \end{aligned}$$
(3.18)
$$\begin{aligned}&=\left( \begin{array}{c} {\mathbf {u}}_\alpha ^0+\Delta t {\tilde{A}}{\mathbf {u}}_\alpha ^0+\Delta t \sum _{\varepsilon _k \le \alpha } m_k^0{\tilde{B}}{\mathbf {u}}_{\alpha -\varepsilon _k}^0\\ 0\\ \vdots \\ 0\\ \end{array}\right) +\Delta t \sum _{\varepsilon _k \le \alpha }\left( \begin{array}{cccc} 0 &{} &{} &{} \\ &{}m_k^1{\tilde{B}} &{} &{}\\ &{} &{} \ddots &{}\\ &{} &{} &{} m_k^{n-1}{\tilde{B}} \end{array}\right) \left( \begin{array}{c} 0\\ {\mathbf {u}}_{\alpha -\varepsilon _k}^1\\ \vdots \\ {\mathbf {u}}_{\alpha -\varepsilon _k}^{n-1} \end{array}\right) \nonumber \\&=\vec {f}_{\alpha }^0+\Delta t \sum _{\varepsilon _k \le \alpha } M_k^0 \vec {U}^0 \vec {c}_{\alpha -\varepsilon _k} \end{aligned}$$
(3.19)

where \(M_k^0=\text {diag}\{0 ,m_k^1{\tilde{B}},\ldots ,m_k^{n-1}{\tilde{B}}\}\) and \(\vec {U}^0\) is the full spatial-temporal representation of the RB basis matrix with the initial and final time step truncated.

$$\begin{aligned} \vec {U}^0=\left( \begin{array}{cccc} 0&{} 0 &{}\ldots &{}0 \\ {\mathbf {u}}_{\alpha _1}^1 &{}{\mathbf {u}}_{\alpha _2}^1 &{}\ldots &{}{\mathbf {u}}_{\alpha _i}^1\\ \vdots &{}\vdots &{} \vdots &{}\vdots \\ {\mathbf {u}}_{\alpha _1}^{n-1} &{}{\mathbf {u}}_{\alpha _2}^{n-1} &{} \ldots &{} {\mathbf {u}}_{\alpha _i}^{n-1} \end{array}\right) . \end{aligned}$$
(3.20)

This compact form can be achieved for CN scheme as well:

$$\begin{aligned} \vec {f}_{\alpha }&=\left( \begin{array}{c} {\mathbf {u}}_\alpha ^0+\Delta t {\tilde{A}}{\mathbf {u}}_\alpha ^0 +\frac{\Delta t}{2} \sum _{\varepsilon _k \le \alpha } m_k^0{\tilde{B}}{\mathbf {u}}_{\alpha -\varepsilon _k}^0+\frac{\Delta t}{2} \sum _{\varepsilon _k \le \alpha } m_k^1{\tilde{B}}{\mathbf {u}}_{\alpha -\varepsilon _k}^1\\ \frac{\Delta t}{2} \sum _{\varepsilon _k \le \alpha } \left( m_k^1{\tilde{B}}{\mathbf {u}}_{\alpha -\varepsilon _k}^1+m_k^2{\tilde{B}}{\mathbf {u}}_{\alpha -\varepsilon _k}^2\right) \\ \vdots \\ \frac{\Delta t}{2} \sum _{\varepsilon _k \le \alpha } \left( m_k^{n-1}{\tilde{B}}{\mathbf {u}}_{\alpha -\varepsilon _k}^{n-1}+m_k^{n}{\tilde{B}}{\mathbf {u}}_{\alpha -\varepsilon _k}^{n}\right) \\ \end{array}\right) \end{aligned}$$
(3.21)
$$\begin{aligned}&=\left( \begin{array}{c} {\mathbf {u}}_\alpha ^0+\frac{\Delta t}{2} {\mathbf {u}}_\alpha ^0+\frac{\Delta t}{2} \sum _{\varepsilon _k \le \alpha }m_k^0 {\tilde{B}}{\mathbf {u}}_{\alpha -\varepsilon _k}^0\\ 0\\ \vdots \\ 0\\ \end{array}\right) +\frac{\Delta t}{2} \sum _{\varepsilon _k \le \alpha }\left( \begin{array}{cccc} 0 &{} &{} &{} \\ &{}m_k^1{\tilde{B}} &{} &{}\\ &{} &{} \ddots &{}\\ &{} &{} &{} m_k^{n-1}{\tilde{B}} \end{array}\right) \left( \begin{array}{c} 0\\ {\mathbf {u}}_{\alpha -\varepsilon _k}^1\\ \vdots \\ {\mathbf {u}}_{\alpha -\varepsilon _k}^{n-1} \end{array}\right) \nonumber \\&\quad +\frac{\Delta t}{2} \sum _{\varepsilon _k \le \alpha }\left( \begin{array}{cccc} m_k^1{\tilde{B}} &{} &{} &{} \\ &{}m_k^2{\tilde{B}} &{} &{}\\ &{} &{} \ddots &{}\\ &{} &{} &{} m_k^{n}{\tilde{B}} \end{array}\right) \left( \begin{array}{c} {\mathbf {u}}_{\alpha -\varepsilon _k}^1\\ {\mathbf {u}}_{\alpha -\varepsilon _k}^2\\ \vdots \\ {\mathbf {u}}_{\alpha -\varepsilon _k}^{n} \end{array}\right) \nonumber \\&=\vec {f}_{\alpha }^0+\frac{\Delta t}{2} \sum _{\varepsilon _k \le \alpha } \left( M_k^0 \vec {U}^0+M_k \vec {U}\right) \vec {c}_{\alpha -\varepsilon _k} \end{aligned}$$
(3.22)

where \(M_k=\text {diag}\{m_k^1{\tilde{B}},m_k^2{\tilde{B}},\ldots ,m_k^n{\tilde{B}}\}\) and \(\vec {U}\) is the full spatial-temporal representation of the RB basis matrix with the initial time step truncated.

$$\begin{aligned} \vec {U}=\left( \begin{array}{cccc} {\mathbf {u}}_{\alpha _1}^1 &{}{\mathbf {u}}_{\alpha _2}^1 &{}\ldots &{}{\mathbf {u}}_{\alpha _i}^1\\ \vdots &{}\vdots &{} \vdots &{}\vdots \\ {\mathbf {u}}_{\alpha _1}^{n-1} &{}{\mathbf {u}}_{\alpha _2}^{n-1} &{} \ldots &{} {\mathbf {u}}_{\alpha _i}^{n-1}\\ {\mathbf {u}}_{\alpha _1}^{n} &{}{\mathbf {u}}_{\alpha _2}^{n} &{} \ldots &{} {\mathbf {u}}_{\alpha _i}^{n} \end{array}\right) . \end{aligned}$$
(3.23)

3.2.2 Method with Spatial Compression Embedded

The consequence of ignoring the spatial compressibility is two-fold. First is the large memory the algorithm is consuming. For example, the “truth approximation” from (3.16), \(\vec {U}_{\alpha }\), is in \({\mathbb {R}}^{n{\mathcal {N}}_x^d}\). So are the number of rows for \(\vec {U}^0\) and \(\vec {U}\) and the resulting matrices \(\vec {U},A\vec {U},A^TA\vec {U}\) etc. This proves challenging especially for the high spatial dimension (\(d \ge 2\)) case. The second negative consequence is the high operation counts thanks to the multiplication of matrices of this size.

To mitigate these difficulties, we design the following algorithm to “compress” the basis every time one is added to the reduced space while maintaining its accuracy. To achieve that, we rewrite the vector \(\vec {U}_\alpha \) as a matrix \(\widehat{\vec {U}_\alpha }\) to which we apply singular value decomposition (SVD).

$$\begin{aligned} \widehat{\vec {U}_\alpha }=\left( {\mathbf {u}}_\alpha ^1,{\mathbf {u}}_\alpha ^2, \ldots , {\mathbf {u}}_\alpha ^n \right) =\eta \Sigma {\zeta }^T=\sum _{j=1}^{\min ({\mathcal {N}}_c,n)} \lambda _j\eta _j\zeta _j^T \end{aligned}$$
(3.24)

where \(\eta \) is an \({\mathcal {N}}_c\)-by-\({\mathcal {N}}_c\) orthonormal matrix, and \(\zeta \) is an n-by-n orthonormal matrix, and \(\Sigma \) is an \({\mathcal {N}}_c\)-by-n matrix whose entries are zero except for its \(\min ({\mathcal {N}}_c,n)\) diagonal elements. They are denoted by \(\lambda _j\) that is decreasing with j. \(\eta _j\) and \(\zeta _j\) are the columns of matrices \(\eta \) and \(\zeta \). Our algorithm truncates this SVD by storing only the first \(m(\ll \min ({\mathcal {N}}_c,n))\) vectors \(\eta _j\), \(\zeta _j\) and singular values of \(\lambda _j\) to approximate \(\widehat{\vec {U}_\alpha }\) as follows.

$$\begin{aligned} \widehat{\vec {U}_\alpha }&\approx \sum _{j=1}^{m}\lambda _j\eta _j\zeta _j^T= \left( \begin{array}{cccc} \eta _1^1 &{} \eta _2^1 &{} \ldots &{} \eta _m^1\\ \eta _1^2 &{} \eta _2^2 &{} \ldots &{} \eta _m^2\\ \vdots &{}\vdots &{}\vdots &{}\vdots \\ \eta _1^{{\mathcal {N}}_c} &{} \eta _2^{{\mathcal {N}}_c} &{} \ldots &{} \eta _m^{{\mathcal {N}}_c} \end{array}\right) \left( \begin{array}{cccc} \lambda _1\zeta _1^1 &{} \lambda _1\zeta _1^2 &{}\ldots &{} \lambda _1\zeta _1^n\\ \lambda _2\zeta _2^1 &{} \lambda _2\zeta _2^2 &{}\ldots &{} \lambda _2\zeta _2^n\\ \vdots &{}\vdots &{}\vdots &{}\vdots \\ \lambda _m\zeta _m^1 &{} \lambda _m\zeta _m^2 &{}\ldots &{} \lambda _m\zeta _m^n \end{array}\right) \end{aligned}$$
(3.25)
$$\begin{aligned}&=V_{\alpha }^m\left( \begin{array}{cccc} w_\alpha ^1&w_\alpha ^2&\ldots&w_\alpha ^n \end{array}\right) \end{aligned}$$
(3.26)

where \(V_\alpha ^m \in \mathbb {R}^{{\mathcal {N}}_c\times m}\), \(w_{\alpha }^j \in \mathbb {R}^{m},j=1,\ldots ,n\). We can then rewrite the truth approximation back as a vector form

$$\begin{aligned} \vec {U}_\alpha&\approx \left( \begin{array}{ccc} V_\alpha ^m&{} &{} \\ &{} \ddots &{} \\ &{} &{} V_\alpha ^m \end{array}\right) \left( \begin{array}{c} w_\alpha ^1\\ \vdots \\ w_\alpha ^n \end{array}\right) =\vec {V}_{\alpha }^m\vec {w}_\alpha . \end{aligned}$$
(3.27)

Now we only store \(m{\mathcal {N}}_c+mn\) elements instead of \(n{\mathcal {N}}_c\) for each truth approximation, a \(\frac{m}{n}+\frac{m}{{\mathcal {N}}_c}\) reduction. The basis matrix \(\vec {U}\) can in turn be approximated as

$$\begin{aligned} \vec {U}&\approx \left( \begin{array}{ccccccccc} V_{\alpha _1}^m&{} &{} &{} V_{\alpha _2}^m&{} &{} &{} V_{\alpha _i}^m&{} &{}\\ &{} \ddots &{} &{} &{} \ddots &{} &{} &{} \ddots &{}\\ &{} &{} V_{\alpha _1}^m &{} &{} &{} V_{\alpha _2}^m&{} &{} &{} V_{\alpha _i}^m \end{array}\right) \left( \begin{array}{cccc} w_{\alpha _1}^1&{}0&{}\ldots &{}0\\ \vdots &{}\vdots &{} &{}\vdots \\ w_{\alpha _1}^n&{}0&{}\ldots &{}0\\ 0&{}w_{\alpha _2}^1&{}\ldots &{}0\\ \vdots &{}\vdots &{} &{}\vdots \\ 0&{}w_{\alpha _2}^n&{}\ldots &{}0\\ \vdots &{}\vdots &{} \vdots &{}\vdots \\ 0&{}0&{}\ldots &{}w_{\alpha _i}^1\\ \vdots &{}\vdots &{} &{}\vdots \\ 0&{}0&{}\ldots &{}w_{\alpha _i}^n \end{array}\right) \nonumber \\&=\left( \vec {V}_{\alpha _1}^m\ \vec {V}_{\alpha _2}^m\ \ldots \ \vec {V}_{\alpha _i}^m\right) \left( \vec {W}_{\alpha _1}\ \vec {W}_{\alpha _2}\ \ldots \ \vec {W}_{\alpha _i}\right) =\vec {V}\vec {W}. \end{aligned}$$
(3.28)

The other basis matrix \(\vec {U}^0\) is approximated in the same fashion, simply with \(\vec {W}\) replaced by \(\vec {W}^0\) that contains \(w_{\alpha _i}^j\) for \( j = 0, \dots , n-1\) instead of \(j = 1, \dots , n\) with \(w_{\alpha _i}^0=0\),

$$\begin{aligned} \vec {U}^0 =\vec {V}\vec {W}^0. \end{aligned}$$

Then the reduced solver (3.11) becomes

$$\begin{aligned} \vec {W}^T&\vec {V}^{T}A^{T}A\vec {V}\vec {W}\vec {c}_{\alpha }= \vec {W}^T\vec {V}^{T}A^{T}\vec {f}_{\alpha } \end{aligned}$$
(3.29)
$$\begin{aligned}&=\left\{ \begin{array}{ll}\vec {W}^T\vec {V}^{T}A^{T}\left( \vec {f}_{\alpha }^0+\Delta t \sum _{\varepsilon _k \le \alpha }M_k^0 \vec {V}\vec {W}^0\vec {c}_{\alpha -\varepsilon _k}\right) &{}\text { for FE}\\ \vec {W}^T\vec {V}^{T}A^{T}\left( \vec {f}_{\alpha }^0+\frac{\Delta t}{2} \sum _{\varepsilon _k \le \alpha }(M_k^0 \vec {V}\vec {W}^0+M_k\vec {V}\vec {W})\vec {c}_{\alpha -\varepsilon _k}\right) &{}\text { for CN}. \end{array}\right. \end{aligned}$$
(3.30)

The detailed algorithm, called COFRB_PDE, is provided in Algorithm 3 with implementation details postponed to “Appendix B”.

figure c

Remark 3.3

Note that the columns of the matrices \(\vec {W}\) and \(\vec {W}^0\) are orthogonal. So we only need to normalize them. The absence of MGS for the basis matrix \(\vec {U}\) improves efficiency of the algorithm. Currently, we use the same m across the RB bases. This may lead to some redundancy in \(\vec {U}\) resulting in a possibly non-invertible RB matrix \(A_{RB}=\vec {W}^T\vec {V}^{T}A^{T}A\vec {V}\vec {W}\). In this case, we adopt the Moore-Penrose pseudo-inversion [2], \((\vec {W}^T\vec {V}^{T}A^{T}A\vec {V}\vec {W})^{\dag }\), to solve the least square problem which amounts to identifying the RB coefficients \(\vec {c}_{\alpha }\) for (3.29) with minimum norm. This redundancy can be removed, thus the Moore-Penrose pseudo-inversion avoided, by using an adaptively chosen m which gets smaller as we build up the RB space.

4 Complexity Analysis of the COFRB Algorithms

As is well-known [28], the tremendous speedup of the reduced basis method originates from the decomposition of the computation into an offline stage and an online stage. The essence is that, whenever a new basis is added to the reduced space, components of the reduced solver and its error estimation that are dependent on this basis should be precomputed and stored. As a consequence, the reduces solver and its error estimation are independent of the degrees of freedom of the truth solver. Keeping this in mind, our COFRB algorithm can be implemented in a similar fashion. The lack of clear offline-online decomposition is overcome by that, in the sequential procedure of the problem resolution, there are overwhelmingly more multi-indices whose solutions are provided by the reduced, as opposed to full, solver. The amount of savings from these reduced solves being independent of the size of the full problem more than compensates for the additional computation, after each full solve, to prepare for the reduced solve. Thus, the total computational complexity is less than the original scheme. In this section, we provide a detailed computational complexity count taking forward Euler time discretization as an example. It makes evident the theoretical basis of the practical numerical speedup we observe in the numerical results section below.

4.1 Computational Complexity for COFRB_ODE

We first consider Algorithm 2 for the SODE problem. The complexity of the original scheme, that is obtaining truth approximations \(\vec {U}_\alpha \) from (3.8) for all \(\alpha \in {\mathcal {J}}_{M,K}\), is of the order

$$\begin{aligned} {\mathcal {C}}^{\mathrm{O}}_{\mathrm{Full}} = n{\mathcal {N}}K. \end{aligned}$$

Recall \({{\mathcal {N}}}\) is the cardinality of \({\mathcal {J}}_{M,K}\). To compute the complexity for Algorithm 2, we note that for each \(\alpha \in {\mathcal {J}}_{M,K}\), we need to form \(\vec {f}_{RB}=\vec {U}^TA^T\vec {f}_\alpha \), solve \(\vec {c}_{\alpha }\), and calculate \(\Delta _i(\alpha )\) from (6.11). The operation count of this procedure is, at most, of the order

This cost is, as expected, independent of the number of time step (in this case the size of the truth approximation) n. During this procedure, if the solution corresponding to the current multi-index is determined to be a worthy addition to the RB space, we encounter the following additional cost which is aggregated for all the N chosen multi-indices.

Therefore, the total complexity of Algorithm 2 for SODE problem is of the order

$$\begin{aligned} {\mathcal {C}}^{\mathrm{O}}_{\mathrm{RB}} = nN^2K^2+N(N^2 + NK + K^2){\mathcal {N}}. \end{aligned}$$

Remark 4.1

The amount of speedup, assuming \(N \ge K\), is in the order of

$$\begin{aligned} \frac{{\mathcal {C}}^{\mathrm{O}}_{\mathrm{RB}}}{{\mathcal {C}}^{\mathrm{O}}_{\mathrm{Full}}} = \frac{N^2 K}{{\mathcal {N}}} + \frac{N^3}{n K}. \end{aligned}$$

4.2 Computational Complexity for COFRB_PDE

One-dimensional case: The cost of the original scheme is of the order \(n{\mathcal {N}}_x^2{\mathcal {N}}K.\) For Algorithm 2, a similar analysis as above shows that the total cost is of the order

$$\begin{aligned} n{\mathcal {N}}_xNK(NK+{\mathcal {N}}_x) + N(N^2 + NK +K^2) {\mathcal {N}}. \end{aligned}$$

For Algorithm 3, due to the spatial reduction, the size of the space-time snapshots decreases from \(n{\mathcal {N}}_x\) to \(m{\mathcal {N}}_x+mn\). For each multi-index, we need to form \(\vec {W}^T\vec {V}^TA^TA\vec {V}\vec {W}\) and \(\vec {W}^T\vec {V}^TA^TM_k^0\vec {V}\vec {W}^0\) (of size at most N, the number of reduced basis), calculate the \((\vec {W}^T\vec {V}^TA^TA\vec {V}\vec {W})^{\dag }\), and evaluate \(\Delta _i(\alpha )\) through forming and decomposing \({\mathcal {B}}^T{\mathcal {B}}\) (see “Appendix B”). All these operations add up to be, at most

For the N chosen multi-indices, we have the following additional cost

Thus, the total cost is of the order

$$\begin{aligned} {\mathcal {N}}\left( N{\mathcal {N}}_xm({\mathcal {N}}_x+m)K+nNK^2m^2 + N^3K^3\right) +n{\mathcal {N}}_x^2NK+N\min (n,{\mathcal {N}}_x)^3. \end{aligned}$$

Two-dimensional case: In the 2D case, the differentiation matrices \({\tilde{A}}\) and \({\tilde{B}}\) are sparse, thus the complexity of matrix vector multiplication is of the order \({\mathcal {N}}_x^3\). So the complexity of obtaining \(\vec {U}_\alpha \) from the original scheme is of the order \(n{\mathcal {N}}_x^3{\mathcal {N}}K\).

For Algorithm 2, the complexity is of order

$$\begin{aligned} n{\mathcal {N}}_x^2NK(NK+{\mathcal {N}}_x) + N(N^2 + NK +K^2) {\mathcal {N}}. \end{aligned}$$

For Algorithm 3,

$$\begin{aligned} {\mathcal {N}}\left( N{\mathcal {N}}_x^2m({\mathcal {N}}_x+m)K+nNK^2m^2 + N^3K^3\right) +n{\mathcal {N}}_x^3NK+N\min (n,{\mathcal {N}}_x^2)^3. \end{aligned}$$

Summary: We list here the complexities of the original scheme, direct extension of COFRB_ODE, and then COFRB_PDE for the d-dimensional SPDE problems.

$$\begin{aligned} {\mathcal {C}}^{\mathrm{P}}_{\mathrm{Full}}&= n{\mathcal {N}}_x^{d+1}{\mathcal {N}}K,\\ {\mathcal {C}}^{\mathrm{P,1}}_{\mathrm{RB}}&= n{\mathcal {N}}_x^dNK(NK+{\mathcal {N}}_x) + N(N^2 + NK +K^2) {\mathcal {N}},\\ {\mathcal {C}}^{\mathrm{P,2}}_{\mathrm{RB}}&= {\mathcal {N}}\left( N{\mathcal {N}}_x^dm({\mathcal {N}}_x+m)K+nNK^2m^2 + N^3K^3\right) +n{\mathcal {N}}_x^{d+1}NK+N\min (n,{\mathcal {N}}_x^d)^3. \end{aligned}$$

After simple algebraic simplifications, the amount of speedup of the COFRB_PDE algorithm is

$$\begin{aligned} \frac{{\mathcal {C}}^{\mathrm{P,2}}_{\mathrm{RB}}}{{\mathcal {C}}^{\mathrm{P}}_{\mathrm{Full}}} = \frac{N}{{\mathcal {N}}} + \frac{Nm}{n} + \frac{NK}{{\mathcal {N}}_x^{d+1}}\left( m^2 + \frac{N^2K}{n}\right) + \frac{N}{{\mathcal {N}}K}\min \left( \frac{n^2}{{\mathcal {N}}_x^{d+1}}, \frac{{\mathcal {N}}_x^{2d-1}}{n}\right) . \end{aligned}$$
Fig. 1
figure 1

SODE problem for \((M,K)=(8,8)\) (top) and \((M,K)=(9,9)\) (bottom). From left to right are histories of convergence of \(e_{2}^{RBM}\) and \(e_{2}^{ORI}\), CPU time of Algorithm 2 as a function of \(\varepsilon _{\text {tol}}\), and number of chosen multi-indices (out of 12,870 or 48,620) as a function of \(\varepsilon _{\text {tol}}\)

5 Numerical Results

In this section, we apply our algorithms to one SODE problem and two SPDE problems, and compare the costs of the original scheme and the COFRB methods. In all the experiments, we set the maximum number of reduced multi-indices \(N_{\max }\) to be 200. For the COFRB_PDE algorithm, we set the number of columns of \(V_\alpha ^m\), m to be 6 in all experiments. Exact solutions are obtained through solving the corresponding moment equations of (3.1).

Table 2 SODE problem: error and computing costs for Algorithm COFRB_ODE

Example 5.1

(Linear SODE). Suppose that \(u=u(t)\) satisfies the following linear SODE

$$\begin{aligned}&u'(t)=u(t)+1+u(t)\diamond \dot{{\mathfrak {N}}}(t),\quad t\in [0,T]\nonumber \\&u(0)=1. \end{aligned}$$
(5.1)

We evolve the propagator ODE system to the final time \(T=1\) with time step size \(\Delta t=10^{-4}\) i.e. time steps \(n=10{,}000\). In order to examine the error of reduced basis solution in comparison to the truth approximation, we define the following square error

$$\begin{aligned} e_{2}^{RBM}={\mathbb {E}}[|u_{M,K}^{N}(1)-u_{M,K}^{{\mathcal {N}}}(1)|^2]. \end{aligned}$$
(5.2)

As a reference, we also compute the error of the truth approximation in comparison to the exact solution

$$\begin{aligned} e_{2}^{ORI}={\mathbb {E}}[|u_{M,K}^{{\mathcal {N}}}(1)-u(1)|^2]. \end{aligned}$$
(5.3)

We test both the forward Euler and Crank–Nicolson time discretization schemes for \((M,K)=(8,8)\) and (9, 9) and plot the histories of convergence, number of chosen multi-indices, and computation time, all with respect to \(\varepsilon _{\text {tol}}\) in Fig. 1. It is clear that the method is highly effective in capturing the “important” multi-indices. With 35 (out of a total of more than 12, 000) of them, the COFRB solution reaches an accuracy of more than 8 digits. In Table 2, we observe that when \(e_{2}^{RBM}\) and \(e_{2}^{ORI}\) reach the same level, our algorithm achieves savings of two orders of magnitude.

Fig. 2
figure 2

1D SPDE problem for \((M,K, {\mathcal {N}}_c)=(9, 9, 32)\) (Row 1), (9, 9, 64) (Row 2), (10, 10, 32) (Row 3), and (10, 10, 64) (Row 4). From left to right are histories of convergence of \(e_{2}^{RBM}\) and \(e_{2}^{ORI}\), CPU time of Algorithm 3 as a function of \(\varepsilon _{\text {tol}}\), and number of chosen multi-indices (out of 48,620) as a function of \(\varepsilon _{\text {tol}}\)

Example 5.2

(Linear 1D parabolic PDE). We solve the linear parabolic PDE in Sect. 2.1. Consider the one-dimensional space region \(D=[0,2\pi ]\) with periodic boundary condition. The initial data is \(u_0(x)=\cos x\). Differential operators \({\mathcal {L}}\) and \({\mathcal {M}}\) are set to be

$$\begin{aligned} {\mathcal {L}}u=0.145\partial _x^2u + 0.1 \sin x \partial _x u, \quad {\mathcal {M}}u = 0.5\partial _x u. \end{aligned}$$
(5.4)
Table 3 1D SPDE problem: error and computing costs of our Algorithms COFRB_ODE and COFRB_PDE
Table 4 1D SPDE problem: error and computing costs of our Algorithm COFRB_PDE for varying polynomial degree M
Fig. 3
figure 3

CPU times (or COFRB_PDE and the original algorithm) versus errors for the 1D SPDE problem with \((M,K,{\mathcal {N}}_c)=(6,9,32),(7,9,32),\ldots ,(11,9,32)\). Left: Forward Euler time discretization; Right: Crank–Nicolson time discretization

Fig. 4
figure 4

COFRB for 2D SPDE problem based on Forward Euler method with \((M,K, {\mathcal {N}}_c)=(9, 9, 32 \times 32)\) (Row 1), \((10, 10, 32 \times 32)\) (Row 2), \((8, 8, 64 \times 64)\) (Row 3), and \((9, 9, 64 \times 64)\) (Row 4). From left to right are histories of convergence of \(e_{2}^{RBM}\) and \(e_{2}^{ORI}\), CPU time of Algorithm 3 as a function of \(\varepsilon _{\text {tol}}\), and number of chosen multi-indices as a function of \(\varepsilon _{\text {tol}}\)

Fig. 5
figure 5

COFRB for 2D SPDE problem based on Crank Nicolson method with \((M,K, {\mathcal {N}}_c)=(6, 6, 32 \times 32)\) (Row 1), \((7, 7, 32 \times 32)\) (Row 2), \((6, 6, 64 \times 64)\) (Row 3), and \((7, 7, 64 \times 64)\) (Row 4). From left to right are histories of convergence of \(e_{2}^{RBM}\) and \(e_{2}^{ORI}\), CPU time of Algorithm 3 as a function of \(\varepsilon _{\text {tol}}\), and number of chosen multi-indices as a function of \(\varepsilon _{\text {tol}}\)

Fig. 6
figure 6

Variance of 2D SPDE problem with \({\mathcal {N}}_c=32\times 32\): from left to right are exact solution, RBM solutions with Algorithm 3 based on Forward Euler method with \((M,K)=(9,9)\) and Crank Nicolson method with \((M,K)=(6,6)\)

Fig. 7
figure 7

Variance of 2D SPDE problem with \({\mathcal {N}}_c=64\times 64\): from left to right are exact solution, RBM solutions with Algorithm 3 based on Forward Euler method with \((M,K)=(8,8)\) and Crank Nicolson method with \((M,K)=(6,6)\)

Fourier collocation method with \({\mathcal {N}}_c\) collocation points is used for spatial discretization. Let \(\{x_i\}_{i=1}^{{\mathcal {N}}_c}\) be the set of equidistant collocation points such that \(x_j=\frac{2\pi (j-1)}{{\mathcal {N}}_c}\). We test our algorithms for collocation points \({\mathcal {N}}_c=32, 64\) with (MK) pairs being (9, 9) and (10, 10) respectively. Both forward Euler and Crank–Nicolson methods are implemented up to \(T=5\) with time step size \(\Delta t=10^{-3}\) i.e. time steps \(n=5000\). The results, shown in Fig. 2, indicates that the COFRB_PDE method is effective for PDE. We only need to resolve about \(0.2\%\) of all the multi-indices through the high-fidelity simulation. Everybody else can be supplied by the reduced solver without degrading the accuracy. We compute \(e_2^{RBM}={\mathbb {E}}[\Vert u_{M,K}^{N}(5,\cdot )-u_{M,K}^{{\mathcal {N}}}(5,\cdot )\Vert _{l^2}^2]\) as proxy of error, where the discrete \(L^2\) norm for v is defined by

$$\begin{aligned} \Vert v\Vert _{l^2}^2:=\frac{2\pi }{{\mathcal {N}}_c}\sum _{j=1}^{{\mathcal {N}}_c}(v(x_j))^2 \end{aligned}$$

and for comparison, we also compute the error \(e_{2}^{ORI}={\mathbb {E}}[\Vert u_{N,K}^{ORI}(5,\cdot )-u(5,\cdot )\Vert _{l^2}^2]\) where u is the exact solution of the SPDE. In Table 3, we find that when the errors \(e_2^{RBM}\) and \(e_{2}^{ORI}\) reach the same order of magnitude, our algorithm is much faster than the original algorithm. This result also stands for the Crank–Nicolson method. We normalize the time with respect to that for one truth solve. We see that our algorithm achieves savings of two orders of magnitude. And we also compare Algorithm 2 with 3, to verify that Algorithm 3 not only saves more storage but is also more efficient than the Algorithm 2. In addition, we demonstrate the scalability of our Algorithm COFRB_PDE. Toward that end, we fix the number of random variables \(K=9\) while increasing polynomial degree M. The results are listed in the Table 4. We observe that, to reach the same level of accuracy as the original algorithm, the number of snapshots N does increase as expected since the original algorithm is getting more accurate. However, this increase in N is quite gradual. As a consequence, our algorithm is actually getting slightly more efficient overall in comparison to the original algorithm which becomes more costly faster as M increases. We visualize these trends in Fig. 3.

Example 5.3

(Linear 2D parabolic PDE) We consider two-dimensional linear transport type SPDE. Consider the following distribution-free equation, equipped with periodic boundary condition.

$$\begin{aligned}&\partial _tu(t,x,y)=\left( \frac{1}{2}\partial _x^2+\cos (x)\partial _y\right) u + \partial _xu\diamond \dot{{\mathfrak {N}}}(t),\quad (t,x,y)\in [0,T]\times [0,2\pi ]^2 \end{aligned}$$
(5.5)
$$\begin{aligned}&u(0,x,y)=\sin (2x)\sin (y), \quad (x,y)\in [0,2\pi ]^2. \end{aligned}$$
(5.6)

Fourier collocation method is used for spatial discretization with \({\mathcal {N}}_x=32,64\) collocation points in each dimension. Equidistant collocation points are denoted by \(x_j=y_j=\frac{2\pi (j-1)}{{\mathcal {N}}_x}\). The propagator system is then evolved to \(T=0.2\). We test Algorithm 3 for both forward Euler and Crank Nicolson methods. For forward Euler, we set time steps \(n=5000\), \((M,K)=(9,9),(10,10)\) for \({\mathcal {N}}_x=32\) and \((M,K)=(8,8),(9,9)\) for \({\mathcal {N}}_x=64\). For Crank–Nicolson, n is set to be 400, (MK) to be (6, 6), (7, 7) for \({\mathcal {N}}_x=32\) and \({\mathcal {N}}_x=64\). We also compute mean square truncation error, \(e_2^{RBM}={\mathbb {E}}[\Vert u_{M,K}^{N}(0.2,\cdot ,\cdot )-u_{M,K}^{{\mathcal {N}}}(0.2,\cdot ,\cdot )\Vert _{l^2}^2]\), \(e_2^{ORI}={\mathbb {E}}[\Vert u_{M,K}^{{\mathcal {N}}}(0.2,\cdot ,\cdot )-u(0.2,\cdot ,\cdot )\Vert _{l^2}^2]\) where the discrete \(L^2\) norm is \(\Vert v\Vert _{l^2}^2:=\frac{4\pi ^2}{{\mathcal {N}}_x^2}\sum _{i=1}^{{\mathcal {N}}_x}\sum _{j=1}^{{\mathcal {N}}_x}(v(x_i,y_j))^2\). We plot the histories of convergence, computation time, and the number of chosen multi-indices as functions of \(\varepsilon _\mathrm{tol}\) in Figs. 4 (forward Euler) and 5 (Crank Nicolson). And we also plot the variance of exact solution and the RBM solutions in Figs. 6 and 7. It is clear the COFRB method captures the variances well.

Table 5 2D SPDE problem: error and computing costs of Algorithm COFRB_PDE

Table 5, where we list the computational time, shows that the COFRB_PDE algorithm achieves savings of two orders of magnitude when the RBM solution reaches the same accuracy level as truth approximation. We note that Algorithm 2 becomes infeasible for the two-dimensional problem due to the memory constraint. For Crank–Nicolson method with \({\mathcal {N}}_c=64\times 64\), the full problem is out of reach. Therefore, we compute the error between the RBM solutions and the exact solutions, \(e_2^{\text {exact}}\), and show them in the Table 5.

$$\begin{aligned} e_2^{\text {exact}}={\mathbb {E}}\left[ \Vert u_{M,K}^N(0.2,\cdot ,\cdot ) -u(0.2,\cdot ,\cdot )\Vert _{l^2}^2\right] . \end{aligned}$$
(5.7)

The fact that the reduced solver COFRB_PDE reaches five digits of accuracy when the full solver or the COFRB_ODE is out of reach underscores the power of the COFRB_PDE method.

6 Concluding Remarks

In this paper, we develop new reduced basis methods for SODEs and SPDEs, called COFRB_ODE and COFRB_PDE respectively. The main features include a new space-time-like treatment of time in the numerical schemes for ODEs and PDEs, an accurate yet efficient compression technique for the spatial component of the space-time RBM bases, a non-conventional “parameterization” of a non-parametric problem, and finally a RBM that is free of any dedicated offline procedure yet still efficient. The numerical experiments corroborate the effectiveness and robustness of our algorithms. Future work includes extension of the current approaches to nonlinear and purely transport problems, and convergence analysis of the error with respect to the number of chosen multi-indices.