Abstract
We analyze the recent Multi-index Stochastic Collocation (MISC) method for computing statistics of the solution of a partial differential equation (PDE) with random data, where the random coefficient is parametrized by means of a countable sequence of terms in a suitable expansion. MISC is a combination technique based on mixed differences of spatial approximations and quadratures over the space of random data, and naturally, the error analysis uses the joint regularity of the solution with respect to both the variables in the physical domain and parametric variables. In MISC, the number of problem solutions performed at each discretization level is not determined by balancing the spatial and stochastic components of the error, but rather by suitably extending the knapsack-problem approach employed in the construction of the quasi-optimal sparse-grids and Multi-index Monte Carlo methods, i.e., we use a greedy optimization procedure to select the most effective mixed differences to include in the MISC estimator. We apply our theoretical estimates to a linear elliptic PDE in which the log-diffusion coefficient is modeled as a random field, with a covariance similar to a Matérn model, whose realizations have spatial regularity determined by a scalar parameter. We conduct a complexity analysis based on a summability argument showing algebraic rates of convergence with respect to the overall computational work. The rate of convergence depends on the smoothness parameter, the physical dimensionality and the efficiency of the linear solver. Numerical experiments show the effectiveness of MISC in this infinite dimensional setting compared with the Multi-index Monte Carlo method and compare the convergence rate against the rates predicted in our theoretical analysis.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this work, we analyze and apply the recent MISC method [22] to the approximation of quantities of interest (outputs) from the solutions of linear elliptic partial differential equations (PDEs) with random coefficients. Such equations arise in many applications in which the coefficients of the PDE are described in terms of random variables/fields due either to a lack of knowledge of the system or to its inherent non-predictability. We focus on the weak approximation of the solution of the following linear elliptic \({{\varvec{y}}}\)-parametric problem:
Here, \({\mathscr {B}} \subset {\mathbb {R}}^d\) with \(d \in {\mathbb {N}}\) denotes the “physical domain,” and the operators \({\text {div}}\) and \(\nabla \) act with respect to the physical variable, \({{\varvec{x}}}\in {\mathscr {B}}\), only. We assume that \({\mathscr {B}}\) has a tensor structure, i.e., \({\mathscr {B}} = {\mathscr {B}}_1 \times {\mathscr {B}}_2 \times \cdots \times {\mathscr {B}}_D\), with \(D \in {\mathbb {N}}\), \({\mathscr {B}}_i \subset {\mathbb {R}}^{d_i}\) and \(\sum _{i=1}^D d_i = d\), see, e.g., Fig. 1a, b. This assumption simplifies the analysis detailed in the following, although MISC can be applied to more general domains, such as
-
domains obtained by mapping from a reference tensor domain as in Fig. 1c, by suitably extending the approaches in [17, 28];
-
non-tensor domains that can be immersed in a tensor bounding box, \({\mathscr {B}} \subset \hat{{\mathscr {B}}} = \hat{{\mathscr {B}}}_1 \times \hat{{\mathscr {B}}}_2 \times \cdots \times \hat{{\mathscr {B}}}_D\), as in Fig. 1d, whose mesh is obtained as a tensor product of meshes on each component, \(\hat{{\mathscr {B}}}_i\), of the bounding box;
-
domains that admit a structured mesh, i.e., with a regular connectivity, whose level of refinement in each “direction” can be set independently, as in Fig. 1e;
-
domains that can be decomposed in patches satisfying any of the conditions above (observe that the meshes on each patch need not be conforming).
The parameter \({{\varvec{y}}}=\{y_j\}_{j\ge 1}\) in (1) is a random sequence whose components are independent and uniformly distributed random variables. More precisely, each \(y_j\) has support in \([-1,1]\) with measure \(\frac{~{\text {d}}\lambda }{2}\), where \(~{\text {d}}\lambda \) is the standard Lebesgue measure. We further define \(\Gamma = \times _{ j\ge 1} [-1,1]\) (hereafter referred to as the “stochastic domain” or the “parameter space”), with the cylindrical probability measure \(~{\text {d}}\mu = \mathop {\times }_{j \ge 1} \frac{~{\text {d}}\lambda }{2}\), (see, e.g., [5, Chapter 3, Section 5]).
The right-hand side of (1), namely the deterministic function \(\varsigma \), does not play a central role in this work, and it is assumed to be a smooth function of class \(C_0^{\infty }(\overline{{\mathscr {B}}})\), where \(\overline{{\mathscr {B}}}\) denotes the closure of \({\mathscr {B}}\). This regularity requirement can be relaxed, but we keep it to ease the presentation, since our main goal in this work is to track the effect of the regularity of the coefficient a in (1) on the MISC convergence rate. Here, we focus on the following family of diffusion coefficients:
where \(\{\psi _j\}_{j \ge 1}\) is a sequence of functions \(\psi _j \in C^t(\overline{{\mathscr {B}}})\) for \(t\ge 0\) such that \(\left\| \psi _j \right\| _{L^\infty ({\mathscr {B}})} \rightarrow 0\) as \(j \rightarrow \infty \). Hereafter, without loss of generality, we assume that the sequence \( \{ \left\| \psi _j \right\| _{L^\infty ({\mathscr {B}})} \}_{j \ge 1}\) is ordered in decreasing order. Thanks to a straightforward application of the Lax–Milgram lemma, the well-posedness of (1) in the classical Sobolev space, \(V=H^1_0({\mathscr {B}})\), is guaranteed almost surely (a.s.) in \(\Gamma \) if two functions, \(a_{\min }, a_{\max }:\Gamma \rightarrow {\mathbb {R}}\), exist such that
Moreover, the equation is well posed in the Bochner space, \(L^q(\Gamma ;V)\), for some \(q \ge 1\),Footnote 1 (see [1, 8] and the following discussion), provided that sufficiently high moments of the functions \(1/a_{\min }\) and \(a_{\max }\) are bounded. The goal of our computation is the approximation of an expected value,
where \(\Theta \) is a deterministic bounded and linear functional, and \(F({{\varvec{y}}}) = \Theta (u(\cdot ,{{\varvec{y}}}))\) is a real-valued random variable, \(F:\Gamma \rightarrow {\mathbb {R}}\). To this end, we utilize the Multi-index Stochastic Collocation (MISC) method, which we have introduced in a general setting in a previous work [22].
In MISC, we consider a decomposition in terms of tensorized univariate details (i.e., a tensorized hierarchical decomposition), for both the discrete space in which (1) is solved for a fixed value of \({{\varvec{y}}}\in \Gamma \) and for the quadrature operator used to approximate the expected value of \(F\), relying on the well-established theory of sparse-grid approximation of PDEs on the one hand [6, 7, 21, 26, 41] and of sparse-grid quadrature on the other hand [1, 6, 15, 34, 35, 40]. We use tensor products of such univariate details, obtaining combined deterministic-stochastic, first-order mixed differences to build the MISC estimator of \({\mathbb {E}}\left[ F\right] \) by selecting the most effective mixed differences with an optimization approach inspired by the literature on the knapsack problem (see, e.g., [30]). The knapsack approach also was used in [33] to obtain the so-called quasi-optimal sparse grids for PDEs with stochastic coefficients and in [6, 20] in the context of sparse-grid resolution of high-dimensional PDEs.
The resulting method can be seen as an extension of the sparse-grid combination technique for PDEs with stochastic coefficients, as well as a fully sparse, non-randomized version of the Multi-level Monte Carlo method [2, 9, 16, 27]. In particular, MISC differs from other works in the literature that attempt to optimally combine spatial and stochastic resolution levels [4, 25, 29, 37, 38] in two aspects. First, MISC uses combined deterministic-stochastic, first-order differences, which allows us to exploit not only the regularity of the solution with respect to the spatial variables and the stochastic parameters, but also the mixed deterministic-stochastic regularity whenever available. Second, the MISC estimator is built upon an optimization procedure, whereas the above-mentioned works try to balance the error contributions arising from the deterministic and stochastic components of the method without taking into account the corresponding costs. Finally, MISC can also be seen as a sparse-grid quadrature version of the Multi-index Monte Carlo method that was proposed and analyzed in [23].
In [22], MISC was introduced in a general setting and we restricted the analysis to the case of problems of type (1) depending on a finite number of random variables, \({{\varvec{y}}}\in \Gamma \subset {\mathbb {R}}^N\), with \(N < \infty \). Here, we provide a complexity analysis of MISC in the more challenging case in which the diffusion coefficient a depends on a countable sequence of random variables, \(\{y_j\}_{j\ge 1}\). Furthermore, we aim at tracking the dependence of the MISC converge rate on the smoothness of the realizations of a. This new framework requires that the tools used to prove the complexity of the method be changed: while in [22] we used a “direct counting” argument, i.e., we derived a complexity estimate by explicitly summing the work and the error contributions associated with each mixed difference included in the MISC estimator, here instead we base our proof on a summability argument and on suitable interpolation estimates in mixed regularity spaces. We mention that in [14] an infinite dimensional analysis based on a direct counting argument was recently carried out in the case of hyperbolic cross-type index sets that might arise when quasi-optimizing the work contribution of sparse grid stochastic collocation without spatial discretization.
The rest of this work is organized as follows. Section 2 introduces suitable assumptions and a class of random diffusion coefficients that we consider throughout the work; functional analysis results that are needed for the subsequent analysis of the MISC method are also provided. The MISC method is reviewed in Sect. 3. A complexity analysis of MISC with an infinite number of random variables is carried out in Sect. 4, where we provide a general convergence theorem. In Sect. 5, we discuss the application of MISC to the specific class of diffusion coefficients that we consider here and track the dependence of the convergence rate on the regularity of the diffusion coefficient. Section 6 presents some numerical tests to verify the convergence analysis conducted in the previous section. Finally, Sect. 7 provides some conclusions and final remarks. In the Appendix, we include some technical results on the summability and regularity properties of certain random fields written in terms of their series expansion.
In the following, \({\mathbb {N}}\) denotes the set of integer numbers including zero, while \({\mathbb {N}}_+\) denotes the set of positive integer numbers excluding zero. We refer to sequences in \({\mathbb {N}}^{{\mathbb {N}}_+}\) and \({\mathbb {N}}_+^{{\mathbb {N}}_+}\) as “multi-indices.” Moreover, we often use a vector notation for sequences, i.e., we formally treat sequences as vectors in \({\mathbb {N}}^{{\mathbb {N}}_+}\) (or \({\mathbb {R}}^{{\mathbb {N}}_+}\)) and mark them with bold type. We employ the following notation, with the understanding that \(N < \infty \) for actual vectors and \(N=\infty \) for sequences:
-
\(\mathbf{1}\) denotes a vector in \({\mathbb {N}}^N\) whose components are all equal to one;
-
\({{\varvec{e}}}^N_\ell \) denotes the \(\ell \)-th canonical vector in \({\mathbb {R}}^N\), i.e., \(\left( {{\varvec{e}}}^N_\ell \right) _i = 1\) if \(\ell =i\) and zero otherwise; however, for the sake of readability, we often omit the superscript N whenever it is obvious from context. For instance, if \({{\varvec{v}}}\in {\mathbb {R}}^N\), we write \({{\varvec{v}}}- {{\varvec{e}}}_1\) instead of \({{\varvec{v}}}- {{\varvec{e}}}_1^N\);
-
given \({{\varvec{v}}}\in {\mathbb {R}}^N\), \(|{{\varvec{v}}}| = \sum _{i=1}^N v_i\), \(|{{\varvec{v}}}|_0\) denotes the number of nonzero components of \({{\varvec{v}}}\), \(\max ({{\varvec{v}}}) = \max _{i=1,\ldots N}v_i\) and \(\min ({{\varvec{v}}}) = \min _{i=1,\ldots N}v_i\);
-
\({\mathfrak {L}}_+\) denotes the set of sequences with positive components with only finitely many elements larger than 1, i.e., \({\mathfrak {L}}_+ = \{ {{\varvec{p}}}\in {\mathbb {N}}_+^{{\mathbb {N}}_+} : |{{\varvec{p}}}-\mathbf{1}|_0 <\infty \}\);
-
given \({{\varvec{v}}}\in {\mathbb {R}}^N\) and \(f:{\mathbb {R}}\rightarrow {\mathbb {R}}\), \(f({{\varvec{v}}})\) denotes the vector obtained by applying f to each component of \({{\varvec{v}}}\), \(f({{\varvec{v}}}) = [f(v_1),f(v_2),\ldots ,f(v_N)] \in {\mathbb {R}}^N\);
-
given \({{\varvec{v}}}, {{\varvec{w}}}\in {\mathbb {R}}^N\), the inequality \({{\varvec{v}}}> {{\varvec{w}}}\) holds true if and only if \(v_i > w_i\) \(\forall i=1,\ldots ,N\);
-
given \({{\varvec{v}}}\in {\mathbb {R}}^D\) and \({{\varvec{w}}}\in {\mathbb {R}}^N\), we denote their concatenation by \([{{\varvec{v}}}, {{\varvec{w}}}] = (v_1,\ldots ,v_D,w_1,\ldots , w_N) \in {\mathbb {R}}^{D+N}\);
-
given a set with finite cardinality, \({\mathcal {G}}\subset {\mathbb {N}}_+\), we define the set \({\mathbb {N}}^{\mathcal {G}}= \{{{\varvec{z}}}\in \mathbb {{\mathbb {N}}}^{{\mathbb {N}}_+}: z_j=0,\;\forall j\notin {\mathcal {G}}\}\). We similarly define \({\mathbb {R}}^{\mathcal {G}}\) and \({\mathbb {C}}^{\mathcal {G}}\).
2 Functional Setting
Even though condition (3) ensures a.s. well-posedness of (1) in V, we need to make sure that realizations of u a.s. belong to more regular spaces to prove a convergence rate result for MISC. More specifically, due to the classic spatial sparse-grid approximation theory, we need certain conditions on the mixed derivatives of u with respect to the physical coordinates. To this end, we introduce suitable functional spaces (tensor products of fractional Sobolev spaces, see also [19]) and then a “shift regularity” assumption (see Assumption A1), i.e., we assume that the regularity of the realizations of u is induced “in a natural way” by the regularity of a, of the forcing, \(\varsigma \), and of the smoothness of the physical domain, \({\mathscr {B}}\). In other words, we rule out “pathological”/“ad hoc” examples in which u is very regular despite that the data are not, e.g., when the forcing is chosen such that \(u \in C^q\) for \(q>2\) even in the presence of a domain with corners. First, recall the definition of a fractional Sobolev space for \(l_i \in {\mathbb {R}}_+ \setminus {\mathbb {N}}_+\) and \({\mathscr {B}}_i \subset {\mathbb {N}}^{d_i}\):
extending the definition of a standard Sobolev space \(H^{l_i}({\mathscr {B}}_i)\) for \(l_i\) integer. The tensorized fractional Sobolev space can then be defined as
for \({{\varvec{l}}}= \left( l_i \right) _{i=1}^D \in {\mathbb {R}}^{D}_+\).Footnote 2 Finally, the mixed fractional Sobolev spaces, that we will need for our analysis, can be defined for each \({{\varvec{q}}}\in {\mathbb {R}}_+^D\) as
Observe that, while mixed fractional spaces, \({\mathcal {H}}^{\mathbf{1}+{{\varvec{q}}}}({\mathscr {B}})\), are the proper setting for the forthcoming analysis, we will, for ease of presentation, not look for the most general mixed space in which the solution lives. Instead, we will be content with deducing mixed regularity from inclusions in standard Sobolev spaces, to the point that Assumption A1 will be written in terms of standard Sobolev spaces. For this, we observe that \({\mathcal {H}}^{\mathbf{1}}({\mathscr {B}}) = H^1({\mathscr {B}})\) holds and in general we have the following inclusion result between standard and mixed fractional Sobolev spaces:
Before stating precisely the shift-regularity assumption on u, we need some more notation and setup. First, observe that this assumption needs to be stated in the complex domain, for reasons that will be made clear later. We therefore extend the diffusion coefficient from \(a(\cdot ,{{\varvec{y}}})\) with \({{\varvec{y}}}\in \Gamma \) to \(a(\cdot ,{{\varvec{z}}})\) with \({{\varvec{z}}}\in {\mathbb {C}}^{{\mathbb {N}}_+}\), so that the corresponding solution of (1), \(u(\cdot ,{{\varvec{z}}})\), becomes a \(H^1_0({\mathscr {B}})\) function taking values in \({\mathbb {C}}\), i.e., \(u(\cdot ,{{\varvec{z}}})\in H^1_0({\mathscr {B}},{\mathbb {C}})\). Since the complex-valued version of problem (1) is well posed as long as there exists \(\delta \) such that \(\mathfrak {Re}\left[ a({{\varvec{x}}},{{\varvec{z}}})\right]> \delta > 0\) for almost every (a.e.) \({{\varvec{x}}}\in {\mathscr {B}}\), and since our approximation method will cover the countable set of parameters \({{\varvec{z}}}\in {\mathbb {C}}^{{\mathbb {N}}_+}\) by multiple subsets of finite cardinality, we define the following region in \({\mathbb {C}}^{\mathcal {G}}\) for a set of finite cardinality, \({\mathcal {G}}\subset {\mathbb {N}}_+\):
We are now ready to state the assumption on the link between the regularity of the coefficient, a, and the regularity of solution, u.
Assumption A1
(Shift assumption) For a given \({\mathscr {B}}\), let \(\psi _j \in C^t(\overline{{\mathscr {B}}})\) (cf. eq. (2)) and \(\varsigma \in C^\infty _0(\overline{{\mathscr {B}}})\). We assume that there exists r such that \(1< r < t\) and that, for any finite set \({\mathcal {G}}\subset {\mathbb {N}}_+\) and any \({{\varvec{z}}}\in \Sigma _{{\mathcal {G}},\delta }\), the three following conditions hold:
-
1.
\(u(\cdot ,{{\varvec{z}}}) \in H^{1+r}({\mathscr {B}},{\mathbb {C}}) \cap H^1_0({\mathscr {B}},{\mathbb {C}})\);
-
2.
\(\frac{D u(\cdot ,z_j)}{ D z_j} \in H^{1 + r}({\mathscr {B}},{\mathbb {C}}),\quad \forall j\in {\mathcal {G}}\), where \(\frac{D u(\cdot ,z_j)}{ D z_j}\) denotes the partial complex derivative of u;
-
3.
\(\left\| u(\cdot ,{{\varvec{z}}}) \right\| _{H^{1+s}({\mathscr {B}},{\mathbb {C}})} \le C(\delta ,s,\varsigma ,{\mathscr {B}})\left\| a(\cdot ,{{\varvec{z}}}) \right\| _{C^s(\overline{{\mathscr {B}}},{\mathbb {C}})} \), with \(C(\delta ,s,\varsigma ,{\mathscr {B}})\rightarrow \infty \) for \(\delta \rightarrow 0\), for every \(s=1,\ldots ,\lfloor r \rfloor \).
In the following, we will need to ensure that \(\left\| u(\cdot , {{\varvec{z}}}) \right\| _{{H^{1+s}(\mathscr {B,{\mathbb {C}}})}} \), for \(s > 0\), is uniformely bounded for all \({{\varvec{z}}}\) in certain subregions of the complex plane. Note that this is a stronger condition than what is stated in the previous assumption, where we only assumed pointwise control on the norms of u (i.e., we gave a bound that depends on \({{\varvec{z}}}\)). In particular, we show at the end of this section that the possibility of having such a uniform bound depends on certain summability properties of the diffusion coefficient. Toward this end, we state the following assumption, which also guarantees the well-posedness of Problem (1).
Assumption A2
(Summability of the diffusion coefficient) For every \(s = 0,1,\ldots ,s_{\max } \le r\), define the sequences \({{\varvec{b}}}_{s}=\{b_{s,j}\}_{j \ge 1}\) where
We assume that an increasing sequence \(\{p_s\}_{s=0}^{s_{\max }}\) exists such that \(0< p_0 \le \cdots \le p_{s_{\max }}<\frac{1}{2}\) and \({{\varvec{b}}}_{s} \in \ell ^{p_s}\), i.e.,
We observe that with the above assumption, \(b_{s,j} \rightarrow 0^+\) as \(j \rightarrow \infty \) and \(0\le b_{0,j}\le b_{s,j}\) for every \(s=0,1,\ldots ,s_{\max }\). Moreover, given Assumption A2, we have that \({{\varvec{b}}}_0 \in \ell ^1\), which, together with the fact that \(y_j \in [-1,1]\) for \(j \ge 1\), guarantees that condition (3) holds and therefore that (1) is well posed in V a.s. in \(\Gamma \). Incidentally, we observe that the conditions in Assumption A2 are sufficient but not necessary for condition (3) to hold: Indeed, one would only need \({{\varvec{b}}}_{s} \in \ell ^2\) for some integer \(s \ge 1\), see Lemma 15 (and Corollary 16 for a specific example) in the Appendix.
As suggested above, the fact that, for a fixed s, the sequence \({{\varvec{b}}}_s\) is \(p_s\)-summable plays a central role in this work. Indeed, if \({{\varvec{b}}}_s\) is \(p_s\)-summable, we show that \(\left\| u(\cdot ,{{\varvec{z}}}) \right\| _{H^{1+s}({\mathscr {B}},{\mathbb {C}})} \) is uniformely bounded with respect to \({{\varvec{z}}}\) in a region of the complex plane whose size is proportional to \(\left\| {{\varvec{b}}}_s \right\| _{\ell ^{p_s}} ^{p_s}\). We use this fact to show convergence of the MISC method, with the convergence rate dictated by both \(p_0\) and \(p_s\). In Theorem 10, we detail how to optimally choose the value of s in the range \(0,1,\ldots ,s_{\max }\), which is the main result of this work. Restricting the range of values of s by \(p_s < \frac{1}{2}\) is not crucial; we could relax this to \(p_{s} < 1\). However, we follow this more stringent assumption because it considerably simplifies some technical steps in the following discussion without affecting the main part of the proof, as we make clear below (see Remark 4). What is important is that \(s_{\max }\) might be strictly smaller than \(\lfloor r \rfloor \) (i.e., it could happen that \({{\varvec{b}}}_r\) is \(p_r\)-summable but with \(p_r > \frac{1}{2}\), or not summable at all); in this case, the line of proof we propose does not fully exploit the regularity of the solution, u.
Example 1
In the numerical section of this work, we consider either \({\mathscr {B}} = [0,1]\), i.e., \(d=D=d_1=1\), or \({\mathscr {B}} = [0,1]^3\), i.e., \(d=D=3\), \(d_i=1\) and \({\mathscr {B}}_i=[0,1]\) for \(i=1,2,3\). In both cases, we consider the following form for \(\kappa ({{\varvec{x}}},{{\varvec{y}}})\):
Observe that it is possible to write \(\kappa \) in the form of (2) using a bijective mapping from \(\{y_{{{\varvec{k}}},{{\varvec{\ell }}}}\}_{{{\varvec{k}}} \in {\mathbb {N}}^d, {{\varvec{\ell }}} \in \{0,1\}^d}\) to \(\{y_j\}_{j \ge 1}\). We also choose the following values for the \(A_{{{\varvec{k}}}}\) coefficients:
for some \(\nu >0\). We observe that \(\nu \) is a parameter dictating the \({{\varvec{x}}}\)-regularity of the realizations of \(\kappa \), hence of a. Moreover, the parameters \(\nu \) and d govern the \(p_s\)-summability of the sequence \({{\varvec{b}}}_s\) for any s and, as a consequence, the overall convergence of the MISC method, as discussed earlier. Section 5 analyzes the summability properties of the series (8).
We conclude this preliminary section by making the shape of the above-mentioned regions in the complex plane more precise and showing how their sizes depend on the summability properties of a. In particular, we will exploit the fact that for any finite set \({\mathcal {G}}\subset {\mathbb {N}}_+\), for every \(s=0,1,\ldots ,s_{\max }\) and for any \({{\varvec{z}}}\in {\mathbb {C}}^{\mathcal {G}}\) we have \(\kappa (\cdot ,{{\varvec{z}}})\in C^{s}(\overline{{\mathscr {B}}},{\mathbb {C}})\), \(\left\| \kappa (\cdot ,{{\varvec{z}}}) \right\| _{C^{s}(\overline{{\mathscr {B}}},{\mathbb {C}})} \le \sum _{j\in {\mathcal {G}}}|z_j| b_{s,j}\) and infer, from the multivariate Faà di Bruno formula (see “Appendix 1” and [13]), that \(a(\cdot ,{{\varvec{z}}})\in C^{s}(\overline{{\mathscr {B}}},{\mathbb {C}})\) as well, with the estimate
Next, for a given \(\zeta >1\), let \({\mathcal {E}}_{\zeta }\) denote the polyellipse in the complex plane
For any sequence \({{\varvec{\zeta }}} = \{\zeta _{j}\}_{j \ge 1}\) with \(\zeta _{j}>1\) for every \(j \ge 1\) and for any finite set, \({\mathcal {G}}\subset {\mathbb {N}}_+\), we introduce the Bernstein polyellipse:
Lemma 1
(Holomorphic complex continuation of u in \(H^1_0({\mathscr {B}};{\mathbb {C}})\) in a Bernstein polyellipse) Consider the sequence \({{\varvec{b}}}_0\) defined in (6). For any \(\delta >0\), let \(E_\delta >2\) be such that
and consider the sequence \({{\varvec{\zeta }}}_0 = \{\zeta _{0,j}\}_{j \ge 1}\), with
with \(p_0\) as in (7). Then, for any finite set \({\mathcal {G}}\subset {\mathbb {N}}_+\), the solution, u, admits a holomorphic complex continuation, \(u:{\mathbb {C}}^{\mathcal {G}}\rightarrow H^{1}_0({\mathscr {B}},{\mathbb {C}})\), in the Bernstein polyellipse, \({\mathcal {E}}_{{{\varvec{\zeta }}}_0}^{\mathcal {G}}\subset \Sigma _{{\mathcal {G}},\delta }\), with
with \(C_{0,u}\) independent of \({\mathcal {G}}\).
Proof
It is well known in the literature that \(u:{\mathbb {C}}^{\mathcal {G}}\rightarrow H^{1}_0({\mathscr {B}},{\mathbb {C}})\) is holomorphic in the region \(\Sigma _{{\mathcal {G}},\delta }\) defined in (5) (see, e.g., [1]). To compute the parameters \(\{ \zeta _j\}_{j \in {\mathcal {G}}}\) of a Bernstein polyellipse contained in \(\Sigma _{{\mathcal {G}},\delta }\), we rewrite \(a({{\varvec{x}}},{{\varvec{z}}})\) as
so that \(\Sigma _{{\mathcal {G}},\delta }\) can be rewritten as
Now, for some \(E>2\) that we choose in the following, the two following conditions on \({{\varvec{z}}}\) imply that \({{\varvec{z}}}\in \Sigma _{{\mathcal {G}},\delta }\):
equivalently, we write
For a fixed value of E, the equations above define a second region, \(\Omega _{{\mathcal {G}},\delta }\), included in \(\Sigma _{{\mathcal {G}},\delta }\). In turn, the previous conditions are verified if the following conditions, which define a hyper-rectangular region, \({\mathcal {R}}_\delta \subset \Omega _{{\mathcal {G}},\delta }\), are verified:
provided that \(\delta \) and E are such that the quantity \(-\left\| {{\varvec{b}}}_0 \right\| _{\ell ^1} -\log \delta + \log \cos \left( \frac{\pi }{E} \right) \) remains positive. Observe that for sufficiently small \(\delta >0\) such E exists, since \(f(E) = \log \cos \left( \frac{\pi }{E} \right) \) is a monotonically increasing function, with \(f(E) \rightarrow -\infty \) for \(E \rightarrow 2\) and \(f(E) \rightarrow 0\) for \(E \rightarrow \infty \), and \(-\log \delta \) is positive for sufficiently small \(\delta \). In particular, for any \(\delta >0\), we choose \(E=E_\delta \) such that \(w_{0,j} = \tau _{0,j}\), which leads to
We observe that with this choice, \(\tau _{0,j}\) (and hence \(w_{0,j}\)) actually does not depend on \({\mathcal {G}}\), therefore we can define the sequence \({{\varvec{\tau }}}_0 = \{\tau _{0,j}\}_{j \ge 1}\).
We are now in the position to compute the Bernstein polyellipses that touch the boundary of \({\mathcal {R}}_\delta \) on the real and imaginary axes. For the real axis, we have to enforce
while for the imaginary axis we have to enforce
The proof is concluded by observing that \(\zeta _{j,{\text {imag}}} \le \zeta _{j,{\text {real}}}\), i.e., the only polyellipse entirely contained in \({\mathcal {R}}_\delta \), and hence in \(\Sigma _{{\mathcal {G}},\delta }\), is the one touching \({\mathcal {R}}_\delta \) on the imaginary axis, which also implies that the bound \(\sup _{{{\varvec{z}}}\in {\mathcal {E}}_{{{\varvec{\zeta }}}}^{\mathcal {G}}} \left\| u(\cdot ,{{\varvec{z}}}) \right\| _{H^{1}({\mathscr {B}})} \le C_{0,u} = \frac{\left\| \varsigma \right\| _{H^{-1}({\mathscr {B}})} }{\delta } <\infty \) holds independently of \({\mathcal {G}}\). \(\square \)
Lemma 2
(Holomorphic complex continuation of u in \(H^{1+s}({\mathscr {B}};{\mathbb {C}})\) in a Bernstein polyellipse) For a given \(s=1,2,\ldots ,s_{\max }\), let \({{\varvec{\zeta }}}_s = \{\zeta _{s,j}\}_{j \ge 1}\), with
with \({{\varvec{b}}}_s\) as in (6), \(p_s\) as in (7), and \(E_\delta \) as in Lemma 1. For any finite set \({\mathcal {G}}\subset {\mathbb {N}}_+\), \(u:{\mathbb {C}}^{\mathcal {G}}\rightarrow H^{1+s}({\mathscr {B}},{\mathbb {C}})\) is holomorphic in the Bernstein polyellipse \({\mathcal {E}}_{{{\varvec{\zeta }}}_s}^{\mathcal {G}}\subset \Sigma _{{\mathcal {G}},\tilde{\delta }}, { with}\)
where \(M = \frac{s!}{(\log 2)^s} e^{K \frac{\pi }{E_\delta }} \left( 1 + K \frac{\pi }{E_\delta } \right) ^s\), \(K= \left( 2 + \frac{1}{\min _{j \ge 1} \tau _{s,j}^2}\right) ^{1/2}\), \(\tilde{\delta }= e^{- K \frac{\pi }{E_\delta }}\), \(C(\tilde{\delta },s, \varsigma , {\mathscr {B}})\) as in Assumption A1, and \(C_{s,u}\) independent of \({\mathcal {G}}\).
Proof
From Assumption A1, \(u : {\mathbb {C}}^{\mathcal {G}}\rightarrow H^{1+s}({\mathscr {B}},{\mathbb {C}})\) is complex differentiable for every \({{\varvec{z}}}\) in \(\Sigma _{{\mathcal {G}},\varepsilon }\) for any \(\varepsilon > 0\). It is therefore holomorphic in \(\Sigma _{{\mathcal {G}},\varepsilon }\). Similarly to the previous lemma, we look for a region in which we have an a-priori bound on the \(H^{1+s}({\mathscr {B}},{\mathbb {C}})\) norm of u uniformly on \({{\varvec{z}}}\). Again from Assumption A1, we have that this is true in the region
However, contrary to the previous lemma, in this proof, we do not derive the expression of a polyellipse contained in \(\Xi _{{\mathcal {G}},\varepsilon }(M)\), but content ourselves with verifying that the polyellipses, \({\mathcal {E}}_{{\mathcal {G}},{{\varvec{\zeta }}}_s}\), proposed in the statement of the lemma (that we have obtained simply by replacing \(b_{0,j}\) with \(b_{s,j}\) in (13)) satisfy the requirement, i.e., \({\mathcal {E}}_{{\mathcal {G}},{{\varvec{\zeta }}}_s} \subset \Xi _{{\mathcal {G}},\tilde{\delta }}(M)\), for every finite set, \({\mathcal {G}}\subset {\mathbb {N}}_+\), and for a certain \(\tilde{\delta }\) that we specify later to control the coercivity of the problem. To this end, let us consider the univariate polyellipse \({\mathcal {E}}_{\zeta _{s,j}}\). We first prove that this polyellipse is contained in the following complex rectangle:
The bound on the imaginary part of z is a consequence of the choice of the polyellipse in (14), similarly to what was done in Lemma 1. For the real part, we compute the point \(z_0\) where the polyellipse intersects the real axis by equating
Furthermore, we observe that \(|z| \le \sqrt{1 + 2 \tau _{s,j}^2} \le K \tau _{s,j}\) for every \(z \in {\mathcal {R}}_j\) and some \(K>0\); for instance, we could look for the smallest \(\tau _{s,j}\), say \(\tau _{s,j^*}\), choose K accordingly, i.e., such that \((K^2-2)\tau _{s,j^*}^2 \ge 1\), and obtain the value in the statement of the lemma. Next, according to (10) and Assumption A2,
holds. We finish the proof by observing that for every, \({{\varvec{z}}}\in {\mathcal {E}}_{{\mathcal {G}},{{\varvec{\zeta }}}_s}\), we have
which gives uniform control of the norm of \(\left\| a(\cdot ,{{\varvec{z}}}) \right\| _{C^s(\overline{{\mathscr {B}}},{\mathbb {C}})} \) within \({\mathcal {E}}_{{{\varvec{\zeta }}}_s}^{\mathcal {G}}\) as required. More precisely, we have
which together with Assumption A1 gives the desired bound on \(\left\| u(\cdot ,{{\varvec{z}}}) \right\| _{H^{1+s}({\mathscr {B}})} \) in (16) and
\(\square \)
The following result from [22, 33] is also needed. Since this result is concerned with the finite-dimensional case, i.e., \({\mathcal {G}}=\{1,2,\ldots ,N\}\) and \({{\varvec{\zeta }}} \in {\mathbb {R}}^N\), we write, for ease of notation, \({\mathcal {E}}_{{{\varvec{\zeta }}}}\) instead of \({\mathcal {E}}_{{{\varvec{\zeta }}}}^{\mathcal {G}}\), i.e., \({\mathcal {E}}_{{{\varvec{\zeta }}}} = \{ {{\varvec{z}}}\in {\mathbb {C}}^N : z_j \in {\mathcal {E}}_{\zeta _j} \text { for } j=1,2,\ldots ,N\}\).
Lemma 3
(Chebyshev expansion of a holomorphic function) Given \(q_j \in {\mathbb {N}}\), let \(\phi _{q_j}\) be the family of Chebyshev polynomials of the first kind on \([-1,1]\), i.e., \(|\phi _{q_j}(y)|\le 1\) for all \(y \in [-1,1]\), and, for \(N \in {\mathbb {N}}_+\) and any \({{\varvec{p}}}\in {\mathbb {N}}^N\), let \(\Phi _{{\varvec{p}}}({{\varvec{y}}}) =\prod _{j=1}^N \phi _{p_j}(y_j)\). If \(f:[-1,1]^N \rightarrow {\mathbb {R}}\) admits a holomorphic complex extension in a Bernstein polyellipse, \({\mathcal {E}}_{{{\varvec{\zeta }}}}\), for some \({{\varvec{\zeta }}} \in (1, \infty )^N\) and if there exists \(0< C_f < \infty \) such that \(\sup _{{{\varvec{z}}}\in {\mathcal {E}}_{{{\varvec{\zeta }}}}} |f({{\varvec{z}}})| \le C_f\), then f admits the following Chebyshev expansion:
which converges uniformely in \({\mathcal {E}}_{{{\varvec{\zeta }}}}\). Moreover the following bound on the coefficients \(f_{{\varvec{p}}}\) holds:
where \(|{{\varvec{p}}}|_0\) denotes the number of nonzero elements of \({{\varvec{p}}}\).
3 The Multi-index Stochastic Collocation Method
In this section, we introduce approximations of \({\mathbb {E}}\left[ F\right] \) along the deterministic and stochastic dimensions and their decomposition in terms of tensorizations of univariate difference operators. We then recall the so-called mixed difference operators and the construction of the MISC estimator that was first introduced in [22] in a general setting.
3.1 Approximation Along the Deterministic and Stochastic Variables
Tensorized deterministic solver. Let \(\{{\mathbb {T}}_i\}_{i=1}^D\) be the triangulations/meshes of each of the subdomains \(\{{\mathscr {B}}_i\}_{i=1}^D\) composing the domain \({\mathscr {B}}\); denote by \(\{h_i\}_{i=1}^D\) the mesh size on each mesh \({\mathbb {T}}_i\); and let \(\bigotimes _{i=1}^D {\mathbb {T}}_i\) be the mesh for \({\mathscr {B}}\). Then, consider a numerical method for the approximation of the solution of (1) for a fixed value of the random variables, \({{\varvec{y}}}\), based on such a mesh, e.g., finite differences, finite volumes, tensorized finite elements, or h-refined splines, such as those used in the isogeometric context. The values of \(h_i\) are actually given as functions of a positive integer value, \(\alpha \ge 1\), referred to as a “deterministic discretization level”, i.e., \(h_i=h_i(\alpha )\). Observe that, in a more general setting, the mesh size needs not be a constant value over the subdomain \({\mathscr {B}}_i\) and could be for instance the result of a grading function intended to refine subregions of \({\mathscr {B}}_i\) as in Fig. 1d (see also [24, Remark 2.2] for further comments on locally refined meshes in the context of Multi-Level Monte Carlo methods). In this work, we restrict ourselves to constant h for ease of presentation. Given a multi-index, \({{{\varvec{\alpha }}}}\in {\mathbb {N}}^D_+\), we denote by \(u^{{{\varvec{\alpha }}}}({{\varvec{x}}},{{\varvec{y}}})\) the approximation of u obtained by setting \(h_i = h_i(\alpha _i)\) and use notation \(F^{{{\varvec{\alpha }}}}({{\varvec{y}}}) = \Theta [u^{{{\varvec{\alpha }}}}(\cdot ,{{\varvec{y}}})]\). More specifically, in the following we will consider
and a method obtained by tensorizing piecewise multi-linear finite element spaces on each mesh, \(\{{\mathbb {T}}_i\}_{i=1}^D\), discretizing each \(\{{\mathscr {B}}_i\}_{i=1}^D\).
As already mentioned in the previous section, MISC could also be applied to more general domains, such as those discussed in Fig. 1, as long as some kind of “tensor structure” can be induced from the shape of the domain to the solver of the deterministic problem and the vector \({{{\varvec{\alpha }}}}\) determines the refinement level of each component of such a tensor structure. The reason why we need such tensor structure will be made clear when we introduce the classic sparse-grids approach to solve the problem. For non-tensorial domains, we can always set \(D=1\) and consider an unstructured mesh for the whole domain, \({\mathscr {B}}\), having only one discretization level \(\alpha \in {\mathbb {N}}_+\). In this way, we give up the sparse-grid approach on the deterministic part of the problem and obtain a variant of the Multi-Level Stochastic Collocation method discussed in [37, 38], yet with a different algorithm for combining spatial and stochastic discretizations. See Remark 1 stated next and [22] for additional discussion on this aspect.
It would be straightforward to extend this setting to discretization methods based on degree elevation rather than on mesh refinement, such as spectral methods, p-refined finite elements or p- and k-refined splines. However, here we limit ourselves to the setting defined above. It would also be possible to include time-dependent problems in this framework, but in this case we might need to take care of possible constraints on discretization parameters, such as CFL conditions; a broader generalization could also include “non-physical” parameters such as tolerances for numerical solvers. Finally, more general problems, e.g., those depending on random variables with probability distributions other than uniform distributions or with uncertain boundary conditions and/or forcing terms could also be addressed with suitable modifications of the MISC methodology.
Tensorized quadrature formulae for expected value approximation. Similarly to what was presented for the deterministic problem, we base our approximation of the expected value of \(F^{{{\varvec{\alpha }}}}({{\varvec{y}}})\) on a tensorization of quadrature formulae over the stochastic domain, \(\Gamma \).
Let \(C^0([-1,1])\) be the set of real-valued continuous functions over \([-1,1]\), \(\beta \in {\mathbb {N}}_+\) be referred to as a “stochastic discretization level”, and \(m: {\mathbb {N}}\rightarrow {\mathbb {N}}\) be a strictly increasing function with \(m(0)=0\) and \(m(1)=1\), that we call a “level-to-nodes function”. At level \(\beta \), we consider a set of \(m(\beta )\) distinct quadrature points in \([-1,1]\), \({\mathscr {P}}^{m(\beta )} = \{y_\beta ^1, y_\beta ^2, \ldots , y_\beta ^{m(\beta )}\} \subset [-1,1]\), and a set of quadrature weights, \({\mathcal {W}}^{m(\beta )}=\{\varpi _\beta ^1, \varpi _\beta ^2, \ldots , \varpi _\beta ^{m(\beta )}\}\). We then define the quadrature operator as
The quadrature weights are selected such that \(Q^{m(\beta )}[y^k] = \int _{-1}^1 \frac{y^k}{2} dy, \quad \forall \, k=0,1,\ldots , m(\beta )-1\). The quadrature points are chosen to optimize the convergence properties of the quadrature error (the specific choice of quadrature points is discussed later in this section). In particular, for symmetry reasons, we define the trivial operator \(Q^{1}[f] = f(0),\; \forall f \in C^0([-1,1])\).
Defining a quadrature operator over \(\Gamma \) is more delicate, since \(\Gamma \) is defined as a countable tensor product of intervals. To this end, we follow [36] and define, for any finitely supported multi-index \({{{\varvec{\beta }}}}\in {\mathfrak {L}}_+\),
where the j-th quadrature operator is understood to act only on the j-th variable, and the tensor product is well defined since it is composed of finitely many non-trivial factors (see [36] again). In practice, the value of \({\mathscr {Q}}^{m({{{\varvec{\beta }}}})}[f]\) can be obtained by considering the tensor grid \({\mathscr {T}}^{m({{{\varvec{\beta }}}})} = \times _{j\ge 1} {\mathscr {P}}^{m(\beta _j)}\) with cardinality \( \#{\mathscr {T}}^{m({{{\varvec{\beta }}}})} = \prod _{j \ge 1} m(\beta _j)\) and computing
where \(\widehat{{{\varvec{y}}}}_j \in {\mathscr {T}}^{m({{{\varvec{\beta }}}})}\) and \(w_j\) are (infinite) products of weights of the univariate quadrature rules. Notice that having \(m(1)=1\) is essential in this construction so that the cardinality of \({\mathscr {T}}^{m({{{\varvec{\beta }}}})}\) is finite for any \({{{\varvec{\beta }}}}\in {\mathfrak {L}}_+\) and \(\varpi _{\beta _j}^1=1\) whenever \(\beta _j=1\). All weights, \(w_j\), are thus bounded.
Coming back to the choice of the univariate quadrature points, it is recommended, for optimal performance, that they are chosen according to the underlying measure, \(~{\text {d}}\lambda /2\). Moreover, since we aim at a hierarchical decomposition of the operator, \({\mathscr {Q}}^{m({{{\varvec{\beta }}}})}\), it is useful (although not necessary, see e.g., [33]) that the nodes be nested collocation points, i.e., \({\mathscr {P}}^{m(\beta )} \subset {\mathscr {P}}^{m(\beta +1)}\) for any \(\beta \ge 1\). Thus, we consider Clenshaw–Curtis points that are defined as
Clenshaw–Curtis points are nested provided that the level-to-nodes function is defined as
We close this section by mentioning that another family of nested points for uniform measures available in the literature is the Leja points, whose performance is equivalent to that of Clenshaw–Curtis points for quadrature purposes. See, e.g., [10, 31, 32, 36] and references therein for definitions and comparison.
3.2 Construction of the MISC Estimator
It is straightforward to see that a direct approximation, \({\mathbb {E}}\left[ F\right] \approx {\mathscr {Q}}^{m({{{\varvec{\beta }}}})} [F^{{{\varvec{\alpha }}}}]\), is not a viable option in practical cases, due to the well-known “curse of dimensionality” effect. In [22], we proposed to use MISC as a computational strategy to combine spatial and stochastic discretizations in such a way as to obtain an effective approximation scheme for \({\mathbb {E}}\left[ F\right] \).
MISC is based on a classic sparsification approach in which approximations like \({\mathscr {Q}}^{m({{{\varvec{\beta }}}})}[F^{{{\varvec{\alpha }}}}]\) are decomposed in a hierarchy of operators. Only the most important of these operators are retained in the approximation. In more detail, let us denote for brevity \({\mathscr {Q}}^{m({{{\varvec{\beta }}}})} [F^{{{\varvec{\alpha }}}}] = F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\) and introduce the first-order difference operators for the deterministic and stochastic discretization operators, denoted, respectively, by \(\Delta _{i}^{\mathrm{det}}\) with \(1\le i\le D\) and \(\Delta _{j}^{\mathrm{stoc}}\) with \(j \ge 1\):
As a second step, we define the so-called mixed difference operators,
with the convention that \(F_{{{\varvec{v}}},{{\varvec{w}}}}=0\) whenever a component of \({{\varvec{v}}}\) or \({{\varvec{w}}}\) is zero. Notice that, since \({{{\varvec{\beta }}}}\) has finitely many components larger than 1, the sum on the right-hand side of (22) contains only a finite number of terms. Finally, letting
we define the Multi-index Stochastic Collocation (MISC) estimator of \({\mathbb {E}}\left[ F\right] \) as
where \({\mathcal {I}} \subset {\mathbb {N}}_+^{D} \times {\mathfrak {L}}_+\). The second form of the MISC estimator is known as the “combination technique”, since it expresses the MISC approximation as a linear combination of a number of tensor approximations, \(F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\), and might be useful for the practical implementation of the method; we observe in particular that many of its coefficients, \(c_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\), are zero.
The effectiveness of MISC crucially depends on the choice of the index set, \({\mathcal {I}}\). Given the hierarchical structure of MISC, a natural requirement is that \({\mathcal {I}}\) should be downward-closed, i.e.,
(see also [6, 33, 39]). In addition to this general constraint, in [22] we have detailed a procedure to derive an efficient set, \({\mathcal {I}}\), based on an optimization technique inspired by the Dantzig algorithm for the approximate solution of the knapsack problem (see [30]). In the following, we briefly summarize this procedure and refer to [22] as well as to [3, 6, 33] for a thorough discussion on the similarities between this procedure and the Dantzig algorithm.
The first step of our optimized construction consists of introducing the “work contribution,” \(\Delta W_{{{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}}\), and “error contribution”, \(\Delta E_{{{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}}\), for each operator, \({{\varvec{\Delta }}}[F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}]\). The work contribution measures the computational cost (measured, e.g., as a function of the total number of degrees of freedom, or in terms of computational time) required to add \({{\varvec{\Delta }}}[F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}]\) to \({\mathscr {M}}_{\mathcal {I}}[F]\), i.e.,
Similarly, the error contribution measures how much the error, \(|{\mathbb {E}}\left[ F\right] - {\mathscr {M}}_{\mathcal {I}}[F]| \), would decrease if the operator \({{\varvec{\Delta }}}[F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}]\) were added to \({\mathscr {M}}_{\mathcal {I}}[F]\),
We observe that the following decompositions of the total work and error of the MISC estimator hold:
Although it would be tempting to define \({\mathcal {I}}\) as the set of couples \([{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}]\) with the largest error contribution, this choice could be far from optimal in terms of computational cost. As suggested in the literature on the knapsack problem (see [30]), the benefit-to-cost ratio should be taken into account in the decision (see also [3, 6, 20, 22, 33]). More precisely, we propose to build the MISC estimator by first assessing the so-called profit of each operator \({{\varvec{\Delta }}}[F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}]\), i.e., the quantity
Then, we build an index set for the MISC estimator:
for a suitable \(\epsilon >0\). We observe that the obtained set is not necessarily downward-closed; we have to enforce this condition a posteriori. Obviously, \(\Delta E_{{{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}}\) and \(\Delta W_{{{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}}\) are not, in general, at our disposal. In practice, we base the construction of the MISC estimator on a-priori bounds for such quantities. More precisely, we derive a-priori ansatzes for these bounds from theoretical considerations and then fit the constants appearing in the ansatzes with some auxiliary computations. We refer to the entire strategy as a priori/a posteriori.
Remark 1
We remark that the general form of the MISC estimator (24) is quite broad and includes other related methods (i.e., methods that combine different spatial and stochastic discretization levels to optimize the computational effort) available in the literature, such as the “Multi-Level Stochastic Collocation” method [37, 38] and the “Sparse Composite Collocation” method [4]; see also [25]. The main novelty of the MISC estimator (24)-(29) with respect to such methods is the profit-oriented selection of difference operators. Another difference from [37, 38] is the fact that difference operators in our approach are introduced in both the spatial and stochastic domains. See also [4] for a similar construction, in which no optimization is performed. More details on the comparison between the above-mentioned methods and MISC can be found in [22].
4 Error Analysis of the MISC Method
In this section, we state and prove a convergence theorem for the profit-based MISC estimator based on the multi-index set (29). The theorem is based on a result from the previous work [33], which was proved in the context of sparse-grid approximation of Hilbert-space-valued functions. Since the sparse grid and the MISC constructions are identical, this theorem can be used verbatim here. In particular, it links the summability of the profits to the convergence rate of methods such as MISC and Sparse Grids Stochastic Collocation, i.e., based on a sum of difference operators. To use this result, we have to assess the summability properties of the profits. We thus introduce suitable estimates of the error and work contributions, \(\Delta E_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\) and \(\Delta W_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\), respectively. In particular, the estimate of \(\Delta E_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\) depends on the spatial regularity of the solution, on the convergence rate of the method used to solve the deterministic problems, and on the summability property of the Chebyshev expansion of the solution over the parameter space.
Theorem 4
(Convergence of the profit-based MISC estimator, see [33]) If the profits, \(P_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\), satisfy the weighted summability condition
for some \(0<p\le 1\), then
where \({\mathrm {Work}}\left[ {\mathscr {M}}_{{\mathcal {I}}}\right] \) is given by (27).
We begin by introducing an estimate for the size of the work contribution, \(\Delta W_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\). To this end, let \(\Delta W_{{{{\varvec{\alpha }}}}}^{{\mathrm{det}}}={\mathrm {Work}}\left[ {{\varvec{\Delta }}}^{{\mathrm{det}}}[F^{{{{\varvec{\alpha }}}}}]\right] \), i.e., let it be the cost of computing \({{\varvec{\Delta }}}^{{\mathrm{det}}}[F^{{{{\varvec{\alpha }}}}}]\) according to (21).
Assumption A3
(Spatial work contribution) There exist \(\gamma _i \in [1, \infty )\) for \(i=1,\ldots ,D\) and \(C_W > 0\) such that
where \(2^{\sum _{i=1}^D d_i \alpha _i}\) is proportional to the number of degress of freedom in the mesh on level \({{{\varvec{\alpha }}}}\), cf. equation (17), and \(\gamma _i\) are related to the used deterministic solver and to the sparsity structure of the linear system, which might be different on each \({\mathscr {B}}_i\) depending on the chosen discretization.
Lemma 5
(Total work contribution) When using Clenshaw–Curtis points for the discretization over the parameter space, the work contribution, \(\Delta W_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\), of each difference operator, \({{\varvec{\Delta }}}[F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}]\), can be decomposed as
with \(\gamma _i\) and \(C_W\) as in Assumption A3.
Proof
Combining (25) and (23), we have
Since the Clenshaw–Curtis points are nested, computing \(\Delta W_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\) (i.e., adding \([{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}]\) to the set \({\mathcal {I}}\) that defines the current MISC estimator) amounts to evaluating \(F^{{{\varvec{\alpha }}}}({{\varvec{y}}})\) in the set of “new” points added to the estimator by \({{\varvec{\Delta }}}^{{\mathrm{stoc}}}[\cdot ]\), i.e., \(\times _{j : \beta _j > 1} \left\{ {\mathscr {P}}^{m(\beta _j)} \setminus {\mathscr {P}}^{m(\beta _j-1)} \right\} \), whose cardinality is \(\prod _{j \ge 1} ( m(\beta _j) - m(\beta _j-1))\). The proof is then concluded by observing that the definition of \(m(\beta )\) in (20) immediately gives \(m(\beta _j) - m(\beta _j-1) \le 2^{\beta _j-1}\) and recalling Assumption A3. \(\square \)
Remark 2
We observe that the exponent \({{{\varvec{\beta }}}}-\mathbf{1}\) guarantees that the directions along which no quadrature is actually performed (i.e., \(\beta _j=1\) for any \(j\ge 1\)) do not contribute to the total work.
Next, we prove a sequence of lemmas that allow us to conclude that an analogous estimate holds for the error contribution as well, i.e., that \(\Delta E_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\) can be bounded as a product of a term related to the spatial discretization and a term related to the approximation over the parameter space. To this end, we need to introduce the quantity
where \(Q^{m(\beta )}[\cdot ]\) are the univariate quadrature operators introduced in (18), and observe that \({\text {Leb}}_{1}=1\). Next, let \({\mathbb {L}}= \max _{\beta \ge 1} {\text {Leb}}_{m(\beta )}\), and note that \( {\mathbb {L}}\le 2\) since \(Q^{m(\beta )}\) has positive weights. Moreover, a much smaller bound on \({\mathbb {L}}\) can be obtained for Clenshaw–Curtis points. Indeed, since Clenshaw–Curtis points are nested, we can also bound \({\text {Leb}}_{m(\beta )} \le \widetilde{{\text {Leb}}}_{m(\beta )}\) with
and it can be verified numerically that \(\widetilde{{\text {Leb}}}_{m(\beta )}\) is bounded, attains its maximum for \(\beta =3\) and converges to 1 for \(\beta \rightarrow \infty \), see Fig. 2. Therefore, we have \({\mathbb {L}}\le \widetilde{{\text {Leb}}}_{m(3)} \approx 1.067\).
Lemma 6
(Stochastic error contribution) Let \(f:\Gamma \rightarrow {\mathbb {R}}\) and \({{{\varvec{\beta }}}}\in {\mathfrak {L}}_+\), and assume that the quadrature operator, \({\mathscr {Q}}^{m({{{\varvec{\beta }}}})}\), is built with Clenshaw–Curtis abscissae. If there exists a sequence, \({{\varvec{\rho }}}=\{\rho _j\}_{j \ge 1}\), \(\rho _j > 1\) for all j, such that
-
1.
\(\sum _{j \ge 1} \frac{1}{\rho _j} < \infty \),
-
2.
there exists \(0< C_f < \infty \) such that for any finite set, \({\mathcal {G}}' \subset {\mathbb {N}}_+\) with \(\# {\mathcal {G}}' < \infty \), the restriction of f on \(([-1,1])^{{\mathcal {G}}'}\) admits a holomorphic complex extension in a Bernstein polyellipse, \({\mathcal {E}}_{{{\varvec{\rho }}}}^{{\mathcal {G}}'}\) with \(\sup _{{{\varvec{z}}}\in {\mathcal {E}}_{{{\varvec{\rho }}}}^{{\mathcal {G}}'}} |f({{\varvec{z}}})| \le C_f\).
Then, the set
has finite cardinality, i.e., \(\#{\mathcal {J}} < \infty \) and
holds, where \({\mathcal {G}}\) is the support of \({{{\varvec{\beta }}}}-\mathbf{1}\),
and \(C_{{\text {SE}}}({{{\varvec{\rho }}}}) < \infty \) is independent of \({{{\varvec{\beta }}}}\).
Proof
Let \({\mathcal {G}}\) be the support of \({{{\varvec{\beta }}}}-\mathbf{1}\) with cardinality \(\#{\mathcal {G}}< \infty \), \({{\varvec{k}}}\in {\mathbb {N}}_+^{\mathcal {G}}\), and let \(\Phi _{{\mathcal {G}},{{\varvec{k}}}}\) denote the Chebyshev polynomials of the first kind with degree \(k_j\) along \(y_{j}\) for \(j \ge 1\). We observe that \(\Phi _{{\mathcal {G}},{{\varvec{k}}}}\) are equivalent to the \(\#{\mathcal {G}}\)-variate Chebyshev polynomials over \([-1,1]^{\#{\mathcal {G}}}\) thanks to the product structure of the multivariate Chebyshev polynomials and to the fact that \(\phi _0(y)=1\). Next, consider the holomorphic extension of \(f:{\mathbb {C}}^{\mathcal {G}}\rightarrow {\mathbb {C}}\), and its Chebyshev expansion over \(\Phi _{{\mathcal {G}},{{\varvec{k}}}}\) introduced in Lemma 3. Then
holds. By construction of hierarchical surplus, we have \({{\varvec{\Delta }}}^{\mathrm{stoc}}[ {\mathscr {Q}}^{m({{{\varvec{\beta }}}})}[\Phi _{{\mathcal {G}},{{\varvec{k}}}}]] = 0\) for all Chebyshev polynomials, \(\Phi _{{\mathcal {G}},{{\varvec{k}}}}\), such that \(\exists \,j \in {\mathcal {G}}: k_j < m(\beta _j -1)\) (i.e., for polynomials that are integrated exactly at least in one direction by both quadrature operators along that direction). Therefore, the previous sum reduces to the multi-index set \({{\varvec{k}}}\ge m({{{\varvec{\beta }}}}- \mathbf{1})\). Furthermore, by the triangular inequality, we have
Next, using the definitions of \({{\varvec{\Delta }}}^{{\mathrm{stoc}}}\) and \({\text {Leb}}_{m(\beta )}\) and recalling that Chebyshev polynomials of the first kind on \([-1,1]\) are bounded by 1, that \({\text {Leb}}_{m(\beta )} \le \widetilde{{\text {Leb}}}_{m(\beta )} \le 1\) for \(\beta =1,2\) and \({\text {Leb}}_{m(\beta )}\le {\mathbb {L}}\) for \(\beta \ge 3\), we have
We then bound \(| f_{{\varvec{k}}}|\) by Lemma 3 to obtain
Next, we observe that \(|k_j|_0 \le \min \{1, k_j\}\) for \(k_j \ge 0\) and \(1 \le \frac{1}{3}m(\beta _j - 1)\) for all \(\beta _j \ge 3\). Then,
where the last inequality is due to the fact that \(m(\beta _j - 1) = 0\) whenever \(j \notin {\mathcal {G}}\) or equivalently \(\beta _j=1\) and
since \(g(\rho _j) > 0\) for all \(j\ge 1\). Note that \(C_{{\text {SE}}}({{{\varvec{\rho }}}})\) is independent of \({{{\varvec{\beta }}}}\) and is bounded since
which was assumed. Moreover, to show that \(\# {\mathcal {J}} < \infty \), note that \(\sum _{j \in {\mathcal {J}}} \rho _j^{-1}\) is otherwise unbounded, which contradicts the first assumption of the theorem, namely \(\sum _{j \ge 1} \rho _j^{-1} < \infty \). \(\square \)
Remark 3
Sharper estimates could be obtained by exploiting the structure of the Chebyshev polynomials when bounding \(\left| {\Delta ^{{\mathrm {stoc}}}[Q^{m(\beta _j)}[\phi _{k_j}]]}\right| \) in (31) (for instance, the fact that \(Q^{m(\beta _j)}[\phi _{k_j}]=0\) whenever \(k_j\) is odd and larger than 1) rather than using the general bound \(\Delta ^{{\mathrm {stoc}}}[ Q^{m(\beta _j)}[\phi _{k_j}]] \le \widetilde{{\text {Leb}}}_{m(\beta _j)} \left\| \phi _{k_j} \right\| _{L^\infty ([-1,1])} \).
We are now almost in the position to prove the estimate on the error contribution (see Lemma 8); before doing this, we need another auxiliary lemma that gives conditions for the summability of certain sequences that will be considered in the proof of Lemma 8 as well as in the proof of the main theorem on the convergence of MISC.
Lemma 7
(Summability of stochastic rates) Recall the definitions of \(\{\zeta _{0,j}\}_{j \ge 1}\) in Lemma 1, of \(\{\zeta _{s,j}\}_{j \ge 1}\) in Lemma 2 and of \(g(\cdot )\) in Lemma 6. Under Assumption A2, for all \(s=0, 1, \ldots , s_{\max }\), the sequences \(\{e^{-g(\zeta _{s,j})}\}_{j \ge 1}\) and \(\left\{ \frac{1}{\zeta _{s,j}}\right\} _{j \ge 1}\) are \(\ell ^{p}\)-summable for \(p \ge \tilde{p}_s = \frac{p_s}{1-p_s}\), with \(\tilde{p}_s < 1\).
Proof
First note that, by definition of \(g(\cdot )\), we have
Then, from (14)–(15), or (12)–(13) for \(s=0\), we can bound \( 2\tau _{s,j} \le \zeta _{s,j}\) and obtain
From Assumption A2, we know that \({{\varvec{b}}}_s \in \ell ^{p_s}\) for \(p_s \le \frac{1}{2}\), and therefore we have the condition
\(\square \)
Lemma 8
(Total error contribution) Assume that the deterministic problem is solved with a method obtained by tensorizing piecewise multi-linear finite element spaces on each mesh, \(\{{\mathbb {T}}_i\}_{i=1}^D\), discretizing each \(\{{\mathscr {B}}_i\}_{i=1}^D\), and let \(h_i\) as in (17) be the mesh size of each \(\{{\mathbb {T}}_i\}_{i=1}^D\). Then, under Assumptions A1 and A2, the error contribution, \(\Delta E_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\), of each difference operator, \({{\varvec{\Delta }}}[F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}],\) can be decomposed as
for a constant \(C_s < \infty \) independent of \({{{\varvec{\alpha }}}}\) and \({{{\varvec{\beta }}}}\) and
with \(g_{s,j} = g(\zeta _{s,j})\) as in Lemma 6 and \(r_{{\mathrm {FEM}}}(s{{\varvec{q}}})_i=\min \left\{ 1,q_is\right\} \) for \(i=1,\ldots ,D\) with \({{\varvec{q}}}\in {\mathbb {R}}_+^D\) s.t. \(|{{\varvec{q}}}|=1\).
Proof
Combining the definition of \({{\varvec{\Delta }}}[F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}]\), cf. (23), and the definition of \(\Delta ^{{\mathrm{det}}}[F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}]\), cf. (21), we have
We observe that \(f({{\varvec{y}}}) = \Theta [ {{\varvec{\Delta }}}^{\mathrm{det}}[u^{{{{\varvec{\alpha }}}}}(\cdot ,{{\varvec{y}}})]]\) is a linear combination of some \(u^{{{\varvec{\alpha }}}}\) and that each of these \(u^{{{\varvec{\alpha }}}}\) is an \(H^1_0({\mathscr {B}},{\mathbb {C}})\)-holomorphic function, since the finite element approximations of u are holomorphic in the same complex region as u itself; hence, \(f({{\varvec{y}}})\) is also holomorphic in the same region. Then, thanks to the summability properties in Lemma 7, we can apply Lemma 6 to \(f(\cdot )\) in the polyellipses defined in Lemmas 1 and 2 by \({{\varvec{\zeta }}}_s\) in (14) (or (12) for \(s=0\)) and obtain
where \({\mathcal {G}}\subset {\mathbb {N}}_+\) denotes the support of \({{{\varvec{\beta }}}}-\mathbf{1}\). Next, assuming that the spatial discretization consists of piecewise linear finite elements with spatial mesh sizes (17) and combining the a-priori bounds on the decay of the difference operators coming from the Combination Technique theory (see, e.g., [19, proof of Theorem 2]) with (4) and the fact that \(u \in H^{1+s}({\mathscr {B}})\) for any \(s=0, 1,\ldots ,s_{\max }\), we have the following bound for every \({{\varvec{z}}}\) in the Bernstein polyellipse, \({\mathcal {E}}_{{{\varvec{\zeta }}}_s}^{\mathcal {G}}\):
for \({{\varvec{q}}}\in {\mathbb {R}}_+^D\) s.t. \(q_i\) s.t. \(|{{\varvec{q}}}|=1\), some \(C_{CT}>0\) independent of u, and where \(C_{s,{{\varvec{q}}}}\) is the embedding constant between \({\mathcal {H}}^{\mathbf{1}+s{{\varvec{q}}}}({\mathscr {B}},{\mathbb {C}})\) and \(H^{1+s}({\mathscr {B}},{\mathbb {C}})\). We then have the following bound:
where the constant, \(C_{{\mathrm {SE}}}({{\varvec{\zeta }}}_s)\) is bounded independently of \({{{\varvec{\beta }}}}\), thanks again to Lemma 6. The proof is then concluded by recalling that \(\sup _{{{\varvec{z}}}\in {\mathcal {E}}_{{{\varvec{\zeta }}}_s}^{\mathcal {G}}} \left\| u(\cdot ,{{\varvec{z}}}) \right\| _{H^{1+s}({\mathscr {B}},{\mathbb {C}})} \le C_{s,u}\) independently of \({{{\varvec{\beta }}}}\) and \({\mathcal {G}}\) due to Lemmas 1 and 2. \(\square \)
Observe that the result in Lemma 8 gives a bound on \(\Delta E_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\) parametric on the vector \({{\varvec{q}}}\). The optimal choice for such \({{\varvec{q}}}\) will be discussed later on, in the proof of the main theorem, namely Theorem 10.
Remark 4
(Relaxing the simplifying assumption) We now clarify why the assumption \(p_s < \frac{1}{2}\), for \(s=0,1,\ldots ,s_{\max }\), is not essential. Due to a suboptimal choice of \(\zeta _{s,j}\) in Lemma 2 (and Lemma 1 for \(s=0\)), the sequence \(\{e^{-g_{s,j} }\}_{j \ge 1}\) in Lemma 7, which is related to the solution u and which appears in the proof of the MISC convergence theorem, has worse summability \(\tilde{p}_s=\frac{p_s}{1-p_s}\) than the sequence \(\{b_{s,j} \}_{j \ge 1}\), whose summability coefficient is \(p_s\), and which is related to the diffusion coefficient, a. As we see in the main theorem stated below, we need \(\tilde{p}_s<1\) to guarantee convergence of MISC, which implies \(p_s < \frac{1}{2}\). By choosing the polyellipses in Lemmas 1 and 2 by the more elaborate strategy presented in [11], it would be possible to obtain the better summability \(\tilde{p}_s=p_s\) for the sequence \(\{e^{-g_{s,j} }\}_{j \ge 1}\), which would only imply the less stringent condition \(p_s<1\) and a better estimate for the MISC convergence rate. However, for ease of exposition, we maintain the sub-optimal choice, which is enough for the purpose of presenting the argument that proves convergence of MISC. The restriction \(p_s<\frac{1}{2}\) formally prevents us from applying the MISC convergence analysis to diffusion coefficients with low spatial regularity. In practice, we see in Sect. 6 that the convergence estimates are numerically verified beyond this restriction.
Before proving the main theorem of this section, we finally need the following technical lemma.
Lemma 9
(Bounding a sum of double exponentials) For \(a>0\), \(b \ge 2\) and \(0\le c < a b\),
holds, where for each fixed c and b, \(\varepsilon (\cdot ,b, c)\) is a monotonically decreasing, strictly positive function with \(\varepsilon (a,b,c) \rightarrow c \text { as } a \rightarrow +\infty \).
Proof
We observe that for \(b\ge 2\) we have \(b^k-1 \ge k\) for \(k \ge 1\) integer. Therefore, \(e^{-a b (b^{k}-1)} \le e^{-abk}\) and we have
Then,
and we finish by verifying that the function, \(\varepsilon \), has the required properties. \(\square \)
We are now ready to state and prove our main result.
Theorem 10
(MISC convergence theorem) Under Assumptions A1–A3, the profit-based MISC estimator, \({\mathscr {M}}_{\mathcal {I}}\), built using the set \({\mathcal {I}}\) defined in (29), Stochastic Collocation with Clenshaw–Curtis points as in (19)–(20), and spatial discretization obtained by tensorizing multi-linear piecewise finite element spaces on each mesh, \(\{{\mathbb {T}}_i\}_{i=1}^D\), with mesh sizes \(h_i\) as in (17) for solving the deterministic problems, satisfies, for any \(\delta > 0\),
for some constant \(C_\delta >0\) that is independent of \({\mathrm {Work}}\left[ {\mathscr {M}}_{{\mathcal {I}}}\right] \) and tends to infinity as \(\delta \rightarrow 0\). Moreover, \({\mathrm {Work}}\left[ {\mathscr {M}}_{{\mathcal {I}}}\right] \) is given by (27), and
where
Proof
In this proof, we use Theorem 4. Therefore, we need to estimate the p-summability of the weighted profits for some \(p<1\). To this end, we use Lemma 8. Observe that Lemma 8 provides a family of bounds for each \(\Delta E_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\), depending on s; therefore we would then ideally choose the best s for each \(\Delta E_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\). However, this optimization problem is too complex and we simplify it by assuming that
-
only two values of s will be considered, \(s=0\) and \(s=s^*\) (which will not necessarily coincide with \(s_{\max }\));
-
the optimization between \(s=0\) and \(s=s^*\) will not be carried out individually on each \(\Delta E_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\), but we will rather take a “convex combination” of the two corresponding estimates and choose the best outcome only at the end of the proof.
To this end, we first need to rewrite the result of the Lemma 8 in a more suitable form for any fixed \(s^* \in \{1,2,\ldots ,s_{\max }\}\); note that from here on, with a slight abuse of notation, we drop the superscript \(^*\) and simply use s to denote the fixed value. Thus, for such fixed s, consider the statement of Lemma 8, let \(C_E = \max \{C_{0}, C_{s}\}\), \(\chi _{j,s} = g_{s,j} \log _2 e\), and \(\theta _{j,s} = (g_{0,j} - g_{s,j}) \log _2 e\) and combine (32)–(34), obtaining
for an arbitrary \({{\varvec{q}}}\in {\mathbb {R}}_+^D\) with \(|{{\varvec{q}}}| = 1\) that we will choose later. We can now bound the weighted sum of the profits as follows:
We then investigate under what conditions each of the two factors is finite (the constants \(C_E,C_W\) are bounded, cf. Lemmas 5 and 8). Before proceeding, we comment on the equations above. As we already mentioned at the beginning of the proof, here we are working in a suboptimal setting in which instead of choosing a different s for each \(\Delta E_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}\), we restrict ourselves to choosing between only two values, \(s=0\) or a certain \(s>0\) (second line). Observe that we have an equality between the second and the third lines since \(2^x\) is a monotone function of x. Hence, the minimum is always attained at either \(\lambda =0\) or \(\lambda =1\). However, when it comes to switching the order of the sum and the minimum in the fourth line, i.e., bounding the sum by choosing the same \(\lambda \) to bound every term in the sum, allowing for fractional \(\lambda \) gives a tighter bound on the overall sum than just considering \(\lambda \in \{0,1\}\). Roughly speaking, we are somehow “mimicking” the fact that the optimal bound of the sum would use a different value of \(\lambda \) for every term by choosing an overall \(\lambda \) that is between the two possible values.
To investigate the condition for which each of the two factors in (37) are bounded, we immediately have for the first factor for all \(i=1,\ldots , D\) that
Our goal is to optimize the above constraint for the summability exponent p. To this end, we will minimize the right-hand side of (38), observing that it decreases with respect to \(r_{{\mathrm {FEM}}}(s{{\varvec{q}}})_i\). Hence, recalling the dependence of \(r_{{\mathrm {FEM}}}(s{{\varvec{q}}})_i\) (cf. Lemma 8) on the vector of weights \({{\varvec{q}}}\), we consider
and observe that the second argument of the min is maximized with respect to \({{\varvec{q}}}\) by making \(\frac{s q_i}{\gamma _i d_i}\) constant over i, i.e.,
With this optimal choice, then (38) becomes simply
For the second factor, denoting the generic term of the inner sum as \(a_{j,k}\) for brevity and observing that \(a_{j,1} = 1\) for every j, we have
We thus only have to discuss the convergence of the sum
where the last step is a consequence of the fact that, for Clenshaw–Curtis points, \(m(k) \ge 2^{k-1}\) for \(k \ge 1\), cf. (20). Moreover, \((1-\lambda ) \theta _{s,j} + \chi _{s,j} \ge 0\). To study the summability of (40), we want to use Lemma 9 to bound the inner sum in (40). First, we rewrite
where we have used the notation in Lemma 9. Note that this lemma holds true under the assumptions that \(a>0\) and \( 0 \le c < 2a\), where the latter has to be verified as follows
which is true whenever
due to (39), \(\theta _{s,j} \ge 0\) and \(\lambda \le 1\). Define \(\overline{{\mathcal {J}}} = \{ j \ge 1 \,:\, \chi _{s,j} \le r_{{\mathrm{det}}}(s)\}\) which has a finite cardinality since \(\chi _{s,j} \rightarrow \infty \) as \(j \rightarrow \infty \). Resuming from (40), we have, due to Lemma 9,
where \(C(\overline{{\mathcal {J}}})\) is bounded, since \(\# \overline{{\mathcal {J}}} < \infty \), and \(\varepsilon (a,2,c)\) is a monotonically decreasing function with limit \(c = (1-p) \log 2\) independent of j. Therefore, the previous series converges if and only if
converges. Inserting the expression of \(\theta _{s,j}\) and \(\chi _{s,j}\), we get
After applying the Hölder inequality in the previous summation with exponents \(\eta _1^{-1}+\eta _2^{-1}=1\), we need to simultaneously ensure the boundedness of the following sums:
Recalling the summability result in Lemma 7, we understand that the following two conditions must hold:
which closes the discussion of the summability of the second factor of (37) for a fixed s. Recalling the constraint (39) coming from the first factor of (37), we finally have to solve the following optimization problem:
i.e., we have to choose \(\eta _2\) and \(\lambda \) to minimize the lower bound on p. We first optimally select \(\eta _2\) given \(\lambda \), i.e., we take \(\eta _2 = \eta _2^*\) such that
Substituting back, we obtain
so that the minimization problem reads
Now, we recall that \(p_0 \le p_s\). Hence, \(f_2(\lambda , s)\) is increasing with \(\lambda \). Conversely, \(f_{1}(\lambda ,s)\) is decreasing with \(\lambda \) since \(r_{{\mathrm {det}}}(s)\) is a positive number. Furthermore, notice that we cannot have \(f_{1}(\lambda ,s)<f_2(\lambda ,s)\) for all \(\lambda \in [0,1]\). Indeed, the previous condition is equivalent to \(f_{1}(0, s) \le f_2(0, s)\), i.e., \(1 \le \frac{p_0}{1-p_0} \Rightarrow p_0 \ge \frac{1}{2}\), which does not satisfy Assumption A2. Note that, in this case, the lower bound for p in (41) is minimized for \(\lambda =0\), implying that \(p> \frac{p_0}{1-p_0} > 1\), i.e., the method does not converge (cf. the statement of Theorem 4). Thus, we have only two cases (see also Fig. 3):
- Case 1 :
-
\(f_{1}(\lambda ,s)>f_2(\lambda , s)\) for all \(\lambda \in [0,1]\), which means that the convergence of the method is dictated by the spatial discretization. Given that \(f_{1}\) is decreasing and \(f_2\) is increasing, the previous condition is equivalent to \(f_1(1, s) \ge f_2(1, s)\), i.e., \( r_{{\mathrm {det}}}(s) \le \frac{1}{p_s} - 2\). In this case, the lower bound (41) is minimized for \(\lambda =1\), and we have \(p > \left( r_{{\mathrm {det}}}(s) + 1\right) ^{-1} \).
- Case 2 :
-
There exists \(\lambda ^* \in (0,1)\) such that \(f_{1}(\lambda ,s)=f_2(\lambda , s)\). This condition is equivalent to the two conditions
$$\begin{aligned} {\left\{ \begin{array}{ll} f_{1}(0,s) \ge f_2(0, s) \\ f_{1}(1,s) \le f_2(1, s) \end{array}\right. } \Rightarrow {\left\{ \begin{array}{ll} 1 \le \frac{1}{p_0} - 1 \\ r_{{\mathrm {det}}}(s) \ge \frac{1}{p_s} - 2. \end{array}\right. } \end{aligned}$$Solving for \(\lambda ^*\) yields
$$\begin{aligned} \lambda ^*&= \frac{\frac{1}{p_0}-2}{r_{{\mathrm {det}}}(s) + \frac{1}{p_0} - \frac{1}{p_s}} > 0 \end{aligned}$$which yields \(p > \overline{p}\), where
$$\begin{aligned} \begin{aligned} \overline{p}&= \left( \left( \frac{\frac{1}{p_0}-2}{r_{{\mathrm {det}}}(s) + \frac{1}{p_0} - \frac{1}{p_s}}\right) r_{{\mathrm {det}}}(s) + 1\right) ^{-1}\\&= \left( 1 + \left( \frac{1}{p_0}-2 \right) \left( 1 + \frac{1}{r_{{\mathrm {det}}}(s)} \left( \frac{1}{p_0} - \frac{1}{p_s} \right) \right) ^{-1}\right) ^{-1}. \end{aligned} \end{aligned}$$
Since the previous computations were carried out for a fixed s, we can take the minimum over all possible values of s. Then, we can apply Theorem 4 and derive the convergence estimate,
where \(1-1/p = 1 - \max _{s=0,\ldots , s_{\max }}(1/{\overline{p}}) + \delta \) for any \(\delta > 0\), which we reformulate as
with \(r_{{\mathrm {MISC}}} = \max _{s=0,\ldots , s_{\max }}({1}/{\overline{p}}) -1\). \(\square \)
5 Analysis of Example 1
In this section, we determine the value of \(s_{\max }\) and the sequence \(\{p_s\}_{s=0}^{s_{\max }}\) for Example 1. Since we will work with localized quantities of interest far from the boundary, cf. equation (45) written below, we believe that the effect of the boundary is negligible and the regularity \(s_{\max }\) is mainly limited by the summability properties of \(\kappa \). See “Appendix 2” for a slightly modified problem on the same domain where we can prove that the regularity is only limited by the summability properties of \(\kappa \). Let us define the following family of auxiliary functions,
so that \(\kappa \) from (8) can be written as
Based on this expression, for \(s \ge 0\), we analyze the summability of \(\left\{ A_{{{\varvec{k}}}} \Vert D^{{{\varvec{s}}}}\Upsilon _{{{\varvec{k}}}, {{\varvec{\ell }}}}\Vert _{L^\infty ({\mathscr {B}})} \right\} \) for \(|{{\varvec{s}}}| = s\) to determine the permissible values of \(p_s\). First, for \(|{{\varvec{s}}}| = s\), observe that for a constant c independent of \({{\varvec{k}}}\) we have
Then, for all \(s \ge 0\), we have
From here, we obtain the bound
for all \(s \ge 0\). Moreover, imposing \(p_0 < \frac{1}{2}\) and \(p_{s_{\max }} < \frac{1}{2}\) gives the bounds
respectively. Since \(\Upsilon _{{{\varvec{k}}}, {{\varvec{\ell }}}} \in C^{\infty }({\mathscr {B}})\), the bounds in (43) are the only bounds on the value of \(s_{\max }\). To determine an upper bound on the value of \(r_{{\mathrm{MISC}}}\), up to a small \(\delta \), we set \(\gamma _1=\cdots =\gamma _D=\gamma \) (motivated by the fact that all subdomains \({\mathscr {B}}_i\) are equal), we substitute \(d_i=1\) for \(i=1,\ldots ,d\) and the lower bound of \( p_s\) in Theorem 10 and obtain after simplifying
Before continuing, we discuss the four branches of the previous expression. If \(s_{\max } \le d\), then only the first two branches are valid. Since the rates in these two branches increase with s, the maximum is achieved for \(s=s_{\max }\). If \(s_{\max } \ge d\), then, since the rates in the third and fourth branches decrease with s, the maximum is achieved for \(s=d\). Hence
In Fig. 4, we plot the upper bound of the rate of MISC work complexity, \(r_{{\mathrm{MISC}}}\), based on Theorem 10 and the following analysis variants:
-
Theory This is based on the summability properties of \(\left\{ A_{{{\varvec{k}}}} \Vert D^{{{\varvec{s}}}}\Upsilon _{{{\varvec{k}}}, {{\varvec{\ell }}}}\Vert _{L^\infty ({\mathscr {B}})} \right\} \). We also use the value \(r_{{\mathrm{FEM}},i}(s) = 2 \min \left( 1, \frac{s}{d}\right) \) for all \(i=1,\ldots ,d\). This is motivated by the fact that we expect to double the convergence rate of the underlying FEM method since we are considering convergence of a smooth linear functional of the solution.
-
Square summability Motivated by the arguments in Lemma 15 in the appendix, we believe that our results may be improved by instead considering the summability properties of \(\left\{ A_{{{\varvec{k}}}}^2 \Vert D^{{{\varvec{s}}}}\Upsilon _{{{\varvec{k}}}, {{\varvec{\ell }}}}\Vert _{L^\infty ({\mathscr {B}})}^2 \right\} \) for \(|{{\varvec{s}}}| \le s\). Similar calculations yield the bounds
$$\begin{aligned} p_s > \left( \frac{2\nu }{d}+1 - \frac{2s}{d}\right) ^{-1}, \end{aligned}$$(44)and the corresponding conditions, \(\nu > \frac{d}{2}\) and \(s_{\max } < \nu - \frac{d}{2}\).
-
Improved As mentioned in Remark 4, we could in principle make our results sharper by taking \(\tilde{p}_s=p_s\) instead of \(\tilde{p}_s=\frac{p_s}{1-p_s}\). The modifications of Theorem 10 to account for these rates are straightforward. Moreover, when considering square summability, the conditions become \(\nu > 0\) and \(s_{\max } < \nu \).
We also include in Fig. 4 the observed convergence rates of MISC when applied to the example with different values of \(\nu \), as discussed below in Sect. 6, and the observed convergence rate of MISC when applied to the same problem with no random variables and a constant diffusion coefficient, a. In the latter case, MISC reduces to a deterministic combination technique [7]. Note that the rate of MISC with no random variables is an upper bound for the convergence rate of MISC with any \(\nu >0\).
From this figure, we can clearly see that the predicted rates in our theory are pessimistic when compared to the observed rates and that the suggested analysis of using the square summability or using the improved rates, \(\tilde{p}_s\), might yield sharper bounds for the predicted work rates. On the other hand, we know from our previous work [22, Theorem 1] that the work degrades with increasing \(d\) with a log factor and in fact the expected work rate for maximum regularity when the number of random variables is finite is of \({{\mathcal {O}}\left( W_{\max }^{-2} \log (W_{\max })^{d-1} \right) }\). This can be seen Fig. 4 and \(d=3\), since in this case the observed work rate seems to be converging to a value less that 2.
6 Numerical Experiments
We now verify the effectiveness of the MISC approximation on some instances of the general elliptic equation (1), as well as the validity of the convergence analysis detailed in the previous sections. In particular, we consider the domain \({\mathcal {B}} = [0,1]^d\) and the family of random diffusion coefficients specified in Example 1. In more detail, we consider a problem with one physical dimension (\(d=1\)) and another with three dimensions (\(d=3\)); in both cases, we set \(\varsigma ({{\varvec{x}}})=1\), and model the diffusion coefficient by the expansion (8) with different values of \(\nu \). Finally, the quantity of interest is a local average defined as
with \(\sigma =0.2\) and location \({{\varvec{x}}}_0=0.3\) for \(d=1\) and \({{\varvec{x}}}_0= [0.3, 0.2,0.6]\) for \(d=3\). The deterministic problems are discretized with a second-order centered finite differences scheme for which we expect to recover the same rate in the numerical experiments that we would obtain with piece-wise multi-linear finite elements on a structured mesh. We choose the mesh sizes in (17) and \(h_{0,i}=1/3\) for all \(i=1,\ldots ,d\), and the resulting linear system is solved with GMRES with sufficient accuracy. With these values and using the coarsest discretization, \(h_0=1/3\), in all dimensions, the coefficient of variation of the quantity of interest can be approximated to be between \(90\%\) and \(110\%\) depending on the number of dimensions, \(d\), and the particular value of the parameter, \(\nu \), that we consider below. Finally, the quadrature points on the stochastic domain are the already-mentioned Clenshaw–Curtis points (see eq. (19) and (20)).
In the plots below, the computational work is compared in terms of the total number of degrees of freedom to avoid discrepancies in running time due to implementation details, i.e., using (27) and Assumption A3. Moreover, we set \(\gamma =1\) in (30), which is motivated by the fact that, for the tolerances we are interested in, we estimate that the cost of solving a linear system with GMRES is linear with respect to the number of degrees of freedom.
In order to evaluate the MISC estimator, we need to build the index set (29). To do that, we must be able to evaluate two quantities for every \({{{\varvec{\alpha }}}}\) and \({{{\varvec{\beta }}}}\): the work contribution, \(\Delta W_{{{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}}\), and the error contribution, \(\Delta E_{{{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}}\). Evaluating the work contribution is straightforward thanks to Assumption A3 and using \(\gamma =1\). On the other hand, evaluating the error contribution is more involved. We look at two options:
-
“brute-force” evaluation We compute \({{\varvec{\Delta }}}[F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}]\) for all \(({{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}})\) within some “universe” index set and set \(\Delta E_{{{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}} = \left| {{\varvec{\Delta }}}[F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}] \right| \). Notice that this method is not practical since the cost of constructing the set, \({\mathcal {I}}\), would far dominate the cost of the MISC estimator. However, within some “universe” index set, this method would produce the best possible convergence and serve as a benchmark for other MISC sets within that universe.
-
“a-priori” evaluation We use Lemma 8 to bound \(\Delta E_{{{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}}\). Using these bounds instead of exact values produces quasi-optimal index sets (cf. [3, 33] ). This method in turn requires the estimation of the parameters \(r_{{\mathrm{FEM}}}, \left\{ g_{s, j}\right\} _{j \ge 1}\) for all \(s=0,\ldots ,s_{\max }\). Since we use a second-order centered finite differences scheme and consider the convergence of a quantity of interest, we expect \(r_{{\mathrm{FEM}}}=2 \min \left( 1, \frac{\nu }{d}\right) \) as motivated by the “improved” analysis in the previous section and considering the summability properties of \(\left\{ A_{{{\varvec{k}}}}^2 \Vert D^{{{\varvec{s}}}}\Upsilon _{{{\varvec{k}}}, {{\varvec{\ell }}}}\Vert _{L^\infty ({\mathscr {B}})}^2 \right\} \). This can also be validated numerically in the usual way by fixing all random variables to their expected value and checking the decay of \(\Delta E_{{{{\varvec{\alpha }}}}, \mathbf{1}}\) with respect to \({{{\varvec{\alpha }}}}\).
On the other hand, estimating \(\left\{ g_{s, j}\right\} _{j \ge 1}\) for \(s=0,\ldots ,s_{\max }\) is more difficult since, in principle, we do not know a priori, for a given \({{{\varvec{\alpha }}}}\) and \({{{\varvec{\beta }}}}\), which value of \(s \in \{0,1,\ldots ,s_{\max }\}\) yields the smallest estimate of \(\Delta E_{{{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}}\). Instead, we use a “simplified” model that was used in [22]:
$$\begin{aligned} \Delta E_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}} \le C e^{-\sum _{j\ge 1} m(\beta _j-1)\tilde{g}_{j}} 2^{-|{{{\varvec{\alpha }}}}|r_{{\mathrm {FEM}}}}, \end{aligned}$$(46)where \(\tilde{g}_j\) is some unknown function of \(g_{s, j}\) for all \(s = 0,1,\ldots , s_{\max }\). \(\tilde{g}_j\) can be estimated given \(r_{{\mathrm{FEM}}}\) and a set of evaluations of \(\left| {{\varvec{\Delta }}}[F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}] \right| \) for some \(({{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}) \in {\mathcal {I}}^*\) by solving a least-squares problem to fit the linear model
$$\begin{aligned} \sum _{j\ge 1} \tilde{g}_j m(\beta _j-1) = -\log \left( \left| {{\varvec{\Delta }}}[F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}] \right| \right) - |{{{\varvec{\alpha }}}}| r_{{\mathrm{FEM}}},\qquad {\text {for all }} ({{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}) \in {\mathcal {I}}^*.\end{aligned}$$For our example, these rates are plotted in Figs. 5a and 6a for \(d=1\) and \(d=3\), respectively. In our current implementation, the construction of the optimal MISC set, \({\mathcal {I}}\), is separate from the set \({\mathcal {I}}^*\). However, it is possible in principle to construct an algorithm in which the optimal MISC set, \({\mathcal {I}}\), is constructed iteratively by alternating between estimating rates given a set of indices and evaluating the MISC estimator.
Note that, in the current work, there are certain operations whose costs we do not track or compare. The first operation is the estimation of the stochastic rates, \(\left\{ \tilde{g}_j\right\} _{j\ge 1}\). The second operation is the construction of the optimal set given estimates of error and work contribution. We believe that the cost of these two operations can be reduced by using the previously mentioned iterative algorithm. The cost of these operations is thus dominated by the cost of evaluating the MISC estimator. The third operation is the assembly of the stiffness matrix, especially since it scales linearly with the number of random variables. While the cost of this operation is relevant to our discussion, it is usually dominated by the cost of the linear solver, at least for fine-enough discretizations.
Finally, we also compare MISC to the Multi-index Monte Carlo (MIMC) method as detailed in [23], for which \({{\mathcal {O}}\left( W_{\max }^{-0.5} \right) }\) convergence can be proved for Example 1 with \(\gamma =1, d\le 3\) and sufficiently large \(\nu \) (see “Appendix 1”). Moreover, when computing errors, we use the result obtained using a well-resolved MISC solution as a reference value.
Figures 5b–d and 6b–d compare some computed values of \(\left| {{\varvec{\Delta }}}[F_{{{{\varvec{\alpha }}}},{{{\varvec{\beta }}}}}] \right| \) versus the model (46) using the estimated rates \(r_{\mathrm{FEM}}=2 \min \left( 1, \frac{\nu }{d}\right) \) and \(\left\{ \tilde{g}_j\right\} _{j\ge 1}\). These plots show that the model (46) is a good fit for the case \(d=1, \nu =2.5\) and \(d=3, \nu =4.5\). Moreover, similar plots were produced for other values of \(d\) and \(\nu \) that are not reported here but also show good fit. Figures 7 and 8 show
-
the maximum spatial discretization level, \(\max _{({{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}) \in {\mathcal {I}}} \max ({{{\varvec{\alpha }}}})\),
-
the maximum quadrature level, \(\max _{({{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}) \in {\mathcal {I}}} \max ({{{\varvec{\beta }}}})\),
-
the index of the last activated random variable, \(\max _{({{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}) \in {\mathcal {I}}} \max _{\beta _j > 1} j\),
-
and the maximum number of jointly activated variables, \(\max _{({{{\varvec{\alpha }}}}, {{{\varvec{\beta }}}}) \in {\mathcal {I}}} |{{{\varvec{\beta }}}}-\mathbf{1}|_0\).
These values convey the size of the used index set, \({\mathcal {I}}\), for different values of \(W_{\max }\).
Figure 4 shows the observed convergence rates of MISC for the cases \(d=1\) and \(d=3\) and different values of \(\nu \). This figure shows that the observed rates are better than those predicted by the theory developed in this work, which suggests that further improvement in the theory is possible (see Remark 4). Figures 9 and 10 show in greater details the observed convergence curves for \(d=1, \nu =2.5\) and \(d=3, \nu =4.5\) and their respective linear fit in log-log scale.
We recall that, as shown in [22, Theorem 1], the convergence rate of MISC with a finite number of random variables is \({{\mathcal {O}}\left( W_{\max }^{-2} \log (W_{\max })^{d-1} \right) }\). Compare this to the theory presented here that predicts, as \(\nu \rightarrow \infty \), a convergence of \({{\mathcal {O}}\left( W_{\max }^{-2+\epsilon } \right) }\) for any \(\epsilon >0\). However, Fig. 9 shows that even for a problem with \(d=3\) and no random variables, MISC (which, in this case, becomes equivalent to a deterministic combination technique [7]) has an observed convergence rate that is closer to \(-1.38\). This is due to the effect of the logarithmic term that is nonzero for \(d>1\). Based on this, we should not expect a better convergence rate for \(d=3\) and any finite \(\nu > 0\). This is also numerically validated in Fig. 10, which shows the full convergence curves for \(d=1, \nu =2.5\) and \(d=3, \nu =4.5\).
7 Conclusions
In this work, we analyzed the performance of the MISC method when applied to linear elliptic PDEs depending on a countable sequence of random variables. For ease of presentation, we worked on tensor product domains, but the results can be extended to more general domains and non-uniform meshes, as briefly mentioned Sect. 3. We proved a convergence result using a summability argument that shows that, in certain cases, the convergence of the method is essentially dictated by the convergence properties of the deterministic solver. We then applied the convergence theorem to derive convergence rates for the approximation of the expected value of a functional of the solution of an elliptic PDE with diffusion coefficient described by a random field, tracking the dependence of the convergence rate on the spatial regularity of the realizations of the random field. The theoretical findings are backed up by numerical experiments that show the dependence of the convergence rate on the regularity parameter. Future works includes extending the convergence analysis to higher-order finite element solvers and improving the estimates of the error contribution of each difference operator by taking into account the factorial terms appearing in the estimates for the size of the Chebyshev coefficients, cf. [3, 11]. Moreover, the ideas in [12] can be extended to design an algorithm that iteratively estimates the optimal MISC set by alternating between optimizing the set and evaluating the estimator to ensure that the work to optimize the set is dominated by the work to evaluate the MISC estimator.
Notes
Recall that, given \(q \ge 1\), \(L^q(\Gamma ;V) = \left\{ v : \Gamma \rightarrow V \text { strongly measurable, such that } \int _\Gamma \left\| u \right\| _{V} ^q~{\text {d}}\mu < \infty \right\} \).
We recall that \(H^{{{\varvec{l}}}}({\mathscr {B}})\) is the completion of formal sums \(v=\sum _{k=1}^{K} v_{1,k}v_{2,k}\cdots v_{D,k}\) with \(v_{i,k} \in H^{l_i}({\mathscr {B}}_i)\) with respect to the norm induced by the inner product
$$\begin{aligned} (v,w)_{H^{{{\varvec{l}}}}({\mathscr {B}})} = \sum _{k,i} (v_{1,k},w_{1,i})_{H^{l_1}({\mathscr {B}}_1)}(v_{2,k},w_{2,i})_{H^{l_2}({\mathscr {B}}_2)}\cdots (v_{D,k},w_{D,i})_{H^{l_D}({\mathscr {B}}_D)}. \end{aligned}$$
References
I. Babuška, F. Nobile, and R. Tempone, A stochastic collocation method for elliptic partial differential equations with random input data, SIAM Review, 52 (2010), 317–355.
A. Barth, C. Schwab, and N. Zollinger, Multi-level Monte Carlo finite element method for elliptic PDEs with stochastic coefficients, Numerische Mathematik, 119 (2011), 123–161.
J. Beck, F. Nobile, L. Tamellini, and R. Tempone, On the optimal polynomial approximation of stochastic PDEs by Galerkin and collocation methods, Mathematical Models and Methods in Applied Sciences, 22 (2012), 1250023.
M. Bieri, A sparse composite collocation finite element method for elliptic SPDEs., SIAM Journal on Numerical Analysis, 49 (2011), 2277–2301.
V. I. Bogachev, Measure Theory, Vol. 1, Springer Berlin Heidelberg, 2007.
H. J. Bungartz and M. Griebel, Sparse grids, Acta Numerica, 13 (2004), 147–269.
H. J. Bungartz, M. Griebel, D. Röschke, and C. Zenger, Pointwise convergence of the combination technique for the Laplace equation, East-West Journal of Numerical Mathematics, 2 (1994), 21–45.
J. Charrier, Strong and weak error estimates for elliptic partial differential equations with random coefficients, SIAM Journal on Numerical Analysis, 50 (2012), 216–246.
J. Charrier, R. Scheichl, and A. Teckentrup, Finite element error analysis of elliptic pdes with random coefficients and its application to multilevel Monte Carlo methods, SIAM Journal on Numerical Analysis, 51 (2013), 322–352.
A. Chkifa, On the Lebesgue constant of Leja sequences for the complex unit disk and of their real projection, Journal of Approximation Theory, 166 (2013), 176–200.
A. Cohen, R. Devore, and C. Schwab, Analytic regularity and polynomial approximation of parametric and stochastic elliptic PDE’S, Analysis and Applications, 9 (2011), 11–47.
N. Collier, A.-L. Haji-Ali, F. Nobile, E. von Schwerin, and R. Tempone, A continuation multilevel Monte Carlo algorithm, BIT Numerical Mathematics, 55 (2015), 399–432.
G. M. Constantine and T. H. Savits, A multivariate Faà di Bruno formula with applications, Transactions of the American Mathematical Society, 348 (1996), 503–520.
D. Dũng and M. Griebel, Hyperbolic cross approximation in infinite dimensions, Journal of Complexity, 33 (2016), 55–88.
B. Ganapathysubramanian and N. Zabaras, Sparse grid collocation schemes for stochastic natural convection problems, jcp, 225 (2007), 652–685.
M. B. Giles, Multilevel Monte Carlo path simulation, Operations Research, 56 (2008), 607–617.
W. J. Gordon and C. A. Hall, Construction of curvilinear co-ordinate systems and applications to mesh generation, International Journal for Numerical Methods in Engineering, 7 (1973), 461–477.
I. G. Graham, R. Scheichl, and E. Ullmann, Mixed finite element analysis of lognormal diffusion and multilevel Monte Carlo methods, Stochastic Partial Differential Equations: Analysis and Computations, (2015), 1–35.
M. Griebel and H. Harbrecht, On the convergence of the combination technique, in Sparse Grids and Applications - Munich 2012, J. Garcke and D. Pflüger, eds., vol. 97 of Lecture Notes in Computational Science and Engineering, Springer International Publishing, 2014, 55–74.
M. Griebel and S. Knapek, Optimized general sparse grid approximation spaces for operator equations, Mathematics of Computation, 78 (2009), 2223–2257.
M. Griebel, M. Schneider, and C. Zenger, A combination technique for the solution of sparse grid problems, in Iterative Methods in Linear Algebra, P. de Groen and R. Beauwens, eds., IMACS, Elsevier, North Holland, 1992, pp. 263–281.
A.-L. Haji-Ali, F. Nobile, L. Tamellini, and R. Tempone, Multi-index stochastic collocation for random PDEs, Computer Methods in Applied Mechanics and Engineering, 306 (2016), 95–122.
A.-L. Haji-Ali, F. Nobile, and R. Tempone, Multi-index Monte Carlo: when sparsity meets sampling, Numerische Mathematik, 132 (2015), 767–806.
A.-L. Haji-Ali, F. Nobile, E. von Schwerin, and R. Tempone, Optimization of mesh hierarchies in multilevel Monte Carlo samplers, Stochastic Partial Differential Equations: Analysis and Computations, 4 (2015), 76–112.
H. Harbrecht, M. Peters, and M. Siebenmorgen, On multilevel quadrature for elliptic stochastic partial differential equations, in Sparse Grids and Applications, vol. 88 of Lecture Notes in Computational Science and Engineering, Springer, 2013, 161–179.
M. Hegland, J. Garcke, and V. Challis, The combination technique and some generalisations, Linear Algebra and its Applications, 420 (2007), 249–275.
S. Heinrich, Multilevel Monte Carlo methods, in Large-Scale Scientific Computing, vol. 2179 of Lecture Notes in Computer Science, Springer Berlin Heidelberg, 2001, 58–67.
T. J. R. Hughes, J. A. Cottrell, and Y. Bazilevs, Isogeometric analysis: CAD, finite elements, nurbs, exact geometry and mesh refinement, Computer Methods in Applied Mechanics and Engineering, 194 (2005), 4135–4195.
F. Y. Kuo, C. Schwab, and I. Sloan, Multi-level Quasi-Monte Carlo Finite Element Methods for a Class of Elliptic PDEs with Random Coefficients, Foundations of Computational Mathematics, 15 (2015), 411–449.
S. Martello and P. Toth, Knapsack problems: algorithms and computer implementations, Wiley-Interscience series in discrete mathematics and optimization, J. Wiley & Sons, 1990.
A. Narayan and J. D. Jakeman, Adaptive Leja Sparse Grid Constructions for Stochastic Collocation and High-Dimensional Approximation, SIAM Journal on Scientific Computing, 36 (2014), A2952–A2983.
F. Nobile, L. Tamellini, and R. Tempone, Comparison of Clenshaw-Curtis and Leja Quasi-Optimal Sparse Grids for the Approximation of Random PDEs, in Spectral and High Order Methods for Partial Differential Equations - ICOSAHOM ’14, R. M. Kirby, M. Berzins, and J. S. Hesthaven, eds., vol. 106 of Lecture Notes in Computational Science and Engineering, Springer International Publishing, 2015, 475–482.
F. Nobile, L. Tamellini, and R. Tempone, Convergence of quasi-optimal sparse-grid approximation of Hilbert-space-valued functions: application to random elliptic PDEs, Numerische Mathematik, 134(2) (2016), 343–388.
F. Nobile, R. Tempone, and C. Webster, An anisotropic sparse grid stochastic collocation method for partial differential equations with random input data, SIAM Journal on Numerical Analysis, 46 (2008), 2411–2442.
F. Nobile, R. Tempone, and C. Webster, A sparse grid stochastic collocation method for partial differential equations with random input data, SIAM Journal on Numerical Analysis, 46 (2008), 2309–2345.
C. Schillings and C. Schwab, Sparse, adaptive Smolyak quadratures for Bayesian inverse problems, Inverse Problems, 29 (2013), 065011.
A. Teckentrup, P. Jantsch, C. G. Webster, and M. Gunzburger, A Multilevel Stochastic Collocation Method for Partial Differential Equations with Random Input Data, SIAM/ASA Journal on Uncertainty Quantification, 3 (2015), 1046–1074.
H. W. van Wyk, Multilevel sparse grid methods for elliptic partial differential equations with random coefficients, arXiv preprint arXiv:1404.0963, 2014.
G. W. Wasilkowski and H. Wozniakowski, Explicit cost bounds of algorithms for multivariate tensor product problems, Journal of Complexity, 11 (1995), 1–56.
D. Xiu and J. Hesthaven, High-order collocation methods for differential equations with random inputs, SIAM Journal on Scientific Computing, 27 (2005), 1118–1139.
C. Zenger, Sparse grids, in Parallel Algorithms for Partial Differential Equations, W. Hackbusch, ed., vol. 31 of Notes on Numerical Fluid Mechanics, Vieweg, 1991, pp. 241–251.
Acknowledgments
F. Nobile and L. Tamellini received support from the Center for ADvanced MOdeling Science (CADMOS) and partial support by the Swiss National Science Foundation under the Project No. 140574 “Efficient numerical methods for flow and transport phenomena in heterogeneous random porous media”. L. Tamellini also received support from the Gruppo Nazionale Calcolo Scientifico - Istituto Nazionale di Alta Matematica “Francesco Severi” (GNCS-INDAM). R. Tempone is a member of the KAUST Strategic Research Initiative, Center for Uncertainty Quantification in Computational Sciences and Engineering. R. Tempone received support from the KAUST CRG3 Award Ref: 2281.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Albert Cohen.
Appendices
Appendix 1: Summability of Series Expansion
We start by recalling a useful multivariate Faà di Bruno formula taken from [13, Theorem 2.1].
Lemma 11
Let \({\mathscr {B}}\subset {\mathbb {R}}^d\) be an open domain, \(g:{\mathscr {B}}\rightarrow {\mathbb {R}}\) and \(f:{\mathbb {R}}\rightarrow {\mathbb {R}}\) be functions of class \(C^s(\mathscr {B})\) and denote \(h=f\circ g : {\mathscr {B}}\rightarrow {\mathbb {R}}\). For any multi-index \({{\varvec{i}}}\in {\mathbb {N}}^d\), \(|{{\varvec{i}}}| \le s\), and any \({{\varvec{x}}}\in {\mathscr {B}}\),
holds, where
and \(\prec \) denotes the lexicographic ordering of multi-indices. The set \(p_r({{\varvec{i}}},\lambda )\) denotes the set of possible decompositions of \({{\varvec{i}}}\) as a sum of \(\lambda \) multi-indices with \(r\le \lambda \) distinct multi-indices, \({\varvec{\ell }}_j\), taken with multiplicity \(k_j\) such that \(\sum _{j=1}^r k_j=\lambda \).
Also from [13, Corollary 2.9], we have that, for any \({{\varvec{i}}}\in {\mathbb {N}}^d\),
where \(S_{n,k}\) is the Stirling number of the second kind, which counts the number of ways to partition a set of n objects into k non-empty subsets. Similarly, the Bell number, \(B_n=\sum _{k=0}^{n} S_{n,k}\), counts the number of partitions of a set of n objects, whereas the ordered Bell numbers are defined by \(\tilde{B}_n = \sum _{k=0}^{n} k! S_{n,k}\) and satisfy the recursive relation \(\tilde{B}_n = \sum _{k=0}^{n-1} \left( {\begin{array}{c}n\\ k\end{array}}\right) \tilde{B}_k\). Clearly, \(B_n\le \tilde{B}_n\). Moreover, the bound
was given in [3, Lemma A.3]. We now use these results to show the following result
Lemma 12
Let \({\mathscr {B}} \subset {\mathbb {R}}^d\) be an open-bounded domain and \(\kappa \in C^s(\overline{{\mathscr {B}}})\) (real or complex valued) for \(s \ge 0\). Then, \(a = e^\kappa \in C^s(\overline{{\mathscr {B}}})\) and we have the estimate
Proof
Using formula (47), we have for any \({{\varvec{i}}}\in {\mathbb {N}}^d\), \(|{{\varvec{i}}}|\le s\) and any \({{\varvec{x}}}\in {\mathscr {B}}\)
The result then follows from the bound on the Bell numbers in (48). \(\square \)
1.1 \(L^p(\Gamma )\) Summability, Pointwise in Space
We now consider a diffusion coefficient as in Assumption A2:
with \(y_j\), \(j \ge 1\), independent random variables, all uniformly distributed in \([-1,1]\) and recall the definition of the sequence \({{\varvec{b}}}_{s}=\{b_{s,j}\}_{j\ge 1}\), for all \(s\in {\mathbb {N}}\) in (6).
Lemma 13
If \({{\varvec{b}}}_0\in \ell ^2\) then \({\mathbb {E}}\left[ a({{\varvec{x}}})^p\right] <\infty \) for all \(0<p<\infty \) and \(\forall {{\varvec{x}}}\in {\mathscr {B}}\).
Proof
For any \({{\varvec{x}}}\in {\mathscr {B}}\), we estimate the p-th moment of \(a({{\varvec{x}}},{{\varvec{y}}})\), exploiting the independence of the random variables, \(y_j\):
where in the last two equalities we have implicitly assumed that \(\sinh (z)/z=1\) for \(z=0\). Setting \(\theta _0(p;{{\varvec{x}}}) = \prod _{j=1}^\infty \frac{\sinh (p\psi _j({{\varvec{x}}}))}{p\psi _j({{\varvec{x}}})}\) and observing that \(\log (\sinh (z)/z) \sim z^2/6\), we have
Since \(\sum _{j=1}^\infty b_{0,j}^2 <\infty \) implies \(\sum _{j=1}^\infty \psi _j({{\varvec{x}}})^2 <\infty \) for any \({{\varvec{x}}}\in {\mathscr {B}}\), this concludes the proof. \(\square \)
A similar result holds for higher-order derivatives of a.
Lemma 14
Let \(s\in {\mathbb {N}}_+\). If \({{\varvec{b}}}_s\in \ell ^2\), then for any \({{\varvec{i}}}\in {\mathbb {N}}^d\), \(|{{\varvec{i}}}|=s\), \({\mathbb {E}}\left[ (D^{{\varvec{i}}}a({{\varvec{x}}}))^{2p}\right] <\infty \) for all \(0<p<\infty \) and \(\forall {{\varvec{x}}}\in {\mathscr {B}}\).
Proof
Since the calculations are tedious, we prove the result here for \(s=1\) only. Using the chain rule, we have
Hence,
We now distinguish between even or odd \(q_j\). For even \(q_j\), we have
while for \(q_j\) odd we have
Hence, defining the function
we have
from which we see that \({\mathbb {E}}\left[ (\partial _{x_i}a({{\varvec{x}}},{{\varvec{y}}}))^{2p}\right] \) is bounded for any \(0\le p<\infty \) and any \({{\varvec{x}}}\in {\mathscr {B}}\) if \({{\varvec{b}}}_1\in \ell ^2\). \(\square \)
1.2 \(L^p(\Gamma )\) Summability, Uniform in Space
Assuming now that \({{\varvec{b}}}_s\in \ell ^2\) so that the random field, a, is s-times differentiable in an \(L^p(\Gamma )\) sense according to Lemma 14, we show that this implies some uniform \(L^p(\Gamma )\) summability as detailed in the next lemma.
Lemma 15
Let \(s\in {\mathbb {N}}_+\). If \({{\varvec{b}}}_s\in \ell ^2\) then \({\mathbb {E}}\left[ \Vert a\Vert ^p_{W^{\upsilon ,\infty }({\mathscr {B}})}\right] <\infty \) for all \(1\le p<\infty \) and \(\upsilon <s\).
Proof
We exploit the Sobolev embedding, \(W^{\upsilon +\frac{d}{2q},2q}({\mathscr {B}}) \subseteq W^{\upsilon ,\infty }({\mathscr {B}})\), for all \(\upsilon \ge 0\) and \(q\ge 1\). For \(q\ge \max \{d/2(s-\upsilon ),p/2\}\), we have
where the last term is bounded from Lemma 14. \(\square \)
Now, we directly observe by taking \(\upsilon =0\) in the previous result that \(a_{\max } = \Vert a\Vert _{L^\infty ({\mathscr {B}})}\) has bounded moments,
for all \(1\le p<\infty \) and \(0<s\). Finally, by observing that, due to (2), in we have that \(a_{\min } = \frac{1}{\Vert a^{-1}\Vert _{L^\infty ({\mathscr {B}})}}\) has the same distribution as \(a_{\max }\). As a consequence, \(a_{\min }\) has bounded moments as well. This implies in turn that (3) holds and thus problem (1) is well posed in the Bochner space, \(L^p\left( \Gamma ;H^1_0({\mathscr {B}})\right) \). That is,
Corollary 16
(Well-posedness with log uniform coefficient) We have for \(0<\nu \) that the problem in Example 1 is well posed in the Bochner space \(L^p\left( \Gamma ;H^1_0({\mathscr {B}})\right) \). The corresponding solution, u, satisfies
We observe that higher regularity of the solution, u, can be obtained by using larger values of s in Lemma 15. This in turn yields control on moments of \(W^{\upsilon ,\infty }({\mathscr {B}})\) norms of the coefficient, a, and following, for instance, estimates similar to (2.10) in [18, Theorem 2.4], we can estimate moments of the \(H^{1+s}({\mathscr {B}})\) norm of the solution, u. These regularity estimates, once combined with pathwise error estimates for the combination technique, can be further used to show the corresponding \(\nu \)-dependent convergence rates of MIMC [23], for Example 1, similar to what was presented in Sect. 5 for MISC in the current work.
Appendix 2: Shift Theorem for Problem (1)
Here, we seek to establish a shift theorem for the problem
under suitable assumptions on a and \(\varsigma \).
With respect to problem (1), for convenience, we drop the dependence on the parameter vector, \({{\varvec{y}}}\). We consider an odd periodic extension of \(\varsigma \), on \([-1,1]^D\), and an even periodic extension of the coefficient a on \([-1,1]^D\), named, respectively, \(\tilde{\varsigma }\) and \(\tilde{a}\). More precisely, for \({{\varvec{j}}}=\{0,1\}^D\), we denote by \({{\varvec{x}}}_{{\varvec{j}}}= ((-1)^{j_1}x_1,\ldots ,(-1)^{j_D}x_D)\) and
The following Shift theorem holds for problem (49).
Lemma 17
If the coefficient a is such that its periodic extension satisfies \(\tilde{a}\in W^{s,\infty }({\mathbb {R}}^D)\), \(s\ge 0\) and \(\varsigma \in C^\infty _0({\mathscr {B}})\) then \(u\in H^{s+1}({\mathscr {B}})\).
Proof
We define the extended problem
Since by assumption \(\tilde{a}\in L^\infty ({\mathbb {R}}^D)\) and \(\tilde{\varsigma }\in L^2(\tilde{{\mathscr {B}}})\), this problem has a unique solution, \(\tilde{u}\in H^1_{per}(\tilde{{\mathscr {B}}})\setminus {\mathbb {R}}\), where we denote with \(H^s_{per}(\tilde{{\mathscr {B}}})\) the space of periodic functions with (periodic) square integrable derivatives up to order s. It is easy to check that the solution \(\tilde{u}\) is odd, that is \(\tilde{u}({{\varvec{x}}}_{{\varvec{j}}})=(-1)^{{{\varvec{j}}}}\tilde{u}({{\varvec{x}}})\), \(\forall {{\varvec{x}}}\in [0,1]^D\), hence \(\tilde{u}=0\) (in the sense of traces) on \(\partial {\mathscr {B}}\) and it coincides with the (unique) solution of (49) on \({\mathscr {B}}\). Moreover, standard elliptic regularity arguments allow us to say that \(\tilde{u}\in H^s_{per}(\tilde{{\mathscr {B}}})\), hence \(u\in H^s({\mathscr {B}})\). \(\square \)
Rights and permissions
About this article
Cite this article
Haji-Ali, AL., Nobile, F., Tamellini, L. et al. Multi-index Stochastic Collocation Convergence Rates for Random PDEs with Parametric Regularity. Found Comput Math 16, 1555–1605 (2016). https://doi.org/10.1007/s10208-016-9327-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10208-016-9327-7
Keywords
- Multi-level
- Multi-index Stochastic Collocation
- Infinite dimensional integration
- Elliptic partial differential equations with random coefficients
- Finite element method
- Uncertainty quantification
- Random partial differential equations
- Multivariate approximation
- Sparse grids
- Stochastic Collocation methods
- Multi-level methods
- Combination technique