Abstract
In this work we study convergence properties of sparse polynomial approximations for a class of affine parametric saddle point problems, including the Stokes equations for viscous incompressible flow, mixed formulation of diffusion equations for groundwater flow, time-harmonic Maxwell equations for electromagnetics, etc. Due to the lack of knowledge or intrinsic randomness, the (viscosity, diffusivity, permeability, permittivity, etc.) coefficients of such problems are uncertain and can often be represented or approximated by high- or countably infinite-dimensional random parameters equipped with suitable probability distributions, and the coefficients affinely depend on a series of either globally or locally supported basis functions, e.g., Karhunen–Loève expansion, piecewise polynomials, or adaptive wavelet approximations. We consider sparse polynomial approximations of the parametric solutions, in particular sparse Taylor approximations, and study their convergence properties for these parametric problems. Under suitable sparsity assumptions on the parametrization of the random coefficients, we show the algebraic convergence rates O(N−r) for the sparse polynomial approximations of the parametric solutions based on the results for affine parametric elliptic PDEs (Cohen, A. et al.: Anal. Appl. 9, 11–47, 2011), (Bachmayr, M., et al.: ESAIM Math. Model. Numer. Anal. 51, 321–339, 2017), (Cohen, A., DeVore, R.: Acta Numer. 24, 1–159, 2015), (Chkifa, A., et al.: J. Math. Pures Appl. 103, 400–428, 2015), (Chkifa, A., et al.: ESAIM Math. Model. Numer. Anal. 47, 253–280, 2013), (Cohen, A., Migliorati, G.: Contemp. Comput. Math., 233–282, 2018), with the rate r depending only on a sparsity parameter in the parametrization, not on the number of active parameter dimensions or the number of polynomial terms N. We note that parametric saddle point problems were considered in (Cohen, A., DeVore, R.: Acta Numer. 24, 1–159, 2015, Section 2.2) with the anticipation that the same results on the approximation of the solution map obtained for elliptic PDEs can be extended to more general saddle point problems. In this paper, we consider a general formulation of saddle point problems, different from that presented in (Cohen, A., DeVore, R.: Acta Numer. 24, 1–159, 2015, Section 2.2), and obtain convergence rates for the two variables, e.g., velocity and pressure in Stokes equations, which are different for the case of locally supported basis functions.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Computational simulations based on mathematical models are increasing used for decision making (design, control, allocation of resources, determination of policy, etc.). For such cases, it is critical to account for uncertainties in the inputs, and thus output predictions of these models. One fundamental approach to characterize these uncertainties is by probabilistic modeling, where the uncertain input can be represented by a finite number of random variables or by random fields that can be represented by a large or even infinite number of random variables. We refer to these random variables as parameters and equip them with suitable probability measures. With these parameters as uncertain inputs, we often need to conduct statistical analysis of the model outputs, such as sensitivity analysis with respect to the parameters, computation of statistical moments via integration of outputs in the parameter space, and risk analysis that predicts the failure probability of the system under the uncertainty. To perform these statistical analyses, various numerical approximation methods have been developed largely in the last few decades, such as Monte Carlo and quasi Monte Carlo methods, generalized polynomial chaos, stochastic collocation and Galerkin methods, and model and parameter reduction methods.
The Monte Carlo method has been widely employed in practice because of several advantages, such as very simple and embarrassingly parallel implementation and dimension-independent convergence. However, it has a slow convergence rate of O(N− 1/2), where N is the number of samples, requiring a large number of simulations to achieve sufficient accuracy. New methods such as (high-order) quasi Monte Carlo [29, 36] and multi-level/multi-index Monte Carlo [22, 32] have been proposed to achieve faster convergence and reduced computational cost. Sparse polynomial approximations such as stochastic Galerkin and collocation methods based on (generalized) polynomial chaos and sparse grids have been developed that improve the convergence to a great extent for problems depending smoothly on the parameters; see, e.g., [1, 2, 31, 40, 49, 50]. Practical algorithms to construct such sparse polynomial approximations, such as adaptive [13, 30], least-squares [20, 39], and compressive sensing [27, 43] constructions, have also been actively developed. Another class of methods known as model reduction, including reduced basis methods, achieve quasi optimal convergence (in terms of Kolmogorov widths [7]) and considerable computational reduction for many-query simulations [5, 7, 9, 10, 15] by exploring the intrinsic structure of the solution manifolds. More recently, deep neural networks has been applied to solve high-dimensional parametric problems [6, 37, 38, 41, 46].
One critical challenge faced by polynomial based approximation methods for high-dimensional parametric problems is the so-called curse of dimensionality, i.e., convergence rates that severely deteriorate with the parameter dimension. In recent work [3, 4, 11, 17, 18, 23,24,25, 28, 46, 48, 52], it has been demonstrated that the curse of dimensionality can be effectively broken with dimension-independent convergence rates achieved under certain sparsity assumptions on the countably infinite-dimensional parametrization of the uncertain input. For instance, in [25] analytic regularity of the parametric solution with respect to the parameters was obtained for elliptic partial differential equations. This leads to upper bounds for the coefficients of Taylor expansion of the parametric solution. Under an ℓs-summability of the coefficients of the expansion that represent the random input, the Taylor coefficients were demonstrated to also satisfy the ℓs-summability. Then a dimension-independent convergence rate of a sparse Taylor approximation—truncation of a Taylor expansion of the parametric solution into a suitable sparse index set—were achieved by Stechkin’s lemma. This analysis has been extended to sparse Legendre polynomial approximation [25], sparse polynomial interpolation [21], and sparse polynomial integration [44] for elliptic problems as well as for certain parabolic and nonlinear problems [18, 23].
In this work, we consider affine parametric saddle point problems that cover a wide range of applications, such as the Stokes equations for viscous incompressible flow, mixed formulation of the Poisson equation for groundwater flow, and time-harmonic Maxwell equations for electromagnetic wave propagation; see [8, 42] and the references therein. These applications require better understanding of the approximability and convergence of parametric saddle point problems in a high- or infinite-dimensional parametric setting, which is the aim and main contribution of this work based on the results for affine parametric elliptic PDEs [4, 18, 19, 23, 25, 26]. In particular, our contributions are presented in several sections structured as follows: In Section 2, we formulate an abstract saddle point problem with affine parametrization, and demonstrate the well-posedness of the parametric saddle point problem through several specific examples. Moreover, we consider both globally and locally supported basis functions for the affine parametrization with suitable sparsity assumptions for each of them. In Section 3, we consider a Taylor expansion of the solution of the parametric saddle point problem with respect to the parameters and its sparse Taylor approximation. In the case of globally supported basis functions, we prove the analytic regularity of the parametric solution with respect to the parameters, and prove the ℓs-summability of the Taylor coefficients. In the case of locally supported basis functions, we prove a weighted ℓ2-summability of the Taylor coefficients, based on which we obtain the ℓs-summability of the Taylor coefficients. Based on the ℓs-summability, we prove dimension-independent convergence rates of the sparse Taylor approximations, for both arbitrary sparse index set and a downward closed sparse index set. In particular, our formulation of the saddle point problems is different from that presented in [23, Section 2.2], and leads to convergence results for the two variables in our saddle point formulation of the three examples, e.g., velocity and pressure in Stokes equations, which are different for the case of locally supported basis functions. This is not considered in [23, Section 2.2]. The last section provides conclusions and several ongoing and future research directions.
2 Affine Parametric Saddle Point Problems
2.1 An Abstract Saddle Point Formulation
Let \(\mathcal {V}\) and \(\mathcal {Q}\) denote two Hilbert spaces equipped with inner products \((\cdot , \cdot )_{\mathcal V}\), \((\cdot , \cdot )_{\mathcal Q}\) and induced norms \(\|v\|_{\mathcal V} = (v, v)^{1/2}_{\mathcal V}\) \(\forall v \in \mathcal {V}\), and \(\|\cdot \|^{2}_{\mathcal Q}=(q, q)^{1/2}_{\mathcal Q}\) \(\forall q \in \mathcal {Q}\). Let \(\mathcal {V}^{\prime }\) and \(\mathcal {Q}^{\prime }\) denote the duals of \(\mathcal {V}\) and \(\mathcal {Q}\), respectively. Let \(\mathcal {K}\) denote a separable Banach space. We present an abstract formulation of the parametric saddle point problem as: given parameter \(\kappa \in \mathcal {K}\), and data \(f \in \mathcal {V}^{\prime }\) and \(g\in \mathcal {Q}^{\prime }\), find \((u, p) \in \mathcal {V} \times \mathcal {Q}\) such that
where the linear forms f(v) and g(q) represent the duality pairing \(\langle f, v \rangle _{\mathcal {V}^{\prime }\times \mathcal {V}}\) and \(\langle g, q\rangle _{\mathcal {Q}^{\prime }\times \mathcal {Q}}\) for simplicity, \(a(\cdot , \cdot ; \kappa ): \mathcal {V}\times \mathcal {V} \to \mathbb {R}\) is a parametric bilinear form, and \(b(\cdot , \cdot ): \mathcal {V} \times \mathcal {Q} \to \mathbb {R}\) is a bilinear form. Moreover, we make the following assumptions on the bilinear forms. First, let \(\mathcal {V}^{0}\) denote the kernel of the bilinear form b in \(\mathcal {V}\), i.e.,
Assumption 2.1
Suppose the bilinear forms a(⋅,⋅;κ) and b(⋅,⋅) are uniformly continuous, i.e., there exist constants γ > 0 independent of κ and δ > 0 such that
Moreover, we assume that a(⋅,⋅;κ) is uniformly coercive in \(\mathcal {V}^{0}\), i.e., there exists a constant α > 0 independent of κ such that
Furthermore, we assume that b(⋅,⋅) satisfies the inf-sup (compatibility) condition, i.e., there exists a constant β > 0 such that
The classical results of existence, uniqueness, and a-priori estimates for the parametric saddle point problem (1) are stated in the following theorem.
Theorem 2.1
[42, Theorem 16.4] Under Assumption 2.1, for every \(\kappa \in \mathcal {K}\), there exists a unique solution \((u,p) \in \mathcal {V} \times \mathcal {Q}\) of the parametric saddle point problem (1), such that the following a-priori estimates hold
where for notational convenience, the constants Cu and Cp are short for
Remark 2.1
We remark that the saddle point problem considered in [23, Section 2.2] has the form: given parameter \(\kappa \in \mathcal K\), find \(u \in \mathcal V\) such that
where the bilinear form B satisfies inf-sup condition, which is different from what we consider in (1) to cover the examples in the next section. Moreover, we can obtain different convergence rates for sparse polynomial approximations of u and p in (1), as shown in Section 3.
2.2 Examples
Let \(D\subset \mathbb {R}^{d}\) (d = 2,3) be an open and bounded physical domain with Lipschitz continuous boundary ∂D = Γ, which can be aligned to Dirichlet boundary Γ0 and Neumann boundary Γ1 such that Γ = Γ0 ∪Γ1 and Γ0 ∩Γ1 = ∅. Let \(L^{\infty }(D)\) denote a space of essentially bounded measurable functions, i.e.,
Let L2(D) denote a space of square integrable functions on D, i.e.,
Let ∇, ∇⋅, ∇× denote the gradient, divergence, and curl operators. We use the definition of the following Hilbert spaces by convention [8]
with corresponding norms
Moreover, for functions with vanishing values on Γ0, we define
where n is the unit normal vector along the boundary. In what follows, we present several classical problems in (mixed) variational formulations. These formulations are preferred due to several reasons [8]: the presence of a physical constraint, physical importance of the variables appearing in the formulations, better accommodation of finite dimensional approximation and/or available data. For the simplicity of presentation, we assume homogeneous Dirichlet and/or Neumann boundary conditions for all the examples.
2.2.1 Stokes Flow
We consider a flow of a viscous incompressible fluid with low velocity in a domain D, which can be described by Stokes equations in the variational form as: given parameter \(\kappa \in L^{\infty }(D)\), data f ∈ (L2(D))d, find \((\boldsymbol {u}, p) \in ({H^{1}_{0}}(D))^{d} \times L^{2}(D)\) such that
where u is the velocity, p is the pressure, κ > 0 is the shear viscosity, \(\boldsymbol {f} \in \mathbb {R}^{d}\) is the body force, and \(\boldsymbol {\varepsilon }(\boldsymbol {u}) \in \mathbb {R}^{d\times d}\) is the strain rate tensor defined as
Note that for the weaker condition f ∈ (H− 1(D))d, the formal expression \({\int \limits }_{D} \boldsymbol {f}\cdot \boldsymbol {v}\) represents the duality pairing \(\langle \boldsymbol {f}, \boldsymbol {v} \rangle _{\mathcal {V}^{\prime }\times \mathcal {V}}\) with \(\mathcal {V} = ({H^{1}_{0}}(D))^{d}\).
Problem (6) can be identified in the abstract saddle point formulation (1) in the spaces \(\mathcal {K} = L^{\infty }(D)\), \(\mathcal {V} = ({H^{1}_{0}}(D))^{d}\) and \(\mathcal {Q} = L^{2}(D)\) with the bilinear forms
Then Assumption 2.1 is satisfied with the constants
where the constants γ1, γ2 are determined by the Korn’s inequality [34], i.e.,
and Cp is determined by the Poincaré’s inequality [42], i.e.,
Thus the inf-sup constant β is obtained as: for any \(q \in \mathcal {Q}\), by taking ∇⋅v = q,
Therefore, Theorem 2.1 holds for the Stokes problem (6) with these constants.
2.2.2 Diffusion
Diffusion equations are widely used in modelling various physical phenomena. In many applications it is the flux rather than the state that is of interesting. For instance in thermo-diffusion problems heat flux may be more important than the temperature field. For such consideration, we present the diffusion problem in the variational formulation: given parameter \(\kappa \! \in \! L^{\infty }(D)\) and data f ∈ L2(D), find (u,p) ∈ H0(div;D) × L2(D) such that
where p is the state, e.g., temperature field, the auxiliary variable u = κ− 1∇p represents the flux, κ > 0 is the (inverse) diffusion coefficient, f is a source term.
By defining the bilinear forms
in the Hilbert spaces \(\mathcal {V} = H_{0}(\text {div};D)\) and \(\mathcal {Q} = L^{2}(D)\), we can identify the diffusion problem (8) in the abstract saddle point formulation (1) with \(\mathcal {K} = L^{\infty }(D)\). Assumption 2.1 is satisfied with the following constants
where β is obtained the same as in the Stokes problem. Note that the bilinear form a(⋅,⋅;κ) is coercive in \(\mathcal {V}^{0}\), in which ∇⋅v vanishes, even it is not coercive in \(\mathcal {V}\).
2.2.3 Time Harmonic Maxwell System
The foundation of classical electromagnetism, optics, and electric circuits can be described by Maxwell equations. The time harmonic Maxwell system is considered when the propagation of electromagnetic waves at a given frequency is studied or when the Fourier transform in time is used. In the mixed variational formulation, the Maxwell system can be stated as: given parameter \(\kappa \in L^{\infty }(D)\), and data f ∈ (L2(D))d, find \((\boldsymbol {u}, p) \in H_{0}(\text {curl};D)\times {H^{1}_{0}}(D)\) such that
where u is the electric field vector, p is the auxiliary variable, ω is a frequency, f = iωj with current source field vector j, κ > 0 denotes the (inverse) magnetic permeability, ε > 0 denotes the electric permittivity. Here we only consider κ as a varying parameter and fix ε for simplicity.
By defining the bilinear forms
in the Hilbert spaces \(\mathcal {V }= H_{0}(\text {curl}; D)\) and \(\mathcal {Q} = {H^{1}_{0}}(D)\), we can express the time harmonic Maxwell system as in the abstract saddle point formulation (1) with \(\mathcal {K} = L^{\infty }(D)\). Moreover, we can verify Assumption 2.1 with the following constants
and
We consider the case that α > 0 in this work. It is straightforward to verify γ and δ. To verify β, for any \(q \in \mathcal {Q}\), by taking v = ∇q, we have
noting that ∇×∇q = 0, \(\forall q \in \mathcal {Q}\), in the first inequality. To verify α, by Friedrichs’s inequality [8], there exists a constant Cf such that
Therefore, we have
2.3 Affine Parametrization
In this section, we present an affine parametrization for the parameter κ. We first present a common structure of the bilinear form a(⋅,⋅;κ) in (1) appearing in many saddle point problems such as the Stokes equations, mixed formulation of the Poisson equation, and time-harmonic Maxwell’s equations, that is affine with respect to the parameter \(\kappa \in \mathcal {K}\), i.e., it can be written as
where a1(w,v;κ) depends linearly on κ such that for any \(\kappa \in \mathcal {K}\) there hold
for constants c1,C1 > 0 independent of κ, e.g., related to the Poincaré’s or Friedrichs’s constant in Stokes equations or time-harmonic Maxwell’s equations. We shall consider this affine structure (9) with the properties (10) in what follows.
To parametrize κ, we consider a countably infinite-dimensional parameter space
We denote the element of the parameter space as y = (yj)j≥ 1 ∈ U and equip the parameter space with the probability measure
where dλ is the Lebesgue measure in [− 1,1]. To this end, we consider an affine parametrization for the representation and approximation of the parameter κ that is widely used in the literature [1, 2, 19, 22,23,24,25, 32, 47].
Assumption 2.2
The variation of the parameter κ in \(\mathcal {K}\) can be represented by the parameter y ∈ U through the affine expansion
Moreover, we assume there exist constants \(0 < \theta < {\varTheta } < \infty \) such that
and such that the coercivity and continuity conditions (3) and (2) are satisfied for the bilinear form a(⋅,⋅;κ) at any κ ∈ [𝜃,Θ].
The sequence (κj)j≥ 0 could either be directly prescribed knowledge of the physical system or given by an affine representation or approximation of the random field κ. We present two specific examples, where we distinguish the parametrization in two classes representing globally and locally supported basis (κj)j≥ 1, respectively.
-
1.
Globally supported basis. One classical example comes from Karhunen–Loève expansion of a random field with finite second order moment, given by [45]
$$ \kappa(x, \boldsymbol{y}) = \kappa_{0}(x) + {\sum}_{j = 1}^{\infty} y_{j} \sqrt{\lambda_{j}} \psi_{j}(x), $$(12)where κ0 is the mean of the random field, (λj,ψj)j≥ 1 are the eigenpairs of the covariance of the random field. Here, we can identify \(\kappa _{j} = \sqrt {\lambda _{j}} \psi _{j}\), j ≥ 1, in the affine assumption (11).
-
2.
Locally supported basis. Piecewise polynomials or wavelets can be employed to model or approximate the parameter field κ. A particular case is the weighted piecewise constant basis representation
$$ \kappa(x, \boldsymbol{y}) = \kappa_{0} + {\sum}_{j = 1}^{J} y_{j} w_{j} \chi_{j}(x), $$(13)where wj is the weight and χj is the characteristic function in the subdomain/element Dj, \(j = 1, \dots , J\), where \(D = \cup _{j=1}^{J} D_{j}\) and Di ∩ Dj = ∅ for i≠j. In this example, we can identify κj = wjχj, \(j = 1, \dots , J\).
Assumption 2.2 guarantees the well-posedness of the parametric saddle point problem (1). To study the convergence property of certain approximation of its solution or related quantity of interest, we make the following assumptions to cover the globally and locally supported basis representations as considered in [25] and [4], respectively.
Assumption 2.3
For the parametrization (11) under Assumption 2.2, assume for some s ∈ (0,1) there holds \((\|\kappa _{j}\|_{\mathcal K})_{j\geq 1} \in \ell ^{s}(\mathbb {N})\), i.e.,
Remark 2.2
As discussed in [4], for the Karhunen–Loève expansion (12), the ℓs-summability condition (14) is satisfied when \(\sup _{j\geq 1}\|\psi _{j}\|_{\mathcal K} \leq C\) for some \(C < \infty \), and \((\sqrt {\lambda _{j}})_{j\geq 1} \in \ell ^{s}(\mathbb {N})\). However, it is not satisfied for any s ∈ (0,1) in the case of the locally supported representation (13) when \(|w_{j}| \lnsim j^{-1}\), i.e., \((|w_{j}|)_{j\geq 1} \not \in \ell ^{1}(\mathbb {N})\), as \(J \to \infty \). To accommodate such a case, we make the following assumption.
Assumption 2.4
For the parametrization (11) under Assumption 2.2, assume there exists a sequence ρ = (ρj)j≥ 1 with ρj > 1, such that
for some \(\theta < \epsilon < \kappa _{\min \limits }\), and such that \((\rho _{j}^{-1})_{j\geq 1} \in \ell ^{t}(\mathbb {N})\) for some \(t \in (0, \infty )\).
Remark 2.3
We can see that Assumption 2.4 is satisfied for the locally supported representation (13) for \(J \to \infty \), as in [4]. For instance, we can take \(\rho _{j}^{-1} \sim |w_{j}| \) and \(\rho _{j} |w_{j}| \leq \kappa _{\min \limits } - \epsilon \) as |wj|→ 0, such that ρj > 1 and (15) holds, then \((\rho _{j}^{-1})_{j\geq 1} \in \ell ^{t}(\mathbb {N})\) whenever \((|w_{j}|)_{j\geq 1} \in \ell ^{t}(\mathbb {N})\) for any \(t\in (0,\infty )\).
3 Sparse Polynomial Approximations
Let \(\mathcal {F}\) denote a multi-index set with finitely supported multi-index ν = (νj)j≥ 1, i.e., \(\boldsymbol {\nu } \in \mathcal {F}\) if and only if \(|\boldsymbol {\nu }| = {\sum }_{j\geq 1} \nu _{j} < \infty \). For any \(\boldsymbol {\nu } \in \mathcal {F}\), we define the multi-factorial ν!, multi-monomial yν for y ∈ U, and partial derivative ∂νψ(y) for a differentiable parametric map ψ(y) as
where we use the convention 0! := 1, 00 := 1, and ∂0ψ(y)/∂0yj = ψ(y). For such a differentiable map ψ, we consider the Taylor power series
with Taylor coefficients \(t_{\boldsymbol {\nu }}^{\psi }\) defined as
Let \(({\Lambda }_{N})_{N\geq 1} \subset \mathcal {F}\) denote a sequence of index sets with N indices that exhaust \(\mathcal {F}\), i.e., any finite set \({\Lambda } \subset \mathcal {F}\) is contained in all ΛN for N ≥ N0 with N0 sufficiently large. We define the truncation of the power series (16) in ΛN as
which we call sparse Taylor approximation. We are interested in two questions: (1) if the sparse Taylor approximation for the solution of the parametric saddle point problem (1) is convergent; (2) if so, how fast it converges with respect to N. To answer these questions, we carry out two types of analyses corresponding to Assumptions 2.3 and 2.4, respectively. The first type is to obtain the analytic regularity property of the parametric solution in a complex domain covering the parameter space. This analyticity leads to upper bounds for the Taylor coefficients \((t^{u}_{\boldsymbol {\nu }}, t_{\boldsymbol {\nu }}^{p})\) at each \(\boldsymbol {\nu } \in \mathcal {F}\) by Cauchy’s integral formula, which implies a \(\ell ^{s}(\mathcal {F})\)-summability of the coefficients. The second type is to derive a weighted \(\ell ^{2}(\mathcal {F})\)-summability of the Taylor coefficient based on the affine structure of the parametrization; then the \(\ell ^{s}(\mathcal {F})\)-summability of the Taylor coefficients is obtained by using Hölder’s inequality. Due to the \(\ell ^{s}(\mathcal {F})\)-summability, a best N-term dimension-independent convergence rate of a suitable Taylor approximation is achieved using Stechkin’s lemma. These analyses are based on the results in [25] and [4] for studying parametric elliptic PDEs, which we extend to dealing with the parametric saddle point problem (1) under Assumptions 2.3 and 2.4, respectively.
3.1 ℓ s-Summability by Analytic Regularity
Let z = (zj)j≥ 1 denote a sequence of complex numbers with \(z_{j} \in \mathbb {C}\), j ≥ 1, i.e., \(\boldsymbol {z} \in \mathbb {C}^{\mathbb {N}}\). Let \(\mathcal {U}\) denote a polydisc defined as
Then we can extend the parametrization of κ in (11) from \(U = [-1,1]^{\mathbb {N}}\) to \(\mathcal {U}\), i.e.,
for which, under Assumption 2.2, we have
For two constants r and R such that
where 𝜃 and Θ are given in Assumption 2.2, we define the complex set
By the equivalence of Babuška Theorem and Brezzi Theorem for saddle point problems [51], and the extension of the Babuška theorem to complex function space [23, Theorem 2.2], Theorem 2.1 holds for \(\boldsymbol {z} \in \mathcal {A}_{r}^{R}\) under Assumptions 2.1 and 2.2 in complex function spaces \(\mathcal {V}\) and \(\mathcal {Q}\), i.e., there exists a unique solution \((u(\boldsymbol {z}), p(\boldsymbol {z})) \in \mathcal {V} \times \mathcal {Q}\) \(\forall \boldsymbol {z} \in \mathcal {A}_{r}^{R}\), which satisfies the a-priori estimates in (4). In fact, Theorem 2.1 holds for \(\boldsymbol {z} \in \mathcal {A}_{\tilde {r}}^{R}\) for any \(\tilde {r} \geq \theta \) due to Assumption 2.2 on the coercivity condition of the sesquilinear form a(⋅,⋅;κ). Moreover, we observe that \(\mathcal {U} \in \mathcal {A}_{r}^{R}\) by definition so that Theorem 2.1 also holds for \(\boldsymbol {z} \in \mathcal {U}\).
Lemma 3.1
Let (u,p) and \((\tilde {u}, \tilde {p})\) denote the solutions of the parametric saddle point problem (1) at \(\kappa \in \mathcal {A}_{r}^{R}\) and \(\tilde {\kappa } \in \mathcal {A}_{r}^{R}\), respectively, then we have
where the constants α, β and γ are given in Theorem 2.1, C1 and Cu are given in (10) and (5).
Proof
By subtracting (1) at κ from it at \(\tilde {\kappa }\), we have
By Theorem 2.1, the following a-priori estimates hold
where we denote \(\mathrm {a}(v) = -a(\tilde {u};v;\kappa -\tilde {\kappa })\) \(\forall v \in \mathcal {V}\). By the affine dependence of a(⋅,⋅;κ) on κ as in (9) and the bound (10) and (4), we have
Thus, we conclude by inserting this bound in (18). □
Lemma 3.2
For every \(\boldsymbol {z} \in \mathcal {A}_{r}^{R}\), the complex derivative \((\partial _{z_{j}}u(\boldsymbol {z}), \partial _{z_{j}} p(\boldsymbol {z}))\) with respect to zj for each j ≥ 1 is well-defined for the solution (u(z),p(z)) of the parametric saddle point problem (1), which is given by: find \((\partial _{z_{j}}u(\boldsymbol {z}), \partial _{z_{j}} p(\boldsymbol {z})) \in \mathcal {V} \times \mathcal {Q}\) such that
Note that we use \(a(u, v; \kappa _{j}) = {\int \limits }_{D} \kappa _{j} (\nabla \times u) \cdot (\nabla \times v) dx\) by slight abuse of notation for the time harmonic Maxwell system, which is bounded.
Proof
For any \(\boldsymbol {z} \in \mathcal {A}_{r}^{R}\) and j ≥ 1, for \(h \in \mathbb {C} \setminus \{0\}\) sufficiently small such that \(|h| \|\kappa _{j}\|_{\mathcal K} \leq \epsilon < r\), we have
where ej is the Kronecker sequence with 1 at index j and 0 at other indices, so that \((u(\boldsymbol {z} + h\boldsymbol {e}_{j}), p(\boldsymbol {z} + h\boldsymbol {e}_{j})) \in \mathcal {V} \times \mathcal {Q}\) is a well-defined solution of (1) at κ(z + hej). Therefore, we have that the following difference quotients satisfy
Subtracting problem (1) at κ(z + hej) from its evaluation at κ(z) and dividing by h, we obtain that (uh(z),ph(z)) is a unique solution of the following problem:
Let ah(v) = −a(u(z + hej),v;κj). By Assumption 2.1, we have
By the stability estimates (17) in Lemma 3.1, we have
which converges to zero as |h|→ 0, so that ah →a0 in \(\mathcal {V}^{\prime }\) as |h|→ 0. Consequently, (uh,ph) converges to (u0,p0) in \(\mathcal {V}\times \mathcal {Q}\) by Theorem 2.1, which is the unique solution of (19) for h = 0. Therefore, \((\partial _{z_{j}}u, \partial _{z_{j}}p) = (u_{0}, p_{0})\) by the uniqueness. □
To study the convergence rate of the Taylor approximation, we need to bound the Taylor coefficients under Assumption 2.3, for which we employ the Cauchy integral formula in a suitable complex domain. We call a sequence ρ = (ρj)j≥ 1 is r-admissible
By this definition, if ρ is r-admissible, Theorem 2.1 holds in a larger polydisc
This is because \(\mathcal {U}_{\boldsymbol {\rho }} \subset \mathcal {A}_{r}^{R}\), as it can be readily shown that
and
Lemma 3.3
Under Assumptions 2.1 and 2.2, for a sequence ρ satisfying (20), for the Taylor coefficients \(t_{\boldsymbol {\nu }}^{u}\) and \(t_{\boldsymbol {\nu }}^{p}\) defined in (16) we have the following bounds
where Cu and Cp are given in (5), ρ− 0 = 1 by convention for any ρ > 0.
We follow the proof in [25, Lemma 2.4] for elliptic problems and adjust it for the saddle point problem (1) here.
Proof
For any \(\boldsymbol {\nu } \in \mathcal {F}\), let \(J = \max \limits \{j \in \mathbb {N}: \nu _{j} \neq 0\}\). For such J, let \(\boldsymbol {z}_{J}^{0}\) denote a truncated complex sequence for any \(\boldsymbol {z} \in \mathcal {U}\) defined as
Then for the solution (u,p) of (1) at \(\boldsymbol {z}_{J}^{0}\), we have the a-priori estimates (4) by Theorem 2.1 under Assumptions 2.1 and 2.2. Given the sequence ρ, we define a new sequence \(\tilde {\boldsymbol {\rho }}\) as
which implies \(\mathcal {U}_{\tilde {\boldsymbol {\rho }}} \subset \mathcal {A}_{\tilde {r}}^{R}\) with \(\tilde {r} = (r+\theta )/2 > \theta \). As the coercivity condition (3) is satisfied for any \(\boldsymbol {z} \in \mathcal {A}_{\tilde {r}}^{R}\) under Assumption 2.2, Theorem 2.1 and Lemma 3.2 hold. Therefore, \(u(\boldsymbol {z}_{J}^{0})\) is analytic with respect to each zj, 1 ≤ j ≤ J on the polydisc \(\mathcal {U}_{\tilde {\boldsymbol {\rho }},J}\), which is an open neighborhood of \(\mathcal {U}_{\boldsymbol {\rho },J}\) defined as
Therefore, by the Cauchy integral formula [33, Theorem 2.1.2], we have for u
By taking the derivative ∂ν on both sides and evaluating it at 0, we have
so that
which is (21) for u. The same argument is applied to derive the bound for p. □
Lemma 3.4
Under Assumption 2.3, there exists a \(\frac {r+\theta }{2}\)-admissible sequence ρ, i.e, it satisfies (20) with r replaced by \(\frac {r+\theta }{2}\), such that
This result for the saddle point problems here can be proved following that in [25, Sec. 3] for elliptic problems.
Proof
By Lemma 3.3, we only need to prove there exists a \(\frac {r+\theta }{2}\)-admissible sequence ρ such that
This is done in a constructive way by specification of ρ. By Assumption 2.3, we have \((\|\kappa _{j}\|_{\mathcal K})_{j\geq 1} \in \ell ^{s}(\mathbb {N}) \subset \ell ^{1}(\mathbb {N})\), so that there exists a sufficiently large J such that
Then we choose τ > 1 such that
For any \(\boldsymbol {\nu } \in \mathcal {F}\), we specify the sequence ρ as
with the convention that \(\nu _{j}/({\sum }_{i > J} \nu _{i}) = 0\) if \({\sum }_{i > J} \nu _{i} = 0\). Then we have
where in the second inequality we have used Assumption 2.2, i.e., for any x ∈ D,
Therefore, ρ is \(\frac {r+\theta }{2}\)-admissible. By results in [25, Sec. 3], (23) holds for the choice (24). □
3.2 ℓ s-Summability by Weighted ℓ 2-Summability
The ℓs-summability of the Taylor coefficients is guaranteed by the ℓs-summability of \((\|\kappa _{j}\|_{\mathcal K})_{j\geq 1}\) in Assumption 2.3 as shown in the last section. However, as indicated in Remark 2.3, \((\|\kappa _{j}\|_{\mathcal K})_{j\geq 1}\) may not be ℓs-summable for any s ∈ (0,1), as considered in [4] for coercive elliptic PDEs. In this case, Assumption 2.4 may still hold, in particular for locally supported (κj)j≥ 1, for which we prove the ℓs-summability of the Taylor coefficients \((\|t_{\boldsymbol {\nu }}^{u}\|_{\mathcal V})_{\boldsymbol {\nu } \in \mathcal {F}}\) and the ℓt-summability of the Taylor coefficients \((\|t_{\boldsymbol {\nu }}^{p}\|_{\mathcal Q})_{\boldsymbol {\nu } \in \mathcal {F}}\), where \(s = \frac {2t}{2+t}\) for \(t \in (0, \infty )\) given in Assumption 2.4.
Lemma 3.5
Under Assumption 2.4, we have
where \(s = \frac {2t}{2+t} \in (0, 2)\) for \(t \in (0, \infty )\) given in Assumption 2.4.
The different summability results for u and p can be proved by following that in [4] for elliptic problem with necessary adjustment to the saddle point problems here.
Proof
For a sequence ρ satisfying (15) in Assumption 2.4, we define the scaling function Rρ(y) := (ρjyj)j≥ 1. By Assumption (15) we have for any x ∈ D
so that a(⋅,⋅;κ) is coercive by Assumption 2.2. Under Assumption 2.1, there exists a unique \((u(R_{\boldsymbol {\rho }}(\boldsymbol {y})), p(R_{\boldsymbol {\rho }}(\boldsymbol {y}))) \in \mathcal {V} \times \mathcal {Q}\) for every y ∈ U such that
By the definition of the Taylor coefficients in (16), we have at ν = 0 that \((t_{\boldsymbol {0}}^{u}, t_{\boldsymbol {0}}^{p}) = (u(\boldsymbol {0}), p(\boldsymbol {0}))\), which satisfy the a-priori estimates (4) by Theorem 2.1, i.e.,
For any other \(\boldsymbol {\nu } \in \mathcal {F}\), by taking the partial derivative ∂ν for (25), we obtain
where \(\text {supp} \boldsymbol {\nu } = \{j \in \mathbb {N}: \nu _{j} \neq 0\}\). Taking division by ν! on both sides, setting y = 0, we have the saddle point problem for the Taylor coefficients \((t_{\boldsymbol {\nu }}^{u}, t_{\boldsymbol {\nu }}^{p}) \in \mathcal {V}\times \mathcal {Q}\)
Therefore, \(t_{\boldsymbol {\nu }}^{u} \in \mathcal {V}^{0}\) by the second equation. We shall show that \((\boldsymbol {\rho }^{\boldsymbol {\nu }} t_{\boldsymbol {\nu }}^{u}, \boldsymbol {\rho }^{\boldsymbol {\nu }} t_{\boldsymbol {\nu }}^{p}) \in \mathcal {V}\times \mathcal {Q}\) is a bounded solution of (26) for any \(\boldsymbol {\nu } \in \mathcal {F}\). First it is so for ν = 0. Then by induction we assume that \((\boldsymbol {\rho }^{\boldsymbol {\mu }} t_{\boldsymbol {\mu }}^{u}, \boldsymbol {\rho }^{\boldsymbol {\mu }} t_{\boldsymbol {\mu }}^{p}) \in \mathcal {V} \times \mathcal {Q}\) are bounded solutions of (26) (being ν replaced by μ) for any μ ≼ν, i.e., μj ≤ νj ∀j ≥ 1, and μ≠ν, then by Theorem 2.1 we have \((\boldsymbol {\rho }^{\boldsymbol {\nu }} t_{\boldsymbol {\nu }}^{u}, \boldsymbol {\rho }^{\boldsymbol {\nu }} t_{\boldsymbol {\nu }}^{p}) \in \mathcal {V}\times \mathcal {Q}\) is the unique solution of (26), such that
where by (10) and \(|\boldsymbol {\nu }|_{0} = \#\{j \in \mathbb {N}: \nu _{j} > 0\}< \infty \) for any \(\boldsymbol {\nu } \in \mathcal {F}\) we have
Therefore, by taking the test functions as \((v, q) = (\boldsymbol {\rho }^{\boldsymbol {\nu }} t_{\boldsymbol {\nu }}^{u}, \boldsymbol {\rho }^{\boldsymbol {\nu }} t_{\boldsymbol {\nu }}^{p})\), we obtain
where for the inequality we used the assumption (10). Therefore, by (15), we have
which, together with (29) leads to
By Assumption 2.2, we have
so that by the affine structure (9) there holds
Hence, from (30) and (31) we obtain
Summing over |ν| = k for any k ≥ 1 for both sides, we have
where we used Assumption 2.4 in the first inequality. By denoting
we obtain
Summing over k ≥ 1, we have
By the coercivity condition (10) in \( \mathcal {V}^{0}\), for any ν≠0, as \(t_{\boldsymbol {\nu }}^{u} \in \mathcal {V}^{0}\) we have
where \(\inf _{x \in D} \kappa _{0}(x) > \epsilon > \theta \) by Assumption 2.4. Therefore, we obtain
By Hölder’s inequality, we have
where the first term is finite by (33). For the second term, with \(t = \frac {2s}{2-s}\), i.e., \(s = \frac {2t}{2+t}\), we have
As \((\rho _{j}^{-1})_{j\geq 1} \in \ell ^{t}(\mathbb {N})\), there exists \(J \in \mathbb {N}\) such that \(\rho _{j}^{-t} < \frac {1}{2}\) for all j > J. Note that \(g(x) := -\log (1-x) - 2x < 0\) as g(0) = 0 and \(g^{\prime }(x) = \frac {1}{1-x} - 2 < 0\) for \(0 < x < \frac {1}{2}\), which implies \((1-\rho _{j}^{-t})^{-1} < \exp (2\rho _{j}^{-t})\) for j > J, so that
which is finite as \((\rho _{j}^{-1})_{j\geq 1} \in \ell ^{t}(\mathbb {N})\). Therefore, \((\|t_{\boldsymbol {\nu }}^{u}\|_{\mathcal V})_{\boldsymbol {\nu } \in \mathcal {F}} \in \ell ^{s}(\mathcal {F})\).
By (33), there exists a constant C2 > 0 such that
Therefore, by (27) and (28), we have
where
and we used the fact \(|\boldsymbol {\nu }|_{0} \leq {\prod }_{j\geq 1}(1+\nu _{j})\) for any \(\boldsymbol {\nu } \in \mathcal {F}\) in the second inequality. Hence, we have
where for each j ≥ 0 we have
As \((\rho _{j}^{-1})_{j\geq 1} \in \ell ^{t}(\mathbb {N})\), there exists J > 0 such that \(\rho _{j}^{-1} < \frac {1}{4}\) for any j > J. Moreover, for any t > 0, there exist c1 > 0 and 1 < c2 < 2 such that \((2+k)^{t} \leq c_{1} {c_{2}^{k}}\) for k ≥ 0, so that
As ρj > 1, there exists \(C_{j} < \infty \) for each j ≥ 1 such that
Therefore, we have
which is finite when \((\rho _{j}^{-1})_{j\geq 1} \in \ell ^{t}(\mathbb {N})\). Note that in the second inequality, we used 1 + x ≤ ex for x ≥ 0. Hence \((\|t_{\boldsymbol {\nu }}^{p}\|_{\mathcal Q})_{\boldsymbol {\nu } \in \mathcal {F}} \in \ell ^{t}(\mathcal {F})\) from (36). □
Remark 3.1
We remark that the weighted ℓ2-summability for \((\|t_{\boldsymbol {\nu }}^{u}\|_{\mathcal V})_{\boldsymbol {\nu } \in \mathcal {F}}\) in Lemma 3.5 is a result of the coercivity property (32) (where the ℓ2-norm shows up) of \(a_{1}(\cdot , \cdot ;\kappa ): \mathcal {V} \times \mathcal {V} \to \mathbb {R}\). However, the weighted ℓ2-summability cannot be shown for \((\|t_{\boldsymbol {\nu }}^{p}\|_{\mathcal Q})_{\boldsymbol {\nu } \in \mathcal {F}}\), where \(t_{\boldsymbol {\nu }}^{p}\) only appears in \(b(\cdot , \cdot ): \mathcal {V} \times \mathcal {Q} \to \mathbb {R}\) that holds the inf-sup condition. Instead, by this condition, we can bound the Taylor coefficient \(t_{\boldsymbol {\nu }}^{p}\) as in (35) by (28).
3.3 Dimension-independent Convergence
As a consequence of the summability obtained in the Sections 3.1 and 3.2, we obtain the following convergence results.
Theorem 3.1
Under Assumptions 2.1 and 2.2, there exist two sequences of index sets \(({{\Lambda }^{u}_{N}})_{N \geq 1}\) and \(({{\Lambda }^{p}_{N}})_{N \geq 1}\) with indices \(\boldsymbol {\nu } \in \mathcal {F}\) corresponding to the N largest Taylor coefficients \(\|t_{\boldsymbol {\nu }}^{u}\|_{\mathcal V}\) and \(\|t_{\boldsymbol {\nu }}^{p}\|_{\mathcal Q}\), respectively, such that
under Assumption 2.3, and
under Assumption 2.4, where the dimension-independent convergence rate r is given by
The convergence results are due to the application of Stechkin’s Lemma [24, Lemma 5.5], as also used in [25] for elliptic problems, which we briefly present below for the saddle point problems.
Proof
At first, by Lemmas 3.4 and 3.5 for Assumption 2.3 and Assumption 2.4, respectively, for any s < 1, we have
which implies that the Taylor power series \(T_{\mathcal F} u\) defined in (16) is uniformly convergent. Secondly, for any y ∈ U and ε > 0, by Lemma 3.1, there exists J1 > 0 such that for any J ≥ J1
under Assumptions 2.3 or 2.4, where \(\boldsymbol {y}_{J}^{0}\) is defined in the same way as in (22). Moreover, for any J ≥ J1, by the analytic regularity of \(u(\boldsymbol {y}_{J}^{0})\) in the complex domain \(\mathcal {U}_{\boldsymbol {\rho }}\) as indicated in Lemma 3.2, there exists K > 0 such that for any \({\Lambda } = \{\boldsymbol {\nu } \in \mathcal {F}: \nu _{j} > K \text { for } j \leq J \text { and } \nu _{j} = 0 \text { for } j > J\}\) there holds
By the definition of Λ we have \(T_{\Lambda }u(\boldsymbol {y}_{J}^{0}) = T_{\Lambda } u(\boldsymbol {y})\). Hence, we have
which implies that the Taylor power series \(T_{\mathcal F} u(\boldsymbol {y})\) converges to u(y) for every y ∈ U. Consequently,
which concludes for the error of the Taylor approximation of u by using Stechkin’s Lemma [24, Lemma 5.5], i.e., for a non-increasing arrangement of \((\|t_{\boldsymbol {\nu }}^{u}\|_{\mathcal V})_{\boldsymbol {\nu } \in \mathcal {F}}\), there holds
with r(s) defined in (40). The same result holds for the error of the Taylor approximation of p by using the same argument. □
Remark 3.2
We remark that the convergence results (38) and (39) are obtained under different assumptions, and cannot be implied by one another. In fact, it is clear that (39) cannot be implied by (38) as explained in Remark 2.2. On the other hand, (38) cannot be implied by (39) as shown in the following simple example: let κ0 = 1 and κj = j− 2 for j ≥ 1, then by (38) we have the convergence rate N−r for any r < 1 arbitrarily close to 1. However, by (39), for which there exists \((\rho _{j}^{-1})_{j\geq 1} \in \ell ^{t}(\mathbb {N})\) with t > 1 satisfying (15), we can only obtain a convergence rate of N−r for \(r = \frac {1}{s} - 1 = \frac {1}{t} - \frac {1}{2}< \frac {1}{2}\) for \(\sup _{\boldsymbol {y} \in U}\|u(\boldsymbol {y}) - T_{{{\Lambda }^{u}_{N}}} u(\boldsymbol {y})\|_{\mathcal V}\), and \(r = \frac {1}{t} - 1<0\), i.e., non-convergent, for \(\sup _{\boldsymbol {y} \in U}\|p(\boldsymbol {y}) - T_{{{\Lambda }^{p}_{N}}} p(\boldsymbol {y})\|_{\mathcal Q}\).
Theorem 3.1 states the existence of such index sets \({{\Lambda }^{u}_{N}} \subset \mathcal {F}\) and \({{\Lambda }^{p}_{N}} \subset \mathcal {F}\) that lead to the dimension-independent convergence rates. However, there is no particular structure of these index sets. To guide more practical algorithm development, we consider a particular structure of these index sets, namely, downward closed set \({\Lambda } \subset \mathcal {F}\), also known as admissible set or monotone set [19, 21, 26, 30], which satisfies
where we recall that μ ≼ν means μj ≤ νj for all j ≥ 1.
We say that a sequence \((\theta _{\boldsymbol {\nu }})_{\boldsymbol {\nu } \in \mathcal {F}}\) is monotonically decreasing
Lemma 3.6
Let \((\theta _{\boldsymbol {\nu }})_{\boldsymbol {\nu } \in \mathcal {F}}\) be a monotonically decreasing sequence of positive real numbers in \(\ell ^{s}(\mathcal {F})\) with s < 1, then there exists a sequence of downward closed and nested index sets \(({\Lambda }_{N})_{N\geq 1} \subset \mathcal {F}\) such that
Proof
By Stechkin’s Lemma as in the proof of Theorem 3.1, there exists a sequence of index sets \(({\Lambda }_{N})_{N\geq 1} \subset \mathcal {F}\) such that (41) holds. It is left to show that (ΛN)N≥ 1 can be taken as downward closed and nested. This is achieved by an induction argument. First, for N = 1, we take Λ1 = {ν(1)} with ν(1) = 0, then (41) holds. Suppose (41) holds for some N > 1 with downward closed and nested index set ΛN, then we look for the next index \(\boldsymbol {\nu }(N+1) \in \mathcal {F}\) such that ΛN+ 1 := ΛN ∪{ν(N + 1)} is downward closed and (41) holds in ΛN+ 1. Let \(\mathcal {N}({\Lambda }_{N})\) denote the admissible forward neighbor set defined as
where we recall the Kronecker sequence ej = (δij)i≥ 1. Then we take
By the definition of the admissible forward neighbor set \(\mathcal {N}({\Lambda }_{N})\), we have ΛN+ 1 := ΛN ∪{ν} is downward closed for any \(\boldsymbol {\nu } \in \mathcal {N}({\Lambda }_{N})\). Moreover, the sequence \((\theta _{\boldsymbol {\nu }})_{\boldsymbol {\nu } \in \mathcal {F}}\) is monotonically decreasing, which implies 𝜃ν(N+ 1) ≤ 𝜃ν(N) since ν(N) ≼ν(N + 1) for every N ≥ 1, which satisfies the Stechkin’s Lemma for decreasing sequence to hold (41) in ΛN+ 1. This concludes. □
Let \((\theta _{\boldsymbol {\nu }})_{\boldsymbol {\nu }\in \mathcal {F}}\) be a real sequence. Then the sequence \((\theta _{\boldsymbol {\nu }}^{\ast })_{\boldsymbol {\nu } \in \mathcal {F}}\) with
is monotonically decreasing. If the sequence \((\theta _{\boldsymbol {\nu }}^{\ast })_{\boldsymbol {\nu } \in \mathcal {F}}\) is \(\ell ^{s}(\mathcal {F})\)-summable, then we denote a \({\ell ^{s}_{m}}(\mathcal {F})\)-norm for \((\theta _{\boldsymbol {\nu }})_{\boldsymbol {\nu } \in \mathcal {F}}\) as
We provide the dimension-independent convergence rates for the case of downward closed and nested index sets for saddle point problems, following that in [26] for elliptic problems.
Theorem 3.2
Under Assumptions 2.1 and 2.2, there exist two sequences of downward closed and nested index sets \(({{\Lambda }^{u}_{N}})_{N \geq 1}\) and \(({{\Lambda }^{p}_{N}})_{N \geq 1}\) with indices \(\boldsymbol {\nu } \in \mathcal {F}\) corresponding to the N largest Taylor coefficients \(\|t_{\boldsymbol {\nu }}^{u}\|_{\mathcal V}\) and \(\|t_{\boldsymbol {\nu }}^{p}\|_{\mathcal Q}\), respectively, such that
under Assumption 2.3, and
under Assumption 2.4, where the dimension-independent convergence rate r is given by
Proof
By Theorem 3.1 and Lemma 3.6, we only need to show that \((\|t_{\boldsymbol {\nu }}^{u}\|_{\mathcal V}^{\ast })_{\boldsymbol {\nu } \in \mathcal {F}}\) and \((\|t_{\boldsymbol {\nu }}^{p}\|_{\mathcal Q}^{\ast })_{\boldsymbol {\nu } \in \mathcal {F}}\), the associated monotone envelopes defined in (42) for \((\|t_{\boldsymbol {\nu }}^{u}\|_{\mathcal V})_{\boldsymbol {\nu } \in \mathcal {F}}\) and \((\|t_{\boldsymbol {\nu }}^{p}\|_{\mathcal Q})_{\boldsymbol {\nu } \in \mathcal {F}}\), respectively, are \(\ell ^{s}(\mathcal {F})\)-summable under Assumption 2.3, and \(\ell ^{t}(\mathcal {F})\)-summable under Assumption 2.4. Under Assumption 2.3, by Lemma 3.3 we have
since \((\boldsymbol {\rho }^{-\boldsymbol {\nu }})_{\boldsymbol {\nu } \in \mathcal {F}}\) is monotonically decreasing by (20). Moreover, as shown in Lemma 3.4, \((\boldsymbol {\rho }^{-\boldsymbol {\nu }})_{\boldsymbol {\nu } \in \mathcal {F}}\) is \(\ell ^{s}(\mathcal {F})\)-summable, which concludes. Under Assumption 2.4, we have by (34) and (35) that
since both \((\boldsymbol {\rho }^{-\boldsymbol {\nu }})_{\boldsymbol {\nu } \in \mathcal {F}}\) and \((\theta _{\boldsymbol {\nu }}^{\ast })_{\boldsymbol {\nu } \in \mathcal {F}}\) are monotonically decreasing, where we denote
The \(\ell ^{t}(\mathcal {F})\)-summability of \((\boldsymbol {\rho }^{-\boldsymbol {\nu }})_{\boldsymbol {\nu } \in \mathcal {F}}\) can be shown as in (36). For the \(\ell ^{t}(\mathcal {F})\)-summability of \((\theta _{\boldsymbol {\nu }}^{\ast })_{\boldsymbol {\nu } \in \mathcal {F}}\), we proceed as follows. As \((\rho _{j}^{-1})_{j\geq 1} \in \ell ^{t}(\mathbb {N})\), there exists a J such that \(\rho _{j}^{-1} < 1/4\) for all j > J, which implies
Moreover, as ρj > 1 there exists \(K \in \mathbb {N}\) such that (1 + k + 1)/(1 + k) < ρj for all j ≤ J when k > K, so that
By defining a sequence of functions \((\theta ^{(J, K)}_{j})_{j\geq 1}\) as
and defining a new sequence \(({\varTheta }_{\boldsymbol {\nu }})_{\boldsymbol {\nu } \in \mathcal {F}}\) as
we have that \(({\varTheta }_{\boldsymbol {\nu }})_{\boldsymbol {\nu } \in \mathcal {F}}\) is monotonically decreasing by (43) and (44). Moreover, the monotone envelope of \((\theta _{\boldsymbol {\nu }})_{\boldsymbol {\nu } \in \mathcal {F}}\) satisfies \(\theta ^{\ast }_{\boldsymbol {\nu }} \leq {\varTheta }_{\boldsymbol {\nu }}\) for all \(\boldsymbol {\nu } \in \mathcal {F}\). Therefore, we only need to show \(({\varTheta }_{\boldsymbol {\nu }})_{\boldsymbol {\nu } \in \mathcal {F}} \in \ell ^{t}(\mathcal {F})\). By definition we have
Since ρj > 1, there exist a constant \(C^{(K, J)}_{j} < \infty \) for each j ≥ 1 such that
Therefore, the first term of (45) can be bounded as
The second term of (45) can be bounded as in (37), i.e.,
which is finite when \((\rho _{j}^{-1})_{j\geq 1} \in \ell ^{t}(\mathbb {N})\). Hence, \(({\varTheta }_{\boldsymbol {\nu }})_{\boldsymbol {\nu } \in \mathcal {F}} \in \ell ^{t}(\mathcal {F})\), which concludes. □
Remark 3.3
Note that the same convergence rate is obtained in Theorem 3.2 for downward closed and nested index sets as in Theorem 3.1 for more general index sets under Assumption 2.3. While under Assumption 2.4, the convergence rates for the Taylor approximation of u becomes different. Specifically, the convergence rate from N−r(s) is deteriorated to N−r(t) with r(s) > r(t), as \(s = \frac {2t}{2+t} < t \), for downward closed and nested index sets. This deterioration is due to the bound (34), which may be crude and the convergence rate may not be optimal.
4 Conclusions
We studied sparse polynomial approximations for parametric saddle point problems, which covered such problems as Stokes, mixed formulation of the Poisson, and time-harmonic Maxwell problems. We considered the setting of a random input parameter parametrized by a countably infinite number of independent parameters as the coefficients of an affine expansion on a series of basis functions. Both globally and locally supported basis functions were considered, which led to different assumptions on the sparsity of the parametrization. Based on the two different sparsity assumptions and the results in [4, 25] for affine parametric elliptic PDEs, we proved the ℓs-summability of the coefficients of the Taylor expansion of the parametric solutions by different approaches—analytic regularity and weighted ℓ2-summability, respectively, for the saddle point problems. By the ℓs-summability we obtained the dimension-independent algebraic convergence rates of the sparse polynomial approximations, thus breaking the curse of dimensionality for high or infinite dimensional parametric saddle point problems. Moreover, we considered sparse polynomial approximations of the parametric solutions on downward closed and nested multi-index sets, which also have the dimension-independent convergence rates.
The analysis in this work can serve as a guideline for error estimates of model reduction techniques such as reduced basis methods constructed by greedy algorithms [16]. Note that we only considered uniformly distributed parameters in this work. We are interested in studying more general distributions such as Gaussian or log-normal random fields for saddle point problems, motivated by their recent analysis for elliptic PDEs [3, 11, 28]. Finally, we mention a particular type of parametric saddle point problem—optimality systems arising from stochastic PDE-constrained optimal control [12, 14, 35]. Application of the analysis to such problems are interesting.
References
Babuška, I., Nobile, F., Tempone, R.: A stochastic collocation method for elliptic partial differential equations with random input data. SIAM J. Numer. Anal. 45, 1005–1034 (2007)
Babuška, I., Tempone, R., Zouraris, G.E.: Galerkin finite element approximations of stochastic elliptic partial differential equations. SIAM J. Numer. Anal. 42, 800–825 (2004)
Bachmayr, M., Cohen, A., DeVore, R., Migliorati, G.: Sparse polynomial approximation of parametric elliptic PDEs. Part II: lognormal coefficients. ESAIM Math. Model. Numer. Anal. 51, 341–363 (2017)
Bachmayr, M., Cohen, A., Migliorati, G.: Sparse polynomial approximation of parametric elliptic PDEs. Part I: affine coefficients. ESAIM Math. Model. Numer. Anal. 51, 321–339 (2017)
Benner, P., Ohlberger, M., Cohen, A., Willcox, K.: Model Reduction and Approximation: Theory and Algorithms. SIAM, Philadelphia (2017)
Berner, J., Grohs, P., Jentzen, A.: Analysis of the generalization error: Empirical risk minimization over deep artificial neural networks overcomes the curse of dimensionality in the numerical approximation of Black–Scholes partial differential equations. SIAM J. Math. Data Sci. 2, 631–657 (2020)
Binev, P., Cohen, A., Dahmen, W., DeVore, R., Petrova, G., Wojtaszczyk, P.: Convergence rates for greedy algorithms in reduced basis methods. SIAM J. Math. Anal. 43, 1457–1472 (2011)
Boffi, D., Brezzi, F., Fortin, M.: Mixed Finite Element Methods and Applications. Springer Series in Computational Mathematics, vol. 44. Springer, Berlin (2013)
Boyaval, S., Le Bris, C., Lelièvre, T., Maday, Y., Nguyen, N.C., Patera, A.T.: Reduced basis techniques for stochastic problems. Arch. Comput. Methods Eng. 17, 435–454 (2010)
Buffa, A., Maday, Y., Patera, A.T., Prud’homme, C., Turinici, G.: A priori convergence of the greedy algorithm for the parametrized reduced basis method. ESAIM Math. Model. Numer. Anal. 46, 595–603 (2012)
Chen, P.: Sparse quadrature for high-dimensional integration with Gaussian measure. ESAIM Math. Model. Numer. Anal. 52, 631–657 (2018)
Chen, P., Quarteroni, A.: Weighted reduced basis method for stochastic optimal control problems with elliptic PDE constraints. SIAM/ASA J. Uncertain. Quantif. 2, 364–396 (2014)
Chen, P., Quarteroni, A.: A new algorithm for high-dimensional uncertainty quantification based on dimension-adaptive sparse grid approximation and reduced basis methods. J. Comput. Phys. 298, 176–193 (2015)
Chen, P., Quarteroni, A., Rozza, G.: Multilevel and weighted reduced basis method for stochastic optimal control problems constrained by Stokes equations. Numer. Math. 133, 67–102 (2016)
Chen, P., Quarteroni, A., Rozza, G.: Reduced basis methods for uncertainty quantification. SIAM/ASA J. Uncertain. Quantif. 5, 813–869 (2017)
Chen, P., Schwab, Ch.: Sparse-grid, reduced-basis Bayesian inversion. Comput. Methods Appl. Mech. Eng. 297, 84–115 (2015)
Chen, P., Villa, U., Ghattas, O.: Hessian-based adaptive sparse quadrature for infinite-dimensional Bayesian inverse problems. Comput. Methods Appl. Mech. Eng. 327, 147–172 (2017)
Chkifa, A., Cohen, A., Schwab, Ch.: Breaking the curse of dimensionality in sparse polynomial approximation of parametric PDEs. J. Math. Pures Appl. 103, 400–428 (2015)
Chkifa, A., Cohen, A., DeVore, R., Schwab, Ch.: Sparse adaptive Taylor approximation algorithms for parametric and stochastic elliptic PDEs. ESAIM Math. Model. Numer. Anal. 47, 253–280 (2013)
Chkifa, A., Cohen, A., Migliorati, G., Nobile, F., Tempone, R.: Discrete least squares polynomial approximation with random evaluations – application to parametric and stochastic elliptic PDEs. ESAIM Math. Model. Numer. Anal. 49, 815–837 (2015)
Chkifa, A., Cohen, A., Schwab, Ch.: High-dimensional adaptive sparse polynomial interpolation and applications to parametric PDEs. Found. Comput. Math. 14, 601–633 (2014)
Cliffe, K.A., Giles, M.B., Scheichl, R., Teckentrup, A.L.: Multilevel Monte Carlo methods and applications to elliptic PDEs with random coefficients. Comput. Visual. Sci. 14, 3 (2011)
Cohen, A., DeVore, R.: Approximation of high-dimensional parametric PDEs. Acta Numer. 24, 1–159 (2015)
Cohen, A., DeVore, R., Schwab, Ch.: Convergence rates of best N-term Galerkin approximations for a class of elliptic sPDEs. Found. Comput. Math. 10, 615–646 (2010)
Cohen, A., Devore, R., Schwab, Ch.: Analytic regularity and polynomial approximation of parametric and stochastic elliptic PDE’s. Anal. Appl. 9, 11–47 (2011)
Cohen, A., Migliorati, G.: Multivariate approximation in downward closed polynomial spaces. In: Dick, J., Kuo, F.Y., Woźniakowski, H (eds.) Contemporary Computational Mathematics - a Celebration of the 80th Birthday of Ian Sloan, pp 233–282. Springer, Cham (2018)
Doostan, A., Owhadi, H.: A non-adapted sparse approximation of PDEs with stochastic inputs. J. Comput. Phys. 230, 3015–3034 (2011)
Ernst, O.G., Sprungk, B., Tamellini, L.: Convergence of sparse collocation for functions of countably many Gaussian random variables (with application to elliptic PDEs). SIAM J. Numer. Anal. 56, 877–905 (2018)
Gantner, RN, Schwab, Ch.: Computational higher order quasi-Monte Carlo integration. In: Cools, R., Nuyens, D. (eds.) Monte Carlo and Quasi-Monte Carlo Methods, pp 271–288. Springer, Cham (2016)
Gerstner, T., Griebel, M.: Dimension–adaptive tensor–product quadrature. Computing 71, 65–87 (2003)
Ghanem, R.G., Spanos, P.D.: Stochastic finite elements: a spectral approach. Courier corporation (2003)
Haji-Ali, A. -L., Nobile, F., Tempone, R.: Multi-index Monte Carlo: when sparsity meets sampling. Numer. Math. 132, 767–806 (2016)
Hervé, M.: Analyticity in Infinite Dimensional Spaces. De Gruyter Studies in Mathematics, vol. 10. Walter de Gruyter, Berlin (1989)
Horgan, C.O.: Korn’s inequalities and their applications in continuum mechanics. SIAM Rev. 37, 491–511 (1995)
Kunoth, A., Schwab, Ch.: Analytic regularity and GPC approximation for control problems constrained by linear parametric elliptic and parabolic PDEs. SIAM J. Control Optim. 51, 2442–2471 (2013)
Kuo, F.Y., Schwab, Ch., Sloan, I.H.: Quasi-monte Carlo finite element methods for a class of elliptic partial differential equations with random coefficients. SIAM J. Numer. Anal. 50, 3351–3374 (2012)
Kutyniok, G., Petersen, P., Raslan, M., Schneider, R.: A theoretical analysis of deep neural networks and parametric PDEs. Constr. Approx. 55, 73–125 (2022)
Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., Anandkumar, A.: Fourier neural operator for parametric partial differential equations. arXiv:2010.08895 (2020)
Narayan, A., Jakeman, J.D., Zhou, T.: A Christoffel function weighted least squares algorithm for collocation approximations. Math. Comput. 86, 1913–1947 (2017)
Nobile, F., Tempone, R., Webster, C.G.: A sparse grid stochastic collocation method for partial differential equations with random input data. SIAM J. Numer. Anal. 46, 2309–2345 (2008)
O’Leary-Roseberry, T., Villa, U., Chen, P., Ghattas, O.: Derivative-informed projected neural networks for high-dimensional parametric maps governed by PDEs. Comput. Methods Appl. Mech. Eng. 388, 114199 (2022)
Quarteroni, A: Numerical Models for Differential Problems, 2nd edn. Springer, Milano (2013)
Rauhut, H., Schwab, Ch.: Compressive sensing Petrov-Galerkin approximation of high-dimensional parametric operator equations. Math. Comput. 86, 661–700 (2017)
Schillings, C., Schwab, Ch.: Sparse, adaptive Smolyak quadratures for Bayesian inverse problems. Inverse Probl. 29, 065011 (2013)
Schwab, Ch., Todor, R.A.: Karhunen–loève approximation of random fields by generalized fast multipole methods. J. Comput. Phys. 217, 100–122 (2006)
Schwab, Ch., Zech, J.: Deep learning in high dimension: Neural network expression rates for generalized polynomial chaos expansions in UQ. Anal. Appl. 17, 19–55 (2019)
Soize, C.: Random vectors and random fields in high dimension: Parametric model-based representation, identification from data, and inverse problems. In: Ghanem, R., Higdon, D., Owhadi, H (eds.) Handbook of Uncertainty Quantification, pp 883–935. Springer, Cham (2017)
Tran, H., Webster, C.G., Zhang, G.: Analysis of quasi-optimal polynomial approximations for parameterized PDEs with deterministic and stochastic coefficients. Numer. Math. 137, 451–493 (2017)
Xiu, D., Hesthaven, J.S.: High-order collocation methods for differential equations with random inputs. SIAM J. Sci. Comput. 27, 1118–1139 (2005)
Xiu, D., Karniadakis, G.E.: The Wiener–Askey polynomial chaos for stochastic differential equations. SIAM J. Sci. Comput. 24, 619–644 (2002)
Xu, J., Zikatanov, L.: Some observations on babuška and Brezzi theories. Numer. Math. 94, 195–202 (2003)
Zech, J., Schwab, Ch.: Convergence rates of high dimensional Smolyak quadrature. ESAIM Math. Model. Numer. Anal. 54, 1259–1307 (2020)
Acknowledgements
This work is partially supported by NSF under grant DMS-2012453, DOE under grants DE-SC0019303 and DE-SC0021239, Simons Foundation under grant 56065.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This paper is dedicated to Prof. Alfio Quarteroni on the occation of his 70th birthday.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, P., Ghattas, O. Sparse Polynomial Approximations for Affine Parametric Saddle Point Problems. Vietnam J. Math. 51, 151–175 (2023). https://doi.org/10.1007/s10013-022-00584-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10013-022-00584-1
Keywords
- Sparse polynomial approximation
- Saddle point problems
- Parametric PDEs
- Dimension-independent convergence