Abstract
We study multivariate integration of functions that are invariant under the permutation (of a subset) of their arguments. Recently, in Nuyens et al. (Adv Comput Math 42(1):55–84, 2016), the authors derived an upper estimate for the nth minimal worst case error for such problems and showed that under certain conditions this upper bound only weakly depends on the dimension. We extend these results by proposing two (semi-) explicit construction schemes. We develop a component-by-component algorithm to find the generating vector for a shifted rank-1 lattice rule that obtains a rate of convergence arbitrarily close to \(\mathcal {O}(n^{-\alpha })\), where \(\alpha >1/2\) denotes the smoothness of our function space and n is the number of cubature nodes. Further, we develop a semi-constructive algorithm that builds on point sets that can be used to approximate the integrands of interest with a small error; the cubature error is then bounded by the error of approximation. Here the same rate of convergence is achieved while the dependence of the error bounds on the dimension d is significantly improved.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In recent times, the efficient calculation of multidimensional integrals has become more and more important, especially when working with a very high number of dimensions. It is well known that such problems are numerically feasible only if certain additional assumptions on the integrands of interest are imposed. In this paper, we seek to construct cubature rules to approximate integrals of d-variate functions that are invariant under the permutation of (subsets of) their arguments. Such a setting was studied earlier in [16] in the context of approximations to general linear operators and also recently in [10, 17] for the special case of integration. These investigations were motivated by recent research by Yserentant [19], who proved that the rate of convergence of the approximations to the electronic Schrödinger equation of N-electron systems does not depend on the number of electrons N, thus showing that it is independent of the number of variables associated with the numerical problem. This is due to the fact that there is some inherent symmetry in the exact solutions (called electronic wave functions) to these equations. In fact, such functions are antisymmetric w.r.t. the exchange of electrons having the same spin, as described by Pauli’s exclusion principle, which is well known in theoretical physics. In [16], general linear problems defined on spaces equipped with this type of (anti-) symmetry constraints were shown to be (strongly) polynomially tractable (under certain conditions). That is, the complexity of solving such problems depends at most weakly on the dimension. Inspired by these results, it was shown in [10] that under moderate assumptions, the nth minimal worst case error for integration in the permutation-invariant (hence symmetric) setting of certain periodic function classes can also be bounded nicely. In particular, the existence of quasi-Monte Carlo (QMC) rules that can attain the rate of convergence \(\mathcal {O}(n^{-1/2})\) (known from Monte Carlo methods) was established. If there are enough variables that are equivalent with respect to permutations, then the implied constant in this bound grows only polynomially with the dimension d. Hence, the integration problem was shown to be polynomially tractable. However, the proofs given in [10] are based on nonconstructive averaging arguments.
In this paper, we extend the results of [10] and provide two (semi-) explicit construction schemes. First, we present a procedure that is inspired by the well-known component-by-component (CBC) search algorithm for lattice rules, introduced by Sloan and Reztsov [14]. More precisely, we prove that the generating vector of a randomly shifted rank-1 lattice rule that achieves an order of convergence arbitrarily close to \(\mathcal {O}(n^{-\alpha })\) can be found component-by-component. Here \(\alpha >1/2\) denotes the smoothness of our permutation-invariant integrands, and n is the number of cubature nodes from a d-dimensional integration lattice. Originally, the CBC algorithm was developed with the aim of reducing the search space for the generating vector, i.e., to make the search quicker. Later on it was shown by Kuo [4] that this algorithm can also be used to find lattice rules that achieve optimal rates of convergence in weighted function spaces of Korobov and Sobolev type, and this improved on the Monte Carlo rate \(\mathcal {O}(n^{-1/2})\) shown previously in [5, 13] and [12]. For an overview of these schemes for different problem settings, we refer to [2, 9].
Next, we present a completely different approach towards cubature rules that does not rely on the concept of lattice points. Based on the knowledge about sampling points that are suitable for approximation, we iteratively derive a cubature rule that attains the desired order of convergence for solving the integration problem in more general reproducing kernel Hilbert spaces (RKHSs). The basic idea of this type of algorithms can be traced back to Wasilkowski [15]. The proof relies on the fact that in RKHSs, the worst case error for integration can be controlled by an average case error for the function approximation problem. This way we arrive at a cubature rule that achieves a rate of convergence arbitrarily close to the optimal rate of \(\mathcal {O}(n^{-\alpha })\), and prove that under certain conditions, we can achieve (strong) polynomial tractability.
Our material is organized as follows. In Sect. 2, we briefly recall the problem setting discussed in [10]. In particular, we define the permutation-invariant subspaces of the periodic RKHSs under consideration. Section 3 introduces the worst case error of the problem, and Proposition 3.1 outlines the main tractability results derived in [10]. Sections 4 and 5 contain our main results, namely the new CBC construction and the application of the second cubature rule explained to permutation invariant subspaces, respectively. In Sect. 4, we start with a brief description of (shifted) rank-1 lattice rules and recall some related results obtained in [10]. Then, Theorem 4.5 in Sect. 4.2 contains our new CBC construction. Section 5 is devoted to the alternate approach. We begin with a short overview of basic facts related to approximation problems w.r.t. the average case setting in Sect. 5.1. Afterwards, in Sect. 5.2, our quadrature rule for general RKHSs is derived. The corresponding error analysis can be found in Theorem 5.8. Finally, in Sect. 5.3, the obtained results are applied to the permutation-invariant setting described in Sect. 2. The paper is concluded with an Appendix that contains the proofs of some technical lemmas needed in our derivation.
We briefly introduce some notation that will be used throughout the paper. Vectors are denoted by boldface letters, e.g., \(\varvec{x},\varvec{k},\varvec{h}\). In contrast, scalars are denoted by, e.g., x, k, h. To denote multidimensional spectral indices, we mainly use \(\varvec{h}\) or \(\varvec{k}\). The elements of a d-dimensional vector are denoted as \(\varvec{k}=(k_1,\ldots ,k_d)\). We use \(\varvec{x}\cdot \varvec{y}\) to denote the inner product or the dot product of two vectors \(\varvec{x}\) and \(\varvec{y}\). The prefix \(\#\) is used for the cardinality of a set. Finally, the norm of a function f in a function space \(H_d\) is denoted by .
2 Setting
We study multivariate integration
for functions from subsets of the Hilbert space of periodic functions
Here \(\widehat{f}(\varvec{k})\) are the Fourier coefficients of f, given by
and \(r_{\alpha ,\varvec{\beta }} :\mathbb {Z}^d \rightarrow (0,\infty )\) is a d-fold tensor product involving a generating function \(R:[1,\infty ) \rightarrow (0,\infty )\) and a tuple \(\varvec{\beta } = (\beta _0,\beta _1)\) of positive parameters such that
Note that the problem is said to be well scaled if \(\beta _0=1\). The parameter \(\alpha \ge 0\) describes the smoothness. Throughout this paper, we assume that
Moreover, we assume that \((R(m)^{-1})_{m\in \mathbb {N}} \in \ell _{2\alpha }\); i.e.,
The functions \(f\in F_{d}(r_{\alpha ,\varvec{\beta }})\) possess an absolutely convergent Fourier expansion. Also, the above assumptions imply \(R(m)\sim m\) and \(\alpha >1/2\), and for these conditions, it is known that \(F_d\) is a d-fold tensor product of a univariate reproducing kernel Hilbert space (RKHS) \(F_1=H(K_1)\) with kernel \(K_1\), see, e.g., [6, Appendix A]. The inner product on the RKHS \(F_d = H(K_d)\) is then given by
We now recall the definition of \(I_d\) -permutation-invariant functions \(f\in F_d(r_{\alpha ,\varvec{\beta }})\), where \(I_d \subseteq \{1,\ldots ,d\}\) is a fixed subset of coordinates. As discussed in [16, 17] and [10], these functions satisfy the constraint that they are invariant under all permutations of the variables with indices in \(I_d\); i.e.,
where
(Note that this set always contains at least the identity permutation.) For example, if \(I_d= \{1,2\}\), then \(\mathcal {S}_d = \left\{ (1,2,\ldots ,d)\mapsto (1,2,\ldots ,d),\, (1,2,\ldots ,d)\mapsto (2,1,\ldots ,d) \right\} \) and (2) reduces to the condition \(f(x_1,x_2,\ldots , x_d)= f(x_2,x_1,\ldots , x_d)\). With slight abuse of notation, we shall use P also in the functional notation; i.e., we let
The subspaces of all \(I_d\)-permutation-invariant functions in \(F_d\) will be denoted by \(\mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }}))\). For fully permutation-invariant functions, we will simply write \(\mathfrak {S}(F_d(r_{\alpha ,\varvec{\beta }}))\). It is known that if \(I_d = \{i_1, i_2, \ldots , i_{\# I_d}\}\), the set of all
with \(\varvec{k}\) from the set
constitutes an orthonormal basis of \(\mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }}))\); see [16] for more details. Here,
accounts for the repetitions in the multiindex \(\varvec{k}\). Since the subspace \(\mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }}))\) is equipped with the same norm as the entire space \(F_d(r_{\alpha ,\varvec{\beta }})\), it is again a RKHS. Moreover, from [10] we know that its reproducing kernel is given by
Finally, we mention that (using a suitable rearrangement of coordinates) the space \(\mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }}))\) can be seen as the tensor product of the fully permutation-invariant subset of the \(\# I_d\)-variate space with the entire \((d-\# I_d)\)-variate space; i.e.,
Hence, the reproducing kernel also factorizes like
For more details about the setting, we refer the reader to [10].
3 Worst Case Error and Tractability
We approximate the integral (1) by the cubature rule
that samples f at the given points \(\varvec{t}^{(j)}\in [0,1]^d\), \(j=0,\ldots ,n-1\), where the weights \(w_j\) are well-chosen real numbers. If \(w_0=\cdots =w_{n-1}=1\), then \(Q_{d,n}\) is a quasi-Monte Carlo (QMC) rule, which we will denote by \(\mathrm {QMC}_{d,n} :=\, \mathrm {QMC}_{d,n}(\,\cdot \,; \varvec{t}^{(0)},\ldots ,\varvec{t}^{(n-1)})\).
If K is the 2d-variate, real-valued reproducing kernel of a (separable) RKHS \(H_d\) of functions on \([0,1]^d\), the worst case error of \(Q_{d,n}\) is given by
see, e.g., Hickernell and Woźniakowski [3]. The nth minimal worst case error for integration on \(H_d\) is then given by
Here the infimum is taken with respect to the class of cubature rules \(Q_{d,n}\) that use at most n samples of the input function.
We briefly recall the concepts of tractability that will be used in this paper, as described in Novak and Woźniakowski [6]. Let \(n = n(\varepsilon , d)\) denote the information complexity w.r.t. the normalized error criterion, i.e., the minimal number of function values necessary to reduce the initial error \(e(0,d; H_d)\) by a factor of \(\varepsilon > 0\), in the d-variate case. Then a problem is said to be polynomially tractable if \(n(\varepsilon ,d)\) is upper bounded by some polynomial in \(\varepsilon ^{-1}\) and d, i.e., if there exist constants \(C,p>0\) and \(q \ge 0\) such that for all \(d\in \mathbb {N}\) and every \(\varepsilon \in (0,1),\)
If this bound is independent of d, i.e., if we can take \(q=0\), then the problem is said to be strongly polynomially tractable. Problems are called polynomially intractable if (6) does not hold for any such choice of C, p, and q. For the sake of completeness, we mention that a problem is said to be weakly tractable if its information complexity does not grow exponentially with \(\varepsilon ^{-1}\) and d, i.e., if
In [10, Theorem 3.6], the following tractability result has been shown for the spaces \(H_d = \mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }}))\).
Proposition 3.1
For \(d\ge 2\), let \(I_d\subseteq \{1,\ldots ,d\}\) with \(\# I_d \ge 2\), and assume
Consider the integration problem on the \(I_d\)-permutation-invariant subspace \(\mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }}))\) of \(F_d(r_{\alpha ,\varvec{\beta }})\). Then:
-
for all n and \(d\in \mathbb {N}\), the nth minimal worst case error is bounded by
$$\begin{aligned} e(n,d; \mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }})))&\le e(0,d; \mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }}))) \, \sqrt{V^* + \frac{1}{1-\eta ^*}} \nonumber \\&\qquad \times \left( 1 + \frac{2 \beta _1 \mu _R(\alpha )}{\beta _0} \right) ^{(d-\#I_d)/2} (\#I_d)^{V^*\!/2} \frac{1}{\sqrt{n}}, \end{aligned}$$(8)where the constants \(V^*\) and \(\eta ^*\) are chosen such that
$$\begin{aligned} \eta ^* := \eta ^*(V^*):= \sum _{m=V^*+1}^\infty \frac{2 \, \beta _1}{\beta _0 \, R(m)^{2\alpha }} < 1. \end{aligned}$$(9) -
there exists a QMC rule that achieves this bound.
-
if \(d-\#I_d \in \mathcal {O}(\ln d)\), then the integration problem is polynomially tractable (with respect to the worst case setting and the normalized error criterion).
-
if \(d-\#I_d \in \mathcal {O}(1)\) and (9) holds for \(V^*=0\), then we obtain strong polynomial tractability.
It is observed in [10] that for the periodic unanchored Sobolev space, i.e., \(\beta _0=\beta _1=1\) and \(R(m)=2\pi m\), the assumption (7) is fulfilled if \(\alpha > 1/2\). In addition, for sufficiently many permutation-invariance conditions and sufficiently large \(\alpha \), we even have strong polynomial tractability. Note that for \(\alpha \) near 1/2, the factor \((1-\eta ^*)^{-1/2}\) is extremely large, whereas for \(\alpha =1\) we already have \((1-\eta ^*)^{-1/2} \le 1.05\). We stress the importance of the above tractability result by mentioning that the integration problem on the full space (\(I_d = \emptyset \)) is not even weakly tractable. That is, in this case, the information complexity \(n(\varepsilon ,d)\) grows at least exponentially in \(\varepsilon ^{-1}+d\). We refer to [10] for a more detailed discussion about this result.
4 Component-By-Component Construction of Rank-1 Lattice Rules
4.1 Definitions and Known Results
We briefly introduce unshifted and shifted rank-1 lattice rules.
For \(n\in \mathbb {N}\), an n-point d-dimensional rank-1 lattice rule \(Q_{d,n}(\varvec{z})\) is a QMC rule (i.e., an operator as given in (4) with \(w_0=\cdots =w_{n-1}=1\)) that is fully determined by its generating vector \(\varvec{z}\in \mathbb {Z}_n^d:=\{0,1,\ldots ,n-1\}^d\). It uses points \(\varvec{t}^{(j)}\) from an integration lattice \(L(\varvec{z},n)\) induced by \(\varvec{z}\):
We will restrict ourselves to prime numbers n for simplicity. The following character property over \(\mathbb {Z}_n^d\) w.r.t. the trigonometric basis is useful:
The collection of \(\varvec{h}\in \mathbb {Z}^d\) for which this sum is one is called the dual lattice, and we denote it by \(L(\varvec{z},n)^\bot \). It has been shown in [10, Corollary 4.7] that irrespective of the set \(I_d\), for standard choices of \(r_{\alpha ,\varvec{\beta }}\) (such as in the periodic Sobolev space or Korobov space), the class of unshifted lattice rules \(Q_{d,n}(\varvec{z})\) is too small to obtain strong polynomial tractability. We thus consider randomly shifted rank-1 lattice rules in what follows.
For \(n\in \mathbb {N}\), an n-point d-dimensional shifted rank-1 lattice rule consists of an unshifted rule \(Q_{d,n}(\varvec{z})\) shifted by some shift \(\varvec{\Delta }\in [0,1)^d\); i.e., its points are given by
We will denote such a cubature rule by \(Q_{d,n}(\varvec{z})+\varvec{\Delta }\).
The root mean squared worst case error helps in establishing the existence of good shifts. It is given by
The above error can be calculated using the shift-invariant kernel
associated with \(K_{d,I_d}\) given in (3). From [10, Proposition 4.8], we know that for \(d\in \mathbb {N}\) and \(I_d\subseteq \{1,\ldots ,d\}\), the shift-invariant kernel is given by
and, for every unshifted rank-1 lattice rule \(Q_{d,n}(\varvec{z})\), the mean squared worst case error satisfies
where \(c=2\beta _1\mu _R(\alpha )/\beta _0\) does not depend on d and n, see [10, Theorem 4.9]. Here \(H_{d,I_d}^{\mathrm {shinv}}\) denotes the RKHS with kernel \(K_{d,I_d}^{\mathrm {shinv}}\) and \(L(\varvec{z},n)^{\perp }\) is the dual lattice induced by \(\varvec{z}\in \mathbb {Z}_n^d\). Moreover, the following existence result has been derived; cf. [10, Theorems 4.9 and 4.11].
Proposition 4.1
Let \(d\in \mathbb {N}\), \(I_d\subseteq \{1,\ldots ,d\}\), and \(n\in \mathbb {N}\) with \(n\ge c_R\) be prime. Then there exists a shifted rank-1 lattice rule \(Q_{d,n}(\varvec{z^*})+\varvec{\Delta ^*}\) for integration of \(I_d\)-permutation-invariant functions in \(F_d(r_{\alpha ,\varvec{\beta }})\) such that
with \(c_R\) as defined in Sect. 2 and
Remark 4.2
Some remarks are in order:
-
1.
In particular, for \(\lambda =1\), the above proposition implies that there exists a shifted rank-1 lattice rule such that
$$\begin{aligned} E(Q_{d,n}(\varvec{z^*})) \le \sqrt{2c_R} \, C_{d,1}(r_{\alpha ,\varvec{\beta }})^{1/2} \, n^{-1/2}, \end{aligned}$$(14)and it can be seen (cf. [10, Corollary 4.14]) that this bound differs from (8) only by a small factor that does not depend on d, provided that (7) is fulfilled.
-
2.
If \(1<\lambda <2\alpha \), then (independent of \(I_d\)) there is an exponential dependence of \(C_{d,\lambda }(r_{\alpha ,\varvec{\beta }})\) on the dimension d that gets stronger as \(\lambda \) increases; this growth in the associated constant is typical and can be observed also when dealing with lattice rule constructions for classical spaces (without permutation-invariance). For \(\lambda \rightarrow 2\alpha \), we even have \(C_{d,\lambda }(r_{\alpha ,\varvec{\beta }})\rightarrow \infty \). Nevertheless, without going into details, we mention that the dependence of this constant on the dimension can be controlled using additional constraints on the parameters \(\varvec{\beta }=(\beta _0,\beta _1)\). We stress that these additional constraints are rather mild and would not achieve a similar tractability behavior for the full space. For an extensive discussion of lower and upper bounds for \(C_{d,\lambda }(r_{\alpha ,\varvec{\beta }})\), we refer to [10, Proposition 4.12].\(\square \)
4.2 The Component-By-Component Construction
Here we derive a component-by-component (CBC) construction to search for a generating vector \(\varvec{z}^*\in \mathbb {Z}_n^d\) such that (for a well-chosen shift \(\varvec{\Delta }^*\in [0,1)^d\)) the corresponding shifted rank-1 lattice rule \(Q_{d,n}(\varvec{z}^*)+\varvec{\Delta }^*\) satisfies an error bound similar to the one given in Proposition 4.1. Our approach is motivated by similar constructions that exist for standard spaces defined via decay conditions of Fourier coefficients; see, e.g., [4, 13, 14].
We will further make use of the following notation. Let \(d\in \mathbb {N}\) and \(I_d\subseteq \{1,\ldots ,d\}\). For subsets \(\emptyset \ne \mathfrak {u}\subseteq \{1,\ldots ,d\}\) and vectors \(\varvec{h}=(h_1,\ldots ,h_d)\in \mathbb {Z}^d\), we denote by \(\varvec{h}_\mathfrak {u}:=(h_j)_{j\in \mathfrak {u}}\) the restriction of \(\varvec{h}\) to \(\mathfrak {u}\). Hence, \(\varvec{h}_\mathfrak {u}\in \mathbb {Z}^\mathfrak {u}\) is a \(\#\mathfrak {u}\)-dimensional integer vector indexed by coordinates in \(\mathfrak {u}\). Given \(\varvec{h}_\mathfrak {u}\in \mathbb {Z}^\mathfrak {u}\), its trivial extension to the index set \(\{1,\ldots ,d\}\) is denoted by \((\varvec{h}_\mathfrak {u};\varvec{0})\in \mathbb {Z}^d\); i.e.,
Finally, the restriction of the set of admissible permutations and their associated multiplicities (see Sect. 2 for the original definitions) to the subset \(\mathfrak {u}\) will be abbreviated by
and
respectively.
To derive the component-by-component construction, we will need a little preparation. First, let us recall a technical assertion that can be found in [10, Lemma 4.10].
Lemma 4.3
Let \(d\in \mathbb {N}\), \(\varvec{h}\in \mathbb {Z}^d\), and \(n\in \mathbb {N}\) be prime. Then
where \(L(\varvec{z},n)^\bot \) denotes the dual lattice induced by \(\varvec{z}\) and \(\varvec{h}\equiv \varvec{0}~({\text {mod}}{n})\) is a shorthand for \(h_\ell \equiv 0 \pmod {n}\) for all \(\ell =1,\ldots ,d\).
Proof
The proof is based on the character property given in (10). We refer to [10] for the complete proof. \(\square \)
Furthermore, we will use a dimension-wise decomposition of the mean squared worst case error (11), as given below.
Proposition 4.4
Let \(d\in \mathbb {N}\), \(I_d\subseteq \{1,\ldots ,d\}\), and \(n\in \mathbb {N}\) be prime. Then for all generating vectors \(\varvec{z}=(z_1,\ldots ,z_d)\in \mathbb {Z}_n^d\), the mean squared worst case error of \(Q_{d,n}(\varvec{z})+\varvec{\Delta }\) w.r.t. all shifts \(\varvec{\Delta }\in [0,1)^d\) is given by
where we set
for \(\ell =1,\ldots ,d\) and \(c_{\mathfrak {u},I_d} = \beta _0^{\#\mathfrak {u}} \left( {\begin{array}{c}\# I_d\\ \#(I_d\cap \mathfrak {u})\end{array}}\right) \).
Proof
Let \(\emptyset \ne \mathfrak {u}\subseteq \{1,\ldots ,d\}\), and assume that \(\varvec{h}=(h_1,\ldots ,h_d)\in \mathbb {Z}^d\setminus \{\varvec{0}\}\) with \(h_j\ne 0\) if and only if \(j\in \mathfrak {u}\) and \(h_j=0\) otherwise. Then, clearly,
i.e., \(\varvec{h}\in L(\varvec{z},n)^\bot \) if and only if \(\varvec{h}_\mathfrak {u}\in L(\varvec{z}_\mathfrak {u},n)^\bot \). Moreover, for all such \(\varvec{h}=(\varvec{h}_\mathfrak {u};\varvec{0})\), we have
since \(\#\mathcal {S}_{\mathfrak {u},I_d}=\# (I_d\cap \mathfrak {u})!\). Further, note that the collection of all nonempty \(\mathfrak {u}\subseteq \{1,\ldots ,d\}\) can be written as the disjoint union . This gives rise to the disjoint decomposition
Next, observe that for every function \(G:\mathbb {Z}^d\rightarrow \mathbb {C}\), the following holds:
Using this and the permutation-invariance of \(\mathbb {M}_d(\cdot )!\) and \(r_{\alpha ,\varvec{\beta }}(\cdot )\), we can express the mean squared worst case error (11) as
From the above considerations, we thus infer that
which proves the result. \(\square \)
Now we are well prepared to state and prove the main theorem of this section. It presents a component-by-component construction to search for a generating vector \(\varvec{z}^*\in \mathbb {Z}_n^d\) such that (for some well-chosen shift \(\varvec{\Delta }^*\in [0,1)^d\)) the error of the shifted rank-1 lattice rule \(Q_{d,n}(\varvec{z}^*)+\varvec{\Delta }^*\) achieves a rate of convergence that is arbitrarily close to \(\mathcal {O}(n^{-\alpha })\).
Theorem 4.5
(CBC construction) Let \(d\in \mathbb {N}\), \(I_d\subseteq \{1,\ldots ,d\}\), and assume \(n\in \mathbb {N}\) with \(n\ge c_R\) to be prime. Moreover, let \(z_1^*\in \{1,\ldots ,n-1\}\) be arbitrarily fixed, and select \(z_2^*,\ldots ,z_d^*\in \mathbb {Z}_n\) component-by-component via minimizing the quantities \(B_{I_d,\ell }(z_1^*,\ldots ,z_{\ell -1}^*,z_\ell )\), defined in (16), subject to \(z_\ell \in \mathbb {Z}_n\), \(\ell =2,\ldots ,d\). Then there exists \(\varvec{\Delta }^*\in [0,1)^d\) such that the shifted rank-1 lattice rule \(Q_{d,n}=Q_{d,n}(z_1^*,\ldots ,z_d^*)+\varvec{\Delta }^*\) for integration of \(I_d\)-permutation-invariant functions in \(F_d(r_{\alpha ,\varvec{\beta }})\) satisfies
for all \(1 \le \lambda < 2\alpha \), where \(C_{d,\lambda }(r_{\alpha ,\varvec{\beta }})\) is given by (13).
Before presenting the proof of Theorem 4.5, we stress that our bound (17) for the CBC construction is only slightly worse than the error bound found in the general existence result of Proposition 4.1. It always depends on the number of permutation invariant variables and thus cannot be used to deduce strong polynomial tractability. This seems to be unavoidable, see also Remark 4.6 below. However, note that this additional linear dependence on d is a noticeable overhead only for the case of \(\lambda =1\); for \(\lambda >1\), the exponential growth in \(C_{d,\lambda }(r_{\alpha ,\varvec{\beta }})\) overshadows this dependence on d.
Proof of Theorem 4.5
Our proof is based on the dimension-wise decomposition of the mean squared worst case error given in Proposition 4.4. Once we have found \(\varvec{z}^*=(z_1^*,\ldots ,z_d^*)\in \mathbb {Z}_n^d\) such that the corresponding quantity \(E(Q_{d,n}(\varvec{z}^*))^2\) is upper bounded by the right-hand side of (17), the result follows (as usual) by the mean value theorem, which ensures the existence of a shift \(\varvec{\Delta }^*\) with
Step 1. Let \(\lambda \in [1,2\alpha )\), \(d\in \mathbb {N}\), \(I_d\subseteq \{1,\ldots ,d\}\), and n be a fixed prime. We apply Jensen’s inequality (see Lemma A.1) with \(p=1\ge 1/\lambda =q\) to the expression in (15) and obtain
with \(B_{I_d,\ell }\) as defined in (16). In Step 2 and 3 below, we will show that if we select \(\varvec{z}=\varvec{z}^*\) component-by-component, then for all \(\ell =1,\ldots ,d\), the summands in the estimate are bounded by
The recombination of these building blocks as in Proposition 4.4 then yields
such that the claim (17) is implied by (18).
Step 2. To prove (19), we apply Jensen’s inequality (again for \(p=1\ge 1/\lambda =q\)) to (16) and obtain
for all \(\varvec{z}\in \mathbb {Z}_n^d\) and \(\ell =1,\ldots ,d\).
Now consider \(\ell =1\), and assume \(z_1=z_1^*\in \{1,2,\ldots ,n-1\}\) to be fixed arbitrarily. Then the sum over all subsets in the latter bound reduces to \(\mathfrak {u}=\{1\}\). Accordingly, we have \(\mathcal {S}_{\mathfrak {u},I_d}=\{\mathrm {id}\}\) and \(\mathbb {M}_{\mathfrak {u},I_d}(\varvec{h}_\mathfrak {u})!=1\) does not depend on \(\varvec{h}_\mathfrak {u}\). Moreover, for \(\ell =1\) only those nontrivial indices \(\varvec{h}_\mathfrak {u}\) belong to \(L(\varvec{z}_\mathfrak {u},n)^\bot \), i.e., satisfy \(\varvec{h}_\mathfrak {u}\cdot \varvec{z}_\mathfrak {u}\equiv 0 \pmod {n}\), which can be written as \(\varvec{h}_\mathfrak {u}=h_1=n\,k\) for some \(k\in \mathbb {Z}\setminus \{0\}\) because \(\varvec{z}_\mathfrak {u}=z_1^*\in \mathbb {Z}_n\) and n is assumed to be prime. Hence, in this case we obtain that \(r_{\alpha ,\varvec{\beta }}^{-1/\lambda }(\varvec{h}_\mathfrak {u})\) equals
with \(c_R/n \le (1+c_R)\, \max \{1,\#I_d\}^{1/\lambda } / n\) (recall that \(\lambda < 2\alpha \) and \(n\ge c_R\)). Consequently, we have
where k was relabeled as \(\varvec{h}_\mathfrak {u}\). In other words, (19) holds true for \(\ell =1\).
Step 3. Now, assume that we have already determined \(z_1^*,\ldots ,z_{\ell -1}^*\) for some \(\ell \in \{2,\ldots ,d\}\). Then the best choice \(z_\ell ^* \in \mathbb {Z}_n\), i.e., the minimizer of \(B_{I_d,\ell }(z_1^*,\ldots ,z_{\ell -1}^*, \cdot \,)\), satisfies
where we used (20) and employed the notation \(\mathfrak {v}:=\mathfrak {u}\setminus \{\ell \}\) and \(\varvec{z}^*_\mathfrak {v}:= (z_j^*)_{j\in \mathfrak {v}}\), as well as the shorthands \(\varvec{h}_\mathfrak {v}:=(\varvec{h}_\mathfrak {u}; \varvec{0})_\mathfrak {v}\) and \(h_\ell :=(\varvec{h}_\mathfrak {u})_\ell \).
Next, we estimate the term in the brackets for every \(\varvec{h}_{\mathfrak {u}}\in (\mathbb {Z}\setminus \{0\})^\mathfrak {u}\) with \(\mathfrak {u}\subseteq \{1,\ldots ,\ell \}\) that contains \(\ell \). The character property (10) yields
where
due to Lemma 4.3 for \(d=1\).
To exploit this estimate, given \(\mathfrak {u}=\mathfrak {v}\cup \{\ell \}\) as above, we split up the sum over all \(\varvec{h}_\mathfrak {u}=(\varvec{h}_\mathfrak {v}, h_\ell )\) from \((\mathbb {Z}\setminus \{0\})^\mathfrak {u}=(\mathbb {Z}\setminus \{0\})^\mathfrak {v}\times (\mathbb {Z}\setminus \{0\})\) and obtain
In the first term, every \(h_\ell \) equals \(n\,k\) for some \(0\ne k \in \mathbb {Z}\). Thus, we can perform a change of variables similar to what was done in Step 2 and replace the inner summation by a sum over all \(k\in \mathbb {Z}\setminus \{0\}\). Then from the product structure of \(r_{\alpha ,\varvec{\beta }}(\cdot )\) and the bound (21), it follows that
But now we also need to estimate \(\mathbb {M}_{\mathfrak {u},I_d}((\varvec{h}_\mathfrak {v}, h_\ell ))!\) w.r.t. this transformation of variables. If \(\ell \notin I_d\), then replacing \(h_\ell =n\, k\) by k does not effect \(\mathbb {M}_{\mathfrak {u},I_d}\). On the other hand, if \(\ell \in I_d\), then at least we know that
Hence, dropping the condition \(h_\ell \not \equiv 0 \pmod {n}\) in the second term finally yields
for every subset \(\mathfrak {u}\subseteq \{1,\ldots ,\ell \}\) that contains \(\ell \). This immediately implies the desired bound (19) on \(B_{I_d,\ell }(z_1^*,\ldots , z_\ell ^*)^{1/\lambda }\), and the proof is complete. \(\square \)
Remark 4.6
Let us add some final remarks on the CBC construction:
-
1.
Note that this theorem asserts the existence of a \(\varvec{z}^*\) that achieves the desired bound and gives a constructive way of finding it (by successively minimizing the quantities \(B_{I_d,\ell }(z_1,\ldots ,z_\ell )\) subject to \(z_\ell \)).
-
2.
As usual, all nontrivial choices for the first component \(z_1^*\) of the generating vector \(\varvec{z}^*\in \mathbb {Z}_n^d\) are equivalent.
-
3.
For the remaining components, we actually assumed more than we needed: Step 3 in the above proof shows that instead of minimizing \(B_{I_d,\ell }(z_1^*,\ldots ,z_{\ell -1}^*,z_\ell )\) for \(z_\ell \in \mathbb {Z}_n\), \(\ell \in \{2,\ldots ,d\}\), it would be sufficient to select \(z_\ell ^*\) that performs better than the average (w.r.t. \(\lambda \)), i.e., that fulfills
$$\begin{aligned} B_{I_d,\ell }(z_1^*,\ldots ,z_{\ell -1}^*, z_\ell ^*)^{1/\lambda } \le \frac{1}{\# \mathbb {Z}_n} \sum _{z_\ell \in \mathbb {Z}_n} B_{I_d,\ell }(z_1^*,\ldots ,z_{\ell -1}^*, z_\ell )^{1/\lambda }. \end{aligned}$$(22)Moreover, note that from Hölder’s inequality, it easily follows that the latter estimate yields a corresponding bound for all \(\widetilde{\lambda } \le \lambda \). That is, selecting \(z^*_\ell \), \(\ell =2,\ldots ,d\), according to (22) implies (17) with \(\lambda \) replaced by any \(\widetilde{\lambda }\in [1,\lambda ]\), whereas minimizing \(B_{I_d,\ell }\) produces a lattice rule that satisfies the bound (17) for the whole range \([1,2\alpha )\) of \(\lambda \).
-
4.
Observe that (in general) a lattice rule based on a generating vector \(\varvec{z}^*\) constructed as above is unfortunately not extensible in the dimension. The reason is that with increasing d, the factors \(c_{\mathfrak {u},I_d}\) in the definition of \(B_{I_d,\ell }\) (see (16)) will change such that the contributions of the subsets \(\mathfrak {u}\subseteq \{1,\ldots ,\ell \}\) with \(\ell \in \mathfrak {u}\) are different for every dimension. Hence, for fixed \(\ell \), the optimal choice \(z_{\ell }^*\) may change with varying d unless the set of permutation-invariant coordinates \(I_d\) is independent of the dimension (which is the uninteresting case). Thus, to successfully apply the CBC construction, we need to know the target dimension in advance. Nevertheless, compared to a full search in \(\mathbb {Z}_n^d\), the complexity of finding a suitable generating vector is significantly reduced if we do it component-by-component, and parts of the calculations made for dimension d may be reused while processing the \((d+1)\)’th dimension.
-
5.
As mentioned earlier, our conjecture is that the dependence of (17) on the dimension d is inevitable and cannot be improved by alternate steps in our proof. The CBC construction tries to distinguish between otherwise identical dimensions; so no CBC construction of lattice rules for the integration of permutation-invariant functions can satisfy bounds like (14), in our opinion. Indeed, (16) in Proposition 4.4 above already indicates the complicated influence of already selected components \(z_1^*,\ldots ,z_{\ell -1}^*\) on the error contribution induced by the choice of the current coordinate \(z_\ell \) (and forthcoming indicee). The fact that the problem is caused by the permutation-invariance assumption (\(\mathbb {M}_{\mathfrak {u},I_d}\) is not of product structure!) is nicely reflected by the factor \(\max \{1,\#I_d\}\) in (17) which disappears for standard constructions, i.e., for spaces without symmetry constraints.
-
6.
Finally, let us mention that slightly stronger assumptions allow us to get rid of the maximum term in (17). Indeed, a careful inspection of the proof shows that actually (17) can be improved to
$$\begin{aligned} e^{\mathrm {wor}}(Q_{d,n}; \mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }})))^2&\le \left[ \max \{1,\#I_d\}^{1/\lambda } \left( \frac{c_R}{n}\right) ^{2\alpha /\lambda } + \frac{1}{n} \right] ^{\lambda } \, C_{d,\lambda }(r_{\alpha ,\varvec{\beta }}) \\&= \left[ \frac{\max \{1,\#I_d\}^{1/\lambda } \, c_R^{2\alpha /\lambda }}{n^{2\alpha /\lambda -1}} + 1 \right] ^{\lambda } \, C_{d,\lambda }(r_{\alpha ,\varvec{\beta }}) \, \frac{1}{n^\lambda }. \end{aligned}$$Thus, if we assume that \(n \ge c_R \,\max \{1,\#I_d\}^{1/(2\alpha -\lambda )}\), then we exactly recover the error bound (12) from Proposition 4.1. \(\square \)
5 An Alternative Construction
In this section, we consider a completely different approach towards efficient cubature rules for the integration problem defined in Sect. 2. For this purpose, in Sect. 5.1 we collect some basic facts (taken from the monographs of Novak and Woźniakowski [6,7,8]) about \(L_2\)-approximation problems on more general reproducing kernel Hilbert spaces (RKHSs) \(H_d=H(K)\) of d-variate functions and on somewhat larger Banach spaces \(B_d \supset H_d\). Then, in Sect. 5.2, we show that some knowledge about the quantities related to these approximation problems allows us to (semi-explicitly) construct a sequence of cubature rules \(Q_{d,N}\) for integration on \(H_d\) with worst case errors that decay with the desired rate. Finally, this construction is applied to our original integration problem of permutation-invariant functions in Sect. 5.3.
5.1 The \(L_2\)-Approximation Problem for RKHSs
Let K denote the reproducing kernel of the Hilbert space \(H_d=H(K)\) of real (or complex) valued functions f on \([0,1]^d\). The inner product on this space will be denoted by \(\left\langle \cdot , \cdot \right\rangle \), and is the induced norm. Then from the reproducing kernel property, it follows that
see, e.g., Aronszajn [1], where a comprehensive discussion of RKHSs can be found. Hence, if
is finite, then the space \(H_d\) can be continuously embedded into the space \(L_2([0,1]^d)\), and it holds that for every \(f\in H_d\). In other words, \(M_{2,d}(K)^{1/2}\) serves as an upper bound for the operator norm
If, in addition, the embedding \(H_d\hookrightarrow L_2([0,1]^d)\) is compact, then we can try to approximate \(\mathrm {App}_d\) by nonadaptive algorithms
which are linear combinationsFootnote 1 of at most n information operations \(L_1,\ldots ,L_n\) from some class \(\Lambda \) and arbitrary functions \(a_1,\ldots ,a_n\in L_2([0,1]^d)\). In the worst case setting, the error of such an algorithm is defined by
As long as we restrict ourselves to the class \(\Lambda =\Lambda ^{\mathrm {all}}=H_d^*\) of continuous linear functionals, the complexity of this problem is completely determined by the spectrum of the self-adjoint and positive semi-definite operator \(W_d := \mathrm {App}_d^{\dagger }\mathrm {App}_d\). Due to the compactness of \(\mathrm {App}_d\), \(W_d\) is also compact such that it possesses a countable set of eigenpairs ,
The eigenvalues \(\lambda _{d,m}\) are nonnegative real numbers that form a null-sequence that is (w.l.o.g.) ordered in a nonincreasing way. For the ease of presentation, we restrict ourselves to the case where \(H_d\) is separable, \(\dim H_d = \infty \), and \(\lambda _{d,m}>0\) for all \(m\in \mathbb {N}\). Then the set of eigenfunctions an orthonormal basis of \(H_d\). Furthermore, they are also orthogonal w.r.t. the inner product of \(L_2([0,1]^d)\), and we have for any \(m\in \mathbb {N}\). The optimal algorithm \(A_{d,n}^*\) in the mentioned (worst case \(L_2\)-approximation) setting is given by the choice
Finally, its worst case error equals the nth minimal worst case error which can be calculated exactly:
We refer to [6] for a more detailed discussion.
All the statements made so far are true for arbitrary Hilbert spaces \(H_d\) which are compactly embedded into \(L_2([0,1]^d)\). Taking into account the reproducing kernel property and using the definition of the adjoint operator \(\mathrm {App}_d^{\dagger }\), it can be checked easily that \(W_d\) takes the form of an integral operator,
since for any \(f \in H_d\) and all \(\varvec{x}\in [0,1]^d\),
Among other useful properties, we also have that if \(M_{2,d}\) as defined in (23) is finite, then
see, e.g., [7, Section 10.8]. In particular, this finite trace property immediately implies that \(\lambda _{d,m} \in \mathcal {O}(m^{-1})\) or \(e_{\mathrm {wor}}(n;\mathrm {App}_d, \Lambda ^{\mathrm {all}}) \in \mathcal {O}(n^{-1/2})\), respectively.
In addition, \(M_{2,d}\) is also related to the average case approximation setting. To this end, assume that \(B_d\) is a separable Banach space of real-valued functions on \([0,1]^d\) equipped with a zero-mean Gaussian measure \(\mu _d\) such that its correlation operator \(C_{\mu _d}:B_d^*\rightarrow B_d\) applied to point evaluation functionals \(\mathrm {L}_{\varvec{x}}\) can be expressed in terms of the reproducing kernel K of \(H_d\). That is, we choose \(B_d\) and \(\mu _d\) such that
where \(\mathrm {L}_{\varvec{x}} :B_d \rightarrow \mathbb {R}\) with \(f\mapsto \mathrm {L}_{\varvec{x}}(f):=f(\varvec{x})\) for \(\varvec{x}\in [0,1]^d\). We stress that these assumptions imply a continuous embedding of \(H_d\) into \(B_d\). For more details, the reader is referred to [6, Appendix B] and [7, Section 13.2].
Again, we look for good approximations \(A_{d,n} f\) to f in the norm of \(L_2([0,1]^d)\). Thus, we formally approximate the operator
by algorithms of the form (24). Clearly, we need to make sure that \(B_d\) is continuously embedded into \(L_2([0,1]^d)\). The difference to the worst case setting discussed above is that this time the error will be measured by some expectation with respect to \(\mu _d\). For this purpose, we define
which immediately implies that the initial error is given by
Since \(A_{d,0}\equiv 0\) does not use any information on f, this holds for information from both classes \(\Lambda \) in \(\{\Lambda ^{\mathrm {all}},\Lambda ^{\mathrm {std}}\}\), where \(\Lambda ^{\mathrm {std}}\) denotes the set of all function evaluation functionals and (as before) \(\Lambda ^{\mathrm {all}}\) is the set of all continuous linear functionals.
In general, it can be shown that for \(\Lambda =\Lambda ^{\mathrm {all}}\), the nth optimal algorithm \(A_{d,n}^*\) is again given by (24) and (25). To see this, we define the Gaussian measure \(v_d := \mu _d \circ (\widetilde{\mathrm {App}}_d)^{-1}\) on \(L_2([0,1]^d)\). Then the corresponding covariance operator \(C_{v_d}:L_2([0,1]^d)\rightarrow L_2([0,1]^d)\) is given by
which formally equals the definition of \(W_d\) above, see [3, Formula (16)]. Consequently, its eigenpairs are known to be the same as in the worst case setting, and it can be shown that the nth minimal error satisfies
5.2 Quadrature Rules Based on Average Case Approximation Algorithms
We are ready to return to the integration problems, the main focus of this article. Given the space \(B_d\) as above, let us define
and let \(\mathrm {Int}_d:=\widetilde{\mathrm {Int}}_d\big |_{H_d}\) denote its restriction to the RKHS \(H_d\subset B_d\) we are actually interested in. As before, we approximate this integral by a cubature rule \(Q_{d,n}\) given by (4). Then the average case error of such an integration scheme \(Q_{d,n}\) on \(B_d\) is defined by
Now [7, Corollary 13.1] shows that this quantity is exactly equal to the worst case (integration) error for \(Q_{d,n}\) on \(H_d\), which is given by (5); i.e.,
Keeping the latter relation in mind, our final goal in this section is to construct a suitable quadrature rule using the following procedure: Given any algorithm \(A_{d,n}\) that uses at most \(n\in \mathbb {N}_0\) function values (information from the class \(\Lambda ^{\mathrm {std}}\)) to approximate \(\widetilde{\mathrm {App}}_d\), and a set \(\mathcal {P}_{d,r}^\mathrm {int}=\{\varvec{t}^{(1)},\ldots ,\varvec{t}^{(r)}\}\) of \(r\in \mathbb {N}\) points in \([0,1]^d\), we let
Note that this algorithm \(Q_{d,n+r}\) denotes a cubature ruleFootnote 2 for \(\widetilde{\mathrm {Int}}_d\) (and \(\mathrm {Int}_d\), respectively) that uses no more than \(n+r\) function evaluations. The basic idea behind this construction is a form of variance reduction and can already be found in [15]; for a clever choice of \(A_{d,n}\), the maximum “energy” of f is captured by the approximation \(A_{d,n}f\) (which is assumed to be integrated exactly) such that the remaining part \(f-A_{d,n}f\) can be treated efficiently by a simple QMC method using r additional nodes. Therefore, it is obvious that the corresponding integration error caused by \(Q_{d,n+r}\) will highly depend on the approximation properties of the underlying algorithm \(A_{d,n}\) and the quality of the chosen point set \(\mathcal {P}_{d,r}^\mathrm {int}\).
A first step towards the desired error bound for \(Q_{d,n+r}\) is given by the following estimate, which resembles a bound given in [3, Formula (21)]. It states that, given \(A_{d,n}\), a suitable point set \(\mathcal {P}_{d,r}^\mathrm {int}\) can always be found. For the sake of completeness, the (nonconstructive) proof based on the proof of [15, Theorem 3] is included in the Appendix.
Proposition 5.1
Let \(d,r\in \mathbb {N}\). Then for any algorithm \(A_{d,n}\), there exists a point set \(\mathcal {P}_{d,r}^\mathrm {int}=\mathcal {P}_{d,r}^{\mathrm {int},*}\) such that \(Q_{d,n+r}\) as defined in (28) satisfies
Remark 5.2
Of course, the sets \(\mathcal {P}_{d,r}^{\mathrm {int},*}\) are not uniquely defined, and it may be hard to find these sets in practice. On the other hand, we can argue that although these bounds are nonconstructive, it is known that slightly larger bounds can be achieved with high probability by any random set of points, see, e.g., Plaskota et al. [11, Remark 2], and hence we claim that a suitable set can be found semi-constructively, provided \(A_{d,n}\) is given. \(\square \)
In view of Proposition 5.1 and our construction (28), we are left with finding suitable algorithms \(A_{d,n}\) (based on at most n function evaluations) that yield a small average case \(L_2\)-approximation error. This can be done inductively. For this purpose, observe that for each \(m\in \mathbb {N}\),
defines a probability density on \(L_2([0,1]^2)\). Here the \(\xi _{d,j}\)’s denote the \(L_2\)-normalized eigenfunctions \(\eta _{d,j}\) of \(W_d\) and \(C_{v_d}\), respectively, as described in Sect. 5.1. Given \(m\in \mathbb {N}\), an algorithm \(A_{d,s}\) that uses \(s\in \mathbb {N}_0\) function values to approximate \(\widetilde{\mathrm {App}}_d\), and a set \(\mathcal {P}_{d,q}^{\mathrm {app}}=\{\varvec{t}^{(1)},\ldots ,\varvec{t}^{(q)}\}\) of \(q\in \mathbb {N}\) points in \([0,1]^d\), let
define another \(L_2\)-approximation algorithm on \(B_d\) that uses \(s+q\) evaluations of f. Here we adopt the convention that \(0/0:=0\). Without going into details, we want to mention that \(A_{d,s+q}\) basically approximates the mth optimal algorithm \(A_{d,m}^*\) (w.r.t. \(\Lambda ^{\mathrm {all}}\)) for \(\widetilde{\mathrm {App}}_d\); see [8, Section 24.3] for details. For \(A_{d,s+q}\) defined this way, we have the following error bound, which can be found in [8, Theorem 24.3].
Proposition 5.3
Let \(d,m,q\in \mathbb {N}\) be fixed. Then for any algorithm \(A_{d,s}\), there exists a point set \(\mathcal {P}_{d,q}^{\mathrm {app}}=\mathcal {P}_{d,q}^{\mathrm {app},*}\) such that \(A_{d,s+q}\) as defined in (30) fulfills
Remark 5.4
Again the proof of Proposition 5.3 is nonconstructive since it involves an averaging argument to find a suitable point set \(\mathcal {P}_{d,q}^{\mathrm {app},*}\). However, once more, such a point set can be found semi-constructively at the expense of a small additional constant in the error bound above.\(\square \)
Hence, to construct an approximation algorithm \(A_{d,n}\) (as required for our cubature rule (28)) inductively, in every step we need to choose positive integers m and q such that the right-hand side of (31) is minimized. Of course, this requires some knowledge about the mth minimal average case approximation error \(e_{\mathrm {avg}}(m; \widetilde{\mathrm {App}}_d, \Lambda ^{\mathrm {all}})\), which will be provided by the subsequent assumption.
Assumption 5.5
For \(d\in \mathbb {N}\), there exist constants \(C_d>0\) and \(p_d>0\) such that for the ordered sequence of eigenvalues of \(W_d = \mathrm {App}_d^{\dagger }\mathrm {App}_d\) (see Sect. 5.1), the following holds:
We note in passing that the bound for \(m=0\) shows that \(C_d\) needs to be larger than \(M_{2,d}\). On the other hand, in concrete examples \(C_d\) will not be too large, as we will see in the remarks after Theorem 5.8 and in Sect. 5.3 below.
Based on Proposition 5.3 and Assumption 5.5, we now construct a sequence \((A_d^{(k)})_{k\in \mathbb {N}_0}\) of algorithms that perform well for average case approximation \(\widetilde{\mathrm {App}}_d\) on \(B_d\). Afterwards Proposition 5.1 will be employed to derive the existence of suitable error bounds for the numerical integration on the RKHS \(H_d\).
For \(p>0\), let \(\omega (y):=y+1/y^{p}\) denote a function on the real halfline \(y>0\). Then
shows that \(\omega (y)>1\) and thus \(2^{p+1} \omega (y)^{1+1/p} > 2\) for all \(y,p>0\). In particular, this holds for the minimizer \(y_p:=p^{1/(p+1)}\) of \(\omega \). Now let
and let \(d\in \mathbb {N}\) be fixed. Then our sequence of algorithms \((A_d^{(k)})_{k\in \mathbb {N}_0}\) on \(B_d\) is defined by
where we set
with \(p_d\) and \(C_d\) taken from Assumption 5.5 and \(A_{d,2^k}\) as given by Proposition 5.3 for \(s=q=2^{k-1}\). The proof of the following lemma can be found in the Appendix.
Lemma 5.6
Let Assumption 5.5 be satisfied. Then the sequence \((A_d^{(k)})_{k\in \mathbb {N}_0}\) given by (35) is well defined. Moreover, every \(A_d^{(k)}\), \(k\in \mathbb {N}_0\), uses at most \(2^k\) points and satisfies the estimate
where \(c(p_d):=2^{p_d(p_d+1)} (1+p_d) \left( 1+1/p_d\right) ^{p_d}>1\).
Remark 5.7
Although the factor \(c(p_d)\) in Lemma 5.6 is super exponential in \(p_d\), it might not be too large in concrete applications. E.g., in case of the Monte Carlo rate of convergence (\(p_d=1\)), it is easily shown that \(c(1)=16\). Moreover, in this case, the first nontrivial algorithm from our sequence is given by \(A_d^{(\mathcal {K}_{1}+1)}=A_d^{(5)}\) and uses no more than 16 sample points. For \(p_d=2\), we would have \(c(2)= 432\) and again \(\mathcal {K}_2=\left\lfloor \log _2(12 \sqrt{3}) \right\rfloor =4\). However, remember that we need to impose stronger decay conditions in Assumption 5.5 in order to conclude a higher rate \(p_d\) in Lemma 5.6. \(\square \)
We are ready to define and analyze our final quadrature rule \(Q_{d,N}\) for integration on \(H_d\). For this purpose, let \(d\in \mathbb {N}\), as well as \(N \ge 2\), and suppose that Assumption 5.5 is satisfied. Setting \(\kappa := \left\lfloor \log _2 N \right\rfloor - 1 \in \mathbb {N}_0\), the rule \(Q_{d,N}\) is given by
where the algorithm \(A^{(\kappa )}_d=A_{d,n}\) is taken out of the sequence \((A_d^{(k)})_{k\in \mathbb {N}_0}\) considered in Lemma 5.6 and \(Q_{d,2^{\kappa +1}}\) is the quadrature rule from Proposition 5.1 (with \(n=r=2^{\kappa }\)). From the previous considerations, it is clear that \(Q_{d,N}\) takes the form (4) with integration nodes \(\varvec{t}^{(j)}\in [0,1]^d\) from the set
which can be found semi-constructively. Moreover, the subsequent error bound can be deduced by a straightforward computation.
Theorem 5.8
Let \(d\in \mathbb {N}\), as well as \(N \ge 2\), and suppose that the RKHS \(H_d\) satisfies Assumption 5.5. Then the cubature rule \(Q_{d,N}\) defined by (38) uses at most N integration nodes and satisfies the bound
Proof
Obviously, the set \(\mathcal {P}_{d,N}\) defined in (39) satisfies \(\#\mathcal {P}_{d,N} \le {2^\kappa } +\sum _{k=0}^{\kappa -1} 2^{k} \le 2^{\kappa +1}\). From \(2^{\kappa +1} \le N \le 2^{\kappa +2}\), it thus follows that \(\#\mathcal {P}_{d,N} \le N\). In addition, (27) together with (29) and (37) yields
with \(c(p_d)\) as defined in Lemma 5.6. \(\square \)
Finally, let us recall a standard estimate that ensures the validity of Assumption 5.5. For this purpose, suppose that \(\sum _{j=1}^\infty \lambda _{d,j}^{1/\tau }\) is finite for some \(\tau > 1\). Then for \(k\in \mathbb {N}\), the nonincreasing ordering of \((\lambda _{d,j})_{j\in \mathbb {N}}\) yields
Consequently, for all \(m\in \mathbb {N}\), we obtain
The case \(m=0\) can be handled using Jensen’s inequality (Lemma A.1) and the fact that \(2^y/y\ge 1\) for \(y>0\):
Thus, (32) in Assumption 5.5 is satisfied with
whenever \(\sum _{j=1}^\infty \lambda _{d,j}^{1/\tau }\) converges.
Corollary 5.9
For \(d\in \mathbb {N}\), let \(H_d\hookrightarrow L_2([0,1]^d)\) denote a RKHS, and assume that there exists \(\tau >1\) such that the ordered sequence of eigenvalues of \(W_d = \mathrm {App}_d^{\dagger }\mathrm {App}_d\) satisfies \(\sum _{j=1}^\infty \lambda _{d,j}^{1/\tau }<\infty \). Then for the cubature rule \(Q_{d,N}\) considered above, it holds that
where \(c'(\tau ):=2^{\tau (\tau ^2-1)} \left( \frac{\tau }{\tau -1}\right) ^{\tau }\).
Remark 5.10
Note that the factor \(c'(\tau )\) in the latter bound deteriorates as \(\tau \) tends to one or to infinity. However, it can be seen numerically that there exists a range for \(\tau \) such that \(c'(\tau )\) is reasonably small. E.g., for \(\tau \in [1.003, 2.04]\), we have \(c'(\tau )\le 350\). Moreover, observe that in general \(\tau \) might depend on d, whereas for the special case of d-fold tensor product spaces \(H_d=\bigotimes _{\ell =1}^d H_1\), it can be chosen independent of d. On the other hand, in this case,
implies \((\sum _{j=1}^\infty \lambda _{d,j}^{1/\tau })^\tau = ( \sum _{k=1}^\infty \lambda _k^{1/\tau } )^{\tau d} \ge (\sum _{k=1}^\infty \lambda _k)^d\), which grows exponentially with the dimension provided that the underlying space \(H_1\) is chosen such that \(L_2\)-approximation is nontrivial and well-scaled (i.e., if \(e_{\mathrm {wor}}(0;\mathrm {App}_1, \Lambda ^{\mathrm {all}})^2=1=\lambda _1 \ge \lambda _2>0\)). \(\square \)
5.3 Application to Spaces of Permutation-Invariant Functions
We illustrate the assertions obtained in the preceding subsection by applying them to the reproducing kernel Hilbert spaces \(\mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }}))\) defined in Sect. 2. For this purpose, we first need to determine the sequence of eigenvalues \((\lambda _{d,j})_{j\in \mathbb {N}}\) of the d-variate operator \(W_d=W_d(K_{d,I_d})\) given by
where \(\mathrm {App}_d :\mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }})) \rightarrow L_2([0,1]^d)\) is the canonical embedding and \(K_{d,I_d}\) denotes the reproducing kernel of the \(I_d\)-permutation-invariant space \(\mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }}))\) given in (3). Afterwards, we can make use of Corollary 5.9 for all \(\tau >1\) for which \(\sum _{j=1}^\infty \lambda _{d,j}^{1/\tau }\) is finite.
If \(\#I_d<2\), then \(\mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }}))=F_d(r_{\alpha ,\varvec{\beta }})\) is the d-fold tensor product of \(F_1(r_{\alpha ,\varvec{\beta }})\), and thus it suffices to find the univariate eigenvalues \(\lambda _{k}=\lambda _{1,k}\), \(k\in \mathbb {N}\), see Remark 5.10. From [6, p. 184], it follows that these eigenvalues are given by
i.e., we have one eigenvalue \(\beta _0\) of multiplicity one and a sequence of distinct eigenvalues \(\beta _1/R(m)^{2\alpha }\), \(m\in \mathbb {N}\), of multiplicity two. If we assume that \(\beta _0\ge \beta _1/R(m)^{2\alpha }\) for all \(m\in \mathbb {N}\), then the latter list is ordered properly according to our needs. Consequently,
is finite if and only if the latter sum converges. Due to our assumptions on R, it is easily seen that
where \(\zeta \) denotes Riemann’s zeta function. Therefore, \(\sum _{k=1}^\infty \lambda _k^{1/\tau }<\infty \) if and only if \(1<\tau <2\alpha \). In addition, from Remark 5.10, and (13), we infer that
(note that \(\mathbb {M}_d(\varvec{h})! = 1 = \# \mathcal {S}_d\) for all \(\varvec{h}\in \mathbb {Z}^d\) since \(\#I_d<2\)), which shows that in this case the new error bound from Corollary 5.9 is worse than the estimate known from Proposition 4.1.
If \(\# I_d \ge 2\), then \(\mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }}))\) is a strict subspace of the tensor product space \(F_d(r_{\alpha ,\varvec{\beta }})\). However, in [16] it has been shown that still there is a relation of the multivariate eigenvalues \(\lambda _{d,j}\) with the univariate sequence given in (40). In fact, for all \(\emptyset \ne I_d=\{i_1,\ldots ,i_{\# I_d}\} \subseteq \{1,\ldots ,d\}\), it holds that
and hence the quantity of interest can be decomposed as
Moreover, it can be shown that this expression is finite if and only if \(\sum _{k=1}^\infty \lambda _{k}^{1/\tau }<\infty \), which holds (as before) if and only if \(1<\tau < 2\alpha \).
In order to exploit the decomposition (42) for Corollary 5.9, we proceed similarly to the derivation of [10, Proposition 3.1]. That is, we bound the second factor with the help of a technical lemma (see Lemma A.2 in the Appendix below) and obtain that for all \(U\in \mathbb {N}_0\), it holds that
Given \(1<\tau <2\alpha \), let us define
Then the finiteness of (41) implies that there necessarily exists some \(U_\tau ^*:=U_\tau ^*(R,\alpha ,\varvec{\beta },\tau )\in \mathbb {N}_0\) with
In conjunction with (42) and (43), this finally yields that \(\sum _{j=1}^\infty \lambda _{d,j}^{1/\tau }\) can be upper bounded by
since \(\lambda _1^{d}=\beta _0^d=e(0,d; \mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }})))^2\). Thus, Corollary 5.9 implies the subsequent result.
Theorem 5.11
Consider the integration problem on the \(I_d\)-permutation-invariant subspaces \(\mathfrak {S}_{I_d}(F_d(r_{\alpha ,\varvec{\beta }}))\), where \(d\ge 2\) and \(I_d\subseteq \{1,\ldots ,d\}\) with \(\# I_d \ge 2\). Moreover, assume that
Then for every \(N\ge 2\), the worst case error of the cubature rule \(Q_{d,N}\) defined in (38) satisfies
for all \(1<\tau <2 \alpha \) and \(U_\tau ^*\in \mathbb {N}_0\) such that
Remark 5.12
We conclude this section by some final remarks on Theorem 5.11.
-
1.
First of all, note that using a sufficiently small (but constant) value of \(\beta _1\), the condition (46) can always be fulfilled with \(U_\tau ^*=0\).
-
2.
Observe that the bound (45) combines the advantages of the general existence result for QMC algorithms (Proposition 3.1) with the higher rates of convergence in the estimates for lattice rules (as well as their component-by-component construction) stated in Proposition 4.1 (and Theorem 4.5, respectively). In fact, (45) structurally resembles the bound (8) from Proposition 3.1; besides the initial error and some absolute constants, it contains a term that grows exponentially in \(d-\#I_d\) (but not in d itself!), as well as a polynomial in \(\#I_d\). On the other hand, similar to the lattice rule approach, the Monte Carlo rate of convergence is enhanced by a factor of \(\tau \). Hence, the construction given in Sect. 5.2 provides a cubature rule that achieves a worst case error of \(\mathcal {O}(n^{-\alpha })\) while the implied constants grow at most polynomially with the dimension d, provided that we assume a sufficiently large amount of permutation-invariance (i.e., if \(d-\#I_d \in \mathcal {O}(\ln d)\)). In other words, it can be used to deduce (strong) polynomial tractability.
-
3.
We stress that (in contrast to the rank-1 lattice rules studied in Sect. 4) \(Q_{d,N}\) is a general weighted cubature rule, and its integration nodes do not necessarily belong to some regular structure such as an integration lattice. However, as exposed in Sect. 5.2, they can be found semi-constructively.
-
4.
Finally, we mention that the condition (44) improves on (7) by a factor of two. \(\square \)
References
Aronszajn, N.: Theory of reproducing kernels. Trans. Am. Math. Soc. 68(3), 337–404 (1950)
Dick, J., Kuo, F.Y., Sloan, I.H.: High dimensional integration: the quasi-Monte Carlo way. Acta Numer. 22, 133–288 (2013)
Hickernell, F.J., Woźniakowski, H.: Integration and approximation in arbitrary dimensions. Adv. Comput. Math. 12, 25–58 (2000)
Kuo, F.Y.: Component-by-component constructions achieve the optimal rate of convergence for multivariate integration in weighted Korobov and Sobolev spaces. J. Complex. 19, 301–320 (2003)
Kuo, F.Y., Joe, S.: Component-by-component construction of good lattice rules with a composite number of points. J. Complex. 18, 943–976 (2002)
Novak, E., Woźniakowski, H.: Tractability of Multivariate Problems. Vol. I: Linear Information, Volume 6 of EMS Tracts in Mathematics. European Mathematical Society (EMS), Zürich (2008)
Novak, E., Woźniakowski, H.: Tractability of Multivariate Problems. Vol. II: Standard Information for Functionals, Volume 12 of EMS Tracts in Mathematics. European Mathematical Society (EMS), Zürich (2010)
Novak, E., Woźniakowski, H.: Tractability of Multivariate Problems. Vol. III: Standard Information for Linear Operators, Volume 18 of EMS Tracts in Mathematics. European Mathematical Society (EMS), Zürich (2012)
Nuyens, D.: The construction of good lattice rules and polynomial lattice rules. In: Kritzer, P., Niederreiter, H., Pillichshammer, F., Winterhof, A. (eds.) Uniform Distribution and Quasi-Monte Carlo Methods: Discrepancy, Integration and Applications, Volume 15 of Radon Series on Computational and Applied Mathematics, pp. 223–256. De Gruyter, Berlin (2014)
Nuyens, D., Suryanarayana, G., Weimar, M.: Rank-1 lattice rules for multivariate integration in spaces of permutation-invariant functions: error bounds and tractability. Adv. Comput. Math. 42(1), 55–84 (2016)
Plaskota, L., Wasilkowski, G., Zhao, Y.: New averaging technique for approximating weighted integrals. J. Complex. 25, 268–291 (2009)
Sloan, I.H., Kuo, F.Y., Joe, S.: Constructing randomly shifted lattice rules in weighted Sobolev spaces. SIAM J. Numer. Anal. 40, 1650–1665 (2002)
Sloan, I.H., Kuo, F.Y., Joe, S.: On the step-by-step construction of quasi-Monte Carlo integration rules that achieve strong tractability error bounds in weighted Sobolev spaces. Math. Comp. 71, 1609–1640 (2002)
Sloan, I.H., Reztsov, A.V.: Component-by-component construction of good lattice rules. Math. Comp. 71, 263–273 (2002)
Wasilkowski, G.: Integration and approximation of multivariate functions: average case complexity with isotropic wiener measure. J. Approx. Theory 77, 212–227 (1994)
Weimar, M.: The complexity of linear tensor product problems in (anti)symmetric Hilbert spaces. J. Approx. Theory 164(10), 1345–1368 (2012)
Weimar, M.: On lower bounds for integration of multivariate permutation-invariant functions. J. Complex. 30(1), 87–97 (2014)
Weimar, M.: Breaking the curse of dimensionality. Dissertationes Math. 505, 1–112 (2015)
Yserentant, H.: Regularity and Approximability of Electronic Wave Functions. Lecture Notes in Mathematics. Springer, Berlin (2010)
Acknowledgements
The authors are grateful to the anonymous reviewers who helped in improving the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Edward B. Saff.
This work has been supported by Deutsche Forschungsgemeinschaft DFG (DA 360/19-1).
Appendix
Appendix
In this final section, we collect auxiliary results, as well as some technical proofs we skipped in the presentation above.
1.1 Auxiliary Estimates
For the reader’s convenience, let us recall a standard estimate that is sometimes referred to as Jensen’s inequality.
Lemma A.1
Let \((a_j)_{j\in \mathbb {N}}\) be an arbitrary sequence of nonnegative real numbers. Then, for every \(0<q\le p< \infty \),
whenever the right-hand side is finite.
In addition, in Sect. 5.3 we make use of the following technical lemma (with \(\sigma _m:=\lambda _{m}^{1/\tau }\) for \(m\in \mathbb {N}\) and \(s:=\#I_d\)), which was employed already in [10] and [16]. For a detailed proof (of an insignificantly modified version), we refer to [18, Lemma 5.8].
Lemma A.2
Let \((\sigma _m)_{m\in \mathbb {N}}\) be a sequence of nonnegative real numbers with \(\sigma _1 = \sup _{m\in \mathbb {N}} \sigma _m > 0\), and set \(\sigma _{L,\varvec{k}}:=\prod _{\ell =1}^L \sigma _{k_\ell }\) for \(\varvec{k}\in \mathbb {N}^L\) and \(L\in \mathbb {N}\). Then for all \(U\in \mathbb {N}_0\) and every \(s\in \mathbb {N}\), it holds that
with equality at least for \(U=0\).
1.2 Postponed Proofs
We start with showing Proposition 5.1, which relates the average case integration error of \(Q_{d,n+r}\) with the error of the approximation algorithm \(A_{d,n}\) used to construct \(Q_{d,n+r}\), see Sect. 5.1.
Proof of Proposition 5.1
For any fixed collection of points \(\varvec{t}^{(1)},\ldots ,\varvec{t}^{(r)}\in [0,1]^d\), it is easy to see that
Squaring this expression and taking the expectation with respect to \(\varvec{t}^{(\ell )}\), \(\ell =1,\ldots ,r\), gives
which implies (by integrating over \(B_d\), interchanging the integrals, and estimating the negative term) that
Now the integral on the right-hand side is nothing but \(e^{\mathrm {avg}}(A_{d,n}; \widetilde{\mathrm {App}}_d)^2\), and by the usual argument (mean value theorem), there exists a point set \(\mathcal {P}_{d,r}^{\mathrm {int},*}=\{\varvec{t}^{(1)},\ldots ,\varvec{t}^{(r)}\} \subset [0,1]^d\) such that \(e^{\mathrm {avg}}(Q_{d,n+r}(\,\cdot \,;\mathcal {P}_{d,r}^{\mathrm {int},*},A_{d,n}); \widetilde{\mathrm {Int}}_d)^2\) is smaller than the average on the left. \(\square \)
It remains to prove Lemma 5.6, which justifies the iterative construction of the sequence of approximation algorithms \((A_d^{(k)})_{k\in \mathbb {N}_0}\).
Proof of Lemma 5.6
For this proof, let \(p:=p_d\) and \(C:=C_d\) denote the constants from Assumption 5.5. We are going to prove that \(A^{(k)}_d\) as defined in (35) satisfies
Then (33) together with the definition of \(y_p:=p^{1/(p+1)}\) implies the claim.
Step 1. Let us consider \(k\in \{0,1,\ldots ,\mathcal {K}_p\}\) first. Note that due to the considerations after formula (33), this set is not empty. Since for \(k\le \mathcal {K}_p\) we have \(2^{p(p+1)} \, \omega (y_p)^{p+1} \, (2^k)^{-p} \ge 1\), it suffices to show that \(e^{\mathrm {avg}}(A_d^{(k)}; \widetilde{\mathrm {App}}_d)^2 \le C\) in this case. Since \(A^{(k)}_d\equiv 0\), this is true due to (26) and (32) applied for \(m=0\). The remaining assertions are trivial for \(k\le \mathcal {K}_p\).
Step 2. Observe that \(m_k\) as defined in (36) is at least one if and only if
Moreover, the right-hand side of (47) is less than or equal to \(C \, 2^k \, y_p^{p+1}\) if and only if
Therefore, every \(k\in \mathbb {N}_0\) that satisfies (47) and (48) fulfills \(m_k \ge 1\).
We stress that the conditions (47) and (48) hold true at least for \(k=\mathcal {K}_p\). In fact, the validity of (47) has been shown already in Step 1, and (by the definition of \(\mathcal {K}_p\)) we have that
which implies \(2^{\mathcal {K}_p}> 2^{p}\, \omega (y_p)^{1+1/p} > 2^{p} (1+ 1/y_p^{p+1})\) using (34) for the last estimate. Furthermore, note that with \(k=\mathcal {K}_p\), the condition (48) holds true for every \(k\ge \mathcal {K}_p\).
Step 3. Now we prove (47) for \(A^{(\mathcal {K}_p+1)}_d\). For this purpose, we let \(k=\mathcal {K}_p\) and make use of Proposition 5.3 with \(m=m_k\) and \(q=2^{k}\) (here we need that \(m_k \ge 1\)). Employing the definition of \(m_k\) given in (36) together with (32) from Assumption 5.5, we obtain
Finally, we plug in the upper estimate (47) for the error of \(A_d^{(k)}\) and derive the same bound for \(A_d^{(k+1)}=A_d^{(\mathcal {K}_p+1)}\). But now Step 2 yields that also \(m_{\mathcal {K}_p+1} \ge 1\), such that the same reasoning as before implies (47) for \(k=\mathcal {K}_p+2\) as well, and (by induction) for any further k.
In conclusion, these arguments show that the tail sequence \((A^{(k)}_d)_{k > \mathcal {K}_p}\) is well defined and that it satisfies the claimed error bound. Since in the construction we add at most \(2^{k-1}\) points when turning from \(k-1\) to k, each operator \(A^{(k)}_d\) obviously uses no more than \(2^k\) sample points. Hence, the proof is complete. \(\square \)
Rights and permissions
About this article
Cite this article
Nuyens, D., Suryanarayana, G. & Weimar, M. Construction of Quasi-Monte Carlo Rules for Multivariate Integration in Spaces of Permutation-Invariant Functions. Constr Approx 45, 311–344 (2017). https://doi.org/10.1007/s00365-016-9362-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00365-016-9362-2
Keywords
- Numerical integration
- Quadrature
- Cubature
- Quasi-Monte Carlo methods
- Rank-1 lattice rules
- Component-by-component construction