1 Introduction

In this article, we study the problem of multivariate numerical integration for a subclass of square-integrable functions \(f \in L^2([0,1]^d)\). We consider special instances of so-called quasi-Monte Carlo (QMC) rules, which are methods to approximate integrals

$$\begin{aligned} I_d(f) =\int _{[0,1]^d } f({\varvec{x}}) \,\mathrm{d}{\varvec{x}}\end{aligned}$$

by equal-weight quadrature rules,

$$\begin{aligned} Q_{N,d}(f) = \frac{1}{N} \sum _{n=0}^{N-1} f({\varvec{x}}_n) , \end{aligned}$$

where the integration nodes \({\varvec{x}}_0,{\varvec{x}}_1,\ldots ,{\varvec{x}}_{N-1}\) are deterministically chosen in \([0,1]^d\). This is in contrast to Monte Carlo rules, where the integration nodes are chosen randomly; with QMC rules, we try to make a deliberate and sophisticated choice of the points \({\varvec{x}}_n\) with the aim of obtaining better error bounds than for Monte Carlo. The crucial challenge is to find integration nodes yielding a low approximation error simultaneously for a large class of functions that may depend on many variables. This means that, usually, one needs to be able to find millions of good integration nodes in very high dimensions which is a considerable computational challenge.

In the literature on QMC methods, there are two main concepts that are commonly made use of when trying to find sets of integration nodes with good properties. These are, on the one hand, lattice point sets, as introduced independently by Korobov (see [10]) and Hlawka (see [9]). For more recent introductions to lattice rules, we refer to [16, 21]. The other class of commonly used QMC integration nodes is that of (digital) (tmd)-nets and (td)-sequences, as introduced by Niederreiter, building up on ideas by Sobol’ and Faure (see [14, 16]). A special case of (tmd)-nets, namely so-called polynomial lattice point sets, is the focus of the present paper. These point sets were introduced in [15] and have their name since their structure can be viewed as analogous to (ordinary) lattice point sets.

While the construction principle of lattice point sets is based on integer arithmetic, polynomial lattice point sets are based on polynomial arithmetic over finite fields. To be more precise, we will fix a prime b and consider the finite field \({\mathbb {F}}_b\) with b elements. A polynomial lattice point set with \(b^m\) points in \([0,1]^d\) is constructed by means of a modulus \(p\in {\mathbb {F}}_b [x]\) with \(\deg (p)=m\), and a generating vector \({\varvec{g}}\in ({\mathbb {F}}_b [x])^d\) (we refer to Sect. 2.2 for the precise definition). The QMC rule using the polynomial lattice point set as integration nodes is then called a polynomial lattice rule. It will be convenient in this paper to assume that the modulus has the form \(p_m (x)=x^m\). However, it is crucial to note that not every choice of the generating vector \({\varvec{g}}\) yields a polynomial lattice point set that has good properties, in the sense that the integration error of the corresponding polynomial lattice rule is sufficiently low. On the contrary, it is usually highly non-trivial to find good generating vectors of polynomial lattice rules, and there are (except for special cases) no explicit constructions of such good generating vectors known. Hence, one has to resort to computer search algorithms for finding generating vectors of polynomial lattice point sets of high quality. Regarding the error measure, we consider in this paper the worst-case setting, i.e., we consider a particular normed function space and the supremum of the integration error over the unit ball of the space.

It is known that (ordinary) lattice rules are well suited for the numerical integration of functions with pointwise convergent Fourier series (see again, e.g., [16] or [21]). On the other hand, polynomial lattice rules are usually applied for the numerical integration of functions that can be represented by Walsh series (cf. [2, 4, 5]). We will therefore define a reproducing kernel Hilbert space based on Walsh functions in Sect. 2.1, which will be considered throughout the paper. The function space under consideration will be characterized by a smoothness parameter \(\alpha \) (in some publications, this parameter is also referred to as “digital smoothness parameter” in the context of Walsh series). Indeed, the parameter \(\alpha \) is linked to the speed of decay of the Walsh coefficients of the functions in our space, but there is also a connection to the number of derivatives that exist for the elements of the space (we refer to [5] and the references therein for details).

The function space considered here is closely related to other function spaces considered in the literature, such as in [2, 4, 5]; indeed, results that we show for the space considered in the present paper immediately imply corresponding results for some of the Walsh spaces considered in these references. Furthermore, our Hilbert space will be a “weighted” function space in the sense of Sloan and Woźniakowski (cf. [23]). This means that we assign non-negative real numbers (weights) to the coordinates, or groups of coordinates, of the integration problem, in order to model the different influence of the coordinates on the problem. As pointed out in [23] and numerous other papers, this method is justified by practical high-dimensional problems in which different coordinates may indeed have a very different degree of influence on the value of an integral. The weights will be incorporated in the inner product and norm of the function space in a suitable way. Using this setting, it is plausible that a nominally very high-dimensional problem may have a rather low “effective dimension,” i.e., only a certain, possibly small, part of the components has a significant influence on the integration problem and the error made by approximative algorithms. This may then yield situations where a curse of dimensionality can be avoided.

In the present paper, we will restrict ourselves, for technical reasons, to considering the most common choice of weights, the so-called product weights, but we suspect that the construction of QMC rules presented here could also work for other choices of weights. We refer to Remark 5 in Sect. 3.3 for further comments on this question.

The first efficient construction of good generating vectors of polynomial lattice point sets was done in [2]. In that paper, the authors considered the so-called component-by-component (CBC) approach, which is a greedy algorithm to construct one component of the generating vector at a time. CBC algorithms were first considered for ordinary lattice point sets, with the first examples in the literature going back to Korobov (cf. [11]), and later a rediscovery by Sloan and Reztsov (cf. [22]). The fast CBC construction, which is due to Cools and Nuyens (see, e.g., [18,19,20]), makes the CBC construction computationally competitive and is currently the standard method to construct high-dimensional lattice point sets of good quality. It is well known (see, e.g., [2] and again [18]) that CBC constructions also work for the efficient search for generating vectors of polynomial lattice point sets; and also in this case, a fast algorithm is available.

1.1 Overview and Main Results

In the present paper, we present another, different algorithm to construct generating vectors of polynomial lattice point sets in an efficient way. This construction is also based on a component-by-component approach. However, as opposed to the CBC algorithms for polynomial lattice point sets currently available in the literature, our new approach constructs the single components of the generating vector \({\varvec{g}}\) “digit-by-digit” and the used search criterion is independent of the smoothness parameter \(\alpha \). Actually, the term “digit-by-digit” is based on a similar approach that exists for ordinary lattice point sets (see [12, 13], and for similar results in a more up-to-date setting, [7]). In the context of polynomial lattice point sets, the generating vector \({\varvec{g}}\) consists of polynomials, so it would be more appropriate to speak of a “coefficient-by-coefficient” instead of a “digit-by-digit” construction. However, to stay consistent regarding the name of the method, and to avoid confusion with the “component-by-component” approach, we keep the name “digit-by-digit” construction also for polynomial lattice rules. In fact, the algorithm which we will present in Sect. 3.2 contains two loops. An outer loop in which the different components are constructed, and an inner loop in which the coefficients (digits) of each component of the generating vector are constructed. Both loops can be regarded as greedy, i.e., choices that have been made in previous steps are kept fixed.

We will show that the polynomial lattice rules obtained by our new construction method (Algorithm 1) satisfy upper error bounds that are arbitrarily close to the optimal convergence rate. In particular, we will prove the following main result regarding the behavior of the worst-case error \(e_{b^m,d,\alpha ,{\varvec{\gamma }}}\) of integration in the weighted function space under consideration.

Theorem 1

Let b be prime, let \(m,d \in {\mathbb {N}}\) with \(m\ge 4\), let \(N=b^m\), let \(\alpha >1\), and let \((\gamma _j)_{j\ge 1}\) be positive product weights satisfying

$$\begin{aligned} \sum _{j \ge 1} \gamma _j < \infty . \end{aligned}$$

Then, for any \(\delta >0\) and each \(\alpha >1\), the worst-case error of the polynomial lattice rule with generating vector \({\varvec{g}}\) constructed by Algorithm 1 (with weight sequence \({\varvec{\gamma }}= (\gamma _j)_{j \ge 1}\)) satisfies

$$\begin{aligned} e_{b^m,d,\alpha ,{\varvec{\gamma }}^\alpha }({\varvec{g}}) \le \frac{1}{N^\alpha } \left( C({\varvec{\gamma }}^\alpha ) + \bar{C}\left( {\varvec{\gamma }},\delta \right) \, N^{\alpha \delta } \right) , \end{aligned}$$

with positive constants \(C({\varvec{\gamma }}^\alpha )\) and \(\bar{C}\left( {\varvec{\gamma }},\delta \right) \), that are independent of d and N.

Theorem 1 shows that under suitable conditions on the coordinate weights, we can vanquish the curse of dimensionality, i.e., avoid exponential dependence of the error on the dimension d of the integration problem, or even obtain error bounds that are independent of the dimension. Furthermore, the devised algorithm (when run with weights \({\varvec{\gamma }}\)) constructs good polynomial lattice rules for which the proven bounds on the worst-case error hold simultaneously for all \(\alpha >1\). This can be an advantage over standard construction algorithms which are often tailored to a particular function space, and thus, the corresponding error bounds only hold for a particular \(\alpha \). We will discuss this issue in more detail in the remainder of the article.

We note that Theorem 1 is a simplified version of Theorem 8, and the latter is proven in Sect. 3. As the proof of Theorem 8, which is essential to the derivation of our algorithm, is fairly technical, we will briefly outline the proof strategy that is used in Sect. 3.

The main idea is to relate our error measure, which is related to the dual net of the polynomial lattice rule, to an auxiliary quantity \(T_{\alpha ,{\varvec{\gamma }}}\) that only depends on a subset of the dual net. The difference of these two quantities will be shown to be of an order of magnitude that is at least as good as what we may expect for the error itself, which justifies concentrating on \(T_{\alpha ,{\varvec{\gamma }}}\) only. (see Proposition 1). As a next step, we show in Theorem 4 that \(T_{\varvec{\gamma }}:= T_{1,{\varvec{\gamma }}}\) can be bounded by the quality function \(H_{d,m,{\varvec{\gamma }}}\) (modulo some additive terms that can be controlled), and thus that it suffices to minimize this quality function in order to device good polynomial lattice rules. Analyzing the average of \(H_{d,m,{\varvec{\gamma }}}\) with respect to the choices of extending the degree of the components of the generating vector (Lemma 7) then leads to the formulation of the component-by-component digit-by-digit construction in Algorithm 1. The remaining step is to show that \(H_{d,m,{\varvec{\gamma }}}\) for the constructed polynomial lattice rules is of order \({\mathcal {O}}(b^{-m})\), which is done via an inductive argument over the dimension in Theorems 5 and 6.

Due to its structure, the obtained component-by-component digit-by-digit (CBC-DBD) algorithm can be implemented in a fast manner that only requires \({\mathcal {O}}(d \, m \, b^m)\) operations. This is competitive with state-of-the-art CBC constructions; however, our algorithm does not require the use of fast Fourier transformations as well as the underlying knowledge regarding circulant matrices. We see this as a benefit of our construction.

This paper is an extension of the research on the CBC-DBD construction of lattice point sets (cf. [7]) to a special construction method for polynomial lattice point sets. As the names of the point sets already suggest, there are certain similarities in the two approaches, but instead of working with integer arithmetic, as it is the case for lattice point sets, we now need to work with polynomial arithmetic over a finite field, which makes the result technically more demanding. In particular, we would like to stress that—while in the earlier paper [7] on lattice rules we restricted ourselves to N of the form \(2^m\) for technical reasons—we were able to generalize this assumption to arbitrary prime bases b instead of \(b=2\), i.e., we allow N to be any prime power. This implies that we can cover a more general class of point sets as compared to [7]. The main work underlying this more general result is to be found in Lemma 1, in which we analyze a Walsh-Dirichlet kernel.

The fact that our new algorithm can be used for the efficient construction of good polynomial lattice point sets, with a running time competitive with other common construction methods, but without the need of using the fast Fourier transform, is an advantage of the method presented. This is even more important as it is known from the theory developed by Dick (see, e.g., the monograph [5]) that certain variants of polynomial lattice rules (so-called higher-order polynomial lattice rules) allow for improved convergence rates for smoother functions, and it would be of great interest to make the method presented here also applicable to that case, for which we hope to lay the basis in this paper. We stress that analogous results for lattice rules regarding the integration of smooth functions are not known, which is why it is even more important to make the CBC-DBD construction available for polynomial lattice rules.

The rest of the paper is structured as follows. In Sect. 2, we introduce the function space setting as well as polynomial lattice rules and analyze the corresponding worst-case error expression. In Sect. 3, we derive the component-by-component digit-by-digit (or, for short, CBC-DBD) construction algorithm for polynomial lattice rules following the proof strategy outlined above. In particular, we analyze the worst-case error behavior of the resulting integration rules. In Sect. 4, we show that the introduced construction method can be implemented in a fast manner, competitive with state-of-the-art construction algorithms. Finally, the article is concluded in Sect. 5, where we illustrate our main results by extensive numerical experiments.

To conclude this introductory section, we fix some notation. In what follows, we denote the set of positive integers by \({\mathbb {N}}\) and the set of non-negative integers by \({\mathbb {N}}_0\). To denote subsets of components, we use fraktur font, e.g., \({\mathfrak {u}}\subset {\mathbb {N}}\) and additionally write shorthand \(\{1 {{:}}d\} := \{1,\ldots ,d\}\). For the projection of a vector \({\varvec{x}}\in [0,1]^d\) or \({\varvec{k}}\in {\mathbb {N}}^d\) onto the components in a set \({\mathfrak {u}}\subseteq \{1 {{:}}d\}\), we write \({\varvec{x}}_{\mathfrak {u}}= (x_j)_{j \in {\mathfrak {u}}}\) or \({\varvec{k}}_{\mathfrak {u}}= (k_j)_{j \in {\mathfrak {u}}}\), respectively. With a slight abuse of notation, we will frequently identify elements of the finite field \({\mathbb {F}}_b\) of prime cardinality b with elements of the group of integers modulo b denoted by \({\mathbb {Z}}_b\).

2 Polynomial Lattice Rules in Weighted Walsh Spaces

In this article, we consider numerical integration of a sub-class of the square-integrable functions \(f \in L^2([0,1]^d)\) which can be represented in terms of their Walsh series. This particular series representation of a function is based on the so-called Walsh functions, which are defined as follows.

Definition 1

Let \(b \ge 2\) be an integer. For a non-negative integer k, we define the k-th Walsh function \(_b\mathrm{wal}_k :[0,1) \rightarrow {\mathbb {C}}\) by

$$\begin{aligned} _b\mathrm{wal}_k (x) := \mathrm{e}^{2\pi \mathrm{i}(\kappa _0 \xi _1 + \kappa _1 \xi _2 + \cdots + \kappa _{a-1} \xi _{a})/b} \end{aligned}$$

with \(x\in [0,1)\) and base b representations \(k=\kappa _0 + \kappa _1 b + \cdots + \kappa _{a-1} b^{a-1}\) and \(x=\xi _1 b^{-1} + \xi _2 b^{-2} + \cdots \) (unique in the sense that infinitely many of the \(\xi _i\) must be different from \(b-1\)) with coefficients \(\kappa _i, \xi _i \in \{0,1,\ldots ,b-1\}\). The imposed restriction that infinitely many of the \(\xi _i\) must be different from \(b-1\) assures that no ambiguous representations arise.

For \(d \in {\mathbb {N}}\), an integer vector \({\varvec{k}}=(k_1,\ldots ,k_d) \in {\mathbb {N}}_0^d\) and \({\varvec{x}}=(x_1,\ldots ,x_d)\in [0,1)^d\), we define the \({\varvec{k}}\)-th (d-variate) Walsh function \(_b\mathrm{wal}_{{\varvec{k}}} :[0,1)^d \rightarrow {\mathbb {C}}\) by

$$\begin{aligned} _b\mathrm{wal}_{{\varvec{k}}} ({\varvec{x}}) := \prod _{j=1}^d \ _b\mathrm{wal}_{k_j} (x_j) . \end{aligned}$$

In the following, we will consider the base \(b \ge 2\) as fixed (for the sake of simplicity, we will assume that b is prime), and then simply write \(\mathrm{wal}_k\) or \(\mathrm{wal}_{{\varvec{k}}}\) instead of \(_b\mathrm{wal}_k\) or \(_b\mathrm{wal}_{{\varvec{k}}}\), respectively. It is known (see, e.g., [5]) that the Walsh functions in any fixed base b form an orthonormal basis of \(L^2 ([0,1]^d)\).

As indicated, we consider a class of square-integrable functions that can be represented in terms of their Walsh series, that is,

$$\begin{aligned} f({\varvec{x}}) = \sum _{{\varvec{k}}\in {\mathbb {N}}_0^d} \hat{f}({\varvec{k}}) \, \mathrm{wal}_{\varvec{k}}({\varvec{x}}) \quad \text {with} \quad \hat{f}({\varvec{k}}) := \int _{[0,1]^d} f({\varvec{x}}) \, \overline{\mathrm{wal}_{\varvec{k}}({\varvec{x}})} \,\mathrm{d}{\varvec{x}}, \end{aligned}$$
(1)

where we call \(\hat{f}({\varvec{k}})\) the \({\varvec{k}}\)-th Walsh coefficient of f.

It is known from the literature on QMC methods in the past decades that it is advantageous to choose the integration nodes of a QMC rule such that there exists an efficient way of expressing the integration error for elements in the function class under consideration. In the case where the integrand f can be represented in terms of Walsh series as in (1), it is common to consider quasi-Monte Carlo rules which are based on so-called digital nets and sequences. Digital (tmd)-nets are point sets consisting of \(b^m\) elements in \([0,1]^d\) that satisfy certain regular distribution properties and were in their most general form introduced in [15] (see also [16]). These point sets are generated by using d generating matrices \(C_1,\ldots ,C_d\) over a finite field or ring. In particular, for a digital (tmd)-net \(P=\{{\varvec{x}}_0,\ldots ,{\varvec{x}}_{b^m-1}\} \subset [0,1]^d\) constructed over \({\mathbb {Z}}_b=\{0,1,\ldots ,b-1\}\) with generating matrices \(C_1,\ldots ,C_d \in {\mathbb {Z}}_b^{m \times m}\) the integration error of a QMC rule based on P takes a special form. It is commonly known, see, e.g., [3, Theorem 6.4], that approximating the integral \(I_d(f)\) of a d-variate function f using a QMC rule \(Q_{b^m,d}(f;P)\), that is,

$$\begin{aligned} Q_{b^m,d}(f)=Q_{b^m,d}(f;P) := \frac{1}{b^m} \sum _{n=0}^{b^m-1} f({\varvec{x}}_n) \approx \int _{[0,1]^d} f({\varvec{x}}) \,\mathrm{d}{\varvec{x}}=: I_d(f) , \end{aligned}$$

leads to an integration error of the form

$$\begin{aligned} Q_{b^m,d}(f;P) - I_d(f) = \sum _{{\varvec{0}}\ne {\varvec{k}}\in {\mathcal {D}}} \hat{f}({\varvec{k}}) \end{aligned}$$
(2)

with the dual net \({\mathcal {D}}= {\mathcal {D}}(C_1,\ldots ,C_d) := \{{\varvec{k}}\in {\mathbb {N}}_0^d \mid C_1^\top \mathbf {k}^m_1 + \dots + C_d^\top \mathbf {k}^m_d =\overline{{\varvec{0}}} \}\), where for \(k \in {\mathbb {N}}_0\) with base b expansion \(k = \kappa _0 + \kappa _1 b + \dots + \kappa _a b^a\) we define the vector \(\mathbf {k}^m=(\kappa _0,\kappa _1, \ldots , \kappa _{m-1}) \in {\mathbb {Z}}_b^m\), and where we denote by \(\overline{{\varvec{0}}}\) the zero vector in \({\mathbb {Z}}_b^m\). Equation (2) is a consequence of the following character property of Walsh functions,

$$\begin{aligned} \frac{1}{b^m}\sum \limits _{n=0}^{b^m-1} \mathrm {wal}_{{\varvec{k}}} ({\varvec{x}}_n) =\left\{ \begin{array}{ll} 1, &{} \quad {\text {if }} C_1^\top \mathbf {k}^m_1 + \dots + C_d^\top \mathbf {k}^m_d = \overline{{\varvec{0}}} , \\ 0, &{} \quad {\text {otherwise.}} \end{array}\right. \end{aligned}$$

We will also use this property in the subsequent analysis.

2.1 The Weighted Walsh Space

Based on the decay of the Walsh coefficients \(\hat{f}({\varvec{k}})\) in (1), we will define a function space for the integrands considered in this paper. As mentioned in the introduction, this space will be equipped with weights to model the varying influence of the coordinates. To this end, let \({\varvec{\gamma }}=(\gamma _j)_{j\ge 1}\) be a non-increasing sequence of positive real numbers. The weights \(\gamma _j\) will appear in the definition of the inner product and norm of the function space defined below. Intuitively, we can think of the weight \(\gamma _j\) describing the degree of influence of the j-th variable on the integration problem. Hence, we assume (w.l.o.g.) that the coordinates are ordered according to their influence. It will also be convenient to define

$$\begin{aligned} \gamma _{\mathfrak {u}}:= \prod _{j\in {\mathfrak {u}}} \gamma _j \end{aligned}$$

for a subset \({\mathfrak {u}}\subseteq \{1 {{:}}d\}\), and to additionally set \(\gamma _{\emptyset }\) to equal 1. The weights \(\gamma _{\mathfrak {u}}\) are (for obvious reasons) called product weights. In the recent literature on QMC rules, also other types of weights have been considered, but we will restrict ourselves to product weights here. We refer to [3] for further information on this subject.

For prime base \(b \ge 2\) and given smoothness parameter \(\alpha > 1\), we set \(\psi _b(k):=\left\lfloor {\log _b(k)}\right\rfloor \) for \(k \in {\mathbb {N}}\) and define the decay function \(r_\alpha : {\mathbb {N}}_0 \rightarrow {\mathbb {R}}\) by

$$\begin{aligned} r_\alpha (k)=r_\alpha (b,k) := \left\{ \begin{array}{ll} 1, &{} \quad {\text {if }} k=0 , \\ b^{\alpha \psi _b (k)}, &{}\quad {\text {if }} k\ne 0 , \end{array}\right. \end{aligned}$$

with \(k\in {\mathbb {N}}_0\). It is also convenient to define the quantity

$$\begin{aligned} \mu _b (\alpha ) := \sum _{k=1}^\infty (r_{\alpha } (k))^{-1} =\sum _{a=0}^\infty \frac{1}{b^{a\alpha }} \sum _{k=b^a}^{b^{a+1}-1} 1=\sum _{a=0}^\infty \frac{(b-1)b^a}{b^{a\alpha }} =\frac{b^{\alpha }(b-1)}{b^\alpha - b} . \end{aligned}$$
(3)

For the multivariate case with dimension \(d\in {\mathbb {N}}\), integer vector \({\varvec{k}}=(k_1,\ldots ,k_d) \in {\mathbb {N}}_0^d\), and a sequence of weights \({\varvec{\gamma }}=(\gamma _j)_{j\ge 1}\), we define the weighted decay functions

$$\begin{aligned} r_\alpha ({\varvec{k}}) := \prod _{j=1}^d r_\alpha (k_j) \quad \text {and} \quad r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}) := \gamma _{\mathrm{supp}({\varvec{k}})}^{-1} \, r_\alpha ({\varvec{k}}) = \gamma _{\mathrm{supp}({\varvec{k}})}^{-1} \prod _{j \in \mathrm{supp}({\varvec{k}})} b^{\alpha \psi _b(k_j)} \end{aligned}$$

with \(\mathrm{supp}({\varvec{k}}) := \{ j \in \{1 {{:}}d\} \mid k_j \ne 0 \}\).

Using this decay function, we can estimate the integration error obtained in (2) by

$$\begin{aligned} \left| {Q_{b^m,d}(f;P) - I_d(f)}\right|&=\left| {\sum _{{\varvec{0}}\ne {\varvec{k}}\in {\mathcal {D}}} \hat{f}({\varvec{k}})}\right| =\left| {\sum _{{\varvec{0}}\ne {\varvec{k}}\in {\mathbb {N}}_0^d} \hat{f}({\varvec{k}}) \, r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}) \, (r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}))^{-1} \, {\varvec{1}}_{{\mathcal {D}}}({\varvec{k}})}\right| \nonumber \\&\le \left( \sup _{{\varvec{k}}\in {\mathbb {N}}_0^d} |\hat{f}({\varvec{k}})| \, r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}) \right) \left( \sum _{{\varvec{0}}\ne {\varvec{k}}\in {\mathcal {D}}} (r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}))^{-1} \right) \end{aligned}$$
(4)

with \({\varvec{1}}_{{\mathcal {D}}}\) denoting the indicator function of the dual net \({\mathcal {D}}\). Based on this estimate, we define, for real \(\alpha >1\) and a sequence of strictly positive weights \({\varvec{\gamma }}= (\gamma _j)_{j\ge 1}\), the weighted Walsh space as

$$\begin{aligned} W_{d,{\varvec{\gamma }}}^{\alpha } := \{f \in L^2([0,1]^d) \mid \left\| {f}\right\| _{W_{d,{\varvec{\gamma }}}^{\alpha }} < \infty \} \end{aligned}$$

with corresponding norm \(\left\| {\cdot }\right\| _{W_{d,{\varvec{\gamma }}}^{\alpha }}\) given by

$$\begin{aligned} \left\| {f}\right\| _{W_{d,{\varvec{\gamma }}}^{\alpha }} :=\sup _{{\varvec{k}}\in {\mathbb {N}}_0^d} |\hat{f}({\varvec{k}})| \, r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}). \end{aligned}$$
(5)

Remark 1

We remark that the definition of the norm implies that functions in \(W_{d,{\varvec{\gamma }}}^{\alpha }\) have an absolutely convergent Walsh series which converges pointwise (see, e.g., [5]).

Remark 2

We would like to note here that in many recent papers (e.g., [2, 4]), a slightly different function space \(\widetilde{W}_{d,{\varvec{\gamma }}}^{\alpha }\) based on Walsh functions has been studied. In \(\widetilde{W}_{d,{\varvec{\gamma }}}^{\alpha }\), the norm is not given as an \(\infty \)-norm as in (5), but in the \(L^2\)-sense, i.e.,

$$\begin{aligned} \left\| {f}\right\| _{\widetilde{W}_{d,{\varvec{\gamma }}}^{\alpha }} := \Bigg ( \sum _{{\varvec{k}}\in {\mathbb {N}}_0^d} |\hat{f}({\varvec{k}})|^2 \, r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}) \Bigg )^\frac{1}{2} . \end{aligned}$$

This definition of the norm corresponds to alternatively applying Hölder’s inequality with \(p=q=2\) in the bound on the integration error that led to (4). As we will see below, the worst-case error expressions for \(W_{d,{\varvec{\gamma }}}^{\alpha }\) and \(\widetilde{W}_{d,{\varvec{\gamma }}}^{\alpha }\) are closely related to each other.

In order to assess the quality of the QMC methods constructed later on, we will use the worst-case error in the weighted Walsh space as the error criterion. Indeed, the worst-case error for the QMC rule \(Q_{b^m,d}(\cdot ;P)\) in the space \(W_{d,{\varvec{\gamma }}}^{\alpha }\) is defined as

$$\begin{aligned} e_{b^m,d,\alpha ,{\varvec{\gamma }}}(P) := \sup _{\begin{array}{c} f \in W_{d,{\varvec{\gamma }}}^{\alpha } \\ \Vert f\Vert _{W_{d,{\varvec{\gamma }}}^{\alpha }} \le 1 \end{array}} | I_d(f) - Q_{b^m,d}(f;P)| . \end{aligned}$$

A useful formula for the worst-case error for (tmd)-nets in the function space \(W_{d,{\varvec{\gamma }}}^{\alpha }\) is given in the following theorem.

Theorem 2

Let \(m,d \in {\mathbb {N}}\), \(\alpha > 1\), a prime b, and a sequence of positive weights \({\varvec{\gamma }}= (\gamma _j)_{j\ge 1}\) be given. Then, the worst-case error \(e_{b^m,d,\alpha ,{\varvec{\gamma }}}(P)\) of the QMC rule \(Q_{b^m,d}(\cdot ;P)\) based on the digital (tmd)-net \(P=\{{\varvec{x}}_0,\ldots ,{\varvec{x}}_{b^m-1}\}\) with generating matrices \(C_1,\ldots ,C_d\) in the space \(W_{d,{\varvec{\gamma }}}^{\alpha }\) satisfies

$$\begin{aligned} e_{b^m,d,\alpha ,{\varvec{\gamma }}}(P) = \sum _{{\varvec{0}}\ne {\varvec{k}}\in {\mathcal {D}}} (r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}))^{-1} . \end{aligned}$$
(6)

Proof

Recalling the definition of the worst-case error of the QMC rule \(Q_{b^m,d}(\cdot ;P)\), the combination of (4) and the definition of \(\Vert \cdot \Vert _{W_{d,{\varvec{\gamma }}}^{\alpha }}\) leads to the estimate

$$\begin{aligned} e_{b^m,d,\alpha ,{\varvec{\gamma }}}(P) \le \sup _{\begin{array}{c} f \in W_{d,{\varvec{\gamma }}}^{\alpha } \\ \Vert f\Vert _{W_{d,{\varvec{\gamma }}}^{\alpha }} \le 1 \end{array}} \Vert f\Vert _{W_{d,{\varvec{\gamma }}}^{\alpha }} \sum _{{\varvec{0}}\ne {\varvec{k}}\in {\mathcal {D}}} (r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}))^{-1} \le \sum _{{\varvec{0}}\ne {\varvec{k}}\in {\mathcal {D}}} (r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}))^{-1} . \end{aligned}$$

Observing that the function \(f_0\) with Walsh coefficients \(\hat{f_0}({\varvec{k}}) =(r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}))^{-1}\) has norm \(\Vert f_0\Vert _{W_{d,{\varvec{\gamma }}}^{\alpha }} = 1\) and that its integration error equals

$$\begin{aligned} Q_{b^m,d}(f_0,P) - I_d(f_0) = \sum _{{\varvec{0}}\ne {\varvec{k}}\in {\mathcal {D}}} (r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}))^{-1} , \end{aligned}$$

we obtain that the previous upper bound is attained such that the claimed identity follows. \(\square \)

Remark 3

Returning to the alternative Walsh space \(\widetilde{W}_{d,{\varvec{\gamma }}}^{\alpha }\) once again, it is known from [4] that the worst-case error in this space equals

$$\begin{aligned} \left( \sum _{{\varvec{0}}\ne {\varvec{k}}\in {\mathcal {D}}} (r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}))^{-1}\right) ^{1/2}, \end{aligned}$$

which is just the square root of the worst-case error in \(W_{d,{\varvec{\gamma }}}^{\alpha }\), as outlined in Theorem 2. Therefore, we see that the worst-case errors in these Walsh spaces are intimately related to each other, and all results shown here for \(W_{d,{\varvec{\gamma }}}^{\alpha }\) immediately yield corresponding results for \(\widetilde{W}_{d,{\varvec{\gamma }}}^{\alpha }\).

2.2 Polynomial Lattice Rules

While Theorem 2 is a very useful result, the question of how to find and construct (tmd)-nets with a low integration error for practical purposes remains. One of the most powerful ways of obtaining nets is to consider a special case, namely so-called polynomial lattice point sets, as introduced by Niederreiter in [15]. The name “polynomial lattice point sets” is due to the fact that the structure of polynomial lattice point sets is similar to that of ordinary lattice point sets as introduced by Korobov [10] and Hlawka [9]. However, while lattice point sets are based on integer arithmetic, polynomial lattice point sets are obtained by using polynomial arithmetic over finite fields. We also point out that there are nowadays variants of polynomial lattice point sets which are especially suited for integrating functions with higher smoothness (see, e.g., [5]). However, we will not consider higher order polynomial lattices here, but restrict ourselves to the more classical construction scheme. We point out that polynomial lattice point sets are actually a special case of so-called digital (tmd)-nets, which can be constructed using generating matrices \(C_1,\ldots ,C_d\) over a finite field. For our purposes, though, it is more convenient to define these point sets in an alternative way. Before we give the precise definition, we need to introduce some notation.

As before and for the remainder of this article, we will assume that the base b is prime, even if not specifically mentioned. Let \({\mathbb {F}}_b ((x^{-1}))\) be the field of formal Laurent series over \({\mathbb {F}}_b\) with elements of the form

$$\begin{aligned} L = \sum _{\ell =w}^\infty t_{\ell } x^{-\ell } , \end{aligned}$$

where w is an arbitrary integer and all \(t_{\ell }\in {\mathbb {F}}_b\). We further denote by \({\mathbb {F}}_b [x]\) the set of all polynomials over \({\mathbb {F}}_b\) and define the map \(v_m: {\mathbb {F}}_b ((x^{-1}))\rightarrow [0,1)\) by

$$\begin{aligned} v_m \left( \sum _{\ell =w}^\infty t_\ell \, x^{-\ell }\right) =\sum _{\ell =\max (1,w)}^m t_{\ell } \, b^{-\ell } . \end{aligned}$$

There is a close connection between the base b expansions of natural numbers and the polynomial ring \({\mathbb {F}}_b [x]\). For \(n\in {\mathbb {N}}_0\) with base b expansion \(n=n_0 + n_1 b + \cdots + n_a b^{a}\), we associate n with the polynomial

$$\begin{aligned} n(x) := \sum _{k=0}^a n_k \, x^k \in {\mathbb {F}}_b [x]. \end{aligned}$$

Furthermore, we define the greatest common divisor of two polynomials \(f,g \in {\mathbb {F}}_b[x]\), denoted by \(\gcd (f,g)\), as the monic polynomial of highest degree that divides both f and g.

The definition of a polynomial lattice point set is then given as follows. We note that here and in the following we consider the zero polynomial to have degree \(-\infty \), hence, the case \(n=0\) is included in the following definition. Furthermore, we remark that the occurring fractions of the form q(x) /p(x) with \(p,q \in {\mathbb {F}}_b[x]\) are elements of the field \({\mathbb {F}}_b ((x^{-1}))\).

Definition 2

(Polynomial lattice) Let b be prime and let \(m,d \in {\mathbb {N}}\) be given. Furthermore, choose \(p\in {\mathbb {F}}_b [x]\) with \(\deg (p)=m\), and let \(g_1,\ldots ,g_d \in {\mathbb {F}}_b [x]\). Then, the point set \(P({\varvec{g}},p)\), defined as the collection of the \(b^m\) points

$$\begin{aligned} {\varvec{x}}_n := \left( v_m \left( \frac{n(x)\, g_1(x)}{p(x)} \right) ,\ldots , v_m \left( \frac{n(x)\, g_d(x)}{p(x)} \right) \right) \in [0,1)^d \end{aligned}$$

for \(n \in {\mathbb {F}}_b[x]\) with \(\deg (n)<m\), is called a polynomial lattice point set (we sometimes also refer to the point set as polynomial lattice for short), where the vector \({\varvec{g}}=(g_1,\ldots ,g_d) \in ({\mathbb {F}}_b[x])^d\) is called the generating vector.

As pointed out above, due to the construction principle and the similarities to the construction of (rank-1) lattices, \(P({\varvec{g}},p)\) is often called a (rank-1) polynomial lattice and a QMC rule using the point set \(P({\varvec{g}},p)\) is referred to as a polynomial lattice rule (modulo p). Furthermore, note that one can restrict the choice of the components \(g_j\) of \({\varvec{g}}\) to the sets

$$\begin{aligned} G_{b,m} := \{ g\in {\mathbb {F}}_b [x] \mid \deg (g)< m \} \quad \text {or} \quad G^*_{b,m} := \{ g\in {\mathbb {F}}_b [x] \setminus \{0\} \mid \deg (g) < m \} . \end{aligned}$$

We also add that it is known from the literature on polynomial lattice point sets that it is desirable to have \(\gcd (g_j,p)=1\) for the components \(g_j\) of \({\varvec{g}}\), as this guarantees certain regularity properties.

For prime b, the generating matrices \(C_1,\ldots ,C_d \in {\mathbb {F}}_b^{m \times m}\) of a polynomial lattice point set \(P({\varvec{g}},p)\) can be obtained from the generating vector \({\varvec{g}}\) and p, cf. [5, Theorem 10.5]. To this end, define for \(k \in {\mathbb {N}}_0\) with b-adic expansion \(k=\kappa _0 + \kappa _1 b + \cdots + \kappa _{a-1} b^{a-1}\) the truncation map \(\mathrm{tr}_m: {\mathbb {N}}_0 \rightarrow G_{b,m}\) via

$$\begin{aligned} \mathrm{tr}_m(k) := \kappa _0 + \kappa _1 x + \cdots +\kappa _{m-1} x^{m-1} , \end{aligned}$$

where we consider \(\kappa _j\) as 0 if \(j\ge a\). If we apply \(\mathrm{tr}_m\) to a d-dimensional vector, we define its d-variate generalization \(\mathrm{tr}_m({\varvec{k}})\) to be applied componentwise. It then follows that the dual net \({\mathcal {D}}({\varvec{g}},p)\) of a polynomial lattice with generating vector \({\varvec{g}}\), modulus p with \(\deg (p)=m\), and generating matrices \(C_1,\ldots ,C_d\) equals (see, e.g., [16, Lemma 4.40])

$$\begin{aligned} {\mathcal {D}}({\varvec{g}},p) = \{{\varvec{k}}\in {\mathbb {N}}_0^d \mid C_1^\top \mathbf {k}^m_1 + \dots + C_d^\top \mathbf {k}^m_d = \overline{{\varvec{0}}} \} = \{{\varvec{k}}\in {\mathbb {N}}_0^d \mid \mathrm{tr}_m({\varvec{k}}) \cdot {\varvec{g}}\equiv 0 {\;(\mathrm{mod}\;p)} \} , \end{aligned}$$

where for two vectors \({\varvec{u}},{\varvec{v}}\in ({\mathbb {F}}_b[x])^d\) we define the vector dot product \({\varvec{u}}\cdot {\varvec{v}}= \sum _{j=1}^{d} u_j v_j\). Furthermore, for a subset \({\mathfrak {u}}\subseteq \{1 {{:}}d\}\) we introduce the notation

$$\begin{aligned} {\mathcal {D}}_{\mathfrak {u}}= {\mathcal {D}}_{\mathfrak {u}}({\varvec{g}}, p) = {\mathcal {D}}_{\mathfrak {u}}({\varvec{g}}_{\mathfrak {u}}) := \{ {\varvec{k}}_{\mathfrak {u}}\in {\mathbb {N}}^{|{\mathfrak {u}}|} \mid \mathrm{tr}_m({\varvec{k}}_{\mathfrak {u}}) \cdot {\varvec{g}}_{\mathfrak {u}}\equiv 0 {\;(\mathrm{mod}\;p)} \} . \end{aligned}$$

Due to the obtained equivalence for the dual net of a polynomial lattice, the result in Theorem 2 also applies to polynomial lattice rules with \({\mathcal {D}}(C_1,\ldots ,C_d)\) replaced by \({\mathcal {D}}({\varvec{g}},p)\). Furthermore, we will henceforth denote the worst-case error of a QMC rule based on the polynomial lattice point set \(P({\varvec{g}},p)\) in the space \(W_{d,{\varvec{\gamma }}}^\alpha \) by \(e_{b^m,d,\alpha ,{\varvec{\gamma }}}({\varvec{g}})\).

2.3 The Quality Measure

In this section, we introduce an alternative quality measure which, opposed to the worst-case error expression \(e_{b^m,d,\alpha ,{\varvec{\gamma }}}\) in (6), is independent of the parameter \(\alpha \).

For \(\alpha \ge 1\), given weight sequence \({\varvec{\gamma }}=(\gamma _j)_{j \ge 1}\), \(m \in {\mathbb {N}}\), modulus \(p \in {\mathbb {F}}_b[x]\) with \(\deg (p)=m\), and \({\varvec{g}}\in ({\mathbb {F}}_b[x])^d\), we define the quantities

$$\begin{aligned} T_{{\varvec{\gamma }}}({\varvec{g}},p) := \sum _{{\varvec{0}}\ne {\varvec{k}}\in A_p({\varvec{g}})} (r_{1,{\varvec{\gamma }}}({\varvec{k}}))^{-1} , \qquad T_{\alpha ,{\varvec{\gamma }}}({\varvec{g}},p) := \sum _{{\varvec{0}}\ne {\varvec{k}}\in A_p({\varvec{g}})}(r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}))^{-1} \end{aligned}$$
(7)

with index set given by

$$\begin{aligned} A_p({\varvec{g}}) := \{ {\varvec{k}}\in \{0,1,\ldots ,b^m-1\}^d \mid {\varvec{k}}\in {\mathcal {D}}({\varvec{g}}, p) \} . \end{aligned}$$

Furthermore, for a subset \(\emptyset \ne {\mathfrak {u}}\subseteq \{1 {{:}}d\}\), we introduce the sets

$$\begin{aligned} A_{\mathfrak {u}}&= A_{p,{\mathfrak {u}}}({\varvec{g}}_{\mathfrak {u}}) = A_{p,{\mathfrak {u}}}({\varvec{g}}) :=\{{\varvec{k}}_{\mathfrak {u}}\in \{0,1,\ldots ,b^m-1\}^{\left| {{\mathfrak {u}}}\right| } \mid {\varvec{k}}_{\mathfrak {u}}\in {\mathcal {D}}_{{\mathfrak {u}}}({\varvec{g}},p) \}, \\ A^*_{\mathfrak {u}}&= A^*_{p,{\mathfrak {u}}}({\varvec{g}}_{\mathfrak {u}}) =A^*_{p,{\mathfrak {u}}}({\varvec{g}}) :=\{ {\varvec{k}}_{\mathfrak {u}}\in \{1,\ldots ,b^m-1\}^{\left| {{\mathfrak {u}}}\right| } \mid {\varvec{k}}_{\mathfrak {u}}\in {\mathcal {D}}_{{\mathfrak {u}}}({\varvec{g}},p) \}, \end{aligned}$$

and for a polynomial \(p \in {\mathbb {F}}_b[x]\) define the indicator function \(\delta _p: {\mathbb {F}}_b[x] \rightarrow \{0,1\}\) by

$$\begin{aligned} \delta _p(q):={\left\{ \begin{array}{ll} 1, &{} \text {if } q \equiv 0 {\;(\mathrm{mod}\;p)}, \\ 0, &{} \text {if } q \not \equiv 0 {\;(\mathrm{mod}\;p)} . \end{array}\right. } \end{aligned}$$

In the following proposition, we estimate the difference between the worst-case error \(e_{b^m,d,\alpha ,{\varvec{\gamma }}}({\varvec{g}})\) and the truncated quality measure \(T_{\alpha ,{\varvec{\gamma }}}({\varvec{g}},p)\) of a polynomial lattice rule with generator \({\varvec{g}}\) and modulus \(p \in {\mathbb {F}}_b[x]\) with \(\deg (p)=m\).

Proposition 1

Let \({\varvec{\gamma }}= (\gamma _j)_{j\ge 1}\) be a sequence of positive weights, let \(p \in {\mathbb {F}}_b[x]\) with \(\deg (p)=m\), and let \({\varvec{g}}= (g_1,\ldots ,g_d) \in G_{b,m}^d\) such that \(\gcd (g_j,p)=1\) for all \(j=1,\ldots ,d\). Then, for any \(\alpha >1\) and \(N=b^m\), we have

$$\begin{aligned} e_{b^m,d,\alpha ,{\varvec{\gamma }}}({\varvec{g}}) - T_{\alpha ,{\varvec{\gamma }}}({\varvec{g}},p) \le \frac{1}{N^{\alpha }} \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \gamma _{{\mathfrak {u}}} (2 \mu _b(\alpha ))^{\left| {{\mathfrak {u}}}\right| } . \end{aligned}$$

Proof

For a non-empty subset \(\emptyset \ne {\mathfrak {u}}\subseteq \{1{{:}}d\}\) and \(i \in \{1{{:}}d\}\), we write for short \({\varvec{k}}_{{\mathfrak {u}}\setminus \{i\}} \in {\mathbb {N}}_0^{\left| {{\mathfrak {u}}}\right| -1}\) and \({\varvec{g}}_{{\mathfrak {u}}\setminus \{i\}} \in G_{b,m}^{\left| {{\mathfrak {u}}}\right| -1}\) to denote the projections on the components in \({\mathfrak {u}}\setminus \{i\}\). The difference can then be rewritten as

$$\begin{aligned}&e_{b^m,d,\alpha ,{\varvec{\gamma }}}({\varvec{g}}) - T_{\alpha ,{\varvec{\gamma }}}({\varvec{g}},p)\\&\quad =\sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \left( \sum _{{\varvec{k}}_{\mathfrak {u}}\in {\mathcal {D}}_{{\mathfrak {u}}}({\varvec{g}}_{\mathfrak {u}})} (r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}_{\mathfrak {u}}))^{-1} - \sum _{{\varvec{k}}_{\mathfrak {u}}\in A^*_{p,{\mathfrak {u}}}({\varvec{g}}_{\mathfrak {u}})} (r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}_{\mathfrak {u}}))^{-1}\right) , \end{aligned}$$

motivating us to define the quantity

$$\begin{aligned} S_{\alpha ,{\varvec{\gamma }},{\mathfrak {u}}}:= \sum _{{\varvec{k}}_{\mathfrak {u}}\in {\mathcal {D}}_{{\mathfrak {u}}}({\varvec{g}}_{\mathfrak {u}})} (r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}_{\mathfrak {u}}))^{-1} - \sum _{{\varvec{k}}_{\mathfrak {u}}\in A^*_{p,{\mathfrak {u}}} ({\varvec{g}}_{\mathfrak {u}})} (r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}_{\mathfrak {u}}))^{-1} \end{aligned}$$

for \(\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}\). In the following, we distinguish two cases.

Case 1: Suppose that \(\left| {{\mathfrak {u}}}\right| =1\) such that \({\mathfrak {u}}= \{j\}\) for some \(j \in \{1{{:}}d\}\). Then, we have

$$\begin{aligned} S_{\alpha ,{\varvec{\gamma }},\{j\}}&= \sum _{\begin{array}{c} k \in {\mathbb {N}}\\ \mathrm{tr}_m(k) \, g_j \equiv 0 {\;(\mathrm{mod}\;p)} \end{array}} (r_{\alpha ,\gamma _j}(k))^{-1} - \sum _{\begin{array}{c} k \in \{1,\ldots ,b^m-1\} \\ \mathrm{tr}_m(k) \, g_j \equiv 0 {\;(\mathrm{mod}\;p)} \end{array}} (r_{\alpha ,\gamma _j}(k))^{-1}\\&= \sum _{\begin{array}{c} k \ge b^m \\ \mathrm{tr}_m(k) \, g_j \equiv 0 {\;(\mathrm{mod}\;p)} \end{array}} (r_{\alpha ,\gamma _j}(k))^{-1} . \end{aligned}$$

Note that \(\mathrm{tr}_m(k) \, g_j \equiv 0 {\;(\mathrm{mod}\;p)}\) if and only if there is a \(c \in {\mathbb {F}}_b[x]\) such that \(\mathrm{tr}_m(k) \, g_j = c p\) and thus, since \(\gcd (g_j,p)=1\), we have that \(\mathrm{tr}_m(k) = a p\) for some \(a \in {\mathbb {F}}_b[x]\). But \(\deg (\mathrm{tr}_m(k)) < m\) while \(\deg (p)=m\), which implies that \(\mathrm{tr}_m(k) = 0\) and thus \(k = t \, b^m\) for some \(t \in {\mathbb {N}}\). This yields

$$\begin{aligned} S_{\alpha ,{\varvec{\gamma }},\{j\}}&= \sum _{t=1}^{\infty } (r_{\alpha ,\gamma _j}(t \, b^m))^{-1} =\gamma _j \sum _{t=1}^{\infty } b^{-\alpha \left\lfloor {\log _b{t \, b^m}}\right\rfloor } = \gamma _j \sum _{t=1}^{\infty } b^{-\alpha \left\lfloor {m + \log _b{t}}\right\rfloor } \\&= \gamma _j \sum _{t=1}^{\infty } b^{-\alpha m} \, b^{-\alpha \left\lfloor {\log _b{t}}\right\rfloor } = \frac{\gamma _j}{b^{\alpha m}} \sum _{t=1}^{\infty } b^{-\alpha \psi _b(t)} = \gamma _j \frac{\mu _b(\alpha )}{b^{\alpha m}} . \end{aligned}$$

Case 2: Suppose that \(\left| {{\mathfrak {u}}}\right| \ge 2\). In this case, we find that

$$\begin{aligned} S_{\alpha ,{\varvec{\gamma }},{\mathfrak {u}}} \le \sum _{i\in {\mathfrak {u}}} \sum _{{\varvec{k}}_{{\mathfrak {u}}\setminus \{i\}} \in {\mathbb {N}}^{\left| {{\mathfrak {u}}}\right| -1}} \sum _{k_i \ge b^m} \frac{\delta _p (\mathrm{tr}_m(k_i) g_i + \mathrm{tr}_m({\varvec{k}}_{{\mathfrak {u}}\setminus \{i\}})\cdot {\varvec{g}}_{{\mathfrak {u}}\setminus \{i\}})}{r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}_{\mathfrak {u}})} . \end{aligned}$$

Then, for \({\varvec{k}}_{{\mathfrak {u}}\setminus \{i\}} \in {\mathbb {N}}^{\left| {{\mathfrak {u}}}\right| -1}\), we write \(q = \mathrm{tr}_m({\varvec{k}}_{{\mathfrak {u}}\setminus \{i\}})\cdot {\varvec{g}}_{{\mathfrak {u}}\setminus \{i\}}\) and estimate the expression

$$\begin{aligned}&\sum _{k_i \ge b^m} \frac{\delta _p (\mathrm{tr}_m(k_i) g_i + q)}{r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}_{\mathfrak {u}})} \\&\quad = \gamma _{{\mathfrak {u}}} \sum _{k_i \ge b^m} \frac{\delta _p (\mathrm{tr}_m(k_i) g_i + q)}{\prod _{j \in {\mathfrak {u}}} b^{\alpha \left\lfloor {\log _b{k_j}}\right\rfloor }}\\&\quad = \gamma _{\mathfrak {u}}\prod _{\begin{array}{c} j\in {\mathfrak {u}}\\ j\ne i \end{array}} b^{-\alpha \left\lfloor {\log _b{k_j}}\right\rfloor } \sum _{k_i \ge b^m} \frac{\delta _p (\mathrm{tr}_m(k_i) g_i + q)}{b^{\alpha \left\lfloor {\log _b{k_i}}\right\rfloor }} \\&\quad =\gamma _{\mathfrak {u}}\prod _{\begin{array}{c} j\in {\mathfrak {u}}\\ j\ne i \end{array}} b^{-\alpha \left\lfloor {\log _b{k_j}}\right\rfloor } \sum _{t=1}^\infty \sum _{k_i=t b^m}^{(t+1)b^m-1} \frac{\delta _p (\mathrm{tr}_m(k_i) g_i + q)}{b^{\alpha \left\lfloor {\log _b{k_i}}\right\rfloor }} \\&\quad \le \gamma _{\mathfrak {u}}\prod _{\begin{array}{c} j\in {\mathfrak {u}}\\ j\ne i \end{array}} b^{-\alpha \left\lfloor {\log _b{k_j}}\right\rfloor } \sum _{t=1}^\infty b^{-\alpha \left\lfloor {\log _b{t b^m}}\right\rfloor } \underbrace{\sum _{k_i=t b^m}^{(t+1)b^m-1} \delta _p (\mathrm{tr}_m(k_i) g_i + q)}_{= 1} \\&\quad =\gamma _{\mathfrak {u}}\prod _{\begin{array}{c} j\in {\mathfrak {u}}\\ j\ne i \end{array}} b^{-\alpha \left\lfloor {\log _b{k_j}}\right\rfloor } \sum _{t=1}^\infty b^{-\alpha \left\lfloor {m + \log _b t}\right\rfloor } =\gamma _{\mathfrak {u}}\frac{\mu _b(\alpha )}{b^{\alpha m}} \prod _{\begin{array}{c} j\in {\mathfrak {u}}\\ j\ne i \end{array}} b^{-\alpha \left\lfloor {\log _b{k_j}}\right\rfloor } , \end{aligned}$$

where the penultimate equality follows since if \(\gcd (g_i,p)=1\) then for each t and each \(q \in {\mathbb {F}}_b[x]\) there exists exactly one \(k \in \{t b^m,\ldots , (t+1)b^m -1\}\) such that \(\mathrm{tr}_m(k) g_i + q \equiv 0 {\;(\mathrm{mod}\;p)}\).

Hence, we can estimate \(S_{\alpha ,{\varvec{\gamma }},{\mathfrak {u}}}\), for \(|{\mathfrak {u}}| \ge 2\), by

$$\begin{aligned} S_{\alpha ,{\varvec{\gamma }},{\mathfrak {u}}}&\le \sum _{i\in {\mathfrak {u}}} \sum _{{\varvec{k}}_{{\mathfrak {u}}\setminus \{i\}} \in {\mathbb {N}}^{\left| {{\mathfrak {u}}}\right| -1}} \gamma _{\mathfrak {u}}\frac{\mu _b(\alpha )}{b^{\alpha m}} \prod _{\begin{array}{c} j\in {\mathfrak {u}}\\ j\ne i \end{array}} b^{-\alpha \left\lfloor {\log _b{k_j}}\right\rfloor }\\&=\gamma _{\mathfrak {u}}\frac{\mu _b(\alpha )}{b^{\alpha m}} \sum _{i\in {\mathfrak {u}}} \left( \sum _{k=1}^{\infty } b^{-\alpha \left\lfloor {\log _b{k}}\right\rfloor } \right) ^{\left| {{\mathfrak {u}}}\right| -1} \\&= \gamma _{\mathfrak {u}}\frac{\mu _b(\alpha )}{b^{\alpha m}} \sum _{i\in {\mathfrak {u}}} \mu _b(\alpha )^{\left| {{\mathfrak {u}}}\right| -1} =\gamma _{\mathfrak {u}}\frac{\mu _b(\alpha )^{\left| {{\mathfrak {u}}}\right| }}{b^{\alpha m}} \left| {{\mathfrak {u}}}\right| \le \gamma _{\mathfrak {u}}\frac{1}{N^\alpha } (2 \mu _b(\alpha ))^{\left| {{\mathfrak {u}}}\right| } . \end{aligned}$$

In summary, we obtain, using the results for both cases from above,

$$\begin{aligned} \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} S_{\alpha ,{\varvec{\gamma }},{\mathfrak {u}}} \le \frac{1}{N^{\alpha }} \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \gamma _{{\mathfrak {u}}} (2 \mu _b(\alpha ))^{\left| {{\mathfrak {u}}}\right| }, \end{aligned}$$

which is the claimed upper estimate. \(\square \)

Based on the previous result, it is straightforward to show the existence of good polynomial lattice rules with respect to the worst-case error in the weighted Walsh space, if one assumes the modulus p to be irreducible. We omit the proof, which uses standard methods, and instead refer the reader to [7, Theorems 9 and 10], where an analogous result for lattice rules has been proven.

Theorem 3

Let \(p \in {\mathbb {F}}_b[x]\) be an irreducible polynomial with \(\deg (p)=m\), let \(N=b^{m}\), and let \({\varvec{\gamma }}= (\gamma _j)_{j\ge 1}\) be positive weights. Then, there exists a \({\varvec{g}}\in G_{b,m}^d\) such that, for all \(\alpha >1\), the worst-case error \(e_{b^m,d,\alpha ,{\varvec{\gamma }}}({\varvec{g}})\) satisfies

$$\begin{aligned} e_{b^m,d,\alpha ,{\varvec{\gamma }}}({\varvec{g}}) \le \frac{1}{N^\alpha } \left( \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \gamma _{{\mathfrak {u}}} (2\mu _b(\alpha ))^{|{\mathfrak {u}}|} + \left( \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \gamma _{{\mathfrak {u}}}^{1/\alpha }(m(b-1))^{|{\mathfrak {u}}|}\right) ^{\alpha }\right) . \end{aligned}$$

Even though the result in Theorem 3 assures us that there always exist generating vectors of polynomial lattice point sets which are in a certain sense good, the result is not constructive. The road which we will take in the present paper is slightly different. Instead of assuming an irreducible modulus p, we will assume that p has the special form \(p(x)=p_m(x)=x^m\) and show a constructive approach to find generating vectors of good polynomial lattice rules. This will be the main result of our paper, which is stated in Theorem 8.

3 The CBC-DBD Construction for Polynomial Lattice Rules

In this section, we formulate and analyze a method for the construction of good polynomial lattice rules. In contrast to the existence result in Theorem 3, our construction method yields polynomial lattice rules with modulus \(p(x)=x^m\). At first, we prove some auxiliary statements which will be needed in the further analysis.

3.1 Preliminary Results

We consider the following Walsh series for \(x\in (0,1)\), based on the decay function \(r_1\),

$$\begin{aligned} \sum _{k=0}^\infty \frac{\mathrm{wal}_k(x)}{r_1(k)} = 1 + \sum _{k=1}^\infty \frac{e^{2 \pi \mathrm{i}(\kappa _{0} \xi _1 + \kappa _{1} \xi _2 + \cdots )/b}}{b^{\left\lfloor {\log _b(k)}\right\rfloor }} , \end{aligned}$$

which, as we will see, is closely related to our quality criterion \(T_{{\varvec{\gamma }}}\) introduced in (7). To this end, we define, for \(n \in {\mathbb {N}}\), the n-th Walsh-Dirichlet kernel by

$$\begin{aligned} D_n(x)=\sum _{k=0}^{n-1} { \mathrm{wal}_k(x)}. \end{aligned}$$

From [5, Lemma A.17], it then follows that, for \(x \in (0,1)\),

$$\begin{aligned} D_{b^t}(x)=\left\{ \begin{array}{cc} b^t, &{} \quad \text {if } x\in (0, \frac{1}{b^t} ), \\ 0 , &{} \quad \text {if } x\in [ \frac{1}{b^t}, 1 ). \end{array}\right. \end{aligned}$$
(8)

We can then prove the following identity.

Lemma 1

For base \(b \ge 2\), the Walsh series of \(-(b-1) (\left\lfloor {\log _b(x)}\right\rfloor + 1)\) equals, pointwise for \(x\in (0,1)\),

$$\begin{aligned} -(b-1) (\left\lfloor {\log _b(x)}\right\rfloor + 1) = 1 + \sum _{k=1}^\infty \frac{e^{2 \pi \mathrm{i}(\kappa _{0} \xi _1 + \kappa _{1} \xi _2 + \cdots )/b}}{b^{\left\lfloor {\log _b(k)}\right\rfloor }} = \sum _{k=0}^\infty \frac{ \mathrm{wal}_k(x)}{r_{1}(k)} . \end{aligned}$$

Proof

Using the definition of the Walsh-Dirichlet kernel, we obtain

$$\begin{aligned} \sum _{k=1}^\infty \frac{\mathrm{wal}_{k}(x)}{b^{\lfloor \log _b(k)\rfloor }} =\sum _{t=1}^{\infty } \sum _{k=b^{t-1}}^{b^t-1} \frac{ \mathrm{wal}_{k}(x)}{b^{t-1}} = \sum _{t=1}^{\infty }\frac{D_{b^t}(x)- D_{b^{t-1}}(x)}{b^{t-1}}, \end{aligned}$$

and from (8) we find that for \(t \ge 1\) we have

$$\begin{aligned} D_{b^t}(x)- D_{b^{t-1}}(x)=\left\{ \begin{array}{ll} (b-1)b^{t-1}, &{}\quad \text {if } x\in \left( 0, \frac{1}{b^t} \right) , \\ -b^{t-1}, &{}\quad \text {if } x\in \left[ \frac{1}{b^t}, \frac{1}{b^{t-1}} \right) , \\ 0, &{}\quad \text {if } x\in \left[ \frac{1}{b^{t-1}}, 1 \right) . \end{array}\right. \end{aligned}$$
(9)

Applied inductively, the relation in (9) yields that for \(x\in [\frac{1}{b^t}, \frac{1}{b^{t-1}})\) we have

$$\begin{aligned} \sum _{\ell =1}^{\infty }\frac{D_{b^\ell }(x)- D_{b^{\ell -1}}(x)}{b^{\ell -1}}&=\sum _{\ell =1}^{t-1} \frac{(b-1)b^{\ell -1}}{b^{\ell -1}} - \frac{b^{t-1}}{b^{t-1}} =(t-1)(b-1) - 1 \end{aligned}$$

for all \(t \ge 1\), which is equivalent to

$$\begin{aligned} 1 + \sum _{k=1}^\infty \frac{\mathrm{wal}_{k}(x)}{b^{\left\lfloor {\log _b(k)}\right\rfloor }} =(b-1)(t-1) = -(b-1)(-t+1) = -(b-1) (\left\lfloor {\log _b(x)}\right\rfloor + 1) \end{aligned}$$

for \(x\in [\frac{1}{b^t}, \frac{1}{b^{t-1}})\) and for all \(t \in {\mathbb {N}}\). This proves the claimed identity. \(\square \)

Based on the previous result in Lemma 1, we show that the function \(-(b-1) (\left\lfloor {\log _b(x)}\right\rfloor + 1)\) can be written in terms of its truncated Walsh series with uniformly bounded remainder term.

Lemma 2

Let \(N=b^m\) with \(m\in {\mathbb {N}}\) and base \(b \ge 2\). Then, for any \(x \in (0,1)\) there exists a \(\tau = \tau (x)\in {\mathbb {R}}\) with \(|\tau (x)|< \frac{b}{b-1}\) such that

$$\begin{aligned} -(b-1) (\left\lfloor {\log _b(x)}\right\rfloor + 1) = \sum _{k=0}^{N-1} \frac{\mathrm{wal}_k(x)}{r_{1}(k)} + \frac{\tau (x)}{N x} . \end{aligned}$$
(10)

Proof

The expansion in Lemma 1 allows us to write

$$\begin{aligned} -(b-1) (\left\lfloor {\log _b(x)}\right\rfloor + 1) = \sum _{k=0}^{N-1} \frac{\mathrm{wal}_k(x)}{r_{1}(k)} + R_{N}(x) , \end{aligned}$$

where the remainder \(R_{N}(x)\) has the form

$$\begin{aligned} R_{N}(x)&= \sum _{k=b^m}^{\infty } \frac{\mathrm{wal}_k(x)}{r_{1}(k)} =\sum _{k=b^m}^{\infty } \frac{\mathrm{wal}_k(x)}{b^{\left\lfloor {\log _b(k)}\right\rfloor }} =\sum _{t=m}^{\infty }\frac{D_{b^{t+1}}(x)- D_{b^t}(x)}{b^t} . \end{aligned}$$

From (9), we then see that the following inequality holds,

$$\begin{aligned} \left| {D_{b^{t+1}}(x)- D_{b^t}(x)}\right| < \frac{1}{x}, \quad t\in {\mathbb {N}}, \ x \in (0,1), \end{aligned}$$

and thus, we obtain

$$\begin{aligned} \left| {R_{N}(x)}\right| =\left| {\sum _{t=m}^{\infty }\frac{D_{b^{t+1}}(x)- D_{b^t}(x)}{b^t}}\right| <\frac{1}{x}\sum _{t=m}^{\infty }\frac{1}{b^t} =\frac{b}{(b-1)b^m}\frac{1}{x}=\frac{b}{(b-1)N x}, \end{aligned}$$

which implies the existence of a \(\tau (x) \in {\mathbb {R}}\) with \(|\tau (x)|< \frac{b}{b-1}\) such that the identity (10) holds. \(\square \)

Remark 4

Using a more involved argument, the result in Lemma 2 can also be extended to general \(N \in {\mathbb {N}}\). In particular, we obtain that for any \(x \in (0,1)\) there exists a \(\tau =\tau (x) \in {\mathbb {R}}\) such that

$$\begin{aligned} -(b-1) (\left\lfloor {\log _b(x)}\right\rfloor + 1) = \sum _{k=0}^{N-1} \frac{\mathrm{wal}_k(x)}{r_{1}(k)} + \frac{\tau }{N x} \end{aligned}$$

with \(\left| {\tau }\right| < b \left( \frac{1}{b-1} + 2 \right) \) for \(b=2\) and with \(\left| {\tau }\right| < b \left( \frac{1}{b-1} + 2b \right) \) for \(b>2\).

We will also make use of the following lemma, which was proved in [7, Lemma 3].

Lemma 3

For \(j\in \{1{{:}}d\}\), let \(u_j, w_j\), and \(\rho _j\) be real numbers satisfying

$$\begin{aligned} (a) \quad u_j = w_j + \rho _j, \quad (b) \quad |u_j| \le \bar{u}_j, \quad (c) \quad \bar{u}_j \ge 1 , \end{aligned}$$

for all \(j\in \{1 {{:}}d\}\). Then, for any subset \(\emptyset \ne {\mathfrak {u}}\subseteq \{1 {{:}}d\}\) there exists a \(\theta _{{\mathfrak {u}}}\) with \(|\theta _{{\mathfrak {u}}}| \le 1\) such that

$$\begin{aligned} \prod _{j \in {\mathfrak {u}}} u_j&= \prod _{j \in {\mathfrak {u}}} w_j + \theta _{{\mathfrak {u}}} \left( \prod _{j \in {\mathfrak {u}}} (\bar{u}_j+|\rho _j|) \right) \sum _{j \in {\mathfrak {u}}} |\rho _j| . \end{aligned}$$

Furthermore, we recall the character property of Walsh functions for polynomial lattice rules with prime base b. Let \(P({\varvec{g}},p)=\{{\varvec{x}}_0,\ldots ,{\varvec{x}}_{b^m-1}\}\) be a polynomial lattice with generating vector \({\varvec{g}}\in ({\mathbb {F}}_b[x])^d\) and modulus \(p \in {\mathbb {F}}_b[x]\) with \(\deg (p)=m\). Then, for any integer vector \({\varvec{k}}\in {\mathbb {N}}_0^d\) the following identity holds,

$$\begin{aligned} \frac{1}{b^m} \sum _{n=0}^{b^m-1} \mathrm{wal}_{{\varvec{k}}} ({\varvec{x}}_n) = \delta _p(\mathrm{tr}_m({\varvec{k}}) \cdot {\varvec{g}}) = {\left\{ \begin{array}{ll} 1, &{} \text {if } \mathrm{tr}_m({\varvec{k}}) \cdot {\varvec{g}}\equiv 0 {\;(\mathrm{mod}\;p)} , \\ 0, &{} \text {otherwise} . \end{array}\right. } \end{aligned}$$
(11)

We remark that an analogous result to (11) also holds if we only consider projections of the polynomial lattice and the generating vector onto a non-empty subset of \(\{1 {{:}}d\}\), as also the projection of a polynomial lattice is a polynomial lattice that is generated by the corresponding projection of the generating vector.

We now state an auxiliary result that will be useful at several instances in this paper.

Lemma 4

Let \(P({\varvec{g}},p)\) be a polynomial lattice with modulus \(p \in {\mathbb {F}}_b[x]\) with \(\deg (p)=m\) and generating vector \({\varvec{g}}=(g_1,\ldots ,g_d) \in ({\mathbb {F}}_b[x])^d\) such that \(\gcd (g_j,p)=1\) for \(1 \le j \le d\). Then, each one-dimensional projection of \(P({\varvec{g}},p)\) is the full grid

$$\begin{aligned} \left\{ 0,\frac{1}{b^m}, \ldots , \frac{b^m-1}{b^m}\right\} , \end{aligned}$$

and in particular the projection of the point with index 0 is always 0.

Proof

The result follows from Definition 2 and [5, Remark 10.3]. \(\square \)

Additionally, we will need the following result.

Lemma 5

Let \(P({\varvec{g}},p)=\{{\varvec{x}}_0,\ldots ,{\varvec{x}}_{b^m-1}\}\) be a polynomial lattice point set with modulus \(p \in {\mathbb {F}}_b[x]\) with \(\deg (p)=m\) and generating vector \({\varvec{g}}\in ({\mathbb {F}}_b[x])^d\) such that \(\gcd (g_j,p)=1\) for \(1 \le j \le d\). Furthermore, let \(m\ge 4\). For a point \({\varvec{x}}_n\) with \(n\in \{0,1,\ldots ,b^m-1\}\), we denote its coordinates via \({\varvec{x}}_n = (x_{n,1},\ldots ,x_{n,d})\). Then, for any \(j \in \{1,\ldots ,d\}\), it is true that

$$\begin{aligned} \frac{1}{b^m} \sum _{n=1}^{b^m-1} \frac{1}{x_{n,j}} < 1 + m \ln b \le m (b-1) . \end{aligned}$$

Proof

We recall that the point set \(P({\varvec{g}},p)\) is defined as the collection of the \(b^m\) points of the form

$$\begin{aligned} {\varvec{x}}_n = \left( v_m \left( \frac{n(x)\, g_1(x)}{p(x)} \right) ,\ldots , v_m\left( \frac{n(x)\, g_d(x)}{p(x)} \right) \right) \end{aligned}$$

for \(n \in {\mathbb {F}}_b[x]\) with \(\deg (n) < m\). Due to Lemma 4, we know that \(\{x_{1,j},\ldots ,x_{b^m-1,j}\}\) equals the set \(\left\{ \frac{1}{b^m}, \ldots , \frac{b^m-1}{b^m}\right\} \) for each \(j\in \{1,\ldots ,d\}\). Thus, we can estimate

$$\begin{aligned} \frac{1}{b^m} \sum _{n=1}^{b^m-1} \frac{1}{x_{n,j}}&= \frac{1}{b^m} \sum _{n=1}^{b^m-1} \frac{b^m}{n} = \sum _{n=1}^{b^m-1} \frac{1}{n} \le 1 + \int _{1}^{b^m-1} \frac{1}{x} \,\mathrm{d}x \\&= 1 + \ln (b^m-1) < 1 + \ln (b^m) = 1 + m \ln b \le m (b-1) , \end{aligned}$$

which yields the claimed result, where the last estimate follows from the assumption \(m\ge 4\). \(\square \)

3.2 The CBC-DBD Construction Algorithm

We are now ready to study the component-by-component digit-by-digit (CBC-DBD) construction for polynomial lattice rules, see also [7], where such an algorithm was analyzed for ordinary lattice rules. In particular, we will assume throughout this section that our modulus polynomial is of the form \(p_m (x)=x^m\) for \(m\in {\mathbb {N}}\).

Concerning the weights, the algorithm can, as indicated in our main result (Theorem 8), be run with respect to the weights \({\varvec{\gamma }}^{1/\alpha } = (\gamma _j^{1/\alpha })_{j \ge 1}\) to obtain a polynomial lattice rule that yields a low worst-case error in the Walsh space \(W_{d,{\varvec{\gamma }}}^\alpha \), or, alternatively, with respect to the weights \({\varvec{\gamma }}\) to obtain good polynomial lattice rules in the space \(W_{d,{\varvec{\gamma }}^{\alpha }}^\alpha \). In the latter case, the construction algorithm is independent of the smoothness parameter \(\alpha \) and we obtain worst-case error bounds that hold for all \(\alpha >1\) simultaneously.

In order to avoid confusion, we will therefore denote the weights in this section by \({\varvec{\eta }}\) instead of \({\varvec{\gamma }}\) and outline the algorithm based on \({\varvec{\eta }}\). In Theorem 8, we will then choose \({\varvec{\eta }}\) equal to \({\varvec{\gamma }}^{1/\alpha }\) or \({\varvec{\gamma }}\), respectively. For technical reasons (see Remark 5), it will be necessary to assume that the positive weights \({\varvec{\eta }}\) are of product structure, that is,

$$\begin{aligned} \eta _{{\mathfrak {u}}} = \prod _{j \in {\mathfrak {u}}} \eta _j \end{aligned}$$

for \({\mathfrak {u}}\subseteq \{1 {{:}}d\}\), with a sequence of positive reals \((\eta _j)_{j\ge 1}\). However, we point out that the following theorem, which is crucial for the proposed construction method, also holds for general weights \({\varvec{\eta }}= (\eta _{{\mathfrak {u}}})_{{\mathfrak {u}}\subseteq \{1:d\}}\).

Theorem 4

Let b be prime, let \(m,d \in {\mathbb {N}}\) with \(m\ge 4\), let \(p_m(x)=x^m \in {\mathbb {F}}_b[x]\), and let \({\varvec{\eta }}= (\eta _{\mathfrak {u}})_{{\mathfrak {u}}\subseteq \{1:d\}}\) be positive weights with \(\eta _{\emptyset } = 1\). Furthermore, let \({\varvec{g}}=(g_1,\ldots , g_d) \in ({\mathbb {F}}_b[x])^d\) with \(\deg (g_j) < m\) and \(\gcd (g_j,p_m)=1\) for \(1\le j\le d\). Then,

$$\begin{aligned} T_{{\varvec{\eta }}} ({\varvec{g}},p_m)&\le \frac{1}{b^m} \, H_{d,m,{\varvec{\eta }}}({\varvec{g}}) - \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \eta _{\mathfrak {u}}+ \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} ((b-1)m+1)^{|{\mathfrak {u}}|}\nonumber \\&\quad + \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} \, (b \,m \left| {{\mathfrak {u}}}\right| ) \left( (b-1)m + \frac{b}{b-1} \right) ^{\left| {{\mathfrak {u}}}\right| } , \end{aligned}$$
(12)

where we define the function \(H_{d,m,{\varvec{\eta }}}: ({\mathbb {F}}_b[x])^d \rightarrow {\mathbb {R}}\) as

$$\begin{aligned} H_{d,m,{\varvec{\eta }}}({\varvec{g}}) := \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}}\eta _{\mathfrak {u}}(1-b)^{|{\mathfrak {u}}|} \sum _{n=1}^{b^m-1} \prod _{j\in {\mathfrak {u}}} \left( \left\lfloor {\log _b \left( v_m \left( \frac{n(x) \, g_j(x)}{x^m} \right) \right) }\right\rfloor + 1 \right) . \end{aligned}$$
(13)

Proof

We use the character property of Walsh functions in (11) to rewrite \(T_{{\varvec{\eta }}}({\varvec{g}},p_m)\) with the help of the identity in Lemma 2. First, we recall that for \(k \in {\mathbb {N}}_0\) we have

$$\begin{aligned} r_1(k) =r_1(b,k) = \left\{ \begin{array}{ll} 1, &{} \quad {\text {for }} k = 0, \\ b^{\left\lfloor {\log _b(k)}\right\rfloor }, &{} \quad {\text {for }} k \ne 0 . \end{array}\right. \end{aligned}$$

Using this definition, we obtain that

$$\begin{aligned}&T_{{\varvec{\eta }}} ({\varvec{g}},p_m) \nonumber \\&\quad = \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \eta _{\mathfrak {u}}\sum _{{\varvec{k}}_{\mathfrak {u}}\in \{1,\dots ,b^m-1 \}^{|{\mathfrak {u}}|}} \frac{\delta _{p_m}(\mathrm{tr}_m({\varvec{k}}_{\mathfrak {u}}) \cdot {\varvec{g}}_{\mathfrak {u}})}{\prod _{j\in {\mathfrak {u}}} r_1(k_j)} \nonumber \\&\quad \le \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \eta _{\mathfrak {u}}\sum _{{\varvec{0}}\ne {\varvec{k}}_{\mathfrak {u}}\in \{0,\dots ,b^m-1 \}^{|{\mathfrak {u}}|}} \frac{\delta _{p_m}(\mathrm{tr}_m({\varvec{k}}_{\mathfrak {u}}) \cdot {\varvec{g}}_{\mathfrak {u}})}{\prod _{j\in {\mathfrak {u}}} r_1(k_j)} \nonumber \\&\quad = \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} \sum _{n=0}^{b^m-1} \left[ \sum _{{\varvec{k}}_{\mathfrak {u}}\in \{0,1,\dots ,b^m-1\}^{|{\mathfrak {u}}|}} \frac{\mathrm{wal}_{{\varvec{k}}_{\mathfrak {u}}}({\varvec{x}}_{n,{\mathfrak {u}}})}{\prod _{j\in {\mathfrak {u}}} r_1(k_j)} - 1 \right] \nonumber \\&\quad = \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} \left[ \sum _{{\varvec{k}}_{\mathfrak {u}}\in \{0,\dots ,b^m-1\}^{|{\mathfrak {u}}|}} \frac{1}{\prod _{j\in {\mathfrak {u}}} r_1(k_j)}\right. \nonumber \\&\qquad \left. +\,\sum _{n=1}^{b^m-1} \prod _{j \in {\mathfrak {u}}} \left( 1 + \sum _{k=1}^{b^m-1} \frac{\mathrm{wal}_k(x_{n,j})}{b^{\left\lfloor {\log _b(k)}\right\rfloor }} \right) \right] - \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \eta _{\mathfrak {u}}\nonumber \\&\quad = \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} \left[ \sum _{{\varvec{k}}_{\mathfrak {u}}\in \{0,\dots ,b^m-1\}^{|{\mathfrak {u}}|}} \frac{1}{\prod _{j\in {\mathfrak {u}}} r_1(k_j)} \right. \nonumber \\&\qquad \left. + \sum _{n=1}^{b^m-1} \left[ \prod _{j\in {\mathfrak {u}}} w_j(n) - \prod _{j\in {\mathfrak {u}}} u_j(n) + \prod _{j\in {\mathfrak {u}}} u_j(n) \right] \right] - \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \eta _{\mathfrak {u}}\nonumber \\&= \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} \sum _{{\varvec{k}}_{\mathfrak {u}}\in \{0,1,\dots ,b^m-1 \}^{|{\mathfrak {u}}|}} \frac{1}{\prod _{j\in {\mathfrak {u}}} r_1(k_j) } + \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} \sum _{n=1}^{b^m-1} \prod _{j\in {\mathfrak {u}}} u_j(n)\nonumber \\&\qquad - \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} \sum _{n=1}^{b^m-1} \theta _{{\mathfrak {u}}}(n) \left( \prod _{j\in {\mathfrak {u}}} (\bar{u}_j + |\rho _j(n)|) \right) \sum _{j\in {\mathfrak {u}}} |\rho _j(n)| - \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \eta _{\mathfrak {u}}, \end{aligned}$$
(14)

where we used Lemma 3 with

$$\begin{aligned} u_j = u_j(n)&:= -(b-1) (\left\lfloor {\log _b(x_{n,j})}\right\rfloor + 1) , \quad \bar{u}_j=\bar{u}_j(n) := (b-1)m , \\ w_j = w_j(n)&:= 1 + \sum _{k=1}^{b^m-1} \frac{\mathrm{wal}_k(x_{n,j})}{b^{\left\lfloor {\log _b(k)}\right\rfloor }} , \quad \rho _j = \rho _j(n) := \frac{\tau _j(n)}{x_{n,j} \, b^m} , \end{aligned}$$

and all \(|\theta _{\mathfrak {u}}(n)| \le 1\) and \(|\tau _j(n)| < \frac{b}{b-1}\). Due to Lemma 2, Condition (a) of Lemma 3 is fulfilled. Furthermore, we see that by Lemma 4 we have for each \(j \in \{1{{:}}d\}\) that \(x_{n,j} \ge b^{-m}\) for every \(1 \le n < b^m\), and hence

$$\begin{aligned} \left| {u_j(n)}\right| \le -(b-1) (\left\lfloor {\log _b(b^{-m})}\right\rfloor + 1) = -(b-1) (-m + 1) < (b-1) m = \bar{u}_j \end{aligned}$$

with \(\bar{u}_j \ge 1\) such that also Conditions (b) and (c) of Lemma 3 are satisfied.

By simple calculations, the first sum in (14) can be shown to equal

$$\begin{aligned}&\sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} \sum _{{\varvec{k}}_{\mathfrak {u}}\in \{0,1,\dots ,b^m-1 \}^{|{\mathfrak {u}}|}} \frac{1}{\prod _{j\in {\mathfrak {u}}} r_1 (k_j)}\\&\quad = \frac{1}{b^m} \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \eta _{\mathfrak {u}}\left( 1+\sum \limits _{k=1}^{b^{m}-1}\frac{1}{ b^{\left\lfloor {\log _b(k)}\right\rfloor }} \right) ^{|{\mathfrak {u}}|}\\&\quad = \frac{1}{b^m} \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \eta _{\mathfrak {u}}\left( 1+\sum \limits _{t=0}^{m-1}\sum \limits _{k=b^t}^{b^{t+1}-1} \frac{1}{b^{\left\lfloor {\log _b(k)}\right\rfloor }} \right) ^{|{\mathfrak {u}}|}\\&\quad = \frac{1}{b^m} \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \eta _{\mathfrak {u}}((b-1)m+1)^{|{\mathfrak {u}}|} , \end{aligned}$$

while the third sum in (14) can be bounded by

$$\begin{aligned}&-\sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} \sum _{n=1}^{b^m-1} \theta _{{\mathfrak {u}}}(n) \left( \prod _{j\in {\mathfrak {u}}} (\bar{u}_j + |\rho _j(n)|) \right) \sum _{j\in {\mathfrak {u}}} |\rho _j(n)| \\&\quad = -\sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} \sum _{n=1}^{b^m-1} \theta _{{\mathfrak {u}}}(n) \left( \prod _{j\in {\mathfrak {u}}} \left( (b-1)m + \frac{|\tau _j(n)|}{x_{n,j} \, b^m}\right) \right) \sum _{j\in {\mathfrak {u}}} \frac{|\tau _j(n)|}{x_{n,j} \,b^m}\\&\quad \le \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} \sum _{n=1}^{b^m-1} |\theta _{{\mathfrak {u}}}(n) | \left( \prod _{j\in {\mathfrak {u}}} \left( (b-1)m + \frac{b}{b-1} \right) \right) \sum _{j\in {\mathfrak {u}}} \frac{|\tau _j(n)|}{x_{n,j} \, b^m}\\&\quad \le \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} \left( \prod _{j\in {\mathfrak {u}}} \left( (b-1)m + \frac{b}{b-1} \right) \right) \sum _{j\in {\mathfrak {u}}} \sum _{n=1}^{b^m-1} \frac{b}{(b-1) b^m} \frac{1}{x_{n,j}}\\&\quad \le \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} \left( \prod _{j\in {\mathfrak {u}}} \left( (b-1)m + \frac{b}{b-1} \right) \right) \frac{b}{b-1} \sum _{j\in {\mathfrak {u}}} m (b-1)\\&\quad = \frac{1}{b^m} \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \eta _{\mathfrak {u}}\, (b \,m \left| {{\mathfrak {u}}}\right| ) \left( (b-1)m + \frac{b}{b-1} \right) ^{\left| {{\mathfrak {u}}}\right| } , \end{aligned}$$

where we used Lemma 5 and the fact that \(x_{n,j} \ge b^{-m}\) for each j and all \(1 \le n < b^m\). Combining these results with (14) yields the claimed result. \(\square \)

We observe that the latter terms in (12) are of order \({\mathcal {O}}(b^{-m})\) and the implied constants can be bounded independently of the dimension provided that certain conditions on the weight sequence \({\varvec{\eta }}\) hold (see the proof of Theorem 8 for further details). As this is the desired order of \(T_{{\varvec{\eta }}} ({\varvec{g}},p_m)\), Theorem 4 implies that it essentially suffices to find a generating vector \({\varvec{g}}\in ({\mathbb {F}}_b[x])^d\) such that \(H_{d,m,{\varvec{\eta }}}({\varvec{g}})\) is small, which then implies that also a good bound on \(T_{{\varvec{\eta }}}({\varvec{g}},p_m)\) holds. We will therefore consider the quantity \(H_{d,m,{\varvec{\eta }}}\) as a search criterion for good generating vectors.

At first, we prove the following result which will be needed in the further analysis and remind the reader that by \(p_m\) we denote the polynomial \(p_m\in {\mathbb {F}}_b[x]\) with \(p_m(x)=x^m\) for \(m\in {\mathbb {N}}\).

Lemma 6

Let a prime b, an integer \(t \ge 2\), and polynomials \(\ell , q \in {\mathbb {F}}_b[x]\) with \(\gcd (\ell ,p_1)=\gcd (q,p_1)=1\) be given. Then, the following identity holds:

$$\begin{aligned} \sum _{g \in {\mathbb {F}}_b} \left( \left\lfloor {\log _b\left( v_t\left( \frac{\ell (x)\,(q(x) + x^{t-1} g)}{x^t} \right) \right) }\right\rfloor + 1 \right) = \left\lfloor {\log _b\left( v_{t-1}\left( \frac{\ell (x) \, q(x)}{x^{t-1}} \right) \right) }\right\rfloor . \end{aligned}$$

Proof

Assume that the product of the polynomials \(\ell \) and q is given by

$$\begin{aligned} \ell (x) \, q (x) = \sum _{i=0}^r a_i x^i \quad \text {with} \quad a_0,a_r \ne 0 . \end{aligned}$$

Let, furthermore,

$$\begin{aligned} \ell (x)=\sum _{k=0}^v \ell _k x^k , \end{aligned}$$

where we note that \(v\le r\). Hence, we obtain that for \(g \in {\mathbb {F}}_b\)

$$\begin{aligned} \frac{\ell (x) \,(q (x) + x^{t-1} g)}{x^t} = \sum _{i=t}^r a_i x^{i-t} + \sum _{k=1}^v \ell _k g x^{k-1} + (a_{t-1} + \ell _0 g) x^{-1} + \sum _{i=0}^{t-2} a_i x^{i-t} \end{aligned}$$

and thus, we have that if \(a_{t-1} + \ell _0 g \not \equiv 0 \pmod {b}\), then

$$\begin{aligned}&\left\lfloor {\log _b\left( v_t\left( \frac{\ell (x) \,(q (x) + x^{t-1} g)}{x^t} \right) \right) }\right\rfloor + 1 \\&\quad = \left\lfloor {\log _b\left( \frac{a_{t-1} + \ell _0 g}{b} + \sum _{i=0}^{t-2} a_i b^{i-t} \right) }\right\rfloor + 1 = - 1 + 1 = 0 . \end{aligned}$$

Otherwise, if \(a_{t-1} + \ell _0 g \equiv 0 \pmod {b}\), then

$$\begin{aligned} v_t \left( \frac{\ell (x) \,(q (x) + x^{t-1} g)}{x^t} \right)&= \sum _{i=0}^{t-2} a_i b^{i-t} = \sum _{i=2}^t a_{t-i} b^{-i} \\&= \frac{1}{b} \left( \sum _{i=1}^{t-1} a_{t-i-1} b^{-i} \right) = \frac{1}{b} v_{t-1}\left( \frac{\ell (x) \, q (x)}{x^{t-1}} \right) , \end{aligned}$$

and therefore

$$\begin{aligned} \left\lfloor {\log _b\left( v_t \left( \frac{\ell (x) \,(q (x) + x^{t-1} g)}{x^t} \right) \right) }\right\rfloor + 1&= \left\lfloor {\log _b\left( \frac{1}{b} v_{t-1} \left( \frac{\ell (x) \, q (x)}{x^{t-1}} \right) \right) }\right\rfloor + 1\\&= \left\lfloor {\log _b\left( v_{t-1}\left( \frac{\ell (x) \, q (x)}{x^{t-1}} \right) \right) }\right\rfloor . \end{aligned}$$

Observing that there exists exactly one \(g \in {\mathbb {F}}_b\) for which \(a_{t-1} + \ell _0 g \equiv 0 \pmod {b}\) and combining the two cases considered, we immediately obtain the claimed identity. \(\square \)

With the help of Lemma 6, we can prove the following result which motivates the choice of our quality function for Algorithm 1.

Lemma 7

For integers \(m \in {\mathbb {N}}\) and \(w \in \{1,\ldots ,m\}\), let b be prime, \(g \in {\mathbb {F}}_b\), and \({\varvec{g}}\in ({\mathbb {F}}_b[x])^d\) with \(\gcd (g_j,p_1)=1\) for all \(1 \le j \le d\), where \(g_d \in G_{b,w-1}\), and let \({\varvec{\eta }}= (\eta _{{\mathfrak {u}}})_{{\mathfrak {u}}\subseteq \{1:d\}}\), where \(\eta _{\mathfrak {u}}=\prod _{j \in {\mathfrak {u}}} \eta _j\) with positive reals \((\eta _j)_{j\ge 1}\), be product weights. Then, the average of \(H_{d,m,{\varvec{\eta }}}\) with respect to the choices for extending the degree of \(g_d + g \,p_{w-1}\) up to m equals

$$\begin{aligned}&\frac{1}{b^{m-w}} \sum _{\bar{g} \in G_{b,m-w}} H_{d,m,{\varvec{\eta }}}(g_1,\ldots ,g_{d-1},g_d + g \,p_{w-1} + \bar{g} \,p_w) \nonumber \\&\quad = \sum _{t=w}^{m} \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \eta _d (1-b) \, b^{w-t} \left( \left\lfloor {\log _b \left( v_w \left( \frac{\ell (x) \, (g_d (x) + g \,x^{w-1})}{x^w} \right) \right) }\right\rfloor + 1 \right) \nonumber \\&\qquad \times \prod _{j=1}^{d-1} \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, g_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) \nonumber \\&\qquad + S_{m,w,{\varvec{\eta }}}({\varvec{g}}) - (b^m-1), \end{aligned}$$
(15)

where the term \(S_{m,w,{\varvec{\eta }}}({\varvec{g}})\), which does not depend on g and \(\bar{g}\), is given by

$$\begin{aligned} S_{m,w,{\varvec{\eta }}}({\varvec{g}})&= \sum _{t=1}^{w-1} \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \prod _{j=1}^d \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, g_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) \\&\quad + \sum _{t=w}^m \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \prod _{j=1}^{d-1} \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, g_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) \\&\quad \times \left( 1 + \eta _d (1-b^{w-t}) \right) . \end{aligned}$$

Proof

For product weights \(\eta _{\mathfrak {u}}= \prod _{j \in {\mathfrak {u}}} \eta _j\) and \(\widetilde{{\varvec{g}}}=(\widetilde{g}_1,\ldots ,\widetilde{g}_d)\in ({\mathbb {F}}_b [x])^d\) with \(\gcd (\widetilde{g}_j,p_1)=1\) for all \(1 \le j \le d\), the quantity \(H_{d,m,{\varvec{\eta }}}(\widetilde{{\varvec{g}}})\) defined in (13) equals

$$\begin{aligned} H_{d,m,{\varvec{\eta }}}(\widetilde{{\varvec{g}}})= & {} \sum _{n=1}^{b^m-1} \prod _{j=1}^d \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_m \left( \frac{n(x) \, \widetilde{g}_j(x)}{x^m} \right) \right) }\right\rfloor + 1 \right) \right) \\&- (b^m-1). \end{aligned}$$

We define \(\bar{H}_{d,m,{\varvec{\eta }}}(\widetilde{{\varvec{g}}}) := H_{d,m,{\varvec{\eta }}}(\widetilde{{\varvec{g}}}) + (b^{m}-1)\) which in turn can be rewritten as

$$\begin{aligned} \bar{H}_{d,m,{\varvec{\eta }}}(\widetilde{{\varvec{g}}}) = \sum _{t=1}^{m} \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \prod _{j=1}^d \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, \widetilde{g}_j(x)}{x^t} \right) \right) }\right\rfloor +1 \right) \right) . \end{aligned}$$
(16)

In order to see this, note that for any function \(f: {\mathbb {R}}\rightarrow {\mathbb {R}}\) we have that

$$\begin{aligned}&\sum _{n=1}^{b^m-1} f\left( v_m\left( \frac{n(x) \,\widetilde{g}_j(x)}{x^m}\right) \right) \\&\quad = \sum _{n=1}^{b^m-1} f\left( v_m\left( \frac{n(x) \,\widetilde{g}_j(x) \bmod x^m}{x^m}\right) \right) = \sum _{n=1}^{b^m-1} f\left( v_m\left( \frac{n(x)}{x^m}\right) \right) \end{aligned}$$

for each \(j \in \{1,\ldots ,d\}\) since \(\gcd (\widetilde{g}_j,x)=1\) and that \(v_m\left( \frac{n}{x^m}\right) = \frac{n}{b^m}\) for \(n \in \{1,\ldots ,b^m-1\}\). Then, using the general identity

$$\begin{aligned} \sum _{n=1}^{b^m-1} f(n/b^m) = \sum _{t=1}^{m} \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} f(\ell / b^t) \end{aligned}$$

yields the rewritten formula for \(\bar{H}_{d,m,{\varvec{\eta }}}(\widetilde{{\varvec{g}}})\) in (16).

Setting \(\widetilde{g}_d=g_d + g \,p_{w-1} + \bar{g} \,p_w\) with \(\bar{g}\in G_{b,m-w}\) and \(\widetilde{g}_j=g_j\) for \(j \in \{1 {{:}}d-1\}\), we can write

$$\begin{aligned}&\frac{1}{b^{m-w}} \sum _{\bar{g} \in G_{b,m-w}} H_{d,m,{\varvec{\eta }}}(g_1,\ldots ,g_{d-1},g_d + g \,p_{w-1} + \bar{g} \,p_w)\\&\quad = \frac{1}{b^{m-w}} \sum _{\bar{g} \in G_{b,m-w}} \bar{H}_{d,m,{\varvec{\eta }}}(\widetilde{{\varvec{g}}})- (b^m-1) \\&\quad = \, \frac{1}{b^{m-w}} \sum _{\bar{g} \in G_{b,m-w}} \sum _{t=1}^{m} \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \prod _{j=1}^d \\&\qquad \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, \widetilde{g}_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) - (b^m-1) \\&\quad = \, \frac{1}{b^{m-w}} \sum _{\bar{g} \in G_{b,m-w}} \sum _{t=1}^{w-1} \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \prod _{j=1}^d\\&\qquad \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, \widetilde{g}_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) -(b^m-1)\\&\quad \qquad +\frac{1}{b^{m-w}} \sum _{\bar{g} \in G_{b,m-w}} \sum _{t=w}^{m} \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \prod _{j=1}^d \\&\qquad \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \,\widetilde{g}_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) . \end{aligned}$$

The term \(- (b^m-1)\) in (15) is therefore accounted for. What is more, by the definition of \(v_t\) we have for any \(q \in {\mathbb {F}}_b[x]\) that

$$\begin{aligned} v_t \left( \frac{q(x)}{x^t} \right) = v_t \left( \frac{q(x) \bmod x^t}{x^t} \right) , \end{aligned}$$
(17)

and hence

$$\begin{aligned}&\frac{1}{b^{m-w}} \sum _{\bar{g} \in G_{b,m-w}} \sum _{t=1}^{w-1} \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \prod _{j=1}^d \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, \widetilde{g}_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) \\&\quad = \sum _{t=1}^{w-1} \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \prod _{j=1}^d \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \,g_j}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) , \end{aligned}$$

which is the first sum in \(S_{m,w,{\varvec{\eta }}}\), and, in particular, is independent of g and all \(\bar{g} \in G_{b,m-w}\).

The second sum in \(S_{m,w,{\varvec{\eta }}}\) and all remaining terms in identity (15) are obtained by considering

$$\begin{aligned}&\frac{1}{b^{m-w}} \sum _{\bar{g} \in G_{b,m-w}} \sum _{t=w}^{m} \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \prod _{j=1}^d \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, \widetilde{g}_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) \nonumber \\&\quad =\sum _{t=w}^{m} \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \prod _{j=1}^{d-1} \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, g_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) \nonumber \\&\qquad \times \left( 1 + \eta _d (1-b) \frac{1}{b^{m-w}} \right. \nonumber \\&\left. \qquad \times \sum _{\bar{g} \in G_{b,m-w}}\left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, (g_d (x) + g \,x^{w-1} + \bar{g}(x) \,x^w)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) , \end{aligned}$$
(18)

such that, with the help of (17) and under the repeated use of Lemma 6, we obtain for each \(t \in \{w+1,\ldots ,m\}\) that

$$\begin{aligned}&\sum _{\bar{g} \in G_{b,m-w}} \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, (g_d (x) + g \,x^{w-1} + \bar{g} (x) \,x^w)}{x^t} \right) \right) }\right\rfloor + 1 \right) \\&\quad = b^{m-t} \sum _{\bar{g} \in G_{b,t-w}} \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x)\, (g_d (x) + g \,x^{w-1} + \bar{g} (x) \,x^w)}{x^t} \right) \right) }\right\rfloor + 1 \right) \\&\quad = b^{m-t} \sum _{\bar{g} \in G_{b,t-w-1}} \left( \left\lfloor {\log _b \left( v_{t-1} \left( \frac{\ell (x)\, (g_d (x) + g \,x^{w-1} + \bar{g} (x) \,x^w)}{x^{t-1}} \right) \right) }\right\rfloor + 1 - 1 \right) \\&\quad = b^{m-t} \left( \left\lfloor {\log _b \left( v_w \left( \frac{\ell (x) \, (g_d (x) + g \,x^{w-1})}{x^w} \right) \right) }\right\rfloor + 1 \right) - b^m \sum _{r=w+1}^t b^{-r}\\&\quad = b^{m-w} b^{w-t} \left( \left\lfloor {\log _b \left( v_w \left( \frac{\ell (x) \, (g_d (x) + g \,x^{w-1})}{x^w} \right) \right) }\right\rfloor + 1 \right) - b^{m-w} \left( \frac{1-b^{w-t}}{b-1} \right) . \end{aligned}$$

Combining this with the identity in (18) yields the remaining term of \(S_{m,w,{\varvec{\eta }}}\) and the first term in (15) such that the claimed result is proved. \(\square \)

We note that only the first term of (15) in Lemma 7 depends on the \((w-1)\)-th order term \(g x^{w-1}\) of \(g_d\). Therefore, we can introduce the quality function for our algorithm which is based on the first term of (15), yet slightly adjusted by an additional summand that is independent of g and \(\bar{g}\).

Definition 3

(Digit-wise quality function) Let \(q \in {\mathbb {F}}_b[x]\), with prime b, let \(m,d \in {\mathbb {N}}\), and let \({\varvec{\eta }}= (\eta _{{\mathfrak {u}}})_{{\mathfrak {u}}\subseteq \{1:d\}}\), where \(\eta _{\mathfrak {u}}=\prod _{j \in {\mathfrak {u}}} \eta _j\) with positive reals \((\eta _j)_{j\ge 1}\), be product weights. For integers \(w \in \{1 {{:}}m\}\), \(r \in \{1 {{:}}d\}\), and polynomials \(g_1,\ldots ,g_{r-1} \in {\mathbb {F}}_b[x]\) with \(\gcd (g_j,p_1)=1\) for \(j=1,\ldots ,r-1\), we define the quality function \(h_{r,w,m,{\varvec{\eta }}}: {\mathbb {F}}_b[x] \rightarrow {\mathbb {R}}\) as

$$\begin{aligned} h_{r,w,m,{\varvec{\eta }}}(q)&:= \sum _{t=w}^m \frac{1}{b^{t-w}} \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \left( 1 + \eta _{r} (1-b) \left( \left\lfloor {\log _b \left( v_w \left( \frac{\ell (x) \, q(x) }{x^w} \right) \right) }\right\rfloor + 1 \right) \right) \\&\qquad \times \prod _{j=1}^{r-1} \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, g_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) . \end{aligned}$$

We remark that the function \(h_{r,w,m,{\varvec{\eta }}}\) directly depends on the polynomials \(g_1,\ldots ,g_{r-1}\) even though this is not visible in the notation. In the remainder of this section, however, these polynomials will always be the components of the generating vector which were selected in the previous steps of our algorithm. Based on the quality function \(h_{r,w,m,{\varvec{\eta }}}\), we formulate the component-by-component digit-by-digit algorithm.

figure a

In the next section, we study the worst-case error behavior of polynomial lattice rules with generating vectors obtained by Algorithm 1.

3.3 Error Bounds for the Constructed Polynomial Lattice Rules

The following theorem shows that for the constructed polynomial lattice rules the quantity \(H_{d,m,{\varvec{\eta }}}({\varvec{g}})\), which for product weights \(\eta _{{\mathfrak {u}}} = \prod _{j\in {\mathfrak {u}}} \eta _j\) equals

$$\begin{aligned} H_{d,m,{\varvec{\eta }}}({\varvec{g}})&= \sum _{n=1}^{b^m-1} \prod _{j=1}^d \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_m \left( \frac{n(x) \, g_j(x)}{x^m} \right) \right) }\right\rfloor + 1 \right) \right) \\&\quad - (b^m-1)\\&= \sum _{t=1}^{m} \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \prod _{j=1}^d \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, g_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) \\&\quad - (b^m-1) , \end{aligned}$$

can be related to the quantity \(H_{d-1,m,{\varvec{\eta }}}({\varvec{g}}_{\{1 {{:}}d-1\}})\).

Theorem 5

Let b be prime, \(m,d \in {\mathbb {N}}\) be integers with \(d\ge 2\), and let \({\varvec{\eta }}= (\eta _j)_{j\ge 1}\) be positive product weights. Furthermore, denote by \({\varvec{g}}\) the corresponding generating vector constructed by Algorithm 1. Then, \({\varvec{g}}\) satisfies

$$\begin{aligned} H_{d,m,{\varvec{\eta }}}({\varvec{g}}) \le \left( 1 + \eta _d \right) H_{d-1,m,{\varvec{\eta }}}({\varvec{g}}_{\{1:d-1\}}) + \eta _d (b^m-1) . \end{aligned}$$
(19)

Proof

We will prove (19) by an inductive argument over the selection of the terms of order \(1 \le t \le m-1\) of the polynomial \(g_d \in {\mathbb {F}}_b[x]\). We start by considering the term of order \(m-1\). According to Algorithm 1, this term has been selected by minimizing \(h_{d,m,m,{\varvec{\eta }}}(g_{d,m-1} + g \,p_{m-1})\) over the choices \(g \in {\mathbb {F}}_b\), and where \(g_{d,m-1} \in G_{b,m-1}\) has been determined in the previous steps of the algorithm. By Lemma 7 (with \(w=m\)) and Definition 3, this is equivalent to minimizing

$$\begin{aligned} H_{d,m,{\varvec{\eta }}}(g_1,\ldots ,g_{d-1},g_{d,m-1} + g \,p_{m-1}) \end{aligned}$$

with respect to \(g \in {\mathbb {F}}_b\). By the standard averaging argument, this yields

$$\begin{aligned} H_{d,m,{\varvec{\eta }}}({\varvec{g}})&= \min _{\bar{g} \in {\mathbb {F}}_b} H_{d,m,{\varvec{\eta }}}\left( {\varvec{g}}_{\{1 {{:}}d-1\}},g_{d,m-1} + \bar{g} \,p_{m-1} \right) \nonumber \\&\le \frac{1}{b} \sum _{\bar{g} \in {\mathbb {F}}_b} H_{d,m,{\varvec{\eta }}}\left( {\varvec{g}}_{\{1 {{:}}d-1\}},g_{d,m-1} + \bar{g} \,p_{m-1} \right) \nonumber \\&= \frac{1}{b} \sum _{\bar{g} \in G_{b,1}} H_{d,m,{\varvec{\eta }}}\left( {\varvec{g}}_{\{1 {{:}}d-1\}},g_{d,m-2} + g \,p_{m-2} + \bar{g} \,p_{m-1} \right) , \end{aligned}$$
(20)

where \(g_{d,m-1}\) has been split up into \(g_{d,m-2}\) and \(g \,p_{m-2}\) in accordance with Algorithm 1 such that g has been selected in the previous step of the algorithm and we used that \(G_{b,1} \cong {\mathbb {F}}_b\).

Similarly, we observe that the term of order \(m-2\) has been selected by minimizing \(h_{d,m-1,m,{\varvec{\eta }}}(g_{d,m-2} + g \,p_{m-2})\) with respect to the choices \(g \in {\mathbb {F}}_b\). Again, by Lemma 7 (with \(w=m-1\)) and Definition 3 this is equivalent to minimizing

$$\begin{aligned} \frac{1}{b} \sum _{\bar{g} \in G_{b,1}} H_{d,m,{\varvec{\eta }}} ({\varvec{g}}_{\{1 {{:}}d-1\}},g_{d,m-2} + g \,p_{m-2} + \bar{g} \,p_{m-1}) \end{aligned}$$

with respect to \(g \in G_{b,1} \cong {\mathbb {F}}_b\). By the standard averaging argument, we obtain that

$$\begin{aligned}&\min _{g \in G_{b,1}} \frac{1}{b} \sum _{\bar{g} \in {\mathbb {F}}_b} H_{d,m,{\varvec{\eta }}}\left( {\varvec{g}}_{\{1 {{:}}d-1\}},g_{d,m-2} + g \,p_{m-2} + \bar{g} \,p_{m-1} \right) \\&\quad \le \frac{1}{b^2} \sum _{g \in {\mathbb {F}}_b} \sum _{\bar{g} \in G_{b,1}} H_{d,m,{\varvec{\eta }}}\left( {\varvec{g}}_{\{1 {{:}}d-1\}},g_{d,m-2} + g \, p_{m-2} + \bar{g} \,p_{m-1} \right) \\&\quad = \frac{1}{b^2} \sum _{\bar{g} \in G_{b,2}} H_{d,m,{\varvec{\eta }}}\left( {\varvec{g}}_{\{1 {{:}}d-1\}},g_{d,m-3} + g \, p_{m-3} + \bar{g} \,p_{m-2} \right) , \end{aligned}$$

where again we split up \(g_{d,m-2}= g_{d,m-3}+ g \, p_{m-3}\) according to Algorithm 1. Inductively repeating this argument and combining the result with the estimate in (20), we obtain the inequality

$$\begin{aligned} H_{d,m,{\varvec{\eta }}}({\varvec{g}}) \le \frac{1}{b^{m-1}} \sum _{\bar{g} \in G_{b,m-1}} H_{d,m,{\varvec{\eta }}} \left( {\varvec{g}}_{\{1 {{:}}d-1\}}, 1 + \bar{g} \,p_1 \right) , \end{aligned}$$

where we used that in Algorithm 1 we set \(g_{d,1}=1\). Then, using Lemma 7 with \(w=1\), \(g_d=1\), and \(g=0\) to equate the right-hand side of the previous estimate, we finally obtain

$$\begin{aligned}&H_{d,m,{\varvec{\eta }}}({\varvec{g}}) \\&\quad \le \sum _{t=1}^{m} \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \eta _d (1-b) b^{1-t} \left( \left\lfloor {\log _b \left( v_1 \left( \frac{\ell (x)}{x} \right) \right) }\right\rfloor + 1 \right) \\&\qquad \times \prod _{j=1}^{d-1} \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, g_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) - (b^m-1)\\&\qquad + \sum _{t=1}^m \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \prod _{j=1}^{d-1} \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, g_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) \\&\quad \qquad \left( 1 + \eta _d (1-b^{1-t}) \right) . \end{aligned}$$

For \(\ell \) with \(\ell \not \equiv 0 {\;(\mathrm{mod}\;b)}\), which is equivalent to \(\gcd (\ell ,p_1)=1\), we have for some \(a \in {\mathbb {F}}_b \setminus \{0\}\) that \(\left\lfloor {\log _b \left( v_1 \left( \ell (x) / x \right) \right) }\right\rfloor + 1 = \left\lfloor {\log _b \left( a / b \right) }\right\rfloor + 1 = - 1 + 1 = 0\). Hence, we get

$$\begin{aligned} H_{d,m,{\varvec{\eta }}}({\varvec{g}})&\le \sum _{t=1}^m \sum _{\begin{array}{c} \ell =1 \\ \ell \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} \prod _{j=1}^{d-1} \left( 1 + \eta _j (1-b) \left( \left\lfloor {\log _b \left( v_t \left( \frac{\ell (x) \, g_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) \\&\qquad \times \left( 1 + \eta _d (1-b^{1-t}) \right) - (b^m-1)\\&\le \left( 1 + \eta _d \right) (H_{d-1,m,{\varvec{\eta }}}({\varvec{g}}_{\{1:d-1\}}) + (b^m-1)) - (b^m-1) \\&= \left( 1 + \eta _d \right) H_{d-1,m,{\varvec{\eta }}}({\varvec{g}}_{\{1:d-1\}}) + \eta _d (b^m-1) , \end{aligned}$$

which is the claimed estimate. \(\square \)

Remark 5

The proof strategy in Theorem 5, in which we relate the quantities \(H_{d,m,{\varvec{\eta }}}({\varvec{g}})\) and \(H_{d-1,m,{\varvec{\eta }}}({\varvec{g}})\) with each other, is the main reason why we restricted our analysis to product weights. Without this assumption, the inductive argument over the dimension becomes so complicated that so far it has not been possible to prove the induction step. The same issue has also occurred for the analogous result for lattice rules, see [7, Theorems 3 and 4].

Based on the result in Theorem 5, we can use an inductive argument to show that the quantity \(H_{d,m,{\varvec{\eta }}}({\varvec{g}})\) is sufficiently small if \({\varvec{g}}\) has been constructed by Algorithm 1.

Theorem 6

Let b be prime, let \(m,d \in {\mathbb {N}}\) be positive integers and let \({\varvec{\eta }}= (\eta _j)_{j\ge 1}\) be positive product weights. Then, the generating vector \({\varvec{g}}\) constructed by Algorithm 1 satisfies

$$\begin{aligned} H_{d,m,{\varvec{\eta }}}({\varvec{g}}) \le b^m \left[ -1 + \prod _{j=1}^d ( 1 + \eta _j )\right] . \end{aligned}$$

Proof

Due to the formulation of Algorithm 1, the estimate (19) obtained in Theorem 5 holds if we replace d by r for any \(r \in \{2,\ldots ,d\}\), such that we get a result for \(H_{r,m,{\varvec{\eta }}}({\varvec{g}})\) for any \(r \in \{2,\ldots ,d\}\). Hence, we can use this estimate inductively to obtain

$$\begin{aligned}&H_{d,m,{\varvec{\eta }}}({\varvec{g}}) \nonumber \\&\quad \le ( 1 + \eta _d ) H_{d-1,m,{\varvec{\eta }}}({\varvec{g}}_{\{1:d-1\}}) + \eta _d (b^m-1)\nonumber \\&\quad \le ( 1 + \eta _d ) ( 1 + \eta _{d-1} ) H_{d-2,m,{\varvec{\eta }}}({\varvec{g}}_{\{1:d-2\}}) + (1 + \eta _d) \eta _{d-1} (b^m-1) + \eta _d (b^m-1)\nonumber \\&\quad = H_{d-2,m,{\varvec{\eta }}}({\varvec{g}}_{\{1:d-2\}}) \prod _{j=d-1}^d ( 1 + \eta _j ) + (b^m - 1) \left[ -1 + \prod _{j=d-1}^d ( 1 + \eta _j ) \right] \nonumber \\&\quad \le H_{1,m,{\varvec{\eta }}}(g_1) \prod _{j=2}^d ( 1 + \eta _j ) + (b^m - 1) \left[ -1 + \prod _{j=2}^d ( 1 + \eta _j ) \right] . \end{aligned}$$
(21)

Next, we observe that

$$\begin{aligned} H_{1,m,{\varvec{\eta }}}(g_1)&= H_{1,m,{\varvec{\eta }}}(1) \\&= \sum _{n=1}^{b^m-1} \left( 1 + \eta _1 (1-b) \left( \left\lfloor {\log _b \left( v_m \left( \frac{n(x)}{x^m} \right) \right) }\right\rfloor + 1 \right) \right) - (b^m-1) \\&= - \eta _1 \sum _{n=1}^{b^m-1} (b-1) \left( \left\lfloor {\log _b \left( v_m \left( \frac{n(x)}{x^m} \right) \right) }\right\rfloor + 1 \right) \\&= - \eta _1 \sum _{t=1}^{m} \sum _{\begin{array}{c} n=1 \\ n \not \equiv 0 {\;(\mathrm{mod}\;b)} \end{array}}^{b^t-1} (b-1) \left\lfloor {\log _b \left( v_t \left( \frac{n(x)}{x^t} \right) \right) }\right\rfloor + \eta _1 (1-b) (b^m-1)\\&= - \eta _1 \sum _{t=1}^{m} \sum _{r=0}^{t-1} \sum _{\begin{array}{c} n=1 \\ n \not \equiv 0 {\;(\mathrm{mod}\;b)} \\ \deg (n(x))=r \end{array}}^{b^t-1} (b-1) \left\lfloor {\log _b \left( v_t \left( \frac{n(x)}{x^t} \right) \right) }\right\rfloor + \eta _1 (1-b) (b^m-1). \end{aligned}$$

For any polynomial \(n(x) \in {\mathbb {F}}_b[x]\) of degree \(0 \le r < t\) with \(\gcd (n,x)=1\), we have that

$$\begin{aligned} \left\lfloor {\log _b\left( v_t\left( \frac{n(x)}{x^t} \right) \right) }\right\rfloor = -(t - r) \end{aligned}$$

such that we can further deduce that

$$\begin{aligned} H_{1,m,{\varvec{\eta }}}(g_1)&= \eta _1 \sum _{t=1}^{m} (b-1) \sum _{r=0}^{t-1} \sum _{\begin{array}{c} n=1 \\ n \not \equiv 0 {\;(\mathrm{mod}\;b)} \\ \deg (n(x))=r \end{array}}^{b^t-1} (t-r) + \eta _1 (1-b) (b^m-1) \\&= \eta _1 \sum _{t=1}^{m} (b-1) \left( (b-1) t + \sum _{r=1}^{t-1} (b-1)^2 b^{r-1} (t-r) \right) + \eta _1 (1-b) (b^m-1) \\&= \eta _1 \sum _{t=1}^{m} (b-1) \left( (b-1) t + b^t - bt + t - 1 \right) + \eta _1 (1-b) (b^m-1)\\&= \eta _1 (b-1) \sum _{t=1}^{m} (b^t - 1) + \eta _1 (1-b) (b^m-1)\\&= \eta _1 (b^{m+1} - bm - b + m) + \eta _1 (1-b) (b^m-1) = \eta _1 (b^m - (b-1)m - 1) . \end{aligned}$$

Combining this with the estimate in (21), we finally obtain

$$\begin{aligned} H_{d,m,{\varvec{\eta }}}({\varvec{g}})&\le \eta _1 \,(b^m - 1) \prod _{j=2}^d (1 + \eta _j ) + (b^m - 1) \left[ -1 + \prod _{j=2}^d ( 1 + \eta _j ) \right] \\&= (b^m - 1) \left[ -1 + \prod _{j=1}^d ( 1 + \eta _j ) \right] , \end{aligned}$$

which yields the claimed estimate. \(\square \)

Theorem 6 allows us to prove the following result regarding the construction in Algorithm 1.

Theorem 7

Let b be prime, let \(m,d \in {\mathbb {N}}\) with \(m\ge 4\), and let \((\eta _j)_{j\ge 1}\) be positive product weights. Then, the generating vector \({\varvec{g}}\) constructed by Algorithm 1 satisfies

$$\begin{aligned}&T_{{\varvec{\eta }}} ({\varvec{g}},p_m) \\&\quad \le \frac{1}{b^m} \left[ \prod _{j=1}^d (1 + \eta _j ((b-1)m+1)) + b \,m \prod _{j=1}^d \left( 1 + \eta _j \left( 2(b-1)m + \frac{2b}{b-1}\right) \right) \right] . \end{aligned}$$

Proof

We remark that for reals \(a_1,\ldots ,a_d \in {\mathbb {R}}\) the general identity

$$\begin{aligned} \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \prod _{j \in {\mathfrak {u}}} a_j = -1 + \prod _{j=1}^d (1 + a_j) \end{aligned}$$

holds. Using the bound on \(T_{{\varvec{\eta }}}({\varvec{g}},p_m)\) in Theorem 4 and inserting for \({\varvec{g}}\) the generating vector obtained from Algorithm 1, for which the bound on \(H_{d,m,{\varvec{\eta }}}({\varvec{g}})\) from Theorem 6 holds, yields

$$\begin{aligned} T_{{\varvec{\eta }}}({\varvec{g}},p_m)&\le \left[ -1 + \prod _{j=1}^d ( 1 + \eta _j ) \right] - \left[ -1 + \prod _{j=1}^d ( 1 + \eta _j ) \right] \\&\quad +\sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} ((b-1)m +1)^{|{\mathfrak {u}}|} \\&\quad + \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \frac{\eta _{\mathfrak {u}}}{b^m} \, (b \,m \left| {{\mathfrak {u}}}\right| ) \left( (b-1)m + \frac{b}{b-1} \right) ^{\left| {{\mathfrak {u}}}\right| } \\&\le \frac{1}{b^m} \left[ \prod _{j=1}^d (1 + \eta _j ((b-1)m+1)) \right. \\&\quad \left. + b \,m \prod _{j=1}^d \left( 1 + \eta _j \left( 2(b-1)m + \frac{2b}{b-1}\right) \right) \right] , \end{aligned}$$

where in the last step we used that \(\left| {{\mathfrak {u}}}\right| \le 2^{\left| {{\mathfrak {u}}}\right| }\). Note that by the formulation of Algorithm 1 we have that \(\gcd (g_j,p_m)=1\) for \(1 \le j \le d\) such that the conditions of Theorem 4 are satisfied. \(\square \)

The next theorem states the main result of this paper, implying that by the construction in Algorithm 1 we obtain an error convergence rate that is arbitrarily close to the optimal rate of \(N^{-\alpha }\). We know that this order is optimal due to the relation between the worst-case errors in \(W_{d,{\varvec{\gamma }}}^\alpha \) and \(\widetilde{W}_{d,{\varvec{\gamma }}}^\alpha \) (see Theorem 2 and Remark 3) and due to the fact that the rate \(N^{-\alpha /2}\) is optimal in \(\widetilde{W}_{d,{\varvec{\gamma }}}^\alpha \), cf., e.g., [6, Theorem 41]. Additionally, under a summability condition on the weights that is common in the related literature, the error can be bounded independently of the dimension, by which we obtain what is known as strong polynomial tractability in the context of information-based complexity. For an overview of different notions of tractability and basic concepts of information-based complexity, we refer the interested reader to [17].

Theorem 8

Let b be prime, let \(m,d \in {\mathbb {N}}\) with \(m\ge 4\), let \(N=b^m\), and let \((\gamma _j)_{j\ge 1}\) be positive product weights satisfying

$$\begin{aligned} \sum _{j \ge 1} \gamma _j < \infty . \end{aligned}$$

Furthermore, denote by \({\varvec{g}}\) the generating vector obtained by Algorithm 1, run for the weight sequence \({\varvec{\eta }}={\varvec{\gamma }}=(\gamma _j)_{j\ge 1}\). Then, for any \(\delta >0\) and each \(\alpha >1\), the generating vector \({\varvec{g}}\) satisfies

$$\begin{aligned} e_{b^m,d,\alpha ,{\varvec{\gamma }}^\alpha }({\varvec{g}}) \le \frac{1}{N^\alpha } \left( C({\varvec{\gamma }}^\alpha ) + \bar{C}\left( {\varvec{\gamma }},\delta \right) \, N^{\alpha \delta } \right) , \end{aligned}$$

with positive constants \(C({\varvec{\gamma }}^\alpha )\) and \(\bar{C}\left( {\varvec{\gamma }},\delta \right) \), which are independent of d and N.

Additionally, if Algorithm 1 is run for the weights \({\varvec{\eta }}={\varvec{\gamma }}^{1/\alpha }\) with \(\alpha > 1\), which satisfy

$$\begin{aligned} \sum _{j \ge 1} \gamma _j^{1/\alpha } < \infty , \end{aligned}$$

then, for any \(\delta >0\), the resulting generating vector \(\widetilde{{\varvec{g}}}\) satisfies the error bound

$$\begin{aligned} e_{b^m,d,\alpha ,{\varvec{\gamma }}}(\widetilde{{\varvec{g}}}) \le \frac{1}{N^\alpha } \left( K({\varvec{\gamma }}) + \bar{K}({\varvec{\gamma }}^{1/\alpha },\delta ) \, N^{\alpha \delta } \right) , \end{aligned}$$

with positive constants \(K({\varvec{\gamma }})\) and \(\bar{K}({\varvec{\gamma }}^{1/\alpha },\delta )\), which are independent of d and N.

Proof

We know from Proposition 1 that

$$\begin{aligned} e_{b^m,d,\alpha ,{\varvec{\eta }}^\alpha }({\varvec{g}}) \le \frac{1}{N^{\alpha }} \sum _{\emptyset \ne {\mathfrak {u}}\subseteq \{1:d\}} \eta ^\alpha _{{\mathfrak {u}}} (2 \mu _b(\alpha ))^{\left| {{\mathfrak {u}}}\right| } + T_{\alpha ,{\varvec{\eta }}^\alpha }({\varvec{g}},p_m) . \end{aligned}$$

For the special case of product weights \(\eta _{{\mathfrak {u}}}=\prod _{j\in {\mathfrak {u}}} \eta _j\), \(u\subseteq \{1 {{:}}d\}\), this yields

$$\begin{aligned} e_{b^m,d,\alpha ,{\varvec{\eta }}^\alpha }({\varvec{g}}) \le \frac{1}{N^{\alpha }} \prod _{j=1}^d \left( 1 + 2\mu _b (\alpha ) \eta _j^\alpha \right) + T_{\alpha ,{\varvec{\eta }}^\alpha }({\varvec{g}},p_m) . \end{aligned}$$

Since \(\alpha >1\), we can use an inequality, sometimes referred to as Jensen’s inequality, which states that \(\sum _{i=1}^M y_i \le \left( \sum _{i=1}^M y^p_i \right) ^{1/p}\) for non-negative \(y_1,\ldots ,y_M\) and \(0 \le p \le 1\). This yields

$$\begin{aligned} T_{\alpha ,{\varvec{\eta }}^\alpha }({\varvec{g}},p_m)&= \sum _{{\varvec{0}}\ne {\varvec{k}}\in A_p({\varvec{g}})} (r_{\alpha ,{\varvec{\eta }}^\alpha }({\varvec{k}}))^{-1} = \sum _{{\varvec{0}}\ne {\varvec{k}}\in A_p({\varvec{g}})} (r_{1,{\varvec{\eta }}}({\varvec{k}}))^{-\alpha } \\&\le \left( \sum _{{\varvec{0}}\ne {\varvec{k}}\in A_p({\varvec{g}})} (r_{1,{\varvec{\eta }}}({\varvec{k}}))^{-1}\right) ^\alpha = \left( T_{{\varvec{\eta }}}({\varvec{g}},p_m)\right) ^\alpha , \end{aligned}$$

and by Theorem 7 we know that Algorithm 1 run for weights \({\varvec{\eta }}\) yields \({\varvec{g}}\) which satisfy

$$\begin{aligned} T_{{\varvec{\eta }}}({\varvec{g}},p_m)\le & {} \frac{1}{b^m} \left[ \prod _{j=1}^d (1 + \eta _j ((b-1)m+1)) \right. \\&\quad \left. +\, b \,m \prod _{j=1}^d \left( 1 + \eta _j \left( 2(b-1)m + \frac{2b}{b-1}\right) \right) \right] . \end{aligned}$$

From this, we deduce, using either the weights \({\varvec{\eta }}={\varvec{\gamma }}^{1/\alpha }\) or \({\varvec{\eta }}={\varvec{\gamma }}\) for Algorithm 1, that

$$\begin{aligned} b^m \,T_{{\varvec{\eta }}}({\varvec{g}},p_m)&\le \prod _{j=1}^d (1 + \eta _j 4 b m) + b \,m \prod _{j=1}^d (1 + \eta _j 4 b m) = (1 + b \, m) \prod _{j=1}^d (1 + \eta _j 4 b m) \\&\le \widetilde{C}(\delta /2) \, b^{m \delta / 2} \prod _{j=1}^d (1 + \eta _j 4 b m) \le \widetilde{C}(\delta /2) \, b^{m \delta / 2} \prod _{j=1}^\infty (1 + \eta _j 4 b m) \end{aligned}$$

for arbitrary \(\delta >0\), where \(\widetilde{C}(\delta /2)\) is a constant depending only on \(\delta \). Due to the imposed condition on the weights, i.e., \(\sum _{j \ge 1} \gamma _j < \infty \) or \(\sum _{j \ge 1} \gamma _j^{1/\alpha } < \infty \), we can use the result in [8, Lemma 3] to see that the last product can be bounded by \(\widehat{C} ({\varvec{\gamma }},\delta ) \, b^{m \delta / 2}\) or \(\widehat{C} ({\varvec{\gamma }}^{1/\alpha },\delta ) \, b^{m \delta / 2}\), respectively, where \(\widehat{C} ({\varvec{\gamma }},\delta )\) and \(\widehat{C} ({\varvec{\gamma }}^{1/\alpha },\delta )\) may depend on \(\delta \) and the weights \({\varvec{\gamma }}\) or \({\varvec{\gamma }}^{1/\alpha }\), but are independent of the dimension. Choosing \({\varvec{\eta }}={\varvec{\gamma }}\), this yields that

$$\begin{aligned} \left( T_{{\varvec{\eta }}}({\varvec{g}},p_m)\right) ^\alpha = \left( T_{{\varvec{\gamma }}}({\varvec{g}},p_m)\right) ^\alpha \le \frac{1}{N^\alpha }\left( \widetilde{C}(\delta /2)\right) ^\alpha \left( \widehat{C} ({\varvec{\gamma }},\delta ) \right) ^{\alpha } N^{\alpha \delta } , \end{aligned}$$

and similarly, for \({\varvec{\eta }}={\varvec{\gamma }}^{1/\alpha }\),

$$\begin{aligned} \left( T_{{\varvec{\eta }}}({\varvec{g}},p_m)\right) ^\alpha = \left( T_{{\varvec{\gamma }}^{1/\alpha }}({\varvec{g}},p_m)\right) ^\alpha \le \frac{1}{N^{\alpha }}\left( \widetilde{C}(\delta /2)\right) ^\alpha \left( \widehat{C} ({\varvec{\gamma }}^{1/\alpha }, \delta ) \right) ^\alpha N^{\alpha \delta } . \end{aligned}$$

Setting then \(C({\varvec{\gamma }}^\alpha )=\prod _{j=1}^d (1 + 2\mu _b (\alpha ) \gamma _j^\alpha )\) and \(\bar{C}({\varvec{\gamma }},\delta ) = (\widetilde{C}(\delta /2))^\alpha \left( \widehat{C} ({\varvec{\gamma }},\delta ) \right) ^\alpha \), and, similarly, \(K({\varvec{\gamma }})=\prod _{j=1}^d (1 + 2\mu _b (\alpha ) \gamma _j)\) and \(\bar{K}({\varvec{\gamma }}^{1/\alpha },\delta ) = (\widetilde{C}(\delta /2))^\alpha \left( \widehat{C} ({\varvec{\gamma }}^{1/\alpha },\delta ) \right) ^\alpha \), we obtain the claimed error estimates, where the first stated bound holds simultaneously for all \(\alpha > 1\). \(\square \)

The result in Theorem 8 consists of two statements regarding the worst-case error behavior of generating vectors constructed by Algorithm 1. On the one hand, when run with weights \({\varvec{\gamma }}^{1/\alpha }\), and hence depending on the parameter \(\alpha \), the algorithm yields typical error bounds for the worst-case error in the space \(W_{d,{\varvec{\gamma }}}^\alpha \). We emphasize that this type of result could also be obtained by formulating and using an analogous CBC-DBD algorithm which is instead directly based on the search criterion \(e_{b^m,d,\alpha ,{\varvec{\gamma }}}\). On the other hand, when run with weights \({\varvec{\gamma }}\), thus independently of \(\alpha \), the algorithm produces generating vectors for which bounds on the worst-case errors in the spaces \(W_{d,{\varvec{\gamma }}^\alpha }^\alpha \) hold simultaneously for all \(\alpha > 1\).

4 Fast Implementation of the Construction Scheme

In this section, we discuss the efficient implementation of the introduced CBC-DBD algorithm and analyze its complexity. Throughout this section, we will consider the implementation for the special case of \(b=2\) and product weights \(\gamma _{{\mathfrak {u}}} = \prod _{j\in {\mathfrak {u}}} \gamma _j\) for a sequence of positive reals \((\gamma _j)_{j\ge 1}\). Choosing the prime base as \(b=2\) allows for the use of bitwise operations which facilitate an efficient implementation of the construction scheme. We remark that the major challenge for the implementation of the algorithm for \(b>2\) is an efficient computation of the polynomial multiplication modulo b, all other steps of the algorithm can be implemented analogously.

4.1 Implementation and Cost Analysis of the CBC-DBD Algorithm

Let \(q \in {\mathbb {F}}_2[x]\), \(m,d \in {\mathbb {N}}\) be positive integers and let \({\varvec{\gamma }}= (\gamma _{{\mathfrak {u}}})_{{\mathfrak {u}}\subseteq \{1:d\}}\), where \(\gamma _{\mathfrak {u}}=\prod _{j \in {\mathfrak {u}}} \gamma _j\) with positive reals \((\gamma _j)_{j\ge 1}\). We recall that for \(b=2\) and integers \(w \in \{1 {{:}}m\}\), \(r \in \{1 {{:}}d\}\) the digit-wise quality function \(h_{r,w,m,{\varvec{\gamma }}}\) in Definition 3, which is used in Algorithm 1, is given by

$$\begin{aligned} h_{r,w,m,{\varvec{\gamma }}}(q)&= \sum _{t=w}^m \frac{1}{2^{t-w}} \sum _{\begin{array}{c} \ell =1 \\ \ell \equiv 1 {\;(\mathrm{mod}\;2)} \end{array}}^{2^t-1} \left( 1 - \gamma _r \left( \left\lfloor {\log _2 \left( v_w \left( \frac{\ell (x) \, q(x)}{x^w} \right) \right) }\right\rfloor + 1 \right) \right) \\&\quad \times \prod _{j=1}^{r-1} \left( 1 - \gamma _j \left( \left\lfloor {\log _2 \left( v_t \left( \frac{\ell (x) \, g_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) , \end{aligned}$$

where the polynomials \(g_1,\ldots ,g_{r-1} \in {\mathbb {F}}_2[x]\) have been determined in the previous steps of the algorithm. Since the cost of a single evaluation of the function \(h_{r,w,m,{\varvec{\gamma }}}\) is crucial for the total cost of Algorithm 1, we are interested in an efficient evaluation procedure which will be discussed in the following paragraph.

For integers \(t \in \{2,\ldots ,m\}\) and odd \(\ell \in \{1,\ldots ,2^t - 1\}\), we define the term \(a(r, t, \ell )\) as

$$\begin{aligned} a(r,t,\ell ) := \prod _{j=1}^r \left( 1 - \gamma _j \left( \left\lfloor {\log _2 \left( v_t \left( \frac{\ell (x) \, g_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) \end{aligned}$$

and observe that for the evaluation of \(h_{r,w,m,{\varvec{\gamma }}}(q)\) we can compute and store the term \(a(r-1,t,\ell )\) since it is independent of w and q. This way we can rewrite \(h_{r,w,m,{\varvec{\gamma }}}(q)\) as

$$\begin{aligned} h_{r,w,m,{\varvec{\gamma }}}(q)&= \sum _{t=w}^m \frac{1}{2^{t-w}} \sum _{\begin{array}{c} \ell =1 \\ \ell \equiv 1 {\;(\mathrm{mod}\;2)} \end{array}}^{2^t-1} a(r-1,t,\ell ) \nonumber \\&\quad \left( 1 - \gamma _r \left( \left\lfloor {\log _2 \left( v_w \left( \frac{\ell (x) \, q(x)}{x^w} \right) \right) }\right\rfloor + 1 \right) \right) , \end{aligned}$$
(22)

where in Algorithm 1, after having determined \(g_{r,w}\), the values of \(a(r,w,\ell )\) for odd integers \({\ell \in \{1,\ldots ,2^w-1\}}\) are computed via the recurrence relation

$$\begin{aligned} a(r,w,\ell ) = a(r-1,w,\ell ) \left( 1 - \gamma _r \left( \left\lfloor {\log _2 \left( v_w \left( \frac{\ell (x) \, g_{r,w}(x)}{x^w} \right) \right) }\right\rfloor + 1 \right) \right) . \end{aligned}$$

For an algorithmic implementation, we introduce the vector \({\varvec{u}}=(u(1),\ldots ,u(2^m-1)) \in {\mathbb {R}}^{2^m-1}\) whose components, for the current \(r \in \{1,\ldots ,d\}\), are given by

$$\begin{aligned} u(\ell \, 2^{m-t}) = \prod _{j=1}^r \left( 1 - \gamma _j \left( \left\lfloor {\log _2 \left( v_t \left( \frac{\ell (x) \, g_j(x)}{x^t} \right) \right) }\right\rfloor + 1 \right) \right) = a(r,t,\ell ) \end{aligned}$$

for each \(t=1,\ldots ,m\) and corresponding odd index \(\ell \in \{1,\ldots ,2^t - 1\}\). In Algorithm 2, the quantity \({\varvec{u}}\) stores the values of \(a(r-1,t,\ell )\) for the current r and can therefore be used to evaluate \(h_{r,w,m,{\varvec{\gamma }}}\) according to equation (22). Furthermore, note that for the evaluation of \(h_{r,w,m,{\varvec{\gamma }}}\) we do not require the values of \(a(r,t,\ell )\) for \(t=2,\ldots ,w-1\). Combining these findings leads to the following fast implementation of Algorithm 1.

figure b

The computational complexity of Algorithm 2 is then summarized in the following theorem.

Theorem 9

Let \(m,d \in {\mathbb {N}}\) and let \({\varvec{\gamma }}= (\gamma _j)_{j=1}^d\) be a given sequence of positive weights. Then, Algorithm 2 constructs a generating vector \({\varvec{g}}=(g_1,\ldots ,g_d) \in (G_{2,m}^*)^d\) using \({\mathcal {O}}(d \,m\, 2^m)\) operations and requiring \({\mathcal {O}}(2^m)\) memory.

Proof

Due to the relation in (22), the cost of evaluating \(h_{r,w,m,{\varvec{\gamma }}}(q)\) can be reduced to \({\mathcal {O}}(\sum _{t=w}^m 2^{t-1})\) operations. Thus, the number of calculations in the inner loop over \(w = 2,\dots ,m\) of Algorithm 2 is of order

$$\begin{aligned} {\mathcal {O}}\left( \sum _{w=2}^m 2 \sum _{t=w}^m 2^{t-1}\right) = {\mathcal {O}}\left( \sum _{w=2}^m \sum _{t=w}^m 2^t \right) = {\mathcal {O}}\left( m \, 2^m - 2(2^m - 1) \right) = {\mathcal {O}}\left( m \, 2^m \right) . \end{aligned}$$

Hence, the outer loop over \(r=2,\ldots ,d\), which is the main cost of Algorithm 2, can be executed in \({\mathcal {O}}\left( d \,m\, 2^m \right) \) operations. Furthermore, we observe that initialization and updating of the vector \({\varvec{u}}\in {\mathbb {R}}^{2^m-1}\) for each r can both be executed in

$$\begin{aligned} \sum _{w=2}^{m} {\mathcal {O}}\left( 2^{w-1} \right) = \sum _{w=1}^{m-1} {\mathcal {O}}\left( 2^{w} \right) = {\mathcal {O}}(2^m) \text { operations}, \end{aligned}$$

such that a total cost of \({\mathcal {O}}(d \, 2^m)\) operations for maintaining \({\varvec{u}}\) is required in Algorithm 2. Additionally, storing the vector \({\varvec{u}}\) requires \({\mathcal {O}}(2^m)\) of memory. \(\square \)

We note that for the modulus polynomial \(x^m\) the evaluation of \(v_m(\ell (x) / x^m)\) requires only one integer division since \(v_m (\ell (x) / x^m) = \ell / 2^m\). Furthermore, we remark that the running time of Algorithm 2 can be reduced further by precomputing and storing the \(2^m-1\) values

$$\begin{aligned} \left( \left\lfloor {\log _2 \left( v_m \left( \frac{\ell }{x^m} \right) \right) }\right\rfloor + 1 \right) \quad \text {for} \quad \ell =1,\ldots ,2^m-1 . \end{aligned}$$

The derivation leading to the fast implementation in Algorithm 2 is using arguments that were used in [7], where a component-by-component digit-by-digit construction for lattice rules in weighted Korobov spaces has been studied. Theorem 9 shows that the fast implementation of the component-by-component digit-by-digit construction for polynomial lattice rules achieves the same computational complexity as state-of-the-art component-by-component methods, see, e.g., [3], which, for product weights, require \({\mathcal {O}}(d \, m \, 2^m)\) operations. In these constructions, the speed-up of the algorithm is achieved by reordering the involved matrices to be of circulant structure and by then employing a fast matrix-vector product which uses fast Fourier transformations (FFTs). We refer to [19] for further details on an implementation for polynomial lattice rules. In contrast, our method does not rely on the use of FFTs and the low time complexity of the resulting algorithm is due to the smaller search space for the components \(g_j\) of the generating vector \({\varvec{g}}\). Furthermore, we remark that the mentioned state-of-the-art CBC constructions mainly use a primitive or irreducible modulus \(p \in {\mathbb {F}}_2[x]\) since then the multiplicative group of \({\mathbb {F}}_2[x] / (p)\) is cyclic. While for reducible polynomials, such as \(p(x)=x^m\), a fast CBC construction is theoretically possible by using a similar strategy as for the fast CBC construction for lattice rules with a composite number of points, there are, to the best of our knowledge, no explicit implementations of such an algorithm known. On the other hand, the CBC-DBD construction considered in this article immediately yields a fast algorithm for the construction of polynomial lattice rules in \({\mathcal {O}}(d \,m\, 2^m)\) operations for \(p(x)=x^m\).

5 Numerical Results

In this section, we illustrate the error convergence behavior of the polynomial lattice rules constructed by the CBC-DBD algorithm and visualize the computational complexity of the construction by means of numerical experiments. As in the previous section, we consider polynomial lattice rules in the weighted Walsh space \(W_{d,{\varvec{\gamma }}}^\alpha \) for prime base \(b=2\) and product weights \(\gamma _{\mathfrak {u}}= \prod _{j \in {\mathfrak {u}}} \gamma _j\) given in terms of positive reals \((\gamma _j)_{j \ge 1}\).

In order to demonstrate the performance of the algorithm, we compare the worst-case errors of the constructed polynomial lattice rules as well as the algorithm’s computation times to the corresponding quantities obtained by a state-of-the-art component-by-component algorithm, see, e.g., [3]. As remarked in the previous section, no fast CBC construction is known for the case \(p(x)=x^m\) such that instead we compare our algorithm with a CBC construction with primitive polynomial \(p \in {\mathbb {F}}_2[x]\) of degree m as the modulus. Both constructions deliver polynomial lattice rules for the spaces \(W_{d,{\varvec{\gamma }}}^\alpha \) consisting of \(2^m\) cubature points.

The different algorithms have been implemented in MATLAB R2019b and Python 3.6.3. In Python, the implementations are available in double-precision as well as arbitrary-precision floating-point arithmetic with the latter provided by the multiprecision Python library mpmath.

5.1 Error Convergence Behavior

Let \(m,d \in {\mathbb {N}}\), \(\alpha > 1\), and a sequence of positive weights \({\varvec{\gamma }}= (\gamma _j)_{j \ge 1}\) be given. By Theorem 2, the worst-case error of a polynomial lattice point set \(P({\varvec{g}},p)=\{{\varvec{x}}_0,\ldots ,{\varvec{x}}_{b^m-1}\}\) in base \(b=2\) with generating vector \({\varvec{g}}\) and modulus \(p \in {\mathbb {F}}_2[x]\), with \(\deg (p)=m\), in the space \(W_{d,{\varvec{\gamma }}}^\alpha \) is given by

$$\begin{aligned} e_{b^m,d,\alpha ,{\varvec{\gamma }}}({\varvec{g}})&= \sum _{{\varvec{0}}\ne {\varvec{k}}\in {\mathcal {D}}({\varvec{g}},p)} \left( r_{\alpha ,{\varvec{\gamma }}}({\varvec{k}}) \right) ^{-1} = \frac{1}{b^m} \sum _{n=0}^{b^m-1} \sum _{{\varvec{0}}\ne {\varvec{k}}\in {\mathbb {N}}_0^d} \gamma _{\mathrm{supp}({\varvec{k}})} \frac{\mathrm{wal}_{{\varvec{k}}} ({\varvec{x}}_n)}{r_{\alpha }({\varvec{k}})} . \end{aligned}$$

For \(b=2\) and product weights \(\gamma _{{\mathfrak {u}}} = \prod _{j \in {\mathfrak {u}}} \gamma _j\), this expression then equals

$$\begin{aligned} e_{2^m,d,\alpha ,{\varvec{\gamma }}}({\varvec{g}}) = -1 + \frac{1}{2^m} \sum _{n=0}^{2^m-1} \prod _{j=1}^d \left( 1 + \gamma _j \,\phi _\alpha (x_{n,j}) \right) \end{aligned}$$

with \(\phi _\alpha : [0,1] \rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} \phi _\alpha (x) = \left\{ \begin{array}{ll} \mu _2(\alpha ), &{} \quad {\text {if }} x=0 , \\ \mu _2(\alpha ) - 2^{(1+t)(\alpha -1)} (\mu _2(\alpha )+1), &{} \quad {\text {otherwise, with }} t=\left\lfloor {\log _2(x)}\right\rfloor , \end{array}\right. \end{aligned}$$

see, e.g., [4, Theorem 2], where \(\mu _b(\alpha )\) is defined as in (3). For the polynomial lattice rules constructed by the algorithms considered, we will use this worst-case error expression as a measure of quality.

In particular, we consider the convergence behavior of the worst-case error \(e_{2^m,d,\alpha ,{\varvec{\gamma }}^\alpha }({\varvec{g}})\) for generating vectors \({\varvec{g}}\) obtained by the CBC-DBD algorithm (with modulus \(p(x)=x^m\)) and compare it with the error rates for polynomial lattice rules constructed by the standard fast CBC algorithm (with primitive polynomial \(p \in {\mathbb {F}}_2[x]\) of degree m) which uses the worst-case error \(e_{2^m,d,\alpha ,{\varvec{\gamma }}^\alpha }\) as the quality criterion. We display the computation results for dimension \(d=100\) for different sequences of product weights \({\varvec{\gamma }}= (\gamma _j)_{j\ge 1}\), different values of m, and different smoothness parameters \(\alpha \). We stress that the almost optimal error rates of \({\mathcal {O}}(N^{-\alpha +\delta })\), guaranteed by Theorem 8, may not always be visible for the weights and ranges of N considered in our numerical experiments. The graphs shown are therefore to be understood as an illustration of the pre-asymptotic behavior of the worst-case error.

Remark 6

We stress that in these numerical experiments we compare the CBC-DBD algorithm with modulus \(p(x)=x^m\) to the CBC construction with a primitive modulus polynomial. Both constructions yield polynomial lattices consisting of \(N=b^m\) points that have been constructed for the same function space \(W_{d,{\varvec{\gamma }}}^\alpha \) such that the comparison is valid. To the best of our knowledge, there is no known implementation of the fast CBC algorithm for polynomial lattice rules based on the modulus \(p(x)=x^m\). The reason for the elusiveness of such an implementation is the more involved structure of the group of units of the factor ring \({\mathbb {F}}_b[x]/(x^m)\) when factored into cyclic groups, see, e.g., [24]. While for lattice rules the group of integer units modulo \(N=b^m\) is either cyclic (for odd b) or can be factored into two cyclic subgroups (for \(b=2\)), which makes the corresponding generator easily computable, see, e.g., [19], the ring \({\mathbb {F}}_b[x]/(x^m)\) factors into a larger number of cyclic subgroups (for sufficiently large m) and their generating elements are less studied in the context of QMC methods.

Fig. 1
figure 1

Convergence results of the worst-case error \(e_{2^m,d,\alpha ,{\varvec{\gamma }}^\alpha }({\varvec{g}})\) in the weighted space \(W_{d,{\varvec{\gamma }}^\alpha }^\alpha \) for smoothness parameters \(\alpha =1.5,2,3\) with dimension \(d=100\). The generating vectors \({\varvec{g}}\) are constructed via the component-by-component digit-by-digit algorithm and the standard CBC construction for polynomial lattice rules for \(N=2^m\), respectively

The results in Fig. 1 show that the CBC-DBD algorithm constructs generating vectors of good polynomial lattice rules which have worst-case errors that are comparable to those of polynomial lattice rules obtained by the fast CBC algorithm. We observe identical asymptotic error rates for both algorithms considered, and also note that the CBC-DBD construction always delivers slightly higher error values. The latter behavior can easily be explained by the fact that the CBC construction is directly tailored to the space \(W_{d,{\varvec{\gamma }}^\alpha }^{\alpha }\) for a particular \(\alpha \) since \(e_{b^m,d,\alpha ,{\varvec{\gamma }}^\alpha }\) is used as the quality measure. In contrast, the CBC-DBD construction is independent of the smoothness parameter \(\alpha \) and constructs polynomial lattices which have a good quality for all \(\alpha > 1\). This in turn also means that, from a theoretical perspective, the CBC-DBD algorithm only needs to be executed once while the CBC construction has to be run for all considered \(\alpha \) in order to obtain theoretically assured error convergence rates. Additionally, we observe that the pre-asymptotic error decay is determined by the weight sequence \({\varvec{\gamma }}= (\gamma _j)_{j\ge 1}\). The faster the weights \(\gamma _j\) decay, the closer the error rate is to the optimal rate of \({\mathcal {O}}(N^{-\alpha })\) for the space \(W_{d,{\varvec{\gamma }}^\alpha }^{\alpha }\).

Remark 7

In the recent article [1], it has been shown that polynomial lattice rules which were constructed for the weighted Walsh space \(W_{d,{\varvec{\gamma }}}^\alpha \) can also achieve the almost optimal worst-case error convergence rate for the space \(W_{d,{\varvec{\gamma }}'}^{\alpha '}\) with different smoothness parameter \(\alpha ' > 1\) and weight sequence \({\varvec{\gamma }}'\), provided that certain conditions on both weight sequences are satisfied. The result in [1], in particular Corollary 3, relies on a favorable relation between the different weight sequences and smoothness parameters and provides a theoretical foundation to use the standard CBC algorithm with quality measure \(e_{b^m,d,\alpha ,{\varvec{\gamma }}}\) to obtain error bounds for related function space settings (and possibly different types of weights). We would like to point out that, in contrast, our algorithm (when run with weights \({\varvec{\gamma }}\)) is independent of \(\alpha \) and delivers QMC rules for which error bounds hold simultaneously for all \(\alpha > 1\).

5.2 Computational Complexity

We demonstrate the computational complexity of Algorithm 2 which was proved in Theorem 9. For this purpose, we measure and compare the computation times of implementations of Algorithm 2 and the standard fast CBC algorithm for polynomial lattice rules with primitive modulus \(p \in {\mathbb {F}}_2[x]\), cf., e.g., [19]. For all timings, we perform five independent measurements and then display either the highest time (Table 1) or the mean (Fig. 2) out of these five runs. We consider multiple values of \(m,d \in {\mathbb {N}}\) and fix the positive weight sequence \({\varvec{\gamma }}= (\gamma _j)_{j \ge 1}\) with \(\gamma _j = 1 / j^2\). Note that the chosen weight sequence does not affect the computation times.

In Table 1, we display the timing results for the two considered algorithms. Furthermore, Fig. 2 provides a graphical illustration of the running times of both algorithms. We remark that the measured times only indicate the duration for the construction of the generating vectors but do not include the calculation of the corresponding worst-case error. All timings were performed on an Intel Core i5 CPU with 2.3 GHz using Python 3.6.3.

Table 1 Computation times (in s) for constructing the generating vector \({\varvec{g}}\) of a polynomial lattice rule with \(2^m\) points in d dimensions using the component-by-component digit-by-digit algorithm (bold font) and the standard fast CBC construction (normal font)

The timings displayed in Table 1 and Fig. 2 confirm that the computational complexity of both algorithms depends on m and d in a similar way and the measured times are in accordance with Proposition 9. Additionally, the linear dependence of the construction cost on the dimension d is well observable. The measured construction times for Algorithm 2 are slightly higher than for the fast CBC algorithm but in general both algorithms can be executed in comparable time. This is especially remarkable since the fast CBC construction is based on fast Fourier transformations which rely on compiled and optimized code via Python’s Discrete Fourier Transform (numpy.fft) library while the CBC-DBD construction does not make use of any compiled libraries. Additionally, we remark that the spread between independent timing runs is neglectable, see also Fig. 2.

Fig. 2
figure 2

Mean computation times (in s) for constructing the generating vector \({\varvec{g}}\) of a polynomial lattice rule with \(2^m\) points in \(d \in \{50,2000\}\) dimensions using the component-by-component digit-by-digit algorithm (circles) and the standard fast CBC construction (crosses). The vertical error bars indicate the spread between the independent timing runs

6 Conclusion

In this paper, we presented an algorithm for constructing good polynomial lattice rules for numerical integration in weighted Walsh spaces. In particular, we studied a component-by-component digit-by-digit (CBC-DBD) construction with quality measure independent of the smoothness parameter \(\alpha \), similar to [7], where such an algorithm was analyzed for ordinary lattice rules. The construction algorithm is formulated for the special case of product weights and yields polynomial lattice rules which admit error convergence rates that are arbitrarily close to the optimal convergence order. Furthermore, the proven error bounds become independent of the dimension if the weights satisfy suitable summability conditions. In addition to these theoretical results, we derived a fast implementation of the considered algorithm which exhibits the same computational complexity as the state-of-the-art fast CBC algorithm, but does not rely on the use of fast Fourier transformations (FFTs). The considered algorithm is, to the best of our knowledge, the first construction method for good polynomial lattice rules with modulus \(p(x)=x^m\) that requires only \({\mathcal {O}}(d \,m\, 2^m)\) operations. Extensive numerical experiments illustrated our findings and proved that the considered method is competitive with the standard fast CBC algorithm.