Keywords

1 Introduction

Most existing methods for providing SA rely on special assumptions connected to the behavior of the model (such as linearity, monotonicity and additivity of the relationship between model input and model output) [22]. Such assumptions are often applicable to a large range of mathematical models. At the same time there are models that include significant nonlinearities and/or stiffness. For such models assumptions about linearity and additivity are not applicable. This is especially true when one deals with nonlinear systems of partial differential equations. The numerical study and results reported in this paper have been done by using a large-scale mathematical model called Unified Danish Eulerian Model (UNI-DEM) [33, 34]. The model enables us to study the transport of air pollutants and other species over a large geographical region. The system of partial differential equations describes the main physical processes, such as advection, diffusion, deposition as well as chemical and photochemical processes between the studied species. The emissions and the quickly changing meteorological conditions are also described. The nonlinearity of the equations is mainly introduced when modeling chemical reactions [33]. If the model results are sensitive to a given process, one can describe it mathematically in a more adequate way or more precisely. Thus, the goal of our study is to increase the reliability of the results produced by the model and to identify processes that must be studied more carefully, as well as to find input parameters that need to be measured with a higher precision. A careful sensitivity analysis is needed in order to decide where and how simplifications of the model can be made. That is why it is important to develop and study more adequate and reliable methods for sensitivity analysis. A good candidate for reliable sensitivity analysis of models containing nonlinearity is the variance-based method [22]. The idea of this approach is to estimate how the variation of an input parameter or a group of inputs contributes into the variance of the model output. As a measure of this analysis we use the total sensitivity indices (TSI) (see, Sect. 2) described as multidimensional integrals:

$$\displaystyle\begin{array}{rcl} I =\int _{\Omega }g(\mathrm{x})p(\mathrm{x})\,\mathrm{d}\mathrm{x},\,\,\,\Omega \subset {\mathbf{R}}^{d},& &{}\end{array}$$
(1)

where g(x) is a square integrable function in Ω and p(x) ≥ 0 is a probability density f unction (p.d.f.), such that Ω p(x) dx = 1.

Clearly, the progress in the area of sensitivity analysis is closely related to the progress in reliable algorithms for multidimensional integration.

2 Problem Setting

2.1 Modeling and Sensitivity

Assume that the mathematical model can be presented as a function

$$\displaystyle{ \mathrm{u} = f(\mathrm{x}),\quad \mbox{ where}\quad \mathrm{x} = (x_{1},x_{2},\ldots,x_{d}) \in {U}^{d} \equiv {[0;1]}^{d} }$$
(2)

is the vector of input parameters with a joint p.d.f. p(x) = p(x 1, , x d ). Assume also that the input variables are independent (noncorrelated) and the density function p(x) is known, even if x i are not actually random variables (r.v.). The TSI of an input parameter x i , i​ ∈ ​{ 1, , d} is defined in the following way [9, 26]:

$$\displaystyle{ S_{i}^{tot} = S_{ i} +\sum _{l_{1}\neq i}S_{il_{1}} +\sum _{l_{1},l_{2}\neq i,l_{1}<l_{2}}S_{il_{1}l_{2}} +\ldots +S_{il_{1}\ldots l_{d-1}}, }$$
(3)

where S i is called the main effect (first-order sensitivity index) of x i and \(S_{il_{1}\ldots l_{j-1}}\) is the j-th order sensitivity index. The higher-order terms describe the interaction effects between the unknown input parameters \(x_{i_{1}},\ldots,x_{i_{\nu }},\nu \in \{ 2,\ldots,d\}\) on the output variance.

The method of global SA used in this work is based on a decomposition of an integrable model function f in the d-dimensional factor space into terms of increasing dimensionality [26]:

$$\displaystyle{ f(\mathrm{x}) = f_{0} +\sum _{ \nu =1}^{d}\sum _{ l_{1}<\ldots <l_{\nu }}f_{l_{1}\ldots l_{\nu }}(x_{l_{1}},x_{l_{2}},\ldots,x_{l_{\nu }}), }$$
(4)

where f 0 is a constant. The representation (4) is referred to as the ANOVA representation of the model function f(x) if each term is chosen to satisfy the following condition [26]:

$$\displaystyle{\int _{0}^{1}f_{ l_{1}\ldots l_{\nu }}(x_{l_{1}},x_{l_{2}},\ldots,x_{l_{\nu }})\mathrm{d}x_{l_{k}} = 0,\quad 1 \leq k \leq \nu,\quad \nu = 1,\ldots,d.}$$

Let us mention the fact that if the whole presentation (4) of the right-hand side is used, this does not make the problem simpler. The hope is that a truncated sequence \(f_{0} +\sum _{ \nu =1}^{d_{tr}}\sum _{l_{ 1}<\ldots <l_{\nu }}f_{l_{1}\ldots l_{\nu }}(x_{l_{1}},x_{l_{2}},\ldots,x_{l_{\nu }})\), where d t r  < d (or even d t r  ≪ d), can be considered as a good approximation to the model function f.

The quantities

$$\displaystyle{ \mathbf{D} =\int _{{U}^{d}}{f}^{2}(\mathrm{x})\mathrm{d}\mathrm{x} - f_{ 0}^{2},\quad \mathbf{D}_{ l_{1}\ \ldots \ l_{\nu }} =\int f_{l_{1}\ \ldots \ l_{\nu }}^{2}\mathrm{d}x_{ l_{1}}\ldots \mathrm{d}x_{l_{\nu }} }$$
(5)

are the so-called total and partial variances, respectively, and are obtained after squaring and integrating over U d the equality (4) on the assumption that f(x) is a square integrable function (thus, all terms in (4) are also square integrable functions). Therefore, the total variance of the model output is split into partial variances in the analogous way as the model function, that is, the unique ANOVA-decomposition: \(\mathbf{D} =\sum _{ \nu =1}^{d}\sum _{l_{1}<\ldots <l_{\nu }}\mathbf{D}_{l_{1}\ldots l_{\nu }}.\) The use of probability theory concepts is based on the assumption that the input parameters are random variables distributed in U d that defines \(f_{l_{1}\ \ldots \ l_{\nu }}(x_{l_{1}},x_{l_{2}},\ldots,x_{l_{\nu }})\) also as random variables with variances (5). For example, \(f_{l_{1}}\) is presented by a conditional expectation: \(f_{l_{1}}(x_{l_{1}}) = \mathbf{E}(\mathrm{u}\vert x_{l_{1}}) - f_{0}\) and, respectively, \(\mathbf{D}_{l_{1}} = \mathbf{D}[f_{l_{1}}(x_{l_{1}})] = \mathbf{D}[\mathbf{E}(\mathrm{u}\vert x_{l_{1}})].\) Based on these assumptions about the model function and the output variance, the following quantities

$$\displaystyle{ S_{l_{1}\ \ldots \ l_{\nu }} = \frac{\mathbf{D}_{l_{1}\ \ldots \ l_{\nu }}} {\mathbf{D}},\quad \nu \in \{ 1,\ldots,d\} }$$
(6)

are referred to as the global sensitivity indices [26]. Based on the formulas (5)–(6), it is clear that the mathematical treatment of the problem of providing global sensitivity analysis consists in evaluating total sensitivity indices (3) of corresponding order that, in turn, leads to computing multidimensional integrals of the form (1). It means that to obtain S i t o t in general, one needs to compute 2d (or \({2}^{d_{tr}}\), with d t r  ≪ d) integrals of type (5).

The procedure for computing global sensitivity indices (see [26]) is based on the following representation of the variance:

$$\displaystyle{ \mathbf{D}_{\mathrm{y}} : \mathbf{D}_{\mathrm{y}} =\int \ f(\mathrm{x})\ f(\mathrm{y},\mathrm{z}^{\prime})\mathrm{d}\mathrm{x}\mathrm{d}\mathrm{z}^{\prime} - f_{0}^{2}, }$$
(7)

where \(\mathrm{y} = (x_{k_{1}},\ldots,x_{k_{m}}),\ 1 \leq k_{1} <\ldots < k_{m} \leq d,\ \) is an arbitrary set of m variables (1 ≤ m ≤ d − 1) and z is the set of d − m complementary variables, i.e. x = (y, z). The equality (7) enables the construction of a Monte Carlo algorithm for evaluating f 0, D and D y:

$$\displaystyle{\begin{array}{ll} \frac{1} {n}\ \sum _{j=1}^{n}\ f(\xi _{ j}){ P \atop \rightarrow } f_{0}, &\qquad \frac{1} {n}\ \sum _{j=1}^{n}\ f(\xi _{ j})\ f(\eta _{j},\zeta ^{\prime}_{j}){ P \atop \rightarrow } \mathbf{D}_{\mathrm{y}} + f_{0}^{2}, \\ \frac{1} {n}\ \sum _{j=1}^{n}\ {f}^{2}(\xi _{ j}){ P \atop \rightarrow } \mathbf{D} + f_{0}^{2},&\qquad \frac{1} {n}\ \sum _{j=1}^{n}\ f(\xi _{ j})\ f(\eta ^{\prime}_{j},\zeta _{j}){ P \atop \rightarrow } \mathbf{D}_{\mathrm{z}} + f_{0}^{2}, \end{array} }$$

where ξ = (η, ζ) is a random sample and η corresponds to the input subset denoted by y.

Instead of randomized (Monte Carlo) algorithms for computing the above sensitivity parameters, one can use deterministic quasi-Monte Carlo (QMC) algorithms or randomized QMC [13, 14]. Randomized (Monte Carlo) algorithms have proven to be very efficient in solving multidimensional integrals in composite domains [3, 23]. At the same time the QMC based on well-distributed Sobol sequences can be considered as a good alternative to Monte Carlo algorithms, especially for smooth integrands and not very high effective dimensions (up to d = 15) [12]. Sobol Λ Π τ are good candidates for efficient QMC algorithms. Algorithms based on Λ Π τ sequences while being deterministic mimic the pseudorandom sequences used in Monte Carlo integration. One of the problems with Λ Π τ sequences is that they may have bad two-dimensional projection. In this context bad means that the distribution of the points is far from being a uniform distribution. If such projections are used in a certain computational problem, then the lack of uniformity may provoke a substantial lost of accuracy. To overcome this problem randomized QMC can be used. There are several ways of randomization and scrambling is one of them. The original motivation of scrambling [10, 19] aims toward obtaining more uniformity for quasi-random sequences in high dimensions, which can be checked via two-dimensional projections. Another way of randomisation is to shake the quasi-random points according to some procedure. Actually, the scrambled algorithms obtained by shaking the quasi-random points can be considered as Monte Carlo algorithms with a special choice of the density function. It is a matter of definition. Thus, there is a reason to be able to compare two classes of algorithms: deterministic and randomized.

3 Complexity in Classes of Algorithms

One may pose the task to consider and compare two classes of algorithms: deterministic algorithms and randomized (Monte Carlo) algorithms. Let I be the desired value of the integral. Assume for a given r.v. θ one can prove that the mathematical expectation satisfies E θ = I. Suppose that the mean value of n values of θ: θ (i),  i = 1, , n is considered as a Monte Carlo approximation to the solution: \(\bar{\theta }_{n} = 1/n\sum _{i=1}^{n}{\theta }^{(i)} \approx I,\) where θ (i)(i = 1, 2, , n) correspond to values (realizations) of a r.v. θ. In general, a certain randomized algorithm can produce the result with a given probability error. So, dealing with randomized algorithms one has to accept that the result of the computation can be true only with a certain (although high) probability. In most practical computations it is reasonable to accept an error estimate with a probability smaller than 1.

Consider the following integration problem:

$$\displaystyle{ S(f) := I =\int _{{U}^{d}}f(\mathrm{x})d\mathrm{x}, }$$
(8)

where \(\mathrm{x} \equiv (x_{1},\ldots,x_{d}) \in {U}^{d} \subset {\mathbf{R}}^{d}\) and f ∈ C(U d) is an integrable function on U d. The computational problem can be considered as a mapping of function f : { [0, 1]d → R} to R: S(f) : f → R, where \(S(f) =\int _{{U}^{d}}f(\mathrm{x})d\mathrm{x}\) and f ∈ F 0 ⊂ C(U d). We refer to S as the solution operator. The elements of F 0 are the data, for which the problem has to be solved, and for f ∈ F 0, S(f) is the exact solution. For a given f, we want to compute exactly or approximately S(f). One may be interested in cases when the integrand f has a higher regularity. It is because in many cases of practical computations f is smooth and has high-order bounded derivatives. If this is the case, then is it reasonable to try to exploit such a smoothness. To be able to do that we need to define the functional class \(F_{0} \equiv {\mathbf{W}}^{k}(\|f\|;{U}^{d})\) in the following way:

Definition 3.1.

Let d and k be integers, d, k ≥ 1. We consider the class \({\mathbf{W}}^{k}(\|f\|;{U}^{d})\) (sometimes abbreviated to W k) of real functions f defined over the unit cube U d = [0, 1)d, possessing all the partial derivatives \(\frac{{\partial }^{r}f(\mathrm{x})} {\partial x_{1}^{\alpha _{1}}\ldots \partial x_{d}^{\alpha _{d}}},\,\,\,\alpha _{1} +\ldots +\alpha _{d} = r \leq k,\) which are continuous when r < k and bounded in sup norm when r = k. The seminorm \(\left \|\cdot \right \|\) on W k is defined as

$$\displaystyle{\left \|f\right \| =\sup \left \{\left \vert \frac{{\partial }^{k}f(\mathrm{x})} {\partial x_{1}^{\alpha _{1}}\ldots \partial x_{d}^{\alpha _{d}}}\right \vert,\,\,\,\,\alpha _{1} +\ldots +\alpha _{d} = k,\,\,\,\mathrm{x} \equiv (x_{1},\ldots,x_{d}) \in {U}^{d}\right \}.}$$

We keep the seminorm \(\|f\|\) into the notation for the functional class \({\mathbf{W}}^{k}(\|f\|;{U}^{d})\) since it is important for our further consideration. We call a quadrature formula any expression of the form

$$\displaystyle{{A}^{D}(f,n) =\sum _{ i=1}^{n}c_{ i}f(\mathrm{{x}}^{(i)}),}$$

which approximates the value of the integral S(f). The real numbers c i  ∈ R are called weights and the d-dimensional points x(i) ∈ U d are called nodes. It is clear that for fixed weights c i and nodes \(\mathrm{{x}}^{(i)} \equiv (x_{i,1},\ldots,x_{i,d})\), the quadrature formula A D(f, n) may be used to define an algorithm with an integration error \(err(f,{A}^{D}) \equiv \int _{{U}^{d}}f(\mathrm{x})d\mathrm{x} - {A}^{D}(f,n)\). We call a randomized quadrature formula any formula of the following kind: \({A}^{R}(f,n) =\sum _{ i=1}^{n}\sigma _{i}f{(\xi }^{(i)}),\) where σ i and ξ (i) are random weights and nodes, respectively. The algorithm A R(f, n) belongs to the class of randomized (Monte Carlo) denoted by \({\mathcal{A}}^{\mathcal{R}}\).

Definition 3.2.

Given a randomized (Monte Carlo) integration formula for the functions from the space W k, we define the integration error

$$\displaystyle{err(f,{A}^{R}) \equiv \int _{{ U}^{d}}f(\mathrm{x})d\mathrm{x} - {A}^{R}(f,n)}$$

by the probability error \(\varepsilon _{P}(f)\) in the sense that \(\varepsilon _{P}(f)\) is the least possible real number, such that

$$\displaystyle{Pr\left (\left \vert err(f,{A}^{R})\right \vert <\varepsilon _{ P}(f)\right ) \geq P,}$$

and the mean square error

$$\displaystyle{r(f) ={ \left \{E\left [er{r}^{2}(f,{A}^{R})\right ]\right \}}^{1/2}.}$$

We assume that it suffices to obtain an \(\varepsilon _{P}(f)\)-approximation to the solution with a probability 0 < P < 1. If we allow equality, i.e. 0 < P ≤ 1 in Definition 3.2, then \(\varepsilon _{P}(f)\) can be used as an accuracy measure for both randomized and deterministic algorithms. In such a way it is consistent to consider a wider class \(\mathcal{A}\) of algorithms that contains both classes: randomized and deterministic algorithms.

Definition 3.3.

  Consider the set \(\mathcal{A}\) of algorithms A:

$$\displaystyle{\mathcal{A} =\{ A : Pr(\vert err(f,A)\vert \leq \varepsilon ) \geq c\},\ \ \ A \in \{ {A}^{D},{A}^{R}\},\ \ 0 < c < 1}$$

that solve a given problem with an integration error e r r(f, A).

In such a setting it is correct to compare randomized algorithms with algorithms based on low-discrepancy sequences like Sobol Λ Π τ sequences.

4 The Algorithms

The algorithms we study are based on Sobol Λ Π τ sequences.

4.1 Λ Π τ Sobol Sequences

Λ Π τ sequences are uniformly distributed sequences (u.d.s.) The term u.d.s. was introduced by Hermann Weyl in 1916 [30]. For practical purposes a u.d.s. should satisfy the following three requirements [23, 25]: (i) the best asymptote as n → , (ii) well-distributed points for small n and (iii) a computationally inexpensive algorithm.

All Λ Π τ sequences given in [25] satisfy the first requirement. Suitable distributions such as Λ Π τ sequences are also called (t, m, s)-nets and (t, s)-sequences in base b ≥ 2. To introduce them, define first an elementary s-interval in base b as a subset of U s of the form \(E =\prod _{ j=1}^{s}\left [ \frac{a_{j}} {{b}^{d_{j}}}, \frac{a_{j}+1} {{b}^{d_{j}}} \right ],\) where a j , d j  ≥ 0 are integers and \(a_{j} < {b}^{d_{j}}\) for all j ∈ { 1, , s}. Given two integers 0 ≤ t ≤ m, a (t, m, s)-net in base b is a sequence x(i) of b m points of U s such that \(Card\ E \cap \{\mathrm{ {x}}^{(1)},\ldots,\mathrm{{x}}^{({b}^{m}) }\} = {b}^{t}\) for any elementary interval E in base b of hypervolume \(\lambda (E) = {b}^{t-m}\). Given a non-negative integer t, a (t, s)-sequence in base b is an infinite sequence of points x(i) such that for all integers k ≥ 0, m ≥ t, the sequence \(\{\mathrm{{x}}^{(k{b}^{m}) },\ldots,\mathrm{{x}}^{((k+1){b}^{m}-1) }\}\) is a (t, m, s)-net in base b.

Sobol [23] defines his Π τ -meshes and Λ Π τ sequences, which are (t, m, s)-nets and (t, s)-sequences in base 2, respectively. The terms (t, m, s)-nets and (t, s)-sequences in base b (also called Niederreiter sequences) were introduced in 1988 by Niederreiter [18].

To generate the j-th component of the points in a Sobol sequence, we need to choose a primitive polynomial of some degree s j over the Galois field of two elements GF(2) \(P_{j} = {x}^{s_{j}} + a_{1,j}{x}^{s_{j}-1} + a_{2,j}{x}^{s_{j}-2} +\ldots +a_{s_{ j}-1,j}x + 1,\) where the coefficients \(a_{1,j},\ldots,a_{s_{j}-1,j}\) are either 0 or 1. A sequence of positive integers {m 1, j , m 2, j , } are defined by the recurrence relation

$$\displaystyle{m_{k,j} = 2a_{1,j}m_{k-1,j} \oplus {2}^{2}a_{ 2,j}m_{k-2,j} \oplus \ldots \oplus {2}^{s_{j} }m_{k-s_{j},j} \oplus m_{k-s_{j},j},}$$

where ⊕ is the bit-by-bit exclusive-or operator. The values \(m_{1,j},\ldots,m_{s_{j},j}\) can be chosen freely provided that each m k, j , 1 ≤ k ≤ s j , is odd and less than 2k. Therefore, it is possible to construct different Sobol sequences for the fixed dimension s. In practice, these numbers must be chosen very carefully to obtain really efficient Sobol sequence generators [27]. The so-called direction numbers {v 1, j , v 2, j , } are defined by \(v_{k,j} = \frac{m_{k,j}} {{2}^{k}}\). Then the j-th component of the i-th point in a Sobol sequence is given by \(x_{i,j} = i_{1}v_{1,j} \oplus i_{2}v_{2,j}\oplus \ldots,\) where i k is the k-th binary digit of \(i = (\ldots i_{3}i_{2}i_{1})_{2}\). Subroutines to compute these points can be found in [2, 24]. The work [15] contains more details.

4.2 The Monte Carlo Algorithms Based on Modified Sobol Sequences:MCA-MSS

One of the algorithms based on a procedure of shaking was proposed recently in [6]. The idea is that we take a Sobol Λ Π τ point (vector) x of dimension d. Then x is considered as a centrum of a sphere with a radius ρ < 1. A random point ξ ∈ U d uniformly distributed on the sphere is taken. Consider a random variable θ defined as a value of the integrand at that random point, i.e. θ = f(ξ). Consider random points ξ (i)(ρ) ∈ U d, i = 1, , n. Assume \({\xi }^{(i)}(\rho ) =\mathrm{ {x}}^{(i)} {+\rho \omega }^{(i)}\), where ω (i) is a unique uniformly distributed vector in U d. The radius ρ is relatively small \(\rho \ll \frac{1} {{2}^{d_{j}}}\), such that ξ (i)(ρ) is still in the same elementary i-th interval \(E_{i}^{d} =\prod _{ j=1}^{d}\left [\frac{a_{j}^{(i)}} {{2}^{d_{j}}}, \frac{a_{j}^{(i)}+1} {{2}^{d_{j}}} \right ]\), where the pattern Λ Π τ point x(i) is. We use a subscript i in E i d to indicate that the i-th Λ Π τ point x(i) is in it. So, we assume that if \(\mathrm{{x}}^{(i)} \in E_{i}^{d}\), then \({\xi }^{(i)}(\rho ) \in E_{i}^{d}\) too.

It was proven in [6] that the mathematical expectation of the random variable θ = f(ξ) is equal to the value of the integral (8), that is, \(\mathbf{E}\theta = S(f) =\int _{{U}^{d}}f(\mathrm{x})d\mathrm{x}.\) This result allows for defining a randomized algorithm. One can take the Sobol Λ Π τ point x(i) and shake it somewhat. Shaking means to define random points \({\xi }^{(i)}(\rho ) =\mathrm{ {x}}^{(i)} {+\rho \omega }^{(i)}\) according to the procedure described above. For simplicity the algorithm described above is abbreviated as MCA-MSS-1.

The probability error of the algorithm MCA-MSS-1 was analysed in [7]. It was proved that for integrands with continuous and bounded first derivatives, i.e. f ∈ W 1(L; U d), where \(L =\| f\|\), it holds

$$\displaystyle{err(f,d) \leq c_{d}^{{}^{{\prime}} }\left \|f\right \|{n}^{{}^{-\frac{1} {2} -\frac{1} {d} } }\quad \mbox{ and}\quad r(f,d) \leq c_{d}^{{}^{{\prime\prime}} }\left \|f\right \|{n}^{{}^{-\frac{1} {2} -\frac{1} {d} } },}$$

where the constants \(c_{d}^{{}^{{\prime}} }\) and \(c_{d}^{{}^{{\prime\prime}} }\) do not depend on n.

In this work a modification of algorithm MCA-MSS-1 is proposed and analysed. The new algorithm will be called MCA-MSS-2.

It is assumed that n = m d, m ≥ 1. The unit cube U d is divided into m d disjoint subdomains, such that they coincide with the elementary d-dimensional subintervals defined in Sect. 4.1 \({U}^{d} =\bigcup _{ j=1}^{{m}^{d} }K_{j},\,\,\,\mathrm{where}\,\,\,K_{j} =\prod _{ i=1}^{d}[a_{i}^{(j)},b_{i}^{(j)}),\) with \(b_{i}^{(j)} - a_{i}^{(j)} = \frac{1} {m}\) for all i = 1, , d.

In such a way in each d-dimensional subdomain K j , there is exactly one Λ Π τ point x(j). Assuming that after shaking, the random point stays inside K j , i.e. \({\xi }^{(j)}(\rho ) =\mathrm{ {x}}^{(j)} {+\rho \omega }^{(j)} \in K_{j}\), one may try to exploit the smoothness of the integrand in case if the integrand f belongs to W 2(L; U d).

Then, if p(x) is a p.d.f., such that \(\int _{{U}^{d}}p(\mathrm{x})d\mathrm{x} = 1\), then

$$\displaystyle{\int _{K_{j}}p(\mathrm{x})d\mathrm{x} = p_{j} \leq \frac{c_{1}^{(j)}} {n},}$$

where c 1 (j) are constants. If d j is the diameter of K j , then

$$\displaystyle{d_{j} =\sup _{x_{1},x_{2}\in K_{j}}\vert x_{1} - x_{2}\vert \leq \frac{c_{2}^{(j)}} {{n}^{1/d}},}$$

where c 2 (j) are another constants.

In the particular case when the subintervals are with edge 1 ∕ m for all constants, we have c 1 (j) = 1 and \(c_{2}^{(j)} = \sqrt{d}\). In each subdomain K j the central point is denoted by s(j), where \(\mathrm{{s}}^{(j)} = (s_{1}^{(j)},s_{2}^{(j)},\ldots,s_{d}^{(j)})\).

Suppose two random points ξ (j) and ξ (j) are chosen, such that ξ (j) is selected during our procedure used in MCA-MSS-1. The second point ξ (j) is chosen to be symmetric to ξ (j) according to the central point s (j) in each cube K j . In such a way the number of random points is 2m d. One may calculate all function values f(ξ (j)) and f(ξ (j)), for j = 1, , m d, and approximate the value of the integral in the following way:

$$\displaystyle{ I(f) \approx \frac{1} {2{m}^{d}}\sum _{j=1}^{2n}\left [f{(\xi }^{(j)}) + f{(\xi }^{(j)^{\prime}})\right ]. }$$
(9)

This estimate corresponds to MCA-MSS-2. We prove later on that this algorithm has an optimal rate of convergence for functions with bounded second derivatives, i.e. for functions f ∈ W 2(L; U d), while the algorithm MCA-MSS-1 has an optimal rate of convergence for functions with bounded first derivatives: f ∈ W 1(L; U d).

One can prove the following:

Theorem 1.

The quadrature formula (9) constructed above for integrands f from W 2(L;Ud) satisfies

$$\displaystyle{err(f,d) \leq \tilde{ c}_{d}^{\ {\prime}}\left \|f\right \|{n}^{{}^{-\frac{1} {2} -\frac{2} {d} } }}$$

and

$$\displaystyle{r(f,d) \leq \tilde{ c}_{d}^{\ {\prime\prime}}\left \|f\right \|{n}^{{}^{-\frac{1} {2} -\frac{2} {d} } },}$$

where the constants \(\tilde{c}_{d}^{\ {\prime}}\) and \(\tilde{c}_{d}^{\ {\prime\prime}}\) do not depend on n.

Proof.

One can see that

$$\displaystyle{\mathbf{E}\left \{ \frac{1} {2{m}^{d}}\sum _{j=1}^{2n}\left [f{(\xi }^{(j)}) + f{(\xi }^{(j)^{\prime}})\right ]\right \} =\int _{{ U}^{d}}f(\mathrm{x})d\mathrm{x}.}$$

For the fixed Λ Π τ point x(j) ∈ K j one can use the d-dimensional Taylor formula to present the function f(x(j)) in K j around the central point s(j). Since f ∈ W 2(L; U d), there exists a d-dimensional point η (j) ∈ K j lying between x(j) and s(j) such that

$$\displaystyle\begin{array}{rcl} f(\mathrm{{x}}^{(j)}) = f(\mathrm{{s}}^{(j)})& +& \nabla f(\mathrm{{s}}^{(j)})\ (\mathrm{{x}}^{(j)} -\mathrm{ {s}}^{(j)}) \\ & +& \frac{1} {2}{(\mathrm{{x}}^{(j)} -\mathrm{ {s}}^{(j)})}^{T}\ [{D}^{2}f{(\eta }^{(j)})](\mathrm{{x}}^{(j)} -\mathrm{ {s}}^{(j)}),{}\end{array}$$
(10)

where \(\nabla f(\mathrm{x}) = \left [\frac{\partial f(\mathrm{x})} {\partial x_{1}},\ldots, \frac{\partial f(\mathrm{x})} {\partial x_{d}} \right ]\) and \([{D}^{2}f(\mathrm{x})] = \left [ \frac{{\partial }^{2}f(\mathrm{x})} {\partial x_{i}\partial x_{k}}\right ]_{i,k=1}^{d}.\) For simplicity the superscript of the argument (j) in the last two formulas is omitted assuming that the formulas are written for the j-th cube K j . Now, we can write formula (10) at previously defined random points ξ and ξ ′ both belonging to K j . In such a way we have

$$\displaystyle{ f(\xi ) = f(\mathrm{s}) + \nabla f(\mathrm{s})\ (\xi -\mathrm{s}) + \frac{1} {2!}{(\xi -\mathrm{s})}^{T}[{D}^{2}f(\eta )](\xi -\mathrm{s}), }$$
(11)
$$\displaystyle{ f(\xi ^{\prime}) = f(\mathrm{s}) + \nabla f(\mathrm{s})\ (\xi ^{\prime} -\mathrm{ s}) + \frac{1} {2!}{(\xi ^{\prime} -\mathrm{ s})}^{T}[{D}^{2}f(\eta ^{\prime})](\xi ^{\prime} -\mathrm{ s}), }$$
(12)

where η ′ is another d-dimensional point lying between ξ ′ and s. Adding (11) and (12), we get

$$\displaystyle\begin{array}{rcl} f(\xi ) + f(\xi ^{\prime}) = 2f(\mathrm{s})& +& \frac{1} {2}\ \left \{{(\xi -\mathrm{s})}^{T}[{D}^{2}f(\eta )](\xi -\mathrm{s})\right. + {}\\ & +& {(\xi ^{\prime} -\mathrm{ s})}^{T}[{D}^{2}f(\eta ^{\prime})]\left.(\xi ^{\prime} -\mathrm{ s})\right \}. {}\\ \end{array}$$

Because of the symmetry there is no term depending on the gradient D f(s) in the previous formula. If we consider the variance D[f(ξ) + f(ξ ′)] taking into account that the variance of the constant 2f(s) is zero, then we get

$$\displaystyle{\begin{array}{lll} \mathbf{D}[f(\xi )&+ &f(\xi ^{\prime})] =\\ \\ & =&\mathbf{D}\left \{\frac{1} {2}\left [{(\xi -\mathrm{s})}^{T}[{D}^{2}f(\eta )](\xi -\mathrm{s}) + {(\xi ^{\prime} -\mathrm{ s})}^{T}[{D}^{2}f(\eta ^{\prime})](\xi ^{\prime} -\mathrm{ s})\right ]\right \} \\ \\ & \leq &\mathbf{E}{\left \{\frac{1} {2}\left [{(\xi -\mathrm{s})}^{T}[{D}^{2}f(\eta )](\xi -\mathrm{s}) + {(\xi ^{\prime} -\mathrm{ s})}^{T}[{D}^{2}f(\eta ^{\prime})](\xi ^{\prime} -\mathrm{ s})\right ]\right \}}^{2}. \end{array} }$$

Since f ∈ W 2(L; U d), we can strengthen the last inequality if the terms [D 2 f(η)] and [D 2 f(η ′)] are substituted by the seminorm L (and removing front bracket) and the products \({(\xi -\mathrm{s})}^{T}\ (\xi -\mathrm{s})\) and \({(\xi ^{\prime} -\mathrm{ s})}^{T}\ (\xi ^{\prime} -\mathrm{ s})\) by the squared diameter of the subdomain K j . Now we return back to the notation with superscript, taking into account that the above consideration is just for an arbitrary subdomain K j . The variance can be estimated from above in the following way:

$$\displaystyle\begin{array}{rcl} \mathbf{D}[f(\xi ) + f(\xi ^{\prime})]& \leq & {L}^{2}\sup _{ x_{1}^{(j)},x_{2}^{(j)}}{\left \vert x_{1}^{(j)} - x_{ 2}^{(j)}\right \vert }^{4} \leq {L}^{2}{(c_{ 2}^{(j)})}^{4}{n}^{-4/d}. {}\\ \end{array}$$

Now the variance of \(\theta _{n} =\sum _{ j=1}^{n}{\theta }^{(j)}\) can be estimated:

$$\displaystyle\begin{array}{rcl} \mathbf{D}\theta _{n} =\sum _{ j=1}^{n}p_{ j}^{2}\mathbf{D}[f(\xi ) + f(\xi ^{\prime})]& \leq & \sum _{ j=1}^{n}{(c_{ 1}^{(j)})}^{2}{n}^{-2}{L}^{2}{(c_{ 2}^{(j)})}^{4}{n}^{-4/d} \\ & \leq &{ \left (Lc_{1}^{(j)}c_{ 2}^{(j)2}\right )}^{2}{n}^{-1-4/d}. {}\end{array}$$
(13)

Therefore, \(r(f,d) \leq \widetilde{ c}_{d}^{\ {\prime\prime}}\left \|f\right \|{n}^{{}^{-\frac{1} {2} -\frac{2} {d} } }.\) The application of Tchebycheff’s inequality to the variance (13) yields

$$\displaystyle{\varepsilon (f,d) \leq \widetilde{ c}_{d}^{\ {\prime}}\left \|f\right \|{n}^{{}^{-\frac{1} {2} -\frac{2} {d} } }}$$

for the probable error \(\varepsilon\), where \(\tilde{c}_{d}^{\ {\prime}} = \sqrt{2d}\), which concludes the proof.

One can see that the Monte Carlo algorithm MCA-MSS-2 has an optimal rate of convergence for functions with continuous and bounded second derivative [3]. This means that the rate of convergence (\({n}^{-\frac{1} {2} -\frac{2} {d} }\)) cannot be improved for the functional class W 2 in the class of the randomized algorithms \({\mathcal{A}}^{\mathcal{R}}\).

Note that both MCA-MSS-1 and MCA-MSS-2 have one control parameter, that is, the radius ρ of the sphere of shaking. At the same time, to be able to efficiently use this control parameter, one should increase the computational complexity. The problem is that after shaking the random point may leave the multidimensional subdomain. That is why after each such a procedure, one should be checking if the random point is still in the same subdomain. It is clear that the procedure of checking if a random point is inside the given domain is a computationally expensive procedure when one has a large number of points. A small modification of MCA-MSS-2 algorithm allows to overcome this difficulty. If we just generate a random point ξ (j) ∈ K j uniformly distributed inside K j and after that take the symmetric point ξ (j) according to the central point s (j), then this procedure will simulate the algorithm MCA-MSS-2. Such a completely randomized approach simulates algorithm MCA-MSS-2, but the shaking is with different radiuses ρ in each subdomain. We call this algorithm MCA-MSS-2-S, because this approach looks like the stratified symmetrised Monte Carlo. Obviously, MCA-MSS-2-S is less expensive than MCA-MSS-2, but there is not such a control parameter like the radius ρ, which can be considered as a parameter randomly chosen in each subdomain K j .

It is important to notice that all three algorithms MCA-MSS-1, MCA-MSS-2 and MCA-MSS-2-S have optimal (unimprovable) rate of convergence for the corresponding functional classes, that is, MCA-MSS-1 is optimal in W 1(L; U d) and both MCA-MSS-2 and MCA-MSS-2-S are optimal in W 2(L; U d).

We also consider the known Owen nested scrambling algorithm [19] for which it is proved that the rate of convergence is \({n}^{-3/2}{(log\ n)}^{(d-1)/2}\), which is very good but still not optimal even for integrands in W 1(L; U d). One can see that if the logarithmic function from the estimate can be omitted, then the rate will become optimal. Let us mention that it is still not proven that the above estimate is exact, that is, we do not know if the logarithm can be omitted. It should be mentioned that the proved convergence rate for the Owen nested scrambling algorithm improves significantly the rate for the unscrambled nets, which is \({n}^{-1}{(log\ n)}^{d-1}\). That is why it is important to compare numerically our algorithms MCA-MSS with the Owen nested scrambling. The idea of Owen nested scrambling is based on randomization of a single digit at each iteration. Let \({x}^{(i)} = (x_{i,1},x_{i,2},\ldots,x_{i,s}),\ i = 1,\ldots,n\) be quasi-random numbers in [0, 1)s, and let \({z}^{(i)} = (z_{i,1},z_{i,2},\ldots,z_{i,s})\) be the scrambled version of the point x (i). Suppose that each x i, j can be represented in base b as \(x_{i,j} = (0.x_{i1,j}\ x_{i2,j}\ldots x_{iK,j}\ldots )_{b}\) with K being the number of digits to be scrambled. Then nested scrambling proposed by Owen [19, 20] can be defined as follows: \(z_{i1,j} =\pi _{\bullet }(x_{i1,j})\), and \(z_{il,j} =\pi _{\bullet x_{i1,j}x_{i2,j}\ldots x_{il-1,j}}(x_{il,j})\), with independent permutations \(\pi _{\bullet x_{i1,j}x_{i2,j}\ldots x_{il-1,j}}\) for l ≥ 2. Of course, (t, m, s)-net remains (t, m, s)-net under nested scrambling. However, nested scrambling requires b l − 1 permutations to scramble the l-th digit. Owen scrambling (nested scrambling), which can be applied to all (t, s)-sequences, is powerful; however, from the implementation point of view, nested scrambling or so-called path-dependent permutations require a considerable amount of bookkeeping and lead to more problematic implementation. There are various versions of scrambling methods based on digital permutation, and the differences among those methods are based on the definitions of the π l ’s. These include Owen nested scrambling [19, 20], Tezuka’s generalized Faure sequences [29] and Matousek’s linear scrambling [17].

5 Case Study: Variance-Based Sensitivity Analysis of the Unified Danish Eulerian Model

The input data for the sensitivity analysis performed in this paper has been obtained during runs of a large-scale mathematical model for remote transport of air pollutants (UNI-DEM, [33]). The model enables us to study concentration variations in time of a high number of air pollutants and other species over a large geographical region (4,800 ×4,800 km), covering the whole of Europe, the Mediterranean and some parts of Asia and Africa. Such studies are important for environmental protection, agriculture and health care. The model presented as a system of partial differential equations describes the main processes in the atmosphere including photochemical processes between the studied species, the emissions and the quickly changing meteorological conditions. Both nonlinearity and stiffness of the equations are mainly introduced when modeling chemical reactions [33]. The chemical scheme used in the model is the well-known condensed CBM-IV (Carbon Bond Mechanism). Thus, the motivation to choose UNI-DEM is that it is one of the models of atmospheric chemistry, where the chemical processes are taken into account in a very accurate way.

This large and complex task is not suitable for direct numerical treatment. For the purpose of numerical solution, it is split into submodels, which represent the main physical and chemical processes. The sequential splitting [16] is used in the production version of the model, although other splitting methods have also been considered and implemented in some experimental versions [4, 5]. Spatial and time discretization makes each of the above submodels a huge computational task, challenging for the most powerful supercomputers available nowadays. That is why parallelization has always been a key point in the computer implementation of DEM since its very early stages.

Our main aim here is to study the sensitivity of the ozone concentration according to the rate variation of some chemical reactions. We consider the chemical rates to be the input parameters and the concentrations of pollutants to be the output parameters.

6 Numerical Results and Discussion

Some numerical experiments are performed to study experimentally various properties of the algorithms. We are interested in both smooth and non-smooth integrands. The reason to consider both cases is that we deal with many different output functions using the UNI-DEM model. Formally the output functions should have enough smoothness, because the solution has bounded second derivatives by definition. Nevertheless, some functions of concentrations that depend on photochemical reactions in the air have computational irregularities. It means that the derivative of the function is very high by modulo and it causes computational difficulties—the function behaves as a non-smooth function.

The expectations based on theoretical results are that for non-smooth functions MCA-MSS algorithms based on the shaking procedures outperform the QMC even for relatively low dimensions. It is also interesting to observe how behave the randomized QMC based on scrambled Sobol sequences.

For our numerical tests we use the following non-smooth integrand:

$$\displaystyle{ f_{1}(x_{1},x_{2},x_{3},x_{4}) =\sum _{ i=1}^{4}\vert {(x_{ i} - 0.8)}^{-1/3}\vert, }$$
(14)

for which even the first derivative does not exist. Such kinds of applications appear also in some important problems in financial mathematics. The referent value of the integral S(f 1) is approximately equal to 7. 22261. To make a comparison we also consider an integral with a smooth integrand:

$$\displaystyle{ f_{2}(x_{1},x_{2},x_{3},x_{4}) = x_{1}\ x_{2}^{2}\ {\mathbf{e}}^{x_{1}x_{2} }\sin x_{3}\cos x_{4}. }$$
(15)

The second integrand (15) is a function \(f_{2} \in {C}^{\infty }({U}^{d})\) with a referent value of the integral S(f 2) approximately equal to 0. 10897. The integration domain in both cases is U 4 = [0, 1]4.

Some results from the numerical integration tests with a smooth (15) and a non-smooth (14) integrand are presented in Tables 1 and 2, respectively. As a measure of the efficiency of the algorithms, both the relative error (defined as the absolute error divided by the referent value) and computational time are shown. For generating Sobol quasi-random sequences, the algorithm with Gray code implementation [1] and sets of direction numbers proposed by Joe and Kuo [11] are used. The MCA-MSS-1 algorithm [6] involves generating random points uniformly distributed on a sphere with radius ρ. One of the best available random number generators, SIMD-oriented Fast Mersenne Twister (SFMT) [21, 32] 128-bit pseudorandom number generator of period 219937 − 1 has been used to generate the required random points. SFMT algorithm is a very efficient implementation of the plain Monte Carlo method [23]. The radius ρ depends on the integration domain, number of samples and minimal distance between Sobol deterministic points δ. We observed experimentally that the behavior of the relative error of numerical integration is significantly influenced by the fixed radius of spheres. That is why the values of the radius ρ are presented according to the number of samples n used in our experiments, as well as to a fixed coefficient, radius coefficient \(\kappa =\rho /\delta\). The latter parameter gives the ratio of the radius to the minimal distance between Sobol points. The code of scrambled quasi-random sequences used in our studies is taken from the collection of NAG C Library [31]. This implementation of scrambled quasi-random sequences is based on TOMS Algorithm 823 [10]. In the implementation of the scrambling, there is a possibility to make a choice of three methods of scrambling: the first is a restricted form of Owen scrambling [19], the second is based on the method of Faure and Tezuka [8] and the last method combines the first two (it is referred to as a combined approach).

Table 1 Relative error and computational time for numerical integration of a smooth function (S(f 2) ≈ 0. 10897)
Table 2 Relative error and computational time for numerical integration of a non-smooth function (S(f 1) ≈ 7. 22261)

Random points for the MCA-MSS-1 algorithm have been generated using the original Sobol sequences and modeling a random direction in d-dimensional space. The computational time of the calculations with pseudorandom numbers generated by SFMT (see columns labeled as SFMT and MCA-MSS in Tables 1 and 2) has been estimated for all 10 algorithm runs.

Comparing the results in Tables 1 and 2 one observes that:

  • All algorithms under consideration are efficient and converge with the expected rate of convergence.

  • In the case of smooth functions, the Sobol algorithm is better than SFMT (the relative error is up to 10 times smaller than for SFMT).

  • The scrambled QMC and MCA-MSS-1 are much better than the classical Sobol algorithm; in many cases even the simplest shaking algorithm MCA-MSS-1 gives a higher accuracy than the scrambled algorithm.

  • In the case of non-smooth functions, SFMT algorithm implementing the plain Monte Carlo method is better than the Sobol algorithm for relatively small samples (n).

  • In the case of non-smooth functions, our Monte Carlo shaking algorithm MCA-MSS-1 gives similar results as the scrambled QMC; for several values of n, we observe advantages for MCA-MSS-1 in terms of accuracy.

  • Both MCA-MSS-1 and scrambled QMC are better than SFMT and Sobol quasi MC algorithm in the case of non-smooth functions.

Another observation is that for the chosen integrands the scrambling algorithm does not outperform the algorithm with the original Sobol points, but the scrambled algorithm and Monte Carlo algorithm MCA-MSS-1 are more stable with respect to relative errors for relatively small values of n.

In Table 3 we compare Sobol QMCA with MCA-MSS-2 and MCA-MSS-2-S, as well as with simplest shaking algorithm MCA-MSS-1.The results show that the simplest shaking algorithm MCA-MSS-1 gives relative errors similar to errors of the Sobol QMCA, which is expected since the Λ Π τ Sobol sequences are already quite well distributed. That is why one should not expect improvement for a very smooth integrand. But the symmetrised shaking algorithm MCA-MSS-2 improves the relative error. The effect of this improvement is based on the fact that the second derivatives of the integrand exists, they are bounded and the construction of the MCA-MSS-2 algorithm gives a better convergence rate of order \(O({n}^{-1/2-2/d})\). The same convergence rate has the algorithm MCA-MSS-2-S, but the latter one does not allow to control the value of the radius of shaking. As expected MCA-MSS-2-S gives better results than MCA-MSS-1. The relative error obtained by MCA-MSS-2 and MCA-MSS-2-S are of the same magnitude (see Table 3). The advantage of MCA-MSS-2-S is that its computational complexity is much smaller. A comparison of the relative error and computational complexity for different values of n is presented in Table 4. To have a fair comparison we have to consider again a smooth function (15).The observation is that MCA-MSS-2-S algorithm outperforms the simplest shaking algorithm MCA-MSS-1 in terms of relative error and complexity.

Table 3 Relative error and computational time for numerical integration of a smooth function (S(f) ≈ 0. 10897)
Table 4 Relative error and computational time for numerical integration of a smooth function (S(f) ≈ 0. 10897) (comparison between MCA-MSS-1 and MCA-MSS-2-S algorithms)

After testing the algorithms under consideration on the smooth and non-smooth functions, we studied the efficiency of the algorithms on real-life functions obtained after running UNI-DEM. Polynomials of 4th degree with 35 unknown coefficients are used to approximate the mesh functions containing the model outputs.

We use various values of the number of points that corresponds to situations when one needs to compute the sensitivity measures with different accuracy. We have computed results for g 0 (g 0 is the integral over the integrand \(g(x) = f(x) - c\), f(x) is the approximate model function of UNI-DEM and c is a constant obtained as a Monte Carlo estimate of f 0, [28]), the total variance D as well as total sensitivity indices S i t o t, i = 1, 2, 3. The above-mentioned parameters are presented in Table 5. Table 5 presents the results obtained for a relatively low sample size n = 6, 600.

One can notice that for most of the sensitivity parameters, the simplest shaking algorithm MCA-MSS-1 outperforms the scrambled Sobol sequences, as well as the algorithm based on the Λ Π τ Sobol sequences in terms of accuracy. For higher values of sample sizes this effect is even stronger.

Table 5 Relative error (in absolute value) and computational time for estimation of sensitivity indices of input parameters using various Monte Carlo and quasi-Monte Carlo approaches (n = 6, 600, c ≈ 0. 51365, δ ≈ 0. 08)

One can clearly observe that the simplest shaking algorithm MCA-MSS-1 based on modified Sobol sequences improves the error estimates for non-smooth integrands. For smooth functions modified algorithms MCA-MSS-2 and MCA-MSS-2-S give better results than MCA-MSS-1. Even for relatively large radiuses ρ the results are good in terms of accuracy. The reason is that centers of spheres are very well uniformly distributed by definition. So that even for large values of radiuses of shaking the generated random points continue to be well distributed. We should stress on the fact that for relatively low number of points ( < 1, 000) the algorithm based on modified Sobol sequences gives results with a high accuracy.

7 Conclusions

A comprehensive theoretical and experimental study of the Monte Carlo algorithm MCA-MSS-2 based on symmetrised shaking of Sobol sequences has been done. The algorithm combines properties of two of the best available approaches—Sobol QMC integration and a high-quality SFMT pseudorandom number generator. It has been proven that this algorithm has an optimal rate of convergence for functions with continuous and bounded second derivatives in terms of probability and mean square error.

A comparison with the scrambling approach, as well as with the Sobol QMC algorithm and the algorithm using SFMT generator, has been provided for numerical integration of smooth and non-smooth integrands. The algorithms mentioned above are tested numerically also for computing sensitivity measures for UNI-DEM model to study sensitivity of ozone concentration according to variation of chemical rates. All algorithms under consideration are efficient and converge with the expected rate of convergence. It is important to notice that the Monte Carlo algorithm MCA-MSS-2 based on modified Sobol sequences when symmetrised shaking is used has a unimprovable rate of convergence and gives reliable numerical results.