Abstract
We prove a multivariate central limit theorem with explicit error bound in a non-smooth function distance for sums of bounded decomposable \(d\)-dimensional random vectors. The decomposition structure is similar to that of Barbour et al. (J Combin Theory Ser 47:125–145, 1989) and is more general than the local dependence structure considered in Chen and Shao (Ann Probab 32:1985–2028, 2004). The error bound is of the order \(d^{\frac{1}{4}} n^{-\frac{1}{2}}\), where \(d\) is the dimension and \(n\) is the number of summands. The dependence on \(d\), namely \(d^{\frac{1}{4}}\), is the best known dependence even for sums of independent and identically distributed random vectors, and the dependence on \(n\), namely \(n^{-\frac{1}{2}}\), is optimal. We apply our main result to a random graph example.
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
Avoid common mistakes on your manuscript.
1 Introduction
Let \(W=\frac{1}{\sqrt{n}}\sum _{i=1}^n X_i\) be a standardized sum of independent and identically distributed \(d\)-dimensional random vectors such that \({\mathbb {E}}(X_1)=0\), \(\mathop {\mathrm {Cov}}(X_1)=\mathbb {I}_d\), and \({\mathbb {E}}( |X_1|^3)<\infty \) where \(\mathbb {I}_d\) denotes the \(d\)-dimensional identity matrix and \(\vert \cdot \vert \) denotes the Euclidean norm of a vector. Bentkus [5] proved the following bound on a non-smooth function distance between the distribution of \(W\) and the standard \(d\)-dimensional Gaussian distribution:
where \({\mathcal {A}}\) denotes the collection of all the convex sets in \(\mathrm{I\!R}^d\), \(Z\) is a \(d\)-dimensional standard Gaussian vector, and \(C\) is an absolute constant.
In this paper, we aim to prove a multivariate central limit theorem with an error bound having the same order of magnitude in terms of \(d\) and \(n\) for the same non-smooth function distance as in (1.1), but for general bounded dependent random vectors. The dependence structure we consider is similar to that of Barbour et al. [4] and is more general than the local dependence structure considered in Chen and Shao [10]. The approach we use is the recursive approach in Stein’s method for multivariate normal approximation.
Stein’s method was introduced by Stein [18] for normal approximation and has become a powerful tool in proving distributional approximations. We refer to Barbour and Chen [3] for an introduction to Stein’s method. Stein’s method for multivariate normal approximation was first studied in Götze [13]. Along with an inductive argument for sums of independent random vectors, he proved an error bound whose dependence on the dimension is \(d^{\frac{3}{2}}\) for the distance in (1.1). The dependence on the dimension was improved to \(d^{\frac{1}{4}}\) in Bentkus [5] [cf. (1.1)] and Bentkus [6] by using the same inductive approach and a Lindeberg-type argument. Using the recursive approach in Stein’s method, Rinott and Rotar [17] proved a multivariate normal approximation result for sums of bounded random vectors that allow for a certain decomposition. The error bound they obtained for the distance in (1.1) is typically of the order \(O_d( n^{-\frac{1}{2}} \log n)\) with unspecified dependence on \(d\) and an additional logarithmic factor. Recently, Fang and Röllin [11] proved a multivariate normal approximation result under the general framework of Stein coupling [cf. Fang and Röllin [9]]. For sums of locally dependent bounded random vectors, their bound for the distance in (1.1) is typically of the order \(d^{\frac{7}{4}} n^{-\frac{1}{2}}\). Compared with the existing literature on bounding the non-smooth function distance in (1.1) for multivariate normal approximation for sums of bounded random vectors, our new result not only applies to a decomposition structure more general than local dependence, but also obtains an error bound typically of the same order as for sums of independent and identically distributed random vectors in (1.1). Stein’s method has also been used to prove smooth function distances for multivariate normal approximation under various dependence structures; see, for example, Goldstein and Rinott [12], Raič [15], Chatterjee and Meckes [7], and Reinert and Röllin [16].
Many problems in random graph counting satisfy the decomposition structure considered in this paper. See, for example, Barbour et al. [4], Janson and Nowicki [14], Avram and Bertsimas [1], and Rinott and Rotar[17]. We will study an example from Rinott and Rotar [17] and show that the error bound obtained by our main theorem is better than that by Rinott and Rotar [17].
The paper is organized as follows. In the next section, we state our main result. In Sect. 3, we study a random graph counting problem. In Sect. 4, we prove our main result. Throughout this article, \(\vert \cdot \vert \) denotes the Euclidean norm of a vector or the cardinality of a set, and \(\mathrm {Id}_d\) denotes the \(d\)-dimensional identity matrix. Define \([n]:=\{1,\dots , n\}\) and \(X_N:=\sum _{i\in N} X_i\) for an index set \(N\).
2 Main Results
For a sum of standardized \(d\)-dimensional random vectors \(W=\sum _{i=1}^n X_i\) with a certain decomposition structure, we aim to bound the quantity
where \(Z\) has standard \(d\)-dimensional Gaussian distribution and \({\mathcal {A}}\) denotes the collection of all the convex sets in \(\mathrm{I\!R}^d\). The following is our main theorem.
Theorem 2.1
Let \(W=\sum _{i=1}^n X_i\) be a sum of \(d\)-dimensional random vectors such that \({\mathbb {E}}(X_i)=0\) and \(\mathop {\mathrm {Cov}}(W)=\mathbb {I}_d\). Suppose \(W\) can be decomposed as follows:
Suppose further that for each \(i\in [n], j\in N_i\) and \(k\in N_{ij}\),
Then there is a universal constant \(C\) such that
where \(d_c\) is defined as in (2.1) and \(Z\) is a \(d\)-dimensional standard Gaussian random vector.
Remark 2.2
Under the conditions of Theorem 2.1 but with \(\mathop {\mathrm {Cov}}(W)=\Sigma \), by considering \(\Sigma ^{-1/2}W\), we have
where \(||\Sigma ^{-1/2}||\) is the operator norm of \(\Sigma ^{-1/2}\).
Remark 2.3
The decomposition (2.2) and (2.3) is the same as that of Barbour et al. [4] and, as observed there, is more general than the local dependence structure studied in Chen [8] and later in Chen and Shao [10]. Condition (2.4) is a natural extension of the decomposition of Barbour et al. [4]. We need this extra condition to obtain the bound in (2.6).
Remark 2.4
As mentioned at the beginning of the Introduction, the dependence on the dimension \(d\) in (2.6) is the best known dependence even for sums of independent and identically distributed random vectors. In typical applications, \(\beta \) is of the order \(O(n^{-\frac{1}{2}})\). From the decomposition (2.2)–(2.4), the neighborhood sizes \(n_2\) and \(n_3\) are typically of the same order as \(n_1\). On the other hand, in the local dependence structure studied in Chen [8] and Chen and Shao [10], quantities corresponding to \(n_2\) and \(n_3\) are typically of the order \(O(n_1^2)\) and \(O(n_1^3)\), respectively. Rinott and Rotar [17] proved a bound similar to (2.6) for a different decomposition structure. Their bound does not have explicit dependence on \(d\) and has an additional \(\log n\) term. Recently, Fang and Röllin [11] proved a general multivariate central limit theorem for the non-smooth function distance under the framework of Stein coupling (cf. Chen and Röllin [9]) with boundedness conditions. Although their bound is more widely applicable, it does not yield optimal dependence on \(d\) and does not directly apply to the decomposition structure considered in this paper.
3 An Application
Let \(n\geqslant 2, m\geqslant 1, d\geqslant 2\) be positive integers. Consider a regular graph with \(n\) vertices and vertex degree \(m\). Let \(N=nm/2\) be the total number of edges. We color each vertex independently with one of the colors \(c_i\), \(1\leqslant i\leqslant d\), with the probability of \(c_i\) being \(\pi _i\), where \(\sum _{i=1}^d \pi _i=1\). For \(i\in \{1,\dots , d\}\), let \(W_i\) be the number of edges connecting vertices both of color \(c_i\). Formally, with edges indexed by \(j\in \{1,\dots , N\}\), we set
where \(X_{ji}\) is the indicator of the event that the edge \(j\) connects two vertices both of color \(c_i\). Let
Let \(\lambda ={\mathbb {E}}(W)\), \(\Sigma =\mathop {\mathrm {Cov}}(W)\). It is known that
and [cf. (3.1) of Rinott and Rotar [17]]
We prove the following bound on the non-smooth function distance between the standardized distribution of \(W\) and the standard \(d\)-dimensional Gaussian distribution.
Proposition 3.1
Let \(W\) be defined as in (3.1). Let \(\lambda \) and \(\Sigma \) be the mean and covariance matrix of \(W\), respectively. We have
where \(d_c\) is defined as in (2.1), \(Z\) is a \(d\)-dimensional standard Gaussian vector, \(C\) is an absolute constant, and
Remark 3.2
Rinott and Rotar [17] proved an upper bound for the left-hand side of (3.2) as follows [cf. (3.2) of Rinott and Rotar [17]]:
where \(c_d\) is an unspecified constant depending on \(d\). Compared to (3.3), our bound in (3.2) has explicit dependence on \(d\) and does not have the logarithmic terms.
Proof of Proposition 3.1
Recall that
Observe that at most one of \(\{X_{j1},\dots , X_{jd}\}\) can be nonzero. Together with \(\sum _{i=1}^d \pi _i=1\) and the fact that the elements of \(\Sigma ^{-1/2}\) are bounded in modulus by \(N^{-1/2} L\) (cf. page 339 of Rinott and Rotar [17]), we have
Moreover, the summation \(\sum _{j=1}^N \xi _j\) can be easily seen to satisfy the decomposition structure (2.2)–(2.4) with
By applying Theorem 2.1, we obtain the bound (3.2). \(\square \)
4 Proof of Main Theorem
For given test function \(h\), we consider the Stein equation
where \(\Delta \) denotes the Laplacian operator and \(\nabla \) the gradient operator. If \(h\) is not continuous (as for the indicator function of a convex set), then \(f\) is not smooth enough to apply Taylor’s expansion to the necessary degree, so more refined techniques are necessary.
We follow the smoothing technique of Bentkus [5]. Recall that \({\mathcal {A}}\) is the collection of all the convex sets in \(\mathrm{I\!R}^d\). For \(A\in {\mathcal {A}}\), let \(h_A(x)=I_A(x)\), and define the smoothed function
where \(\mathop {\mathrm {dist}}(w,A) = \inf _{v\in A}\vert w-v\vert \) and
Define also
[note that in general \((A^{-\varepsilon })^\varepsilon \ne A\)].
We will use the following lemmas in the proof of Theorem 2.1.
Lemma 4.1
(Lemma 2.3 of Bentkus [5]) The function \(h_{A,\varepsilon }\) as defined above has the following properties:
Lemma 4.2
(Ball [2], Bentkus [5]) We have
and the dependence on \(d\) in (4.9) is optimal.
Lemma 4.3
(Lemma 4.2 of Fang and Röllin [11]) For any \(d\)-dimensional random vector \(W\),
Lemma 4.4
(Lemma 4.3 of Fang and Röllin [11]) For each map \(a{:}\,\{1,\dots ,d\}^k\rightarrow \mathrm{I\!R}\), we have
where \(\varphi (z)\) is the density of \(d\)-dimensional standard normal distribution and
Now fix \(\varepsilon \) and a convex set \(A\subset \mathrm{I\!R}^d\). It can be verified directly that defining
the function
is a solution to (4.1) is (cf. Götze [13]). In what follows, we keep the dependence on \(A\) and \(\varepsilon \) implicit and write \(g=g_{A, \varepsilon }, f=f_{A,\varepsilon }\) and \(h=h_{A,\varepsilon }\). For real-valued functions on \(\mathrm{I\!R}^d\), we write \(f_r(x)\) for \(\partial f(x)/\partial x_r\), \(f_{rs}(x)\) for \(\partial ^2 f(x)/(\partial x_r\partial x_s)\) and so forth. We also write \(g_r(w,\tau )=\partial g(w,\tau )/\partial w_r\) and so on. Moreover, let \(\nabla g(w,\tau )=(g_1(w,\tau ),\dots , g_d(w,\tau ))^t\) and let \(\Delta g(w,\tau )=\sum _{r=1}^d g_{rr}(w,\tau )\).
Using this notation and the integration by parts formula, we have for \(1\leqslant r,s,t\leqslant d\) that
and
Proof of Theorem 2.1
Fix \(A\in {\mathcal {A}}\) and \(\varepsilon >0\) (to be chosen later) and let \(f=f_{A,\varepsilon }\) be the solution to the Stein equation (4.1) corresponding to \(h=h_{A,\varepsilon }\) as defined by (4.2). Let
To avoid confusion, we will always use \(r,s,t\) to index the components of \(d\)-dimensional vectors. Define
By assumption (2.2) and because \({\mathbb {E}}(X_i)=0\), we have
Here and subsequently, we use \((X)_r\) to denote the \(r\)th component of a vector \(X\). By assumption (2.2) and \(\mathop {\mathrm {Cov}}(W)=\mathbb {I}_d\), we have \(\sum _{i=1}^n \sum _{j\in N_i} {\mathbb {E}}\{ (X_i)_r (X_j)_s \} =\delta _{rs}\) where \(\delta \) denotes the Kronecker delta. This implies that
Adding and subtracting the corresponding terms and using (2.3), we have
Taking \(g(w,\tau )=g_{A,\varepsilon }(w,\tau )\) in (4.12), it follows from (4.1) and (4.13) that
In the following, we will first give an upper bound for \(|\int _0^1 R_1(\tau ) \hbox {d}\tau |\) and then argue that an upper bound for \(|\int _0^1 R_2(\tau ) \hbox {d}\tau |\) can be derived similarly.
To estimate \(\int _0^1 R_1(\tau ) \hbox {d}\tau \), we consider the cases \(\varepsilon ^2< \tau \leqslant 1\) and \(0< \tau \leqslant \varepsilon ^2\) separately. For the first case, we use the first expression of \(g_{rs}(w,\tau )\) in (4.14), Taylor’s expansion
and the integration by parts formula and get
where \(U\) and \(V\) are independent random variables distributed uniformly on \([0,1]\). By writing \(h(\sqrt{1-\tau }W +\sqrt{\tau }z-\sqrt{1-\tau }UVX_{N_i})\) as a sum of differences and using the independence assumption (2.4), we have
where
where \(Z\) is an independent \(d\)-dimensional standard Gaussian random vector.
By the properties of \(h\) in (4.4) and (4.5), the boundedness condition (2.5), and the independence assumption (2.4),
By the boundedness condition (2.5), the definition of \(\kappa \) in (4.16) and (4.9),
By the Cauchy–Schwartz inequality, (4.11) and the boundedness condition (2.5),
Therefore, using \(\int _{\varepsilon ^2}^1 \frac{1}{\tau ^{3/2}}\hbox {d}\tau \leqslant C \frac{1}{\varepsilon }\),
Here and in the remainder of the proof, \(C\) denotes an absolute constant, which may differ from line to line.
By the same argument, \(|R_{1,2}|\) has the same upper bound as \(|R_{1,1}|\).
By the properties of \(h\) in (4.4) and (4.5), the definition of \(\kappa \) in (4.16), and (4.9),
By the same lower bound and a similar argument as for \(R_{1,1}\), we can bound \(R_{1,3}\) by
Using the first expression of \(g_{rst}(w,s)\) in (4.15), we have
Observe that from (4.12),
Differentiating with respect to \(w_r, w_s\), and \(w_t\) and evaluating at \(w=0\), we obtain
For the case \(0< \tau \leqslant \varepsilon ^2\), we use the second expression of \(g_{rs}(w,\tau )\) in (4.14) and the Taylor expansion
where we recall that \(U\) and \(V\) are independent random variables distributed uniformly on \([0,1]\). By (4.4), (4.5), (4.8), and (2.5),
Much as in (4.18),
Together with (2.2), (2.5), and (4.11), we obtain
and hence, from (4.19), (4.20), and (4.21),
Now we turn to bounding \(|\int _0^1 R_2(s) ds|\). Much as for \(R_1(s)\),
where \(U\) is a independent random variable distributed uniformly on \([0,1]\). By the same arguments used in bounding \(|\int _0^1 R_1(\tau )\hbox {d}\tau |\), we have
By (4.10), (4.17), (4.22), and (4.23),
The final bound (2.6) is obtained by choosing \(\varepsilon =2C n \beta ^3 n_1 n_2\) for the same \(C\) as in (4.24), solving the recursive inequality (4.24) and observing that \(d\leqslant n \beta ^2 n_1\) from \(\mathop {\mathrm {Cov}}(W)=\mathbb {I}_d\) and (2.5). \(\square \)
References
Avram, F., Bertsimas, D.: On central limit theorems in geometrical probability. Ann. Appl. Probab. 3, 1033–1046 (1993)
Ball, K.: The reverse isoperimetric problem for Gaussian measure. Discrete Comput. Geom. 10, 411–420 (1993)
Barbour, A.D., Chen, L.H.Y.: An introduction to Stein’s method. Lecture Notes Series 4, Institute for Mathematical Sciences, National University of Singapore, Singapore University Press and World Scientific (2005)
Barbour, A.D., Karoński, M., Ruciński, A.: A central limit theorem for decomposable random variables with applications to random graphs. J. Combin. Theory Ser. B 47, 125–145 (1989)
Bentkus, V.: On the dependence of the Berry-Esseen bound on dimension. J. Stat. Plan. Inference 113, 385–402 (2003)
Bentkus, V.: A Lyapunov type bound in \({ R}^d\). Theory Probab. Appl. 49, 311–323 (2005)
Chatterjee, S., Meckes, E.: Multivariate normal approximation using exchangeable pairs. ALEA Lat Am. J. Probab. Math. Stat. 4, 257–283 (2008)
Chen, L.H.Y.: Two central limit problems for dependent random variables. 2. Wahrsch. Verw. Gebiete 43, 223–243 (1978)
Chen, L.H.Y., Röllin, A.: Stein couplings for normal approximation. Preprint. Available at arXiv:1003.6039 (2010)
Chen, L.H.Y., Shao, Q.-M.: Normal approximation under local dependence. Ann. Probab. 32, 1985–2028 (2004)
Fang, X., Röllin, A.: Rates of convergence for multivariate normal approximation with applications to dense graphs and doubly indexed permutation statistics. To appear in Bernoulli. arXiv:1206.6586 (2014)
Goldstein, L., Rinott, Y.: Multivariate normal approximations by Stein’s method and size bias couplings. J. Appl. Probab. 33, 1–17 (1996)
Götze, F.: On the rate of convergence in the multivariate CLT. Ann. Probab. 19, 724–739 (1991)
Janson, S., Nowicki, K.: The asymptotic distributions of generalized \(U\)-statistics with applications to random graphs. Probab. Theory Relat. Fields 90, 341–375 (1991)
Raič, M.: A multivariate CLT for decomposable random vectors with finite second moments. J. Theor. Probab. 17, 573–603 (2004)
Reinert, G., Röllin, A.: Multivariate normal approximation with Stein’s method of exchangeable pairs under a general linearity condition. Ann. Probab. 37, 2150–2173 (2009)
Rinott, Y., Rotar, V.: A multivariate CLT for local dependence with \(n^{-1/2}\log n\) rate and applications to multivariate graph related statistics. J. Multivar. Anal. 56, 333–350 (1996)
Stein, C.: A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In: Proceedings of Sixth Berkeley Symp. Math. Stat. Prob. 2 Univ. California Press. Berkeley, Calif., pp. 583–602 (1972)
Acknowledgments
This work is based on part of the author’s Ph.D. thesis. He is grateful to his advisor, Louis H. Y. Chen, for his guidance. He would like to thank Adrian Röllin for helpful discussions. He would also like to thank the referees for their helpful comments and suggestions which have significantly improved the presentation of this paper. This paper was written under the financial support of NUS-Overseas Postdoctoral Fellowship from the National University of Singapore.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fang, X. A Multivariate CLT for Bounded Decomposable Random Vectors with the Best Known Rate. J Theor Probab 29, 1510–1523 (2016). https://doi.org/10.1007/s10959-015-0619-7
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10959-015-0619-7
Keywords
- Stein’s method
- Multivariate normal approximation
- Non-smooth function distance
- Rate of convergence
- Random graph counting