Abstract
Motivated by a classical result in the independent identically distributed (i.i.d.) case for a pair random variables X, Y, we look for a simple sufficient condition, allowing for possible dependence between \(X \mbox{and} Y\), under which the ratios of the components X,Y to their sum are equal in distribution. Our finding is easily extended to random vectors of higher (\(n \geq 2\)) dimensions to show that exchangeability of a finite sequence \(X_1, \cdots, X_n\) is sufficient to guarantee the desired result. Any Archimedian copula can be used as a generator of such random vectors. Our main result is applicable in many Bayesian contexts, where the observations are conditionally i.i.d. given an environmental variable with a prior.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
6.1 Introduction and Summary
One often comes across problems where the probability of male and that of female each being equal to 50 % is questioned. This question can be thought of in terms of the human sex ratio of X: Y (which is currently 101 male to 100 female, CIA Fact Book, 2013) and the corresponding proportions being same to that of their corresponding distributions being identical. In this context, X and Y are thought to be nonnegative random variables. However, if the X and Y are independent identically distributed i.i.d.; it is well-known that the ratios \(X\!{/}(X+Y)\) and \(Y\!{/}(X+Y)\) are equal in distribution. This prompts the question: if we remove the assumption of mutual independence of X and Y, can the equidistribution of these ratios still hold, and under what reasonable conditions? In what follows, we explore some general answers to this question. We show that, if X and Y have the same distribution then \(\frac{X}{X+Y}\) need not have the same distribution as \(\frac{Y}{X+Y}\) and identify sufficient conditions for an affirmative answer. Extension of our main result to the case of n-dimensional random vectors \((X_1, \cdots, X_n)\) for \(n \,\geq\, 2\) is indicated.
Generically, the cumulative distribution function (c.d.f.) of a random vector \((X,Y)\) is denoted by \(F_{X,Y}\) and its probability density function (p.d.f.), when it exists, by \(f_{X,Y}\). For higher dimensional random vectors \((X_1, \cdots, X_n)\), \(n \geq 2\); \(F_{X_1, \cdots, X_n}\) and \(f_{X_1, \cdots, X_n}\) correspondingly denote its c.d.f. and p.d.f., respectively. We use \(\stackrel{d}{=}\) to denote equality in distribution of (r.v.s).
6.2 Counterexample
We show a counterexample to demonstrate that \(X \stackrel{d}{=}Y\) does not guarantee equality of distribution of the ratios \(\frac{X}{X+Y}\) and \(\frac{Y}{X+Y}\). For this purpose, we use a suitable joint density of \((X,Y)\), that we construct via the standard normal density
Consider the joint density function on \(R^2 = (-\infty, \infty) \times (-\infty, \infty)\), given by
To see that \(f_{X,Y}\) is a valid joint density we need to observe that \(\phi(x) \,<\, 1\) and that \(|x\phi(x)| \,<\, 1\) (because \(\frac{x^2}{2\pi} < \textit{exp}(x^2)\)). This in turn gives \(1+xy\phi(x)\phi^2(y)\,>\,0\) and the fact that the mean of a scaled standard normal random variable is zero, which make \(f_{X,Y}\) a valid density and both the marginals to be standard normal. Hence, X and Y have the same distribution. We will now derive the density of \(V = \frac{Y}{X+Y}\) and then show that densities of V and \(1-V = \frac{X}{X+Y}\) are not the same. Let \(W = X\) and \(Y = \frac{VW}{1-V}\). The absolute value of the Jacobian is given by \(\frac{|w|}{(1-v)^2}\). Hence, the joint density \(f_{W,V}\) of \((W, V)\) on the R 2 plane is given by,
which simplifies to,
In the above joint density, we integrate out the w variable, to get the marginal density of V. Note that a closed form of the density of V can be obtained by using the facts that if N is a normal random variable with mean zero and variance \(\sigma_N^2\) then \(E|N| = \sqrt{\frac{2}{\pi}}\sigma_N\) and \(E|N|^3 = 2\sqrt{\frac{2}{\pi}}\sigma_N^3.\) Hence, the density of V is given by
Clearly, \(f_V(v) \neq f_V (1-v)\), and the latter is the density of \(U:=\frac{X}{X+Y}\). The two ratios U and V are not equal in distribution.
Dependence between X and Y in the counterexample does not establish the necessity of their statistical independence for the equality in distribution of the ratios U, V to hold. In fact, our results are typically based on the assumption of a joint distribution, and cover independence as a special case.
6.3 Main Results
For a random vector \((X,Y)\), denote the ratios of the two component r.v.s to their sum, by
It may be noted that while \(U\,+\,V=1\), the r.v.s \(U \mbox{and} V\) cannot be thought of as the proportional contribution of the components of \((X,Y)\) to their sum, as is obvious from the preceding counterexample.
If X, Y are absolutely continuous with a (joint) density, then so are U and V, with their respective densities related via
Standard calculations yield an expression for the density of U. In particular, choosing the transformation
the joint density of \((U,T)\) is easily seen to be \(f_{U,T} (u,t) = f_{X,Y} (ut,\, (1-u)t)\;|t|\), so that the marginal density of U is
which together with (6.2) implies
in general.
Define H to be symmetric in its arguments \((x,y)\), if
If, however, \(f_{X,Y}\) has this symmetry, then the earlier equality obviously holds. We thus have the following proposition.
Proposition 6.1
If \((X,Y)\) admits a joint density that is symmetric in its arguments, then the ratios in (6.1) are equal in distribution (\(U \stackrel{d}{=} V\)).
Remark 1.
There is no explicit assumption that \(X \stackrel{d}{=} Y\) in the premise of the earlier proposition, as it is an easy consequence of the symmetry; viz,
Remark 2.
In view of the Remark 1 earlier, in the absolutely continuous case, the classic result that X,Y i.i.d. implies \(U \stackrel{d}{=} V\) follows as a special case of proposition 6.1, since if X,Y are i.i.d. with a common p.d.f. \(f_X(\cdot) \equiv f_Y(\cdot)\), then the joint p.d.f. satisfies
While proposition 6.1 provides an answer to our question when X,Y are absolutely continuous, an affirmative answer in the general case, where the joint c.d.f. of X,Y may also have discrete or/and singular components, is given by our next proposition. Note that F(x,y) being symmetric in \((x,y)\) implies that \(P\{(X,Y) \in ({-}\infty,x] \times ({-}\infty,y]\} = P\{(Y,X) \in ({-}\infty,x] \times (\!\!-\infty,y]\}\) for all \((x, y) \in R^2\). This, in turn implies that \((X,Y) \stackrel{d}{=}(Y, X).\)
Proposition 6.2
If the joint c.d.f. \(F_{X,Y}(x,y)\) is symmetric in \((x,y)\), then \(U {\stackrel{d}{=} V}\).
Proof.
With \(F_{X,Y}(x,y)\) also denoting the Lebesgue–Stieltjes measure on the plane induced by the joint c.d.f., we have,
where the second equality uses the symmetry condition of the joint c.d.f. and the two corresponding measures are the same because they are seen to be same of the relatively determining class of sets \((\!\!-\infty,x] \times (\!\!-\infty,y].\) Thus, the ratios U and V having the same characteristic function and therefore must be equal in distribution. Alternately, \(F_{X,Y}(x,y) = F_{X,Y}(y,x)\) implies that \((X, Y) \stackrel{d}{=} (Y, X)\) and \(h(x, y)= \frac{x}{x+y}\) being a continuous function gives \(h(X,Y) \stackrel{d}{=} h(Y,X).\) Interestingly, converse of Proposition 6.2 is not true namely, \(X/(X + Y) \stackrel{d}{=}Y/(X + Y)\) does not imply that X and Y have symmetric distribution functions. To see this, let \((X, Y)\) take on the bivariate pairs (1,2) and (4,2) with probability 1/2 each. Then \(X/(X+Y)\) and \(Y/(X+Y)\) both have identical distributions, taking on the values 1/3 and 2/3 with probability 1/2 each. Yet, \(1/2 = P[X = 1,Y = 2] \neq P[X = 2, Y = 1] = 0\).
The joint c.d.f.’s symmetry condition was motivated by the corresponding assumption in Proposition 6.1 and the following observation.
Lemma 6.3
-
(i)
Suppose X,Y are absolutely continuous. Then \(F_{X,Y}\) is symmetric in its arguments \((x,y)\) if and only if so is \(f_{X,Y}\).
-
(ii)
The symmetry condition in Proposition 6.2 implies X and Y are identically distributed.
Proof.
-
(i)
Suppose \(f_{X,Y}\) is symmetric in \((x,y)\). Then the nonnegativity of the integrand and Fubini’s theorem implies,
$$\begin{aligned}F_{X,Y}(x,y)=P(X \leq x,\;Y \leq y) & = & \int_{- \infty}^x \int_{- \infty}^y f_{X,Y}(u,v)\; dv du\\ & = & \int_{- \infty}^x \int_{- \infty}^y f_{X,Y}(v,u)\; dv du\\ & = & \int_{- \infty}^y \int_{- \infty}^x f_{X,Y}(v,u)\; du dv\\ & = & P(X \leq y,\;Y \leq x) \equiv F_{X,Y}(y,x).\end{aligned}$$Conversely, supposing \(F_{X,Y}\) is symmetric in its argument \((x,y)\), and has a joint density; we have,
$$\begin{aligned} f_{X,Y}(x,y) = \frac{\partial^2}{\partial x,\partial y} F_{X,Y}(x,y)=\frac{\partial^2}{\partial x,\partial y} F_{X,Y}(y,x)=f_{X,Y}(y,x).\end{aligned}$$ -
(ii)
Using the pointwise symmetry of \(F_{X,Y}(\cdot, \cdot)\) on R 2,
$$\begin{aligned} P(X \leq x) = \lim_{y \rightarrow \infty} F_{X,Y}(x,y) = \lim_{y \rightarrow \infty} F_{X,Y}(y,x)= P(Y \leq x).\end{aligned}$$
Remark 3.
The symmetry condition in Proposition 6.2 is of course equivalent to X,Y being “exchangeable”, i.e., \((X,Y) \stackrel{d}{=} (Y,X)\). For a pair of r.v.s however, it is much more simply stated as the property that the joint c.d.f. \(F_{X,Y} (\cdot,\cdot): R^2 \longrightarrow [0,1]\) is symmetric in its arguments. For random vectors of higher dimensions, the corresponding condition that the c.d.f. \(F_{X_1, \cdots, X_n}\) is permutation invariant in its arguments is more succinctly and elegantly descried as \(X_1, \cdots, X_n\) being exchangeable; thus generalizing our earlier proposition as follows.
Proposition 6.4
If \(X_1, \cdots, X_n (n \geq 2)\) is a finite, exchangeable sequence, then
where \(S_n:= \sum_{i=1}^n X_i\).
Proof.
Suppose \(X_1, \cdots, X_n (n \,\geq\, 2)\) are exchangeable, i.e., \((X_{i_1}, \cdots, X_{i_n}) \stackrel{d}{=}\) \((X_1, \cdots, X_n)\) for all permutations \((i_1, \cdots, i_n)\) of \((1, \cdots, n)\). For brevity, denote by
be the corresponding vector that skips the j-th coordinate X j , and the corresponding values assumed as, \({\boldsymbol{x}}\) and \(0_j {\boldsymbol{X}}\), respectively. When
where the value s n of S n is given by \(s_n = u+ \smash{\sum_{{i=1}_{i \neq j}}^n x_i}\) or \(s_n = u+ \smash{\sum_{{i=1}_{i \neq k}}^n x_i}\) in the second or third integrands earlier, respectively. Note, the two equalities preceding the last step hold, since \((X_j, 0_j {\boldsymbol{X}}) \stackrel{d}{=} {\boldsymbol{X}} \stackrel{d}{=} (X_k, 0_k {\boldsymbol{X}})\) for all pairs j,k, by exchangeability. Alternately, since \((X_j, 0_j {\boldsymbol{X}}) \stackrel{d}{=}(X_k, 0_k {\boldsymbol{X}})\) and \(h({\boldsymbol{x}}) = \frac{x_{1}}{s_{n}}\) is a continuous function, \(h(X_j, 0_j {\boldsymbol{X}}) \stackrel{d}{=}h(X_k, 0_k {\boldsymbol{X}})\). Hence the result.
In conclusion, any Archimedian copula can be used as a generator of such exchangeable r.v.s, Nelson (1999) and Genest et al. (1986). These results are also applicable to Bayesian contexts, where the observations are conditionally i.i.d. given an environmental variable with a prior distribution.
{
References
The Central Intelligence Agency of the United States (2013) CIA Fact Book. https://www.cia.gov/library/publications/the-world-factbook/index.html
Nelsen RB (1999) An Introduction to Copulas. Lecture Notes in Statistics, # 139; Springer, New York
Genest C, MacKay J (1986) The joy of copulas: bivariate distributions with uniform marginals. Amer Statist 40:280–285
Acknowledgement
We thank an anonymous reviewer for improving the presentation of this paper. Prof. Hira Lal Koul is my “statistics-guru”. I wish to thank him for all the knowledge he has given me.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Dhar, S., Bhattacharjee, M. (2014). On Equality in Distribution of Ratios \(\boldsymbol{X\!{/}(X}{+}\boldsymbol{Y)}\) and \(\boldsymbol{Y\!{/}(X}{+}\boldsymbol{Y)}\) . In: Lahiri, S., Schick, A., SenGupta, A., Sriram, T. (eds) Contemporary Developments in Statistical Theory. Springer Proceedings in Mathematics & Statistics, vol 68. Springer, Cham. https://doi.org/10.1007/978-3-319-02651-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-02651-0_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02650-3
Online ISBN: 978-3-319-02651-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)