Abstract
For several decades, much attention has been paid to the two-sample Behrens-Fisher (BF) problem which tests the equality of the means or mean vectors of two normal populations with unequal variance/covariance structures. Little work, however, has been done for the k-sample BF problem for high dimensional data which tests the equality of the mean vectors of several high-dimensional normal populations with unequal covariance structures. In this paper we study this challenging problem via extending the famous Scheffe’s transformation method, which reduces the k-sample BF problem to a one-sample problem. The induced one-sample problem can be easily tested by the classical Hotelling’s T 2 test when the size of the resulting sample is very large relative to its dimensionality. For high dimensional data, however, the dimensionality of the resulting sample is often very large, and even much larger than its sample size, which makes the classical Hotelling’s T 2 test not powerful or not even well defined. To overcome this difficulty, we propose and study an L 2-norm based test. The asymptotic powers of the proposed L 2-norm based test and Hotelling’s T 2 test are derived and theoretically compared. Methods for implementing the L 2-norm based test are described. Simulation studies are conducted to compare the L 2-norm based test and Hotelling’s T 2 test when the latter can be well defined, and to compare the proposed implementation methods for the L 2-norm based test otherwise. The methodologies are motivated and illustrated by a real data example.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Behrens B V. Ein Beitrag zur Fehlerberechnung bei wenige Beobachtungen. Landwirtschaftliches Jahresbuch, 68: 807–837 (1929)
Fisher R A. The fiducial argument in statistical inference. Annals of Eugenics, 11: 141–172 (1935)
Scheffe H. On solutions of the Behrens-Fisher problem, based on the t-distribution. Ann Math Statist, 14: 35–44 (1943)
Welch B L. The generalization of Student’s problem when several different population variances are involved. Biometrika, 34: 28–35 (1947)
Scheffe H. Practical solutions of the Behrens-Fisher problem. J Amer Statist Assoc, 65: 1501–1508 (1970)
Lee A, Gurland J. Size and power of tests for equality of means of two normal populations with unequal variances. J Amer Statist Assoc, 70: 933–947 (1975)
Weerahandi S. Generalized confidence intervals. J Amer Statist Assoc, 88: 899–905 (1993)
Weerahandi S. Exact Statistical Methods in Data Analysis. New York: Springer-Verlag, 1995
Tang S, Tsui K W. Distributional properties for the generalized P-value for the Behrens-Fisher problem. Statist Prob Lett, 77: 1–8 (2005)
Ghosh M, Kim Y H. The Behrens-Fisher problem revisited: A Bayes-frequentist synthesis. Canad J Statist, 29: 1–14 (2001)
Dong L. The Behrens-Fisher problem: an empirical likelihood approach. Technical Report. Department of Economics, University of Victoria, 2004
Bennett B M. Note on a solution of the generalized Behrens-Fisher problem. Ann Inst Statist Math, 2: 87–90 (1951)
Anderson T W. An Introduction to Multivariate Statistical Analysis. New York: Wiley, 1984
James G S. Tests of linear hypotheses in univariate and multivariate analysis when the ratios of the population variances are unknown. Biometrika, 41: 19–43 (1954)
Yao Y. An approximate degrees of freedom solution to the multivariate Behrens-Fisher problem. Biometrika, 52: 139–147 (1965)
Johansen S. The Welch-James approximation to the distribution of the residual sum of squares in a weighted linear regression. Biometrika, 67: 85–95 (1980)
Nel D G, Van der Merwe C A. A solution to the multivariate Behrens-Fisher problem. Comm Statist Theory Methods, 15: 3719–3735 (1986)
Algina J, Tang K. Type-I error rates for Yao’s and James’ tests of equality of mean vectors under variance-covariance heteroscedasticity. J Educ Statist, 13: 281–290 (1988)
Krishnamoorthy K, Yu J. Modified Nel and Van der Merwe test for the multivariate Behrens-Fisher problem. Statist Prob Lett, 66: 161–169 (2004)
Krishnamoorthy K, Xia Y. On selecting tests for equality of two normal mean vectors. Multi Behavioral Res, 41: 533–548 (2006)
Grizzle J E, Allen D M. Analysis of growth and dose response curves. Biometrics, 25: 357–381 (1969)
Wang Y. Mixed effects smoothing spline analysis of variance. J Royal Statist Soc Ser B, 60: 159–174 (1998)
Bai Z, Saranadasa H. Effect of high dimension: by an example of a two sample problem. Statistica Sinica, 6: 311–329 (1996)
Hotelling H. The generalization of student’s ratio. Ann Math Statist, 2: 360–378 (1931)
Buckley M J, Eagleson G K. An approximation to the distribution of quadratic forms in normal random variables. Austral J Statist, 30A: 150–159 (1988)
Zhang J T. Approximate and asymptotic distribution of χ 2-type mixtures with application. J Amer Statist Assoc, 100: 273–285 (2005)
Author information
Authors and Affiliations
Corresponding author
Additional information
Dedicated to Professor Zhidong Bai on the occasion of his 65th birthday
The work was supported by the National University of Singapore Academic Research Grant (Grant No. R-155-000-085-112)
Rights and permissions
About this article
Cite this article
Zhang, J., Xu, J. On the k-sample Behrens-Fisher problem for high-dimensional data. Sci. China Ser. A-Math. 52, 1285–1304 (2009). https://doi.org/10.1007/s11425-009-0091-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11425-009-0091-x
Keywords
- χ 2-approximation
- χ 2-type mixtures
- high-dimensional data analysis
- Hotelling’s T 2 test
- k-sample test
- L 2-norm based test