Abstract
A finite mixture model using the Student's t distribution has been recognized as a robust extension of normal mixtures. Recently, a mixture of skew normal distributions has been found to be effective in the treatment of heterogeneous data involving asymmetric behaviors across subclasses. In this article, we propose a robust mixture framework based on the skew t distribution to efficiently deal with heavy-tailedness, extra skewness and multimodality in a wide range of settings. Statistical mixture modeling based on normal, Student's t and skew normal distributions can be viewed as special cases of the skew t mixture model. We present analytically simple EM-type algorithms for iteratively computing maximum likelihood estimates. The proposed methodology is illustrated by analyzing a real data example.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Azzalini A. 1985. A class of distributions which includes the normal ones. Scandinavian Journal of Statistics 12: 171–178.
Azzalini A. 1986. Further results on a class of distributions which includes the normal ones. Statistica 46: 199–208.
Azzalini A. and Capitaino A. 2003. Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. Journal of the Royal Statistical Society, Series B 65: 367–389.
Basford K.E., Greenway D.R., McLachlan G.J., and Peel, D. 1997. Standard errors of fitted means under normal mixture. Computational Statistics 12: 1–17.
Dellaportas P. and Papageorgiou I. 2006. Multivariate mixtures of normals with unknown number of components. Statistics and Computing 16: 57–68.
Dempster A.P., Laird N.M., and Rubin D.B. 1977. Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society, Series B 39: 1–38.
Flegal K.M., Carroll M.D., Ogden C.L., and Johnson C.L. 2002. Prevalence and trends in obesity among US adults, 1999–2000. Journal of the American Medical Association 288: 1723–1727.
Henze N. 1986. A probabilistic representation of the skew-normal distribution. Scandinavian Journal of Statistics 13: 271–275.
Jones M.C. and Faddy M.J. 2003. A skew extension of the t-distribution, with applications. Journal of the Royal Statistical Society, Series B 65: 159–174.
Lin T.I., Lee J.C., and Ni H.F. 2004. Bayesian analysis of mixture modelling using the multivariate t distribution. Statistics and Computing 14: 119–130.
Lin T.I., Lee J.C., and Yen S.Y. 2007. Finite mixture modelling using the skew normal distribution. Statistica Sinica (In press)
Liu C.H. and Rubin D.B. 1994. The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 81: 633–648.
Liu C.H., Rubin D.B., and Wu, Y. 1998. Parameter expansion to accelerate EM: the PX-EM algorithm. Biometrika 85: 755–770.
McLachlan G.J. and Basford K.E. 1988. Mixture Models: Inference and Application to Clustering, Marcel Dekker, New York.
McLachlan G.J. and Peel D. 2000. Finite Mixture Models, Wiely, New York.
Meng X.L. and Rubin D.B. 1993. Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80:267–78.
Peel D. and McLachlan G. J. 2000. Robust mixture modeling using the t distribution. Statistics and Computing 10: 339–348.
Richardson S. and Green P.J. 1997. On Bayesian analysis of mixtures with an unknown number of components (with discussion). Journal of the Royal Statistical Society, Series B 59: 731–792.
Shoham S. 2002. Robust clustering by deterministic agglomeration EM of mixtures of multivariate t-distributions. Pattern Recognition 35: 1127–1142.
Shoham S., Fellows M.R., and Normann R.A. 2003. Robust, automatic spike sorting using mixtures of multivariate t-distributions. Journal of Neuroscience Methods 127: 111–122.
Titterington D.M., Smith A.F.M., and Markov U.E. 1985. Statistical Analysis of Finite Mixture Distributions, Wiely, New York.
Wang H.X., Zhang Q.B., Luo B., and Wei S. 2004. Robust mixture modelling using multivariate t distribution with missing information. Pattern Recognition Letter 25: 701–710.
Zacks S. 1971. The Theory of Statistical Inference, New York, Wiley.
Zhang Z., Chan K.L., Wu Y., and Cen C.B. 2004. Learning a multivariate Gaussian mixture model with the reversible Jump MCMC algorithm. Statistics and Computing 14: 343–355.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lin, T.I., Lee, J.C. & Hsieh, W.J. Robust mixture modeling using the skew t distribution. Stat Comput 17, 81–92 (2007). https://doi.org/10.1007/s11222-006-9005-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-006-9005-8