Abstract
Outlier detection based on the Mahalanobis distance (MD) requires an appropriate transformation in case of compositional data. For the family of logratio transformations (additive, centered and isometric logratio transformation) it is shown that the MDs based on classical estimates are invariant to these transformations, and that the MDs based on affine equivariant estimators of location and covariance are the same for additive and isometric logratio transformation. Moreover, for 3-dimensional compositions the data structure can be visualized by contour lines. In higher dimension the MDs of closed and opened data give an impression of the multivariate data behavior.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Aitchison J (1986) The statistical analysis of compositional data. Monographs on statistics and applied probability. Chapman & Hall, London, 416 p
Aitchison J (1992) On criteria for measures of compositional difference. Math Geol 24(4):365–379
Aitchison J, Egozcue JJ (2005) Compositional data analysis: where are we and where should we be heading? Math Geol 37(7):829–850
Barceló C, Pawlowsky V, Grunsky E (1996) Some aspects of transformations of compositional data and the identification of outliers. Math Geol 28(4):501–518
Barceló-Vidal CB, Martín-Fernandez JA, Pawlowsky-Glahn V (1999) Comment on “Singularity and nonnormality in the classification of compositional data” by Bohling GC, Davis JC, Olea RA, Harff J (Letter to the editor). Math Geol 31(5):581–585
Bohling GC, Davis JC, Olea RA, Harff J (1998) Singularity and nonnormality in the classification of compositional data. Math Geol 30(1):5–20
Coakley JP, Rust BR (1968) Sedimentation in an Arctic lake. J Sed Pet 38(4):1290–1300. Quoted in Aitchison (1986), the statistical analysis of compositional data. Chapman & Hall, London, 416 p
Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35(3):279–300
Filzmoser P, Garrett RG, Reimann C (2005) Multivariate outlier detection in exploration geochemistry. Comput Geosci 31:579–587
Gnanadesikan R, Kettenring JR (1972) Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 28:81–124
Hardin J, Rocke DM (2005) The distribution of robust distances. J Comput Graph Stat 14:928–946
Harville DA (1997) Matrix algebra from a statistician’s perspective. Springer, New York, 630 p
Maronna R, Zamar R (2002) Robust estimates of location and dispersion for high-dimensional data sets. Technometrics 44(4):307–317
Maronna R, Martin RD, Yohai VJ (2006) Robust statistics: theory and methods. Wiley, New York, 436 p
Martín-Fernández JA, Barceló-Vidal C, Pawlowsky-Glahn V (2003) Dealing with zeros and missing values in compositional data sets using nonparametric imputation. Math Geol 35(3):253–278
Peña D, Prieto F (2001) Multivariate outlier detection and robust covariance matrix estimation. Technometrics 43(3):286–310
R development core team, 2006, R: A language and environment for statistical computing. Vienna. http://www.r-project.org
Reimann C, Äyräs M, Chekushin V, Bogatyrev I, Boyd R, Caritat P. d., Dutter R, Finne T, Halleraker J, Jæger O, Kashulina G, Lehto O, Niskavaara H, Pavlov V, Räisänen M, Strand T, Volden T (1998) Environmental geochemical atlas of the Central Barents Region: Geological Survey of Norway (NGU), Geological Survey of Finland (GTK), and Central Kola Expedition (CKE), Special Publication, Trondheim, Espoo, Monchegorsk, 745 p
Rousseeuw PJ, Leroy AM (2003) Robust regression and outlier detection. Wiley, New York, 360 p
Rousseeuw P, Van Driessen K (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41:212–223
Rousseeuw PJ, Van Zomeren BC (1990) Unmasking multivariate outliers and leverage points. J Am Stat Assoc 85(411):633–651
Thompson RN, Esson J, Duncan AC (1972) Major element chemical variation in the Eocene lavas of the Isle of Skye Scotland. J Petrol 13(2):219–253. Quoted in Aitchison, J., 1986, The statistical analysis of compositional data. Chapman & Hall, London, 416 p
Visuri S, Koivunen V, Oja H (2000) Sign and rank covariance matrices. J Stat Plan Inference 91:557–575
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Filzmoser, P., Hron, K. Outlier Detection for Compositional Data Using Robust Methods. Math Geosci 40, 233–248 (2008). https://doi.org/10.1007/s11004-007-9141-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11004-007-9141-5