Abstract
This paper describes an approach to handle multivariate training data which contain outliers. The aim is to analyze the training patterns and to detect anomalous patterns. Therefore we explicitly model the existence of outliers in the training data using a widespread outlier distribution. Indicator variables assign each pattern to either the outlier distribution or the distribution of normal patterns. Thus we can estimate the data distribution using the EM-algorithm or Data Augmentation. We present the general approach as well as a concrete realization where we use Gaussian mixture models to describe the patterns’ distribution. Experimental results show the applicability of this approach for practical studies.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Vic Barnett and Toby Lewis. Outliers in Statistical Data. John Wiley & Sons, 1978.
Colin Campbell and Kristin P. Bennett. A linear programming approach to novelty detection. In Advances in Neural Information Processing Systems 13 (to appear), 2001.
L. H. Cox, M. M. Johnson, and K. Kafadar. Exposition of statistical graphics technology. In ASA Proceedings of the Statistical Computation Section, pages 55–56, 1982.
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39:1–38, 1977.
Jean Diebolt and Christian P. Robert. Estimation of finite mixtures through bayesian sampling. Journal of the Royal Statistical Society Series B, 56(2):363–375, 1994.
Nathalie Japkowicz, Catherine Myers, and Mark Gluck. A novelty detection approach to classification. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 518–623, 1995.
Geoffrey McLachlan and David Peel. Finite Mixture Models. John Wiley & Sons, 2000.
Alberto Munõz and Jorge Muruzabál. Self-organizing maps for outlier detection. Neurocomputing, 18(1–3):33–60, 1998.
Alexandre Nairac, Timothy A. Corbett-Clark, Ruth Ripley, Neil W. Townsend, and Lionel Tarassenko. Choosing an appropriate model for novelty detection. In Proceedings of the Fifth International Conference on Artificial Neural Networks, pages 117–122, 1997.
Stephen Roberts and Lionel Tarassenko. A probabilistic ressource allocation network for novelty detection. Neural Computation, 6(2):270–284, 1994.
Peter J. Rousseeuw and Annick M. Leroy. Robust Regression and Outlier Detection. John Wiley & Sons, 1987.
Bernhard Schölkopf, Robert C. Williamson, Alex Smola, and John Shawe-Taylor. SV estimation of a distribution’s support. In Advances in Neural Information Processing Systems 12, pages 582–588, 2000.
Statlib—datasets archive. cf. http://lib.stat.cmu.edu/datasets.
Martin A. Tanner and Wing Hung Wong. The calculation of posterior distributions by data augmentation. Journal of the American Statistical Society, 82(398):528–550, 1987.
Geoffrey G. Towell. Local expert autoassociators for anomaly detection. In Proceedings of the Seventeenth International Conference on Machine Learning, pages 255–262, 2000.
Alexander Ypma and Robert P. W. Duin. Novelty detection using self-organizing maps. In Progress in Connectionist-Based Information Systems, volume 2, pages 1322–1325, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lauer, M. (2001). A Mixture Approach to Novelty Detection Using Training Data with Outliers. In: De Raedt, L., Flach, P. (eds) Machine Learning: ECML 2001. ECML 2001. Lecture Notes in Computer Science(), vol 2167. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44795-4_26
Download citation
DOI: https://doi.org/10.1007/3-540-44795-4_26
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42536-6
Online ISBN: 978-3-540-44795-5
eBook Packages: Springer Book Archive