Abstract
Non-stationarity is an important aspect of data stream mining. Change detection and on-line adaptation of statistical estimators is required for non-stationary data streams. Statistical hypothesis tests may also be used for change detection. The advantage of using statistical tests compared to heuristic adaptation strategies is that we can distinguish between fluctuations due to the randomness inherent in the underlying distribution while it remains stationary and real changes of the distribution from which we sample. However, the problem of multiple testing should be taken into account when a test is carried out more than once. Even if the underlying distribution does not change over time, any test will erroneously reject the null hypothesis of no change in the long run if we only carry out the test often enough. In this work, we propose methods which account for the multiple testing issue and consequently improve reliability of change detection. A new method based on the information about the distribution of p-values is presented and discussed in this article as well as classical methods such as Bonferroni correction and the Bonferroni-Holm method.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Basseville, M., Nikiforov, I.: Detection of Abrupt Changes: Theory and Application. Prentice Hall, Upper Saddle River (1993)
Bhattacharya, B., Habtzghi, D.: Median of the p value under the alternative hypothesis. The American Statistician 56, 202–206 (2002)
Crawley, M.: Statistics: An Introduction using R. J. Wiley & Sons, New York (2005)
Donahue, R.M.J.: A note on information seldom reported via the P value. The American Statistician 53(4), 303–306 (1999)
Gibbons, J., Pratt, J.: p-values: Interpretation and methodology. The American Statistician 29, 20–25 (1975)
Gustafsson, F.: Adaptive Filtering and Change Detection. J. Wiley & Sons, New York (2000)
Holm, S.: A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6, 65–70 (1979)
Sackrowitz, H., Samuel-Cahn, E.: p-values as random variables—expected p-values. The American Statistician 53(4), 326–331 (1999)
Shaffer, J.P.: Multiple hypothesis testing. Ann. Rev. Psych. 46, 561–584 (1995)
Sheskin, D.: Handbook of Parametric and Nonparametric Statistical Procedures. CRC-Press, Boca Raton (1997)
Tschumitschew, K., Klawonn, F.: The need for benchmarks with data from stochastic processes and meta-models in evolving systems. In: Int. Symp. Evolving Intelligent Systems, pp. 30–33. SSAISB, Leicester (2010)
Tschumitschew, K., Klawonn, F.: Incremental statistical measures. In: Sayed-Mouchaweh, M., Lughofer, E. (eds.) Learning in Non-Stationary Environments: Methods and Applications, ch. 2. Springer, New York (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag GmbH Berlin Heidelberg
About this chapter
Cite this chapter
Tschumitschew, K., Klawonn, F. (2013). Change Detection Based on the Distribution of p-Values. In: Borgelt, C., Gil, M., Sousa, J., Verleysen, M. (eds) Towards Advanced Data Analysis by Combining Soft Computing and Statistics. Studies in Fuzziness and Soft Computing, vol 285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30278-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-30278-7_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30277-0
Online ISBN: 978-3-642-30278-7
eBook Packages: EngineeringEngineering (R0)