Randomization Methods for Assessing the Significance of Data Mining Results

Mannila, Heikki

doi:10.1007/978-3-642-04125-9_1

Heikki Mannila^23,24

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5722))

Included in the following conference series:

International Symposium on Methodologies for Intelligent Systems

1393 Accesses

Abstract

Data mining research has developed many algorithms for various analysis tasks on large and complex datasets. However, assessing the significance of data mining results has received less attention. Analytical methods are rarely available, and hence one has to use computationally intensive methods. Randomization approaches based on null models provide, at least in principle, a general approach that can be used to obtain empirical p-values for various types of data mining approaches. I review some of the recent work in this area, outlining some of the open questions and problems.

Download to read the full chapter text

Chapter PDF

Permutation Tests Using Arbitrary Permutation Distributions

Article Open access 03 June 2023

Completely-Randomized Designs

Permutation and Randomization Tests

Author information

Authors and Affiliations

Helsinki Institute for Information Technology HIIT, Finland
Heikki Mannila
University of Helsinki and Helsinki University of Technology, Finland
Heikki Mannila

Authors

Heikki Mannila
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics and Statistics, University of Economics, W. Churchill Sq. 4, 130 67, Prague 3, Czech Republic
Jan Rauch
Department of Computer Science, University of North Carolina, NC 27599-3175, Charlotte, USA
Zbigniew W. Raś
Faculty of Informatics and Statics, University of Economics, W. Churchill Sq. 4, 130 67, Prague, Czech Republic
Petr Berka
Institute of Software Systems, Tampere University of Technology, P. O. Box 553, 33101, Tampere, Finland
Tapio Elomaa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mannila, H. (2009). Randomization Methods for Assessing the Significance of Data Mining Results. In: Rauch, J., Raś, Z.W., Berka, P., Elomaa, T. (eds) Foundations of Intelligent Systems. ISMIS 2009. Lecture Notes in Computer Science(), vol 5722. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04125-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-04125-9_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04124-2
Online ISBN: 978-3-642-04125-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Randomization Methods for Assessing the Significance of Data Mining Results

Abstract

Chapter PDF

Similar content being viewed by others

Permutation Tests Using Arbitrary Permutation Distributions

Completely-Randomized Designs

Permutation and Randomization Tests

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Randomization Methods for Assessing the Significance of Data Mining Results

Abstract

Chapter PDF

Similar content being viewed by others

Permutation Tests Using Arbitrary Permutation Distributions

Completely-Randomized Designs

Permutation and Randomization Tests

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation