Permutation Filtering: A Novel Concept for Significance Analysis of Large-Scale Genomic Data

Scheid, Stefanie; Spang, Rainer

doi:10.1007/11732990_29

Stefanie Scheid²⁴ &
Rainer Spang²⁴

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3909))

Included in the following conference series:

Annual International Conference on Research in Computational Molecular Biology

1277 Accesses
6 Citations

Abstract

Permutation of class labels is a common approach to build null distributions for significance analyis of microarray data. It is assumed to produce random score distributions, which are not affected by biological differences between samples. We argue that this assumption is questionable and show that basic requirements for null distributions are not met.

We propose a novel approach to the significance analysis of microarray data, called permutation filtering. We show that it leads to a more accurate screening, and to more precise estimates of false discovery rates. The method is implemented in the Bioconductor package twilight available on http://www.bioconductor.org.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Multiple Hypothesis Testing: A Methodological Overview

Statistical Analysis of Microarray Data

Data-driven hypothesis weighting increases detection power in genome-scale multiple testing

Article 30 May 2016

References

Dudoit, S., Yang, Y.H., Callow, M.J., Speed, T.P.: Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica 12, 111–139 (2002)
MATH MathSciNet Google Scholar
Broberg, P.: A new estimate of the proportion unchanged genes in a microarray experiment. Genome Biology 5, P10 (2004)
Google Scholar
Dalmasso, C., Broët, P., Moreau, T.: A simple procedure for estimating the false discovery rate. Bioinformatics 21, 660–668 (2005)
Article Google Scholar
Liao, J., Lin, Y., Selvanayagam, Z.E., Shih, W.J.: A mixture model for estimating the local false discovery rate in DNA microarray analysis. Bioinformatics 20, 2694–2701 (2004)
Article Google Scholar
Nettleton, D., Hwang, J.G.: Estimating the number of false null hypothesis when conducting many tests. Technical Report 9, Department of Statistics & Statistical Laboratory, Iowa State University (2003)
Google Scholar
Pounds, S., Morris, S.W.: Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of p-values. Bioinformatics 19, 1236–1242 (2003)
Article Google Scholar
Scheid, S., Spang, R.: A stochastic downhill search algorithm for estimating the local false discovery rate. IEEE Transactions on Computational Biology and Bioinformatics 1, 98–108 (2004)
Article Google Scholar
Storey, J.D., Tibshirani, R.: Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences 100, 9440–9445 (2003)
Article MATH MathSciNet Google Scholar
Huang, E., Cheng, S., Dressman, H., Pittman, J., Tsou, M., Horng, C., Bild, A., Iversen, E., Liao, M., Chen, C., West, M., Nevins, J., Huang, A.: Gene expression predictors of breast cancer outcomes. Lancet 361, 1590–1596 (2003)
Article Google Scholar
Affymetrix: Microarray Suite User Guide, Version 5.0. Affymetrix, Santa Clara, CA, USA (2001)
Google Scholar
Huber, W., von Heydebreck, A., Sültmann, H., Poustka, A., Vingron, M.: Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18, 96–104 (2002)
Google Scholar
Irizarry, R., Bolstad, B., Collin, F., Cope, L., Hobbs, B., Speed, T.: Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Research 31, e15 (2003)
Google Scholar
Efron, B., Tibshirani, R., Storey, J.D., Tusher, V.: Empirical Bayes analysis of a microarray experiment. Journal of the American Statistical Society 96, 1151–1160 (2001)
MATH MathSciNet Google Scholar
R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2005) ISBN 3-900051-07-0
Google Scholar
Gentleman, R., Carey, V., Bates, D., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B., Gautier, L., Ge, Y., Gentry, J., Hornik, K., Hothorn, T., Huber, W., Iacus, S., Irizarry, R., Leisch, F., Li, C., Maechler, M., Rossini, A., Sawitzki, G., Smith, C., Smyth, G., Tierney, L., Yang, J., Zhang, J.: Bioconductor: Open software development for computational biology and bioinformatics. Genome Biology 5, R80 (2004)
Google Scholar
Scheid, S., Spang, R.: Twilight; a Bioconductor package for estimating the local false discovery rate. Bioinformatics 21, 2921–2922 (2005)
Article Google Scholar
Scheid, S., Spang, R.: Estimation of local false discovery rate - User’s guide to the Bioconductor package twilight. CompDiag Technical Report 1, Computational Diagnostics Group. Max Planck Institute for Molecular Genetics, Berlin, Germany (2004)
Google Scholar
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B 57, 289–300 (1995)
MATH MathSciNet Google Scholar
Storey, J.D.: The positive false discovery rate: A Bayesian interpretation and the q-value. Annals of Statistics 31, 2013–2035 (2003)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Max Planck Institute for Molecular Genetics, Computational Diagnostics Group, Ihnestrasse 63-73, D-14195, Berlin, Germany
Stefanie Scheid & Rainer Spang

Authors

Stefanie Scheid
View author publications
You can also search for this author in PubMed Google Scholar
Rainer Spang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Georgia Institute of Technology and Università di Padova,
Alberto Apostolico
Topic Chairs, P.O. Box
Concettina Guerra
Center for Molecular Biology and Computer Sciecne Department, Brown University, 115 Waterman St., 02912, Providence, RI, USA
Sorin Istrail
University of California, San Diego, USA
Pavel A. Pevzner
Department of Molecular and Computational Biology, University of Southern California, 1050 Childs Way, 90089-2910, Los Angeles, CA, USA
Michael Waterman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Scheid, S., Spang, R. (2006). Permutation Filtering: A Novel Concept for Significance Analysis of Large-Scale Genomic Data. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds) Research in Computational Molecular Biology. RECOMB 2006. Lecture Notes in Computer Science(), vol 3909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11732990_29

Download citation

DOI: https://doi.org/10.1007/11732990_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33295-4
Online ISBN: 978-3-540-33296-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Permutation Filtering: A Novel Concept for Significance Analysis of Large-Scale Genomic Data

Abstract

Chapter PDF

Similar content being viewed by others

Multiple Hypothesis Testing: A Methodological Overview

Statistical Analysis of Microarray Data

Data-driven hypothesis weighting increases detection power in genome-scale multiple testing

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Permutation Filtering: A Novel Concept for Significance Analysis of Large-Scale Genomic Data

Abstract

Chapter PDF

Similar content being viewed by others

Multiple Hypothesis Testing: A Methodological Overview

Statistical Analysis of Microarray Data

Data-driven hypothesis weighting increases detection power in genome-scale multiple testing

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation