Skip to main content

Controlling Batch Effect in Epigenome-Wide Association Study

  • Protocol
  • First Online:
Epigenome-Wide Association Studies

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2432))

Abstract

Methylation data, similar to other omics data, is susceptible to various technical issues that are potentially associated with unexplained or unrelated factors. Any difference in the measurement of DNA methylation, such as laboratory operation and sequencing platform, may lead to batch effects. With the accumulation of large-scale omics data, scientists are making joint efforts to generate and analyze omics data to answer various scientific questions. However, batch effects are inevitable in practice, and careful adjustment is needed. Multiple statistical methods for controlling bias and inflation between batches have been developed either by correcting based on known batch factors or by estimating directly from the output data. In this chapter, we will review and demonstrate several popular methods for batch effect correction and make practical recommendations in epigenome-wide association studies (EWAS).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Leek JT et al (2010) Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet 11(10):733

    Article  CAS  Google Scholar 

  2. Sun Z et al (2011) Batch effect correction for genome-wide methylation data with Illumina Infinium platform. BMC Med Genet 4(1):84

    CAS  Google Scholar 

  3. Cazaly E et al (2016) Comparison of pre-processing methodologies for Illumina 450k methylation array data in familial analyses. Clin Epigenetics 8(1):75

    Article  Google Scholar 

  4. Benito M et al (2004) Adjustment of systematic microarray data biases. Bioinformatics 20(1):105–114

    Article  CAS  Google Scholar 

  5. Johnson WE, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8(1):118–127

    Article  Google Scholar 

  6. Wilhelm-Benartzi CS et al (2013) Review of processing and analysis methods for DNA methylation array data. Br J Cancer 109(6):1394

    Article  CAS  Google Scholar 

  7. van Iterson M, van Zwet EW, Heijmans BT (2017) Controlling bias and inflation in epigenome-and transcriptome-wide association studies using the empirical null distribution. Genome Biol 18(1):19

    Article  Google Scholar 

  8. Barfield RT et al (2012) CpGassoc: an R function for analysis of DNA methylation microarray data. Bioinformatics 28(9):1280–1281

    Article  CAS  Google Scholar 

  9. Kilaru V et al (2012) MethLAB: a graphical user interface package for the analysis of array-based DNA methylation data. Epigenetics 7(3):225–229

    Article  Google Scholar 

  10. Leek JT et al (2012) The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28(6):882–883

    Article  CAS  Google Scholar 

  11. Teschendorff AE, Zhuang J, Widschwendter M (2011) Independent surrogate variable analysis to deconvolve confounding factors in large-scale microarray profiling studies. Bioinformatics 27(11):1496–1505

    Article  CAS  Google Scholar 

  12. Leek JT, Storey JD (2007) Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet 3(9):1724–1735

    Article  CAS  Google Scholar 

  13. Xu C-J et al (2018) DNA methylation in childhood asthma: an epigenome-wide meta-analysis. Lancet Respir Med 6(5):379–388

    Article  CAS  Google Scholar 

  14. Zhang B et al (2013) Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer's disease. Cell 153(3):707–720

    Article  CAS  Google Scholar 

  15. Gagnon-Bartsch JA, Speed TP (2012) Using control genes to correct for unwanted variation in microarray data. Biostatistics 13(3):539–552

    Article  Google Scholar 

  16. Maksimovic J et al (2015) Removing unwanted variation in a differential methylation analysis of Illumina HumanMethylation450 array data. Nucleic Acids Res 43(16):e106–e106

    Article  Google Scholar 

  17. Leek JT (2014) Svaseq: removing batch effects and other unwanted noise from sequencing data. Nucleic Acids Res 42(21):e161

    Article  Google Scholar 

  18. Rahmani E et al (2016) Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies. Nat Methods 13(5):443

    Article  CAS  Google Scholar 

  19. Du P et al (2010) Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinformatics 11:587

    Article  CAS  Google Scholar 

  20. Jiao C et al (2018) Positional effects revealed in Illumina methylation Array and the impact on analysis. Epigenomics 10(5):643–659

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Jiang, Y., Chen, J., Chen, W. (2022). Controlling Batch Effect in Epigenome-Wide Association Study. In: Guan, W. (eds) Epigenome-Wide Association Studies. Methods in Molecular Biology, vol 2432. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1994-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-1994-0_6

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-1993-3

  • Online ISBN: 978-1-0716-1994-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics