Abstract
Methylation data, similar to other omics data, is susceptible to various technical issues that are potentially associated with unexplained or unrelated factors. Any difference in the measurement of DNA methylation, such as laboratory operation and sequencing platform, may lead to batch effects. With the accumulation of large-scale omics data, scientists are making joint efforts to generate and analyze omics data to answer various scientific questions. However, batch effects are inevitable in practice, and careful adjustment is needed. Multiple statistical methods for controlling bias and inflation between batches have been developed either by correcting based on known batch factors or by estimating directly from the output data. In this chapter, we will review and demonstrate several popular methods for batch effect correction and make practical recommendations in epigenome-wide association studies (EWAS).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Leek JT et al (2010) Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet 11(10):733
Sun Z et al (2011) Batch effect correction for genome-wide methylation data with Illumina Infinium platform. BMC Med Genet 4(1):84
Cazaly E et al (2016) Comparison of pre-processing methodologies for Illumina 450k methylation array data in familial analyses. Clin Epigenetics 8(1):75
Benito M et al (2004) Adjustment of systematic microarray data biases. Bioinformatics 20(1):105–114
Johnson WE, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8(1):118–127
Wilhelm-Benartzi CS et al (2013) Review of processing and analysis methods for DNA methylation array data. Br J Cancer 109(6):1394
van Iterson M, van Zwet EW, Heijmans BT (2017) Controlling bias and inflation in epigenome-and transcriptome-wide association studies using the empirical null distribution. Genome Biol 18(1):19
Barfield RT et al (2012) CpGassoc: an R function for analysis of DNA methylation microarray data. Bioinformatics 28(9):1280–1281
Kilaru V et al (2012) MethLAB: a graphical user interface package for the analysis of array-based DNA methylation data. Epigenetics 7(3):225–229
Leek JT et al (2012) The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28(6):882–883
Teschendorff AE, Zhuang J, Widschwendter M (2011) Independent surrogate variable analysis to deconvolve confounding factors in large-scale microarray profiling studies. Bioinformatics 27(11):1496–1505
Leek JT, Storey JD (2007) Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet 3(9):1724–1735
Xu C-J et al (2018) DNA methylation in childhood asthma: an epigenome-wide meta-analysis. Lancet Respir Med 6(5):379–388
Zhang B et al (2013) Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer's disease. Cell 153(3):707–720
Gagnon-Bartsch JA, Speed TP (2012) Using control genes to correct for unwanted variation in microarray data. Biostatistics 13(3):539–552
Maksimovic J et al (2015) Removing unwanted variation in a differential methylation analysis of Illumina HumanMethylation450 array data. Nucleic Acids Res 43(16):e106–e106
Leek JT (2014) Svaseq: removing batch effects and other unwanted noise from sequencing data. Nucleic Acids Res 42(21):e161
Rahmani E et al (2016) Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies. Nat Methods 13(5):443
Du P et al (2010) Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinformatics 11:587
Jiao C et al (2018) Positional effects revealed in Illumina methylation Array and the impact on analysis. Epigenomics 10(5):643–659
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Jiang, Y., Chen, J., Chen, W. (2022). Controlling Batch Effect in Epigenome-Wide Association Study. In: Guan, W. (eds) Epigenome-Wide Association Studies. Methods in Molecular Biology, vol 2432. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1994-0_6
Download citation
DOI: https://doi.org/10.1007/978-1-0716-1994-0_6
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-1993-3
Online ISBN: 978-1-0716-1994-0
eBook Packages: Springer Protocols