Abstract
Epigenetic regulation and interactions between transcription factors and regulatory genomic regions play crucial roles in controlling transcriptional regulatory networks that drive development, environmental responses, and disease. Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-seq) and ChIP followed by genomic tiling microarray hybridization (ChIP-chip) are the two of the most widely used technologies for genome-wide identification of DNA protein interactions and histone modification in vivo. Many algorithms and tools have been developed and evaluated that allow identification of transcription factor binding sites from ChIP-seq or ChIP-chip datasets. However, binding site identification is only the first step; the ultimate goal is to discover the regulatory network of the transcription factor (TF). Here, we present a common workflow for downstream analysis of ChIP-chip and ChIP-seq with an emphasis on annotating binding sites and integration with gene expression data to identify direct and indirect targets of the TF. These tools will help with the overall goal of unraveling transcriptional regulatory networks using datasets publicly available in GEO.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Johnson DS, Mortazavi A, Myers RM, Wold B (2007) Genome-wide mapping of in vivo protein-DNA interactions. Science 316:1497–1502
Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T et al (2007) Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 4:651–657
Valouev A, Johnson DS, Sundquist A, Medina C, Anton E, Batzoglou S et al (2008) Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat Methods 5:829–834
Johnson DS, Li W, Gordon DB, Bhattacharjee A, Curry B, Ghosh J et al (2008) Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets. Genome Res 18:393–403
Kidder BL, Hu G, Zhao K (2011) ChIP-Seq: technical considerations for obtaining high-quality data. Nat Immunol 12:918–922
Buck MJ, Lieb JD (2004) ChIP-chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments. Genomics 83:349–360
Park PJ (2009) ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet 10:669–680
Ho JW, Bishop E, Karchenko PV, Negre N, White KP, Park PJ (2011) ChIP-chip versus ChIP-seq: lessons for experimental design and data analysis. BMC Genomics 12:134
Metzker ML (2010) Sequencing technologies—the next generation. Nat Rev Genet 11:31–46
Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM (2010) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 38:1767–1771
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760
Li R, Li Y, Kristiansen K, Wang J (2008) SOAP: short oligonucleotide alignment program. Bioinformatics 24:713–714
Li H, Homer N (2010) A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform 11:473–483
Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE et al (2008) Model-based analysis of ChIP-Seq (MACS). Genome Biol 9:R137
Fejes AP, Robertson G, Bilenky M, Varhol R, Bainbridge M, Jones SJ (2008) FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics 24:1729–1730
Albert I, Wachi S, Jiang C, Pugh BF (2008) GeneTrack – a genomic data processing and visualization framework. Bioinformatics 24:1305–1306
Jothi R, Cuddapah S, Barski A, Cui K, Zhao K (2008) Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res 36:5221–5231
Nix DA, Courdy SJ, Boucher KM (2008) Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks. BMC Bioinformatics 9:523
Spyrou C, Stark R, Lynch AG, Tavare S (2009) BayesPeak: Bayesian analysis of ChIP-seq data. BMC Bioinformatics 10:299
Ji H, Jiang H, Ma W, Wong WH (2011) Using CisGenome to analyze ChIP-chip and ChIP-seq data. Curr Protoc Bioinformatics Chapter 2:Unit2 13
Muino JM, Kaufmann K, van Ham RC, Angenent GC, Krajewski P (2011) ChIP-seq Analysis in R (CSAR): an R package for the statistical detection of protein-bound genomic regions. Plant Methods 7:11
Taslim C, Huang T, Lin S (2011) DIME: R-package for identifying differential ChIP-seq based on an ensemble of mixture models. Bioinformatics 27:1569–1570
Zhang X, Robertson G, Krzywinski M, Ning K, Droit A, Jones S, Gottardo R (2011) PICS: probabilistic inference for ChIP-seq. Biometrics 67:151–163
Wilbanks EG, Facciotti MT (2010) Evaluation of algorithm performance in ChIP-seq peak detection. PLoS One 5:e11471
Laajala TD, Raghav S, Tuomela S, Lahesmaa R, Aittokallio T, Elo LL (2009) A practical comparison of methods for detecting transcription factor binding sites in ChIP-seq experiments. BMC Genomics 10:618
Kasprzyk A, Keefe D, Smedley D, London D, Spooner W, Melsopp C et al (2004) EnsMart: a generic system for fast and flexible access to biological data. Genome Res 14:160–169
Zhu LJ, Gazin C, Lawson ND, Pages H, Lin SM, Lapointe DS, Green MR (2010) ChIPpeakAnno: a bioconductor package to annotate ChIP-seq and ChIP-chip data. BMC Bioinformatics 11:237
Bailey TL, Elkan C (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol 2:28–36
Bailey TL (2011) DREME: motif discovery in transcription factor ChIP-seq data. Bioinformatics 27:1653–1659
Li L (2009) GADEM: a genetic algorithm guided formation of spaced dyads coupled with an EM algorithm for motif discovery. J Comput Biol 16:317–329
Hochbaum D, Zhang Y, Stuckenholz C, Labhart P, Alexiadis V, Martin R et al (2011) DAF-12 regulates a connected network of genes to ensure robust developmental decisions. PLoS Genet 7:e1002179
Fisher AL, Lithgow GJ (2006) The nuclear hormone receptor DAF-12 has opposing effects on Caenorhabditis elegans lifespan and regulates genes repressed in multiple long-lived worms. Aging Cell 5:127–138
Sandelin A, Alkema W, Engstrom P, Wasserman WW, Lenhard B (2004) JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res 32(Database issue):D91–D94
Schneider TD, Stephens RM (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18:6097–6100
Mahony S, Benos PV (2007) STAMP: a web tool for exploring DNA-binding motif similarities. Nucleic Acids Res 35(Web Server issue):W253–W258
Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5:R80
Ihaka R, Gentlemen R (1996) R: a language for data analysis and graphics. J Comput Graph Stat 5:299–314
Hinrichs AS, Karolchik D, Baertsch R, Barber GP, Bejerano G et al (2006) The UCSC Genome Browser Database: update 2006. Nucleic Acids Res 34(Database issue):D590–D598
Lawrence M, Gentleman R, Carey V (2009) rtracklayer: an R package for interfacing with genome browsers. Bioinformatics 25:1841–1842
Mahony S, Auron PE, Benos PV (2007) DNA familial binding profiles made easy: comparison of various motif alignment and clustering strategies. PLoS Comput Biol 3:e61
Ou J, Zhu LJ (2013) http://www.bioconductor.org/packages/release/bioc/html/GeneNetworkBuilder.html
Acknowledgment
I would like to thank Dr. Michael Brodsky at Program in Gene Function and Expression in University of Massachusetts Medical School for his critical review of the manuscript and his excellent suggestions.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this protocol
Cite this protocol
Zhu, L.J. (2013). Integrative Analysis of ChIP-Chip and ChIP-Seq Dataset. In: Lee, TL., Shui Luk, A. (eds) Tiling Arrays. Methods in Molecular Biology, vol 1067. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-607-8_8
Download citation
DOI: https://doi.org/10.1007/978-1-62703-607-8_8
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-606-1
Online ISBN: 978-1-62703-607-8
eBook Packages: Springer Protocols