Abstract
Chromatin immunoprecipitation coupled with massive parallel sequencing (ChIP-seq) is a powerful technology to identify the genome-wide locations of DNA binding proteins such as transcription factors or modified histones. As more and more experimental laboratories are adopting ChIP-seq to unravel the transcriptional and epigenetic regulatory mechanisms, computational analyses of ChIP-seq also become increasingly comprehensive and sophisticated. In this article, we review current computational methodology for ChIP-seq analysis, recommend useful algorithms and workflows, and introduce quality control measures at different analytical steps. We also discuss how ChIP-seq could be integrated with other types of genomic assays, such as gene expression profiling and genome-wide association studies, to provide a more comprehensive view of gene regulatory mechanisms in important physiological and pathological processes.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Metzker, M. L. (2010) Sequencing technologies — the next generation. Nat. Rev. Genet., 11, 31–46.
Ansorge, W. J. (2009) Next-generation DNA sequencing techniques. New Biotechnol., 25, 195–203.
Kircher, M., Heyn, P. and Kelso, J. (2011) Addressing challenges in the production and analysis of illumina sequencing data. BMC Genomics, 12, 382.
Schuster, S. C. (2008) Next-generation sequencing transforms today’s biology. Nat. Methods, 5, 16–18.
Solomon, M. J., Larsen, P. L. and Varshavsky, A. (1988) Mapping protein-DNA interactions in vivo with formaldehyde: evidence that histone H4 is retained on a highly transcribed gene. Cell, 53, 937–947.
Hurtado, A., Holmes, K. A., Ross-Innes, C. S., Schmidt, D. and Carroll, J. S. (2011) FOXA1 is a key determinant of estrogen receptor function and endocrine response. Nat. Genet., 43, 27–33.
Lupien, M., Eeckhoute, J., Meyer, C. A., Wang, Q., Zhang, Y., Li, W., Carroll, J. S., Liu, X. S. and Brown, M. (2008) FoxA1 translates epigenetic signatures into enhancer-driven lineage-specific transcription. Cell, 132, 958–970.
Young, R. A. (2011) Control of the embryonic stem cell state. Cell, 144, 940–954.
Kagey, M. H., Newman, J. J., Bilodeau, S., Zhan, Y., Orlando, D. A., van Berkum, N. L., Ebmeier, C. C., Goossens, J., Rahl, P. B., Levine, S. S., et al. (2010) Mediator and cohesin connect gene expression and chromatin architecture. Nature, 467, 430–435.
Chen, X., Xu, H., Yuan, P., Fang, F., Huss, M., Vega, V. B., Wong, E., Orlov, Y. L., Zhang, W., Jiang, J., et al. (2008) Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell, 133, 1106–1117.
Kim, J., Chu, J., Shen, X., Wang, J. and Orkin, S. H. (2008) An extended transcriptional network for pluripotency of embryonic stem cells. Cell, 132, 1049–1061.
Rahl, P. B., Lin, C. Y., Seila, A. C., Flynn, R. A., McCuine, S., Burge, C. B., Sharp, P. A. and Young, R. A. (2010) c-Myc regulates transcriptional pause release. Cell, 141, 432–445.
Handoko, L., Xu, H., Li, G., Ngan, C. Y., Chew, E., Schnapp, M., Lee, C. W., Ye, C., Ping, J. L., Mulawadi, F., et al. (2011) CTCF-mediated functional chromatin interactome in pluripotent cells. Nat. Genet., 43, 630–638.
Dostie, J., Richmond, T. A., Arnaout, R. A., Selzer, R. R., Lee, W. L., Honan, T. A., Rubio, E. D., Krumm, A., Lamb, J., Nusbaum, C., et al. (2006) Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. Genome Res., 16, 1299–1309.
Espinoza, C. A. and Ren, B. (2011) Mapping higher order structure of chromatin domains. Nat. Genet., 43, 615–616.
Fullwood, M. J., Han, Y., Wei, C. L., Ruan, X. and Ruan, Y. (2010) Chromatin interaction analysis using paired-end tag sequencing. Curr. Protoc. Mol. Biol., Chapter 21, Unit 21.15.1–25.
Fullwood, M. J., Liu, M. H., Pan, Y. F., Liu, J., Xu, H., Mohamed, Y. B., Orlov, Y. L., Velkov, S., Ho, A., Mei, P. H., et al. (2009) An oestrogen-receptor-alpha-bound human chromatin interactome. Nature, 462, 58–64.
Li, G., Fullwood, M. J., Xu, H., Mulawadi, F. H., Velkov, S., Vega, V., Ariyaratne, P. N., Mohamed, Y. B., Ooi, H. S., Tennakoon, C., et al. (2010) ChIA-PET tool for comprehensive chromatin interaction analysis with paired-end tag sequencing. Genome Biol., 11, R22.
Lieberman-Aiden, E., van Berkum, N. L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B. R., Sabo, P. J., Dorschner, M. O., et al. (2009) Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science, 326, 289–293.
Rusk, N. (2009) When ChIA PETs meet Hi-C. Nat. Methods, 6, 863.
Schoenfelder, S., Sexton, T., Chakalova, L., Cope, N. F., Horton, A., Andrews, S., Kurukuti, S., Mitchell, J. A., Umlauf, D., Dimitrova, D. S., et al. (2010) Preferential associations between co-regulated genes reveal a transcriptional interactome in erythroid cells. Nat. Genet., 42, 53–61.
Theodorou, V. and Carroll, J. S. (2010) Estrogen receptor action in three dimensions — looping the loop. Breast Cancer Res., 12, 303.
Barski, A., Cuddapah, S., Cui, K., Roh, T. Y., Schones, D. E., Wang, Z., Wei, G., Chepelev, I. and Zhao, K. (2007) High-resolution profiling of histone methylations in the human genome. Cell, 129, 823–837.
Wang, Z., Zang, C., Rosenfeld, J. A., Schones, D. E., Barski, A., Cuddapah, S., Cui, K., Roh, T. Y., Peng, W., Zhang, M. Q., et al. (2008) Combinatorial patterns of histone acetylations and methylations in the human genome. Nat. Genet., 40, 897–903.
The ENCODE Project Consortium. (2011) A user’s guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol., 9, e1001046.
The modENCODE Consortium, Roy, S., Ernst, J., Kharchenko, P. V., Kheradpour, P., Negre, N., Eaton, M. L., Landolin, J. M., Bristow, C. A., Ma, L., Lin, M. F., et al. (2010) Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science, 330, 1787–1797.
Liu, T., Rechtsteiner, A., Egelhofer, T. A., Vielle, A., Latorre, I., Cheung, M. S., Ercan, S., Ikegami, K., Jensen, M., Kolasinska-Zwierz, P., et al. (2011) Broad chromosomal domains of histone modification patterns in C. elegans. Genome Res., 21, 227–236.
Kharchenko, P. V., Alekseyenko, A. A., Schwartz, Y. B., Minoda, A., Riddle, N. C., Ernst, J., Sabo, P. J., Larschan, E., Gorchakov, A. A., Gu, T., et al. (2011) Comprehensive analysis of the chromatin landscape in Drosophila melanogaster. Nature, 471, 480–485.
Ernst, J., Kheradpour, P., Mikkelsen, T. S., Shoresh, N., Ward, L. D., Epstein, C. B., Zhang, X., Wang, L., Issner, R., Coyne, M., et al. (2011) Mapping and analysis of chromatin state dynamics in nine human cell types. Nature, 473, 43–49.
Auerbach, R. K., Euskirchen, G., Rozowsky, J., Lamarre-Vincent, N., Moqtaderi, Z., Lefrançois, P., Struhl, K., Gerstein, M. and Snyder, M. (2009) Mapping accessible chromatin regions using Sono-Seq. Proc. Natl. Acad. Sci. USA, 106, 14926–14931.
Park, P. J. (2009) ChIP-seq: advantages and challenges of a maturing technology. Nat. Rev. Genet., 10, 669–680.
de Magalhães, J. P., Finch, C. E. and Janssens, G. (2010) Nextgeneration sequencing in aging research: emerging applications, problems, pitfalls and possible solutions. Ageing Res. Rev., 9, 315–323.
Hamady, M., Walker, J. J., Harris, J. K., Gold, N. J. and Knight, R. (2008) Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex. Nat. Methods, 5, 235–237.
Kim, J. B., Porreca, G. J., Song, L., Greenway, S. C., Gorham, J. M., Church, G. M., Seidman, C. E. and Seidman, J. G. (2007) Polony multiplex analysis of gene expression (PMAGE) in mouse hypertrophic cardiomyopathy. Science, 316, 1481–1484.
Meyer, M. and Kircher M. (2010) Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb. Protoc., 2010, pdb.prot5448.
Liu, T., Ortiz, J. A., Taing, L., Meyer, C. A., Lee, B., Zhang, Y., Shin, H., Wong, S. S., Ma, J., Lei, Y., et al. (2011) Cistrome: an integrative platform for transcriptional regulation studies. Genome Biol., 12, R83.
Ji, H., Jiang, H., Ma, W., Johnson, D. S., Myers, R. M. and Wong, W. H. (2008) An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nat. Biotechnol., 26, 1293–1300.
Ji, H., Jiang, H., Ma, W. and Wong, W. H. (2011) Using CisGenome to analyze ChIP-chip and ChIP-seq data. Curr. Protoc. Bioinformatics, Chapter 2, Unit2.13.
Langmead, B., Trapnell, C., Pop, M. and Salzberg, S. L. (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol., 10, R25.
Langmead, B. and Salzberg, S. L. (2012) Fast gapped-read alignment with Bowtie 2. Nat. Methods, 9, 357–359.
Li, H. and Durbin, R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754–1760.
Li, H. and Durbin, R. (2010) Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics, 26, 589–595.
Li, H., Ruan, J. and Durbin, R. (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res., 18, 1851–1858.
Lunter, G. and Goodson, M. (2011) Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res., 21, 936–939.
Krawitz, P., Rödelsperger, C., Jäger, M., Jostins, L., Bauer, S. and Robinson, P. N. (2010) Microindel detection in short-read sequence data. Bioinformatics, 26, 722–729.
Li, R., Yu, C., Li, Y., Lam, T. W., Yiu, S. M., Kristiansen, K. and Wang, J. (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics, 25, 1966–1967.
Bao, S., Jiang, R., Kwan, W., Wang, B., Ma, X. and Song, Y. Q. (2011) Evaluation of next-generation sequencing software in mapping and assembly. J. Hum. Genet., 56, 406–414.
Zhang, Y., Liu, T., Meyer, C. A., Eeckhoute, J., Johnson, D. S., Bernstein, B. E., Nusbaum, C., Myers, R. M., Brown, M., Li, W., et al. (2008) Model-based analysis of ChIP-Seq (MACS). Genome Biol., 9, R137.
Kharchenko, P. V., Tolstorukov, M. Y. and Park, P. J. (2008) Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat. Biotechnol., 26, 1351–1359.
Nix, D. A., Courdy, S. J. and Boucher, K. M. (2008) Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks. BMC Bioinformatics, 9, 523.
Zang, C., Schones, D. E., Zeng, C., Cui, K., Zhao, K. and Peng, W. (2009) A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics, 25, 1952–1958.
Rozowsky, J., Euskirchen, G., Auerbach, R. K., Zhang, Z. D., Gibson, T., Bjornson, R., Carriero, N., Snyder, M. and Gerstein, M. B. (2009) PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat. Biotechnol., 27, 66–75.
Ji, H. (2010) Computational analysis of ChIP-seq data. Methods Mol. Biol., 674, 143–159.
Fejes, A. P., Robertson, G., Bilenky, M., Varhol, R., Bainbridge, M. and Jones, S. J. (2008) FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics, 24, 1729–1730.
Jothi, R., Cuddapah, S., Barski, A., Cui, K. and Zhao, K. (2008) Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res., 36, 5221–5231.
Garber, M., Grabherr, M. G., Guttman, M. and Trapnell, C. (2011) Computational methods for transcriptome annotation and quantification using RNA-seq. Nat. Methods, 8, 469–477.
Pepke, S., Wold, B. and Mortazavi, A. (2009) Computation for ChIPseq and RNA-seq studies. Nat. Methods, 6, S22–S32.
Wilbanks, E. G. and Facciotti, M. T. (2010) Evaluation of algorithm performance in ChIP-seq peak detection. PLoS ONE, 5, e11471.
Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B, 57, 289–300.
Storey, J. D. (2002) A direct approach to false discovery rates. J. R. Stat. Soc. B, 64, 479–498.
Storey, J. D. and Tibshirani, R. (2003) Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA, 100, 9440–9445.
Valouev, A., Johnson, D. S., Sundquist, A., Medina, C., Anton, E., Batzoglou, S., Myers, R. M. and Sidow, A. (2008) Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat. Methods, 5, 829–834.
Tuteja, G., White, P., Schug, J. and Kaestner, K. H. (2009) Extracting transcription factor targets from ChIP-Seq data. Nucleic Acids Res., 37, e113.
Johnson, D. S., Mortazavi, A., Myers, R. M. and Wold, B. (2007) Genome-wide mapping of in vivo protein-DNA interactions. Science, 316, 1497–1502.
Zhang, Y., Shin, H., Song, J. S., Lei, Y. and Liu, X. S. (2008) Identifying positioned nucleosomes with epigenetic marks in human from ChIP-Seq. BMC Genomics, 9, 537.
Chen, Y., Meyer, C. A., Liu, T., Li, W., Liu, J. S. and Liu, X. S. (2011) MM-ChIP enables integrative analysis of cross-platform and betweenlaboratory ChIP-chip or ChIP-seq data. Genome Biol., 12, R11.
Kent, W. J., Sugnet, C. W., Furey, T. S., Roskin, K. M., Pringle, T. H., Zahler, A. M. and Haussler, D. (2002) The human genome browser at UCSC. Genome Res., 12, 996–1006.
Fujita, P. A., Rhead, B., Zweig, A. S., Hinrichs, A. S., Karolchik, D., Cline, M. S., Goldman, M., Barber, G. P., Clawson, H., Coelho, A., et al. (2011) The UCSC Genome Browser database: update 2011. Nucleic Acids Res., 39, D876–D882.
Karolchik, D., Hinrichs, A. S., Furey, T. S., Roskin, K. M., Sugnet, C. W., Haussler, D. and Kent, W. J. (2004) The UCSC Table Browser data retrieval tool. Nucleic Acids Res., 32, D493–D496.
Raney, B. J., Cline, M. S., Rosenbloom, K. R., Dreszer, T. R., Learned, K., Barber, G. P., Meyer, L. R., Sloan, C. A., Malladi, V. S., Roskin, K. M., et al. (2011) ENCODE whole-genome data in the UCSC genome browser (2011 update). Nucleic Acids Res., 39, D871–D875.
Robinson, J. T., Thorvaldsdóttir, H., Winckler, W., Guttman, M., Lander, E. S., Getz, G. and Mesirov, J. P. (2011) Integrative genomics viewer. Nat. Biotechnol., 29, 24–26.
Nicol, J. W., Helt, G. A., Blanchard, S. G. Jr, Raja, A. and Loraine, A. E. (2009) The Integrated Genome Browser: free software for distribution and exploration of genome-scale datasets. Bioinformatics, 25, 2730–2731.
Donlin, M. J. (2009) Using the Generic Genome Browser (GBrowse). Curr. Protoc. Bioinformatics, Chapter 9, Unit 9.9.
Podicheti, R. and Dong, Q. (2011) Administering GBrowse sites with WebGBrowse. Curr. Protoc. Bioinformatics, Chapter 9, Unit 9.14.
Huang, W. and Marth, G. (2008) EagleView: a genome assembly viewer for next-generation sequencing technologies. Genome Res., 18, 1538–1543.
Milne, I., Bayer, M., Cardle, L., Shaw, P., Stephen, G., Wright, F. and Marshall, D. (2010) Tablet — next generation sequence assembly visualization. Bioinformatics, 26, 401–402.
Nicol, J. W., Helt, G. A., Blanchard, S. G. Jr, Raja, A. and Loraine, A. E. (2009) The Integrated Genome Browser: free software for distribution and exploration of genome-scale datasets. Bioinformatics, 25, 2730–2731.
Bao, H., Guo, H., Wang, J., Zhou, R., Lu, X. and Shi, S. (2009) MapView: visualization of short reads alignment on a desktop computer. Bioinformatics, 25, 1554–1555.
Lewis, S. E., Searle, S. M., Harris, N., Gibson, M., Lyer, V., Richter, J., Wiel, C., Bayraktaroglir, L., Birney, E., Crosby, M. A., et al.. (2002) Apollo: a sequence annotation editor. Genome Biol., 3, RESEARCH0082.
Li, Q. H., Brown, J. B., Huang, H. and Bickel, P. J. (2011) Measuring reproducibility of high-throughput experiments. Ann. Appl. Stat., 5, 1752–1779.
Siepel, A., Bejerano, G., Pedersen, J. S., Hinrichs, A. S., Hou, M., Rosenbloom, K., Clawson, H., Spieth, J., Hillier, L. W., Richards, S., et al. (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res., 15, 1034–1050.
Robasky, K. and Bulyk, M. L. (2011) UniPROBE, update 2011: expanded content and search tools in the online database of proteinbinding microarray data on protein-DNA interactions. Nucleic Acids Res., 39, D124–D128.
Xie, Z., Hu, S., Blackshaw, S., Zhu, H. and Qian, J. (2010) hPDI: a database of experimental human protein-DNA interactions. Bioinformatics, 26, 287–289.
Bryne, J. C., Valen, E., Tang, M. H., Marstrand, T., Winther, O., da Piedade, I., Krogh, A., Lenhard, B. and Sandelin, A. (2008) JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res., 36, D102–D106.
AlQuraishi, M. and McAdams, H. H. (2011) Direct inference of protein-DNA interactions using compressed sensing methods. Proc. Natl. Acad. Sci. USA, 108, 14819–14824.
Nutiu, R., Friedman, R. C., Luo, S., Khrebtukova, I., Silva, D., Li, R., Zhang, L., Schroth, G. P. and Burge, C. B. (2011) Direct measurement of DNA affinity landscapes on a high-throughput sequencing instrument. Nat. Biotechnol., 29, 659–664.
Bailey, T. L. (2011) DREME: motif discovery in transcription factor ChIP-seq data. Bioinformatics, 27, 1653–1659.
Machanick, P. and Bailey T. L. (2011) MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics, 27, 1696–1697.
Liu, X. S., Brutlag, D. L. and Liu, J. S. (2002) An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nat. Biotechnol., 20, 835–839.
Ma, X., Kulkarni, A., Zhang, Z., Xuan, Z., Serfling, R. and Zhang, M. Q. (2012) A highly efficient and effective motif discovery method for ChIP-seq/ChIP-chip data using positional information. Nucleic Acids Res., 40, e50.
Meyer, C. A., He, H. H., Brown, M. and Liu, X. S. (2011) BINOCh: binding inference from nucleosome occupancy changes. Bioinformatics, 27, 1867–1868.
Bell, O., Tiwari, V. K., Thomä, N. H. and Schübeler, D. (2011) Determinants and dynamics of genome accessibility. Nat. Rev. Genet., 12, 554–564.
Crawford, G. E., Holt, I. E., Mullikin, J. C., Tai, D., Blakesley, R., Bouffard, G., Young, A., Masiello, C., Green, E. D., Wolfsberg, T. G., et al. (2004) Identifying gene regulatory elements by genome-wide recovery of DNase hypersensitive sites. Proc. Natl. Acad. Sci. USA, 101, 992–997.
Sabo, P. J., Humbert, R., Hawrylycz, M., Wallace, J. C., Dorschner, M. O., McArthur, M. and Stamatoyannopoulos, J. A. (2004) Genomewide identification of DNaseI hypersensitive sites using active chromatin sequence libraries. Proc. Natl. Acad. Sci. USA, 101, 4537–4542.
Bernstein, B. E., Stamatoyannopoulos, J. A., Costello, J. F., Ren, B., Milosavljevic, A., Meissner, A., Kellis, M., Marra, M. A., Beaudet, A. L., Ecker, J. R., et al. (2010) The NIH Roadmap Epigenomics Mapping Consortium. Nat. Biotechnol., 28, 1045–1048.
Gerstein, M. B., Lu, Z. J., Van Nostrand, E. L., Cheng, C., Arshinoff, B. I., Liu, T., Yip, K. Y., Robilotto, R., Rechtsteiner, A., Ikegami, K., et al. (2010) Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science, 330, 1775–1787.
Moorman, C., Sun, L. V., Wang, J., de Wit, E., Talhout, W., Ward, L. D., Greil, F., Lu, X. J., White, K. P., Bussemaker, H. J., et al. (2006) Hotspots of transcription factor colocalization in the genome of Drosophila melanogaster. Proc. Natl. Acad. Sci. USA, 103, 12027–12032.
Nègre, N., Brown, C. D., Ma, L., Bristow, C. A., Miller, S. W., Wagner, U., Kheradpour, P., Eaton, M. L., Loriaux, P., Sealfon, R., et al. (2011) A cis-regulatory map of the Drosophila genome. Nature, 471, 527–531.
Shin, H., Liu, T., Manrai, A. K. and Liu, X. S. (2009) CEAS: cisregulatory element annotation system. Bioinformatics, 25, 2605–2606.
Wang, D., Garcia-Bassets, I., Benner, C., Li, W., Su, X., Zhou, Y., Qiu, J., Liu, W., Kaikkonen, M. U., Ohgi, K. A., et al. (2011) Reprogramming transcription by distinct classes of enhancers functionally defined by eRNA. Nature, 474, 390–394.
Cheung, I., Shulha, H. P., Jiang, Y., Matevossian, A., Wang, J., Weng, Z. and Akbarian, S. (2010) Developmental regulation and individual differences of neuronal H3K4me3 epigenomes in the prefrontal cortex. Proc. Natl. Acad. Sci. USA, 107, 8824–8829.
Xu, H., Wei, C. L., Lin, F. and Sung, W. K. (2008) An HMM approach to genome-wide identification of differential histone modification sites from ChIP-seq data. Bioinformatics, 24, 2344–2349.
Robinson, M. D., McCarthy, D. J. and Smyth, G. K. (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26, 139–140.
Anders, S. and Huber, W. (2010) Differential expression analysis for sequence count data. Genome Biol., 11, R106.
Hardcastle, T. J. and Kelly, K. A. (2010) baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics, 11, 422.
Verzi, M. P., Shin, H., He, H. H., Sulahian, R., Meyer, C. A., Montgomery, R. K., Fleet, J. C., Brown, M., Liu, X. S. and Shivdasani, R. A. (2010) Differentiation-specific histone modifications reveal dynamic chromatin interactions and partners for the intestinal transcription factor CDX2. Dev. Cell, 19, 713–726.
Tang, Q., Chen, Y., Meyer, C., Geistlinger, T., Lupien, M., Wang, Q., Liu, T., Zhang, Y., Brown, M. and Liu, X. S. (2011) A comprehensive view of nuclear receptor cancer cistromes. Cancer Res., 71, 6940–6947.
The ENCODE Project Consortium. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature, 447, 799–816.
Wang, Q., Li, W., Zhang, Y., Yuan, X., Xu, K., Yu, J., Chen, Z., Beroukhim, R., Wang, H., Lupien, M., et al. (2009) Androgen receptor regulates a distinct transcription program in androgen-independent prostate cancer. Cell, 138, 245–256.
Carroll, J. S., Meyer, C. A., Song, J., Li, W., Geistlinger, T. R., Eeckhoute, J., Brodsky, A. S., Keeton, E. K., Fertuck, K. C., Hall, G. F., et al. (2006) Genome-wide analysis of estrogen receptor binding sites. Nat. Genet., 38, 1289–1297.
Huang, D. W., Sherman, B. T. and Lempicki, R. A. (2009) Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res., 37, 1–13.
Huang, D. W., Sherman, B. T. and Lempicki, R. A. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc., 4, 44–57.
Thomas, P. D., Campbell, M. J., Kejariwal, A., Mi, H., Karlak, B., Daverman, R., Diemer, K., Muruganujan, A. and Narechania, A. (2003) PANTHER: a library of protein families and subfamilies indexed by function. Genome Res., 13, 2129–2141.
McLean, C. Y., Bristor, D., Hiller, M., Clarke, S. L., Schaar, B. T., Lowe, C. B., Wenger, A. M. and Bejerano, G. (2010) GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol., 28, 495–501.
Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., Paulovich, A., Pomeroy, S. L., Golub, T. R., Lander, E. S., et al. (2005) Gene set enrichment analysis: a knowledgebased approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA, 102, 15545–15550.
Zhang, Z., Chang, C. W., Goh, W. L., Sung, W. K. and Cheung, E. (2011) CENTDIST: discovery of co-associated factors by motif distribution. Nucleic Acids Res., 39, W391–W399.
Carroll, J. S., Liu, X. S., Brodsky, A. S., Li, W., Meyer, C. A., Szary, A. J., Eeckhoute, J., Shao, W., Hestermann, E. V., Geistlinger, T. R., et al. (2005) Chromosome-wide mapping of estrogen receptor binding reveals long-range regulation requiring the forkhead protein FoxA1. Cell, 122, 33–43.
Giardine, B., Riemer, C., Hardison, R. C., Burhans, R., Elnitski, L., Shah, P., Zhang, Y., Blankenberg, D., Albert, I., Taylor, J., et al. (2005) Galaxy: a platform for interactive large-scale genome analysis. Genome Res., 15, 1451–1455.
Quinlan, A. R. and Hall, I. M. (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26, 841–842.
Feng, D., Liu, T., Sun, Z., Bugge, A., Mullican, S. E., Alenghat, T., Liu, X. S. and Lazar, M. A. (2011) A circadian rhythm orchestrated by histone deacetylase 3 controls hepatic lipid metabolism. Science, 331, 1315–1319.
Odom, D. T., Dowell, R. D., Jacobsen, E. S., Gordon, W., Danford, T. W., MacIsaac, K. D., Rolfe, P. A., Conboy, C. M., Gifford, D. K. and Fraenkel, E. (2007) Tissue-specific transcriptional regulation has diverged significantly between human and mouse. Nat. Genet., 39, 730–732.
Schmidt, D., Wilson, M. D., Ballester, B., Schwalie, P. C., Brown, G. D., Marshall, A., Kutter, C., Watt, S., Martinez-Jimenez, C. P., Mackay, S., et al. (2010) Five-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding. Science, 328, 1036–1040.
Chung, D., Kuan, P. F., Li, B., Sanalkumar, R., Liang, K., Bresnick, E. H., Dewey, C. and Keleş, S. (2011) Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data. PLOS Comput. Biol., 7, e1002111.
Wang, T., Zeng, J., Lowe, C. B., Sellers, R. G., Salama, S. R., Yang, M., Burgess, S. M., Brachmann, R. K. and Haussler, D. (2007) Species-specific endogenous retroviruses shape the transcriptional network of the human tumor suppressor protein p53. Proc. Natl. Acad. Sci. USA, 104, 18613–18618.
Eaton, M. L., Prinz, J. A., MacAlpine, H. K., Tretyakov, G., Kharchenko, P. V. and MacAlpine, D. M. (2011) Chromatin signatures of the Drosophila replication program. Genome Res., 21, 164–174.
Bernstein, B. E., Mikkelsen, T. S., Xie, X., Kamal, M., Huebert, D. J., Cuff, J., Fry, B., Meissner, A., Wernig, M., Plath, K., et al. (2006) A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell, 125, 315–326.
Kolasinska-Zwierz, P., Down, T., Latorre, I., Liu, T., Liu, X. S. and Ahringer, J. (2009) Differential chromatin marking of introns and expressed exons by H3K36me3. Nat. Genet., 41, 376–381.
Creyghton, M. P., Cheng, A. W., Welstead, G. G., Kooistra, T., Carey, B.W., Steine, E. J., Hanna, J., Lodato, M. A., Frampton, G. M., Sharp, P. A., et al. (2010) Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl. Acad. Sci. USA, 107, 21931–21936.
Heintzman, N. D., Stuart, R. K., Hon, G., Fu, Y., Ching, C. W., Hawkins, R. D., Barrera, L. O., Van Calcar, S., Qu, C., Ching, K. A., et al. (2007) Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet., 39, 311–318.
Ernst, J. and Kellis, M. (2010) Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat. Biotechnol., 28, 817–825.
Hoffman, M. M., Buske, O. J., Wang, J., Weng, Z., Bilmes, J. A. and Noble, W. S. (2012) Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat. Methods, 9, 473–476.
He, H. H., Meyer, C. A., Shin, H., Bailey, S. T., Wei, G., Wang, Q., Zhang, Y., Xu, K., Ni, M., Lupien, M., et al. (2010) Nucleosome dynamics define transcriptional enhancers. Nat. Genet., 42, 343–347.
Kasowski, M., Grubert, F., Heffelfinger, C., Hariharan, M., Asabere, A., Waszak, S. M., Habegger, L., Rozowsky, J., Shi, M., Urban, A. E., et al. (2010) Variation in transcription factor binding among humans. Science, 328, 232–235.
McDaniell, R., Lee, B. K., Song, L., Liu, Z., Boyle, A. P., Erdos, M. R., Scott, L. J., Morken, M. A., Kucera, K. S., Battenhouse, A., et al. (2010) Heritable individual-specific and allele-specific chromatin signatures in humans. Science, 328, 235–239.
Ahmadiyeh, N., Pomerantz, M. M., Grisanzio, C., Herman, P., Jia, L., Almendro, V., He, H. H., Brown, M., Liu, X. S., Davis, M., et al. (2010) 8q24 prostate, breast, and colon cancer risk loci show tissuespecific long-range interaction with MYC. Proc. Natl. Acad. Sci. USA, 107, 9742–9746.
Birney, E., Lieb, J. D., Furey, T. S., Crawford, G. E. and Iyer, V. R. (2010) Allele-specific and heritable chromatin signatures in humans. Hum. Mol. Genet., 19, R204–R209.
Pickrell, J. K., Gaffney, D. J., Gilad, Y. and Pritchard, J. K. (2011) False positive peaks in ChIP-seq and other sequencing-based functional assays caused by unannotated high copy number regions. Bioinformatics, 27, 2144–2146.
Verzi, M. P., Shin, H., Ho, L. L., Liu, X. S. and Shivdasani, R. A. (2011) Essential and redundant functions of caudal family proteins in activating adult intestinal genes. Mol. Cell. Biol., 31, 2026–2039.
Iyengar, S., Ivanov, A. V., Jin, V. X., Rauscher, F. J. 3rd and Farnham, P. J. (2011) Functional analysis of KAP1 genomic recruitment. Mol. Cell. Biol., 31, 1833–1847.
O’Geen, H., Echipare, L. and Farnham, P. J. (2011) Using ChIP-seq technology to generate high-resolution profiles of histone modifications. Methods Mol. Biol., 791, 265–286.
Adli, M., Zhu, J. and Bernstein, B. E. (2010) Genome-wide chromatin maps derived from limited numbers of hematopoietic progenitors. Nat. Methods, 7, 615–618.
Shankaranarayanan, P., Mendoza-Parra, M. A., Walia, M., Wang, L., Li, N., Trindade, L. M. and Gronemeyer, H. (2011) Single-tube linear DNA amplification (LinDA) for robust ChIP-seq. Nat. Methods, 8, 565–567.
Rhee, H. S. and Pugh, B. F. (2011) Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution. Cell, 147, 1408–1419.
Cokus, S. J., Feng, S., Zhang, X., Chen, Z., Merriman, B., Haudenschild, C. D., Pradhan, S., Nelson, S. F., Pellegrini, M. and Jacobsen, S. E. (2008) Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature, 452, 215–219.
Lister, R., O’Malley, R. C., Tonti-Filippini, J., Gregory, B. D., Berry, C. C., Millar, A. H. and Ecker, J. R. (2008) Highly integrated singlebase resolution maps of the epigenome in Arabidopsis. Cell, 133, 523–536.
Lister, R., Pelizzola, M., Dowen, R. H., Hawkins, R. D., Hon, G., Tonti-Filippini, J., Nery, J. R., Lee, L., Ye, Z., Ngo, Q. M., et al. (2009) Human DNA methylomes at base resolution show widespread epigenomic differences. Nature, 462, 315–322.
Xiang, H., Zhu, J., Chen, Q., Dai, F., Li, X., Li, M., Zhang, H., Zhang, G., Li, D., Dong, Y., et al. (2010) Single base-resolution methylome of the silkworm reveals a sparse epigenomic map. Nat. Biotechnol., 28, 516–520.
Bar-Joseph, Z., Gerber, G. K., Lee, T. I., Rinaldi, N. J., Yoo, J. Y., Robert, F., Gordon, D. B., Fraenkel, E., Jaakkola, T. S., Young, R. A., et al. (2003) Computational discovery of gene modules and regulatory networks. Nat. Biotechnol., 21, 1337–1342.
Basso, K., Margolin, A. A., Stolovitzky, G., Klein, U., Dalla-Favera, R. and Califano, A. (2005) Reverse engineering of regulatory networks in human B cells. Nat. Genet., 37, 382–390.
Friedman, N. (2004) Inferring cellular networks using probabilistic graphical models. Science, 303, 799–805.
Lee, I., Date, S. V., Adai, A. T. and Marcotte, E. M. (2004) A probabilistic functional network of yeast genes. Science, 306, 1555–1558.
Liao, J. C., Boscolo, R., Yang, Y. L., Tran, L. M., Sabatti, C. and Roychowdhury, V. P. (2003) Network component analysis: reconstruction of regulatory signals in biological systems. Proc. Natl. Acad. Sci. USA, 100, 15522–15527.
Lemmens, K., Dhollander, T., De Bie, T., Monsieurs, P., Engelen, K., Smets, B., Winderickx, J., De Moor, B. and Marchal, K. (2006) Inferring transcriptional modules from ChIP-chip, motif and microarray data. Genome Biol., 7, R37.
Liu, X., Jessen, W. J., Sivaganesan, S., Aronow, B. J. and Medvedovic, M. (2007) Bayesian hierarchical model for transcriptional module discovery by jointly modeling gene expression and ChIP-chip data. BMC Bioinformatics, 8, 283.
Youn, A., Reiss, D. J. and Stuetzle, W. (2010) Learning transcriptional networks from the integration of ChIP-chip and expression data in a non-parametric model. Bioinformatics, 26, 1879–1886.
Kinde, I., Wu, J., Papadopoulos, N., Kinzler, K.W. and Vogelstein, B. (2011) Detection and quantification of rare mutations with massively parallel sequencing. Proc. Natl. Acad. Sci. USA, 108, 9530–9535.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shin, H., Liu, T., Duan, X. et al. Computational methodology for ChIP-seq analysis. Quant Biol 1, 54–70 (2013). https://doi.org/10.1007/s40484-013-0006-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40484-013-0006-2