Abstract
Background
In eukaryotic genome, chromatin is not randomly distributed in cell nuclei, but instead is organized into higher-order structures. Emerging evidence indicates that these higher-order chromatin structures play important roles in regulating genome functions such as transcription and DNA replication. With the advancement in 3C (chromosome conformation capture) based technologies, Hi-C has been widely used to investigate genome-wide longrange chromatin interactions during cellular differentiation and oncogenesis. Since the first publication of Hi-C assay in 2009, lots of bioinformatic tools have been implemented for processing Hi-C data from mapping raw reads to normalizing contact matrix and high interpretation, either providing a whole workflow pipeline or focusing on a particular process.
Results
This article reviews the general Hi-C data processing workflow and the currently popular Hi-C data processing tools. We highlight on how these tools are used for a full interpretation of Hi-C results.
Conclusions
Hi-C assay is a powerful tool to investigate the higher-order chromatin structure. Continued development of novel methods for Hi-C data analysis will be necessary for better understanding the regulatory function of genome organization.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Gorkin, D. U., Leung, D. and Ren, B. (2014) The 3D genome in transcriptional regulation and pluripotency. Cell Stem Cell, 14, 762–775
Phillips-Cremins, J. E., Sauria, M. E., Sanyal, A., Gerasimova, T. I., Lajoie, B. R., Bell, J. S., Ong, C. T., Hookway, T. A., Guo, C., Sun, Y., et al. (2013) Architectural protein subclasses shape 3D organization of genomes during lineage commitment. Cell, 153, 1281–1295
Dekker, J., Rippe, K., Dekker, M. and Kleckner, N. (2002) Capturing chromosome conformation. Science, 295, 1306–1311
Simonis, M., Klous, P., Splinter, E., Moshkin, Y., Willemsen, R., de Wit, E., van Steensel, B. and de Laat, W. (2006) Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture-on-chip (4C). Nat. Genet., 38, 1348–1354
Dostie, J., Richmond, T. A., Arnaout, R. A., Selzer, R. R., Lee, W. L., Honan, T. A., Rubio, E. D., Krumm, A., Lamb, J., Nusbaum, C., et al. (2006) Chromosome conformation capture carbon copy (5C): a massively parallel solution for mapping interactions between genomic elements. Genome Res., 16, 1299–1309
Lieberman-Aiden, E., van Berkum, N. L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B. R., Sabo, P. J., Dorschner, M. O., et al. (2009) Comprehensive mapping of longrange interactions reveals folding principles of the human genome. Science, 326, 289–293
Fullwood, M. J., Liu, M. H., Pan, Y. F., Liu, J., Xu, H., Mohamed, Y. B., Orlov, Y. L., Velkov, S., Ho, A., Mei, P. H., et al. (2009) An oestrogen-receptor-alpha-bound human chromatin interactome. Nature, 462, 58–64
Jäger, R., Migliorini, G., Henrion, M., Kandaswamy, R., Speedy, H. E., Heindl, A., Whiffin, N., Carnicer, M. J., Broome, L., Dryden, N., et al. (2015) Capture Hi-C identifies the chromatin interactome of colorectal cancer risk loci. Nat. Commun., 6, 6178
Dixon, J. R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J. S. and Ren, B. (2012) Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 485, 376–380
Schmitt, A. D., Hu, M., Jung, I., Xu, Z., Qiu, Y., Tan, C. L., Li, Y., Lin, S., Lin, Y., Barr, C. L., et al. (2016) A Compendium of chromatin contact maps reveals spatially active regions in the human genome. Cell Rep., 17, 2042–2059
Castellano, G., Le Dily, F., Hermoso Pulido, A., Beato, M. and Roma, G. (2015) Hi-Cpipe: a pipeline for high-throughput chromosome capture. bioRxiv, doi: https://doi.org/10.1101/020636
HiC-Box. available from https://github.com/koszullab/HiC-Box
Schmid, M. W., Grob, S. and Grossniklaus, U. (2015) HiCdat: a fast and easy-to-use Hi-C data analysis tool. BMC Bioinformatics, 16, 277
Hwang, Y. C., Lin, C. F., Valladares, O., Malamon, J., Kuksa, P. P., Zheng, Q., Gregory, B. D. and Wang, L. S. (2015) HIPPIE: a highthroughput identification pipeline for promoter interacting enhancer elements. Bioinformatics, 31, 1290–1292
Durand, N. C., Shamim, M. S., Machol, I., Rao, S. S., Huntley, M. H., Lander, E. S. and Aiden, E. L. (2016) Juicer provides a oneclick system for analyzing loop-resolution Hi-C experiments. Cell Syst., 3, 95–98
Imakaev, M., Fudenberg, G., McCord, R. P., Naumova, N., Goloborodko, A., Lajoie, B. R., Dekker, J. and Mirny, L. A. (2012) Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods, 9, 999–1003
Wingett, S., Ewels, P., Furlan-Magaril, M., Nagano, T., Schoenfelder, S., Fraser, P. and Andrews, S. (2015) HiCUP: pipeline for mapping and processing Hi-C data. F1000Res, 4, 1310
Servant, N., Varoquaux, N., Lajoie, B. R., Viara, E., Chen, C. J., Vert, J. P., Heard, E., Dekker, J. and Barillot, E. (2015) HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol., 16, 259
Serra, F., Baù, D., Filion, G. and Marti-Renom, M. A. (2016) Structural features of the fly chromatin colors revealed by automatic three-dimensional modeling. bioRxiv, doi: https://doi. org/10.1101/036764
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and the 1000 Genome Project Data Processing Subgroup. (2009) The sequence alignment/map format and SAMtools. Bioinformatics, 25, 2078–2079
Ma, W., Ay, F., Lee, C., Gulsoy, G., Deng, X., Cook, S., Hesson, J., Cavanaugh, C., Ware, C. B., Krumm, A., et al. (2015) Fine-scale chromatin interaction maps reveal the cis-regulatory landscape of human lincRNA genes. Nat. Methods, 12, 71–78
Hu, M., Deng, K., Selvaraj, S., Qin, Z., Ren, B. and Liu, J. S. (2012) HiCNorm: removing biases in Hi-C data via Poisson regression. Bioinformatics, 28, 3131–3133
Knight, P. A. and Ruiz, D. (2013) A fast algorithm for matrix balancing. IMA J. Numer. Anal., 33, 1029–1047
Yaffe, E. and Tanay, A. (2011) Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture. Nat. Genet., 43, 1059–1065
Rao, S. S., Huntley, M. H., Durand, N. C., Stamenova, E. K., Bochkov, I. D., Robinson, J. T., Sanborn, A. L., Machol, I., Omer, A. D., Lander, E. S., et al. (2014) A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell, 159, 1665–1680
Sexton, T., Yaffe, E., Kenigsberg, E., Bantignies, F., Leblanc, B., Hoichman, M., Parrinello, H., Tanay, A. and Cavalli, G. (2012) Three-dimensional folding and functional organization principles of the Drosophila genome. Cell, 148, 458–472
Filippova, D., Patro, R., Duggal, G. and Kingsford, C. (2014) Identification of alternative topological domains in chromatin. Algorithms Mol. Biol., 9, 14
Lévy-Leduc, C., Delattre, M., Mary-Huard, T. and Robin, S. (2014) Two-dimensional segmentation for analyzing Hi-C data. Bioinformatics, 30, i386–i392
Wang, Y., Li, Y., Gao, J. and Zhang, M. Q. (2015) A novel method to identify topological domains using Hi-C data. Quant. Biol., 3, 81–89
Zhou, X., Lowdon, R. F., Li, D., Lawson, H. A., Madden, P. A., Costello, J. F. and Wang, T. (2013) Exploring long-range genome interactions using the WashU Epigenome Browser. Nat. Methods, 10, 375–376
The 3D Genome Browser. Available from: http://www.3dgenome.org
Karolchik, D., Barber, G. P., Casper, J., Clawson, H., Cline, M. S., Diekhans, M., Dreszer, T. R., Fujita, P. A., Guruvadoo, L., Haeussler, M., et al. (2014) The UCSC Genome Browser database: 2014 update. Nucleic Acids Res., 42, D764–D770
Asbury, T. M., Mitman, M., Tang, J. and Zheng, W. J. (2010) Genome3D: a viewer-model framework for integrating and visualizing multi-scale epigenomic information within a threedimensional genome. BMC Bioinformatics, 11, 444
Lewis, T. E., Sillitoe, I., Andreeva, A., Blundell, T. L., Buchan, D. W., Chothia, C., Cozzetto, D., Dana, J. M., Filippis, I., Gough, J., et al. (2015) Genome3D: exploiting structure to help users understand their sequences. Nucleic Acids Res., 43, D382–D386
Lewis, T. E., Sillitoe, I., Andreeva, A., Blundell, T. L., Buchan, D. W., Chothia, C., Cuff, A., Dana, J. M., Filippis, I., Gough, J., et al. (2013) Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains. Nucleic Acids Res., 41, D499–D507
TADkit. available from http://sgt.cnag.cat/3dg/tadkit
Ay, F. and Noble, W. S. (2015) Analysis methods for studying the 3D architecture of the genome. Genome Biol., 16, 183
Schmitt, A. D., Hu, M. and Ren, B. (2016) Genome-wide mapping and analysis of chromosome architecture. Nat. Rev. Mol. Cell Biol., 17, 743–755
Ashish, N., Dewan, P., Ambite, J. L. and Toga, A.W. (2015) GEM: the GAAIN entity mapper. Data Integr. Life Sci., 9162, 13–27
Marco-Sola, S., Sammeth, M., Guigó, R. and Ribeca, P. (2012) The GEM mapper: fast, accurate and versatile alignment by filtration. Nat. Methods, 9, 1185–1188
Durand, N. C., Robinson, J. T., Shamim, M. S., Machol, I., Mesirov, J. P., Lander, E. S. and Aiden, E. L. (2016) Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst., 3, 99–101
Li, W., Gong, K., Li, Q., Alber, F. and Zhou, X. J. (2015) Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data. Bioinformatics, 31, 960–962
Sauria, M. E., Phillips-Cremins, J. E., Corces, V. G. and Taylor, J. (2015) HiFive: a tool suite for easy and efficient HiC and 5C data analysis. Genome Biol., 16, 237
Lun, A. T. and Smyth, G. K. (2015) diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data. BMC Bioinformatics, 16, 258
Acknowledgments
This work is supported by the National Basic Research Program of China (Nos. 2016YFA0100703 and 2015CB964800) and the National Natural Science Foundation of China (No. 31271354).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Han, Z., Wei, G. Computational tools for Hi-C data analysis. Quant Biol 5, 215–225 (2017). https://doi.org/10.1007/s40484-017-0113-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40484-017-0113-6