Skip to main content

Prediction of CpG Islands as an Intrinsic Clustering Property Found in Many Eukaryotic DNA Sequences and Its Relation to DNA Methylation

  • Protocol
  • First Online:
CpG Islands

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1766))

Abstract

The promoter region of around 70% of all genes in the human genome is overlapped by a CpG island (CGI). CGIs have known functions in the transcription initiation and outstanding compositional features like high G+C content and CpG ratios when compared to the bulk DNA. We have shown before that CGIs manifest as clusters of CpGs in mammalian genomes and can therefore be detected using clustering methods. These techniques have several advantages over sliding window approaches which apply compositional properties as thresholds. In this protocol we show how to determine local (CpG islands) and global (distance distribution) clustering properties of CG dinucleotides and how to generalize this analysis to any k-mer or combinations of it. In addition, we illustrate how to easily cross the output of a CpG island prediction algorithm with our methylation database to detect differentially methylated CGIs. The analysis is given in a step-by-step protocol and all necessary programs are implemented into a virtual machine or, alternatively, the software can be downloaded and easily installed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Suzuki MM, Bird A (2008) DNA methylation landscapes: provocative insights from epigenomics. Nat Rev Genet 9:465–476. https://doi.org/10.1038/nrg2341

    Article  CAS  PubMed  Google Scholar 

  2. Duncan BK, Miller JH (1980) Mutagenic deamination of cytosine residues in DNA. Nature 287:560–561. https://doi.org/10.1038/287560a0

    Article  CAS  PubMed  Google Scholar 

  3. Deaton AM, Bird A (2011) CpG islands and the regulation of transcription. Genes Dev 25:1010–1022. https://doi.org/10.1101/gad.2037511

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Hackenberg M, Previti C, Luque-escamilla PL et al (2006) CpGcluster: a distance-based algorithm for CpG-island detection. BMC Bioinformatics 13:1–13. https://doi.org/10.1186/1471-2105-7-446

    Article  CAS  Google Scholar 

  5. Gardiner-Garden M, Frommer M (1987) CpG islands in vertebrate genomes. J Mol Biol 196:261–282

    Article  CAS  PubMed  Google Scholar 

  6. Takai D, Jones PA (2002) Comprehensive analysis of CpG islands in human chromosomes 21 and 22. Proc Natl Acad Sci U S A 99:3740–3745. https://doi.org/10.1073/pnas.052410099

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Hackenberg M, Barturen G, Carpena P et al (2010) Prediction of CpG-island function: CpG clustering vs. sliding-window methods. BMC Genomics 11:327. https://doi.org/10.1186/1471-2164-11-327

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Hackenberg M, Carpena P, Bernaola-galván P et al (2011) WordCluster : detecting clusters of DNA words and genomic elements. Algorithms Mol Biol 6:2. https://doi.org/10.1186/1748-7188-6-2

    Article  PubMed  PubMed Central  Google Scholar 

  9. Pruitt KD, Tatusova T, Brown GR, Maglott DR (2012) NCBI reference sequences (RefSeq): current status, new features and genome annotation policy. Nucleic Acids Res 40:D130–D135. https://doi.org/10.1093/nar/gkr1079

    Article  CAS  PubMed  Google Scholar 

  10. Fernandez-Pozo N, Menda N, Edwards JD et al (2015) The Sol Genomics Network (SGN)—from genotype to phenotype to breeding. Nucleic Acids Res 43:D1036–D1041. https://doi.org/10.1093/nar/gku1195

    Article  CAS  PubMed  Google Scholar 

  11. Kent WJ, Sugnet CW, Furey TS et al (2002) The human genome browser at UCSC. Genome Res 12:996–1006. https://doi.org/10.1101/gr.229102

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Altschul SF, Erickson BW (1985) Significance of nucleotide sequence alignments: a method for random sequence permutation that preserves dinucleotide and codon usage. Mol Biol Evol 2:526–538

    CAS  PubMed  Google Scholar 

  13. Lister R, Pelizzola M, Dowen RH et al (2009) Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462:315–322. https://doi.org/10.1038/nature08514

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Grantham R, Gautier C, Gouy M et al (1980) Codon catalog usage and the genome hypothesis. Nucleic Acids Res 8:197. https://doi.org/10.1093/nar/8.1.197-c

    Article  Google Scholar 

  15. Bernardi G (1993) Genome organization and species formation in vertebrates. J Mol Evol 37(4):331–337

    Article  CAS  PubMed  Google Scholar 

  16. Bernaola-Galván P, Oliver JL, Hackenberg M et al (2012) Segmentation of time series with long-range fractal correlations. Eur Phys J B. https://doi.org/10.1140/epjb/e2012-20969-5

  17. Hackenberg M, Rueda A, Carpena P et al (2012) Clustering of DNA words and biological function: a proof of principle. J Theor Biol 297:127–136. https://doi.org/10.1016/j.jtbi.2011.12.024

    Article  CAS  PubMed  Google Scholar 

  18. Carpena P, Oliver JL, Hackenberg M et al (2011) High-level organization of isochores into gigantic superstructures in the human genome. Phys Rev E Stat Nonlin Soft Matter Phys 83:31908

    Article  CAS  Google Scholar 

  19. Dios F, Barturen G, Lebrón R et al (2014) DNA clustering and genome complexity. Comput Biol Chem 53:71–78. https://doi.org/10.1016/j.compbiolchem.2014.08.011

    Article  CAS  PubMed  Google Scholar 

  20. Oliver L, Hackenberg M, Barturen G, De GD (2011) NGSmethDB: a database for next-generation sequencing single-cytosine-resolution DNA methylation data. Nucleic Acids Res 39:75–79. https://doi.org/10.1093/nar/gkq942

    Article  CAS  Google Scholar 

  21. Hackenberg M, Barturen G, Oliver JL (2011) NGSmethDB: a database for next-generation sequencing single-cytosine-resolution DNA methylation data. Nucleic Acids Res 39:D75–D79. https://doi.org/10.1093/nar/gkq942

    Article  CAS  PubMed  Google Scholar 

  22. Lebrón R, Gómez-Martín C, Carpena P et al (2016) NGSmethDB 2017: enhanced methylomes and differential methylation. Nucleic Acids Res 45:gkw996. https://doi.org/10.1093/nar/gkw996

    Article  CAS  Google Scholar 

  23. Geisen S, Barturen G, Alganza M et al (2014) NGSmethDB: an updated genome resource for high quality , single-cytosine resolution methylomes. Nucleic Acids Res 42:53–59. https://doi.org/10.1093/nar/gkt1202

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Hackenberg .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Gómez-Martín, C., Lebrón, R., Oliver, J.L., Hackenberg, M. (2018). Prediction of CpG Islands as an Intrinsic Clustering Property Found in Many Eukaryotic DNA Sequences and Its Relation to DNA Methylation. In: Vavouri, T., Peinado, M. (eds) CpG Islands. Methods in Molecular Biology, vol 1766. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7768-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7768-0_3

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-7767-3

  • Online ISBN: 978-1-4939-7768-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics