Skip to main content

Promoter Analysis: Gene Regulatory Motif Identification with A-GLAM

  • Protocol
  • First Online:
Bioinformatics for DNA Sequence Analysis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 537))

Abstract

Reliable detection of cis-regulatory elements in promoter regions is a difficult and unsolved problem in computational biology. The intricacy of transcriptional regulation in higher eukaryotes, primarily in metazoans, could be a major driving force of organismal complexity. Eukaryotic genome annotations have improved greatly due to large-scale characterization of full-length cDNAs, transcriptional start sites (TSSs), and comparative genomics. Regulatory elements are identified in promoter regions using a variety of enumerative or alignment-based methods. Here we present a survey of recent computational methods for eukaryotic promoter analysis and describe the use of an alignment-based method implemented in the A-GLAM program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Elnitski, L., Jin, V. X., Farnham, P. J., and Jones, S. J. (2006) Locating mammalian transcription factor binding sites: a survey of computational and experimental techniques. Genome Res 16, 1455–64.

    Article  PubMed  CAS  Google Scholar 

  2. Harbison, C. T., Gordon, D. B., Lee, T. I., Rinaldi, N. J., Macisaac, K. D., Danford, T. W., Hannett, N. M., Tagne, J. B., Reynolds, D. B., Yoo, J., et al. (2004) Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104.

    Article  PubMed  CAS  Google Scholar 

  3. Bieda, M., Xu, X., Singer, M. A., Green, R., and Farnham, P. J. (2006) Unbiased location analysis of E2F1-binding sites suggests a widespread role for E2F1 in the human genome. Genome Res 16, 595–605.

    Article  PubMed  CAS  Google Scholar 

  4. Cawley, S., Bekiranov, S., Ng, H. H., Kapranov, P., Sekinger, E. A., Kampa, D., Piccolboni, A., Sementchenko, V., Cheng, J., Williams, A. J., et al. (2004) Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116, 499–509.

    Article  PubMed  CAS  Google Scholar 

  5. Guccione, E., Martinato, F., Finocchiaro, G., Luzi, L., Tizzoni, L., Dall’ Olio, V., Zardo, G., Nervi, C., Bernard, L., and Amati, B. (2006) Myc-binding-site recognition in the human genome is determined by chromatin context. Nat Cell Biol 8, 764–70.

    Article  PubMed  CAS  Google Scholar 

  6. Hughes, J. D., Estep, P. W., Tavazoie, S., and Church, G. M. (2000) Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J Mol Biol 296, 1205–14.

    Article  PubMed  CAS  Google Scholar 

  7. Workman, C. T., and Stormo, G. D. (2000) ANN-Spec: a method for discovering transcription factor binding sites with improved specificity. Pac Symp Biocomput 5, 467–78.

    Google Scholar 

  8. Hertz, G. Z., and Stormo, G. D. (1999) Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15, 563–77.

    Article  PubMed  CAS  Google Scholar 

  9. Frith, M. C., Fu, Y., Yu, L., Chen, J. F., Hansen, U., and Weng, Z. (2004) Detection of functional DNA motifs via statistical over-representation. Nucleic Acids Res 32, 1372–81.

    Article  PubMed  CAS  Google Scholar 

  10. Ao, W., Gaudet, J., Kent, W. J., Muttumu, S., and Mango, S. E. (2004) Environmentally induced foregut remodeling by PHA-4/FoxA and DAF-12/NHR. Science 305, 1743–6.

    Article  PubMed  CAS  Google Scholar 

  11. Bailey, T. L., and Elkan, C. (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol 2, 28–36.

    PubMed  CAS  Google Scholar 

  12. Eskin, E., and Pevzner, P. A. (2002) Finding composite regulatory patterns in DNA sequences. Bioinformatics 18 Suppl 1, S354–63.

    Article  PubMed  Google Scholar 

  13. Thijs, G., Lescot, M., Marchal, K., Rombauts, S., De Moor, B., Rouze, P., and Moreau, Y. (2001) A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling. Bioinformatics 17, 1113–22.

    Article  PubMed  CAS  Google Scholar 

  14. Régnier, M., and Denise, A. (2004) Rare events and conditional events on random strings. Discrete Math Theor Comput Sci 6, 191–214.

    Google Scholar 

  15. Favorov, A. V., Gelfand, M. S., Gerasimova, A. V., Ravcheev, D. A., Mironov, A. A., and Makeev, V. J. (2005) A Gibbs sampler for identification of symmetrically structured, spaced DNA motifs with improved estimation of the signal length. Bioinformatics 21, 2240–5.

    Article  PubMed  CAS  Google Scholar 

  16. Pavesi, G., Mereghetti, P., Zambelli, F., Stefani, M., Mauri, G., and Pesole, G. (2006) MoD Tools: regulatory motif discovery in nucleotide sequences from co-regulated or homologous genes. Nucleic Acids Res 34, W566–70.

    Article  PubMed  CAS  Google Scholar 

  17. Pavesi, G., Zambelli, F., and Pesole, G. (2007) WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences. BMC Bioinformatics 8, 46.

    Article  PubMed  Google Scholar 

  18. Sinha, S., and Tompa, M. (2003) YMF: A program for discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res 31, 3586–8.

    Article  PubMed  CAS  Google Scholar 

  19. Blanchette, M., Bataille, A. R., Chen, X., Poitras, C., Laganiere, J., Lefebvre, C., Deblois, G., Giguere, V., Ferretti, V., Bergeron, D., et al. (2006) Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression. Genome Res. 16, 656–68.

    Article  PubMed  CAS  Google Scholar 

  20. Tompa, M., Li, N., Bailey, T. L., Church, G. M., De Moor, B., Eskin, E., Favorov, A. V., Frith, M. C., Fu, Y., Kent, W. J., et al. (2005) Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol 23, 137–44.

    Article  PubMed  CAS  Google Scholar 

  21. Ohler, U., and Niemann, H. (2001) Identification and analysis of eukaryotic promoters: recent computational approaches. Trends Genet 17, 56–60.

    Article  PubMed  CAS  Google Scholar 

  22. Marino-Ramirez, L., Spouge, J. L., Kanga, G. C., and Landsman, D. (2004) Statistical analysis of over-represented words in human promoter sequences. Nucleic Acids Res 32, 949–58.

    Article  PubMed  CAS  Google Scholar 

  23. Tharakaraman, K., Marino-Ramirez, L., Sheetlin, S., Landsman, D., and Spouge, J. L. (2005) Alignments anchored on genomic landmarks can aid in the identification of regulatory elements. Bioinformatics 21 Suppl 1, i440–8.

    Article  PubMed  CAS  Google Scholar 

  24. Tharakaraman, K., Marino-Ramirez, L., Sheetlin, S., Landsman, D., and Spouge, J. L. (2006) Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements. BMC Bioinformatics 7, 408.

    Article  PubMed  Google Scholar 

  25. Goffeau, A., Barrell, B. G., Bussey, H., Davis, R. W., Dujon, B., Feldmann, H., Galibert, F., Hoheisel, J. D., Jacq, C., Johnston, M., et al. (1996) Life with 6000 genes. Science 274, 546, 563–47.

    Google Scholar 

  26. Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921.

    Article  PubMed  CAS  Google Scholar 

  27. Levine, M., and Tjian, R. (2003) Transcription regulation and animal diversity. Nature 424, 147–51.

    Article  PubMed  CAS  Google Scholar 

  28. Carninci, P., Waki, K., Shiraki, T., Konno, H., Shibata, K., Itoh, M., Aizawa, K., Arakawa, T., Ishii, Y., Sasaki, D., et al. (2003) Targeting a complex transcriptome: the construction of the mouse full-length cDNA encyclopedia. Genome Res 13, 1273–89.

    Article  PubMed  Google Scholar 

  29. Kimura, K., Wakamatsu, A., Suzuki, Y., Ota, T., Nishikawa, T., Yamashita, R., Yamamoto, J., Sekine, M., Tsuritani, K., Wakaguri, H., et al. (2006) Diversification of transcriptional modulation: large-scale identification and characterization of putative alternative promoters of human genes. Genome Res 16, 55–65.

    Article  PubMed  CAS  Google Scholar 

  30. Suzuki, Y., Yamashita, R., Sugano, S., and Nakai, K. (2004) DBTSS, DataBase of Transcriptional Start Sites: progress report 2004. Nucleic Acids Res 32, D78–81.

    Article  PubMed  CAS  Google Scholar 

  31. Halees, A. S., and Weng, Z. (2004) PromoSer: improvements to the algorithm, visualization and accessibility. Nucleic Acids Res 32, W191–4.

    Article  PubMed  CAS  Google Scholar 

  32. Jiang, C., Xuan, Z., Zhao, F., and Zhang, M. Q. (2007) TRED: a transcriptional regulatory element database, new entries and other development. Nucleic Acids Res 35, D137–40.

    Article  PubMed  CAS  Google Scholar 

  33. Schmid, C. D., Perier, R., Praz, V., and Bucher, P. (2006) EPD in its twentieth year: towards complete promoter coverage of selected model organisms. Nucleic Acids Res 34, D82–5.

    Article  PubMed  CAS  Google Scholar 

  34. Eriksson, P. R., Mendiratta, G., McLaughlin, N. B., Wolfsberg, T. G., Marino-Ramirez, L., Pompa, T. A., Jainerin, M., Landsman, D., Shen, C. H., and Clark, D. J. (2005) Global regulation by the yeast Spt10 protein is mediated through chromatin structure and the histone upstream activating sequence elements. Mol Cell Biol 25, 9127–37.

    Article  PubMed  CAS  Google Scholar 

  35. Riz, I., Akimov, S. S., Eaker, S. S., Baxter, K. K., Lee, H. J., Marino-Ramirez, L., Landsman, D., Hawley, T. S., and Hawley, R. G. (2007) TLX1/HOX11-induced hematopoietic differentiation blockade. Oncogene 26, 4115–23.

    Article  PubMed  CAS  Google Scholar 

  36. van Helden, J., andre, B., and Collado-Vides, J. (1998) Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J Mol Biol 281, 827–42.

    Article  PubMed  Google Scholar 

  37. Lawrence, C. E., Altschul, S. F., Boguski, M. S., Liu, J. S., Neuwald, A. F., and Wootton, J. C. (1993) Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262, 208–14.

    Article  PubMed  CAS  Google Scholar 

  38. Wasserman, W. W., and Sandelin, A. (2004) Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 5, 276–87.

    Article  PubMed  CAS  Google Scholar 

  39. Marino-Ramirez, L., Jordan, I. K., and Landsman, D. (2006) Multiple independent evolutionary solutions to core histone gene regulation. Genome Biol 7, R122.

    Article  PubMed  Google Scholar 

  40. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–402.

    Article  PubMed  CAS  Google Scholar 

  41. Staden, R. (1989) Methods for calculating the probabilities of finding patterns in sequences. Comput Appl Biosci 5, 89–96.

    PubMed  CAS  Google Scholar 

  42. Ren, B., Robert, F., Wyrick, J. J., Aparicio, O., Jennings, E. G., Simon, I., Zeitlinger, J., Schreiber, J., Hannett, N., Kanin, E., et al. (2000) Genome-wide location and function of DNA binding proteins. Science 290, 2306–9.

    Article  PubMed  CAS  Google Scholar 

  43. Schneider, T. D., and Stephens, R. M. (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18, 6097–100.

    Article  PubMed  CAS  Google Scholar 

  44. Crooks, G. E., Hon, G., Chandonia, J. M., and Brenner, S. E. (2004) WebLogo: a sequence logo generator. Genome Res 14, 1188–90.

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This research was supported by the Intramural Research Program of the NIH, NLM, NCBI.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Humana Press, a part of Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Mariño-Ramírez, L., Tharakaraman, K., Spouge, J.L., Landsman, D. (2009). Promoter Analysis: Gene Regulatory Motif Identification with A-GLAM. In: Posada, D. (eds) Bioinformatics for DNA Sequence Analysis. Methods in Molecular Biology, vol 537. Humana Press. https://doi.org/10.1007/978-1-59745-251-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-59745-251-9_13

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-58829-910-9

  • Online ISBN: 978-1-59745-251-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics