Abstract
Molecular genetic markers represent one of the most powerful tools for the analysis of genomes and the association of heritable traits with underlying genetic variation. The development of high-throughput methods for the detection of single nucleotide polymorphisms (SNPs) and simple sequence repeats (SSRs) has led to a revolution in their use as molecular markers. The availability of large sequence data sets permits mining for these molecular markers, which may then be used for applications such as genetic trait mapping, diversity analysis and marker assisted selection in agriculture. Here we describe web-based automated methods for the discovery of SSRs using SSR taxonomy tree, the discovery of SNPs from sequence data using SNPServer and the identification of validated SNPs from within the dbSNP database. SSR taxonomy tree identifies pre-determined SSR amplification primers for virtually all species represented within the GenBank database. SNPServer uses a redundancy based approach to identify SNPs within DNA sequences. Following submission of a sequence of interest, SNPServer uses BLAST to identify similar sequences, CAP3 to cluster and assemble these sequences and then the SNP discovery software autoSNP to detect SNPs and insertion/deletion (indel) polymorphisms. The NCBI dbSNP database is a catalogue of molecular variation, hosting validated SNPs for several species within a public-domain archive.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brumfield, R. T., Beerli, P., Nickerson, D. A., and Edwards, S. V. (2003) The utility of single nucleotide polymorphisms in inferences of population history. Trends Ecol Evol 18, 249–56.
Collins, A., Lau, W., and De la Vega, F. M. (2004) Mapping genes for common diseases: The case for genetic (LD) maps. Hum Hered 58, 2–9.
Rafalski, A. (2002) Applications of single nucleotide polymorphisms in crop genetics. Curr Opin Plant Biol 5, 94–100.
Gupta, P. K., Roy, J. K., and Prasad, M. (2001) Single nucleotide polymorphisms: a new paradigm for molecular marker technology and DNA polymorphism detection with emphasis on their use in plants. Curr Sci 80, 524–35.
Clifford, R., Edmonson, M., Hu, Y., Nguyen, C., Scherpbier, T., and Buetow, K. H. (2000) Expression-based genetic/physical maps of single nucleotide polymorphisms identified by the cancer genome anatomy project. Genome Res 10, 1259–65.
Deutsch, S., Iseli, C., Bucher, P., Antonarakis, S. E., and Scott H. S. (2001) A cSNP map and database for human chromosome 21. Genome Res 11, 300–7.
Edwards, D., Forster, J. W., Chagné, D., and Batley, J. (2007) What are SNPs? in Association Mapping in Plants (Oraguzie N. C., Rikkerink E. H. A, Gardiner S. E. and De Silva H. N., Eds.), Springer New York, pp 41–52.
Syvanen, A. C. (2001) Genotyping single nucleotide polymorphisms. Nat Rev Genet 2, 930–42.
The International HapMap Consortium. (2003) The international HapMap project. Nature 426, 789–96.
Buetow, K. H., Edmonson, M. N., and Cassidy, A. B. (1999) Reliable identification of large numbers of candidate SNPs from public EST data. Nature Genet 21, 323–5.
Gu, Z., Hillier, L., and Kwok, P.-Y. (1998) Single nucleotide polymorphism hunting in cyberspace. Hum Mutat 12, 221–5.
Picoult-Newberg, L., Ideker, T. E., Pohl, M. G., Taylor, S. L., Donaldson, M. A., Nickerson, D. A., and Boyce-Jacino M. (1999) Mining SNPs from EST databases. Genome Res 9, 167–74.
Taillon-Miller, P., Gu, Z., Li, Q., Hillier, L., and Kwok, P.-Y. (1998) Overlapping genomic sequences: a treasure trove of single-nucleotide polymorphisms. Genome Res 8, 748–754.
Adams, M. D., Kerlavage, A. R., Fleischmann, R. D., Fuldner, R. A., Bult, C. J., Lee, N. H., Kirkness, E. F., Weinstock, K. G., Gocayne, J. D., White, O., Sutton, G., Blake, J. A., Brandon, R. G., Chiu, M. W., Clayton, R. A., Cline, R. T., Cotton, M. D., Earlehughes, J., Fine, L. D., Fitzgerald, L. M., Fitzhugh, W. M., Fritchman, J. L., Geoghagen, N. S. M., Glodek, A., Gnehm, C. L., Hanna, M. C., Hedblom, E., Hinkle, P. S., Kelley, J. M., Klimek, K. M., Kelley, J. C., Liu, L. I., Marmaros, S. M., Merrick, J. M., Morenopalanques, R. F., Mcdonald, L. A., Nguyen, D. T., Pellegrino, S. M., Phillips, C. A., Ryder, S. E., Scott, J. L., Saudek, D. M., Shirley, R., Small, K. V., Spriggs, T. A., Utterback, T. R., Weldman, J. F., Li, Y., Barthlow, R., Bednarik, D. P., Cao, L. A., Cepeda, M. A., Coleman, T. A., Collins, E. J., Dimke, D., Feng, P., Ferrie, A., Fischer, C., Hastings, G. A., He, W. W., Hu, J. S., Huddleston, K. A., Greene, J. M., Gruber, J., Hudson, P., Kim, A., Kozak, D. L., Kunsch, C., Ji, H. J., Li, H. D., Meissner, P. S., Olsen, H., Raymond, L., Wei, Y. F., Wing, J., Xu, C., Yu, G. L., Ruben, S. M., Dillon, P. J., Fannon, M. R., Rosen, C. A., Haseltine, W. A., Fields, C., Fraser, C. M., and Venter, J. C. (1995) Initial assessment of human gene diversity and expression patterns based upon 83-million nucleotides of cDNA sequence. Nature 377, (Suppl.) 3–17.
Dawson, E., Chen, Y., Hunt, S., Smink, L. J., Hunt, A., Rice, K., Livingston, S., Bumpstead, S., Bruskiewich, R., Sham, P., Ganske, R., Adams, M., Kawasaki, K., Shimizu, N., Minoshima, S., Roe, B., Bentley, D., and Dunham, I. (2001) A SNP resource for human chromosome 22: extracting dense clusters of SNPs from the genomic sequence. Genome Res 11, 170–8.
Ewing, B., and Green, P. (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8, 186–94.
Ewing, B., Hillier, L., Wendl, M. C., and Green, P. (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8, 175–85.
Batley, J., Barker, G., O’Sullivan, H., Edwards, K. J., and Edwards, D. (2003) Mining for single nucleotide polymorphisms and insertions/deletions in maize expressed sequence tag data. Plant Physiol 132, 84–91.
Barker, G., Batley, J., O’Sullivan, H., Edwards, K. J., and Edwards, D. (2003) Redundancy based detection of sequence polymorphisms in expressed sequence tag data using autoSNP. Bioinformatics 19, 421–2.
Savage, D., Batley, J., Erwin, T., Logan, E., Love, C. G., Lim, G. A. C., Mongin, E., Barker, G., Spangenberg, G.C., and Edwards, D. (2005) SNPServer: a real-time SNP discovery tool. Nucleic Acids Res 33, W493–5.
Huang, X. and Madan, A. (1999) CAP3: a DNA sequence assembly program. Genome Res 9, 868–77.
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) Basic local alignment search tool. J Mol Biol 215, 403–10.
Sherry, S. T., Ward, M., and Sirotkin, K. (1999) dbSNP-Database for single nucleotide polymorphisms and other classes of minor genetic variation. Genome Res 9, 677–9.
Smigielski, E. M., Sirotkin, K., Ward, M., and Sherry, S. T. (2000) dbSNP: a database of single nucleotide polymorphisms. Nucleic Acids Res 28, 352–5.
Tóth, G., Gáspári, Z., and Jurka, J. (2000) Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res 10, 967–81.
Mortimer, J., Batley, J., Love, C., Logan, E., and Edwards, D. (2005) Simple sequence repeat (SSR) and GC distribution in the Arabidopsis thaliana genome. J Plant Biotechnol 7, 17–25.
Katti, M. V., Ranjekar, P. K., and Gupta, V. S. (2001) Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol Biol Evol 18, 1161–7.
Schlötterer, C. (2000) Evolutionary dynamics of microsatellite DNA. Nucleic Acids Res 20, 211–5.
Tautz, D. (1989) Hypervariability of simple sequences as a general source for polymorphic DNA markers. Nucleic Acids Res 17, 6463–71.
Powell, W., Machray, G. C. and Provan, J. (1996) Polymorphism revealed by simple sequence repeats. Trends Plant Sci 1, 215–22.
Edwards, K. J., Barker, J. H. A., Daly, A., Jones, C., and Karp, A. (1996) Microsatellite libraries enriched for several microsatellite sequences in plants. Biotechniques 20, 758–60.
Robinson, A. J., Love, C. G., Batley, J., Barker, G., and Edwards, D. (2004) Simple sequence repeat marker loci discovery using SSRPrimer. Bioinfomatics 20, 1475–6.
Jewell, E., Robinson, A., Savage, D., Erwin, T., Love, C. G., Lim, G. A. C., Li, X., Batley, J., Spangenberg, G. C., and Edwards, D. (2006) SSR primer and SSR taxonomy tree: biome SSR discovery. Nucleic Acids Res 34, W656–9.
Abajian, C. (1994) SPUTNIK, http://abajian.net/sputnik/.
Rozen, S., and Skaletsky, H. J. (2000) Primer3 on the WWW for general users and for biologist programmers, in: Bioinformatics Methods and Protocols: Methods in Molecular Biology (Krawetz S., and Misener S., Eds.), Humana Press, Totowa, NJ, pp 365–86.
Stajich, J. E., Block, D., Boulez, K., Brenner, S. E., Chervitz, S. A., Dagdigian, C., Fuellen, G., Gilbert, J. G. R., Korf, I., Lapp, H., Lehvaslaiho, H., Matsalla, C., Mungall, C. J., Osborne, B. I., Pocock, M. R., Schattner, P., Senger, M., Stein, L. D., Stupka, E., Wilkinson, M. D., and Birney, E. (2002) The Bioperl toolkit: Perl modules for the life sciences. Genome Res 12, 1611–8.
Wilson, I. D., Barker, G. L. A., Beswick, R. W., Shepherd, S. K., Lu, C., Coghill, J. A., Edwards, D., Owen, P., Lyons, R., Parker, J. S., Lenton, J. R., Holdsworth, M. J., Shewry, P. R., and Edwards, K. J. (2004) A transcriptomics resource for wheat functional genomics. Plant Biotechnol J 2, 495–506.
Green, P. (1994) Phrap, unpublished. www.phrap.org.
Gordon, D., Abajian, C., and Green, P. (1998) Consed: a graphical tool for sequence finishing. Genome Res 8, 195–202.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Batley, J., Edwards, D. (2009). Mining for SNPs and SSRs Using SNPServer, dbSNP and SSR Taxonomy Tree. In: Posada, D. (eds) Bioinformatics for DNA Sequence Analysis. Methods in Molecular Biology, vol 537. Humana Press. https://doi.org/10.1007/978-1-59745-251-9_15
Download citation
DOI: https://doi.org/10.1007/978-1-59745-251-9_15
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-58829-910-9
Online ISBN: 978-1-59745-251-9
eBook Packages: Springer Protocols