Quercus is one of the most prominent genera of plants in the world, including about 500 species distributed throughout the Northern hemisphere (Nixon 1993). However, 35 species in the genus are recorded in the International Union for Conservation of Nature and Natural Resources Red List (http://www.iucnredlist.org/) as critically endangered, endangered or vulnerable. Genetic aspect of these threatened species should be carefully analyzed to develop effective conservation strategies.

One of the major obstacles for genetic studies of rare species is the general lack of genetic markers such as simple sequence repeats (SSRs), development of which is costly and time-consuming. Fortunately, however, the massive accumulations of DNA sequences, especially expressed sequence tags (ESTs), in publicly available databases may facilitate the development of these markers (Pashley et al. 2006). Furthermore, since ESTs are gene sequence and conservative, primers designed for the target species are likely to work well in related species. The utility of DNA databases for developing SSR markers should be demonstrated to accelerate such a strategy.

Quercus mongolica var. crispula is one of the most silviculturally important broad-leaved deciduous tree species in Japan. Since increasing importance is being attached to multi-functional aspects of such trees, e.g. for watershed protection, recreational use and maintenance of biodiversity as well as timber production, the areas reforested with broad-leaved trees in Japan, especially Quercus species, have increased (The Ministry of Agriculture 2002). Ideally, the genetic composition and diversity of the populations used as sources of planting material should be carefully analyzed in such reforestation programs. Therefore, in the present study, we have developed SSR markers for Q. mongolica, by mining 2,691 publicly available sequences originating from mRNA.

There were 1,439 and 1,233 ESTs for Q. robur and Q. petraea, respectively, registered in dbEST (http://www.ncbi.nlm.nih.gov/dbEST/). These sequences, along with additional 19 mRNA sequences, were downloaded and processed using read2Marker (Fukuoka et al. 2005) under default settings. Locations of microsatellites (coding/non-coding region) and putative gene functions were also estimated using ESTScan (Iseli et al. 1999) and blastx (Altschul et al. 1990), respectively. The e-value cutoff for blastx was set at 1e−10.

A total of 22 primer pairs were designed for sequences with at least nine and six repeats for di- and tri-SSRs, respectively. PCR was carried out in 8 μl reaction mixtures containing ca. 10 ng genomic DNA, 1× PCR buffer, 200 μM of each dNTP, 1.5 mM MgCl2, 0.2 μM of each primer designed in the present study and 0.4 U of Taq polymerase (Promega), using the following program: 94°C for 3 min, then 40 cycles of 94°C for 45 s, 55°C for 45 s and 72°C for 45 s, followed by a final extension at 72°C for 7 min. The PCR products were electrophoretically separated on 2% agarose gels and stained with ethidium bromide. Sixteen individuals were used for screening polymorphisms, by PCR carried out in 6 μl reaction mixtures under the conditions described above, except that the amount of Taq polymerase was reduced to 0.15 U. PCR products were labeled with ChromaTide Rhodamine Green-5-dUTP (Molecular Probes) according to the method of Kondo et al (2000), and analyzed using a 3100 Genetic Analyzer with GeneScan software. For each locus, the number of alleles (Na), observed heterozygosity (H o ) and expected heterozygosity (H e ) were calculated and Hardy-Weinberg (HW) and genotypic equilibrium were tested using FSTAT software (Goudet 1995). Ten loci showed clear polymorphic patterns, with H o and H e ranging from 0.31 to 0.87 and 0.28 to 0.94, respectively (Table 1). The locus DN949776 showed significant deviation from HW equilibrium (P = 0.0265). However, no loci showed HW disequilibrium after the significance levels were adjusted for multiple-tests. No significant genotypic disequilibrium was detected for any pair of loci. All of the di-SSR loci were estimated to be in non-coding regions, with Na ranging from 4 to 15, while all tri-SSR loci were estimated to be in coding regions with Na of 3 or 4. Half of the ten loci in Table 1 showed similarity with other proteins and putative functions were proposed for them. In summary, we have efficiently developed 10 EST-SSR markers for Q. mongolica from 2,691 Q. robur and Q. petraea gene (EST) sequences. The study shows that developing SSR markers by database mining is rapid, convenient and cost-effective, and we hope that the ten markers developed in it will be useful for analyzing the genetic diversity of Quercus species.

Table 1 Characteristics of the ten polymorphic microsatellite markers in Q. mongolic a