Introduction

Bamboo is a monocotyledonous plant in the family Poaceae and subfamily Bambusoideae. China has the largest reserve of bamboo in the world. Phyllostachys pubescens (synonyms Ph. edulis) is the most important bamboo species in China and the third most important plant species for timber production next to Pinus massoniana and Cunninghamia lanceolata. There are 4.2 million ha of bamboo forest in China, Ph. pubescens representing 3 million ha and corresponding to approximately 2% of China’s total forest area, having doubled over the last 30 years (Fu 2001). It is the predominant source of bamboo shoots and plays an important ecological role.

Phyllostachys species flower at intervals of 60–120 years, often flowering simultaneously over an extensive area before dying (Janzen 1976; Watanabe et al. 1982). Populations recover mainly through the development of seedling cohorts, which suffer high mortality rates during early development, the survivors extending rhizomes and producing culms. Therefore, it is difficult to achieve genetic improvements in Ph. pubescens through plus-tree selection and sexual hybridization. Molecular markers that facilitate the analysis of genetic traits are important for plant improvement (Gupta et al. 1999) and several types of marker have been applied in bamboo research including RAPDs (SCARs), AFLPs, ISSRs and unidentified EST-derived SSRs (reviewed by Zhang and Tang 2007). Research focuses on species identification (Das et al. 2005; Lin et al. 2009), genetic diversity (Friar and Kochert 1991; Lai and Hsiao 1997; Barkley et al. 2005; Ruan et al. 2008), the clonal structure of populations (Suyama et al. 2000; Isagi et al. 2004), phylogenetic relationships and species evolution (Friar and Kochert 1994; Loh et al. 2000; Li et al. 2002; Barkley et al. 2005).

Microsatellite markers, also known as simple-sequence repeats (SSRs), are DNA sequences 1–6 bp in length that are tandemly repeated a variable number of times (Tautz 1989). They are particularly valuable in plant-breeding programs because they are polymorphic, co-dominant, relatively abundant, widely dispersed across the genome, and easy to score using automated methods (Powell et al. 1996). Microsatellites have been used in cultivar identification and parentage assessment (Buteler et al. 2002; Malysheva et al. 2003; Rajora and Rahman 2003), genetic diversity analysis (Goldstein and Clark 1995; Cho et al. 2000), evolutionary and phylogenetic studies (Pupko and Graur 1999; Zhu et al. 2000), the construction of molecular maps (Bell and Ecker 1994; Temnykh et al. 2000), and the support of patents and property rights for plant varieties (Powell et al. 1996; Gupta et al. 1999). Recently, six microsatellite markers were developed from a genomic library of Bambusa arundinacea, a bamboo species in India with a caespitose rhizome system (Nayak and Rout 2005). However, developing microsatellite markers for a new plant species through the use of genomic DNA or enriched-SSR libraries is laborious, expensive and inefficient (Powell et al. 1996). Fortunately, searching through published sequence databases offers an alternative. In this study, we used sequence databases of Ph. pubescens with a scattered rhizome system to (1) evaluate the frequency and distribution of different classes and types of SSRs in the genome, (2) establish and validate microsatellite markers for the detection of polymorphisms in a reference set of Ph. pubescens cultivars and provenance populations, and (3) assess cross-species transferability and identify Phyllostachys interspecies hybrids.

Materials and methods

Database search and primer design for microsatellite markers

DNA sequences in GenBank (http://www.ncbi.nlm.nih.gov/) were searched for the phrase “Phyllostachys pubescens” and the search results were downloaded as FASTA-formatted sequence files. Web software RepeatMasker (Smit et al. 1996–2004: http://www.repeatmasker.org/) and Microsatellite Repeats Finder (Benson 1999: http://biophp.org/minitools/microsatellite_repeats_finder/) were used to detect tandem repeats of 2–6 nucleotides (Table 1). PCR primers were designed to anneal in the flanking regions of identified repeat sequences using the computer program Primer Primer 5 (PRIMIER Biosoft International, CA, USA). When two distinct microsatellite sequences were present at distant sites in one DNA sequence [for example, (TA)9 and (TTTTC)4 in ED018039], primer pairs were designed for each microsatellite. When two microsatellites in one DNA sequence were in close proximity [for example, (TA)24 and (GA)14 in ED018001], the primer pairs were designed outside these microsatellites (Table 2). Primers that met these requirements generated PCR products in the range 100250 bp.

Table 1 Size distribution of microsatellite motifs observed in Ph. pubescens sequences in the GenBank database
Table 2 Candidate sequences and microsatellite marker information

Plant materials and DNA extraction

Two groups of cultivars or forms and provenances were used to evaluate microsatellite marker polymorphism (Fig. 1). The first group included a 1-year-old seedling and nine cultivars or forms showing morphological differences in stem shape and color, and leaf color (Lin et al. 2009). The second group consisted of 17 provenances collected from bamboo stands in eight provinces (Jiangsu, Fujian, Hunan, Hubei, Jiangxi, Guangdong, Zhejiang and Anhui), representing almost all Ph. pubescens habitats. The provenances were genetically divergent in growth characteristics (Chen et al. 2001) and physical and mechanical properties of timber (Liu et al. 2008). Phyllostachys pubescens and six additional Phyllostachys species (Table 3) collected from Anji Bamboo Germplasm Garden, Anji, Zhejiang Province, were used to test the amplification, sequencing and identification of SSRs (Fig. 2). Four clones of Phyllostachys interspecies hybrids sampled from Jiangxi province were used to test the new microsatellite loci (personal communication with Professor Liao of Jiangxi Forestry Research Institute). Genomic DNA was extracted from young leaves with the hexadecyltrimethylammonium bromide (CTAB) method (Doyle and Doyle 1987), with some modifications.

Fig. 1
figure 1

ac Polyacrylamide gel electrophoresis patterns of microsatellites derived from GSS sequences on a panel of 11 varieties and 17 provenances: Lane 1, 30: size marker; Lane 2: Ph. pubescens as a reference; Lane 3: seedling; Lanes 4–12: cultivars or forms of Ph. pubescens cv. Ventricosa, Ph. pubescens cv. Tao Kiang, Ph. pubescens cv. Viridisulcata, Ph. pubescens cv. Luteosulcata, Ph. pubescens cv. Gracilis, Ph. pubescens cv. Obliquinoda, Ph. pubescens cv. Tubaeformis, Ph. pubescens cv. Heterocycla, Ph. pubescens cv. Pachyloen; Lanes 13–29: provenances in Jurong of Jiangsu, Yixing of Jiangsu, Huoshan of Anhui, Wuhan of Hubei, Anji and Zhuzhou of Zhejiang, Jiujiang of Jiangxi, Shangrao of Jiangxi, Lechang and Conghua of Guangdong, Wuyi, Songxi, Jian’ou, Shaxian, Hua’an and Longhai of Fujiang province. a: PBM014; b: BPBM017; c: PBM019

Table 3 Amplicon size, microsatellite motifs and polymorphism in species of genus Phyllostachys
Fig. 2
figure 2

a Polyacrylamide gel electrophoresis patterns of microsatellite alleles derived from locus PBM014. b Alignment of the nucleotide sequences of the microsatellite alleles at locus PBM014 amplified from seven representative bamboo species of genus Phyllostachys. Nucleotides conserved among these sequences (relative to Ph. pubescens) are shown by dots. The dashes indicate deletions. The arrows indicate the primer sequences used to amplify this microsatellite locus. The box highlights the microsatellite motif. The suffix numbers after bamboo species correspond to the DNA bands marked in a

PCR and sequencing of microsatellite loci

Newly synthesized primer pairs (Shanghai Sangon Biological Engineering Technology & Services Co., Ltd) were tested for PCR amplification using DNA from Ph. pubescens. PCR amplification was performed in a thermal cycler (PE 9700, ABI) in 20-μl reactions comprising 50–100 ng of template DNA, 0.2 μM of each primer, 200 μM of each dNTP and 1 unit of Taq DNA polymerase with 1× PCR universal buffer (10 μM Tris–HCl, pH 8.3 at 25°C; 50 μM KCl), and 1.5 μM MgCl2 (Shanghai Sangon Biological Engineering Technology & Services Co., Ltd). The reaction consisted of heating to 95°C for 5 min, followed by 30 cycles of 1 min denaturation at 95°C, 1 min annealing at 46–59°C depending on the primer pair (Table 2) and 2 min extension at 72°C, and a final step at 72°C for 5 min. Amplified microsatellite loci were further tested in six Phyllostachys cross-species (Table 3) and interspecies hybrids (Fig. 3). The PCR primer annealing temperature was lowered by 2–5°C according to the evolutionary distance between Ph. pubescens and related species (Rossetto 2001). PCR products were separated on 6% polyacrylamide denaturing gels, and marker bands were revealed by silver staining as described by Panaud et al. (1996). Desired bands were excised from the gel, purified using the EZ-10 Spin Column DNA Gel Extraction Kit (Biobasic Inc.) and ligated into the pUC18 vector (TaKaRa, Japan). Three positive clones for each species were selected for sequencing using BigDye terminator V3.1 in a cycle sequencing protocol according to the manufacturer’s specifications (PE Applied Biosystems, ABI PRISM 3100-Avant Automatic DNA Sequencer). Sequences were deposited in the NCBI GenBank database (accession numbers FJ588714–FJ588848).

Fig. 3
figure 3

a Microsatellite DNA fingerprints of Ph. kwangsiensis and Ph. bambusoides and their presumable hybrids at PBM014 locus. b Alignment of the nucleotide sequences of the microsatellite alleles at locus PBM014 amplified from Ph. kwangsiensis and Ph. bambusoides and their presumable hybrids. Nucleotides conserved among these sequences (relative to Ph. kwangsiensis) are shown by dots. The dashes indicate deletions. The arrows indicate the primer sequences used to amplify this microsatellite locus. The box highlights the microsatellite. The suffix numbers after bamboo species correspond to the DNA bands marked in a

Analysis of sequence data

Vector sequences were removed, then edited using Vector NTI software (version 10.0, Invitrogen Corporation, USA). The DNA sequences were then aligned using the CLUSTAL method included in the software. Multiple gaps were closed manually to group the repeat sequences.

Results

Screening of GenBank and characteristics of Ph. pubescens SSRs

A total of 1,532 Ph. pubescens DNA sequences (including 200 cDNA sequences) was downloaded from GenBank using “Phyllostachys pubescens” as a search keyword (accessed before November 3rd 2008). After screening with RepeatMasker and Microsatellite Repeats Finder and excluding single-nucleotide repeats, we found 3,057 SSRs in 920 out of 996 GSSs (Gui et al. 2007) covering ~722 kb including 770 GSSs containing more than one SSR. In addition, 58 out of 68 cDNA sequences representing 66.8 kb contained 184 SSRs (Table 1). This corresponds to average distances of approximately 336 bp between SSRs in GGSs and 363 bp between SSRs in cDNAs. The size of the repeat unit was not evenly distributed among the SSR loci: 76.6 and 74.5% of the SSRs were dinucleotide repeats, 20.0 and 22.3% trinucleotide repeats, and 3.4 and 3.2% were higher-number repeats in GSS and cDNA sequences, respectively. This indicates that dinucleotide SSRs are the most frequent in both GSS and cDNA sequences. Among the dimeric SSRs, AG/CT (or GA/TC) repeat was the most common in both GSSs (36.7%) and cDNAs (39.7%), whereas GC or CG repeat was comparatively rare in GSSs (8.6%) and cDNAs (10.7%). SSRs ≥20 nucleotides in length (Class I microsatellites) accounted for only 0.75% of GSS SSRs (a total of 23 sequences), whereas SSRs 12–19 nucleotides in length (Class II microsatellites) accounted for 2.4 and 2.6% of the total number of SSRs in GSS and cDNA sequences, respectively.

Assessing the polymorphism of Ph. pubescens microsatellite markers

We sought to develop microsatellite markers based on public genomic sequence information, incorporating empirically-derived data concerning the frequency, size polymorphism, and PCR-amplification properties of different types of Ph. pubescens SSR. Primers were designed for 30 of the longest SSR loci in 24 GSSs comprising 23Class I SSRs, six Class II SSRs and one SSR <12 nucleotides in length (Table 2). Primer pairs were successfully designed for 26 (86.7%) of the loci, the remainder either containing insufficient flanking sequences [(TA)17 in ED018674 and (TA)15 in ED018770] or were inappropriate for primer modeling [(CA)24(TA)10 in ED018039 and (TA)12 in ED018674]. After PCR amplification with the appropriate primers, polyacrylamide gel electrophoresis and sequencing led to the further development of 19 microsatellite markers (summarized in Table 2).

To determine the polymorphism of the selected loci, one Ph. pubescens seedling, nine cultivars or forms and 17 provenances of were surveyed using the 19 corresponding primer sets listed in Table 2. We detected little size or sequence variation (data not shown) with the exception of locus PBM014 (Fig. 1), indicating that the 19 microsatellite loci display little polymorphism and the genetic diversity of Ph. pubescens is low.

Microsatellite analysis in Phyllostachys interspecies hybrids

To test the transferability of the 19 microsatellites to other bamboo species, six diverse Phyllostachys species were selected for cross-species amplification with the SSR primers identified in Ph. pubescens (Table 3). All but one of the markers (94.7%) transferred successfully to at least one other Phyllostachys species. Ten microsatellites (52.6%), corresponding to loci PBM014, PBM016, PBM018, PBM020, PBM022, PBM025, PBM023, PBM027, PBM028 and PBM004 (PBM005), successfully transferred to all six species. The remaining microsatellites failed to amplify correctly or efficiently (e.g. loci PBM002 and PBM007 produced weak amplification products in all six species, and PBM026 produced a product lacking the SSR in five of the six species). The aggregate transferability of the 19 Ph. pubescens SSRs was 75.3%.

We observed clear interspecies variation in most of the 18 microsatellite loci although the size polymorphism was generally subtle, the major exception being locus PBM030 (Table 3). No variation was observed in loci PBM018, PBM020, PBM025 or PBM023, but significant variation was observed in loci PBM014, PBM016, PBM028 and PBM004 (PBM005). Thirteen of the 18 loci showed polymorphism in at least one other species using Ph. pubescens as the comparator (72.2%), which gave an average polymorphism of 66.7% for the 18 successfully transferred loci. PBM014 showed the richest polymorphism among the seven Phyllostachys species, with three amplicons in Ph. nidularia and two in the other species. In the smallest amplicon, all seven Phyllostachys species shared a 244-bp sequence containing the common SSR motif (CT)4 and minimal nucleotide difference. The largest amplicon contained both poly (CT) n with 7–21 copies and poly (CAT) n with 3–5 copies of compound repeat motifs. In addition, differences between the smallest and largest amplicons reflected the CT copy number, the presence or absence of (CAT) n and further insertions and deletions (indels) in the flanking region.

The different SSR copy numbers among these bamboo species provide a series of SSLP (simple sequence length polymorphism) markers that may be useful for the genetic analysis of interspecies hybrids (Fig. 3). Polymorphic microsatellites (SSLPs) were selected to characterize the unidentified interspecies hybrid samples. PBM014 was polymorphic for all of the bamboo species tested and selected for parental (Ph. kwangsiensis, Ph. bambusoides and Ph. pubescens) species-specific alleles at this locus (Fig. 2a). As indicated in Fig. 3a, the heterozygosity of clones I and II revealed Ph. kwangsiensis and Ph. bambusoides parents, whereas homozygous clone III was derived from the female parent Ph. kwangsiensis. Sequencing the corresponding bands demonstrated that all of the bands contained the (CT) n and (CAT) n SSR motifs. These data show that clones I and II are interspecies hybrids of Ph. kwangsiensis and Ph. bambusoides, but clone III is not. Similarly, clone IV was confirmed as an interspecies hybrid of Ph. bambusoides and Ph. pubescens.

Discussion

This study sought to develop microsatellite markers for bamboo based on sequences containing SSRs deposited in public databases. In the past, this method has only been suitable for well-characterized plants such as Arabidopsis thaliana (Bell and Ecker 1994), rice (Cho et al. 2000; Temnykh et al. 2001), and other cereals (Cordeiro et al. 2001; Thiel et al. 2003; La Rota et al. 2005). However, with the advent of low-cost, large-scale DNA sequencing, the pool of DNA sequence information in public databases for bamboo species, especially Ph. pubescens and Dendrocalamopsis oldhamii, has increased rapidly (reviewed by Tang 2009). In this study, we detected more than 3,200 SSRs from 966 GSS and 200 cDNA sequences and determined several notable characteristics of SSRs in the Ph. pubescens genome. Most of the SSRs were dinucleotide repeats (>75%) and proportion of the repeated number less than four was 87.7%. These characteristics resulted in short SSRs, 99% of which were <20 bp in length, and 97% of which were <12 bp in length, corresponding to our previous characterization of SSR-enriched libraries (data not shown). Although rice shows a similar tendency (Temnykh et al. 2001), the bias in frequency and length variation appears much more pronounced in Ph. pubescens.

Microsatellites can be categorized as class I (SSRs ≥20 nucleotides) or class II (SSRs ≥12 but <20 nucleotides) (Temnykh et al. 2001). Class I microsatellites tend to be highly polymorphic, whereas class II microsatellites show less variability (Weber 1990; Cho et al. 2000; Temnykh et al. 2000). Microsatellites <12 bp tend to mutate at the same rate as unique sequences and therefore demonstrate stochastic variation (Pupko and Graur 1999). To develop polymorphic microsatellite markers for bamboo, we followed the procedure already adopted for rice (Temnykh et al. 2001). We identified 30 candidate microsatellite loci (predominantly class I) and selected 19 for further development (Table 2).

Polymorphism was limited among the nine cultivars or forms and 17 provenances of Ph. pubescens (Fig. 1), which nevertheless show genetic variation when typed with AFLP and ISSR markers (Ruan et al. 2008; Lin et al. 2009). At the genus level, however, polymorphism reached an average of 66.7% with Ph. pubescens as the comparator (Table 3), much lower than the 86.0% transferred polymorphism statistically calculated from 1,800 species/primers at the same genus level (Rossetto 2001). Rich polymorphism is the norm for microsatellites (Tautz 1989), e.g. in rice, the average polymorphism in microsatellites developed from the GenBank database is 54%, whereas that observed in genomic libraries is 83.8% at the intra-species level (Cho et al. 2000). Therefore, the limited polymorphism observed in this study appears to be another distinguishing feature for Ph. pubescent and other Phyllostachys species. Replication slippage (Schlotterer and Tautz 1992; Richards and Sutherland 1994) and recombination (Jakupiak and Wells 1999) during DNA replication are the principal methods for the diversification of microsatellite alleles. However, the long flowering interval (Janzen 1976; Watanabe et al. 1982) means that Phyllostachys species are propagated clonally more often than sexually, reducing the likelihood of allele diversification as described for B. arundinacea (Nayak and Rout 2005).

Microsatellite primers developed for one species can be used to detect polymorphism at homologous sites in related species. The transfer success of microsatellites was an average of 76.4% at the genus level and 35.2% at the family level (Rossetto 2001). In rice, the transfer success was >90% at the genus level (Wu and Tanksley 1993). In bamboo, the transfer success of microsatellites in B. arundinacea was 100% (6/6 loci) for species of Bambusa and 83.3% (5/6 loci) for other genera in the Bambusoideae subfamily (calculated from the data provided by Nayak and Rout, 2005). In this study, the transfer success of Ph. pubescens microsatellites was an average of 75.3% at the genus level, lower than that reported in Bambusa. However, when differences concerning amplicon size and the identification of repeat motifs are taken into account, our results are consistent with those of Nayak and Rout (2005).

The frequency of microsatellites in plants is higher in transcribed regions, especially in the untranslated region (Morgante et al. 2002). EST-derived and/or unigene-derived microsatellites demonstrate a high-level of transferability to distantly related species, thereby providing additional markers for comparative genomics and evolutionary studies (Cordeiro et al. 2001; La Rota et al. 2005; Zhang et al. 2005; Parida et al. 2006). Barkley et al. (2005) assessed genetic diversity and phylogenetic relationships of temperate bamboo species using the transferred EST-derived SSRs from major cereals, but did not identify the transferability of SSRs between the cereals and bamboo species. In this study most microsatellite loci could be transferred to all the tested species, and might be present in the transcribed region. BLAST searches using these Ph. pubescens SSR loci showed no matches to any of the transcribed or EST regions except locus PBM014, which is closely related to the sequence encoding a hypothetical rice protein (1074–1412 bp of Os07g0569100). With polymorphism and transfer success among all six Phyllostachys species, the PBM014 locus could be used to identify Phyllostachys interspecies hybrids and the corresponding parental alleles. In another of our studies, the PBM014, PBM022, PBM025 and PBM028 loci transferred successfully to other bamboo species beyond Phyllostachys. PBM014 and PBM025 served as species-specific alleles for the identification of interspecies hybrids of Sinobambusa tootsik × Pleioblastus distichus, Sasa tokugawana × S. borealis and Pleioblastus simoni × Phyllostachys praecos (Lu et al. 2009). These results demonstrate that microsatellite markers (especially PBM014 etc.) are ideal markers for bamboo hybrid identification as reported in poplar (Rajora and Rahman 2003) and wheat-barley (Malysheva et al. 2003).

Compared with previous methods involving the construction of genomic DNA libraries for Bambusa arundinacea (Nayak and Rout 2005) and an SSR-enriched library for Ph. pubescens (data unpublished), developing microsatellite markers from public database searches is simple, rapid, cost-effective and highly suited to practical applications in plant breeding.