Introduction

Microsatellites or simple sequence repeats (SSRs) are tandem repeat sequences of short units of 2–6 nucleotides that occur frequently in all prokaryotic and eukaryotic genomes studied to date (Koelling et al. 2012). Due to their hyper-variability, reproducibility, high abundance and transferability, multi-allelic nature and co-dominant inheritance, SSR markers have become valuable and reliable tools for genetic diversity analysis, genetic mapping, gene tagging and comparative mapping in plants (Li et al. 2012; Silva et al. 2013). Additionally, SSR markers can be isolated from both conserved coding regions and non-coding nucleotide sequences of all higher organisms (Sraphet et al. 2011). However, the development of genomic SSRs is relatively laborious and time-consuming; alternatively, SSRs have been mined from the public sequence databases of expressed sequence tags (ESTs) (Huang et al. 2011; Koelling et al. 2012; Sraphet et al. 2011; Zeng et al. 2010) or coding sequences by RNA-seq technology (Kaur et al. 2012; Li et al. 2012). Nevertheless, compared with the SSR markers derived from transcriptome sequences, the development of SSRs from the large-scale cloning and sequencing of DNA or insufficient public EST libraries would yield multiple sets of markers at the same locus due to sequence redundancy. To circumvent the problem of redundancy in EST databases, a non-redundant unigene EST data set obtained by de novo transcriptome sequencing technologies should be used.

In recent years, transcriptome sequencing using the next-generation sequencing (NGS) technology platforms of SOLiD, Illumina and 454 has become increasingly popular in many crop species (Li et al. 2010; Lu and Lu 2010). The NGS technology is a tremendously efficient approach for the large-scale generation of reliable and robust transcript sequences (Blanca et al. 2011). Even in model species, such as Arabidopsis thaliana, this deep sequencing is desirable for identifying new transcripts not present in previous EST collections (Blanca et al. 2011). This technology offers a simple, direct and reliable approach for identification and development of massive unigene-based microsatellite markers (UGMS) with diverse motifs through data mining by bioinformatic methods. Additionally, the genic-SSRs target the transcribed regions specifically and increase the potential for linkage to loci that contribute to agronomic phenotypes. Because of their target coding domains that are more likely to be conserved between relatives, the markers can also facilitate better cross-genome comparisons (Dutta et al. 2011). In recent decades, the comprehensive transcriptome of EST sequences by NGS technology has become one of the significant SSR detection resources. To date, a large number of SSR markers based on transcriptome sequences has been developed and utilized in diverse species, such as rice (Yu et al. 2012), cotton (Wang et al. 2012a), orange (Song et al. 2012), linseed (Kale et al. 2012) and orchid (Fu et al. 2011). Recent advances in next-generation sequencing technologies have generated a huge wealth of sequence information, which offers the opportunity to develop genic-SSR markers on a large scale for molecular breeding, genetic mapping and identification of plant species and their varieties with the manual cultivar identification diagram (MCID) (Korir et al. 2013).

Radish, Raphanus sativus L. (2n = 2x = 18), is a major root vegetable crop of the Brassicaceae family. Several molecular markers have been utilized in radish including universal sequence-related amplified polymorphism (SRAP), random amplified polymorphic DNA (RAPD), restriction fragment length polymorphism (RFLP), amplified fragment length polymorphism (AFLP) markers (Kong et al. 2011; Shao et al. 2011) and species-specific SSR markers (Liu et al. 2008). The existence of informative molecular markers can provide valuable insights into genetic and genomic studies in radish (Zhai et al. 2013). Recently, a number of SSRs have been developed from the published EST database (Jiang et al. 2012; Shirasawa et al. 2011; Zhai et al. 2013). However, in comparison with other members of the Brassicaceae family, the development and application of genic-SSR markers from transcriptome sequences in radish are still largely limited. Wang et al. (2012b) identified a total of 14,641 potential EST–SSRs from the unigenes generated by high-throughput transcriptomic sequencing. However, these markers were not further detected and utilized. Thus, developing genic-SSRs by de novo RNA sequencing technology is needed in radish.

In this study, we developed a comprehensive set of genic-SSRs based on de novo transcriptome sequencing, and we characterized uni-transcript sequences of R. sativus (an advanced inbred line, NAU-RG) obtained by Illumina paired-end sequencing technology. A total of 11,928 genic-SSRs from 11,311 unigenes were generated in the de novo transcriptome. Moreover, a set of 5,503 novel genic-SSR markers were developed, and the application of genic-SSR markers was well demonstrated in the genetic diversity analysis of 32 genotypes in radish. All radish genotypes could be quickly distinguished by a combination of six genic-SSR primers with MCID, which was a new strategy making both morphological descriptors and molecular markers easy, referable and practical to use (Korir et al. 2013). Furthermore, for SSRs located in conserved coding regions, they showed high transferability across nine related species in the Brassicaceae family, which makes them beneficial for comparative analysis across relatives in Brassicaceae. These newly developed SSR markers will rapidly enrich the number of functional molecular markers directly related to expressed regions of the genes in radish and be very valuable in facilitating genetic mapping and comparative genome analysis in radish.

Materials and methods

Plant materials

Plants of a radish advanced inbred line, NAU-RG, with a white global taproot, were grown in a temperature-controlled greenhouse (25 °C) with a relative humidity of 75 % and 14 h of light daily. The taproots for RNA extraction were harvested and placed in liquid nitrogen immediately and then stored at −80 °C. A total of 32 radish genotypes [Electronic Supplementary Material (ESM) Table S1] with different root colors and origins were selected for genetic diversity and cultivar identification, and nine related species in Brassicaceae for transferability studies.

RNA isolation and cDNA library construction

For Illumina sequencing, the total RNA was extracted from the taproots at seedling, taproot thickening, and mature stages of NAU-RG with Trizol® Reagent (Invitrogen, Carlsbad, CA, USA) following the manufacturer’s instructions, and was stored at −80 °C. The cDNA library construction and sequencing with an Illumina HiSeq™ 2000 were performed at the Beijing Genomics Institute (BGI; Shenzhen, China). The unigenes assembled were annotated using BLASTX against the nr (non-redundant) protein database in the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/blast/Blast.cgi). The unigenes were allocated to the corresponding functional categories based on gene ontology (GO) terms by Blast2GO (Conesa et al. 2005) with GO weight 2 (Huang et al. 2011).

Novel genic-SSR identification and primer design

All the uni-transcript sequences generated by deep transcriptome sequencing in radish were screened for SSRs by SSR Locator software. In this study, SSRs were considered to contain di-, tri-, tetra-, penta- and hexa-nucleotides with minimum repeat numbers of 6, 4, 3, 3 and 3, respectively (Iorizzo et al. 2011; Kaur et al. 2012). Primer pairs flanking the SSRs were designed using Primer3 in accordance with the core criteria of predicted product size ranging from 100 to 500 bp, a GC percentage between 40 and 60 %, optimum primer length of 22 bp and melting temperature between 50 and 60 °C.

DNA extraction, PCR amplification and detection

Genomic DNA from 32 accessions of radish with different root colors and origins, and from 10 related species of Brassica were extracted from young leaves using a modified CTAB protocol (Liu et al. 2003).

PCR amplifications were performed on a Thermal Cycler (SensoQuest) with a 15-μl final reaction volume containing 2.0 mM MgCl2, 0.2 mM dNTPs, 0.75 U Taq DNA polymerase (TaKaRa Bio Inc., Dalian, China), 0.1 μM of each primer and 10 ng of template DNA. The PCR conditions comprised an initial denaturation at 94 °C for 2 min; 35 cycles of 94 °C for 40 s, 55–60 °C (varying with the T m of the different primer sets) of annealing for 45 s, and 72 °C for 1 min; and a final extension of 72 °C for 7 min. PCR products were separated on 8 % non-denaturing polyacrylamide gels at 160 V for 2–2.5 h and visualized with a rapid silver staining method (Liu et al. 2008).

Partial amplification products with the expected sizes were recovered from PAGE gels using the AxyPrep DNA gel extraction kit (Axygen Bio Inc., Hangzhou, China). Extracted products were subcloned with the T-A cloning kit (TaKaRa), and the positive clones were sequenced with ABI 3730 at the BGI.

Survey of polymorphism and genetic diversity analysis

To validate the polymorphism of these loci and further study the application of these genic-SSR markers, 32 different radish genotypes consisting of R. sativus and its wild relatives (ESM Table S1) were employed and clustered based on the estimated genetic distance. For dendrogram analysis, only the data for the polymorphic alleles were entered for all DNA samples, scored for their presence (1) or absence (0). Through the similarity matrix based on Jaccard’s coefficient, a dendrogram was calculated using NTSYS-pc2.10e software by the un-weighted pair group method using the arithmetic averages (UPGMA) methodology in the SAHN module, which could illustrate the genetic relationships among the samples (Zhai et al. 2013; Zhu et al. 2012).

For primers that produced the predicted fragments after PCR reactions, the number of alleles was noted, and the polymorphism information content (PIC) value (Smith et al. 1997) of an SSR locus was calculated according to the formula PIC = ∑p 2 i , where p i is the frequency of the ith allele out of the total number of alleles at each SSR locus. Moreover, a new strategy called manual cultivar identification diagram (MCID) was adopted (Zhang et al. 2012c; Korir et al. 2013), which facilitates the efficient use of the primers and was easily operated. In this method, the clear specific bands were chosen and manually scored for cultivar identification. The cultivars with a specific band in the fingerprint generated from one primer were separated singly, and those sharing the same banding pattern were clustered into the same subgroup (Wang et al. 2011). Based on additional genic-SSR markers employed with specific band sizes, the 32 radish genotypes were completely separated from each other.

Detection of the transferability of genic-SSR primers

To assess the transferability of the SSR markers, 37 novel primer pairs were amplified from 10 related species of crops in the Brassicaeae family using the above PCR conditions. Subsequently, DNA bands with the expected size fragments in the non-denaturing polyacrylamide gels were recovered and directly cloned into the pMD 18 TA cloning vector (TaKaRa). The sequences of positive clones were analyzed with ClustalX software (http://www.seekbio.com/DownloadShow.asp?id=2247) (Jiang et al. 2012; Zhai et al. 2013).

Results

Uni-transcript sequences and gene ontology analysis

The radish cDNA library was sequenced by the Solexa system and yielded a total of 71.95 million raw reads. After removing the short and low-quality reads and trimming off the adapter sequences and poly-A tails, approximately 66.11 million high-confidence reads were obtained, which were assembled into a total of 73,084 unigenes with an average length of 763 nucleotides and selected for further analyses.

According to the BLASTX results against NCBI databases, in total 3,935 unigenes were successfully associated with GO terms out of 9,876 unique unigene sequences containing SSR loci. The GO-annotated unigenes belonged to the cellular components, molecular functions and biological processes clusters and were classified into 37 categories at process level 2 (Fig. 1). Among the cellular component classification class, cell part (8.70 %), cell (8.70 %), organelle (3.50 %) and macromolecular complex (2.30 %) were the most represented categories. Moreover, the metabolic process (16.60 %) and cellular process (15.90 %) of the biological processes classification contributed the largest proportion of all annotations, followed by biological regulation (5.00 %), pigmentation (4.90 %), localization (2.70 %) and establishment of localization (2.70 %). In the molecular function classification class, binding (25.90 %) and catalytic activity (15.10 %) constituted the two major categories, followed by transcription regulator activity (2.30 %) and transporter activity (1.00 %). Other components were represented at proportions less than 1 % of the total (Fig. 1). These results suggested that those unigenes take part in various biological processes of regulation, growth, development, metabolism and apoptosis in radish.

Fig. 1
figure 1

Gene Ontology (GO) classification of unigene sequences containing SSR loci from radish taproot. The results are summarized in three main categories: cellular component, molecular function and biological process

Characteristics of genic-SSRs in the radish transcriptome

SSRs were highly abundant in the assembled unigene dataset. In total, 11,928 potential SSRs with a minimum of four repetitions for all motifs were identified from 150,455 contigs generated by Illumina sequencing. The frequency of occurrence of SSR loci was one in every 4.93 kb of unigene sequence. Among all repeat types, the length of SSRs was distributed from 12 to 114 bp, with an average of 16.29 bp. Incidences of different repeat types were determined. SSRs existed mainly as dinucleotide repeats (DNR) and trinucleotide repeats (TNR), accounting for 91.83 %. TNR, comprising 52 % of the total SSRs, were the most abundant repeat unit, followed by di-(40 %), tetra-(5 %), hexa-(2 %) and penta-nucleotides (1 %), with the repeat unit number of SSR loci ranging from four to 23 (Fig. 2a, b). Most (98.09 %) of the DNRs and TNRs had 4–10 repeat units, while motifs with more than 10 reiterations were rare, with a frequency of <5 %. Within the identified SSRs, 400 motif sequence types were searched, and the maximum motif with TTCC (6.47 %), GAA (7.04 %) and TC (28.13 %) was found in tetra-, tri- and dinucleotides, respectively (Fig. 2c).

Fig. 2
figure 2

Characterization of SSRs in radish taproot transcriptome. a Distribution of different SSR repeat motif types; b number of different repeat motifs; c frequency distribution of major SSRs based on main motif sequence type

The 20 types of major motifs and the frequencies of individual SSR units are shown in Fig. 2c. In this study, the most common types of all the motifs detected in radish contig sequences were AG/CT (1,970; 16.52 %) and GA/TC (15.81 %), followed by GAA/TTC (5.42 %), AGA/TCT (4.71 %), AAG/TCC (4.38 %), CTC/GAG (3.63 %), GGA/TCC (3.49 %) and ATC/GAT (2.95 %), reflecting the AG/CT-rich nature of the Raphanus genome. However, other motifs shown in Fig. 2c made up less than 2.5 % of the total SSRs, and the GC motif was not detected.

Development and detection of genic-SSR markers

From 73,084 unigenes of the radish taproot, a total of 11,311 SSR-containing sequences were employed for primer design, from which 5,503 (44.67 %) sequences containing SSRs could be successfully used for SSR primer development. A total of 1,052 SSR primers (ESM Table S2) were synthesized randomly and tested for initial validation, and amplified using three radish DNA templates of YB, DY13 and ZQH. Of these SSR primers, 823 (78.23 %) pairs of primers exhibited stable and repeatable amplification. To further confirm the reality and positivity of the polymorphic microsatellite-containing sequences, 20 co-dominant segregation segments were recovered and sequenced after T-A cloning. The results showed that the sequences were consistent with the original unigene ones, indicating that these developed SSR primers were highly specific.

Genetic diversity analysis of radish genotypes

A sample of 67 genic-SSR primers was selected to assess the genetic diversity and to differentiate a set of 32 radish accessions from different countries (ESM Table S1). Eleven primers resulted in weak amplifications or non-specific amplicons, and the remaining primers could generate distinct bands with expected size (Fig. 3a–c). Out of these 56 primers, 17 genic-SSR markers generated unified and poor polymorphic bands. A total of 39 analyzed primer pairs showed allelic polymorphisms, and 211 alleles were detected in total. The number of alleles per locus ranged from 1 to 10 with an average of 3.6. The PIC values of the primers with polymorphisms varied from 0.49 to 0.89, with a mean value of 0.66. The sequences of these informative genic-SSR primers and other major information including core motif, annealing temperature, PIC value and expected size of the PCR products are shown in Table 1.

Fig. 3
figure 3

Genetic diversity analysis of 32 radish genotypes with SSR markers. Polyacrylamide gel electrophoresis patterns amplified with SSR primers UgRsr-6 (a), UgRsr-56 (b) and UgRsr-18 (c), and the UPGMA dendrogram constructed from SSR marker analysis (d). M: DNA ladder 100 bp; numbers (1–32) represent radish genotypes listed in ESM Table S1

Table 1 Genic-SSR primers used for radish genetic diversity analysis and cultivar identification

Based on the genetic similarity results, a dendrogram placed the 32 radish accessions into three main clusters with similarity coefficients ranging from 0.59 to 0.90 (Fig. 3d). The first main cluster (I) consisted of 11 accessions collected from China, most of which had a different color root skin. In this group, there were three subgroups at a cut-off point of 0.67 similarity coefficient primarily according to the color of the xylem and phloem of the taproot. Additionally, ZQH showed a high similarity with WH, both of which had red phloem and white xylem of taproot in medium maturity. WXQ, BJXLM, QTBJQ and YZH were distinctly divergent from the other subgroups in cluster I with a different colored phloem taproot. The second main cluster (II) comprised 15 accessions and further separated into five sub-clusters mainly based on geographical sources, color of taproot and the number of chromosomes. The cultivars in the subgroup IIb mainly originated from America and Korea, while the others were collected from China. In comparison to all accessions of radish in this study, only Lw23 was grouped into subgroup IIe with nine chromosomes. The five wild relatives showed a strong relationship, some of them having purple long siliques (more than 20 cm) or deep purple flowers, which was very different from the local radish cultivars. In addition, all accessions in the group IIIb were early maturing with a white skin root.

MCID of cultivar identification with genic-SSR markers

A total of six genic-SSR primers with reproducible and polymorphic bands were successfully employed to identify the 32 radish cultivars. Of the six primers used, primer UgRsr-6 was the first to be screened and used in the identification (Fig. 4). The electrophoresis results show that primer UgRsr-6 (Fig. 3a) generated two polymorphic bands in 32 cultivars, which could separate these radish genotypes into three groups by the presence or absence of distinct 300- and 310-bp bands. Subsequently, primers UgRsr-28 and UgRsr-18 further separated the three groups of cultivars into smaller groups or singly, such as NPZ, XBY and YZH. Following this cultivar identification procedure, the other five primers (Fig. 4; Table 1) were screened step by step and chosen to differentiate the radish cultivars. By primer UgRsr-11, all the 32 cultivars were completely separated in the MCID (Fig. 4). It should be emphasized that only the clear polymorphic bands generated with each primer were used to differentiate the cultivars. In addition, the sizes and the presence/absence of polymorphic bands used in the MCID (Fig. 4) can make the diagram clear and referable. These observations revealed that the MCID is a valuable and efficient strategy for cultivar identification in radish.

Fig. 4
figure 4

MCID analysis of the 32 radish genotypes with the DNA fingerprints of six genic-SSR primers. Number above each horizontal line in the diagram denotes the size of the polymorphic bands used to separate the genotypes following the line, reported in bp; (+) or (−) denotes the presence or absence of the polymorphic band

Applicability of genic-SSR primers to other species in Brassicaceae

In order to inspect the potential transferability between cross-species, we randomly selected 45 stable and reliable SSR primers for amplification on nine different species of the Brassicaceae family (ESM Fig. S1). Overall, 39 of the 45 (86.67 %) genic-SSR primers showed transferability to one or more of the nine related Brassica species tested; among these, a total of 14 primers yielded distinct and stable bands in all nine species. The specific polymorphic markers were further sequenced. The DNA sequences were subjected to alignment analysis and SSR motifs with different copies were identified (ESM Fig. S1), suggesting that the regions flanking the SSRs were sufficiently conserved across genera of the same family. The relatively high level homology of genic-SSRs were observed in the coding regions, which demonstrated that these newly developed radish SSR markers were applicable and reliable in some related species of the Brassicaceae family.

Discussion

Genic-SSR markers are considered to have strong potential for diversity analysis and genetic mapping in crop species due to their specificity and high degree of conservation (Zhang et al. 2012a; Dutta et al. 2011). Increasing number of studies demonstrated that the identification of SSR-marker-based NGS is an efficient and cost-effective strategy. In order to develop more genic-SSR markers, we sequenced the transcriptome of radish taproots and identified 5,503 genic-SSR primer pairs, out of which 1,052 SSR markers were selected and validated. Functional markers were used in genetic diversity analysis and cultivar identification among 32 different genotypes of radish and in transferability analysis across nine relatives in the Brassicaceae.

Due to the steady decrease in cost per sequenced nucleotide and increase in throughput data, NGS technologies have become a powerful approach for the high-throughput discovery of genes and generate a large amount of sequence data for molecular marker identification (Silva et al. 2013; Iorizzo et al. 2011). Recently, de novo transcriptome assemblies using Illumina sequences have been successfully developed and widely applied in a few important plant species, including rice, wheat and maize (Duan et al. 2012; Li et al. 2010; Lu and Lu 2010). Large-scale root-specific transcriptome analysis could provide useful reference data to profile systemic gene expression and clarify the genetic mechanisms underlying radish taproot formation. To provide a better assembly and coverage of the transcriptome in radish roots, in this study a large collection of longer reads was generated for further novel specific gene discovery in taproot and novel genic-SSR marker development (Wang et al. 2010, 2012c).

In recent years, transcriptome sequencing has proven to be one of the most powerful sources for SSR development in many important plant species, and a large number of transcriptome-based SSRs have been extensively used in genetic diversity analysis, molecular mapping and gene-based association studies (Zhang et al. 2012b; Zeng et al. 2010). Many studies showed that the de novo transcriptome could provide high-quality flanking regions for SSR primer design (Silva et al. 2013). In this study, the repeat sequences are also distributed throughout the genome; about 7.93 % of the radish transcriptome dataset possessed at least one SSR, which is consistent with the range of frequencies reported previously for other dicotyledonous species (2.65–16.82 %). The abundance of SSRs in radish was 4.93 kbp/SSR on average, which was much lower than that of 5.4 in wheat (Peng and Lapitan 2005), 14.0 in poplar and Arabidopsis (Cardle et al. 2000), 20.0 in cotton (Jena et al. 2011) and 23.8 in soybean (Gao et al. 2003). Differences in SSR frequency and abundance might be partially due to the genome composition, dataset size, SSR detection tools and microsatellite mining criteria (Biswas et al. 2012). Nevertheless, our results showed that the SSRs with trinucleotide repeats (52 %) were predominant, which is consistent with previous reports on radish (Wang et al. 2012c; Shirasawa et al. 2011) and other plant species, including rice, peanut, field pea and faba bean (Liang et al. 2009; Zhang et al. 2012b; Kaur et al. 2012).

Among trinucleotide repeats in coding regions, the GAA/TTC motif was the most frequent in radish; this finding was in accordance with other reports in pea (Gong et al. 2010), sesame (Zhang et al. 2012a) and peanut (Liang et al. 2009). Moreover, the dinucleotide motif AG/CT showed the highest frequency (16.52 %), followed by GA/TC, similar to the recent results in radish and other species (Wang et al. 2012c; Zhang et al. 2012a). Although the functional significance of SSRs in plant transcript regions is not quite clear, the motif AG/CT, a homopurine–homopyrimidine stretch with high frequency in the 5′ untranslated region, has been reported to play a role in the regulation of gene expression and nucleic acid metabolism in plants (Martienssen and Colot 2001; Scaglione et al. 2009; Wöhrmann and Weising 2011). Furthermore, it seems to be a common feature in eukaryotic genomes that (GC) n repeats are extremely rare (Biswas et al. 2012; Feng et al. 2009).

Functional genetic markers like EST–SSRs, which were attractive for multi-allelic detection, high cross-species transferability and reproducibility, have increasingly become a powerful tool for gaining insight into genetic studies (Kaur et al. 2012). To date, with the advent of NGS technologies, a large-scale superior resource for gene-based SSR marker discovery has directly or indirectly enabled rapid progress in marker-assisted breeding (Iorizzo et al. 2011; Parchman et al. 2010). A great number of genic-SSR markers have been extensively identified in many plants (Kale et al. 2012; Hiremath et al. 2011) by analysis of the transcriptome sequence data generated. In the present study, a very large number of SSR primer sets were successfully designed from transcriptome sequences of radish taproot, which was discriminative to a certain degree as compared to the previous transcriptome report in radish (Wang et al. 2012c). Most of the primers yielded specific bands, while several markers amplified much larger bands, which was attributable to the presence of a small intron between each pair of primer pairs or variation in the repeat numbers (Gasic et al. 2009; Zeng et al. 2010). Furthermore, in spite of various tests under multiple amplification conditions, the remainder of the primers still did not produce PCR fragments, probably due to assembly errors in cDNA contigs, the existence of null alleles or large introns and primer pairs designed across a splice site, as previously described (Dutta et al. 2011).

Recently, the application of genic-SSR markers for diversity analyses and fingerprinting has been reported in several plant species (Barchi et al. 2011; Triwitayakorn et al. 2011). In this study, genic-SSR markers were demonstrated to be efficient at differentiating 32 radish cultivars, and the distribution was not completely based on their geographical sources, which is consistent with some previous studies (Zhang et al. 2012a; Hernan and Petr 2006). Most of the varieties released in China were clustered in the same subgroup in the dendrogram, suggesting the limited genetic diversity and narrow basis of Chinese radish cultivars (Fig. 3d). To enlarge the genetic basis, more exotic accessions should be used in future radish breeding programs. In conclusion, the genic-SSR markers from the transcriptome sequence were appropriate and superior markers for the discrimination of cultivated landraces and wild species (Dutta et al. 2011).

Manual cultivar identification diagram (MCID) is a new strategy which is more effective, economical and practical for identifying plant cultivars using fewer primers (Zhang et al. 2012c). It can enhance the power of the markers developed in this study in radish cultivar identification and gives reliable information for future rapid identification work. In MCID, only the genic-SSR markers with clear polymorphic bands are selected to gradually distinguish the individual samples, and this method creates a readable and recordable flow chart (Wang et al. 2011), making plant cultivar identification much easier than before (Korir et al. 2013). In this study, six genic-SSR primers were sufficient to distinguish all 32 radish cultivars using the MCID strategy. This is the first report, to our knowledge, that assesses the use of genic-SSR markers for the complete identification of radish cultivars by the MCID strategy.

It is generally accepted that polymorphic SSR markers are invaluable in marker-assisted genetic studies. Moreover, the PIC values, indicating substantial genetic information, can be used to assess the degrees of polymorphism of informative SSR markers in radish. According to the criterion previously described where the three categories were defined as high (PIC > 0.5), moderate (0.25 < PIC < 0. 5) and low (PIC < 0.25) (Yadav et al. 2011; Xu et al. 2012), approximately two-thirds of primers reported herein exhibited moderate or high levels of PIC (Table 1). Generally, genic-SSR markers show a lower level of polymorphism than genomic-SSR markers, but, in contrast, in this study it was a little higher or the same as the previous studies (Dutta et al. 2011; Wang et al. 2012d). In addition, the number of alleles produced may be correlated with heterozygosity level and microsatellite polymorphism.

As the gene-based genetic markers generated in the current study were designed in transcript regions, they could be highly conserved and transferable to closely related species (Kaur et al. 2012). Recently, a considerable degree of transferability of microsatellite markers from one species to other species or genera has been demonstrated within several plant families including cereals (Ince et al. 2010), Leguminosae (Gutierrez et al. 2005) and Rosaceae (Gasic et al. 2009). In this study, similar to EST–SSRs, the genic-SSR markers designed from the transcriptome sequences and representing putative function were observed with a relatively high level of transferability from radish to some related Brassicaceae species. The success of highly efficient amplification was also observed in other research studies, where 93 of 108 SSR primers (86.1 %) adopted from other Prunus species were transferable to chokecherry (Wang et al. 2012b). While a lower transferability rate was observed within Medicago truncatula microsatellites to chickpea (36.3 %) and pea (37.6 %) (Gutierrez et al. 2005), the rate of genic-SSR marker transferability in different species is related to the genetic distance between the species from which the genic-SSR markers were developed and other species. Therefore, these novel genic-SSR markers developed in the present study with a relatively high level of transferability will help advance the investigation of comparative mapping analyses in the Brassicaceae family.

In summary, the genic-SSR marker prediction and development in non-model organisms by Illumina paired-end sequencing technology is a much faster and more cost-effective approach. With their polymorphism, reproducibility and transferability, these newly developed genic-SSR markers will be valuable molecular tools for germplasm identification, genetic mapping, gene tagging, comparative mapping and genetic diversity analyses in radish and relative species.