Introduction

Zingiber officinale Roscoe (Zingiberaceae) is a perennial plant. It is native to tropical climates of India, Malaysia, Australia, China, Brazil, United States and several other parts of the world (Langner et al. 1998). Rhizome of Z. officinale, commonly called as ginger, is generally consumed as a spice for its flavor enhancing effects. Gingerols, shogaols, paradols and zingerone are the main phytoconstituents responsible for its pungency and flavor (Pour et al. 2014). Ginger is also an age old medicine used for its wide array of pharmacological activities. It has been reported to have carminative, gastroprotective, antiemetic, antitussive, antipyretic, spasmolytic, analgesic and peripheral circulatory stimulant effects (Suekawa et al. 1984; Ghosh et al. 2011; Marx et al. 2013; Haniadka et al. 2013; Pour et al. 2014). The extract of ginger was shown to posses anti-inflammatory and anticancer activities (Miyoshi et al. 2003; Habib et al. 2008). It has also shown prominent and effective glycaemic control properties in diabetes mellitus and related complications (Li et al. 2012).

Most commercially cultivated varieties of Z. officinale rarely flower and generally do not produce viable seeds (Inden et al. 1988). However, it is noteworthy that considerable diversity is known in terms of color of rhizome, its size, aroma, fiber content and chemical profiles, in various cultivated varieties (Pandotra et al. 2013a; Khan et al. 2016). Present study was planned to understand the genetic basis of such high morphological diversity observed in Z. officinale, despite being a vegetatively propagated plant. For this, expressed sequence tag (EST) databases offer a rich source of information and EST-SSRs have been widely used for diversity analysis and development of new molecular markers in plant species. Simple sequence repeats (SSRs) or microsatellites are short repetitive DNA sequences which occur due to slipped strand mispairing (Leclercq et al. 2010). It has also been shown that SSRs found in coding sequences (EST-SSRs) are polymorphic across species which may be helpful in phylogenetic analysis (Gandhi et al. 2010; Awasthi et al. 2012) as well as plant breeding studies (Scott et al. 2000; Pashley et al. 2006).

Earlier, we have studied chemical diversity, differential ability to accumulate heavy metals, morphological diversity and genetic diversity on the basis of inter simple sequence repeat and retrotransposon based markers in various accession of Z. officinale (Pandotra et al. 2013a, b; Khan et al. 2016). Present study involves characterization of the ESTs of Z. officinale for the type of SSRs present, annotation of their putative function and roles in different biological processes. Apart from studying diversity in accession of Z. officinale, the EST-SSR markers developed in this study were also assessed for their ability to be cross-transferred to other species of Zingiberaceae family.

Materials and methods

Collection of plant material

25 accessions of Z. officinale were collected from different locations in India, as shown in Table 1. Zingiber zerumbet (L.) Roscoe ex Sm. (Zingiberaceae), Hedychium spicatum Sm. (Zingiberaceae), Curcuma longa L. (Zingiberaceae), C. amada Roxb. (Zingiberaceae), C. aeruginosa Roxb. (Zingiberaceae), C. aromatica Salisb. (Zingiberaceae) and C. angustifolia Roxb. (Zingiberaceae) plants were used to assess the cross species transferability of EST-SSR markers. These plants were grown in institutional experimental farms [Jammu, India 32°44′N:74°54′E], as described earlier (Pandotra et al. 2013b).

Table 1 Locations of different accessions of Z. officinale collected from various regions of India

Data mining for EST-SSR markers and primer designing

38,115 EST sequences of Z. officinale were downloaded from the NCBI GenBank database (http://www.ncbi.nlm.nih.gov). EST-SSRs were mined using MIcroSAtellite identification tool (MISA, http://pgrc.ipk-gatersleben.de/misa/). The parameters used for identification of SSRs were same as described in our previous study (Awasthi et al. 2012; Gandhi et al. 2010; Mahajan et al. 2015). Primers used for EST-SSR study were designed using PRIMER3 software (http://primer3.ut.ee/).

DNA isolation and amplification of genic microsatellite markers

Total DNA was isolated from young leaves, by CTAB method (Doyle and Doyle 1987), and analyzed using electrophoresis and NanoDrop 2000c spectrophotometer (Thermo Scientific, MA, USA). 20 µl polymerase chain reaction (PCR) was carried out in a PCR machine (Eppendorf, Germany). The reaction mixture comprises of 50–100 ng of genomic DNA, PCR buffer (10×; with MgCl2), 2 µl dNTPs (2 mM), 2 µl of each primer (0.5 µM), 1U of Taq DNA polymerase (New England Biolabs, England, UK). The PCR was set at following conditions: initial denaturation at 95 °C for 5 min, followed by 35 cycles at 95 °C for 1 min, 1 min at Tm (for detail see Table 2), 1 min at 72 °C, and final extension at 72 °C for 10 min followed by hold at 4 °C. A control reaction was performed using 18S rDNA primers to ascertain the presence of amplifiable genomic DNA. PCR products were evaluated by polyacrylamide gel electrophoresis (PAGE) for changes in amplicon size. Amplicon size changes up to ± 15 base pairs were taken into consideration, remaining were discarded as possible indels and mispriming.

Table 2 Primers used in the study

Statistical evaluations

EST-SSR bands were scored as discrete variables. Presence and absence of a band was scored as 1 and 0 respectively (binary data). Dice coefficient of similarity (Dice 1945) was calculated for the accessions, to examine the genetic relatedness. Further, for plotting of dendrogram, a similarity matrix generated using the Dice coefficient was clustered in ‘SAHN’ subroutine using UPGMA (Unweighted Pair Group Method with Arithmetic mean) method. NTSYSpc ver 2.2 tool was used for statistical evaluation.

Results

Identification and characterization of EST SSR marker from Z. officinale

We mined the EST resource of Z. officinale containing a total of 38,115 ESTs, and clustered them using MegaBLAST. Clusters were assembled using a CAP3 assembly to generate 7850 contigs and 10,762 singletons. SSRs were searched in these unigenes and 512 microsatellite containing ESTs were identified with 500 ESTs having single SSRs while 15 ESTs had more than 2 SSRs. 349 EST sequences were found appropriate for primer designing. The frequency of SSR was 1 per 25.21 kb. Five different repeat motifs were identified (di-, tri-tetra-, penta- and hexanucleotide). As expected, tri-repeats were the most abundant (39.74%) SSR in the ESTs followed by di-(24.48%), hexa-(13.75%), tetra-(12.24%) and penta (9.79%) as summarized in Supplementary Table 1. Among the di-nucleotide repeats, there was a distinct predominance of TA (24.6%, 32/130) and GA (21.5%, 28/130) repeats, with low frequencies of other di-nucleotide repeats. Amongst trinucleotide repeats, CGC (5.2%, 11/211), CTC (5.2%, 11/211), GAG (5.2%, 11/211), GGA (5.6%, 12/211) and TCC (7.1%, 15/211) repeats were found to be in majority. The most frequent tetra- and penta-nucleotide repeats were TTTG (6.1%, 4/65) and AATAT (7.6%, 4/52) respectively, but their frequencies were low. We also identified hexa-nucleotide repeats but most of them were found only once (Supplementary Table 2).

Functional annotation and classification of EST SSR sequences from Z. officinale

The annotations of 349 EST sequences were performed using Blast2GO tool (Conesa et al. 2005). The query sequence was aligned with the non-redundant protein sequences from NCBI database. A match with an E-value of 10−6 or less was considered as significant. Gene ontology analysis was performed to determine biological process, molecular functions and cellular component of these ESTs. Several ESTs were found to be involved in different biological processes like transport, protein modification, metabolic processes, response to different stresses while many had hydrolase, kinase activity and binding domains for nucleic acid, lipids and proteins (Supplementary Fig. 1).

EST SSR marker development and its cross transferability study within Zingiberaceae

25 accessions of Z. officinale, collected from different regions of India were used for wet lab validation of EST-SSR markers (Table 1). Prior to validation studies, cross-species transferability of these EST-SSR markers was assessed by carrying out in silico PCR against Curcuma longa unigenes. 12,565 ESTs from C. longa were assembled in the same way as described for Z. officinale into 2978 contigs and 4072 singletons. 116 primer pairs resulted in successful in silico amplification (data not shown). Out of 349 primer pairs that were designed from EST database of Z. officinale, PCR was carried for 25 primer pairs, selected in an arbitrary and random fashion for experiments. PCR products were run on agarose to check for gross amplification results, mispriming and null alleles. Null alleles were confirmed by using at least 5 times more template in PCR reactions. Out of 25, 16 primers sets were optimized for the EST-SSR study. PCR products were run on PAGE to test the change in amplicon size. Representative image showing amplification across 25 accessions using three EST SSR markers (GES440, GES452 and GES454) is shown in Fig. 1. The optimized EST-SSR markers were checked for cross species transferability among seven different species of Zingiberaceae: Z. zerumbet, H. spicatum, C. longa, C. amada, C. aeruginosa, C. aromatica and C. angustifolia. GES454, GES466, GES480, GES486 markers were transferable to all the accessions of Zingiber, Curcuma and H. spicatum while GES452, GES456 marker were transferable to all species under study except one (Supplementary Fig. 2a). Percentage transferability of EST-SSR markers was found to be 100% for GES454, GES466, GES480, GES486 (Supplementary Fig. 2b).

Fig. 1
figure 1

Development of EST-SSR markers in different accessions of Z. officinale collected from various regions of India. a Representative gel image of three EST-SSR markers (out of 25) used in the study. 18S rDNA was used as control. Optimized PCRs were run on PAGE to test for changes in amplicon size. Amplicon size changes up to ± 15 base pairs were taken into consideration, b dendrogram generated using UPGMA analysis depicts phylogenetic relationships amongst different accessions

Discussion

Zingiber officinale is important medicinal spice, with wide range of pharmaceutical properties (Marx et al. 2013; Haniadka et al. 2013). Genetic variability observed in vegetatively propagated crops, such as Z. officinale, may be due to ancestral differences and/or spontaneous mutations (Elias et al. 2000; Jankowicz-Cieslak et al. 2012). Microsatellite expansion/contraction and retrotransposon mobility are amongst the key drivers of spontaneous mutations (Ellegren 2004). Here, we have made a systematic attempt to study the genic (protein coding) sequences of this plant for development of genetic markers. SSRs may be found in non-coding as well as protein coding (genic) DNA (Tóth et al. 2000; Zhao et al. 2014). However, variation in lengths of SSRs present in genic sequences may have profound effects on morphology, chemoprofile, fiber content, etc. Earlier we have reported differences in such parameters amongst the collected accessions of Z. officinale (Khan et al. 2016; Pandotra et al. 2013a, b). EST-SSR markers (genic microsatellites) have been widely used for gene mapping and diversity analysis in many plant species (Kantety et al. 2002; Chen et al. 2006). Earlier, we had developed EST-SSR markers for Ocimum genus and correlated with the essential oil composition (Mahajan et al. 2015).

Abundance and distribution of microsatellites and EST-SSR primer development

In case of ginger, we found that 2.7% unigenes contained non-redundant microsatellites. In general, EST-SSR frequency was found to range between 2.65 and 16.82% for dicotyledonous species (Kumpatla and Mukhopadhyay 2005). While EST SSR frequency in monocots is generally low (1.5–4.7%) in comparison to dicots (Kumpatla and Mukhopadhyay 2005). Among dicots, computational analysis of EST data shows that there are significant differences in types and abundance of SSRs in various plants (Kantety et al. 2002; Kumpatla and Mukhopadhyay 2005). The present study recorded a relatively lower abundance of genic SSRs as compared to other dicot plant species such as grapes, tea, coffee and turmeric (Scott et al. 2000; Poncet et al. 2006; Joshi et al. 2010; Huang et al. 2011; Backiyarani et al. 2013). The frequency of different SSR repeats in Z. officinale revealed that tri-nucleotide repeats were the most plentiful SSRs followed by di-, hexa-, tetra- and penta-nucleotide repeats. In general, genic SSRs with tri-repeats remained most common among the monocot and dicots (Kumpatla and Mukhopadhyay 2005).

Transferability of EST SSR marker within Zingiberaceae

16 optimized EST-SSR markers were found to be cross transferable among seven species (Z zerumbet, H. spicatum, C. longa, C. amada, C. aeruginosa, C. aromatica and C. angustifolia) of Zingiberaceae (Supplementary Fig. 2). Higher levels of transferability of these genic SSRs imitate the conserved nature of coding sequences in the microsatellite flanking region. This result suggested that these transferable genic microsatellite markers could be used for detection of markers associated with specific traits in other Zingiberaceae species and related genera. Similar results were seen among the other species of grapes, and pines (Decroocq et al. 2003; Chagn et al. 2004). In another study, 100% transferability of EST SSR (C. longa) marker was observed but among other species of Curcuma (Siju et al. 2010).

Conclusion

Ginger is a vegetatively propagated plant, however it is found in multiple varieties which display considerable diversity in morphological characters and chemoprofiles. Present study has isolated genic microsatellite markers from this commercially important plant. Since these markers originate from coding gene sequences, they can be utilized not only for studying the genetic diversity but also used for identification of candidate genes for particular traits. Further, many markers were found to exhibit a high degree of cross-genus and cross-species transferability which implies that they will be a precious resource for the comparative mapping by developing conserved ortholog set (COS) markers in evolutionary studies of different members of Zingiberaceae.