Introduction

Averrhoa, a fruit tree genus belonging to the family Oxalidaceae, invites special attention on account of its economic and medicinal importance. The genus includes the two cultivated species (A. bilimbi L. and A. carambola L.) and three wild species (A. dolichocarpa Rugayah & Sunarti, A. leucopetala Rugayah & Sunarti and A. microphylla Tardieu). The present study is restricted to the cultivated species. Averrhoa bilimbi and A. carambola, commonly known as the cucumber tree and starfruit tree, are reported to have originated in Southeast Asia, most probably in Malaysia [4] and Indonesia [27], respectively. Though closely allied, the two species of Averrhoa are quite different in appearance, flowering, manner of fruiting, flavor and uses. Information regarding the genetic constitution and relationships between the species of the genus, Averrhoa, is scarce. A detailed knowledge of the amount and apportionment of genetic variation within a species is necessary for the development of appropriate strategies for the conservation and utilization of plant genetic resources [32].

Compared to the conventional marker systems, DNA-based molecular markers facilitate the simultaneous comparison of the genetic constitution of individuals growing at various geographical locations. The two molecular methods adopted for the present study were RAPD and SRAP. RAPD has been used extensively for the determination of genetic variability and construction of linkage maps because of its simplicity and speed [15, 38], while SRAP combines simplicity, reliability, moderate throughput ratio and facile sequencing of selected bands [18].

Among the many types of PCR-based molecular markers, SRAP has been demonstrated to be useful in the genetic analysis of different plant species [7, 23]. The SRAP technique consists of preferential amplification of open reading frames (ORFs) using PCR. The first primer preferentially amplifies exonic regions, which tend to be rich in these nucleotides, and the second primer preferentially amplifies intronic regions and regions with promoters. The observed polymorphism fundamentally originates in the variation of the length of these introns, promoters and spacers, both among individuals and among species [18]. Sequencing demonstrated that SRAP polymorphism results from two events: fragment size changes due to insertions and deletions, which could lead to codominant markers, and nucleotide changes leading to dominant markers.

The objective of the present study is to determine the genetic relationship between and among nine collections of A. bilimbi and six of A. carambola obtained from different regions in Kerala state and also to evaluate the utility of the RAPD and SRAP techniques in generating DNA markers for genetic and taxonomic studies.

Materials and Methods

Plant Materials

The preliminary screening was carried out in Averrhoa bilimbi (40 trees) collected from different localities of nine districts (collections) and A. carambola (20 trees) collected from six districts in Kerala state (Supplementary Table 1). Since the variations between collections were less, the trees collected from each district were considered to represent one collection in the subsequent analysis, thereby reducing the effective sample size as fifteen collections. Fresh, young, disease-free leaves were used for DNA extraction. The leaves of all samples were surface-sterilized and stored frozen at − 80 °C until use.

Genomic DNA Extraction

Total genomic DNA was extracted from 100 mg tissue of each collection based on the method of Doyle & Doyle [6]. The quality of the DNA samples was determined by observing the ratio of absorbance at A260/A280 using Biospectrophotometry (Biophotometer Plus, SI No 6132 ZG 4026 Eppendorf, Germany). The presence of DNA was confirmed by subjecting samples to 0.8% agarose gel electrophoresis and staining with ethidium bromide. Staining intensities of the total DNA were compared visually with a 500 bp molecular weight DNA ladder. After quantification, DNA was diluted to a final concentration of 50 ng µl−1.

All the required chemicals were obtained from Genei (Bangalore, India).

RAPD Analysis

Primer Selection

A total of 30 random primers belonging to OPA, OPAE, OPAN, OPBA, OPC, OPD, OPN, OPP, OPT, OPX, P and UBC series (Integrated DNA Technologies Pvt. Ltd., USA, and Oligonucleotide Synthesis Laboratory, University of British Columbia) were initially screened using a sample from each species. Three different primer concentrations were tested (0.1, 0.3 and 0.5 µM) for obtaining better results.

Polymerase Chain Reaction of Genomic DNA

DNA amplification was performed in a Thermal cycler (Eppendorf master cycler gradient, SI. No. 533149527, Germany) and commenced with the reaction cycle as given: 95 °C (4 min), followed by 35 cycles at 94 °C (30 s), 37 °C (1 min), 72 °C (2 min), followed by a final extension at 72 °C (5 min). Polymerase chain reactions were carried out in a total volume of 25 µl reaction cocktail consisting of 50 ng isolated genomic DNA, 200 mM each of dATP, dCTP, dGTP, dTTP; 10 pmol of primer, 1 unit of Taq DNA polymerase and 1× reaction buffer (10 mM Tris–HCl, 1.5 mM MgCl2 and 50 mM KCl, pH 8.3). A control containing all components except genomic DNA was included in each set of reactions to check for contamination. PCR conditions, including the concentration of template DNA, primer, dNTP, Magnesium and Taq DNA polymerase were optimized to generate RAPD profiles of high intensity and sharp bands with a clear background. The amplification reactions were repeated at least twice, and only reproducible and intense bands were scored.

SRAP Analysis

Primer Selection

About 36 different SRAP primer combinations, with six forward (ME series) and six reverse primers (EM series), were chosen based on previous studies [18]. Here, also three different primer concentrations (0.1, 0.3 and 0.5 µM) were screened.

Polymerase Chain Reaction of Genomic DNA

A standard polymerase chain reaction (PCR) cocktail mix was subjected to the following thermal cycling profile: 1 cycle of 94 °C for 5 min, 5 cycles of 94 °C for 1 min, 35 °C for 1 min, and 72 °C for 1 min, 35 cycles of 94 °C for 1 min, 50 °C for 1 min, and 72 °C for 1 min followed by a final extension of 72 °C for 10 min.

Gel Electrophoresis and Documentation

Amplified products were subjected to electrophoresis on different concentrations of Agarose gel with 1X Tris Acetate Ethylene Diamine Tetra Acetic acid (1X TAE Buffer), stained with Ethidium Bromide and visualized with the help of UV trans-illuminator equipped with Gel Documentation System (Alpha Innotech Corporation, USA). A 500 bp DNA ladder (Genei, Bangalore, India) was included as reference. Electrophoresis was performed in a horizontal gel electrophoresis unit (Scie Plas, UK), and the gels were documented and analyzed in detail by the Gel Documentation System.

Data Scoring and Analysis

Scoring

Bands discerned from the agarose gel were scored as present (1) or absent (0), assembled into a data matrix and used to compute the measures of genetic distance for all pairs of individuals. Less resolving bands were excluded from the scoring process. In order to identify the patterns of similarity and dissimilarity, the data matrix was subjected to cluster analysis based on UPGMA method. Multivariate analyses were performed using Multivariate Statistical Package MVSP version 3.1 (Kovach computing Services, Wales, UK).

Genotype Data

To analyze the suitability of RAPD and SRAP markers to evaluate genetic profiles of Averrhoa collections, the performance of the markers was measured using five parameters: polymorphic information content (PIC), effective multiplex ratio (EMR), marker index (MI), resolving power (RP) and discrimination power (D).

The PIC value for each RAPD primer can be calculated as proposed by Roldan-Ruiz et al. [28]. PICi = 2fi (1 − fi), where fi is the frequency of the amplified fragments. The marker index (MI) was also calculated [31] using the formula MI = PIC × EMR, where EMR (effective multiplex ratio) is the product of the fraction of polymorphic loci (β) and the number of polymorphic loci (n). EMR = . The resolving power (Rp) of each primer was calculated according to Prevost and Wilkinson [26] using the formula Rp = ΣIbi, where Ibi = band informativeness, Ibi = 1 − [2 × (0.5 − pi)]. Discrimination power (D) was estimated using the formulae of Tessier et al. [29] as D = 1 − Ci, where Ci (confusion probability) Ci = pi × \(\frac{{\left( {Np_{i} } \right) - 1}}{N - 1}\) where, pi = frequency of banding patterns, N = total number of genotypes.

The relationship between the primer efficiency parameters was calculated by Pearson correlation coefficient using Minitab 2000 version 13.2 SPSS statistical software. An analysis of molecular variance (AMOVA) was performed using FAMD 1.31 (Fingerprint Analysis with Missing Data) software to partition the total molecular variance between and within populations. Significance level was detected via permutation test (n = 1000).

Results

The present study deals with the genetic characterization of two species of the genus Averrhoa using two molecular markers such as RAPD and SRAP.

RAPD Analysis

To investigate the genetic variability and relationships within and between species of Averrhoa, 30 RAPD primers were tested, 23 of which revealed reproducible polymorphic patterns (76.66%) and were used for further analysis (Table 1). The results presented here demonstrated that 1.7% agarose gel provided a satisfactory resolution of PCR amplification products.

The suitability of RAPD primers for estimating genetic variability of the selected samples was also verified. Two main aspects were evaluated: Marker informativeness (polymorphism and overall efficiency of informative band detection) and Marker performance (overall efficacy of the primer sets used in determining polymorphism level, genetic diversity and discriminatory power).

Marker Informativeness

A total of 309 RAPD loci were generated by the 23 RAPD primers selected for the study, and out of these, 293 (76.66%) were found to be polymorphic. The number of polymorphic bands varied from 6 (SK-03) to 18 (OPD-07) with an average of 12.74 bands per primer. RAPD product frequencies of all the collections of two species are shown in Table 1.

Table 1 Polymorphism detected with 23 RAPD primers in fifteen collections of Averrhoa

Based on the number and frequency of amplified products produced in the present study, the primers, OPD-07, OPA-10, OPA-18 and OPP-1 (in the listed order) may be considered to be more efficient compared to the rest (Supplementary Fig. 1).

The frequencies of polymorphic loci obtained varied from primer to primer. Twelve primers, viz. OPA-18, OPAN-01, OPC-11, OPD-07, OPP-1, OPX-17, P4, UBC-210, UBC-292, UBC-465, UBC-509 and UBC-54, generated 100% polymorphism. The size of the amplified fragments varied from 101–2543 bp. The present study revealed that a total of 72 unique bands were recognized out of 293 polymorphic bands. The maximum number of unique bands was produced by OPD-07 (eight) followed by OPC-11 (six) and UBC-292 (six).

Marker Performance

The information on the genetic profile of each collection obtained using the 23 RAPD primers was used to assess the marker performance through the evaluation of five parameters: PIC, EMR, MI, Rp and D (Table 2).

Table 2 Marker parameters calculated for each RAPD primer with Averrhoa collections

OPD-07 appears to be the most suited primer for distinguishing among different collections of the two species of Averrhoa while considering the marker performance parameters: PIC, EMR, MI and Rp.

The relationship between efficiency of primer parameters was studied using Pearson correlation.

The highest positive Pearson correlation (r = 0.946) was between the MI and EMR. A significant positive correlation could be observed for PIC, EMR, MI with Rp. However, the D value showed a low magnitude of correlation with other parameters (Table 3).

Table 3 Pearson correlation coefficients between efficiency parameters of 23 RAPD primers (n = 23) in Averrhoa

Genetic Similarities

Cluster analysis was conducted based on the Jaccard’s dissimilarity coefficient matrices, calculated from RAPD binary data. The dendrogram derived from UPGMA cluster analysis clearly discriminated the Averrhoa collections into two major clusters (Fig. 1). Cluster 1 grouped together all nine collections of A. bilimbi and cluster 2 all the six collections including the sour and sweet types of A. carambola. All the nine collections of first cluster again divided into two subclusters. The subcluster consisting of Ab2, Ab3 and Ab4 could be clearly distinguished from the rest. Likewise, all the six collections of second cluster again divided into two subclusters. The first subcluster consisting of two collections each of sour (Ac2 and Ac4) and sweet (Ac5 and Ac6) collections of A. carambola. This group could be distinguished from the two other sour collections (Ac1 and Ac3).

Fig. 1
figure 1

UPGMA cluster of fifteen collections based on 23 RAPD primers

RAPD primers currently employed for the genetic diversity studies were useful in distinguishing the relationship between the two species. The results of genetic diversity based on UPGMA dendrogram derived using genetic similarity values and the PCA analysis conducted directly from binary data matrix were in agreement with a great extent. PCA revealed that the first principal component (PC) accounted for 32.773% of the variation (Fig. 2).

Fig. 2
figure 2

Principal component analysis based on RAPD analysis

Results from AMOVA indicated that 51.84% of the genetic variation could be attributed to differences among species, while 48.16% of the genetic variation among collections of one species (Table 4).

Table 4 Analysis of molecular variance (AMOVA) of RAPD data

SRAP Analysis

In SRAP analysis, out of 36 primer combinations screened, 31 combinations (86.11%) were chosen for further analysis based on their reproducibility and polymorphism. In order to compare the efficiency of resolution in Polyacrylamide gel electrophoresis and Agarose gel electrophoresis, initially amplicons obtained were used on both gels. However, the bands resolved from the agarose gels were almost similar to the bands obtained in polyacrylamide. Since agarose gels with ethidium bromide staining are cost-effective and less time-consuming than acrylamide gels with silver staining for marker screening and genetic diversity studies, the former were chosen for the present experiments. Later, the concentration of agarose gel and SRAP primers for analysis was standardized as 2% and 0.1 µM, respectively. The best annealing temperature for amplification was standardized as 35/50 °C. The primers and their sequences used for the analysis are listed in Supplementary Table 2.

Marker Informativeness

A total of 534 SRAP amplicons were produced by the 31 SRAP primer combinations screened for the study, and out of these, 510 (95.51%) were found to be polymorphic. Highest number of polymorphic bands were produced by the primer combination ME2/EM6 (26). Lowest number of polymorphic bands were produced by the primer combination ME5 + EM6 (6) with an average of 17.23 bands per primer combination. SRAP amplicon frequencies of all the collections of two species are shown in Table 5. The size of SRAP amplicon fragments ranged from 38 to 1480 bp.

Table 5 Polymorphism detected with 31 SRAP primer combinations in fifteen collections of Averrhoa

Out of 31 primer combinations studied, 24 combinations generated 100% polymorphism. By the amplification of 31 primer combinations, a total of 108 unique bands were generated out of 510 polymorphic bands.

A maximum number of unique bands were produced by ME3 + EM4 (eight) followed by ME2 + EM6 (seven), ME3 + EM6 (seven) and ME1 + EM6 (six). Certain unique bands produced by SRAP primers were unique to some collections of Averrhoa, and some other unique bands were species specific.

The primer combinations ME3 + EM4, ME2 + EM6, ME3 + EM6 and ME1 + EM6 were detected to be the most efficient for SRAP amplification of Averrhoa collections based on the number of polymorphic and unique bands generated (Supplementary Fig. 2).

Marker Performance

Marker performance of each Averrhoa collection characterized by using 31 SRAP primers was analyzed by considering five parameters, viz. PIC, EMR, MI, Rp and D (Table 6).

Table 6 Marker parameters calculated for each SRAP primer used with Averrhoa collections

On the whole, considering parameters such as PIC, EMR, MI and Rp, the most suited primer for distinguishing the different collections was identified to be ME2/EM6.

Pearson correlation analysis revealed that highest positive Pearson correlation (r = 0.969) was between the MI and EMR followed by Rp and MI (r = 0.968). A significant positive correlation could be observed for all parameters including D (Table 7).

Table 7 Pearson correlation coefficients between efficiency parameters of 31 SRAP primers (n = 31) in Averrhoa

Genetic Similarities

UPGMA dendrogram derived from cluster analysis based on the Jaccard’s dissimilarity coefficient matrices discriminated the fifteen collections of Averrhoa into two major clusters (Fig. 3). Cluster 1 gathered together all the nine collections of A. bilimbi and cluster 2 all the six collections of sour and sweet collections of A. carambola. All the nine collections of first cluster are again divided into two subclusters. The first subcluster consisting of Ab2, Ab3 and Ab4 could be clearly distinguished from the rest, and the second subcluster separated three collections of A. bilimbi, Ab7, Ab8 and Ab9. Unlike RAPD, relatively high genetic similarity was observed between Ab7 and Ab8 of A. bilimbi.

Fig. 3
figure 3

UPGMA cluster of fifteen collections based on 31SRAP primers

The six collections of second cluster (of A. carambola) are divided into two subclusters separating the sour and sweet collections. The first subcluster consisted of all the sour collections of A. carambola (Ac1, Ac3, Ac2 and Ac4), and the second subcluster consisted of all the sweet collections of A. carambola (Ac6 and Ac5) unlike that in the RAPD cluster.

The results of UPGMA dendrogram and the PCA analysis were in agreement with a great extent. PCA revealed that the first PC accounted for 40.038% of the variation (Fig. 4).

Fig. 4
figure 4

Principal component analysis based on SRAP analysis

Results from AMOVA showed that 46.69% of the genetic variation could be credited to differences among species, while 53.31% of the genetic variation among collections of a species (Table 8).

Table 8 Analysis of molecular variance (AMOVA) of SRAP data

Integrated Data of RAPD and SRAP Analysis

Cluster analysis for integrated data based on RAPD and SRAP markers divided the fifteen collections of Averrhoa into two distinct clusters (Fig. 5). The first major cluster consisted of all the nine collections of A. bilimbi, and the second major cluster contained all the six collections of sour and sweet A. carambola. Here also, the A. bilimbi collections Ab2, Ab3 and Ab4 formed a separate subcluster from the rest of the collections as noticed for RAPD and SRAP independently. Averrhoa bilimbi collections, Ab7 and Ab8 were most closely related followed by Ab9 as observed for SRAP, but Ab8 and Ab9 were most closely related in RAPD data. The integrated data were capable of distinguishing sweet collections from sour collections just as in the dendrogram generated from SRAP. The first principal component accounted for 40.042% of the total variation, and the second component accounted for 23.549% of total variation (Fig. 6).

Fig. 5
figure 5

UPGMA dendrogram of fifteen collections with the combined data of RAPD and SRAP markers

Fig. 6
figure 6

Principal component analysis based on integrated RAPD and SRAP markers

Discussion

Molecular markers are very useful in the genetic analysis of different plant species. This ability of molecular markers is directly related to the number of polymorphic bands detected [25]. Dominant (RAPD, AFLP) and codominant (RFLP, SNPs) molecular markers have their own merits and demerits. Dominant markers require no prior information of the genetic background [13, 24] and can display a large number of polymorphic loci [10, 35], while codominant markers, though more accurate, have procedural complexities.

In the current study, fifteen collections belonging to two species of genus Averrhoa were analyzed.

RAPD

In the present study, 94.82% of RAPD markers were found to be polymorphic. Greater polymorphism indicates the ability of such primers in differentiating a given set of samples more efficiently. Some of the bands were unique to certain collections of Averrhoa and could be considered as potential species specific bands.

The marker performance was assessed using PIC, EMR, MI, RP and D values. The PIC value of each marker represents the probability of finding this marker in two different states (present/absent) in two plants drawn at random from the population. The PIC value is often used to measure the informativeness of a genetic marker system [33], and the theoretical maximum PIC value for a dominant marker is 0.5. The EMR depends on the fraction of polymorphic loci. MI is the statistic used to calculate the overall utility of a marker system. Higher MI values indicate greater suitability. The Rp is a parameter that indicates the discriminatory potential of the primers chosen. The D value determines the efficiency of primers on the basis of their banding patterns. The higher the number of amplified products, the better the discrimination of collections.

Among the five parameters studied, PIC, MI and Rp were most efficient, informative and suitable for the genetic diversity analysis of Averrhoa, except D. Here, the maximum D value was observed for the primers, OPAN-01 and UBC-210. The present investigation also revealed that the PIC, EMR, MI and Rp values were most appropriate for the primer, OPD-07. PCs with eigen value > 1.0 are considered as informative. Therefore, in the present study all the PCs obtained were highly informative. Thus, RAPD proved to be an efficient DNA marker type to detect the genetic diversity and relatedness of A. bilimbi and A. carambola collections.

SRAP

The SRAP marker system has been adapted for a variety of purposes in different crops, including map construction, gene tagging and genetic diversity studies [12]. Zaefizadeh and Golieb [39] reported that SRAP markers possess multi-loci and multi-allelic features, which make them potentially efficient for genetic diversity analysis, gene mapping and fingerprinting of genotypes.

Recently, SRAP markers have been demonstrated to be reliable, reproducible and cheap in several studies including Brassica [18], Cucurbita pepo [9], elephant grass [36], Celosia argentea [8], olive varieties [14] and mung bean [3]. However, SRAP has not been utilized for evaluating the genetic analysis in Averrhoa. Li et al. [19] confirmed that SRAP could preferentially amplify gene-rich regions in a genome. Consequently, SRAP markers could be more advantageous over ISSR markers due to a large difference in the numbers of polymorphic loci detected by individual SRAP primers. Many studies have found that SRAP markers provide similar level of band-pattern variability and reproducibility as that of AFLP markers, but with significantly less technical effort and cost [17, 18, 20, 22, 34]. Furthermore, a lesser extent (up to 20%) of codominance has been identified for SRAP markers [18] and is higher compared to that for AFLP markers [24]. These findings highlight the value of SRAP markers for the genetic analysis.

SRAP markers were originally developed in Brassica to be run on polyacrylamide gels [18]. Agarose gels as well as denaturing PAGE have been used in resolving polymorphic bands in SRAP. Gulsen et al. [11] used 2.5% agarose gel for electrophoresis, and the number of bands scored per primer ranged from two to six. In the present study, when 2% agarose gel was used for electrophoresis, the total number of amplified bands ranged between 9 and 26.

Out of 36 primer combinations screened, 31 SRAP primer combinations were used for further studies. The rate of polymorphism for the SRAP markers (97.07%) was little high compared to that for RAPD markers (94.82%). About 77.42% of primer combinations used for the study unveiled 100% polymorphism. Both the polymorphism percentage and polymorphic fragment number of each primer combination were higher than that for RAPD markers. SRAP has the same operation convenience as RAPD but shows improved stability over RAPD [21]. In addition, SRAP markers require no complex mix of enzymes for cutting, joining and pre-amplification compared with that for AFLP markers, which makes SRAP a more effective molecular marker system for genome studies [37].

Based on marker informativeness, the most efficient primer combinations identified were ME3 + EM4, ME2 + EM6, ME3 + EM6 and ME1 + EM6. In the present study, SRAP exhibited more bands per primer combination tested than RAPD. Marker performance as tested by the PIC values showed that it ranged from 0.17 to 0.40 (< 0.5). The primer combination, ME2/EM6 exhibited relatively high values for the primer efficiency parameters (PIC, EMR, MI and Rp) and also polymorphic and unique bands. A comparative table of primer efficiency parameters is provided (Table 9). The study thus revealed that the most efficient primer combination for the characterization of Averrhoa collections was ME2/EM6.

Table 9 Comparison of primer parameters with previous studies on SRAP markers

Pearson correlation analysis indicated a significant correlation between all the primer parameters. A higher positive correlation could be observed for SRAP primers compared to RAPD primers.

An important feature of a good marker system is the capacity to distinguish among different collections.

Cluster analysis and principal component analysis conducted on RAPD and SRAP molecular data indicated similar patterns, i.e., collections from different localities were found to group together. Therefore, there was no definite correlation between different cluster groups and their geographical locations. Similar observations have been recorded previously in some other crops [5, 30].

However, the sour and sweet collections of A. carambola could be discriminated in both SRAP and integrated analysis (RAPD + SRAP) but not with RAPD. This discrimination is possibly because of the strong relation between flavor genes of fruits (sour and sweet). Aharoni et al. [2] have identified the genes that directly influence fruit flavor formation mainly in tomato, strawberry, and melon fruits. The availability of information on genes and enzymes associated with pathways of flavor formation is a prerequisite for understanding their genetic control.

SRAP markers preferentially amplify ORFs, which may include coding regions of the genome involved in morphological and agronomic traits, and likely reflect differences in coding sequences, which are thought to be relatively conserved among species [9].

Thus, the results demonstrated that the SRAP technique is more informative and efficient for studying the genetic diversity in Averrhoa than RAPD. On the whole, it is evident from the present study that SRAP markers have an upper hand in the genetic analysis of the genus.

Though molecular studies have been conducted in discriminating the species of Averrhoa, a critical evaluation of various efficiency parameters of RAPD and SRAP markers for identifying the inherent genetic variations has not been reported earlier.

Conclusions

This is the first in-depth report of genetic characterization in Averrhoa using RAPD and SRAP markers. The exhibited polymorphism reflects the extent of genetic divergence among and within the collections of two species of Averrhoa. The precision and accuracy of SRAP in identifying genetic diversity at the molecular level makes it the marker system of choice when studying closely related genotypes. The outcome demonstrates the efficacy of SRAP over RAPD in disclosing the genetic variability in Averrhoa.