Introduction

Sponge gourd (Luffa cylindrica) is an important cucurbit vegetable and cultivated worldwide in rainy and summer season. Its origin is subtropical Asia particularly India, considered the primary center of origin of sponge gourd (Kalloo 1993; Sirohi et al. 2005). It is a monoecious and cross-pollinated crop with diploid chromosome number 2n = 2x = 26. Its fruits are the rich source of vitamin A, vitamin C and iron (Yawalkar. 2004) and have a potent anti-inflammatory effect, anti-tumor, anti-viral activities and have protective role against fever, enteritis, diabetes, with role in removing toxins and regeneration of skin (Azeez et al. 2013; Lee and Yoo 2006). Dried fruits of sponge gourd are utilized worldwide in form of bath sponges, packaging material, manufacture of ethanol and as adsorptive of heavy metals (Papanicolaou et al. 2015). Little attention has been given on genetic diversity and genetic improvement of sponge gourd in India. Knowledge of genetic variability is essential for conservation and crop breeding (Talebi et al. 2008) and reduced genetic susceptibility (Fatehi et al. 2011). Genetic variability can be estimated through study of morphological characters, molecular and biochemical markers (Carvalho 2004; de Vicente et al. 2005).

Morphological characterization is employed before molecular characterization as this method estimate genetic diversity along with the activity of genotypes in given environmental conditions (Hoogendijk and Williams 2001). Molecular markers are considered more reliable and used extensively to investigate genetic differences as well as employed in plant breeding. ISSR is an easy, quick technique for determination of genetic diversity, genetic mapping, gene tagging and evolutionary analyses without previous information of the sequence of the genome (Biabani et al. 2013; Arcade et al. 2000; Sankar and Moore 2001). It encompass nearly all of the advantages of SSR and AFLP and the extensiveness of RAPD marker (Reddy et al. 2002). Recently, besides random DNA markers, many new gene-targeted markers have been developed. Start codon targeted (SCoT) marker is a new, gene-directed marker and used for cultivar identification, genetic investigation, and mapping (Luo et al. 2010; Collard and Mackill 2009).

Hitherto, few molecular studies have been conducted for luffa variability assessment. Thus, there is a need of more polymorphic markers. Earlier investigations have been conducted to illustrate the genetic variability of Luffa based on morphological variability (Prakash et al. 2013), molecular markers such as RAPD (Hoque and Rabbani 2009), morphological and RAPD (Junhui and Changping 2008), SSR and SRAP (Jun et al. 2010), morphological and SRAP (Tyagi et al. 2016), RAPD and ISSR (Rathod et al. 2015), morphological, ISSR and DAMD (Misra et al. 2017) and SSR (Pandey et al. 2017). To date, no data is available on comparative evaluation of genetic variability within sponge gourd using ISSR, SCoT and morphological markers. Here, the utilization of SCoT marker system for genetic variability study was illustrated for the first time in sponge gourd accessions.

Thus, the aim of present investigation was to estimate the efficiency of ISSR and SCoT along with morphological traits in ascertaining the genetic diversity and population structure of sponge gourd accessions.

Materials and methods

Plant material

Seeds of 45 accessions of L. cylindrica were collected for this study from 10 different states of India i.e. West Bengal, Uttar Pradesh, Punjab, Jharkhand, Maharashtra, Gujarat, New Delhi, Himachal Pradesh, Uttarakhand and Madhya Pradesh (Table 1). These were selfed and retained as pure lines in the fields of Horticulture department, IARI, New Delhi during the month of March–May with sub-humid conditions. The experiment was laid out during the spring–summer season of 2013 and 2014 in a randomized block design with three replications. The lines were sown in rows of 2.5 m with 75 cm spacing between the plants, with fifteen plants per replication. Observations were made on 10 randomly selected plants in each replication, and all 3 replications were analyzed.

Table 1 Detailed information of L. cylindrica accessions used in our investigation

Morphological analysis

The morphological measurements were recorded for 14 qualitative traits and 12 quantitative traits of 10 randomly chosen plants of each accession based on descriptors adopted and created by Joshi et al. (2004) with some modifications (Table 2).

Table 2 List of morphological traits used in present study

DNA extraction

Total genomic DNA was extracted from young healthy leaves using CTAB procedure as described by Saghai-Maroof et al. (1984). The purity and yield of the extracted DNA was determined electrophoretically on 0.8% agarose gel.

ISSR analysis

A set of 60 primers were screened, out of which 20 primers were found reproducible and selected for profiling. The PCR was performed in a reaction volume of 25 µl containing 50 ng of genomic DNA, 10X PCR buffer, 0.2 mM of each dNTP, 1.5 mM MgCl2, 8 µM of primer and 0.5 U of Taq DNA polymerase (Thermo-Scientific). Bioer-Gene Pro-cycler was used for PCR amplification using the following conditions: 5 min at 94 °C; followed by 35 cycles of 1 min at 94 °C, 1 min at 48–60 °C, and 2 min at 72 °C, and a final extension of 7 min at 72 °C. The amplification products were separated on 2% (w/v) agarose gel using the 50 bp (Thermo-Scientific) and 1 kb ladders (Thermo-Scientific) as standards in 1X TBE buffer.

SCoT analysis

A set of 80 primers were screened, out of which 23 primers were found reproducible and selected for profiling. The PCR was performed in a reaction volume of 25 µl containing 50 ng of genomic DNA, 10X PCR buffer, 0.2 mM of each dNTP, 1.5 mM MgCl2, 8 µM of primer and 0.5 U of Taq DNA polymerase (Thermo-Scientific). Bioer-Gene Pro-cycler was used for PCR amplification using the following conditions: 5 min at 94 °C; followed by 35 cycles of 1 min at 94 °C, 1 min at 48–60 °C, and 2 min at 72 °C, and a final extension of 7 min at 72 °C. The amplification products were separated on 2% (w/v) agarose gel using the 50 bp (Thermo-Scientific) and 1 kb ladders (Thermo-Scientific) as standards in 1X TBE buffer.

Data analysis

For morphological traits, scoring was carried out by giving a series of numbers to each trait. Pearson’s correlation coefficient was calculated between quantitative traits. Principal component analysis (PCA) and total variance were calculated. All these calculations were performed using SPSS 16.0 software. The morphological data was subjected to cluster analysis based on Euclidean distance using NTSYS-pc version 2.1(Rohlf 2000).

Clear and distinct bands obtained with both molecular markers were scored as 0 (absence) and 1(presence) and entered into a binary matrix. The Jaccard similarity coefficient was calculated using binary matrices to determine similarity among accessions using NTSYS-pc version 2.1 (Rohlf 2000). The UPGMA based clustering was performed to infer genetic relatedness among accessions using NTSYS-pc software. The markers informativeness was investigated by calculating amplified bands (AB), polymorphic bands (PB) and polymorphism information content (PIC). Genetic diversity parameters such as polymorphic loci percentage (P%), Nei gene diversity (H) (Nei 1973) and Shannon information index (I) (Shannon and Weaver 1949) were calculated by Popgene software version 1.31(Yeh and Boyle 1997). Population structure was explored using the Bayesian model-based clustering using STRUCTURE 2.3.4 software (Pritchard et al. 2000), which cluster individuals into optimal number of populations (K) based on the multilocus genotypic data based on Markov Chain Monte Carlo (MCMC) algorithm (Falush et al. 2003) and also identify membership of accessions. The admixture model-based simulations were carried out by performing five independent runs for each K (from 1 to 10) with 1,00,000 burn-in period and 1,00,000 MCMC replications. The optimal number of populations was determined by calculating ΔK value (Evanno et al. 2005) through a change in likelihood function with respect to each K.

Results

Morphological analysis

The descriptive statistics results suggest that considerable variation was present among eleven quantitative traits (Table 3). The Pearson correlation was evaluated between nine important quantitative traits and significant positive correlation was observed among fruit traits (Table 4). The selection of such traits will save both time and labor. From the economic point of view, accessions with high yield and a large number of fruits per plant should be selected for future breeding. All morphological traits were found highly polymorphic except five qualitative characters which displayed no marked variation including flower color (yellow), stem shape (angular), peduncle shape (sharply angled), leaf margin (dented) and tendrils (present) therefore, not used in the principal component analysis. The leaf shape showed three morphological classes viz. ovate in 11 accessions, orbicular in 10 accessions and reniform in rest 24 accessions. Green leaf spot was found in the majority of accessions i.e. 21 followed by admixture type spot (16 accessions) and silver leaf spot in 8 accessions. 11 accessions had shallow leaf lobes and intermediate in 24 accessions. Clear variation was found in stem end shape (25 rounded, 14 pointed and 6 flattened) and blossom end shape (19 rounded, 10 pointed and 16 flattened). Highest polymorphism was observed in fruit traits as fruit color varied from green (16), light green (17) to dark green (12); fruit shape varied from elliptical (4), elongated (17) to elongated tapered (25); fruit ribs were absent in 2 accessions, deep in 12, intermediate in 12 and superficial in 19 accessions; fruit skin texture showed 2 morphological variants i.e. 8 were smooth and 37 were grainy.

Table 3 Mean maximum, minimum, range and standard deviation (SD) for 11 quantitative traits
Table 4 Pearson correlation coefficient values between quantitative traits

The principal component analysis (PCA) was performed on 20 morphological characters and the first 7 components which had eigenvalues greater than 1 explained 72.70% of the total morphological variation (Table 5). PCA analysis revealed that traits like blossom end shape, stem end fruit shape, fruit width, fruit weight, seed length, seed weight and fruit yield were major contributors of morphological variations in sponge gourd germplasm. Therefore, the variation observed in these traits could be exploited by plant breeders. Dendrogram was constructed based on Euclidean distance and grouped accessions into two clusters (Fig. 1). Cluster I included six accessions i.e. VRSL12, VRSL13, DSG48, DSG98, DSG38 and NSG-1-11. Majority of these accessions showed orbicular leaf shape, green leaf spots, intermediate leaf lobes, rounded stem end shape and light green colored elongated tapered fruits with grainy texture of fruits. The rest other accessions were included in cluster II. No relationship was found between morphological diversity and geographical origin as admixture type of clustering observed.

Table 5 The first seven principal components Eigen values and their proportions for morphological characters
Fig. 1
figure 1

Dendrogram generated based on Euclidean distance using NTSYS-pc software showing relationships among different accessions of sponge gourd with Morphological traits

ISSR analysis

Based on the reproducibility of scorable bands, 20 ISSR primers were used for characterization of genetic diversity. Amplification allowed the visualization of 182 clear and bright bands and size ranged between 225 and 3000 bp. Out of 182 DNA fragments, 137 were found diverse with an average of 6.85 variable fragments per primer (Table 6). The number of polymorphic fragments generated were 16 (ISSR2 and ISSR14) to 5 bands (ISSR821 and ISSR900). An average polymorphism of 74.6% was recorded. PIC is a good indicator of the primer efficiency in differentiating accessions and depends on the unique banding patterns. Thus, ISSR10 with 0.29 PIC value was the most efficient primer. On determining gene diversity (H), a mean diversity of 0.27 was observed with a range of 0.06–0.42. Another genetic variability parameter is Shannon information index (I) with 0.39 average value. Similarity coefficient value was found 0.81 based on the pair-wise comparison of the accessions. Dendrogram is formed which delineated all accessions into well defined two clusters and one outgroup (Fig. 2). Highest genetic similarity was observed between VRSL1 and DSG31. Cluster I contained two accessions viz. DSG98 (Uttarakhand) and NSG28 (Maharashtra). The second cluster comprised remaining accessions. Pusa Supriya was seen outside the whole dendrogram. Both major clusters were further subdivided into sub-clusters.

Table 6 Amplified bands (AB), polymorphic bands (PB), polymorphism% (P%), polymorphic information content (PIC), H = Nei’s gene diversity, I = Shannon information index showed by ISSR markers
Fig. 2
figure 2

Dendrogram generated based on Jaccard similarity coefficient through NTSYS-pc software showing relationships among different accessions of sponge gourd with ISSR markers

The genetic population structure was investigated using a bayesian model-based clustering to determine the most probable number of clusters in the population. The ΔK was maximum at K = 2 which implied the existence of two sub-populations and revealed admixture type of population (Fig. 3). There was no clear demarcation between accessions according to their geographical origin as observed earlier also in case of UPGMA clustering. This indicates that a clear geographical population structure differentiation could not be identified.

Fig. 3
figure 3

The population structure of sponge gourd based on the optimal value of K for ISSR marker. Each vertical line represents an individual, and the different colors represent populations. The length of the colored segment illustrates the estimated proportion of membership in corresponding clusters as calculated through Structure 2.3.4

SCoT analysis

Out of 80 SCoT primers, 23 were selected for further analysis based on unambiguous and reproducible fragments. A total of 212 DNA fragments were obtained with 151 polymorphic fragments with sizes between 200 and 3200 bp (Table 7). The amplified fragments ranged from 3 (S53) to 23 (25). The detected percentage of polymorphism was 71.5% with a range of 27 (S24) to 100% (S25, S26, S28 and S72). PIC values exhibited by primers were in range of 0.03 (S15 and S53) to 0.33 (S25). H value had a range of 0.11 (S30) to 0.33 (S15) as compared to I value which varied from 0.12 (S59) to 0.53 (S32) indicating lower variation among accessions.

Table 7 Amplified Bands (AB), polymorphic bands (PB), polymorphism% (P%), polymorphic information content (PIC), H = Nei’s gene diversity, I = Shannon information index showed by SCoT markers

All 212 DNA fragments were subjected for pairwise comparison of the accessions and similarity coefficient was calculated. A dendrogram was created utilizing coefficient values (Fig. 4). The coefficient of 0.94 was recorded between VRSL1 (West Bengal) and DSG7 (Uttar Pradesh) demonstrated the highest similarity. Dendrogram grouped all the accessions into two major clusters. Both major clusters were further subdivided into two sub-clusters. There was little grouping observed according to the geographical region in sub-cluster I of cluster II viz. West Bengal (VRSL1, VRSL2, VRSL3, VRSL5), Uttar Pradesh (DSG7, VRSL6, VRSL12, VRSL8, VRSL7, VRSL13, VRSL9, VRSL10, VRSL11, VRSL14, NDSG1, VRSL15) and Jharkhand (CHSG1 and CHSG2). Thus, results suggest little correlation geographic and genetic distances.

Fig. 4
figure 4

Dendrogram generated based on Jaccard similarity coefficient using UPGMA clustering through NTSYS-pc software showing relationships among different accessions of sponge gourd with SCoT markers

To identify the hidden population structure, the bayesian model-based clustering was performed as it grouped samples into genetically distinguishable groups. Sponge gourd population comprised of two sub-populations as K showed the peak value at 2. A clear pattern of ecological differentiation was not found, but spreading and mixing of the subpopulations was inferred (Fig. 5). Results were not in concordance with the UPGMA clustering. However, the resemblance was inferred between ISSR and SCoT results in both dendrogram and structure analysis.

Fig. 5
figure 5

The population structure of sponge gourd based on the optimal value of K for SCoT marker. Each vertical line represents an individual, and the different colors represent populations. The length of the colored segment illustrates the estimated proportion of membership in corresponding clusters as calculated through Structure 2.3.4

Discussion

Understanding genetic diversity and genetic relationships is necessary for conservation and sustainable management of species (Lynch et al. 1999). It facilitates selection of diverse parental combinations with maximum genetic variability and desirable gene introgression from diverse germplasm into the available genetic base which proved useful in crop breeding programs (Barrett and Kidwell 1998; Thompson et al. 1998).

In the current study, two different molecular marker systems ISSR and SCoT with morphological traits were utilized for genetic diversity estimation within 45 accessions of sponge gourd. The parallel use of these 3 data sets allowed precise assessment of the relationship of different accessions of sponge gourd. In our view, the molecular investigation should correspond with morphological traits analysis because along with phylogeny agronomic traits study is also essential which can be done with morphological analysis (Métais et al. 2000).

Higher morphological variation was recorded in leaf, fruit and seed traits. On performing correlation analysis among quantitative traits, a significant correlation was noticed between fruit and seed traits. Above results suggest the selection of such polymorphic traits for future breeding plans. As the selection of plant with high-quality fruit traits will automatically lead to the selection of plant of superior quality seed traits. Thus, such strategies will prove beneficial from both economical and time point of view. The first seven principal components explained 72.70% of total variation. Thus, selection of accessions which were found diverse in these 7 components could be done. The results were in conformity with earlier analysis where high genetic variability was reported in fruit and seed characters of Indian sponge gourd accessions (Prakash et al. 2013). Leaf traits were considered reliable in the evaluation of genetic relationships and taxonomic studies (Almajali et al. 2012). Hence, such traits could be used as a marker for ascertaining genetic relationships between accessions and distinguishing accessions.

Both molecular markers showed comparable genetic diversity values but a higher level of polymorphism represented by ISSR. Differences were noticed in genetic similarity coefficient values and PIC values. Therefore, we can say that the variation pattern is clearly controlled by the genetic marker used. According to Karp and Edwards (1995), different markers had different characteristics and may reveal different features of genetic variability. Variation in the chromosomal location of the markers has also influenced the diversity pattern (Kojima et al. 1998). Overall, the genetic background of sponge gourd is inferred narrow in the present study. These results are in agreement with other studies, which concluded moderate genetic variation in sponge gourd (Tyagi et al. 2016; Marr et al. 2005). But incongruence was found with others who reported high genetic diversity (Rathod et al. 2015; Prakash et al. 2014; Junhui and Changping 2008). The reasons can be a single place of domestication or use of few key varieties for hybridization (Marr et al. 2005; Ghaffari et al. 2014). Other reasons can be difference in the number of primers, type of plant variety used (Hajibarat et al. 2015).

Dendrogram and population structure analysis did not demonstrate the existence of a definite pattern of relationships amid ecological origin and genetic variability as the accessions from same geographical regions did not fall exclusively in a single or two clusters. However, little grouping was seen in 2–3 subclusters. This could be due to the difference in nature of each technique as markers designed from various regions of the genome, genome region coverage by each marker, polymorphism and loci number (Gorji et al. 2011; Pakseresht et al. 2013; Souframanien and Gopalakrishna 2004). Hence, it becomes crucial to broadening the genetic base of the sponge gourd to increase the yield. This can be achieved by introgression of genes from wild species which possess enormous variability.

In brief, the investigation had suggested considerable diversity and provides a pathway for future crop variety identification and conservation. In addition, SCoT and ISSR markers in being economical, fast and informative can be used in selecting diverse parents for crop improvement programs. SCoT marker is a gene-targeted marker which can be converted into functional markers after sequencing and would be more useful for variability analysis and linkage maps construction.