Introduction

Clerodendrum L. (Lamiaceae) is a large diversified genus of small trees or shrubs, predominantly distributed in tropical and subtropical regions of Asia, Africa, America, Egypt, and Madagascar. Globally about 580 species were recorded, and 23 species were reported in India, of which 18 species were documented in North East India [1]. The genus had entrenched pharmacological properties and practice in folklore medicines to treat disease disorders like inflammations, dyspepsia, bronchitis, tumours, dropsy, asthma, hypertension, fever, leprosy, and cancer [2]. The genus had a higher degree of morphological and cytological variation (2n = 24 to 2n = 184) within the species and depicts paraphyletic or polyphyletic origin of the genus [3]. Phylogenetic analysis of Clerodendrum sp. based on chloroplast DNA (cpDNA) and ITS sequences revealed polyphyletic nature of the genus [4]. Based on morphological variations, taxonomic classification of the genus remains incoherent as many researchers classified the genus into different subgenera and nomenclature by more than one author which results in species dualism [5]. Numerous species of this genus are used in the traditional systems of medicine by various tribes of North East India to treat asthma, vermifuge, skin diseases, tumors, cancer, diabetes, obesity and cancer using plant parts such as roots and leaves extract of different Clerodendrum species [6]. The leaves of C. colebrookianum are commonly used by Khasi, Naga, Mishing and Assamese tribes of North East India to treat high blood pressure [7]. North East India covers 50% of India’s biodiversity and it is crucial to evaluate the genetic structure and genetic diversity of the Clerodendrum species, which is widely distributed in this region, to eventually utilize it valuable germplasm for further classifying species into clusters according to their genetic similarities.

Assessment of genetic variability could uplift optimum improvement in the genetic and effective conservation of any species [8]. Generally, genetic diversity was accessed using morphological traits, biochemical and molecular markers [9]. However, the advancement of molecular systemic methods suggests molecular markers as reliable tools compared to morpho-chemical characterizations as these markers are not influenced by environmental stress, natural selections and disease susceptibility [10]. Widely used molecular markers like Amplified Fragment Length Polymorphism (AFLP), Randomly Amplified Polymorphic DNA (RAPD) and Inter Simple Sequence Repeats (ISSR) collectively known as Arbitrarily Amplified Dominant (AAD) are used to study genus diversity based on different banding patterns of the species. Additionally, new alternative DNA markers such as Sequence Related Amplified Polymorphism (SRAP), CAAT Box Derived Polymorphism (CBDP), Targeted Region Amplification Polymorphism (TRAP), Conserved Region Amplification polymorphism (CRAP), and Start Codon Targeted Polymorphism (SCoT) are used in genetic fingerprinting, gene population studies and qualitative trait loci (QTL) mapping. The markers ISSR and RAPD have arbitrarily amplified markers based on the non-coding region of genome whereas SCoT primers are gene-targeted markers derived either from the gene itself or instant flanking regions. SCoT had several advantages over RAPD, ISSR, and AFLP because it is easy to design, stable and amplifies reproducible bands based on the conserved region of the translational initiation codon (ATG) and resembles a functional gene with its correlating characters [11]. To our knowledge, a limited number of molecular marker-based studies on Clerodendrum sp with SSR [12] and RAPD [13] were reported till date. So, this study is the first attempt to analyze genetic diversity in Clerodendrum sp. of North East India using the comparative efficiency of RAPD, ISSR and SCoT markers.

Methodology

Sample collection and extraction of genomic DNA

A total of 94 genotypes from 9 Clerodendrum species were collected from different locations of North East India. The details of the collected Clerodendrum genotypes were listed in (Support information Table S1). All the collected leaf samples were stored at − 20 °C for 24 h and lyophilized (Scanvac, CoolSafe™) at − 110 °C for 48 h. This process extracts the water content from the frozen samples at low temperature using sublimation process (ice convert directly to vapor state without passing through intermediate liquid phase) under high vacuum and preserved the samples for long term storage. For extraction of genomic DNA, 4gm of lyophilized leaves grinned and total genomic content was isolated using the modified CTAB method. Quality of DNA was checked on 0.8% agarose gel and visualized in gel documented system (G: BOX, Syngene, U.K.). The total yields of DNA (ng/µl) as well as DNA purity at A260/A280 ratio were estimated using nanodrop (Eppendorf, Germany) (Support information Table S1). Further, a working concentration of 15 ng/µl DNA stock was prepared and stored at 4 °C until further use.

Molecular markers genotyping

Clerodendrum species were screened with 3 dominant molecular markers, ISSR, RAPD, and SCoT. Initially 20 primers for each marker were screened using 9 Clerodendrum samples. A total 12 ISSR primers, 16 RAPD primers, and 17 SCoT primers produced clear, reproducible banding patterns and were selected for further analysis. The primers that generated weak, no or complex banding patterns were discarded (Support information Table S2). PCR amplification was performed in a total volume of 20 µl reaction mixture in 96-well plates (Abdos) with 15 ng/µl of DNA template, 10X PCR buffer, 2.5 mM dNTP, 5 pmol primer, 1U Taq DNA polymerase (Sigma) and final volume was adjusted with nuclease-free water. The PCR reactions were amplified in a Thermal Cycler (Applied Biosystem). Further, the amplified products were analyzed in 1.5% agarose gel at 95 V for 1.5 h and visualized in the gel documentation system (G:BOX, Syngene, U.K.). Low range DNA ladder of 100 bp to 3 kb (Genei) was used as a reference marker for each marker analysis.

Scoring of data

The scoring of ISSR, RAPD and SCoT markers were done manually in a binary format based on the presence (1) or absence (0) of bands by visual inspection. Only the distinctive and reproducible bands were calculated. To discriminate the markers, parameters like the total number of bands, number of polymorphic and monomorphic bands, percentage of polymorphism, polymorphic information content (PIC), resolving power (RP), and marker index (MI) were evaluated using Microsoft Excel. The value of PIC was calculated as PIC = 1 − ∑Pi2, where pi is the frequency of ith allele (band present) and summation extends over n alleles [14]. Resolving power (Rp) of each primer was estimated with Rp = ∑Ib where Ib (informative fragments) = 1 − [2 × (0.5 − pi2)], where pi2 is the proportion of accession containing bands [15].

Distance matrix and cluster analysis of the datasets were generated using NTSYS (Numerical taxonomy and multivariate analysis system) software v2.1 [16]. The genetic relationship among all the Clerodendrum species was calculated using SIMQUAL (Similarity for qualitative data program in NTSYS) module of Jaccard’s similarity coefficient [17]. UPGMA with Sequential agglomerative hierarchical and nested clustering method program in NTSYS (SAHN) was used to generate a dendrogram representing the genetic association as revealed by similarity coefficient.

The allelic data matrix of 1 or 0 was used to calculate the population genetic analysis using POPGENE v 1.32 [18] which consist of number of observed alleles (Na), and number of effective alleles (Ne). Nei’s genetic diversity (H), polymorphism index (PI) and Shannon’s index (I) were computed for each Clerodendrum species. Genetic diversity measure such as Ht = total gene diversity and Gst = coefficient of gene differentiation was calculated using Nei gene diversity statistics [19].

Bayesian model-based population structure was studied to infer population structure and assign individuals to population based on ISSR, RAPD and SCoT using STRUCTURE software version 2.3.4 [20]. The membership of each accession was run for a range of genetic clusters from the value of K = 2–10 with the admixture model and correlated allele frequency. To obtain optimum K value, the length of the burning period was 5000 iterations followed by 50,000 Monte Carlo Markov Chain replicates. For each K it was replicated 3 times. Ln(PD) derived for each K and plotted to find the plateau of the ΔK values [21]. Population structure was calculated with ΔK using “structure harvester” (https://taylor0.biology.ucla.edu) based second order rate of change of Likelihood distribution mean L (“K”) and with respect to K. The partitioning of genetic variation among and within different genotypes was also investigated with analysis of molecular variance (AMOVA) and Principal Coordinate Analysis (PCoA) using GenAlEx 6.501 software [22]. The cophenetic correlation coefficients were calculated with COPH and MXCOMP procedures for each marker. Mantel tests between two matrices were compared in NTSYS v2.1 software using the MXCOMP option and the matrices were obtained from the cophenetic values of each marker.

Results

To estimate the genetic diversity in Clerodendrum, three different PCR-based molecular markers (ISSR, RAPD, and SCoT) were used. The statistical details of each marker were as follows:

ISSR analysis

Out of 20 ISSR tested primers, 12 primers amplified 79 clear bands for Clerodendrum sp. depending upon their simple sequence repeat motifs. The number of amplified fragments varied from 5 to 8, with an average of 6.58 bands per primer (Table 1). Among 79 bands, 56 bands were polymorphic and the average polymorphic percentage was 71.40% across all the genotypes. The maximum numbers of polymorphic bands were observed in UBC809 and UBC810. The average PIC was 0.34 ranging from 0.28 to 0.37 and highest in UBC811 and UBC836 while lowest in UBC810. MI and Rp of the ISSR varied from 0.27 to 1.72 and 5.26 to 11.11. The maximum MI and Rp were found in UBC809 and UBC842. Nei’s gene diversity ranged from 0.13 ± 0.06 to 0.26 ± 0.10 with a mean of 0.18 ± 0.09 while the effective number of allelic frequencies spanned from 1.09 ± 0.23 to 1.30 ± 0.37 with a mean of 1.19 ± 0.31. The genetic diversity parameters such as Na, Ne, H and I were found to be highest in C. infortunatum (Na-1.53 ± 0.50, Ne-1.30 ± 0.37, H-0.26 ± 0.10, and I-0.28 ± 0.08) and lowest in C. inerme (Na-1.16 ± 0.39, Ne-1.09 ± 0.23, H-0.13 ± 0.06, and I-0.20 ± 0.09) respectively (Table 2). The genetic variation in ISSR markers revealed that the mean value of total gene diversity (Ht), gene diversity within population (Hs), coefficient of gene differentiation (Gst) and level of gene flow (Nm) were Ht-0.24 ± 0.04, Hs-0.11 ± 0.01, Gst-0.33, and Nm-0.38, respectively (Support information Table S3). The Gst value 0.33 indicates about 33% of total genetic divergence among the populations and the remaining 67% within the populations.

Table 1 Details characteristic of ISSR, RAPD, and SCoT markers with total number of bands amplified, number of polymorphic bands, %polymorphism, PIC, MI and Rp value
Table 2 Genetic diversity parameters of Clerodendrum sp. based on ISSR, RAPD, and SCoT markers using POPGENE software

RAPD analysis

The PCR amplification of Clerodendrum sp. with 16 RAPD primers yielded 126 scoreable bands with 111 polymorphic fragments (Table 1). The RAPD primers produced a greater number of polymorphic and scoreable bands as compared to ISSR primers. The PIC value ranged from 0.39 (OPB-1) to 0.69 (OPB-17), with an average of 0.55 per loci. The percentage of polymorphic fragments ranged from 71.42 to 100%, with an average of 88.23% polymorphism. The MI and RP of RAPD ranged from 1.40 (OPB-1) to 5.17 (OPA-18) and 4.04 (OPB-5) to 9.67 (OPA-18). The level of genetic diversity measured with Na, Ne, H and I ranged from 1.06 ± 0.23 to 1.74 ± 0.48, 1.04 ± 0.16 to 1.36 ± 0.36, 0.09 ± 0.01 to 0.24 ± 0.11, and 0.13 ± 0.03 to 0.33 ± 0.17 with their standard deviation from 0.23 to 0.44, 0.16 to 0.36, 0.01 to 0.11 and 0.03 to 0.17 respectively (Table 2). The genetic diversity in C. infortunatum was identified to be the highest with effective Na, Ne, H and I value of 1.74 ± 0.48, 1.36 ± 0.36, 0.24 ± 0.11, and 0.33 ± 0.17 while lowest in C. thomsoniae with Na, Ne, H and I value of 1.06 ± 0.23, 1.04 ± -0.16, 0.09 ± 0.01, and 0.13 ± 0.03. The genetic variation accessed using RAPD markers indicated that the mean values of Ht, Hs, Gst, and Nm were 0.33 ± 0.02, 0.14 ± 0.01, 0.57 and 0.44 respectively (Support information Table S3). The Gst value of 0.57 indicates about 57% of the total genetic divergence among the population and the remaining 43% within the populations.

SCoT analysis

In the case of SCoT, 145 clear and scoreable bands were amplified with 17 screened primers. The numbers of amplified bands were higher than compared of ISSR and RAPD markers. A total of 129 polymorphic scoreable bands were revealed by each primer with an average of 7.59 for each individual (Table 1). The PIC of SCoT ranged from 0.30 (S41) to 0.62 (S23, S43) with a mean of 0.57 per primer. Similarly, the values of MI and Rp for the genotypes varied from 0.95 (S41) to 7.35 (S17) and 4.54 (S6) to 8.65 (S17), respectively. The genetic diversity values of Na, Ne, H and I ranged 1.26 ± 0.44 to 1.83 ± 0.39, 0.16 ± 0.29 to 1.50 ± 0.37, 0.17 ± 0.06 to 0.29 ± 0.16 and 0.25 ± 0.05 to 0.43 ± 0.25 with an average of 1.58 ± 0.46, 1.35 ± 0.36, 0.23 ± 0.11, and 0.34 ± 0.16 (Table 2). These SCoT primers reveal highest genetic diversity in C. infortunatum population (Na—1.83 ± 0.39, Ne—1.50 ± 0.37, H-0.29 ± 0.16, I—0.43 ± 0.25) compared to other populations while lowest genetic diversity was found in C. thomsoniae (1.26 ± 0.44, 1.16 ± 0.29, 0.17 ± 0.06, 0.25 ± 0.05). The mean values of genetic variation (Ht, Hs, Gst, and Nm) accessed with SCoT markers were 0.34 ± 0.03, 0.21 ± 0.11, 0.59 and 0.79, respectively (Support information Table S3). According to the Gst value of 0.59 indicates that about 59% of total genetic divergence among the populations and the remaining 41% within the populations.

Genetic diversity and cluster analysis in Clerodendrum

Jaccard’s similarity coefficient was used to estimate the genetic diversity and relationships among the Clerodendrum sp. Based on ISSR, RAPD, and SCoT data, the genetic similarity ranged from 0.98 (Cine6 with Cine5, Cine7 with Cine4, Cine9 with Cine8, Cind4 with Cind2, Cind3, Cind5, Cind8 with Cind9 and Cind10, Cjap4 with Cjap2, Cjap3, Cpan3 with Cpan5) to 0.57 (Cinf2 and Cphi6) for ISSR; 0.99 (Cinf12 with Cinf13 and Cine8 with Cine9) to 0.50 (Cinf9 and Cjap4) for RAPD and 0.99 (Ctho6 and Ctho7) to 0.41 (Ccol14 and Cine5) respectively.

From the dendrogram constructed by NTSys using the UPGMA method for ISSR markers, 6 major clusters were generated (Fig. 1a). The cluster 1 includes 41 genotypes that divided into 2 sub-clusters, sub-cluster 1a with 23 genotypes of C. colebrookianum (72.2%) and C. philippinum (90.9%) and sub-cluster 1b with 18 genotypes C. indicum (33.3%), C. japonicum (100%) and C. paniculatum (100%). Cluster 2 consist of 20 genotypes, 10 each of C. inerme (100%) and C. indicum (66.7%), Cluster 3 with 13 genotypes, 6 genotypes of C. serratum (100%) and 7 genotypes of C. thomsoniae (100%), Cluster 4 with 6 genotypes, 5 genotypes of C. colebrookianum (27.8%) and 1 genotype of C. infortunatum (7.1%), Cluster 5 and Cluster 6 contain separate clusters with 13 genotypes of C. infortunatum (92.9%) and 1 genotype of C. philippinum (9.1%).

Fig. 1
figure 1

UPGMA tree-based grouping of 94 genotypes of Clerodendrum sp. based on a-ISSR, b-RAPD and c-SCoT markers. In the figure Clerodendrum sp. were represented as C1–C18 = C. colebrookianum; C19–C32 = C. infortunatum; C33–C43 = C. philippinum; C44–C53 = C. inerme; C54–C68 = C. indicum; C69–C74 = C. serratum; C75–C82 = C. japonicum; C83–C87 = C. paniculatum; C88–C94 = C. thomsoniae

Based on RAPD cluster analysis, 3 major clusters (Cluster 1, Cluster 2 and Cluster 3) were found (Fig. 1b). In Cluster 1, 64 genotypes were grouped whereas, Cluster 2 and Cluster 3 comprises 14 genotypes of C. indicum (93.4%) and 14 genotypes of C. infortunatum (100%) along with 2 genotypes of C. philippinum (9.1%). The cluster 1 can be subdivided into 5 sub-clusters as Cluster 1a which consist of 18 genotypes of C. colebrookianum (100%); Cluster 1b consist of 20 genotypes of C. inerme (100%), C. japonicum (100%), C. indicum (6.6%), C. paniculatum (40%); Cluster 1c consist of 6 genotypes of C. serratum (100%); Cluster 1d consist of 9 genotypes of C. philippinum (90.9%) and Cluster 1e consist of 10 genotypes of C. paniculatum (60%) and C. thomsoniae (100%).

In cluster analysis of SCoT markers, the genotypes were divided into 3 major clusters (Fig. 1c). Cluster 1 contained 17 genotypes of C. colebrookianum (94.4%). In Cluster 2, 2 sub-clusters were formed as Cluster 2a with 26 genotypes of C. colebrookianum (5.6%), C. indicum (6.6%) and C. inerme (100%) and Cluster 2b with 26 genotypes of C. indicum (93.3%), C. serratum (100%), C. japonicum (100%), C. paniculatum (100%), and C. thomsoniae (100%). Cluster 3 consists of 25 genotypes of C. infortunatum (100%) and C. philippinum (100%).

Structure-based study

Population structure was analyzed for 94 genotypes of Clerodendrum species with three markers and evaluated with Structure harvester software (Fig. 2). Based on membership fractions, the genotypes with ≥ 80% probability was assigned to corresponding populations while other was categorized as admixture type.

Fig. 2
figure 2

Population structure of 94 Clerodendrum genotypes inferred by STRUCTURE analysis based on a ISSR, c RAPD and e SCoT markers. The vertical bar with different color indicates the estimated membership coefficient for each genotype using Q statistic. Graph represent the estimated membership fraction using LnP(D)-derived ΔK with cluster number (K) ranged from 1 to 10 where b RAPD, d ISSR, f SCoT

According to Evanno’s method, the maximum peak of ΔK was observed at K = 6. This result indicates that six distinct populations were obtained with ISSR markers similar to the result of UPGMA cluster analysis. The individuals of 14 genotypes were grouped in Cluster 1 that includes all genotypes of C. japonicum and C. paniculatum along with 1 genotype of C. colebrookianum. In Cluster 2 and 3, all genotypes of C. infortunatum, C. indicum, and C. thomsoniae were found. The 17 genotypes of C. colebrookianum and 5 genotypes of C. philippinum were grouped in Cluster 4. Cluster 5 includes the entire genotypes of C. inerme with 1 each genotype of C. indicum and C. philippinum. Cluster 6 contained genotypes of C. serratum (6) and C. philippinum (3). 62 genotypes were pure and 32 genotypes were admixture type.

In structure analysis using the RAPD dataset, the maximum peak of ΔK was observed at K = 6 followed by leveling off and accompanied by an increase in variance. This result indicates 6 distinct populations of Clerodendrum sp. Out of 94 genotypes; Cluster 1 contained 18 genotypes among which 14 genotypes were from C. indicum and 5 genotypes from C. serratum. Cluster 2 (21 genotypes) includes genotypes from C. philippinum (9), C. japonicum (8), C. paniculatum (3), and C. serratum (1) were generated. Cluster 3, 4, 5 and 6 includes all the genotypes of C. thomsoniae, C. infortunatum, C. colebrookianum, and C. inerme. Among all the genotypes, 77 were identified as pure and 17 genotypes were found to be as admixture type.

Four distinct populations were obtained with SCoT markers as high value of ΔK was observed at K = 4. In Cluster 1, 10 genotypes of C. colebrookianum were found. Similarly, Cluster 2 consists of 12 genotypes of C. indicum. Cluster 3 consists of 44 genotypes which includes 11 genotypes of C. philippinum, 10 genotypes of C. inerme, 14 genotypes of C. infortunatum, 8 genotypes of C. colebrookianum and 1 genotype of C. serratum. Cluster 4 consists of 28 genotypes of C. indicum (8 genotypes), C. thomsoniae (7 genotypes), C. japonicum (8 genotypes) and C. serratum (5 genotypes). 85 genotypes were pure and 9 were found to be admixture genotypes.

Analysis of molecular variance (AMOVA) based study

AMOVA was studied to examine the population differentiation among and within Clerodendrum species and considered as statistically significant if P < 0.001 (Support information Fig. S1). The ISSR data showed 47% among the population and 53% within the population. Similarly, RAPD based AMOVA analysis showed 53% and 47% variations among and within the populations. The SCoT marker based AMOVA resulted in the least variation among the population (32%) and maximum variation (68%) within the population of Clerodendrum species. The present study of three marker systems showed that population variation was partitioning between 32–53% and maximum variation was present within-population ranging from 68–47%. Additionally, SCoT marker showed high genetic variation within and among the populations of Clerodendrum species in contrast to ISSR and RAPD markers.

Principle coordinate analysis (PCoA) based study

The Principle coordinate analysis was performed to ensure the spatial representation of genetic distance among the individuals and to analyze the consistency of differentiation among the population denied by cluster analysis. The projection of PCoA was plotted on a two-dimensional scatter plot (Fig. 3). In ISSR based analysis, the percentage of total variations explained by the first three principle components analysis were 15.80%, 14.36%, 12.18%, respectively. The first two principle axes accounted for 30.16 of the total molecular variances. Three species C. infortunatum, C. indicum and C. inerme were clustered with significant distinctive groups.

Fig. 3
figure 3

Principle coordinates analysis of 94 genotypes of Clerodendrum sp. based on genetic similarity matrix derived from a- ISSR, b- RAPD and c- SCoT markers. In the figure Clerodendrum sp. were represented with coloured dots and sample I.D as: C1–C18 = C. colebrookianum; C19–C32 = C. infortunatum; C33–C43 = C. philippinum; C44–C53 = C. inerme; C54–C68 = C. indicum; C69–C74 = C. serratum; C75–C82 = C. japonicum; C83–C87 = C. paniculatum; C88–C94 = C. thomsoniae

Based on RAPD data, total percentage of variation with first three axes were 42.27% (axis1-18.86%, axis2-13.08% and axis3-10.33%). The species of C. colebrookianum, C. infortunatum, C. thomsoniae and C. indicum form distinct clusters. The PCoA analysis based on SCoT markers showed similarity with dendrogram and 24.94% of the total variation was described by three coordinates as PCoA1-9.61, PCoA2-8.66 and PCoA3-6.67 respectively.

Combine data analysis for diversity and population structure

The data from three markers (ISSR, RAPD, and SCoT) were summarized to understand better the diversity and population structure. The dendrogram constructed using UPGMA tree showed that all the genotypes were grouped into three clusters (Support information Fig. S2). Cluster 1 consist of 18 genotypes of C. colebrookianum (100%); Cluster 2 contains 60 genotypes with two sub-clusters as Cluster 2a with 9 genotypes of C. philippinum (81.8%) and Cluster 2b with 51 genotypes of C. inerme (100%), C. indicum (100%), C. serratum (100%), C. japonicum (100%), C. paniculatum (100%) and C. thomsoniae (100%). Clusters 3 consist of 16 genotypes with 14 genotypes of C. infortunatum and 2 genotypes of C. philippinum (18.2%). The cophenetic coefficient for three molecular markers indicates a good fit for clustering with values of 0.86 for ISSR, 0.89 for RAPD and 0.90 for SCoT. The Mantel test correlation values depicted a positive correlation between the three marker types. The correlation coefficient (r) was 0.37 between ISSR and RAPD, 0.55 between ISSR and SCoT and 0.42 between RAPD and SCoT. The PCoA based on combined markers grouped the genotype similar to UPGMA clustering method. The first and second principle axes were 9.81 and 8.80% of the variation, respectively. The structures of 94 Clerodendrum genotypes were further analyzed with STRUCTURE software using the Bayesian clustering model. The result showed that the highest value of ΔK was observed for K = 4 which indicates three distinct populations (Support information Fig. S3). Cluster 1 consists of 32 genotypes which include 10 genotypes of C. inerme, 15 genotypes of C. indicum, 4 genotypes of C. philippinum and 3 genotypes of C. serratum. Cluster 2 consists of 39 genotypes of C. infortunatum (14 genotypes), C. colebrookianum (18 genotypes) and C. philippinum (7 genotypes). Cluster 3 contains 23 genotypes of C. thomsoniae (7 genotypes), C. japonicum (8 genotypes), C. paniculatum (5 genotypes) and C. serratum (3 genotypes). 84 genotypes were pure and 10 genotypes were found to be admixture type. AMOVA analysis also revealed only 42% diversity among populations and 58% within variations within populations. The principal coordinate analysis showed that the first three coordinates explained 25.56% of total variations.

Discussion

The importance of genetic diversity with informative molecular markers could lead to genetic improvement, germplasm maintenance and identification of unique genotype/individuals [23]. In the present study, three PCR based dominant molecular markers, ISSR (12 primers), RAPD (16 primers), and SCoT (17 primers) were selected to access the genetic diversity among 9 species of Clerodendrum. The ISSR, RAPD and SCoT markers generated a high percentage of polymorphism within the genotypes and specify higher informative nature of markers [24]. The level of polymorphism was maximum in SCoT marker (88.6%) followed by RAPD (88.23%) and ISSR (71.41%) markers. Similar results were reported in crops like grape [25], Cicer [26], peanut [27] and mango [28]. Moreover, the higher levels of polymorphism in RAPD compared to ISSR were also reported earlier in rice bean [29] and Myristica spp.[30]. The mean of marker index (MI) that evaluates the overall efficacy of a marker system was almost equivalent for both RAPD and SCoT markers as compared to ISSR markers. These closet values of marker index indicate equal efficiency for fingerprinting of Clerodendrum species.

The different genetic parameters (Na, Ne, H and I) were found highest in SCoT markers followed by RAPD and ISSR markers. According to Nei’s, Gst is measured low when Gst < 0.05, medium if 0.05 < Gst < 0.15, and high if Gst > 0.15. The population of genetic differentiation was maximum with SCoT (Gst-0.59) followed by RAPD (Gst-0.57) and ISSR (Gst-0.33). This high level of population variation may occur due to numerous factors such as species breeding system, genetic drift or geographical origin of the populations [31, 32]. The value of Nm < 0.5 indicates a restriction of genetic drift due to an adequate amount of gene flow and Nm > 1.0 indicate initiation of variation as a result of genetic drift in the populations [33]. The observed Nm value in SCoT (Nm-0.79), ISSR (Nm-0.38), and RAPD (0.44) specify extensive gene flow among the populations. Incongruent with RAPD and ISSR marker system, SCoT could generate more information correlated with biological traits and higher genetic polymorphism [9]. SCoT markers are easy to be amplified as compared to RAPD and ISSR due to the presence of elongated oligonucleotides [34]. Various studies reported the efficacy of SCoT marker in diverse plant species such as Potato [35], Tomato [36], Wheat [37], Kalmegh [8], Vigna unguiculate [38] and Taxus [39].

Furthermore, the UPGMA dendrogram of ISSR, RAPD, and SCoT generated similar grouping of genotypes with minor deviations in ISSR marker. The deviation in the clustering of genotypes could occur due to variation in the size of target genome [40]. The comparison of the clustering pattern in this study revealed that the location specificity of SCoT was higher than ISSR and RAPD markers as well as similar to the results of clusters analysis obtain with PCoA. Jaccard’s coefficient was analyzed to access the genetic diversity of Clerodendrum sp. ISSR showed highest similarity (0.98) among C. inerme (Cine6–Cine5, Cine7–Cine4, Cine9–Cine8), C. indicum (Cind4–Cind2, Cind3, Cind8–Cind9, Cind10), C. japonicum (Cjap4–Cjap2, Cjap3) and C. paniculatum (Cpan3–Cpan5) and lowest (0.57) in C. infortunatum (Cinf2) with C. philippinum (Cphi6). Similarly, RAPD showed maximum similarity (0.99) with C. infortunatum (Cinf12–Cinf13) and C. inerme (Cine8–Cine9) and lowest (0.50) in C. infortunatum (Cinf9) with C. japonicum (Cjap4). In SCoT highest similarity (0.99) was observed in C. thomsoniae (Ctho6- Ctho7) and lowest (0.41) in C. colebrookianum (Ccol14) with C. inerme (Cine5). It can be further concluded that C. infortunatum, C. philippinum, C. japonicum, C. colebrookianum, and C. inerme were distantly related from each other.

The grouping of dendrogram clusters was further supported by Bayesian clustering algorithm using STRUCTURE software that determines the true number of cluster (K) in a given sample of genotypes using an ad hoc statistic ΔK based on the rate of change in the log probability of data between successive values of K [21]. The best interference of delta K (ΔK) was obtained as K = 6 for ISSR, K = 6 for RAPD and K = 4 for SCoT. The AMOVA analysis indicated that genetic diversity by ISSR and SCoT except RAPD markers were distributed more than the diversity among the populations. Further, this could help in strategy development for species collection and estimation. The low variability among the populations was also reported in fig [41] and tetraploid potato [35].

The combinations of molecular markers could analyze various regions of the genome in both functional and neutral regions [42]. In this study, ISSR, RAPD, and SCoT were combined to analyze the genetic diversity of Clerodendrum sp. The dendrogram of ISSR + RAPD + SCoT showed similar grouping with the dendrogram of RAPD and SCoT molecular markers. The dendrogram displayed closer relation based on SM similarity coefficients in genotypes of C. thomsoniae (C93–C94) and distance relation between C. infortunatum (C20) and C. serratum (C71). The correlation between genetic distances and geographical origin of Clerodendrum genotypes suggests that the natural selection had significantly exaggerated the genome regions that were amplified with ISSR, RAPD and SCoT markers.

Our present study revealed a high level of genetic diversity among the species Clerodendrum genotypes and this level of genetic variation could be useful in systemic germplasm management and utilization in plant breeding programmed. The molecular marker SCoT followed by RAPD and ISSR showed a high level of polymorphism efficiency and could be further used in fingerprinting of Clerodendrum species.

Conclusion

Genetic diversity is a crucial means to analyze short- and long- time survival of species in their natural habitats. In this study, 94 genotypes from nine species of Clerodendrum were evaluated using ISSR, RAPD and SCoT markers and revealed a high level of genetic diversity. Based on values of PIC, Rp, MI, Ne, H, I, Ht, Hs and Nm, SCoT markers showed high level of discriminatory and informative values in comparison to ISSR and RAPD markers. UPGMA analysis revealed similar clustering of genotypes for RAPD and SCoT marker with minor deviation in ISSR markers and depict similar to results with PCoA analysis. In AMOVA analysis, the maximum difference was 53% among the populations whereas, maximum variation was 68% within the populations. Structure-based population using three markers showed different population number which does not correspond to UPGMA tree except for SCoT based population structure. Therefore, we tentatively could propose that SCoT markers could be useful in differentiation nine species and various other representatives of the Clerodendrum genus. Further, this study could provide valuable baseline data on population genetics of Clerodendrum species of North East India for better marker-assisted selections, selection of marker technologies for breeding and utilization of comparative genetic studies.