Introduction

Terminalia bellerica commonly known as belleric myrobalan is a large deciduous tree belonging to family Combretaceae. It is distributed throughout tropical Asia including India [1]. It is a multipurpose tree providing raw material to pharmaceutical, animal husbandry, leather, dyeing, soap, chemical, resin and gum, paper, railways, match, oil and cosmetic industries [2]. It is secondary host of tasar silkworm. It is well known for its antimicrobial [3], antifungal, antiHIV and antimalarial activities [4] and extensively used in herbal drug preparation in Indian Ayurveda.

Overexploitation of Terminalia by timber, leather and pharmaceutical industries in addition to the losses incurred by overgrazing, indiscriminate felling of trees, conversion of forest land into agriculture land and human settlements has led to a rapid depletion of genetic resources of Terminalia species [5]. A better understanding of genetic diversity and its distribution is essential for its conservation and use. It will help in determining what to conserve as well as where to conserve [6]. PCR based markers such as RAPD and ISSR are considered powerful tools for assessing the existing genetic variation at the species, genus and population level. These markers are neutral to the environmental influences, high in providing number of variations and do not require DNA sequence information of a species [7].

One of the major problems faced by the herbal pharmaceutical industry is the lack of quality control of the medicine which can affect the drug efficacy and safety [8]. Quantification of medicinally active constituents is useful for correct identification of zonal population for genotype with high secondary metabolite yield, as the secondary metabolite producing capacity of plants also varies with geographical origin and developmental stage [9].

The present study is the first attempt on diversity analysis in T. bellerica with an aim to ascertain the nature and extent of genetic diversity present among different accessions. The study was aimed to characterize the morphological and phytochemical diversity in the germplasm and to evaluate the molecular diversity employing RAPD and ISSR markers of T. bellerica.

Materials and Methods

Plant Material

Terminalia bellerica leaf samples were collected from different geographic locations representing four districts of Rajasthan i.e. Udaipur, Dungarpur, Banswara and Jaipur (Table 1). Total genomic DNA was extracted from young leaves following the standard CTAB method [10] with minor modifications. 5 g of leaves were ground in liquid nitrogen, homogenized in 25 ml of extraction buffer (2% CTAB, 20 mM EDTA, 2% PVP, 1.4 M NaCl, 100 mM Tris–HCl pH 8.0 and 1% β-mercaptoethanol) and incubated at 65°C for 1 h. The supernatant was treated with RNAase A, incubated at 37°C for 30 min and extracted twice with chloroform:isoamylalcohol (24:1 v/v). The DNA was pelleted with chilled isopropanol and washed twice with 70% ethanol. The pelleted DNA was air dried and stored at −20°C. DNA concentration was determined using known amount of λ DNA as standard.

Table 1 Details of T. bellerica accessions collected from different locations in Rajasthan

RAPD Amplification

PCR amplification was performed with 40 random decamer primers obtained from Operon Technologies (Almeda, USA). Amplification was performed in 20 μl reaction volume containing 25 ng DNA, 1× PCR buffer (10 mM Tris pH 9.0, 50 mM KCl, 1.5 mM MgCl2), 200 μM each dNTP, 20 pmol of RAPD primer and 0.5 U of Taq DNA polymerase (Bangalore Genei, India). PCR reactions were performed in Thermal Cycler (Bio Rad, UK) with an initial denaturation at 94°C for 3 min followed by 45 cycles at 94°C for 45 s, 36°C for 45 s and 72°C for 2 min with a final extension at 72°C for 7 min. Amplified products were separated on 1.5% agarose gel in 1× TBE buffer by electrophoresis at 100 V and visualized with ethidium bromide staining in GelDoc System (Bio Rad, UK). The size of the amplification products was determined by 1 kb marker as a molecular standard.

ISSR Amplification

PCR amplification was performed with 20 ISSR primers (University of British Columbia, Vancouver, Canada). The PCR mixture (20 μl) consisted of 25 ng DNA, 1× PCR buffer (10 mM Tris pH 9.0, 50 mM KCl, 1.5 mM MgCl2), 1.5 μl of 25 mM MgCl2, 100 μM dNTP, 20 pmol of ISSR primer and 0.5 U Taq DNA polymerase (Bangalore Genei, India). PCR amplifications were performed with initial denaturation at 94°C for 4 min followed by 35 cycles of 30 s at 92°C, 1 min at annealing temperature (depending upon the primer), 2 min elongation at 72°C and final extension at 72°C for 7 min. The amplified products were electrophoresed in 1× TBE buffer at 100 V on a 1.8% agarose gel using 1 kb marker as a molecular standard.

Morphological Characterization

Data were recorded at least three times for eleven morphological traits from each accession. Morphological parameters were divided into five quantitative and six qualitative characters. Five quantitative characters include number of primary branches, fruit length, leaf length, leaf width, weight of 50 fruits and qualitative morphological descriptors include bark color (grey, brown), stem nature (cracked, rough, smooth), phyllotaxy (alternate, opposite), leaf shape (elliptical, ovate, lanceolate, obvate) and fruit shape (globular, ovate).

HPLC Analysis

Isolation of gallic acid from leaf samples was performed using the method described by Juang et al. [11]. HPLC analysis was performed on Agilent (Germany) model 1200 and separation was achieved by a reverse-phase column (Eclipse XDB c-18, 4.5 × 150 mm2, particle size 1.8 lm; Agilent) using water (A) and Acetonitrile (B), each containing 0.1% acetic acid, as solvent and online UV-Diode Array Detector (UV-DAD) at 254 nm. The solvent gradient was as described by Juang et al. [11] and flow rate was 0.4 ml min−1. Sample volume of 10 μl was injected and the column temperature maintained at 27°C during the run. Authentic gallic acid (Chromadex, CA, USA) was used as marker to ascertain their discrete resolution from each other under these conditions for HPLC. Computation of gallic acid concentration in the samples was done through a calibration curve of concentration versus detector response (peak area) using different concentrations of standard solutions of gallic acid in methanol. HPLC data was analyzed with the Chemstation LC-3D software (Agilent).

Statistical Analysis

The standardized mean values of the morphochemical traits were used to perform cluster analysis using NTSYS-Pc versions 2.02e [12] and a dendrogram was constructed using the unweighted pair group method of arithmetic average (UPGMA). Principal component analysis was undertaken with modules STAND, CORR and EIGEN of NTSYS-Pc.

The DNA fingerprint patterns obtained were converted into binary data matrices containing arrays of 0 and 1 s. The RAPD and ISSR bands were scored visually for the presence (1) or absence (0) of bands of various molecular weight sizes. Only polymorphic and reproducible bands were considered for the analysis. The binary character matrix of RAPD data were analysed using the POPGENE ver 1.31 [13]. The following genetic diversity parameters were estimated: Shannon’s index of phenotypic diversity (I), Nei’s gene diversity (h), polymorphic band percentage (PBP), coefficient of genetic differentiation (G ST) and gene flow (Nm) [14]. Nei’s genetic identity and genetic distance between populations were also calculated.

Data were analysed using SIMQUAL route to generate Jaccard’s similarity coefficient using NTSYS-pc version 2.02e [12]. Similarity matrices were utilized to construct dendrograms pooled marker data using UPGMA algorithm and SAHN clustering. Finally, a principal coordinate analysis was performed in order to highlight the resolving power of the ordination. A two (2D)- and three-dimensional (3D) principal component analysis was constructed to provide another means of testing the relationships among accessions using EIGEN program (NTSYS-PC).

The robustness of each phenogram was evaluated by a bootstrap analysis [15] of each data set using the computer program WINBOOT [16]. Each phenogram was reconstructed 1,000 times by repeated sampling with replacement. The frequency with which a particular grouping was identified was taken to reflect the strength of the grouping.

Results and Discussion

Two different types of plant habits (erect and spreading types of trees) were seen in T. bellerica (Table 2). However, in the present study, 89% accessions were erect type (Fig. 1a) and only 11% spreading (Fig. 1b) in nature. Stem colour was of two types brown and grey. Maximum accession (78.5%) showed grey stem colour. A great extent of variability was observed in lamina shape. Leaf shape (Fig. 1c–f) was ovate (17), elliptical (4), lanceolate (1), obvate (6) in the accession studied. Phyllotaxy was of alternate type in all the accessions. Ovate fruit shape was observed only in 2 accessions (Fig. 1h). Gallic acid content varied between 1.7 and 4.9%. Maximum amount of gallic acid (4.9%) in accession G4.

Table 2 Morphological variation and gallic acid content observed among the T. bellerica accessions
Fig. 1
figure 1

Morphological variation in T. bellerica. a Erect type of tree, b spreading type of tree, c elliptical leaf, d lanceolate leaf, e ovate leaf, f obvate leaf, g fruit size, h fruit shape—ovate, globular

Principal component analysis (PCA) was conducted for the six quantitative morphochemical traits (leaf length, leaf width, fruit length, gallic acid, fruit weight, no. of primary branches) in order to summarize the data. The first two principal components (PC) explained 69.38% variation among the accessions. PC1, explaining 47.35% of variation was linked to variables related to fruit length, leaf length, fruit weight and number of primary branches. In PC2, gallic acid content (22.03%) was higher (Table 3). In cluster analysis, 28 T. bellerica accessions were grouped into 3 clusters at a coefficient of 1.47 (Fig. 2). The first cluster consisted of 12 accessions which were vigorous in leaf length, fruit length, fruit weight. The second cluster consisted of 15 accessions and the third cluster had only one accession (S1) which had less fruit length and weight as compared to other accessions.

Table 3 Eigenvectors and values of the first two PC for six quantitative morphochemical characters of 28 T. bellerica accessions
Fig. 2
figure 2

Dendrogram based on 6 quantitative traits

Overall, 40 RAPD and 20 ISSR primers were used to examine the extent of genetic diversity in 28 accessions. Amid the 40 RAPD primers, 23 resulted in amplification, of which 15 primers that gave polymorphic and reproducible banding pattern were chosen for further study. Out of 20 ISSR primers, 4 gave polymorphic banding pattern. The 19 primers (RAPD + ISSR) produced a total of 81 scorable bands, out of which 55 were polymorphic with an average of 5.8 polymorphic bands per primer. Figure 3 shows the RAPD profile of 28 accessions amplified with OPT-06 primer. Percent polymorphism ranged from 25% (OPF-07, OPT20) to 100% (OPT-09, OPJ-20) with an average of 57.2% (Table 4). In the present study, a moderate level of polymorphism (57.2%) was observed in T. bellerica. Similar results were observed for Jatropha curcas (60.2%) accessions [17]. On the contrary, 87.5% polymorphism has been reported using RAPD markers for Syzygium cumini [18]. The percent polymorphism may vary significantly in different plant species. This is explicable as the product amplified depends upon the sequence of random primers and their compatibility with the genomic DNA [18].

Fig. 3
figure 3

RAPD profile using OPT06 for 28 accessions of T. bellerica. The M lane is marker of 1 kb size

Table 4 RAPD and ISSR primers used for the screening of the 28 accessions of T. bellerica

In order to analyze the relatedness among the genotypes, a dendrogram was constructed on the basis of similarity matrix representing Jaccard’s coefficient using the UPGMA algorithm. As evident from the dendrogram, 28 genotypes from different sites formed 4 clusters at 59% similarity. Evaluation of the degree of support for clusters within the dendrogram through bootstrap analysis with 1,000 replicates revealed high confidence values around the key nodes. As shown in Fig. 4 cluster 1 exclusively represented one major ecogeographical region as it grouped all the samples collected from Udaipur (U1 to U9). Cluster 2 was represented by Ghatol (Banswara) and Dungarpur populations, cluster 3 was represented by Peepalkhoont (Banswara) genotypes and cluster 4 was represented by Jaipur accessions. Clustering was region specific; however, some deviations were noticed in the clustering pattern as the populations of Ghatol and Peepalkhont (Banswara) with lesser spatial distance formed separate clusters in RAPD + ISSR based dendrogram. Earlier other workers have also reported geographical clustering pattern [19, 20], whereas, nonecogeographical clustering was observed in Terminalia arjuna [5].

Fig. 4
figure 4

Dendrogram based on RAPD and ISSR markers

Similarity matrix values using Jaccard’s coefficient ranged from 0.28 to 0.96. The matrix revealed maximum similarity between the accession pairs G3, G4 (0.96); G1, G2 (0.93); and G5, G6 (0.93). The most diverse genotypes were U3, J1 (0.28); U7, J1 (0.28); U3, J4 (0.29); U6, J4 (0.29) and U7, J4 (0.29) which shared maximum dissimilarity values.

The PCA apart from cluster analysis was performed for grouping of accessions. No major differences were observed between the dendrogram and the PCA plot, except for Peepalkhoont accessions that formed cluster 3 using Jaccard’s coefficient, but appeared with Jaipur population on PCA plot. UPGMA-phenogram classified the accessions into four clusters while PCA grouped them into three clusters (Fig. 5). The dendrogram and PCA plot thus depicts apparent correlation between geographical and genetic diversity. Morphochemical and molecular matrices were also compared by using Mantel’s test [21]. The correlation value between the matrices was low and nonsignificant (r = −0.3666).

Fig. 5
figure 5

PCA based on RAPD and ISSR markers

Nei’s genetic identity and genetic distance showed the highest similarity between Dungarpur and Banswara populations. The lowest similarity occurred between Jaipur and Peepalkhoont populations (Table 5). The percentage of polymorphic loci at the population level was low, ranging from 19.64% (Ghatol and Peepalkhoont) to 41.07% (Udaipur) (Table 6) with an average of 28.21%. The Nei’s genetic diversity ranged from 0.06 to 0.15 with an average value of 0.1. The Shannon’s information index ranged from 0.09 to 0.22 with an average of 0.15. The genetic diversity at population level was relatively low, while the total genetic diversity at the species level was high (P = 98.21%; h = 0.29; I = 0.45). The coefficient of genetic differentiation between populations (G ST, the proportion of the inter-population genetic diversity) was 0.63, which indicated that 63% of the genetic variability was distributed among the populations, and only 37% of the variation existed within the population. The number of migrants per generation (Nm) was 0.26 which suggested that the gene flow in T. bellerica was low. Analogous results were obtained in Torreya jackii chun (G ST = 0.63) [22] and Omphalogramma souliei (G ST = 0.603) [23]. However, population differentiation was lower among the fragmented populations of Terminalia amazonia [24].

Table 5 Nei’s genetic identity and genetic distance between five populations of T. bellerica
Table 6 Genetic diversity within population of T. bellerica

The genetic structure of plant populations reflects the interaction of various factors, including long term evolutionary history of the species, genetic drift, mating system and gene flow [25]. Low level of genetic diversity within a population and high genetic diversity among the populations in T. bellerica could be attributed to many factors including restrictive geographical distribution and isolated populations. Such a high degree of genetic differentiation implies that populations are reproductively isolated, or have been isolated in the past, resulting in differentiation by inbreeding and genetic drift, or through local adaptation [26]. As the species is restricted to moist valleys [27], it is therefore conceivable that topographic features in the region, might constitute significant barriers to gene flow, and may thereby have promoted genetic differentiation between populations of this species. In this case, the limited gene flow probably resulted from the geographical separation and discontinuity of the population that limit the long distance dispersal of pollen and seed.

A weak correlation was observed between morphological and molecular marker, which is similar to earlier studies [2830]. The lack of correlation between morphological and molecular markers showed that most of the growth and morphological traits were influenced by several genes [29]. Molecular markers are neutral and thus do not reflect the diversity in functional characters [22]. Morphological traits are under natural selection and their expression is partially under the influence of environmental factors [31].

Systematic germplasm collection and evaluation programme to identify superior planting material from the existing natural variation is desirable in ex situ conservation strategies [32]. The analysis of genetic diversity in T. bellerica using morphochemical and molecular markers will assist in conservation and further improvement of the species through breeding. This study identified genotypes with high secondary metabolite yield and thus can help herbal drug manufacturers in identification of correct raw material.