Introduction

Rubia cordifolia L., commonly called as Indian Madder, belongs to the family Rubiaceae. It is an industrially viable medicinal plant widely distributed in India, China, and tropical Australia with predominance in the South Western Ghats of India (Khare 2004). The plant is recorded in Ayurveda under the Sanskrit name ‘Manjistha’ as source of the popular commercial product Manjith (Daman et al. 2006) and forms an integral part of the traditional herbal formulations Ghrit Kumari, Aswagandharistam, Chirata, Pit Papda etc. (Ved et al. 2002). The plant is highly valued for its pharmacological, cardioprotective and industrial approach (Bhatt and Kushwah 2013). The roots were reported to have a potential interest as a natural or herbal medicinal source (Deshkar et al. 2008). Apart from its high medicinal value, the plant is a source of natural dye used by many flavour and pharmaceutical industries. The plant contains substantial amounts of secondary metabolites, namely, alkaloids, glycosides, flavonoids, phenols and anthraquinones in which purpurin (1,2,4-Trihydroxyanthraquinone) and munjistin (1,3-dihydroxy-9,10-dioxo-9,10-dihydroanthracene-2-carboxylic acid), possessing potential antitumor activities, are predominant (Mishchenko et al. 2007; Bhatt and Kushwah 2013; Sisubalan et al. 2015).

Diversity characterization has always been a prerequisite for utilizing genetic resources for improvement of plant species producing secondary metabolites of interest to commercial applications. To study the quantification and diversification of chemicals in plants, various techniques like thin layer chromatography (TLC) (Fecka and Cisowski 2002), high-performance liquid chromatography (HPLC) (Liu et al. 2000; Ku et al. 2002), gas chromatography-mass spectrometry (GC–MS) (Liu et al. 1995) and liquid chromatography–mass spectrometry (LC–MS) (Ferran et al. 2003) have been used by researchers worldwide. High-performance thin liquid chromatography (HPTLC) is a technique with improvement over conventional TLC, whereby special plates and instrumental resources for sampling are used and the quantitative evaluation of separations is done with densitometry. HPTLC is a simple, sensitive, selective, precise, and robust method for determination of phyto-constituents (Szepesi and Nyiredy 1992). The former method complements HPLC and other techniques in terms of not requiring extensive clean-up steps of crude plant extracts even for quantitative analysis (Pereira et al. 2004). The chemical fingerprinting analysis results in the chromatographic display of the chemical components present in the respective plant and can be corroborated with the botanical identity for developing potential biochemical markers for genetic resource management (Bobby et al. 2012).

Molecular markers in association with biochemical traits have demonstrated their potential applications in characterization of germplasm, genome organization, chromosome mapping, and molecular breeding, among others (Pandotra et al. 2013). Despite the large array of molecular markers available to characterize genetic divergence within and among species of various taxa, PCR-based dominant molecular markers are used more frequently due to their simple, fast, reproducible and cost-effective properties and target abundant sequences throughout the genome as their application does not need any prior sequence information (Mishra et al. 2015). Inter simple sequence repeat (ISSR) is a PCR-based molecular fingerprinting technique, which amplifies the flanking sequences between two simple sequence repeat regions (Zietkiewiez et al. 1994). Literature survey reveals wide applications of ISSR in genetic management and sustainable use strategies in conservation aspects with relatively lower cost of analysis and no need of prior sequence information compared to amplified fragment length polymorphisms (AFLP) and simple sequence repeats (SSRs) (Yang et al. 1996; Bornet and Branchard 2001).

According to a literature survey, there are no previous reports conducted for an intra-specific biochemical profiling of R. cordifolia populations or attempting to correlate chemical and genetic diversity of R. cordifolia populations from India. The main objectives of the study are to (a) determine the levels of genetic divergence among R. cordifolia populations across their geographic extent, using ISSR markers; (b) study the chromatographic variability among the populations through HPTLC analysis; (c) test for correlations between population genetic and chemical data and between both and population geographical distances; and (d) provide basic information for future conservation and management plans of the species.

Materials and Methods

Plant Material

Fresh leaves and roots of R. cordifolia were collected from seven different geographical locations of Eastern Ghats, Tamil Nadu, India (Fig. 1) using a random sampling of 5–7 individuals, obtained from each population (Table 1). Populations were georeferenced with Trimble-JunoSB CE0678. Leaf samples were dried directly in silica-gel and stored for DNA extraction. Root samples were air-dried (under shed) and grounded to fine powder for chemical analysis.

Fig. 1
figure 1

Geographic map showing the collection sites of seven populations of R. cordifolia. Population codes indicate the collection site as listed in Table 1

Table 1 Geographic locations of seven studied populations of Rubia cordifolia

Molecular Analyses

DNA Isolation and PCR Amplification

Genomic DNA of the individual genotypes was extracted from the leaves using the cetyltrimethylammonium bromide (CTAB) protocol. Necessary modifications in terms of chloroform:isoamyl alcohol (24:1) step along with the concentration of polyvinylpyrrolidone (PVP) and β-Mercaptoethanol (BME) were standardized to obtain quality DNA free from secondary metabolites (Khanuja et al. 1999). The quality of the isolated DNA was analysed by gel electrophoresis on a 0.8% agarose gel in 1% tris–acetate EDTA buffer stained with 0.001% Ethidium Bromide (EtBr). The DNA was quantified spectrophotometrically (Nanodrop, ND-1000, USA) and diluted to 25–50 ng/µl for use as template in PCR reactions. Twenty-five ISSR primers procured from the University of British Columbia (UBC, Vancouver, Canada) were screened for their repeatable amplification. From these primer sets, 9 ISSR primers were selected for the present study based on their reproducibility (Table 2). Each primer set was amplified in 25-μl reaction volume which contained 1X Taq DNA polymerase buffer (with MgCl2), 1 unit Taq DNA polymerase enzyme, dNTPs 200 μM (GeNei™), 0.5 µM primer, and 50–100 ng of genomic DNA. PCR amplification was carried out in a DNA thermal cycler (Applied Biosystems, Bio-Rad, California, USA) with initial denaturation for 5 min at 94 °C followed by 45 cycles, denaturation for 1 min at 94 °C, annealing for 1 min at respective annealing temperature (Tm), and extension at 72 °C for 2 min, with a final extension at 72 °C for 7 min. The amplified products were resolved on a 1.2–2% agarose gel stained with EtBr. Quick load 2-Log DNA ladder was used as standard to record the size of DNA bands (New England Biolabs, UK). Gel photographs were documented using BoxEF2 supported with the Quantity one software (Syngene, USA).

Table 2 Characteristics of ISSR markers used in this study and their polymorphism indices

ISSR Data and Statistical Analyses

ISSR-amplified locus was scored as 1 (presence) or 0 (absence) based on the adequate intensity of the bands. Different marker parameters, namely, Effective Multiplex Ratio (EMR), Marker Index (MI) and Resolving Power (RP), were calculated to access the efficiency of the used marker system in R. cordifolia (Mishra et al. 2015). Additionally, the Polymorphic Information Content (PIC) for each primer set was also calculated, according to Roldan-Ruiz et al. (2000). POPGENE software version 1.32 was used to analyse the underlying genetic diversity in terms of percentage of polymorphic bands (PPB), average gene diversity within population (HS) and total diversity (HT) at the population and species level assuming Hardy–Weinberg equilibrium (Yeh et al. 1999). Genetic differentiation (Gst) was estimated by Nei’s gene diversity statistics as well as Shannon’s Information Index (Lewinton 1972) in POPGENE. Gene flow (Nm) among the populations was quantified accordingly with Nm = 0.5(1 − Gst)/Gst (Nei 1987). A similarity matrix was calculated based on the banding pattern of the genotypes, using Jaccard’s similarity coefficient (Jaccard 1908) in SPSS Statistics 17.0 software (Leonard 2009). TREECON software (Van-de-peer and Wachter 1994) was used to assess the genetic relatedness (clustering) of the populations using unweighted pair group method with arithmetic mean (UPGMA) based on pairwise genetic distance (Nei’s 1973). Bootstrapping analysis of each data set over 1000 replicates was added as a module in TREECON to test the robustness of each UPGMA node (Felsenstein 1985). A PCoA was conducted in GenAlEx using a matrix of genetic distances. Several hierarchical analyses of molecular variance (AMOVA) with different grouping criteria were conducted to estimate variance components, partitioning the variations among populations and individuals, using program GenAlEx version 6.5 (Peakall and Smouse 2006), and the variance components were tested statistically by nonparametric randomization tests using 9999 permutations.

Chemical Analyses

Metabolic Profiling

The metabolic profiling of the R. cordifolia populations was carried out by HPTLC to check the existing variability among metabolites of the collected germplasms. Alizarin and purpurin used as standard were purchased from Himedia and Sigma-Aldrich, USA, respectively. One mg of the alizarin and purpurin were weighed separately and the volume was made to 10 ml in methanol to achieve a concentration of 100 µg/ml. Other HPLC grade solvents such as toluene, ethyl acetate, methanol, formic acid were obtained from Merck, Mumbai, India.

Extraction of Plant Material

The completely air-dried plant roots were grounded to fine powder and weighed to 10 g. The volume was adjusted to 100 ml with methanol to achieve a concentration of 100 mg/ml. The extract was filtered through Whatmann filter paper (No 41) to remove insoluble components. The extraction protocol was repeated thrice for each sample and the pooled extracts were concentrated under vacuum. The filtrate was used for the HPTLC analysis.

HPTLC Conditions and Densitometric Chromatogram Evaluation

The HPTLC was performed on 20 cm × 20 cm TLC aluminium pre-coated plates with 0.2-mm layer thickness of silica gel GF254 (sd. fine chem. Ltd, Mumbai, India). Standards and samples were applied as band using a Camag 100-µl sample syringe with a Linomat IV applicator (Camag, Switzerland) under a flow of N2 gas. The development was carried out in a linear ascending manner, in a twin trough glass chamber using the optimized mobile phase with toluene:ethyl acetate:formic acid (85:14:2). The plate was kept in an oven at 110 ºC for 5 min followed by air-drying. The developed chromatogram was evaluated by using Camag TLC Scanner 3 at a wavelength of 232 nm by win CATS Software Version 3.2.1. Quantification was performed using peak area with linear regression of amount ng/band.

HPTLC Cluster Analyses

The samples were subjected to a cluster analysis based on retention factor variation patterns using the Pearson linear correlation coefficient (Snedecor and Cochran 1995) as implemented in SPSS Statistics 17.0 software. A dendrogram was generated based on the similarity matrix with DendroUPGMA (Garcia et al. 1999). A principal components analysis (PCA) was conducted on the similarity coefficient data (Leonard 2009).

Correlation Analyses

Correlation analyses were performed between molecular and chemical matrixes, between molecular and geographical distance matrixes, and between chemical and geographical distance matrixes, to test the goodness of fit between them. The analysis was done employing Mantel test (Mantel 1967) in software XLSTAT©-Pro version 7.5 (Addinsoft Inc., Brooklyn, NY, USA). The parameters were set to 10,000 permutations with significance level at α = 0.05. Molecular and chemical data matrixes were obtained through SPSS Statistics 17.0 software. The geographic distance matrix was generated using the latitude and longitude coordinates obtained in sampling using Geographic Distance Matrix Generator version 1.2.3 (Ersts 2016).

Results

Molecular Characterization

Genetic Diversity

The diversity analysis in R. cordifolia indicated a high level of primer polymorphism. Nine ISSR primers generated 39 products with an average of 4.33 products per primer, of which 35 products were polymorphic. The size of the amplified product ranged from 200 to 2000 bp. The oligonucleotide sequences of primers and their resultant products are summarized in Table 2. ISSR profiles of a representative gel of primer UBC 807 amplification are shown in Fig. 2. The analysis estimated the PPB ranging from 50% (UBC 809) to 100% (UBC 807, 810, 826, 844, and 881) with an average value of 88.88% polymorphism per primer at the species level (Table 2). The PIC value was found to range from 0.09 (UBC 809) to 0.37 (UBC 881) with an average 0.27 per primer. Other parameters, namely, EMR, MI, and RP, had average values of 20.62, 5.62, and 1.74 per primer, respectively (Table 2). Population level analysis showed that the value of PPB per population ranged from 30.77% [Pachamalai (P)] to 51.28% [Kolli Hills (KL) and Shervaroy Hills (SH)] with an average of 45.79%. The observed number of alleles (Na) ranged from 1.30 to 1.51 with mean value of 1.45 ± 0.50, while the expected number of alleles (Ne) ranged from 1.23 to 1.39 with mean value of 1.33 ± 0.39. The population average values of Nei’s gene diversity (h) and Shannon’s information index (I) were estimated at 0.187 ± 0.212 and 0.272 ± 0.304, respectively. When calculated at the species level, the h and I values were 0.266 ± 0.173 and 0.408 ± 0.230, respectively (Table 3).

Fig. 2
figure 2

ISSR profile of R. cordifolia populations with primer UBC-807. Lanes M, 2 log DNA ladder; 1–38, accessions as mentioned in Table 1

Table 3 ISSR-based intra-population statistics of seven populations of Rubia cordifolia

Population Genetic Structure

The Jaccard coefficient among 38 samples of R. cordifolia ranged from 0.302 (Chitteri Hills and Shervaroy Hills) to 0.962 (among several Pachamalai populations). Based on the banding patterns at species level, the individuals grouped into three major clusters (cluster I, II and III) with divergence into sub-clusters (Fig. 3), although only three subgroups had minimum bootstrap support (above 50%), each grouping two individuals from the same population or from two close by populations. In agreement with the dendrogram, PCoA revealed that individuals grouped regardless to their geographical localities, although some unclear structure seems to be present. The first two components accounted for 21.72% (axis 1 = 11.77%, axis 2 = 9.95%) of the total variability (Fig. 4a).

Fig. 3
figure 3

Genetic similarity dendrogram based on ISSR band pattern (Nei and Li’s coefficient) showing the relationship among 38 R. cordifolia accessions

Fig. 4
figure 4

a Principal coordinate analysis on molecular assays of 38 individuals of R. cordifolia populations based on covariance matrix (pooled data) with data standardization; b two-dimensional plots obtained using principal component analysis based on chemical assay

Regarding gene flow, average Nm was measured to be 1.36 individual per generation, indicating occurrence of gene exchange between populations, albeit low. The genetic distance between the populations of R. cordifolia ranged from 0.023 (between Pachamalai and Chitteri Hills) to 0.157 (between Pachamalai and Jawadhu Hills), and the average Nei’s genetic identity was 0.912 (ranging from 0.811 to 0.976).

AMOVA based on three geographically close regions (YE+JW, SH+CH+KR, KL+P) indicated that the majority of genetic variation (95%) occurred within populations, while the variation between the three regions was 05% (Table 4). We also treated the seven populations as one group and compared the variation within and among populations. The percentage of variation obtained was same as 95% within the populations and 5% among the populations indicating that the genetic variation mainly exists at intra-population level.

Table 4 Population genetic structures of Rubia cordifolia populations based on AMOVA

Chemical Characterization

The result of HPTLC analysis showed significant variations in the content of alizarin and purpurin in all the genotypes undertaken for study (Table 5). Yelagiri Hills population showed the highest percentage (0.115 ± 0.05 mg/g dry wt.) of alizarin, followed by Shervaroy Hills (0.093 ± 0.03 mg/g dry wt.), Pachamalai (0.0763 mg/g dry wt.), and Kolli Hills (0.072 ± 0.02 mg/g dry wt.), while the least alizarin content was observed in Chitteri Hills (0.029 ± 0.02 mg/g dry wt.). Contrarily, the highest content of purpurin was found in Pachamalai populations (0.284 ± 0.06 mg/g dry wt.) followed by Kolli Hills, Kalrayan Hills and Shervaroy Hills, with the lowest purpurin content recorded in Yelagiri Hills (0.063 ± 0.09 mg/g dry wt.). The limit of detection (LOD) and limit of quantification (LOQ) values depict that the method HPTLC is sensitive.

Table 5 Amount of alizarin and purpurin content in Rubia cordifolia roots analysed by HPTLC

The HPTLC-based structure analysis of R. cordifolia populations’ produced similar results between the PCA (Fig. 4b) and dendrogram (Fig. 5), with Pachamalai grouping separately and two major groups, one formed by CH and SH populations and the other by KR and YE as sister to JW. The first two principal coordinate components of the PCA accounted for 69.49 and 26.27% of the variation, respectively (Fig. 4b).

Fig. 5
figure 5

UPGMA dendrogram analysis derived from the chromatographic fingerprints of R. cordifolia distributed in different altitudinal populations

Correlation Analyses

Correlation between genetic and chemical diversity did not show significant correlation (R2 = 0.002, p = 0.973) between chemical and molecular data. Correlation between both the genetic and chemical datasets with population geographic distances were also not significant (R2= 0.231, p = 0.315 and R2= 0.191, p = 0.397, respectively).

Discussion

Population Genetic Diversity and Structure

Genetic differentiation studies are considered as standard and more discrete means of identifying the populations as they reflect the genetic constitution of genotypes (Mishra et al. 2015). ISSR markers have been routinely used to detect the prevailing genetic variations in the genomes of individuals, representing them as one of the potential genetic markers used successfully in population genetic studies of wild plants (Pither et al. 2003; Bahulikar et al. 2004; Naik et al. 2009; Bodare et al. 2013). In the present study, high polymorphism at the species level and medium polymorphism at the population level indicates the suitability of ISSR markers in depicting the underlying genetic diversity within the studied populations. Based on the present results from the ISSR analysis, the levels of genetic diversity in the populations were similar except for Pachamalai, which revealed lower values. The earlier work by Prasanth and Nandagopal (2009) and Sisubalan et al. (2015) carried out on genetic similarity and diversity studies in the same R. cordifolia populations as those in this study, are similar to our study. High diversity was also reported in Rubia tinctorum populations using RAPD markers (Baghalian et al. 2010), which may indicate that similar levels of polymorphism may also exist in other species across the genus. Additionally, relatively high levels of genetic variability have been reported previously in family Rubiaceae with different molecular markers (Cieslak and Szelag 2010; Gaafar et al. 2014; da Silva et al. 2014), which agrees with the present results. Additionally, our study revealed high genetic diversity at species and population level. Similar values have been reported in many other studies on different plant systems, such as with Rhodiola alsia (Xia et al. 2005), Torreya jackii (Li and Jin 2007), Rheum species (Wang et al. 2012).

Genetic drift, pollination/breeding system and geographic distribution range generally shape the genetic differentiation of populations (Young et al. 1996; Panda et al. 2015), but little is known regarding R. cordifolia reproductive biology. Geographical distribution and topographical barriers can lead to low seed dispersal, resulting in limited gene flow among populations (Hamrick and Godt 1996). The populations used in this study were located in three major different geographical ranges with significant landmasses between them, and overall gene flow estimated among populations was not prominent however, no clear genetic structure could be found.

HPTLC Characterization of R. cordifolia

In the present study, HPTLC analysis reflected considerable variation in the alizarin and purpurin content in different populations investigated. Yelagiri Hills population was found to be the highest accumulator of alizarin followed by Shervaroy Hills, while in case of purpurin content, the genotypes from Pachamalai regions recorded highest accumulator followed by Kolli Hills. There could be different possible reasons for the findings, namely, different environmental factors and habitats conditions in the collection sites, and individual genetic variability, which can alter secondary metabolite production (Pandotra et al. 2013). Furthermore, growth stages affect the secondary metabolite content of plants thereby increasing at the maturity; however, in our study all samples were collected at the flowering stage. Hence, both genetic and environmental factors might be responsible for variation in the alizarin and purpurin content such as proposed by Zhao et al. (2014) for Rhodiola sachalinensis. Considering that the populations of KL, SH and YE are rich in alizarin and purpurin as well as genetically diverse, they may potentially be used as source of mother-plants for large-scale commercial cultivation.

Correlation Between the Genetic and HPLC Datasets

The studied genetic markers did not reflect any correlation with the alizarin and purpurin biosynthesis, nor revealed the influence of geographic factors on the obtained genetic and chemical population structures. The earlier work by Baghalian et al. (2010) on the genetic diversity of Rubia tinctorum similarly did not find a significant correlation between phytochemical and molecular diversity related with environmental influence, while Zhao et al. (2014) reported that Rhodiola sachalinensis populations were negatively correlated with the HPLC fingerprinting variations. The possible reason might be the insufficiency of the studied markers in analysing genome information related to alizarin and purpurin biosynthesis, or environment-based transitory activation of the gene(s), which sometimes alters the production of secondary metabolites. The samples were collected from different geographical locations, which accounts for the variation as secondary metabolites content differs with the environmental conditions. However, individual variation also might be the possible reason behind the results.

Concluding Remarks

An approach made through molecular and chemical analysis of R. cordifolia collected from different natural locations of Eastern Ghats, Tamil Nadu, India, provided baseline population genetics and phytochemical information for characterization and management of the species gene pool. Molecular marker-based and chemical characterization provided a better understanding of the existing variation within and across populations and allowed for the identification of target populations for conservation purposes and commercial use. The ISSR marker system tested could be efficiently used in future studies at a wider scale, to demonstrate genetic diversity and structure in other populations of R. cordifolia. It is possible that environmental factors were among major factors influencing the results as well as within species variability. Because of its valuable medicinal properties and high industrial demand, R. cordifolia needs more attention and the conservation of populations in different habitats is recommended to maintain the species genetic diversity.