Introduction

The livelihood security and sustainability in hot arid regions, characterized by limited and highly variable rainfall, extreme temperatures and frequent droughts, is dependent on tree-based production systems like agri-silvi-pastoral, silvi-pastoral and agri-silvicultural. Trees are intricately associated with the life of human beings in these regions. However, the harsh agro-climatic conditions adversely affect and limit the number of tree species suitable for growth in these regions. This necessitates the strengthening as well as conservation of the available woody germplasm diversity stock. The monotypic genus Tecomella undulata (Sm.) Seem (family bignoniaceae), commonly called Rohida, Marwar teak or Desert teak, is an important deciduous, timber yielding tree of the hot arid regions (Shankarnarayan and Nanda 1963). Originating in India and Arabia (Randhawa and Mukhopadhyay 1986), it is distributed in drier parts of Arabia, southern Pakistan and north-western India (Tewari 2007). Major population of this endangered species (Bhau et al. 2007) is found in western parts of Rajasthan (Nadkarni 2000). Tecomella flourishes well in arid regions and survives well under extreme environmental conditions viz., scanty rainfall (<150 to 500 mm annually), extreme temperatures (−2 to 50 °C) and pH (6.5 to 8.0). Its soft, durable wood is used for making engraved furniture, agricultural implements, carvings, turnery and toys (Bhau et al. 2007). However, many older trees have hollow trunk due to infestation of wood by borers and wood-decaying fungi (Kar et al. 2009). This important agroforestry tree, having three colour morphotypes (yellow, orange and red flowers), plays an important role in sustaining livelihood of local communities in hot arid regions. It acts as soil binder, wind break, stabilizer of shifting sand dunes, provide shade and shelter to domestic and wild animals; leaf fodder for cattle, camels, sheep and goats; increase carbon and nitrogen content of soil due to profuse growth of microorganisms in its mycorrhizal rhizosphere; phytoremediation of chromium contaminated soil and for rehabilitation of lignite mine backfills (reviewed by Kalia et al. 2014). The presence of many bioactive constituents in its leaves, stem and bark accounts for its multifarious medicinal and pharmacological uses in indigenous medicinal systems of Ayurveda. Tecomella is used for treating abdominal ailments, jaundice, enlarged spleen, anaemia, intestinal worms, urinary disorders, syphilis, migraine, hepatitis, gynaecological troubles, etc. Many therapeutic activities like hepatoprotective, antibacterial, antimicrobial, antifungal, immunomodulatory, anticancer, cytotoxic, analgesic, anti-inflammatory, HIV prohibitors, anti-obesity, etc. have been attributed to Tecomella (reviewed by Kalia et al. 2014).

Over-exploitation coupled with minimum plantation and conservation efforts has led to severe depletion of natural populations of this important slow growing tree of hot arid deserts. Only few young plantlets are seldom found growing in the field (Negi et al. 2011) due to browsing of seedlings by goats and gerbils (Kaul et al. 1962) in common lands and ploughing into soil in farmers’ fields. The mature trees on the other hand are indiscriminately used for timber and fuel depleting the rich diversity stock. Developmental, commercial and agricultural activities are also seriously threatening the diversity of this plant species. The genetic diversity evolved over the course of millions of years, however, enables a plant to adapt to future environmental changes that may arise due to pollution, climate change, diseases and other forms of environmental adversities (Khan 1998). In addition, Tecomella growing under very harsh agro-climatic conditions, therefore, must be harbouring many important genes conferring abiotic stress tolerance which are yet to be identified and harvested for enhancing resilience of other economically important crops and trees. Not much efforts have been made till date to assess the available genetic diversity in this species except for the reports of Jindal et al. (1987a, b) and Negi et al. (2011), wherein morphological [approximate age, tree height, forking height, basal diameter, diameter at breast height (DBH), canopy diameter, branches per tree, leaf length and breadth, and flower colour] and biochemical [SDS-PAGE, crude protein content, acid detergent fibre (ADF), cellulose and hemi-cellulose content, neutral detergent fibre (NDF), lignin and silica] markers were used for diversity assessment.

In contrast to morphological and biochemical methods, DNA fingerprinting has emerged as a more reliable technique capable to estimate genetic variability at molecular level using PCR or hybridization-based markers. In recent years, these DNA-based marker systems have advanced our understanding of genetic resources. Molecular markers have been utilized for variety of purposes including examination of genetic relationships between individuals, population genetics, phylogenetic studies, breeding and conservation (Kalia et al. 2011). Till date, only Bhau et al. (2007) have studied the genetic diversity among 42 accessions of T. undulata collected from 3 districts of western Rajasthan using DNA-based (AFLP) markers. Many arbitrary, semi-arbitrary and sequence-based markers have been used for studying genetic diversity, population structure and germplasm management in genebanks (Alikhani et al. 2014). Recently, Collard and Mackill (2009) developed start codon targeted (SCoT) markers, based on short conserved nucleotide sequences that flank the start codon ATG. This PCR amenable, agarose gel separable, highly polymorphic marker targets the start codon region of genes which are abundant in the genome.

The present study targeted the estimation of genetic diversity among and within 21 T. undulata populations collected from all 12 districts of western Rajasthan, India, using SCoT markers. The three colour morphotypes were included in populations wherever they were available. This is the first report estimating diversity of this endangered tree species across its natural range in India using PCR-based SCoT marker.

Materials and methods

Young leaves were collected from a total of 108 trees of T. undulata growing in all 12 districts of western Rajasthan, its natural habitat, during February–March. The sampling locations along with their geographical coordinates are summarized in Table 1, Fig. 1. Usually five samples (sometimes 6–7) were collected from each site (21 nos.) and were designated as one population. The sampled trees were chosen to provide maximum representation of the area—trees from wild as well as farmers’ field were included. All three colour morphotypes (Fig. 2) were included in the sampled population wherever available. Samples of one population were collected from within an area of 2500 m2 (50 × 50 m) while distance between populations varied from 25 to 200 km.

Table 1 Details of 108 Tecomella undulata genotypes used for diversity analysis using SCoT markers
Fig. 1
figure 1

Geographic location of collection sites of 21 populations of T. undulata in western Rajasthan

Fig. 2
figure 2

Three colour morphotypes of T. undulata. a Yellow. b Orange. c Red

Total genomic DNA was isolated from the young leaves stored at −40 °C using the modified CTAB method of Doyle and Doyle (1990). The yield and purity of isolated DNA was determined electrophoretically on 0.8% agarose gel and spectrophotometrically at the wavelength of 260/280 nm using Biophotometer (D30, Eppendorf, Germany). The 36 SCoT primers [developed by Collard and Mackill (2009)] were screened to shortlist primers generating distinct and unambiguous amplicons. Finally 22 primers were used for studying polymorphism among the 21 populations of T. undulata. DNA amplification reactions were performed in 25 μl volume containing 50–70 ng of template DNA, 1× PCR buffer (Sigma Aldrich), 0.2 mM dNTP mix (Thermo Scientific), 1.5 mM MgCl2, 1 μM SCoT primer (Integrated DNA Technologies Inc., India) and 0.6 U Taq DNA polymerase (Sigma Aldrich). PCR was performed using the following program: initial denaturation at 94 °C for 4 min, followed by 40 cycles of denaturation at 94 °C for 45 s, annealing at 51 °C for 1 min and extension at 72 °C for 1 min with a final extension at 72 °C for 10 min. DNA amplification was performed in an Eppendorf thermocycler (Mastercycler Nexus GSX1). The 2% agarose (Amresco, USA) gels with 1× TAE buffer were stained with ethidium bromide (1.0 μg/ml gel; Sigma Aldrich); run under a steady voltage of 110 V for 120 min and then photographed under gel documentation system (Alphaimager EC, Protein Simple, California, USA). GeneRuler 100 bp plus DNA ladder (Thermo Scientific, EU, Lithuania) was used as size marker in all agarose gels.

Data analysis

To ensure the consistency of results, all reactions were performed twice. The consistently produced, unambiguous and well-resolved fragments were scored manually. The amplified DNA fragments were scored as ‘1’ for their presence and ‘0’ for absence. Polymorphic information content was calculated as described by Smith et al. (1997).

$$ \mathrm{PIC}=1-\Sigma {f_i}^2 $$
$$ {{}^i}^{=1\hbox{--} n} $$

where, f i is the frequency of the ith amplicon. The number of amplicons refers to the number of scored DNA fragments. The frequency of an amplicon was obtained by dividing the number of species, where it was found by the total number of species. ‘POPGENE’ software version 1.32 (Yeh et al. 1997) was used for the estimation of genetic parameters (number of polymorphic loci, their percentage, Nei’s gene diversity and Shannon information index) using the 1/0 binary data. Dendrogram was constructed of all the T. undulata genotypes using the SCoT marker data utilizing unweighted pair-group with arithmetic mean (UPGMA) method (Michener and Sokal 1957). Principal coordinate analysis (PCoA) of all the genotypes was performed based on variance-covariance matrix using the marker data by utilizing NTSYS-pc (Rohlf 2000).

Results

SCoT analysis

The SCoT marker analysis of 108 T. undulata genotypes representing 21 populations collected from all 12 districts of western Rajasthan, India, using 22 primers is presented in Table 2. A total of 294 scorable, unambiguous amplicons were obtained using 22 SCoT primers. The total number of amplicons per primer varied from 4 to 24 with an average of 13.4. The highest (24) and lowest (4) number of amplicons per primer were amplified by SCoT-15 and SCoT-9 primers, respectively. Among the total of 294 DNA fragments amplified, 212 were polymorphic with an average polymorphism of 71.6%. The polymorphism varied from 50% (SCoT-1) to 81.3% (SCoT-12 and SCoT-30) and average number of polymorphic fragments per primer was 9.74. A representative SCoT amplification profile of 108 T. undulata genotypes with SCoT-21 primer is presented in Fig. 3.

Table 2 Genetic diversity and extent of polymorphism among the 108 Tecomella undulata genotypes using SCoT primers
Fig. 3
figure 3

Gel profile of 108 samples of T. undulata with primer SCoT 21. Lanes: M = 100 bp plus DNA ladder; 1–108—108 samples

Polymorphism information content (PIC) ranged from 0.31 for SCoT-1 to 0.71 for SCoT-12 with an average PIC value of 0.54. The size of the amplified DNA fragments ranged from 100 to 3000 bp; however, most of them were in the size range of 300 to 2000 bp. Shannon’s information index (I) varied from 0.25 (SCoT-9) to 0.47 (SCoT-23) with an average of 0.36 which implied the presence of considerable genetic diversity among the genotypes. Nei’s genetic diversity (H) values ranged from 0.13 to 0.32 for primers SCoT-9 and SCoT-23, respectively.

Cluster analysis of T. undulata genotypes

The cluster analysis carried out by UPGMA method using the total number of amplicons generated by 22 SCoT primers clustered the 108 genotypes belonging to 21 different populations into 5 major clusters (C1–C5; Fig. 4). Cluster 1 (C1) included 17 genotypes of populations collected from Jalore (5), Hanumangarh (4), Jaisalmer (5) and Nagaur (3) regions. Cluster 1 could be further divided into 3 subgroups. Four genotypes of Jalore (JAL-1-1 to JAL-1-4) and a single genotype from Hanumangarh (HGH-3-1) were included in subgroup 1. JAL-1-1, JAL-1-4 and HGH-3-1 had yellow flowers while JAL-1-2 and JAL-1-3 had orange flowers. Subgroup 2 included three genotypes from Hanumangarh (HGH-3-2 to HGH-3-4) and one from Jalore (JAL-1-5). The three genotypes grouping close together from Hanumangarh had red flowers. Subgroup 3 of C1 included all five genotypes from Jaisalmer (JSM-2-1 to JSM-2-5) and three genotypes of Nagaur-4 population (NAG-4-1 to NAG-4-3). The genetic distance of JSM-2-1 with the other genotypes of Jaisalmer population (JSM-2-2 to JSM-2-5) or Nagaur population (NAG-4-1 to NAG-4-3) was greater than the distance between other members of these two (Jaisalmer and Nagaur) populations.

Fig. 4
figure 4

UPGMA dendrogam of 108 samples of T. undulata based on SCoT primer analysis

Cluster 2 (C-2) including 19 genotypes from Churu (4), Ganganagar (4), Pali (5), Barmer (1) and Jhunjhunu (5) was further divided into 2 subgroups; subgroup 1 included four genotypes from Churu (CHR-8-2 to CHR-8-5), four from Ganganagar (GGN-11-1 to GGN-11-4), and three from Pali (PAL-9-3 to PAL-9-5) whereas subgroup 2 had one genotype from Barmer (BAR-12-1), all five from Jhunjhunu (JJN-10-1-1 to JJN-10-1-5), and two from Pali (PAL-9-1, PAL-9-2). Among the three different populations of Churu (8, 16, 17), only four genotypes (CHR-8-2 to CHR-8-5) from Churu-8 population were grouped in C-2. Similarly, among the four different populations of Barmer (12, 13, 18 and 19), only one genotype (BAR-12-1) was grouped under C-2. The Churu and Ganganagar genotypes in subgroup 1 were more closely grouped among themselves than when compared to the Pali genotypes. In subgroup 2, all the five genotypes from Jhunjhunu (JJN-10-1-1 to JJN-10-1-5) were closely grouped along with BAR-12–1. Cluster 3 (C-3) had 17 genotypes from 3 districts viz., Bikaner (6), Barmer (9) and Churu (1). Four genotypes of Barmer 12 (BAR-12-2 to BAR-12–5) and five of Barmer 13 (BAR-13-1 to BAR-13-5) populations, respectively, were represented in C-3 along with six genotypes of Bikaner 14 population (BKN-14-1 to BKN-14-6). The Churu genotype (CHR-16-1) was genetically most distant from Barmer and Bikaner genotypes. Among the closely clustered genotypes, some genotypes had similar flower colour viz., BAR-12-2 and BAR-12-3 were red, BAR-13-3 and BAR-13–4 were orange, and BKN-14-6 and BAR-12-4 had yellow flowers.

Cluster 4 (C-4) was a very huge cluster including 32 genotypes from populations of Barmer (9), Bikaner (4), Churu (9) and Jodhpur (10). The C-4 could be divided into 2 subgroups containing 16 genotypes each. Subgroup 1 included three genotypes from Barmer 18 (BAR-18-1 to BAR-18-3), four from Bikaner 15 (BKN-15-2 to BKN-15-5), four from Churu 16 (CHR-16-2 to CHR-16-5) and all five genotypes from Churu 17 (CHR-17-1 to CHR-17-5) populations. The genotypes of Churu 16 and Churu 17 populations closely grouped with genotypes from Bikaner 15 (except BKN-15-2) and Barmer 18 populations, respectively. Subgroup 2 included six genotypes from two populations of Barmer (BAR-18-4, BAR-18-5; BAR-19-1, BAR-19-2, BAR-19-4, BAR-19-5) and ten genotypes from two Jodhpur populations (JDP-20-1 to JDP-20-5, and JDP-21-1 to JDP-21-5). Cluster 5 (C-5) included 18 genotypes from Churu 8 (1), Nagaur 4 (2), Nagaur 5 (5), Sikar 6 (5) and Sikar 7 (5) populations. Sikar and Nagaur genotypes from their two respective populations segregated into two different branches despite being closely grouped. Among the five genotypes of Nagaur 5 population, four had orange colour flowers while only NAG-5-4 was yellow in colour. Interestingly, two genotypes belonging to Bikaner (BKN-15-1) and Barmer (BAR-19-3) were distinctly separated from all the other 106 genotypes falling under 5 clusters.

Principal coordinate analysis

An alternate view of the relationships among the 108 genotypes was obtained by subjecting the SCoT marker data to principal coordinate analysis (PCoA; Fig. 5). The two-dimensional PCoA plot demonstrated results comparable to UPGMA dendrogram with similar clustering of the genotypes. The overall genetic relationships among the genotypes was very well revealed by the two-dimensional projection of data along the first two principal coordinates; first two principal coordinates explained total variation of 8.1 and 13.4%, respectively, indicating multi-dimensional feature of SCoT data. As observed, 3 clusters (C1, C2 and C3) exhibited close clustering with considerable divergence. While the remaining 2 clusters (C4 and C5) were distinctly separated but also exhibited considerable divergence.

Fig. 5
figure 5

PCoA plot of 108 samples of T. undulata based on SCoT primer analysis

Discussion

SCoT markers are PCR-based single primers developed from the short conserved sequence of plant genes flanking the start codon ATG (Collard and Mackill 2009), these regions are conserved among all the genes. Despite being similar to other single primer (used both as forward and reverse primer) based techniques like RAPD and ISSR, SCoT markers are more reliable (annealing temperature 50 °C compared to 37–42 °C), reproducible (target genes instead of arbitrary or non-coding regions) and easy to design. SCoT polymorphism detection is less cost and labour intensive compared to AFLP and other similar techniques (SAMPL, FIASCO). Further, SCoT polymorphism can produce markers that are linked with traits and could therefore be used in marker-assisted breeding. The efficiency and informativeness of SCoT markers for diversity studies of varieties was better than other markers based on polymorphism (%), PIC values and Shannon index in potato (Gorji et al. 2011). SCoT markers have been effectively used to study the diversity among several species like Mangifera indica (Luo et al. 2011a, b), Vitis vinifera (Guo et al. 2011), Dendrobium nobile (Bhattacharya et al. 2013) and Citrus spp. (Jiang et al. 2011).

We utilized 22 effective SCoT markers to study the genetic diversity of populations covering all 12 districts of western Rajasthan, India. Overall, 71.6% polymorphism was observed among the genotypes, indicating good genetic diversity and significantly high discriminating ability of SCoT analysis. The high percentage polymorphism observed could be due to the out-crossing nature of T. undulata (Negi et al. 2011). Earlier studies have also reported the out-crossing (often cross pollinated) nature of T. undulata (Jindal et al. 1985, 1992). The percent polymorphism observed in our study (71.6%) is in agreement with that reported by Bhau et al. (2007) using AFLP markers (71.7%) in T. undulata. But, our study assessed diversity of larger number of genotypes (108 vs 42) including wild/natural and planted populations utilizing more number of primers (22 SCoT) compared to 8 AFLP primer pairs used by Bhau et al. (2007). Further, and most importantly, this study covered a larger geographical area (21 locations in 12 districts) compared to only 3 districts covered by Bhau et al. (2007; Jodhpur—2 samples; Jalore—6 samples; Barmer—32 samples and 2 samples from selfed flowers).

Despite being highly valued for its timber and medicinal properties, only one report of Bhau et al. (2007) is available on the genetic diversity of T. undulata at molecular level. Our study is one of the earliest reports studying the genetic diversity of this endangered species using easy to handle PCR amenable SCoT markers. The average PIC value observed was 0.54, moderately higher indicating that these loci are informative. Further, higher PIC values indicated the effectiveness of the SCoT primers in fingerprinting of varieties or genotypes. As there are no earlier reports in T. undulata, these PIC values were compared with PIC values obtained with SCoT markers in other tree species like Cocos nucifera (0.16 to 0.47; Rajesh et al. 2015), Jatropha curcas (0.04 to 0.50; Mulpuri et al. 2013) and Tinospora cordifolia (0.19 to 0.49; Paliwal et al. 2013), and found to be higher. Similarly, PIC values obtained with SCoT markers in some other tree species like D. nobile (0.78; Bhattacharya et al. 2013), V. vinifera (0.75 to 0.91; Guo et al. 2011) and Pistacia spp. (0.23 to 0.98; Sorkheh et al. 2016) were found to be higher than T. undulata. Further, the genetic diversity among the genotypes was estimated based on Nei’s genetic diversity (H) and Shannon’s information index (I). ‘H’ estimates the diversity of a population irrespective of the number of alleles per locus, the pattern of evolutionary forces such as mutation/selection/migration and the reproductive method of the organism (Nei 1973) while ‘I’ provides an estimate of genetic variation within the population. The average H and I values obtained were 0.22 and 0.36, respectively, in this study.

Our study encompassed all the three morphotypes based on the flower (corolla) colour viz., yellow, red and orange. Among the 41 genotypes with orange flowers, 29 (72.7%) occurred in 6 arid districts of western and north-western Rajasthan (Barmer, Bikaner, Churu, Jaisalmer, Jodhpur and Jhunjhunu), confirming the dominance of orange morphotype in regions under severe climatic stress conditions. The dominance and survival of orange morphotype under extreme climatic conditions could be due to its hybrid nature (between yellow and red) which is manifested by its vigorous growth (Negi et al. 2011). Based on our results, it might be possible to identify candidate populations’ best adapted for harsh environmental conditions for in situ and ex situ conservation. A hybrid-based study focusing on the adaptability features of these populations could yield fruitful results in the future. Further, our results agree with other studies correlating higher genetic diversity with higher out-crossing rates (Miller and Tanksley 1990). Additionally, it was also observed that distribution of red and yellow morphotypes was very limited in Churu/Jalore/Pali and Jhunjhunu/Ganganagar districts, respectively.

Taxonomic classification is generally based on morphological traits, but the availability of novel and powerful molecular techniques is helping in modifying/refining the existing systems of classification and establishing phylogenetic relationships based on DNA/protein sequences especially in morphologically identical/related species. PCR-based molecular markers have contributed significantly in establishing phylogenetic relationships. Distinct genetic relationships among the 108 genotypes were obtained using the cluster analysis which grouped the genotypes into 5 major clusters (C-1 to C-5; Fig. 4). This SCoT marker-based study has identified greater genetic diversity among different populations and moderate diversity within the populations. But even within the populations, genotypes like BAR-19-3 (Barmer population) and BKN-15-1 (Bikaner population) were significantly diverse from their respective populations. Since all the genotypes representing Barmer populations were of wild type, we assume that the genotype BKN-15-1 could be of similar type based on molecular analysis.

Close association of genotypes within the population especially at the subgroup level based on geography and corolla colour is very well demonstrated among most of the populations analysed. Further, this study has identified grouping based on geographical proximity among the populations viz., Jodhpur (20 and 21) and Barmer (19) in subgroup 2 of cluster C-4, and Churu (CHR-8-1), Nagaur (4 and 5) and Sikar (6 and 7) in cluster C-5. These districts are close by/adjacent to each other (Fig. 1). Furthermore, clusters 3, 4 and 5 include most of the samples collected from arid region. This closeness can be attributed to the fact that these samples were collected mostly from hotter and drier regions of Indian Thar desert having harsh climatic conditions. The closeness of genotypes of similar geographic region, i.e., hot and dry regions of Indian Thar desert has also been reported in Prosopis cineraria (Sharma et al. 2010) and Commiphora wightii (Harish et al. 2014), an important woody plant species of Indian Thar Desert. Using SCoT markers, relationship based on geographic regions has also been reported in V. vinifera (Guo et al. 2011). Additionally, clustering of genotypes belonging to different populations based on flower colour was also observed to a limited extent viz., JAL-1-1, JAL-1-4 and HGH-3-1 had yellow flowers, and JAL-1-2 and JAL-1-3 had orange flowers in subgroup 1of cluster C-1 while subgroup 2 included three genotypes of Hanumangarh population (HGH-3-2 to HGH-3-4) having red flowers. Similarly, BAR-12-4 and BKN-14-6 in cluster C-3 had yellow flowers, and NAG-5-3, CHR-8-1, NAG-5-5 and SKR-6-1 in cluster C-5 had orange flowers.

This SCoT marker-based study has identified and established phylogenetic relationships among the T. undulata genotypes very effectively. Molecular analysis confirmed the existence of wide genetic diversity which could be characterized and exploited for varied uses of this important arid tree species. Thus, there is an urgent need to identify, collect and maintain such genetically diverse populations/genotypes which would be of utmost importance in formulation of management and improvement strategies. The economical and medicinal value of this important tree species of desert ecosystem underscores its conservation and improvement.