Introduction

Orchids are famous for their ornamental and medicinal value (Swarts and Dixon 2009), belongs to the family Orchidaceae in the monocot. Orchids are traded as cut flowers in the international markets. Orchidaceae is a largest family in the flowering plant, which includes approximately 25,000 species (Atwood 1986; Hsu et al. 2011). Orchids can grow in the wide range of ecological habitats in the world and show maximum diversity among the species (Tremblay et al. 2005). Orchids grow in epiphytic and terrestrial habitat; they have acquired specialized reproductive and ecological strategies for advancing their adaptability in different environmental conditions. Flower of orchids form specialized structure by fusing androecium and gynoecium to form labellum and gynostegium for the smooth progression of pollination by attracting pollinators (birds and insects). This modification facilitates the co-evolution of orchid flowers and pollinators during the course of evolution (Yu and Goh 2001; Schiest et al. 2003). Orchids have been considered advance species among the flowering plants. It has been estimated that orchids originated at least 80–40 million years ago (Mya) (Dressler 1993) in the late Cretaceous eras. Some of the orchid species are considered rare and threatened because of their uncontrolled exploitation by the traders and orchids growers in addition to habitat deterioration due to modern industrial civilization (Swarts and Dixon 2009). Thus, protection measures are needed to conserve orchid germplasm of the world in the different ecological regions and habitat and thereby, the valuable gene pool, before these become extinct. Information on the genetic diversity in a gene pool of the species and their population structure is essential for conservation of any plant species to understand the growth pattern and survival potential of the germplasm (Pillon et al. 2007). Wide genetic diversity is the basic need to conserve any genetic resources (Brown 1988; Izawa et al. 2007), because it gives the adaptability features to the individual plant to cope with the environmental fluctuation in that ecogeographical habitat. If any individual is disappeared from the population due to any reason, it ultimately leads to loss of genetic traits (Izawa et al. 2007).

Genetic diversity helps to understand the evolutionary consequences in order to get idea about the genetic drift, mutational process and direction of gene-flow and divergence phenomena in the populations (Dressler 1993; Chase et al. 2005; George et al. 2009). Orchids contain high level of genomic complexity, because it carries wide range of variation in DNA content. DNA amount varies from 0.33 to 55.4 pg/cell, indicating the diversity in evolution (Leitch et al. 2009). Pattern of genetic diversity present in the organism indicates its complexity (Lynch and Conery 2003), and helps in sustainability of the ecosystem (Reusch et al. 2005). Species sensitivity to environmental changes is necessary for evolution (O’brien 1994). Less diversity in a population indicates the possibility for concealed endangerment (Amos and Balmford 2001). Genetic diversity was measured in orchid Calanthe tsoongiana for the conservation purpose of this endemic orchid in China (Qian et al. 2013).

Different molecular techniques were employed to measure the genetic diversity among plant species including some gene specific sequences such as matK, rbcL, ITS regions (Hilu and Liang 1997; Ma and Yin 2009; Cao et al. 2001; Koch et al. 2001; Zhu et al. 2003; Yang et al. 2004; Zhang et al. 2013; Chen et al. 2014; Xing 2014). Recently the high-throughput SNP genotyping techniques based on NGS method are used to analyze the genetic diversity (Kumar et al. 2012; Shirasawa et al. 2016). The NGS based advanced techniques such as genotyping-by-sequencing (GBS) (Elshire et al. 2011; Poland and Rife 2012) and restriction site-associated DNA sequencing (RAD-Seq) have become popular choice for genetic analysis because they are low cost and flexible (Davey et al. 2011) in experiment design.

To our knowledge, no published reports are available for nucleotide diversity in these four orchids species based on NGS technique. Therefore, in this study, we analyze preliminary data of the ddRAD sequencing report of four orchid species (Dendrobium densiflorum, Geodorum densiflorum, Cymbidium aloifolium and Rhynchostylis retusa) for evaluation of genetic diversity and evolutionary phylogenetic relationships based on DNA polymorphism using DnaSP5 and MEGA6 programs.

Materials and methods

Plant materials

Four orchid species were used in this study for DNA polymorphism analysis. Dendrobium densiflorum Lindl. and Geodorum densiflorum (Lam.) Schltr., Cymbidium aloifolium (L.) Sw. and Rhynchostylis retusa (L.) Blume were collected from Raiganj, Dist-Uttar Dinajpur, WB (Geographical location: Longitude-25.62°N and Latitude-88.12°E, Altitude- 40 m) and maintained in the garden of the Department of Botany, University of North Bengal, India.

DNA isolation and purification

Young leaves were harvested from the plants and kept in −20° C for 24 h before DNA extraction. Total genomic DNA was extracted using a Qiagen DNeasy plant mini kit (Qiagen, Germany). High quality DNA (concentration >300 ng/μl, A260/A280 ratio = 1.8–2.0 and A260/A230 ratio >1.7) was used for ddRAD sequencing purpose (Peterson et al. 2012).

Library construction for next generation sequencing (NGS)

Four DNA samples after meeting the required QC parameters were considered for library preparation using the standard protocols (Illumina) for next generation sequencing application. Following purification and quality check, about 1 μg genomic DNA of each of the four orchid species was double digested with SphI and MlucI restriction enzymes by incubating at 37 °C for 16–20 h following standard protocol (Peterson et al. 2012). The AMpure XP beads (Beckman Coulter Genomics) clean-up technology (with Dynabeads, Invitrogen) was used to clean the digested products (250–400 bp size) using standard protocols (Beckman Coulter Genomics) before library construction.

Adapters are ligated to the restricted cut sites in order to add barcodes and common PCR priming sequences. Barcodes are added to the downstream of the sequencing primer to resolve the products of different ligation after sequencing the library in NGS i.e., to separate the individual’s sequencing reads. Ligation of P1 (Barcoded) and P2 adapters was carried out using T4 DNA ligase followed by indexing with the addition of Index 1 and Index 2 sequences of 8 nt long. Adapter ligated fragments are separated through 2% agarose gel electrophoresis and visualized after staining with SYBR safe DNA stain. DNA fragments between 250 to 400 bp were excised from the gel under UV Trans illuminator. DNA fragments were extracted from the excised gel and purification was done by using AMpureBeads system. To increase the concentration of the DNA fragments of sequencing libraries, PCR amplification (8–12 cycles) was performed using Phusion™ polymerase kit. Products of PCR amplification were analyzed in Agilent Bioanalyzer to quantify molarity and fragment size distribution (BA profiles) in the sequencing library (NGS using Illumina). Each sequencing library was sequenced in six or more lanes using Illumina TruSeq chemistry on the HiSeq 2000 platform (SciGenom Labs Pvt. Ltd, Cochin, Kerala) and 90 bp paired-end reads were generated and used Solexapipeline version 1.0 to read the raw fluorescence images. To ensure high quality, raw data was filtered by deleting reads having adapter contamination or containing more than 50% low quality bases (quality value  ≤ 5). The raw reads were assembled using SOAP de novo version 1.05 with default parameters (Peterson et al. 2012).

Analysis of nucleotide diversity from ddRAD sequencing data

Annotated and assembled contig sequences of ddRADSeq data were aligned using multiple sequence alignment software ClustlW (http://www.ebi.ac.uk/clustlw/). The output aligned files were saved as fasta format and used in an input file for further genetic and phylogenetic analysis in DnaSP5 and MEGA6. DNA polymorphism was studied by using DnaSP (Dna Sequence Polymorphism) program version 5.10.01 (Librado and Rozas 2009) (http://www.ub.es/dnasp/). DnaSP5 software was used to estimate the molecular diversity including the haplotype (gene) diversity (Hd), nucleotide diversity Pi (π) (Nei 1987), theta (θ) diversity (Watterson 1975), number of haplotypes (Nh), and number of segregating sites (S) in these four orchid species. Phylogenetic analysis was conducted using MEGA6 software program, and the phylogenetic tree generated using the Neighbor-Joining method (NJ) with 1000 bootstrap trials. The dendrograms were reconstructed based on partial gene sequences (matK, Ycf2, psbD) of four orchids. Pairwise synonymous (dS) and non-synonymous (dN) nucleotide substitution rates were estimated using the Jukes and Cantor (1969) distance model with the Nei–Gojobori method (Nei and Gojobori 1986) implemented in MEGA v6 (Tamura et al. 2013). The mean pairwise ratio of d N  /d S  (ω) was calculated and used to examine whether gene sequences evolve under purifying constraint for amino acid sequences (ω < 1) or positive selection for amino acid changes (ω > 1). Evidence for non-neutral evolution was investigated using the Tajima’s D (Tajima 1989), the Fu and Li’s D* and F* (Fu and Li 1993), the McDonald and Kreitman (1991), and the HKA (Hudson et al. 1987) tests, all implemented in the same DnaSP5 software.

Results and discussion

Analysis of nucleotide diversity

The raw sequence data (fastq format) of four orchid species (G. densiflorum, D. densiflorum, C. aloifolium and R. retusa) obtained from ddRAD sequencing technique are summarized in Table 1. Total sequence read was 553.3 Kbp in Geodorum (submitted to NCBI SRA archive with bioproject Accession no. SRP065790, PRJNA294125), 1.1 Mbp in Dendrobium (submitted to NCBI SRA archive with bioproject Accession no. SRP063543, PRJNA295128), 1.6 Gbp in Cymbidium (submitted to NCBI SRA archive with bioproject Accession no. SRP072201, PRJNA316048) and 1.4 Gpb in Rhynchostylis (submitted to NCBI SRA archive with bioproject Accession no. SRP072378, PRJNA316496). Total assembled transcripts were 285 in Geodorum with longest read length 275 bp, 14326 in Dendrobium with longest read length 512 bp, 126051 in Cymbidium with longest read length 423 bp and 42869 in Rhynchostylis with longest read length 484 bp. Present results of four orchids showed approximately same GC%, which was 43.9, 43.7, 41.2 and 42.3 in Geodorum, Dendrobium, Cymbidium and Rhynchostylis, respectively (Table 1). The GC content was 35.95% in case of Phalaenopsis equestris based on whole genome sequencing report (Jiangfeng et al. 2014) with genome size 1600 Mb (Diploid chromosome number 2n = 38, with 3.37 pg DNA/diploid genome). It indicated that P. equestris, has AT-rich genome with approximately 47,007 genes. In general, GC% in monocot plant ranges from 33.6 to 48.9% (Jiangfeng et al. 2014). The GC% in the four orchids species in the present investigation indicates that they have little higher GC% that signifies complex genome architecture compared to other plant groups. High GC% in the genome helps to adapt to the different environmental conditions (cold/dry seasons) leading to more complex gene regulation (Smarda et al. 2014). Partial genome sequence was carried out in one important orchid (Phalaenopsis equestris) (Hsu et al. 2011) and then whole genome was completely sequenced for its characterization and phylogenetic analysis (Cai et al. 2015). But reduced representation of the genomic segment has been sequenced in many organism using RAD, ddRAD, GBS methods based on NGS technology for genetic diversity and phylogenetic relationship using SNPs marker (Baird et al. 2008; Etter et al. 2011; Peterson et al. 2012; Jiangfeng et al. 2014; Wataru et al. 2014).

Table 1 Summary of the ddRADSequencing contig assembly report (Illumina HiSeq 2000) along with NCBI SRA submission analysis and accession numbers

The Fastq files of four orchids were analyzed in the NCBI BLAST (http://blast.ncbi.nlm.nih.gov) with default parameters (such as plant, angiosperm, and monocot) to understand the taxonomic identity with the morphological identification. Blast analyses showed maximum similarity with the respective orchid species (Geodorum, Dendrobium, Cymbidium and Rhynchostylis) in the present investigation. Some of the contig assemblies from the orchid species were characterized and each of the annotated partial gene sequence was deposited in the NCBI GenBank with accession numbers (Table 2). DNA barcoding involves sequencing a standard region of DNA as a tool for species identification. Mainly the plastid DNA regions like matK and rbcL gene sequence data are used to identify species of land plants (Liu et al. 2013; Xu et al. 2015) and their phylogenetic relationship. Other gene sequences were also considered for evolutionary phylogenetic analysis purpose such as 28S, 18S, 16S rRNA gene, Ycf2 gene, GTP-transmembrane like sequences, NdhB2, Nad5, NADH dehydrogenase subunit-9, 30S ribosomal S11 like, D1 like photosystem II gene, PsbH protein like gene, Rps12 ribosomal gene like, tRNA-serine like gene, receptor protein like gene and many hypothetical protein genes. In the present investigation, partial gene sequences of matK gene (Geodorum densiflorum) (Accession no. KU136316, 235 bp) was used for nucleotide diversity analysis in DnaSP ver5 and also for phylogenetic relationship analysis in MEGA6.

Table 2 Partial sequences of different genes characterized from the SRA data of ddRAD sequencing contig assembly for four orchids and submitted to NCBI GenBank

The matK is the chloroplast specific maturase K protein coding gene and widely used in phylogenetic relationship in plants. In case of Dendrobium densiflorum, partial gene sequence of Ycf2 (Accession no. KU136288, 386 bp) was used for nucleotide diversity analysis in DnaSP ver 5 and dendrogram reconstruction for phylogenetic analysis in MEGA6. Partial matK gene sequence of other orchids (13 in number) has been retrieved from NCBI GenBank by searching with matK gene sequence of Geodorum densiflorum of Raiganj for analysis purposes (Trillium discolor as out group).

The matK gene sequence of orchid Oncidium was used for phylogenetic relationship among 36 Oncidiinae species (Pan et al. 2012). In the same way partial gene sequence of Ycf2 for other orchids (10 in number) was retrieved from NCBI GenBank using Ycf2 sequence of Dendrobium densiflorum of Raiganj (Scadoxus cinnabarinus as out group). In case of Cymbidium aloifolium, partial gene sequence of psbD (Accession no. KX033492, 484 bp) was used for nucleotide diversity analysis in DnaSP ver 5 and dendrogram reconstruction for phylogenetic analysis in MEGA6. Partial psbD gene sequence of other orchids (8 in number) has been retrieved from NCBI GenBank by searching with psbD gene sequence of Cymbidium aloifolium of Raiganj for analysis purposes. In case of Rynchostylis, partial gene sequence of Ycf2 (Accession no. KX064256, 363 bp) was used for nucleotide diversity analysis in DnaSP ver 5 and dendrogram reconstruction for phylogenetic analysis in MEGA6. Partial Ycf2 gene sequence of other orchids (10 in number) has been retrieved from NCBI GenBank by searching with Ycf2 gene sequence of Rhynchostylis retusa of Raiganj for analysis purposes. Results of the DnaSP5 of four orchids were summarized in Table 3. Several plastid regions, such as matKatpBpsbBpsbC and rpoC1, have been used to identify phylogenetic relationships in orchid (Cameron 2004; Cameron and Carmen Molina 2006) and results are consistent with our analysis in these four orchids also. To test the efficiency of gene sequences serving as markers in phylogenetic analyses, coding regions of four genes (accD, ccsA, matK, and ycf1) were used as a case study to construct phylogenetic trees in the subfamily Epidendroideae (including Dendrobium officinale Cypripedium macranthos) (Luo et al. 2014).

Table 3 Nucleotide diversity and neutrality test reports obtained from DnaSP5 for four orchid species of Raiganj, India

The haplotype (gene) diversity Hd was 1.00 and nucleotide diversity (per site) Pi (π) was 0.10560 in Dendrobium, which indicate that the genotype showing the genetic diversity. Nucleotide diversity (per site) Pi (π) was 0.03586 and haplotype (gene) diversity Hd was 0.945 in case of Geodorum while considering matK gene sequence. The matK sequence variation was low in the same group of orchids, but it was quite high in diverse groups. The value of Hd and Pi among orchid species (0.945 and 0.03586), respectively, signifies the high level of genetic diversity. Haplotype diversity (H d) was 0.945 in gene including 10 different haplotypes (in Geodorum sp.). On an average nucleotide diversities were π = 0.03586 and θ = 0.06641 in these 10 sequences of Geodorum. Both the value of nucleotide diversity, π and θ were high in this matK gene sequences. The value of θ was slightly higher than the value of π. Due to less SNP variation, the value of θ and π was similar and according to the neutrality test statistics it tends to be negative. The slipper orchids were analyzed to trace the phylogenetic relationship among the orchids species based on chloroplastic gene sequences (accD, rbcL, matK, rpoC1, rpoC2, ycf1, ycf2, and ndhF) in PAUP version 4.0b10 (Guo et al. 2012).

Pattern of molecular evolutionary mechanisms can be assessed using nucleotide variation present in the population. DNA sequence comparison can be the most important characteristics features to study the genetic variation in the natural population (Kreitman 1983). Natural selection played an important role in evolution (Ronald Fisher) and genetic drift has a minimum influence in evolution. But, Kimura (1968) has revolutionized the idea of evolution by proposing his Neutral theory of molecular evolution. According to the Neutral theory of molecular evolution, genetic changes that have occurred in the individual can spread by genetic drift. It can be detected by using the test (McDonald and Kreitman 1991). Tajima’s neutrality test (Tajima 1989) or D test statistic is used to test the Neutral theory of molecular evolution (Kimura 1983). Molecular differences found in the gene sequence, which arises through spontaneous mutation do not influence the fitness of the individual. In the present investigation of four orchid species, Tajima’s D test value was negative (−2.01655 in Geodorum, −2.17959 in Dendrobium, −2.12362 in Cymbidium and −1.54222 in Rhynchostylis), which suggests less polymorphisms in the population relative to expectation. This result also indicates the expansion of population size after a genetic bottleneck or a selective sweep (as a whole it is indicating purifying selection). A negative D value (Tajima’s D value) is usually interpreted as purifying selection. It was detected that the values for Tajima’s D, Fu and Li’s D* and F* were all statistically significant in the four orchids genus. Result for these gene sequences (matK and Ycf2 and psbD) indicate that they were not evolved neutrally signifying that selection might have played a role in evolution of these genes in these four groups of orchids.

Present investigation also showed the ratio of d N/d S is less than one (d N/d S < 1) indicating that they are mainly under negative selection or a recent population size expansion. Many factors are responsible for these selections, such as mutation, population size, recombination rate, gene conversion, and selection intensity (Hudson et al. 1987; McDonald and Kreitman 1991; Kreitman 2000). Tajima’s test or ‘D’ test statistic (Tajima 1989) was used to predict the probability of occurrence of Neutral theory of molecular evolution (Kimura 1983) in this present investigation. The present results are indicating the non neutral molecular evolution.

Results of the Tajima’s D test and Fu and Li’s D* and F* are listed in Table 3. These tests analyze the proportion of alleles at high or intermediate vs low frequencies among the four orchid species. Almost all the values are negative, indicating an excess of rare alleles in all the four orchids. An excess of rare alleles is consistent with positive selection, selective sweep or population size expansion. Present observation is fully consistent with the earlier reports in Mediterranean orchids (Aceto et al. 2007) for their genetic relationship based on gene sequences.

All the gene sequences (Ycf2 and matK, psbD) showed the bias in codon usage that prevailed in these orchids. Effective number of codon (ENC) was 51.00 in Dendrobium, 40.83 in Geodorum, 51.84 in Cymbidium, and 47.336 in Rhynchostylis. The codon bias index (CBI) was 0.543 in Dendrobium, 0.531 in Geodorum, 0.376 in Cymbidium, and 0.516 in Rhynchostylis indicating moderate codon bias. The CBI value indicates the deviation from the equal use of synonymous codon by the genes. This value ranges from 0 to 1 (0 = normal use of synonymous codon, 1 = high codon bias). Nonsynonymous rate variation among genes correlates most strongly with gene expression, perhaps owing to selection for translational robustness. Among lineages, perennial plants evolve more slowly than annuals, but the mechanism driving this effect remains unclear (Gaut et al. 2011). Evolutionary rates vary among nucleotide sites as a consequence of selection and mutational biases. These four orchids also showed the codon biases, which supports the earlier views (Gaut et al. 2011). Linkage disequilibrium (LD) values were measured in the four orchids genus by ZnS parameter (Kelly 1997), which were 0.6897 in case of Geodorum, 0.8630 in case of Dendrobium, 0.2175 in case of Cymbidium, and 0.9342 in Rhynchostylis (all are significant by the Bonferroni procedure).

The ZnS value is the average of R2 value (Hill and Robertson 1968) over all sequences in pair wise comparison. If R2 value is = 1, it signifies that the two SNPs are in complete LD, which means if one can know one of the values then other value can be directly predicted. When R2 = 0, the two SNPs are considered to be independent. Graphical representation of the four orchid’s linkage disequilibrium was depicted in Fig. 1. Linkage Disequilibrium was significant for both G. densiflorum (R2 > 0.6897) and Ddensiflorum (R2 > 0.8630) (0.001 < P < 0.01), R2 > 0.2175 in Cymbidium aloifolium and R2 > 0.9513 in Ryhnchostylis retusa. R2 values in the LD plots revealed slight decay of the analyzed sequences (Fig. 1). Pairwise comparisons showed significant value in Fisher’s exact test as well as in Chi-square test.

Fig. 1
figure 1

Linkage disequilibrium of Geodorum (from above left), Dendrobium, Cymbidium and Rhynchostylis depicted as graphical representation using matK, Ycf2 gene, and psbD gene respectively

The overall transition/transversion bias in matK gene sequences of Geodorum was R = 0.517, where R = [A*G*k 1  + T*C*k 2 ]/[(A + G)*(T + C)]. The analysis involved 14 nucleotide sequences with 232 positions in the final dataset (Table 4) and bold marked results for transitional substitutions and italics for transversionsal substitutions run in MEGA6 (Tamura et al. 2013). In case of Ycf2 gene sequences of Dendrobium, total bias for transition/transversion was R = 1.144 (Table 5). In case of psbD gene of Cymbidium, R = 1.023 using 8 gene sequences (Table 6). While considering the psbD gene sequence of Rhynchostylis retusa it showed transition/transversion bias rate R = 1.372 (Table 7). The emerging picture of the nucleotide substitution process in plants is a complex one. Evolutionary rates are seen to be quite variable, both among genes and among plant lineages. The present result (four orchids) is very much consistent with the views that the nucleotide substitution rates including transition/transversion ratio depends on genes and species. They are not evolving at the same level (Muse 2000).

Table 4 Estimate of nucleotide substitution pattern based on MCL model in 14 matK gene sequences of Geodorum densiflorum of Raiganj
Table 5 Nucleotide Substitution pattern was estimated using MCL model in 11 Ycf2 gene sequences of Dendrobium densiflorum of Raiganj
Table 6 Estimate of nucleotide substitution pattern based on MCL model in 8 psbD gene sequences of Cymbidium aloifolium of Raiganj
Table 7 Estimate of nucleotide substitution pattern based on MCL model in 10 Ycf2 gene sequences of Rhynchostylis retusa of Raiganj

Phylogenetic tree reconstructed in four orchids of Raiganj based of Ycf2, matK and psbD partial gene sequences

The evolutionary phylogenetic dendrograms were reconstructed depicting the genetic relatedness among the partial gene sequences (matK, Ycf2 and psbD) of four orchid species from Raiganj (Fig. 2a–d; Tables 8, 9, 10, 11) in MEGA6 (Tamura et al. 2013) (www.megasoftware.net/mega6.html). In case of Geodorum, matK gene sequences of 14 different orchids were analyzed based on pairwise distances between different gene sequences (Table 8). The maximum likelihood method was used to reconstruct the evolutionary history based on Jukes-Cantor model (Jukes and Cantor 1969). Bootstrap consensus tree was formed from 1000 replication to analyze the taxa (Felsenstein 1985). Heuristic search was employed to obtain initial trees by applying Neighbor-Join and BioNJ algorithms based on pairwise distance matrix. Gamma distribution was applied to trace the evolutionary rate differences among sites. Present analysis involved 14 nucleotide sequences in Geodorum (considering matK gene) 11 sequences in Dendrobium (considering Ycf2 gene), 8 in Cymbidium (considering psbD gene) and 10 in Rhynchostylis (considering Ycf2 gene). Three clusters were formed in the dendrogram of Geodorum. Geodorum densiflorum of Raiganj was closely related to the other Geodorum densiflorum (JN004445), and Geodorum recurvum (KF673833) species in this phylogenetic tree (Fig. 2a; Table 8). Trillium discolour was grouped as an out group and placed as separate clade in the dendrogram. The same method was applied to reconstruct the phylogenetic tree based on Ycf2 partial gene sequence of 11 species of Dendrobium (Fig. 2b; Table 9) and 10 sequences of Rhynchostylis (Fig. 2d; Table 15) based on pairwise genetic distances. Broadly, two clusters were formed in the dendrogram of Geodorum and closely related to other Geodorum species in this study (Fig. 2a) and Trillium discolor was used as out group, which was placed outside of the cluster. In case of Dendrobium densiflorum (Raiganj), two clusters were formed in the phylogenetic dendrogram and showed the close relationship with other Dendrobium species in this investigation (Fig. 2b), Scadoxus cinnabarinus was used as out group. In case of Rhynchostylis retusa, dendrogram formed one cluster and Rhynchostylis of Raiganj was placed in different clade which signifies that it was not closely related to other species considered in this dendrogram reconstruction (Fig. 2d; Table 11) using Ycf2 gene sequences. Three separate clusters were formed in the dendrogram while considering the psbD partial gene sequences of 8 orchid species including orchid Cymbidium of Raiganj (Fig. 2c; Table 10). One intriguing relationship that has been revealed through studies of branch lengths on molecular phylogenies is a link between the rate of molecular evolution and the net diversification rate. A correlation between evolutionary rates and species diversity has been found in several groups including flowering plants (Barraclough and Savolainen 2001). Present investigation showing different rates of phylogenetic diversification supporting the findings of other workers.

Fig. 2
figure 2

a Dendrogram reconstructed for molecular phylogenetic analysis using 14 matk gene partial sequences of orchid species by maximum likelihood method. b Dendrogram reconstructed for molecular phylogenetic analysis using 11 Ycf2 partial gene sequences of orchid species by maximum likelihood method. c Dendrogram reconstructed for molecular phylogenetic analysis using 8 psbD partial gene sequences of orchid species by maximum likelihood method. d Dendrogram reconstructed for molecular phylogenetic analysis using 10 Ycf2 partial gene sequences of orchid species by maximum likelihood method

Table 8 Estimates of evolutionary divergence between sequences and pairwise distance among the 14 matK partial gene sequences of various orchids
Table 9 Estimates of evolutionary divergence between sequences and pairwise distance among the 11 Ycf2 partial gene sequences of various orchids
Table 10 Estimates of evolutionary divergence between sequences and pairwise distance among the 8 psbD partial gene sequences of various orchids
Table 11 Estimates of evolutionary divergence between sequences and pairwise distance among the 10 Ycf2 partial gene sequences of various orchids

Comparative report of Tajima’s Neutrality Test (Tajima 1989) based on the partial gene sequences of matK, psbD and ycf2 was summarized in Table 16. The number of segregating sites was high (185) in ycf2 (in Dendrobium), low in psbD gene sequences (19) (in Cymbidium) but 153 in Rhynchostylis and 46 in Geodorum sp. Nucleotide diversity (π) was comparatively little high in case of Ycf2 sequences (0.105603) as compared to 0.011193 in Cymbidium psbD gene sequences and diversity was 0.035856 in matK sequences of Geodorum and 0.086563 in Rhynchostylis Ycf2 gene sequences.

The value of Tajima’s D statistic in both the cases was negative, suggesting that there may have occurred purifying selection (Table 16) in all four orchid species of Raiganj. Tajima’s test statistics is indicating noteworthy difference between the value of π and θ, thus, presenting the departure from the proposition of neutral theory. Neutrality test was conducted among these gene sequences using the codon-based Z-Test in MEGA6 software. The Nei-Gojobori method (Nei and Gojobori 1986) was employed to run this experiment. Null hypothesis rejection probability based on strict-neutrality equation (dN = dS), was depicted below diagonal in the following Tables 12, 13, 14, 15.

Table 12 Codon based Z-test of neutral evolution using 14 nucleotide sequences of matK gene of Geodorum sp.
Table 13 Codon based Z-test of neutral evolution using 11 nucleotide sequences of Ycf2 gene of Dendrobium sp.
Table 14 Codon based Z-Test of Neutral evolution using 8 nucleotide sequences of psbD gene of Cymbidium sp.
Table 15 Codon based Z-test of neutral evolution using 10 nucleotide sequences of Ycf2 gene of Rhynchostylis sp.

The value of dS means numbers of synonymous substitution and dN means numbers of nonsynonymous substitutions (per site basis). The Neutrality test statistics (dN − dS) have been shown above the diagonal in the respective tables. The P value less than 0.05 were considered significant (at the 5% level) and which were in italic. We found significant differences in the ratio of non-synonymous to synonymous substitutions between four orchid species differing in net diversification rate, and indicates the signal of population size changes or alteration in selection pressures that might be causing this relationship (Duchene and Bromham 2013) for molecular evolutionary forces (Table 16).

Table 16 Comparative summary report of Tajima’s neutrality test (Tajima 1989) based on matK, ycf2 and psbD partial gene sequences of four orchids of Raiganj

Conclusion

Based on the genetic diversity analysis using DnaSP5 software, it was found that the all four orchid species of Raiganj are carrying moderate level of genetic diversity compared with the same gene sequences of other orchid species. Genetic diversity was evaluated based on nucleotide diversity and haplotype (gene) diversity of four orchid species. According to the neutral test statistics, all four orchid species showed negative value in Tajima’s D test suggesting less variability in the natural populations of orchids, than expected. This may also indicate that population is under expansion after a genetic bottleneck and as a whole indicating the purifying selection. The ratio of dN/dS is less than one (dN/dS < 1) means population is under negative selection, which also reflects the population size expansion. Different types of factors are responsible for these selections, such as mutation, population size, recombination rate, gene conversion, and selection intensity. Thus these four orchids (Cymbidium, Dendrobium, Geodorum and Rhynchostylis) of Raiganj, can be conserved in their respective habitat (in situ conservation system) to evolve naturally in their eco-climatic conditions. Overall results illustrate the complex evolutionary pattern in the population of the four orchids under study. A variety of mechanisms could maintain variation of these loci at different levels such as point mutations, filtered by the action of a diffuse purifying selection and by different selective constraints acting on synonymous and non-synonymous sites.