Introduction

The Himalayan range is among the most intricate and diverse mountain systems in the world. It forms distinct geological and ecological entity, influencing climate and biotic aspects of the region. The varying topography promotes environmental heterogeneity at both temporal and spatial scales affecting diversity and distribution patterns of biodiversity elements. In India, Himalaya amasses around 18–20% of the total geographical area and holds above 50% of the country’s forest, with 40% of its endemic species. Uttarakhand state which is located in the central Himalayan region encompasses highly varied tropical to temperate regimes in span of just few hundred kilometres and due to its variable altitudinal terrain, offers an abode for rich, luxuriant flora and fauna. Through our extensive explorations in past decade, we have documented more than 90 species from this region (Sati et al. 2013), with significant number new to the world of science, in comparison around 300 drosophilid species recorded so far from varied ecogeographical zones in India (Gupta 2005; Kumar and Ajai 2009). This high ratio of diversity affirms Uttarakhand state as a potential drosophilid diversity hotspot with implication in molecular systematics and evolutionary biological research.

Drosophilidae is a large family of acalyptrate diptera with worldwide distribution. The first catalogue listed more than 2500 species belonging to 55 genera in this family (Wheeler 1981) with two subfamilies, Steganinae and Drosophilinae. Yassin et al. (2010) recently documented 4000 described species distributed in 76 genera. However, at present, the database on taxonomy of drosophilidae compiled by Gerhard Bächli (dated March 2015) reports 6633 species of which, Drosophila Fallén 1823 is the largest with more than 1150 species recorded so far (Bächli 2015). Drosophila is also among the most versatile and best-studied model organisms in modern biology with 12 fully-sequenced genomes now made available in the public domain by Drosophila 12 Genomes Consortium (2007). In spite of availability of such robust genetic and genomic resources, meagre is known so far about molecular phylogeny, ecological preferences and evolutionary history of species inhabiting Indian subcontinent, especially biodiversity-rich Uttarakhand region of India. Although researchers in this region have documented several novel species reflecting high degree of endemism, sexual dimorphism and host inclination (Singh et al. 1985; Singh and Bhatt 1986, 1988; Singh and Negi 1989, 1992, 1995; Singh and Dash 1998; Singh et al. 2000, 2004; Singh and Fartyal 2002; Fartyal and Singh 2002, 2004; Fartyal et al. 2005, 2013), no molecular studies have yet been performed using mitochondrial DNA-based markers. These markers can determine the levels of genetic differentiation among various geographically distributed and ecologically isolated populations of drosophilids from this region. This study provides a resolute data of the extent to which these populations differ genetically; permitting interpretation of the evolutionary patterns of reproductive isolation and speciation mechanisms. As an important evolutionary mechanism in speciation is the diversification of genes among populations, mitochondrial DNA-based bar-coding is much acknowledged as an efficient tool not only for rapid species identification, but also for discovery and delimitation of species, i.e. species discovery or molecular taxonomy (Hebert et al. 2003b; Jinbo et al. 2011; Bast 2015).

Intraspecific molecular analysis is also a reliable approach to decipher how populations have been influenced by historical and existing processes (Emerson et al. 2001; Posada and Crandall 2001). mtDNA has been extensively used for reconstructing phylogenies and tracing historical demographies (Avise et al. 1987; Moritz et al. 1987; Harrison 1989; Avise 1991, 2000, 2004) in closely related taxa and populations within species owing its low recombination rate, maternal inheritance, conserved structure, reduced effective population size and comparatively high evolutionary rate (Brown et al. 1979; Moritz et al. 1987; Avise 2004). Even though with some drawback, mtDNA study provides an efficient way to detect gene flow, reproductive isolation, species boundaries and population structure (Rubinoff and Holland 2005). Thus, in the present study, we assessed intraspecific variation in three mitochondrial genes, namely, 16S rRNA, cytochrome c oxidase subunit I and cytochrome c oxidase subunit II (COI and COII) in 26 drosophilid species collected from varying habitats along an altitudinal transect between 550 and 2700 m above mean sea level (table 1). Several known and some novel species were sampled, making this the first molecular analysis to date from the central Himalayan region of India, which would aid in understanding various isolating mechanisms operating and their impact on genetic divergence among genomes as a whole. This also provides a base data to identify emerging reproductive isolating mechanisms and evolutionarily diverging populations before speciation.

Table 1 Sampling locations of different drosophilid species from Uttarakhand with their GenBank accession numbers for partial mitochondrial sequences of 16S rRNA, COI and COII.

Materials and methods

Sample collections

Extensive sampling surveys were carried out for a year along altitudinal transect, covering varied ecological habitats ranging from near tropic-like regimes to mid-elevation forests dominated by cool-temperate taxa. Stations sampled were Srinagar-Garhwal (550 m asl, 30 22 N and 78 78 E, Pauri-district), Augustyamuni (800 m asl, 30 39 N and 79 02 E, Rudraprayag-district), Upper Chamoli (1150 m asl, 30 24 N and 79 21 E, Chamoli-district), Mandal (1600 m asl, 30 46 N and 79 26 E, Chamoli-district), Kanchula-Kharak (2100 m asl, 28 43 N and 77 34 E, Chamoli-district) and Chopta (2700 m asl, 30 29 N and 79 10 E, Rudraprayag-district) located in Uttarakhand. The cosmopolitan drosophilid species were collected over rotten fruits, vegetables, fungi and flowers, while rare species were sampled from their natural habitats employing range of techniques which are as follows.

Trap bait:

Small containers baited with yeasted banana or other fermenting fruits as oranges, tomato, guava, apples, etc., were suspended through strings from bushes and trees. The method was very effective for collecting species particularly belonging to the two subgenera, Sophophora and Drosophila of the genus Drosophila. The banana trap design of Prof Toda was also used (Toda and Kitching 2002).

Net sweeping:

Collection using a modified insect net over natural feeding sites as decaying fruits and leaves, wild grasses, etc. was done to capture flies of other genera of the family drosophilidae which are rarely attracted towards fermenting fruits.

Collection with aspirator:

Rare flies showing occasional appearance which could not be collected by above methods was caught directly with the help of an aspirator while they were either courting or resting over the leaves, petals, fungi, etc.

Identification and morphological study

Collected flies were etherized, categorized and subsequently identified using published monographs and guide to species identification (Gupta 2005; Markow and O’Grady 2006) and online identification tools like BioCIS, JDD and FlyBase. External morphologies namely, head, thorax, wing and body, length of adult flies were examined under the stereo zoom microscope (Magnus MS24 model, India, at 3 × objective zoom) and metric characters were measured with an ocular micrometer. For further confirmation, the detailed structures of male and female terminalia were observed under trinocular microscope (Magnus, India MLX-DX model, at 10 × magnification). The respective genital organs were detached from the adult body and cleared by warming in 10% KOH solution around 100 C for 20–30 min and observed in a droplet of glycerol. The morphological terminology and the definitions of measurements and indices mostly followed were of McAlpine (1981); Zhang and Toda (1992); Hu and Toda (2001). The examined specimens of all species were deposited in the Systematics, Cytogenetics and Molecular Systematics Laboratory, Department of Zoology, H. N. B Garhwal University, Uttarakhand, India.

DNA extraction

DNA isolation procedure was standardized for single fly. Each fly was mashed in 50 μL of fly lysis buffer (20 mM Tris-Cl, pH 8.5, 1.5 mM EDTA, 30 mM NaCl and 250 μg/mL of freshly prepared Proteinase K). The mixture was incubated at 25–37 C for 45 min. Proteinase K was inactivated at 95 C incubation for 5 min. Three μL of this preparation was used per 25 μL PCR reaction mix.

Amplification and sequencing

16S rRNA gene was amplified in 25 μL reaction mix which included 1 ×T a q polymerase buffer, 0.2 mM of each dNTP, 0.2 mM of each forward and reverse primers, 2 mM MgCl 2, 1 unit of Taq polymerase using Gradient DNA Thermo-Cycler (Techne, UK). The primers used for the amplification of the partial 16S rRNA gene were 16sF (5 -CCGGTTTGAACTCAGATCACGT-3 ) and 16sR (5 -CGCCTGTTTAACAAAAACAT-3 ) (Simon et al. 1994). The thermal cycling conditions were: initial denaturation for 3 min at 94 C, followed by 39 repetitions consisting of denaturation at 94 C for 45 s, annealing at 50 C for 1 min and extension at 72 C for 1.5 min and the final extension at 72 C for 5 min.

COI and COII genes were also amplified in a final concentration of 25 μL volumes with 1 ×T a q polymerase buffer, 0.2 mM of each dNTP, 0.2 mM of each forward and reverse primers, 2 mM MgCl 2, 1 unit of T a q polymerase. The primers used for the amplification were: COI (F: 5 -CCAGCTGGAGGAGGAGATCC-3 and R: 5 -CCAGTAAATAATGGGTATCAGTG-3 ) (Palumbi 1996) and COII (F: 5 -ATGGCAGATTAGTGCAATGG-3 and R: 5 -GTTTAAGAGACCAGTACTTG-3 ) (Simon et al. 1994). For both genes, an initial denaturation step for 3 min at 94 C was performed, followed by 39 repetitions consisting of denaturation at 94 C for 30 s, annealing at 55 C for 45 s and extension at 72 C for 1 min and the final extension at 72 C for 7 min.

The PCR products were visualized on 1.5% agarose gels and the most intense products were selected for sequencing. Products were labelled with the Big-Dye Terminator ver. 3.1 Cycle Sequencing kit (Applied Biosystems, Foster City, USA) and sequenced bidirectionally using an ABI 3730 capillary sequencer following the manufacturer’s instructions.

Sequence analysis

Sequences of all the three mitochondrial genes were separately aligned using the program Clustal X 2.0 (Larkin et al. 2007). Differences in length were resolved by inserting alignment gaps and positions that could not be aligned, unambiguously were excluded. The degree of sequence disparity between species was calculated by averaging pair-wise comparisons of sequence differences across all individuals. For all three genes i.e., 16S rRNA, COI and COII analysed separately at least four to six individuals of each species per sampling station were sequenced and aligned to yield a final alignment. Overall, base composition, number of transition and transversion from aligned sequences and pair-wise evolutionary distance among mitochondrial haplotypes were determined under Kimura 2-parameter (K2P) method using molecular evolutionary genetic analysis (MEGA) software, ver. 6.0 (Tamura et al. 2013). This method gives better distance estimates than most other methods when the rates of transitional and transversional substitutions are different. MEGA 6.0 was also used to create a neighbour-joining (NJ) tree based on K2P model as recommended by Hebert et al. (2003a). Statistical support for nodes on the tree was evaluated using 1000 bootstrapping replications. The number of polymorphic sites, G + C content, number of haplotypes, haplotype diversity, variance of haplotype diversity, nucleotide diversity and average number of nucleotide differences were determined for each population by DnaSP 5.10.1 (Librado and Rozas 2009).

Different tests of neutrality were applied to molecular data obtained from the altitudinally isolated populations using the DnaSP 5.10.1 (Librado and Rozas 2009), the tests used were Tajima’s D test of selective neutrality (Tajima 1989) and Fu and Li’s F and D tests (Fu and Li 1993). These tests of neutrality though do not detect departure from neutrality to same degree, yet their combination allows inference to be made on the patterns of selection affecting a specific/selected region of DNA.

Results

mt16S rRNA sequence analysis

We sequenced a total 82 individuals of 15 species for partial sequence analysis of 16S rRNA gene (GenBank accession numbers KP730750–KP730831). Simplicity and unambiguity were observed among all the sequences. Sequencing of the 16S rRNA gene produced an average of 541 bp ranging from 432 to 847 nucleotide base pairs. The analysis revealed nucleotide frequencies as A = 39.27%, T/U = 38.15%, C = 11.97% and G = 10.60%. Pair-wise nucleotide differences and sequence divergences are given in table 1 a&b in electronic supplementary material at http://www.ias.ac.in/jgenet/. Overall 21 unique mitochondrial haplotypes were observed in 82 individuals of 15 drosophilid species. The number of polymorphic sites, G + C content, number of haplotypes, haplotype diversity, variance of haplotype diversity, nucleotide diversity and average number of nucleotide differences are provided in table 2. All the 21 different haplotypes and their consensus sequences detected in 16S rRNA gene are shown in figure 1a along with their NJ tree in figure 2a. All neutrality tests showed a significant deviation from the null hypothesis of neutrality, however, for D. busckii, Fu and Li’s F* and D* test indicated departure from neutrality; Tajima’s D test did not show any significant departure from neutrality (table 3a). 16S rRNA gene sequence for D. mukteshwarensis, Liodrosophila nitida, Lordiphosa parantillaria, Lordiphosa ayarpathaensis and Stegana minuta are provided for the first time in public domains through this study.

Figure 1
figure 1

Different haplotypes with their consensus sequences for partial mitochondrial (a) 16S rRNA, (b) COI and (c) COII genes (top to bottom).

Figure 2
figure 2

NJ tree based on partial mitochondrial (a) 16S rRNA, (b) COI and (c) COII genes (left to right). Statistical support for nodes on the tree was evaluated using 1000 bootstrapping replications.

mtCOI and COII sequence analyses

COI and COII genes were also amplified and sequenced for 238 (GenBank accession numbers KP730832–KP731069) and 223 (KP731070–KP731292) individuals, respectively, to determine genetic variation. No insertions, deletions or stop codons were observed in any of the sequence for both the genes. Overall, 25 species were analysed for mtCOI gene variation with an average of 598 bp ranging from 407 to 867 base pairs. The nucleotide frequencies observed were A = 28.90%, T/U = 39.52%, C = 14.89% and G = 16.68% with estimated transition/transversion bias (R) of 1.26. While for mtCOII gene, 23 species were analysed generating an average of 679 bp ranging from 369 to 805 base pairs. The nucleotide frequencies observed were A = 33.01%, T/U = 39.84%, C = 13.51% and G = 13.65% and transition/transversion bias (R) of 1.46. Substitution pattern and rates were estimated under K2P method. Pair-wise nucleotide differences and sequence divergences for COI and COII genes are provided in table 1, c–f in electronic supplementary material.

Forty-seven unique mitochondrial haplotypes were found for COI while 46 for COII gene. Table 2, b&c, displays the number of polymorphic sites, G + C content, number of haplotypes, haplotype diversity, variance of haplotype diversity, nucleotide diversity and average number of nucleotide differences. All the mitochondrial haplotypes with their consensus sequences are shown in figure 1, b&c along with their NJ tree in figure 2, b&c. Here also for almost all the species, neutrality tests showed a significant deviation from the null hypothesis of neutrality across all localities. Table 3 b&c, shows the results of neutrality for both the genes. For D. busckii mtCOI region, Fu and Li’s D* test did not indicate departure from neutrality. In mtCOII gene analysis for D. immigrans only Fu and Li’s F* test showed significant departure from neutrality, while for D. repleta, Tajima’s D was the only test that failed to depart from neutrality.

Table 2(a). 16S rRNA partial gene sequence based diversity analysis.
Table 2(b). COI partial gene sequence based diversity analysis.
Table 2(c). COII partial gene sequence based diversity analysis.
Table 3(a). Neutrality tests based on 16S rRNA partial gene sequence.
Table 3(b). Neutrality tests based on COI partial gene sequence.
Table 3(c). Neutrality tests based on COII partial gene sequence.

Mitochondrial COI and COII barcode for some rare species, namely, D. mukteshwarensis, Liodrosophila nitida, Lordiphosa parantillaria, Lordiphosa ayarpathaensis, Scaptomyza himalayana, Scaptomyza tistai, Zaprionus grandis and Stegana minuta are released for the first time to public domains through this study.

Discussion

The enormous biodiversity which we witness today is the result of several biological processes, including adaptive radiation (Gavrilets and Losos 2009; Glor 2010; Losos 2010), which is rapid diversification of lineage into different ecological niches. Owing to ecological opportunity in an unutilized environment, the ancestral species can undergo rapid species diversification into new ecological niches available. Central Himalayan region of India, due to its variable altitudinal terrain, encompasses highly varied tropical to temperate-like habitats in span of just few hundred kilometres and thus offers a well-established arena for studying diversification across lineages, which can be useful in, identifying adaptive radiations and understanding processes underlying origin of biological diversity (Losos 2010). Our extensive explorations in the region in past decades unravelled around 95 species (Sati et al. 2013), with significant number of them new to science, which could be attributed to the rapid divergence of the lineage as the descendants with time got adapted to range of available ecological resources. Thus, we assessed variation in three mitochondrial genes, namely, 16S rRNA, COI and COII in 26 drosophilid species collected along altitudinal gradient from 550 to 2700 m above mean sea level. In all 543 sequences, we generated: 82 for 16S rRNA, 238 for COI, 223 for COII with 21, 47 and 45 mitochondrial haplotypes for 16S rRNA, COI and COII genes, respectively.

Most species were represented by 2–3 unique mitochondrial haplotypes depicting a significant impact of environmental heterogeneity along altitudinal gradient on genetic diversity. Of the six localities surveyed, all individuals sequenced from Srinagar – Garhwal, Augustyamuni and Upper Chamoli showed same haplotype (designated ‘S’), Kanchula – Kharak and Chopta formed another haplotype (designated ‘C’) in some cases, while Mandal formed a discrete haplotype (designated ‘M’) in most of the species. Thus, Mandal (1600 m) being a transition area between subtropical and temperate zones with high degree of heterogeneity in climate and vegetation, has an evident impact on genetic diversity of drosophilid species. Large population size and environmental heterogeneity promotes high haplotype diversity within populations (Nei 1987; Avise 1998), while nucleotide diversity (π) specifies the mean number of differences among all pairs of haplotypes in population. Therefore, these values provide insightful inference of genetic diversity for a population (Nei and Li 1979). Almost all species exhibited high haplotype diversity (>0.5) with low nucleotide diversity (<0.02) indicating that the studied populations recently diverged from each other. Although all species represented low π values, they were within the boundaries of the values observed for mtDNA in other Drosophila species (Powell 1997). In this study, we also observed high level of polymorphism as all the flies sequenced were natural populations caught from wild and not the laboratory cultures, leading to high levels of polymorphism.

We also obtained significantly positive results for neutrality tests in this study indicating balancing selection or population substructure. Most mitochondrial DNA sequence analysis have indicated that whenever the neutral model is rejected for a region it is due to a significantly negative test (Rand et al. 1994; Nachman 1998; Nielsen and Weinreich 1999). It may be due to recent directional selection, population bottleneck, recent population growth or background selection of some deleterious alleles (Tajima 1989). Although the ultimate consequences of these events may be an accumulation of rare polymorphic sites, the difference in the pattern of nucleotide substitution allows differentiation between some of these processes (Fu 1997; Wayne and Simonsen 1998). Our analysis of all the three genes of almost all the species pooled individually indicated significant departure from neutrality. Several different evolutionary forces may cause such departure. All these neutrality tests do not have the same power to detect departure from neutrality due to different evolutionary forces such as hitchhiking, population expansion, background or directional selection (Fu 1997) but their combinatorial application permits inferences on the patterns of selection affecting a specific/selected gene.

Most genetic diversity studies in Drosophila and several other taxa primarily utilize closely related species pairs (Coyne and Orr 1997, 1989) rather than population of same species at the early juncture of speciation (Harrison 1998). Although most interspecific studies provide information about the plausible correlation relationship between genetic diversity and the strength of an isolating mechanism after speciation, they do not provide a clearer picture of early stages of speciation, i.e. when particular isolating mechanisms primarily arose or its relationship to degrees of genetic diversity among genomes. Here, six altitudinally isolated populations of 26 drosophilid species, utilizing different ecological niches were analysed. We characterized evolutionarily diverging populations of these drosophilid species based on three mitochondrial genes with respect to both reproductive isolation and degree of genetic differentiation among them. This pattern is generally considered outcome of adaptive radiation as we witnessed rapid origin of species adapted to diversity of ecological niches followed by a slowdown of diversification rate through time as the new niches get occupied (Gavrilets and Losos 2009; Yoder et al. 2010).

With advent of plethora of molecular taxonomic techniques, recent sequencing of genomes of several Drosophila species and their expanding implication in evolutionary biology, it was important to present before the Drosophila research community a better taxonomic framework of the common and novel species reported by us from Uttarakhand region. Considering the high rate of discovery of novel drosophilid species from this region, we for the first time employed molecular characterization tools to speed up the recognition of new units that represent undescribed species. Molecular data of some rare species like D. mukteshwarensis, Liodrosophila nitida, Lordiphosa parantillaria, Lordiphosa ayarpathaensis, Scaptomyza himalayana, Scaptomyza tistai, Zaprionus grandis and Stegana minuta are now for the first time available to public domains through this effort which would not only strengthen the drosophilid taxonomic database of the region but may also help us to better understand the process of ecological adaptations that occurred over the history of this group.