Introduction

There are total five breeds of water buffalo in Pakistan, including Nili, Ravi, Nili-Ravi, Kundi, and Azi-Kheli. These animals are raised in small herds (1–10 animals/herd) mostly in rural areas (Khan et al., 2007). Azi-Kheli buffalo is a native breed of Pakistan propagated in different areas of Swat, Shangla, Malakand, Dir, and Bajur districts in Khyber Pakhtunkhwa (Khan et al., 2013). In a 2006 census report, Azi-Kheli buffalo was incorporated for the first time and counted 2.9% of total provincial buffalo population having 107,000 in number with reducing population tendency (Khan, 2003; Khan et al., 2007).

The ancestor of present-day domestic water buffalo is Bubalus arnee which is a wild buffalo found in Nepal, Bhutan, Thailand, and few areas of India (Scherf, 2001). It is assumed that domestic water buffalo Bubalus bubalis was domesticated as early as 5000 years back in civilization of Indus Valley (Cockrill, 1981) and 7000 years ago in China (Chen and Li, 1989). There are two ideas about domestication of water buffalo, one idea states that both river and swamp buffaloes were independently domesticated while the second idea state that both types of water buffaloes were result of a single domestication event (Kierstein et al., 2004).

In every eukaryotic cell, mitochondria are the most abundant organelles located inside the cytoplasm and termed as power house of the cell (Mandal et al., 2011). Apart from nuclear DNA, mitochondria contain a small size genome which is maternally transmitted, lacks genetic recombination, and is regarded as a vital tool for evaluation of phylogenetic studies in many species or different breeds within the same species (Chen et al., 2013; Zinovkina, 2018). The study of DNA especially mitochondrial genome is used to find out past domestication and diversity of mammalian species (Robinson et al., 2010; Gonzalez-Freire et al., 2015). Little but considerable part of mitochondrial genome studies about phylogeny of domestic water buffalo species (Bubalus bubalis) has been reported (Kierstein et al., 2004).

The COI gene as a DNA barcoding marker was investigated for the first time as an exact tool in identification and phylogenetic analysis of a species by Hebert et al. (2003). The nucleotide sequence of CO1 gene within same species is expected to be nearly identical (Hebert et al., 2004). Besides from genes, mitochondrial DNA constitutes (˜ 1150 pb) large non-coding region called D-loop (Shadel and Clayton, 1997). There are hypervariable regions in the D-loop which have high mutation rate as compared to the whole mitochondrial genome. They play a key role in the phylogenetic analysis of eukaryotes (Lang et al., 1999). The data obtained from genetic methodologies like genetic information, evolution, and phylogeny are essential for the conservational management and species monitoring (Schwartz et al., 2007). In Pakistan, phylogenetic analysis using D-loop has been attempted in Kundi (Hussain et al., 2009), Nili, Ravi, and Nili-Ravi (Zahoor et al., 2016; Bhatt et al., 2020). Moreover, genetic relationship and diversity analysis using D-loop has also been reported in the same breeds with Indian riverine and Chinese swamp buffaloes, sequences retrieved from GenBank (Bhatt et al., 2020). Our project “Characterization of cattle genetic resources of Khyber Pakhtunkhwa through Genetic Markers and Molecular techniques” focused to characterize cattle breeds’ native to Khyber Pakhtunkhwa Province of Pakistan. Four buffalo breeds, i.e., Kundi, Nili, Ravi, and Nili-Ravi, being native to Punjab and Sindh provinces of Pakistan were excluded from the current study. Azi-Kheli buffalo breed being native to Khyber Pakhtunkhwa Province and mandate of the project was thus selected in the current study to investigate its origin and level of genetic diversity using CO1 and D-loop.

Materials and methods

Sample collection and DNA extraction

For this research, the breeding track of Azi-Kheli buffalo—district Swat located Northern belt of Khyber Pakhtunkhwa Province—was visited twice in the start and end of February 2020. Animals having small compact body size, well adopted to mountain slope grazing, typical dominant brown coat color, and sickle shaped horns are typical morphometric description of the Azi-Kheli buffalo breed (Khan et al., 2013). A team of experts from LR&DS, Charbagh, District Swat, and Azi-Kheli Buffalo Improvement and Development Farm Charbagh, District Swat (34.8346° N, 72.5441° E) was accompanied to ensure sampling from pure Azi-Kheli breed. Moreover, history-based pedigree was recorded from the owner of each animal to ensure pure breed and not cross-breed. A total 30 unrelated animals of Azi-Kheli buffalo, including bull, cow, and calves, were sampled for blood collection from different herds. The jugular vein of each animal was disinfected and punctured with sterile disposable syringe, and 3 ml of blood was collected in EDTA tubes (REF-XLGA-E3K3, Xinle®, China). DNA was extracted by non-enzymatic salting out method from blood samples as described by Suguna et al., (2014). The research work was conducted at the Genomic Laboratory, Centre of Microbiology and Biotechnology (CMB), Veterinary Research Institute (VRI), Peshawar, from February 2020 to November 2020 (34.0170° N, 71.5699° E).

PCR amplification of CO1 and D-loop

The partial region of CO1 gene was amplified using: F5′-TCTCAACCAACCATAAAGATATCGG-3′ and R. 5′-TATACTTCAGGGTGTCCGAAGAATCA-3′ primers (accession no. AF547270.1), whereas the whole region of D-loop was amplified using F 5′-TAGTGCTAATACCAACGGCC-3′ and R 5′ AGGCATTTTCAGTGCCTTGC-3′ primers (Accession No. AY488491.1); the primers were designed from NCBI data base (www.ncbi.nlm.nih.gov). The designed primers were sent to Macrogen®, Korea, for synthesis. A total of 25-ul PCR reaction was prepared for each sample in a PCR tube by adding 5 ul of master mix (Cat. No.SM213-0250, GeneDireX, Inc.), 1.5 ul of reverse primer, and 1.5 ul of forward primer, 12 ul of PCR water, and 5 ul of template DNA. The reaction was carried out in thermal cycler BIORAD® using the following protocol: 94 °C for 5 min and 34 cycles of 94 °C for 30 s, 58 °C and 59 °C for 30 s, 72 °C for 1 min, and 72 °C for 10 min.

Sequencing of CO1 gene and D-loop

The two target regions on isolated DNA were amplified by PCR techniques using specific primers of complete D-loop while partial CO1 gene. The sequencing reactions of mitochondrial CO1 gene and D-loop were performed through Sanger sequencing method, as described by Sanger et al. (1977). Both the targeted regions were subjected to chain termination PCR and million to billion copies were terminated at random lengths by 5′-ddNTPs. Then, terminated oligonucleotides were separated in gel electrophoresis via supply of electric current. Finally, gel was analyzed and DNA sequence was determined by fluorescence tags through automated Sanger sequencing and results were generated by computer in AB1 format.

Data analysis

The reference sequence of COI gene (accession no. AF547270.1) and D-loop (accession no. AY488491.1) are downloaded from NCBI GenBank (www.ncbi.nlm.nih.gov). The sequencing results of COI and D-loop of the current study were compared with its reference sequences, in which single-nucleotide polymorphism (SNP) positions were detected. Every identical sequence was considered as single haplotype Haplotype diversity and nucleotide diversity was also estimated, using DNA SP 6.0 software (Rozas et al., 2017).

Multiple sequence alignment was conducted through ClustalW method in MEGAX software (Kumar et al.2018). Two phylogenetic trees were constructed. The first phylogenetic tree was constructed from haplotypes of CO1 gene via maximum likelihood method, while the second phylogenetic tree was constructed from haplotypes of D-loop via neighbor-joining method using MEGAX software as described by Kumar et al. (2018). The Kimura-2 parameter method was used to estimate genetic distance within CO1 gene of current study, with other related species using the same software. Similarly, nucleotide composition and nucleotide pair frequencies of D-loop nucleotide sequence of the current study was estimated by MEGAX software as described by Kumar et al. (2018).

Results and discussion

Polymorphism (SNP’s) in CO1 gene

The sequencing results of CO1 gene showed four polymorphic sites at positions 119, 231, and 453, 644. The average A/T content % and G/C content % of COI gene were 55.3% and 44.7%, respectively. Higher A/T content % (54.2%) and lower G/C content % (44.8%) in COI gene of lowland anoa species (genus Bubalus) have also been reported by PRIYONO et al. (2018). Multiple sequence alignment of COI gene sequences with reference genome (AF547270.1) revealed four haplotypes, i.e., Azi-Kheli haplotypes one to four, respectively (AZHap1 to AZHap4), as shown in Table 1. These few numbers of haplotypes indicate lesser polymorphism rate in COI gene; such few polymorphic sites (4) are comparable with those reported in Egyptian river buffalo (Hassan et al., 2018). COI gene showed very low nucleotide variation as compared to other mitochondrial region in Egyptian river buffalo, wherein Hassan et al. (2009) showed 77 variable regions in mt D-loop.

Table 1 The identified haplotypes of CO1 gene with reference sequence to buffalo whole mitochondrial genome via DNA SP 6.0 Software; dots (.) show identical bases with reference sequence. AZ = Azi-Kheli, Hap = haplotype

Diversity of CO1 gene in Azi-Kheli buffalo from other related species of buffalo

Inter-specific diversity was determined by Kimura-2 parameter method (Table 2). The inter-specific diversity between CO1 gene in Azi-Kheli buffalo and CO1 gene in closely related species, i.e., swamp buffalo, lowland anoa ranges from 0.024 to 0.027 with an average of 2.56%, while it was 11.8% with African buffalo and 16.6% with domestic goat. However, intra-specific diversity of CO1 gene within Azi-Kheli buffalo ranges from 0.001 to 0.004 with an average of 0.25%. In a previous study for CO1 gene sequence, 2.5% sequence diversion is recommended for species identification (Tobe et al., 2010). Another previous study reveals that greater than 2% nucleotide sequence polymorphism of CO1 is sufficient to identify animal species (Yan et al., 2013). The result of intra-specific diversity of CO1 gene in the current study is higher than that of previously reported studies. In the current study, CO1 nucleotide sequence successfully differentiated and identified that Azi-Kheli buffalo is a river buffalo (Bubalus bubalis) which is distinct from three related species of buffalo, i.e., swamp buffalo (Bubalus bubalis carabanesis), lowland (Bubalus depressicornis), and African buffalo (Syncerus caffer).

Table 2 Diversity in Azi-Kheli buffalo and other buffalo types based on K2P method. Bold differentiates Azi-Kheli from Swamp buffalo, African buffalo, and anoa

Phylogenetic tree construction through CO1 gene

Phylogenetic tree was determined by maximum likelihood method and Tamura-Nei model from haplotypes of current study and published species (Table 3). The phylogenetic tree revealed four clades. The first clade consisted of Azi-Kheli buffalo haplotypes (AZHAP1 TO AZHAP4) with river buffalo haplotypes, while the second clade consisted of lowland anoa, whereas the third was swamp buffalo clade, and the fourth was African buffalo clade. Being members of Bovidae family, all the clades were rooted by Capra aegagrus hircus, as shown in Fig. 1. Similar investigation was done by Hassan et al. (2018) in Egyptian water buffalo which revealed distinct clade clustering of river buffalo, swamp buffalo, and lowland anoa.

Table 3 Specimen with scientific name and NCBI GenBank; accession no. used in construction of phylogenetic tree
Fig. 1
figure 1

Azi-Kheli buffalo phylogenetic tree construction by maximum likelihood method and Tamura-Nei model addition of published haplotypes of related species

Polymorphism (SNP) in D-loop

A total of 28 variable sites were identified in sequencing results of D-loop. Such type of high mutation rate has also been reported in D-loop by Kierstein et al. (2004) (128 variable regions in 36 haplotypes) in water buffaloes (species). The ratio of transition to transversion in D-loop of Azi-Kheli buffalo was 10.7:1 which is bias towards transition. Such bias towards transition has also been reported in previous studies like in Kundi buffalo in which this ratio was 10:1 (Hassan et al., 2009), whereas in Indian river buffaloes, it was 17:1(Kumar et al., 2007), and in Egyptian river buffalo, this ratio was 4.8:1 (Hassan et al., 2009). Average nucleotide frequencies of A/T and G/C contents were 59.9% and 40.1%, respectively. Almost similar results of base compositions with high A/T% have been reported in river buffaloes by Kierstein et al. (2004); i.e., in Mediterranean buffalo, A/T contents was 59.93%; in Murrah buffalo, it was 59.6%; in cattle, it was 61.7%; whereas in swamp buffalo, A/T contents was lower (58.3%).

Multiple sequence alignment of D-loop sequences with reference gene showed a total of five haplotypes. Every identical sequence was considered as the same haplotype, as shown in Table 4. The haplotype diversity (Hd) was Hd = 0.9601 ± SD = 0.096 and nucleotide diversity (π) was π = 0.01208 ± SD = 0.00182. Almost similar results of haplotype and nucleotide diversity in D-loop had been reported in Nili-Ravi buffalo, i.e., HD = 0.9561 ± 0.00010 and π = 0.00988 ± 0.00060, respectively; in Kundi buffalo, it was HD = 0.9386 ± 0.00020, π = 0.01043 ± 0.00083 (Bhatt et al., 2020), while in swamp buffalo, almost similar haplotype and nucleotide diversities in Xinglong buffalo; i.e., HD = 0.900 ± 0.161, π = 0.00328 ± 0.00219, have been reported by Lei Chu-Zhao (2007).

Table 4 The identified haplotypes of D-loop with reference buffalo mitochondrial genome accession no (AY488491.1); dots (.) show identical bases with reference

Phylogenetic tree construction through D-loop

Phylogenetic tree was constructed using neighbor-joining method from the haplotypes of Azi-Kheli buffalo D-loop and published haplotypes of river buffalo and swamp buffalo as shown in Table 5. The phylogenetic tree showed that newly found haplotypes (AZHAP1 to AZHAP5) in D-loop of Azi-Kheli buffalo intermingled with other river buffalo breeds from different geographic regions of the world and formed a clade. The haplotypes of swamp buffalo intermingled in a separate clade that was distinct from river buffalo clade as shown in Fig. 2. The current study supported the clustering pattern of haplotypes, reported in Kundi River buffalo (Hussain et al., 2009) and Egyptian river buffalo, whose haplotypes intermingled with other river buffalo from different geographical regions of the world (Hassan et al., 2009). Similar clustering pattern of haplotypes has also been reported in Indian River buffalo (Kumar et al., 2007). The current study on mitochondrial D-loop also showed different origin of swamp and river buffalo.

Table 5 Specimen with scientific name and NCBI GenBank; accession no. used in construction of D-loop phylogenetic tree
Fig. 2
figure 2

Neighbor-joining tree constructed from newly found D-loop haplotypes and published D-loop haplotypes of river buffalos, swamp buffalos, African buffalos, and Bos taurus as out group

Conclusions

Azi-Kheli buffalo has high genetic diversity, belongs to river buffalo (Bubalus bubalis), and is distinct from swamp buffalo (Bubalus bubalis carabenesis) having high divergence of CO1 gene (2.56%) from recommended threshold of species identification. The phylogenetic results of maximum likelihood tree also confirm presence of Azi-Kheli buffalo within river buffalo clade and distinct from swamp buffalo clade, whereas the phylogenetics results of D-loop (neighbor-joining tree) showed common origin of Azi-Kheli buffalo with different breeds of river buffalo of the world.