Introduction

The potential of an organism for evolutionary interactions with pathogens or other species as well as the fitness are related to immunological functions (Lazzaro and Little 2009). The extent of genetic diversity is known to be associated with the capacity for adaptation and evolution to environmental changes (Reed and Frankham 2003). Diversity of genes was pivotal phenomenon for immune functions which could be associated with the resistance or susceptibility to pathogens (Tibayrenc 2004; Trowsdale and Parham 2004). A cluster of associated genes named the MHC plays a key role in presenting antigenic peptides to T lymphocytes (Klein 1986). In the vertebrate genome, MHC has been known to be the most variable genes, which seem to be maintained by balancing selection, predating speciation events and reflecting the coevolution of hosts with their pathogens (Bernatchez and Landry 2003).

The class II genes of MHC encoded for \(\upalpha \) and \(\upbeta \) chains of DR and DQ dimer molecule, which present antigenic peptides to the helper T cells (McKinney et al. 2013). In rat, mouse, rabbit and pig, there was a single gene of the DQ genes, whereas in dogs and human multiple DQ gene copies have been identified but only one of them appears to be expressed (Kappes and Strominger 1988; Trowsdale 2001). The number of DQ loci in ruminants varies in different species e.g., the haplotypes in cattle and buffalo contain two copies of DQ genes (Andersson and Rask 1988; Sigurdardottir et al. 1992; Sena et al. 2011) and both genes are expressed (Russell et al. 1997). Hence, the polymorphisms as well as the duplication of DQ gene increase the differences at the cell surface by interhaplotype and intrahaplotype pairing of \(\upalpha \) and \(\upbeta \) chains during dimerization. The formation of functional restriction elements is the result of interhaplotype combination of DQA and DQB molecules with duplicated DQA haplotypes (Glass et al. 2009).

Gayal or mithun (Bos frontalis) is a natural inhabitant of hilly forests, in India it is kept by ethnic groups living in the hills of Tripura, Mizoram, Assam, Arunachal Pradesh, Nagaland and Chittagong hill tracts. They are also found in the Trung and Salween river basins of northern Burma and Yunnan province of China (Simoons 1984). Gayal is an important source of meat in these areas than the other cattle and is considered to possess high percentage resistance to diseases (Rajkhowa et al. 2004). Gayal normally intakes local bamboo and other plant leaves and grasses but possesses high range (from cold to tropical regions) of adaptation to harsh environment (Zhao et al. 2003; Xi et al. 2007). However, its genetic composition has been controversial as many biologists regarded the gayal as the domestic gaur for morphological similarity between the gayal (B. frontalis) and the gaur (B. gaurus) (Walker et al. 1968; Lan et al. 1993; Nie et al. 1995). The findings of karyotyping, mt-DNA and Y-chromosome analyses have made the scenario little more complex. However, most studies have suggested that gaur has been one of the immediate species ancestor of gayal (Nie et al. 1995; Verkaar et al. 2004; Chi et al. 2005; Gou et al. 2010; Sun et al. 2014). A recent investigation of Yunnan gayal suggested that maternal lineages of both Yunnan gayal and cattle were the admixture of B. indicus and B. taurus, while the Y chromosomal phylogeny indicated that their parental lineages are almost B. frontalis and B. indicus, respectively (Gou et al. 2010).

In the present study, we have isolated and characterized two cDNAs of DQA1 and DQA2 from gayal and compared with other homologues MHC sequences from other animal species with regards to determine the disease resistance and susceptibility genetic factors. This work will possibly strengthen our understanding to the disease control in pet animals as well as in knowing MHC diversity in common ruminants. The study will assist to explore new horizons to investigate immunological functions, selective and evolutionary forces that affect MHC variation within and between species.

Materials and methods

Three healthy gayal (B. frontalis) liver samples were collected from the National Jiumudang Stud Gayal Farm, Gongshan, China. The RNA extraction was performed using the commercial kit (Tiangen Biotech, Beijing, China). The extracted RNA was incubated with DNaseI to cleave the DNA contamination. The cDNA was constructed using the commercial RevertAid First Strand cDNA synthesis kit (Fermentas, Ontario, Canada), following the manufacturers’ protocol.

The Bofr-DQA1 (784 bp) and Bofr-DQA2 (801 bp) fragments were amplified from the template cDNA of gayal using three primer pairs, i.e. A1A2F, A1R and A2R, published previously for swamp buffalo (Niranjan et al. 2009). The forward primer (A1A2F: 5\(^{\prime }\)-ACCTTGAGAAGAGGATGGTCCTG-3\(^{\prime }\)) was shared on the consensus region. The other two reverse primers (A1R: 5\(^{\prime }\)-ATTGCACCTTCCTTCTGGAGTGT-3\(^{\prime }\) and A2R: 5\(^{\prime }\)-TCATAGATCGGCAGAACCACCTT-3\(^{\prime }\)) were different for both the DQA1 and DQA2. By using the combined primers A1A2F, A1R and A2R, the two primers (A1A2F and A1R) amplified the Bofr-DQA1 and the additional two primers (A1A2F and A2R) amplified the Bofr-DQA2 fragments, respectively. Using Bioer Life Express Thermocycler, the polymerase chain reaction (PCR) was performed in a reaction volume of \(25\, \mu \hbox {L}\), containing \(2.0\, \mu \hbox {L}\) template cDNA, \(12.5\, \mu \hbox {L}\) PCR Power Mix, \(1.0\, \mu \hbox {L} \,\,10\,\, \hbox {pmoL}\,\, \mu \hbox {L}^{-1}\) of each primer, and \(8.5\, \mu \hbox {L}\) double-distilled water. The PCR cycle was denaturation at \(94^{\circ }\hbox {C}\) for 3 min, followed by 35 cycles at \(94^{\circ }\hbox {C}\) for 1 min, \(59^{\circ }\hbox {C}\) for 45 s and \(72^{\circ }\hbox {C}\) for 45 s, with a final extension of 10 min at \(72^{\circ }\hbox {C}\). Finally, the PCR products were sequenced bidirectionally using an ABI 3730 DNA Analyzer (Applied Biosystems, Foster City, USA).

Table 1 Sequence comparison of the \(\upalpha 1, \upalpha 2\), CP/TM/CY motifs between the Bofr-DQA1/DQA2 and BoLA-DQA1/DQA2 genes.

The cDNA sequences were translated to amino acid sequences using GenScan software (http://genes.mit.edu/GENSCAN.html) and compared with the orthologous sequences. The theoretical isoelectric point (pI) and molecular weight (Mw) of the two putative proteins of the gayal genes were also computed using the online pI/Mw tool (http://www.expasy.org/tools/pi_tool.html). The sequence predictions were made using the open reading frame (ORF) Finder software (http://www.ncbi.nlm.nih.gov/projects/gorf/) and the neighbour-joining phylogenetic tree was constructed using MEGA software based on the coding regions of different orthologous DQA alleles from different species (Tamura et al. 2007). The nonsynonymous (\(d_{\mathrm{n}})\) and synonymous substitution (\(d_{\mathrm{s}})\) ratios between gayal and other livestock species in the genes DQA1 and  DQA2 have been estimated using the software PAML (Yang 2007) and the significant changes that has altered the amino acids among livestock with respect to gayal have been investigated by the web version of PAL2NAL (http://www.bork.embl.de/pal2nal/).

Results and discussion

We searched the most homologous sequences for Bofr-DQA1 and Bofr-DQA2 genes using the BLAST tool of NCBI server (http://www.ncbi.nlm.nih.gov/BlAST). The sequence similarity search has revealed that the two genes were not similar to any of the known gayal genes but possesses high similarity to other ruminant genes. The nucleotide sequences of DQA1 and DQA2 were deposited to the NCBI GenBank database with accession number KT318732 and KT318733, respectively. Further, the sequences were also deposited to Immunopolymorphism database (www.ebi.ac.uk/ipd/mhc/bola/nomenclature) with the assigned official names as Bogr-DQA*0101 (for Bofr-DQA1) and DQA*2001 (for Bofr-DQA2). The sequence prediction showed that the 784 bp and 801 bp cDNA sequences only represent two single genes with an ORF of 768 bp and both encoding a polypeptide of 255 amino acid residues. The computed pI of gayal DQA1 and  DQA2 genes were 4.93 and 4.84, respectively. The computed Mw of the two putative proteins were 28298.34 and 27953.88 Da for Bofr-DQA1 and Bofr-DQA2, respectively.

The nucleotide sequence comparison of Bofr-DQA with BoLA-DQA genes for homology showed that the Bofr-DQA1 and Bofr-DQA2 possess 91 and 100% sequence identities with that of BoLA-DQA1 and  BoLA-DQA2, respectively (table 1). However, the nucleotide sequence identity between the Bofr-DQA1 and Bofr-DQA2 were 88% only. These findings corroborate to the study conducted by Niranjan et al. (2009) on water buffalo. However, these authors presented that the Bubu-DQA genes have different identity (93.9 and 97.7%) with that of cattle as compared to the sequence homology between the DQA1 and DQA2 genes (85.7%).

Fig. 1
figure 1

An alignment between the amino acid sequences of Bofr-DQA and orthologous DQA sequences. The arrows indicate the amino acids positions consulting part of PBS. The putative N-linked glycosylation sites are underlined (

figure a
). The square (\(\blacksquare \)) indicates the position of residues associated with binding of CD4\(+\) molecules. A point (\(\cdot \)) indicates amino acid identity and hyphen (-) indicates gap inserted to maximize. The reference GenBank accession numbers for DQA1 alignment are Y07898 (BoLA-DQA*0101), U80884 (BoLA-DQA*0102), U80872 (BoLA-DQA*0204), U80871 (BoLA-DQA*0401), AB257109 (BoLA-DQA*10011), Y07819 (BoLA-DQA*12011), D50454 (BoLA-DQA*12021), U80869 (BoLA-DQA*1401), DQ440647 (Bubu-DQA*0101) and M93430 (OLA-DQA1). The reference GenBank accession numbers for DQA2 alignment are Y07820 (BoLA-DQA*2201), D50045 (BoLA-DQA*22021), U80868 (BoLA-DQA*2401), Y14020 (BoLA-DQA*25012), Y14021 (BoLA-DQA*2602), Y14022 (BoLA-DQA*27012), AF037314 (BoLA-DQA*2801), DQ440648 (Bubu-DQA*2001), M93433 (OLA-DQA2) and AY464652 (CLA-DQA).

A considerable mutations of 49 amino acid polymorphisms were observed when Bofr-DQA1 and Bofr-DQA2 were compared to other alleles which resulted from 95 nucleotide polymorphisms within the coding regions (figure 1). A total of 29 amino acid replacements were found within the exon 2 motif (\(\upalpha 1\)), deduced from the 51 of the nucleotide mutations. The remaining amino acid differences including four in SP domain, 11 in the \(\upalpha 2\) domain, two in the connecting peptide (CP), two in the transmembrane (TM) region and one in the cytoplasmic (CY) domain were observed. These results demonstrated that gayal has more amino acid substitutions than buffaloes with 45 amino acids variation (Niranjan et al. 2009).

Additionally, the peptide-binding sites (PBS, marked by green arrow sign), one N-glycosylation (NFT) within the \(\upalpha 1\) domain and another (NIT) within the \(\upalpha 2\) domain, one intrapeptide disulphide bond and the CD4\(+\) binding site (marked by square) were identified, revealing the significance of maintaining their molecular conformation and function to against the invading pathogens (Rudd et al. 1999). There were 20 PBS (figure 1) which are specific functional motifs in contacting with the antigens (Brown et al. 1993; Kuduk et al. 2012). The highly conserved loci from different animal species were only eight residues at positions 11, 25, 29, 35, 57, 60, 63 and 70 between DQA1 and DQA2 homologues. The other 12 PBS sites had the different amino acids in both the polypeptide chains, demonstrating that it could have associated with gayal adaptation to specific environment. Moreover, Indian buffaloes have extra three rare polymorphisms at positions 57 (hydrophilic > hydrophobic) and 36, 94 (hydrophobic > hydrophilic) resulting into the opposite water affinity (Niranjan et al. 2009). This may be from the animal germplasm because buffalo can well adapt to the tropical areas (Perera 2011). However, the replacements within the \(\upalpha 1\) domain have impacted on the antigen-binding groove and could reveal differential binding ability to wide profiles of pathogens in different environments during the evolution process for livestock (Germain 1995; Williams et al. 2002). We conclude that the Bofr-DQA1 and -DQA2 genes are more identical with the corresponding sequences of their counterpart cattle. Similarly, the low-nucleotide sequence homology between Bofr-DQA1 and -DQA2 as well as the high proportion of nucleotide and amino acid substitutions clearly reveal inconsistency as allelic form. Our results also support the findings of Ballingall et al. (1998), that the bovine DQA3*01 and DQA3*02 sequences as nonallelic types have 92% nucleotide homology and larger genetic distance within two genes cluster.

Fig. 2
figure 2

Phylogenetic tree based on DQA nucleotide sequences of gayal (neighbour-joining method).

From the phylogenetic tree exploration based on the nucleotide sequences, it appears that the split of the DQA1 and DQA2 sequences from the gayal and other ruminants into two major clades and further indicates their independently evolutionary relationship of the gayal DQA sequences (figure 2). We speculate from the results that gayal is genetically closer to cattle, which is in accordance with the previous studies (He et al. 2014; Sun et al. 2014). Further, a large distance between the two clades indicated that the Bofr-DQA1 and Bofr-DQA2 belong to two separate loci. It has been previously described that in case of cattle and buffalo, the DOA genes are present in duplicated form and both can be expressed (Russell et al. 1997; Niranjan et al. 2009) that seems to have similarity with gayal. These duplicated genes with different mutations could be useful to promote immunological response as well as environmental adaptation for gayal.

There have been several nonsynonymous changes in the livestock species with respect to gayal that has altered the amino acid sequences in both DQA1 and DQA2 genes. A detailed results (\(d_{\mathrm{n}}/d_{\mathrm{s}}\) ratio) and synonymous/nonsynonymous substitution have been presented in electronic supplementary table (for \(d_{\mathrm{n}}/d_{\mathrm{s}}\) ratio see tables 1 & 2 in electronic supplementary material at http://www.ias.ac.in/jgenet/ and for synonymous/nonsynonymous substitutions see tables 3 & 4 in electronic supplementary material).

In conclusion, the Bofr-DQA1 and Bofr-DQA2 genes have been characterized with extending our understanding to the MHC-DQA in rare ruminants. Like other animals DQA genes, the Bofr-DQA and Bofr-DQA2 were also highly variable, especially in the \(\upalpha 1\) domain as in most ruminants. It would be more interesting to clarify the effect of mutations from Bofr-DQA1 and Bofr-DQA2 on the pathogen’s resistance for gayal adaption in further studies.