Introduction

Natural killer (NK) cells express a diverse array of membrane-bound receptors that largely control the NK response to infection and malignancy. These receptors fall into two major groups, the immunoglobulin-like receptors, encoded by genes in the leukocyte receptor complex, and the lectin-like receptors, encoded by genes in the NK complex (Trowsdale 2001). Killer cell Ig-like receptors (KIR), originally described in primates (Vilches and Parham 2002), have now been identified in many species, but do not appear to be functional in rodents, which have a functionally equivalent family of genes (Ly49/KLRA1) encoding lectin-like receptors (Raulet et al. 1997). Potentially functional Ly49 genes were recently described in other mammalian species, e.g., cattle, cat, and dog (McQueen et al. 2002; Gagnier et al. 2003). Both of these sets of receptors primarily engage classical MHC class I molecules, and the expression of different combinations of activating and inhibitory receptors largely controls NK activation.

Another set of genes encoding lectin-like receptors was described in both primates and rodents; these are the CD94/NKG2 (KLRD1/KLRC) genes (Gunturi et al. 2004). CD94 is a single copy, monomorphic gene in human and mouse (Vance et al. 1997; Lohwasser et al. 2000). Three NKG2 genes were identified encoding proteins that form heterodimers at the cell surface with CD94, namely, NKG2A/B, NKG2C, and NKG2E/H. NKG2F encodes a protein that is expressed intracellularly (Kim et al. 2004), and NKG2D encodes a protein with a distinct function, that is expressed at the cell surface as a homodimer together with the adaptor molecule DAP10 (Gonzalez et al. 2006). NKG2B is an alternatively spliced form of NKG2A that lacks exon 4, encoding the stem region (Lieto et al. 2006); both isoforms carry two immune tyrosine-based inhibitory motifs (ITIMs) and thus function in an inhibitory manner. NKG2C, NKG2E, and its isoform NKG2H do not have ITIMs and instead interact with the adaptor molecule DAP12 via a charged residue in the transmembrane (TM) domain. These receptors are assumed to function in an activating manner (Lanier et al. 1998). Alternatively spliced forms of all these genes were reported in primates. NKG2A is monomorphic in humans and shows very slight polymorphism in chimpanzees. NKG2C and NKG2E demonstrate very limited polymorphism in human and chimpanzee, and in the latter species the NKG2C gene is duplicated (Shum et al. 2002).

The CD94/NKG2A/B and -C receptors monitor cell status through expression of classical MHC class I genes, but in contrast to the KIR and Ly49 receptors they do not interact directly with these molecules. In human CD94/NKG2A/B and -C molecules interact with the nonclassical MHC class I molecule HLA-E (Braud et al. 1998), and in mice the ligand is also a nonclassical MHC class I molecule, Qa-1b (Vance et al. 2002). The HLA-E and Qa-1b genes appear to have evolved independently to have the same function: to encode molecules that bind and present peptides derived from the leader sequences of some classical MHC class I molecules (O’Callaghan et al. 1998). In the case of HLA-E, a suitable peptide can also be derived from the nonclassical HLA-G (Llano et al. 1998). Thus, if classical MHC class I expression is downregulated in a cell, for example, after viral infection, HLA-E may not find a suitable peptide and will not then be expressed at the cell surface. As with KIR/Ly49 and classical MHC class I, the combination of activating and inhibitory NKG2 genes expressed and the level of the nonclassical MHC class I ligand on the target cell surface determine the activation status of the NK cell. Because NK cells may be expressing both receptor types, in addition to others with non-MHC ligands, the true situation is very complex.

Cattle were recently shown to express both KIR and Ly49 genes (McQueen et al. 2002), and additional NK receptors were also cloned, for example, NKp46 (Storset et al. 2003). While a single cattle CD94 cDNA sequence was published (Storset et al. 2003), little is known about the NKG2 genes. The aim of this study was to extend our knowledge of these genes in cattle and to clone the genes to ultimately identify their ligands in this species. In this way, we hope to gain further insight into the evolution of NK receptor genes in mammals and to better understand their interactions with MHC genes.

Materials and methods

Animals, RNA extraction, and cDNA synthesis

The cattle used in this study were Holsteins and were part of the Institute for Animal Health herd. Four animals were selected to represent distinct lines with each animal homozygous for a particular MHC haplotype. Peripheral blood mononuclear cells (PBMCs) were obtained from venous blood by density gradient centrifugation. Polyadenylated mRNA was isolated from 5x106 cells using the Dynal mRNA DIRECT kit (Invitrogen, Paisley, UK). First-strand cDNA was synthesized from the mRNA using an oligo(dT)12-18 primer and Superscript II reverse transcriptase (Invitrogen).

Amplification and sequencing of cattle CD94 and NKG2 genes

Primers are detailed in Table 1. Full-length CD94 and NKG2 genes were amplified by PCR from cDNA using the following primer pairs: CD94 5′ HindIII and CD94 3′ XhoI for CD94; NKG2A 5′UTR 2 and NKG2A 3′UTR 2 for NKG2A-01; NKG2A 5′UTR 1 and NKG2A 3′UTR 1 for NKG2A-02 to -07; and NKG2C 5′ and NKG2A 3′UTR 2 for NKG2C.

Table 1 Sequence of primers (5′–3′)

PCR reactions were carried out on approximately 20 ng of cDNA template in a final volume of 25 μl containing 1× PCR buffer (20 mM Tris–HCl at pH 8.4, 50 mM KCl; Invitrogen), 2.5 mM of MgCl2, 0.25 mM each diethylnitrophenyl thiophosphate, 1 μM each primer, and 1.25 U of Taq polymerase (Invitrogen). The following thermal cycling profile was used: 95°C for 1 min; 38 cycles of 95°C for 20 s, 55°C for 20s, 72°C for 45 s; followed by 72°C for 5 min. Thermal cycling was performed on a PTC-200 thermal cycler (MJ Research, Incline Village, NV, USA) set to use calculated reaction temperatures.

The 5′ rapid amplification of cDNA ends (5′ RACE) was carried out on polyadenylated mRNA using the 5′ RACE System, version 2.0 (Invitrogen). Briefly, cDNA was synthesized from mRNA as above except that an NKG2C-specific primer (NKG2C RACE1) was used instead of the oligo(dT) primer. The cDNA was then dC-tailed and PCR was carried out using a nested NKG2C-specific primer (NKG2C RACE2) and the anchor primer supplied with the kit. A further round of PCR was performed on the product, using another nested NKG2C-specific primer (NKG2C RACE4) and universal amplification primer supplied with the kit.

PCR products, including those generated by 5′ RACE, were purified from agarose gels using the QiaQuick gel extraction kit (Qiagen, UK) and cloned into pGEM-T (Promega, Southampton, UK). Individual clones were sequenced using forward and reverse vector primers with the GenomeLab Dye Terminator Cycle Sequencing Quick Start Kit (Beckman Coulter, Fullerton, CA, USA) and a CEQ 8000 Genetic Analysis System sequencer (Beckman Coulter, USA). All full-length sequences were confirmed by analysis of multiple clones.

Results

CD94

CD94 was shown to be conserved within species, and human and chimp CD94 share all but one amino acid (Shum et al. 2002). A single cattle CD94 cDNA sequence was reported previously (AF 422180; Storset et al. 2003). In the current study, primers based on that sequence were used to amplify CD94 from four animals. Sequence analysis revealed four distinct sequences, including the previously published sequence that differ by between one and seven amino acids (EF081290, EF081291, and EF081292; Fig. 1). Most of the polymorphic residues are within the predicted carbohydrate recognition domain (CRD). We have numbered these sequences CD94-01 to CD94-04. Each of the four animals transcribed one or two of these, but none appeared to be consistently transcribed (Table 2). An apparently alternatively spliced form of CD94-04 was seen in both animals carrying this sequence. In this case, 62 nucleotides were missing, which are predicted to correspond to the 3′ portion of exon 4; this deletion results in a frame shift (data not shown). Similar alternative splicing was reported in chimpanzee CD94 (Shum et al. 2002). Two putative CD94 sequences were identified in the most recent draft of the cattle (Bos taurus) genome (http://www.hgsc.bcm.tmc.edu/projects/bovine/) located close together on chromosome 5 (Fig. 3), so it is possible that these four sequences derive from two genes, but we have no formal evidence for this at present.

Fig. 1
figure 1

Alignment of the predicted amino acid sequences derived from four cattle CD94 cDNA sequences. CD94-01 was previously published (Storset et al. 2003); CD94-02 to -04 were additionally identified in this study. Dashes indicate identity. GenBank accession numbers AF 422180, EF081290, EF081291, and EF081292

Table 2 Transcribed CD94 and NKG2 genes in four animals

NKG2A

Amplification of full-length NKG2A from four animals revealed at least ten different sequences, some of which appear to be minor sequence variants (Table 2). Putative genes were provisionally numbered NKG2A-01 to -07 (EF081282-EF081288); minor variants were not shown (Fig. 2). All of the sequences have two putative ITIMs in the cytoplasmic domain, with some slight variation in sequence. NKG2A-07, in addition to the ITIMs, has an arginine in the TM domain. Alternative splicing around the exon 3/4 boundary was observed, as in human. The majority of the sequences align with human NKG2A, having a predicted stalk region of 23 amino acids. NKG2A-02, -03, and -04 have a predicted additional seven amino acids, presumably due to missplicing, although the correct isoform in these cases was not observed. Additional isoforms of NKG2A-02 and -03 were also found, with another nine predicted amino acids (data not shown). An isoform of NKG2A-01 was found that missed the 5′ part of exon 4, resulting in a predicted 24-amino-acid deletion (data not shown). NKG2A-04 is predicted to encode two additional amino acids in the carboxyl terminus; a similar phenomenon was reported in some chimp NKG2C alleles (Shum et al. 2002).

Fig. 2
figure 2

Alignment of the predicted amino acid sequences derived from seven cattle NKG2A cDNA sequences (EF081282-EF081288), cattle NKG2C (EF081289), human NKG2A (hsNKG2A; NM_002259), and human NKG2C (hsNKG2C; NM_002260). Dashes indicate identity; dots represent gaps introduced to maximize alignment. ITIM sequences are underlined; charged residues in TM domain are shown in bold. Predicted domain boundaries are indicated by vertical lines

Clones with the whole of exon 4 missing (corresponding to most of the extracellular stalk), as seen in human NKG2B, were found only very rarely (data not shown). For this reason, all of the sequences described in this study were provisionally named NKG2A, although it is possible that NKG2A-07 will be renamed if it is confirmed that it is functionally distinct. BLAST searches of the most recent draft of the cattle genome showed that the NKG2A-01 gene (XM_605520) is next to and in the same orientation as the NKG2C gene on chromosome 5. Additional NKG2A-like genes were found on the other side of the NKG2A-01 gene in the opposite orientation (Fig. 3). They each have a number of nucleotide differences to the sequences described in this study and are not obviously allelic to any of them. Two other NKG2A-like genes were predicted (XM_592813 and XM_867124), which have not yet been assigned a chromosomal location.

Fig. 3
figure 3

Approximate location of CD94 and NKG2 genes on cattle chromosome 5 (derived from http://www.hgsc.bcm.tmc.edu/projects/bovine/). Genes shown in gray appear to be incomplete. Asterisk indicates assignment to a particular gene group is provisional based on sequence similarity. Distances and positions shown are approximate

NKG2A-01 (or minor sequence variant) was transcribed in all four animals, as was NKG2A-03. At present, it appears that NKG2A haplotype composition may otherwise be variable, but this has yet to be confirmed. The number and combination of sequences found in individual animals suggests that these sequences correspond to at least four distinct genes; however, a relatively small number of clones was sequenced in each case, thus, additional transcripts may have been present but were undetected (Table 2).

NKG2C

Amplification of full-length NKG2C from four animals revealed three sequences, each differing by one or two nucleotides; only one sequence is shown (EF081289; Fig. 2, Table 2). BLAST searches against the most recent draft of the cattle genome revealed a very similar sequence on chromosome 5 (XM_870891), together with a second partial and more divergent sequence that seems likely to be a pseudogene (Fig. 3). The NKG2C sequence has a basic arginine residue in the TM domain and no ITIM, indicating activating function. An NKG2C cDNA sequence was identified in each of the four animals.

Sequence comparisons

Sequence comparisons show that NKG2A-01 is divergent when compared to the other six sequences designated NKG2A (Table 3). However, in the CRD the NKGA-01 sequence demonstrates 93.5% predicted amino acid identity with NKG2C; this mirrors the situation in human, where NKG2A and NKG2C show 95% identity in the CRD (Fig. 2). Table 3 shows that NKG2A-02 to -07 do not differ dramatically from one another, showing between 89 and 95% amino acid identity across the whole coding region. If the CRD is considered in isolation, the level of similarity drops very slightly (84–94%).

Table 3 Percentage of identity between cattle NKG2A amino acid sequences

Discussion

The most striking finding in this study is the number and diversity of genes/alleles within the cattle CD94/NKG2 family. NKG2A-like sequences were provisionally assigned to seven putative genes. NKG2A-01 is clearly a distinct gene, while designation of the remaining six NKG2A sequences (A-02 to A-07) is more difficult. Data from the cattle genome sequence support the existence of multiple NKG2A (or closely related) genes, but clear interpretation is problematic due to likely haplotype and allelic differences and the fact that the genome sequence is derived from a cattle breed other than Holstein. However, the arrangement of the CD94, NKG2D, NKG2C, and NKG2A-01 genes is broadly as seen in other mammals. Taken together with the number and combination of sequences found in individual animals (Table 2) and the extent of variation (25 variable amino acid positions within the CRD), it seems reasonable to propose preliminary assignment of the putative additional NKG2A sequences shown to discrete genes. All of the genes have two ITIMs that are predicted to be functional, although the first ITIM is more typical in NKG2A-01 (VIYAEL) than the other genes (ATYAEL) (Kabat et al. 2002). There is nothing to indicate that any of these genes is nonfunctional and, because they show no additional homology to NKG2 genes in other species other than NKG2A, we do not propose to give them alternative names at present.

The sequence carrying a positively charged (basic) residue in the TM domain (NKG2A-07) is unique in that it also has two ITIMs and is not therefore equivalent to any previously reported NKG2 genes. The charged residue (arginine) is encoded at the same position as in human NKG2C, E, and F and as in cattle NKG2C. It is therefore possible that this allows an interaction to occur with the ITAM-bearing adaptor protein DAP12, although it does not exclude the possibility of interaction with a distinct adaptor protein, as was demonstrated for some other activating receptors (MacFarlane and Campbell 2006). Although human NKG2F was initially reported to encode both a basic TM residue and an ITIM-like sequence, it was subsequently shown that this putative ITIM is not functional, and that the molecule does associate with DAP12 and may therefore have activating function (Kim et al. 2004). The cattle NKG2A-07 has exactly the same ITIMs as most of the other NKG2A sequences, and a full-length CRD, so it does not appear to be equivalent to human NKG2F. The primate KIR2DL4 is also unusual in encoding both a functional ITIM and a charged TM residue, but this appears to be predominantly an activating receptor (Kikuchi-Maki et al. 2005). At present, it is impossible to do more than speculate that the cattle NKG2A-07 molecule may be bifunctional, the activating function mediated by as yet unconfirmed accessory proteins or other molecular interactions.

NKG2A-01 is the most divergent of the NKG2A sequences (Fig. 2, Table 3) and interestingly shows slightly more similarity to human NKG2A than the other sequences do (Fig. 2). However, the most striking feature is its similarity in the CRD to the single NKG2C sequence (93.5% identity), suggesting that these two receptors share a ligand, as is seen with NKG2A and NKG2C in human and mouse.

Four different CD94 cDNA sequences were obtained, with some animals apparently expressing only one and some two of these. It is therefore apparent that one or more cattle CD94 genes are polymorphic, which was not observed in other species. Although a possible explanation for this is that different CD94 alleles partner different NKG2A or NKG2C molecules, this is not supported by the data, which show none of the CD94 variants to be present in all four animals despite them all expressing both NKG2A-01 and NKG2-C. In addition, it still remains to be demonstrated that CD94 in cattle forms heterodimers with any of the NKG2 molecules.

Several studies were carried out in human to define the nature of CD94/NKG2 binding to the nonclassical class I molecule HLA-E (Wada et al. 2004). A number of residues crucial for binding to occur were identified, and these all reside in the top of the alpha 1 and 2 domains of HLA-E, suggesting that the interaction is similar to that of NKG2D and MICA, shown by crystallographic analysis (Li et al. 2001). In addition, it was demonstrated that the leader sequence-derived peptide bound to HLA-E has a significant influence on CD94/NKG2 binding (Miller et al. 2003). A nonclassical MHC class I gene with equivalent function or characteristics to HLA-E has not yet been identified in cattle; however, nine full-length MHC class I cDNAs were described (BoLA-N*50001-N*50501) that are defined as nonclassical (http://www.ebi.ac.uk/ipd/mhc/bola). These are believed to be encoded by at least four genes, each with very limited polymorphism (Davies et al. 2006), but it is not yet clear if they are all present on all MHC haplotypes (Birch et al. 2006), as would be predicted for an HLA-E functional homolog. Little is known of their expression patterns or function. Some can be found transcribed at very low levels in PBMC, and all were found in trophoblast (Davies et al. 2006). BoLA-N*50001 was implicated in the immune response to infection because, unlike classical class I, it is not downregulated by bovine papilloma virus (Araibi et al. 2006). The genes have some distinguishing characteristics, for example, BoLA-N*50001 (and related alleles) and N*50201 have truncated cytoplasmic domains, and N*50501 has no TM domain, which suggest distinct functions. Extensive analysis of both genomic and cDNA demonstrates that additional class I genes exist that have not yet been fully sequenced or characterized (Birch et al. 2006); thus, while it is possible that one or more of the defined nonclassical class I genes is the functional equivalent to HLA-E, it is equally possible that it has yet to be identified.

Because the role of CD94/NKG2 in primates and rodents is primarily to monitor classical MHC class I expression, albeit by interaction with nonclassical class I, it must remain a possibility that in cattle some or all of these heterodimers are binding directly to classical class I molecules. MHC class I gene arrangement and expression is complex in cattle, with potentially six or more functional genes expressed in a variety of combinations on different haplotypes, with none consistently expressed (Ellis et al. 1999). This raises questions concerning KIR ligand specificity. In humans KIRs bind to subsets of classical alleles, mostly encoded at the HLA-C or -B loci. Because cattle class I haplotypes are very variable, it is difficult to see how this would operate, unless there are rather more KIR genes or they are less restricted in their binding. Current data suggest that cattle have at least 13 KIR genes, with variable haplotypes as in human (Ellis et al., unpublished data), but there is no information regarding ligands. In addition, cattle have at least one apparently functional and potentially polymorphic Ly49 gene (McQueen et al. 2002; Ellis et al., unpublished data).

The fact that NKG2C appears monomorphic in cattle is surprising given the level of polymorphism in the putative NKG2A genes. This suggests that control of NK cell activation in the context of these genes cannot depend on a balance between the inhibitory NKG2A and activating NKG2C molecules. One explanation is that NKG2A-01 and NKG2C function as in human and mouse, and presumably encode molecules that interact with a nonclassical, nonpolymorphic MHC ligand, while the remaining NKG2A genes and alleles encode molecules with alternative, possibly diverse ligands, such as classical MHC class I molecules.

The position, proximity, and similarity of the cattle NKG2 genes indicates multiple duplication events as seen in related gene families in other species (Hao et al. 2006). Although there is evidence for allelic variation and in some cases gene duplication of NKG2A and NKG2C in some primate species (Shum et al. 2002; LaBonte et al. 2004), in these cases the degree of variation between genes and alleles is relatively small and does not indicate diverse function or ligands. In cattle the driving force may have been a need to generate a backup system for the KIR genes due to the very variable MHC class I haplotypes and lack of flexibility in the KIR system to be certain of sufficient NK receptor/ligand interaction in all individuals. Alternatively, expression patterns of these receptor families may be distinct in cattle, with more separation of KIR and NKG2 expression on different cell types/subsets.

Until information becomes available concerning NK receptor ligands in cattle, it is impossible to definitively interpret the data presented in this study. However, they show that, although closely related gene families exist in many species, the precise function and interactions of the genes may distinctly differ, and understanding these interactions may shed light on the broader picture of immune system evolution. These data show that cattle have more potentially functional NK receptors likely to bind MHC class I than some other species, and this may reflect complexity in the cattle MHC class I region or simply an alternative evolutionary path.