Transmissible spongiform encephalopathies (TSEs) are a heterogeneous group of fatal neurodegenerative disorders characterized by changes in trafficking and conformation of the mammalian prion protein (PrP) [for review see (Prusiner 1998)]. Natural prion diseases may occur as genetic, infectious, or sporadic disorders in a variety of mammals, most notably in humans, mink, sheep, cattle, and deer. Cattle with bovine spongiform encephalopathy (BSE) have been implicated in one human TSE, variant Creutzfeldt-Jakob disease (vCJD) (Scott et al. 1999), through the consumption of beef from affected animals. Consequently, many countries are developing policies aimed at eliminating TSE-affected animals from their food chains. Key components of TSE eradication programs include: excluding ruminant meat and bone meal from animal feed, identifying and disposing of animals with clinical disease, tracing infected animals to their population source, culling in endemic areas, and restocking with animals from TSE-free regions. Individuals with genetic resistance to TSE are desired for restocking because disease recurrence from exposure to environmental prion contamination may occur in endemic areas (Thorgeirsdottir et al. 1999). Thus, identifying genetic variation correlated with TSE resistance is an important step in plans to eliminate TSEs from the food chain.

Various PrP isoforms may influence TSEs susceptibility. In sheep, 14 single nucleotide polymorphisms (SNPs) in the prion gene (PRNP) coding region have been published and/or reported in GenBank. Nucleotide variants affecting the translation of codons 136, 154, and 171 are the most often-studied polymorphisms associated with variation in susceptibility to scrapie, the TSE of sheep. PRNP haplotype alleles encoding alanine, arginine, and arginine (ARR) at the respective 136, 154, and 171 positions are correlated with increased scrapie resistance, whereas valine, arginine, and glutamine (VRQ) haplotypes are correlated with increased scrapie susceptibility (Belt et al. 1995; Hunter et al. 1996; Baylis et al. 2002). The most resistant individuals are those with homozygous ARR/ARR genotypes, and the most susceptible are the VRQ/VRQ individuals. The scrapie susceptibility for four other known PRNP haplotype alleles (ARQ, AHQ, ARH, ARK) is less clear because susceptibility appears to vary somewhat among breeds and populations. Nevertheless, individuals with two alleles encoding glutamine at codon 171 (i.e.,171QQ) are susceptible to natural scrapie (Goldmann et al. 1994; Westaway et al. 1994; O’Rourke et al. 1997). PRNP alleles encoding arginine at position 171 are dominant for increased scrapie resistance, and thus, heterozygous individuals (171QR) are considered resistant, with few exceptions (Ikeda et al. 1995; Tranulis et al. 1999; Baylis et al. 2002). The apparent dominant resistance conferred by the 171R allele implies that flocks derived from 171RR founders are expected to remain scrapie-free, even in environments where exposure to the infectious form of PrP is likely.

BSE in cattle and chronic wasting disease (CWD) in deer are similar prion diseases in species where relatively few PRNP polymorphisms have been characterized. A comparison of PRNP coding sequence (CDS) within Bos taurus or Odocoileus spp. in GenBank shows five and nine nucleotide differences, respectively, including insertion/deletion (indel) polymorphisms. The polymorphisms in cattle have not been significantly correlated with BSE (Hunter et al. 1994; Neibergs et al. 1994; Hernandez-Sanchez et al. 2002), and the correlation of deer polymorphisms with CWD is unresolved. In the human PRNP gene, both intronic and upstream regulatory regions appear to influence susceptibility to CJD (McCormack et al. 2002). However, polymorphisms have not been described in the PRNP promoter region in ungulates. Three bovine SNPs have been reported in the 5′ untranslated regions containing exons 1a and 1b (Humeny et al. 2002). Informative markers spanning both the promoter and coding regions may be useful for extended haplotype analysis of the PRNP gene locus.

Although TSEs occur in goats (Billinis et al. 2002), elk (Williams and Young 1992; Williams and Miller 2002), moufflon (Wood et al. 1992), kudu, and many other exotic captive ungulates (Kirkwood and Cunningham 1994), cattle and free-ranging deer populations were chosen as the focus of the present study because an extended set of polymorphic markers is needed for epidemiologic studies involving BSE and CWD. The aim of this study was to describe the nucleotide diversity in the promoter and coding regions of the PRNP locus in diverse populations of healthy U.S. sheep, U.S. cattle, and free-ranging deer (O. virginianus and O. hemionus from Wyoming).

Materials and methods

Animal groups and genomic DNA samples

Three different panels of ruminant DNAs were analyzed in the present study: the U.S. Meat Animal Research Center (MARC) Sheep Diversity Panel (MSDP) version 1.1 (Freking et al. 2002); the MARC Beef Cattle Diversity Panel (MBCDP) version 2.1 (Heaton et al. 2001); and a sample of white-tailed deer (O. virginianus) and mule deer (O. hemionus) populations from Wyoming (MARC Odocoileus panel version 1.0, this report). Each panel was designed to contain the most diverse germplasm available and, where possible, represents wide ranges of performance for a variety of economically important traits.

The sheep panel consisted of DNA from 90 individuals representing nine genetically diverse breeds of sheep. Ten rams were sampled for each breed, with no rams sharing a common sire. The breeds were divided into four classifications: 1) general purpose breeds including Dorset, Rambouillet, and Texel; 2) terminal-sire breeds including Suffolk and Composite III [1/4 Suffolk, 1/4 Hampshire, and 1/2 Columbia (Leymaster 1991)]; 3) prolific breeds including Finnsheep and Romanov; and 4) hair-shedding breeds including Dorper and Katahdin. These breeds are presently being evaluated for a wide range of performance traits at MARC and represent a diverse cross section of popular U.S. sheep germplasm. Germplasms from other popular breeds, such as Hampshire and Columbia, are significant components of the Composite III breed and thus contribute alleles to the group.

The cattle panel consisted of 92 sires from 16 popular beef breeds and four sires from the Holstein dairy breed. Sires within each breed were selected for pedigrees with minimal relationships between ancestors, to maximize the total number of unshared haploid genomes. The beef breeds in this panel comprise greater than 99% of the germplasm used in the U.S. beef cattle industry, based on the number of registered progeny for each breed.

The deer panel consisted of DNA from 50 white-tailed and 43 mule deer sampled from wildlife management areas across Wyoming from 1996 to 1998. White-tailed deer were sampled from five management areas in northeast Wyoming (Region A, Areas 2–5, and Region B, Area 7; htttp://gf.state. wy.us/HTML/afs/pdf/NR03deermap.pdf), and mule deer were sampled from 14 management areas across the state (Regions A, B, D, K, and T). These 14 areas included four of the same areas where white-tailed deer were sampled (Region A, Areas 2, 3, 5; Region B, Area 7). One mule deer from south-central Nebraska was also included in this panel. DNA was extracted from tissues and arrayed in 96-well plates as previously described (Heaton et al. 2001).

Estimating the minimum allele frequency required for detection in the above panels was based on the probability of observing the allele at least once in an animal group, as previously described (Heaton et al. 2001). Briefly, the probability of observing an allele at least once is 1 − (1 − p)n where “p” is the frequency of the allele and “n” is the number of independent samplings or, in this case, the number of unshared haploid genomes for diploid organisms. This assumes that samplings (haploid genomes) are independent and identically distributed (e.g., the same p applies to all animal subpopulations). Setting power, or the probability of observing the allele at least once, to 0.95 results in the equation: 0.95 = 1 − (1 − p)n. Solving this equation for p yields p = 1 − (0.05)1/n for all p between 0 and 1. Based on ancestors in pedigrees with at least four to seven generations present, the cattle panel was estimated to contain 187 unshared haploid genomes and is expected to allow a 95% probability of detecting any allele with a frequency greater than 0.016 in the panel (Heaton et al. 2001). Individuals in the sheep and deer panels were also selected for minimal relationships; however, pedigree information was not available for estimating the respective number of unshared haploid genomes. If one assumes that 10% of the haploid genomes are shared among the individuals in the respective sheep and deer panels, the minimum allele frequency that would allow 95% probability of detection is less than 0.02 for each panel.

Selection of PRNP regions for analysis, PCR amplification, and DNA sequencing.

PCR cocktails were designed to generate PRNP fragments containing as much CDS as possible for cattle, sheep, and deer. Primers used for amplification of genomic DNA are listed in Table 1. Oligonucleotide primers used in PCR have the effect of “remodeling” the template in the binding site, causing polymorphisms in those regions to be unreadable. Thus, it was desirable to have minimum overlap between the primer binding sites and the CDS. The sense amplification primer binding sites overlapped the first 5 nt in cattle and sheep and 12 nt in deer on the 5′ end of the CDS. Consequently, any polymorphisms present in these first few nucleotides corresponding to the N-terminus of the PrP leader signal peptide (MV in cattle and sheep; MVKS in deer) were not measured in our experiments. A larger fragment was not designed for amplification because a number of stem-loop structures with high melting temperatures in the adjacent 5′ non-coding region were predicted to interfere with PCR primers. Beyond the stop codon on the 3′ end of the PRNP gene, there were 49 nt in sheep, 33 nt in cattle, and 31 nt in deer that were included in the PCR products and did not overlap the binding sites for the antisense amplification primer. Thus, nucleotide variation in the 3′ end of the PRNP CDS was measured in these experiments. In the PRNP promoter region, PCR cocktails were designed to generate an approximately 600-bp fragment centered in a 4.3-kb region of the bovine prion gene. This 4.3-kb region was previously shown to have promoter activity (Lemaire-Vieille et al. 2000). With the location of this bovine amplicon as a reference point, additional oligonucleotides were designed to generate the corresponding amplicons in sheep and deer.

Table 1 Oligonucleotides for PRNP amplification and DNA sequencing

A standard amplification reaction contained 50 ng of genomic DNA, 0.5 µM of each amplification primer, 200 µM of each dNTP, 1.5 mM MgCl2, 1.25 U of either HotStarTaq DNA polymerase (Qiagen, Inc, Valencia, Calif.) or Thermo-Start DNA polymerase (ABgene, Epsom, UK) and 10% vol/vol reaction buffer provided by the manufacturer in a total volume of 55 µL. PCR was performed with either the PTC-200, the PTC 220 Dyad, or the PTC225 Tetrad thermal cycler chassis (MJ Research, Watertown, Mass.). Reactions were denatured at 94°C for 15 min and subjected to 45 cycles of denaturation at 94°C for 20 s, annealing at the appropriate temperature (Table 1) for 30 s, and a 72°C extension for 1 min. After cycling, an additional 3-min incubation at 72°C was included before storage at 4°C. A 5-µL portion of each amplified product was analyzed by agarose gel electrophoresis in buffer containing 90 mM Tris-borate (pH 8.0), 2 mM ethylenediamine tetraacetic acid, and 0.1 µg/mL ethidium bromide.

Both DNA strands of each amplicon were sequenced at least once with either the amplification primers or nested sequencing primers as previously described (Grosse et al. 1999). Sequencing reactions for the 280 individuals of three panels were performed according to the manufacturer’s instructions with BigDye terminator chemistry (version 2.0) and resolved on an ABI PRISM 3700 DNA analyzer (PE Applied Biosystems, Foster City, Calif.). For amplicons spanning the PRNP CDS, additional pairs of outward-facing sequencing primers were designed to hybridize to conserved regions in the center of the amplicons to ensure that DNA sequences near the ends of the CDS were accurately determined. All of the resulting sequences were analyzed with the assistance of PolyPhred software (Nickerson et al. 1997) in conjunction with Phred/Phrap/Consed software (Ewing and Green 1998; Ewing et al. 1998; Gordon et al. 1998), and consensus sequences were constructed for each group of sheep, cattle, and deer as previously described (Heaton et al. 2001). Because the geographic ranges of white-tailed and mule deer overlap, and they are known to interbreed in the wild, a single consensus sequence was constructed for these Odocoileus spp.

The consensus sequence for each group of sheep, cattle, or deer contained monomorphic sites that were unique for the respective group and thus provided a control for inadvertent amplification of contaminating DNA from other species. Annotated consensus sequences for the amplicons have been deposited in GenBank (see Tables 2 through 4 for accession numbers).

Table 2 Allele and genotype frequencies of PRNP gene polymorphisms in the MARC Sheep Diversity Panel Version 1.1
Table 3 Alelle and genotype frequencies of PRNP gene polymorphisms in the MARC Odocoileus Panel version 1.0

Results

Sequence comparisons in the PRNP coding region

Comparing DNA sequences from the present study and those previously reported revealed a total of 53 polymorphic sites in the PRNP coding region: 20 in sheep, 13 in cattle, and 20 in deer (Fig. 1C). These include three, seven, and 11 previously unreported polymorphisms, respectively. Some sites were monomorphic in our panels but polymorphic in previous reports (six in sheep, one in cattle, and three in deer). There were two sites in the PRNP coding region that were polymorphic among more than one species. For example, the first position of codon 151 contained a C/T polymorphism (R151C) in both sheep and deer, and the third position of codon 202 contained a C/T polymorphism (T202, synonymous) in cattle and deer. In both the 151 and the 202 codons, the C nucleotide was the common allele in all three species and appears to represent the allelic state of the most recent common ancestor.

Figure 1
figure 1

Physical maps of PRNP polymorphic sites in sheep, cattle, and deer. DNA sequences were obtained and analyzed as described in the Materials and methods. Panel A: PRNP genomic regions derived from GenBank accessions U67922 (Lee et al. 1998), AF163764 (Lemaire-Vieille et al. 2000), D10612 (Yoshimoto et al. 1992), and AJ298878 (Hills etal. 2001). Panels B and C: DNA segments amplified from the promoter and CDS regions by PCR. The relative position of the SNPs are indicated with vertical tick marks, and the numbers correspond to the last two digits in the SNP identifier column of Tables 2 through 4. The symbol legend for the feature maps is as follows: red squares, polymorphisms identified in this report but not previously reported in GenBank or the published literature; green squares, polymorphisms observed both in this report and GenBank/publishedliterature; black numbers, polymorphisms identified in GenBank or published literature but not observed in the present study; vertical lines connecting two numbers, SNPs present in two species; yellow arrows, CDS regions; white arrows, non-coding exon regions; blue arrow, promoter region; thin light blue rectangles, intron or intergenic regions; crosshatched squares, octapeptide repeat regions;thick black lines beneath octapeptide repeat regions; significant stem loop regions (72–99°C Tm); and black rectangles in promoter or CDS regions, indels.

With regard to sheep, 18 of the 20 SNPs in the coding region are predicted to encode amino acid differences in the translated prion protein. A summary of the known PRNP SNPs in sheep is presented in Table 2. Three of these predicted amino acid differences have not been previously reported (G127A, H180T, P241S), although their frequency in the sheep diversity panel was low (0.01). Five haplotypes affecting the translation of codons 136, 154, and 171 were observed (ARQ, ARR, AHQ, VRQ, ARH) with frequencies of 0.57, 0.33, 0.03, 0.03, and 0.03, respectively, in the MARC sheep diversity panel (data not shown). With regards to key genotypes in scrapie eradication programs, the frequencies of the most resistant genotype (ARR/ARR) was 0.13 (CI95% = 0.09 to 0.19), and the susceptible genotypes (ARQ/ARQ, ARQ/VRQ, and VRQ/VRQ) was 0.43 (CI95% = 0.36 to 0.51).

In cattle, only three of the 13 polymorphisms in the coding region are predicted to encode amino acid differences in the translated prion protein. None of the seven newly recognized SNPs is predicted to affect the amino acid sequence of PrP. The frequencies of the minor alleles for these SNPs ranged from 0.01 to 0.14 (Table 3), with some SNPs (e.g., AH25-22) having both alleles present in more than half of the breeds tested (data not shown).

Table 4 Allele and genotype frequencies of PRNP gene polymorphisms in the MARC Beef Cattle Diversity Panel version 2.1

In deer, seven of the 20 SNPs in the coding region are predicted to encode amino acid differences in the translated prion protein. Of the 11 newly recognized SNPs, five were predicted to encode amino acid differences, five were synonymous substitutions, and one was in the 3′ untranslated region (UTR). Two of the non-synonymous substitutions are predicted to affect the translation of codon 20 in the signal peptide region of PrP (Table 4, D20G, MDS001-22 and MDS001-23). Most of the newly recognized SNPs were observed in both O. virginianus and O. hemionus spp., although allele frequencies of some SNPs appeared to differ between species (Table 4, footnote b). These latter SNPs may be useful in estimating the extent of interbreeding in populations where the ranges of O. virginianus and O. hemionus overlap. One SNP was highly informative in both species (Table 4, MDS001-41). The frequencies of the minor alleles and minor homozygous genotypes were greater than 0.41 and 0.09, respectively, and thus may be useful for animal identity and parentage testing in both Odocoileus spp.

Sequence comparisons in the PRNP promoter region.

Sequence comparison of the DNA segments amplified from the PRNP promoter region revealed 33 polymorphic sites: six each in sheep and cattle and 21 in deer (Fig. 1B). No polymorphic sites were in common among the three groups. All sites were SNPs except in deer, where one complex site (MDS001-10) contained an eight-base indel, with the insertion allele having an additional SNP (MDS001-11). None of the 33 sites have been previously reported and thus represent new markers for evaluating the PRNP promoter regions of sheep, cattle, and deer for association with TSE susceptibility by linkage disequilibrium.

Discussion

These results describe previously unrecognized nucleotide diversity in the PRNP promoter and coding regions and provide allele frequency estimates in healthy individuals from U.S. sheep, cattle, and free-ranging Wyoming deer populations. Knowledge of the minor allele frequencies is critical for 1) estimating genotyping error rates caused by SNPs that lie within a primer binding site, 2) assessing whether markers may be useful for animal identification or parentage testing, and 3) evaluating their association with TSE susceptibility. The number of individuals sampled within each animal group was sufficient to detect more than 95% of all alleles present at a frequency greater than 0.02. The total number of polymorphisms observed in approximately 1400 bp of sequence was similar between species when deer were counted as two species, i.e., 26 in sheep, 19 in cattle, and 41 in white-tailed and mule deer combined. In cattle, this is slightly higher than the 15 SNPs expected on the basis of the average from previous analyses of SNP density in the same panel (Heaton et al. 2002). Similar estimates have not yet been made for the sheep and deer panels. A striking species difference was observed, however, in the number of polymorphisms predicted to affect the amino acid sequence. The ratio of non-synonymous to synonymous amino acid substitutions was 9:1 in sheep, 0.3:1 in cattle, and 0.6:1 in deer. The reasons for this difference are unknown.

The present report provides aggregate frequency estimates for the most resistant and susceptible genotypes in sheep breeds that contribute significantly to U.S. production. In this group of sheep, 13% (CI95% = 9–19%) have the most resistant genotype (ARR/ARR), whereas 43% (CI95% = 36–51%) have the most susceptible genotypes (ARQ/ARQ, ARQ/VRQ, and VRQ/VRQ combined). Because the overall ARR allele frequency is relatively low (33%, CI95% = 27–41%), selection for this allele would substantially increase the number of animals with resistant ARR/ARR genotypes and significantly reduce the overall genetic risk of developing scrapie. Furthermore, this selection strategy is compliant with scrapie eradication programs.

It is important to note that in spite of significant sampling, any population, breed, or lineage that has not been analyzed may contain additional polymorphisms. The minor allele of unrecognized polymorphisms may be rare in the overall national population, yet be quite common in isolated or inbred populations. This phenomenon is a common source of error when genotype tests are developed from information in one population and applied towards other untested populations. Any previously unrecognized SNP in the binding site of any primer used in genotype assays may prevent the detection of a disease-predisposing allele. This phenomenon may significantly affect livestock disease control and eradication programs when the sires are the focus of selection and the ratio of sires to dams in breeding is 10:1 or 25:1. Both of these conditions are typical in many sheep and cattle production systems. Thus, the success of disease control and eradication programs may be significantly diminished if even a few disease carriers are incorrectly classified.

In summary, the goal of this project was to obtain informative markers spanning both the promoter and coding regions to facilitate extended haplotype analysis of the PRNP gene locus. Such analysis may lend power to future genetic epidemiologic studies involving BSE and CWD because disease-associated SNPs may occur independently, yet share a common haplotype. By identifying many SNPs across 20 kb of the PRNP locus, haplotypes and the mutational steps in the haplotype network may be identified and tested for correlation with genetic predisposition to TSE diseases. The ability to identify individuals with genetic resistance to TSE is predicted to significantly enhance the efficiency with which TSEs are eliminated from the food chain.

Note added in proof: While this manuscript was in press, Hills et al. reported PRNP sequence variation in European populations of cattle and sheep (Anim Genet. 2003 Jun;34(3):183–90).