Introduction

European bison Bison bonasus (EB) were originally distributed throughout vast deciduous forests of Western, Central and Eastern Europe. However, by the beginning of the twentieth century, only two populations remained: one in the Białowieża Forest in Poland (Bison b. bonasus) and one in northwest Caucasus (Bison b. caucasicus). The last free-living European bison was shot in Poland in 1919, and the last individual in the Caucasus region died before 1927 (Heptner et al. 1966; Pucek et al. 2004). After its extinction in the wild, only 54 (29 males and 25 females) Bison bonasus individuals with registered pedigrees survived in European zoological gardens. The restored population of EB stems from only 7 animals. This breeding line originates from Białowieża Forest and includes pure breed animals of B. b. bonasus (Pucek et al. 2004). Consequently, due to the founder effect, the current free-living EB population (about 440 individuals in 2007) is expected to have lower genetic diversity than the historic population.

There are two recognized subspecies of the American bison: the plains bison Bison bison bison (PB) and the wood bison Bison bison athabascae (WB). About 500 American bison survived the period of intensive subsistence and commercial exploitation during European settlement of the interior of North America in the nineteenth century. Survivors included less than 100 PB (Hedrick 2009) and less than 250 WB (Soper 1941). The population size has increased in the past 100 years and the census population sizes are now estimated to be large for both PB and WB, although most of the bison in North America occur in commercial herds (data after American Bison Society; http://www.americanbisonsocietyonline.org; see also Freese et al. 2007; Hedrick 2009). Coalescent theory supports a demographic model in which the North American bison population rose to a peak around 37,000 years ago and subsequently declined (Shapiro et al. 2004). Thus, although human hunting was responsible for the severe population bottleneck in the North American population during the last 200 years, it seems that the earlier population decline was caused by climatic events which could explain the depletion of the genetic variability of the bison populations prior to the recent bottleneck (Shapiro et al. 2004).

Population genetic variability in extant post-bottleneck populations is poorly known. There are only few data concerning mitochondrial (mtDNA) variation in the EB from the Białowieża Forest (Tiedemann et al. 1998; Burzyńska et al. 1999; Anderung et al. 2006; Wójcik et al. 2009). Molecular markers such as microsatellites and MHC genes have suggested low genetic variability in EB. An investigation of 14 microsatellite loci estimated the mean allele number per locus to be 2.3 (Luenser et al. 2005). Radwan et al. (2007) found only four alleles in the MHC class II DRB3 locus in EB from the Białowieża Forest, one of which (the rarest allele in EB) was identical to a homologous allele in PB.

The main aim of this investigation was to carry out a genome-wide screen of EB from the Białowieża Forest population and of the American bison (subspecies PB and WB) and to determine the consequences of the population bottlenecks that EB, PB and WB have been through. Given the different demographic history of EB, PB and WB we expect to see different levels of genomic variability. Furthermore, the genetic variability was compared among seven breeds of domestic cattle. We genotyped bison and cattle samples for approximately 54,000 single nucleotide polymorphisms (SNPs) across the entire bovine genome. By screening a large number of SNPs, we could considerably reduce the sampling error of the estimated overall genomic variability compared to traditional investigations using a limited number of markers (SNPs or microsatellites). Furthermore, the large amount of data obtained using the BeadChip will facilitate the design of marker assisted selection strategies (MAS) which can be applied for increasing the effective population size (N e), minimizing genetic drift, increasing generation length and decreasing deleterious effects of inbreeding (Wang and Hill 2000).

Materials and methods

BovineSNP50

The BovineSNP50 BeadChip included more than 54,000 evenly-spaced SNPs. More than half of the SNPs were discovered using the sequencing system Genome Analyzer by Illumina®. The additional SNPs were derived from publicly available sources such as Btau (ftp://ftp.hgsc.bcm.tmc.edu/pub/data/Btaurus/fasta), the bovine reference genome, and the Bovine HapMap Consortium data set (www.bovinehapmap.org). The BeadChip has an average minor allele frequency (MAF) of 0.25 across all loci and has been validated in both dairy and beef cattle. The SNPs are approximately uniformly distributed with an average spacing of 51.5 kb. The number of polymorphic loci varies between breeds. The average call rate over all segregating loci is 99.57% in cattle and was found to be 99.3% in the segregating SNPs when 18 individuals of the species Bos bison, Bos gaurus, Bos grunniens, Bos javanicus, Bubalus depressicornis and Syncerus caffer were analysed. The average call rate refers to the number of useable SNPs. If the genotype of a given animal falls into one of the three genotype clusters that are formed by the SNP it can be “called. The mean MAF of the outgroup consisting of all the species combinedwas found to be 0.05 in 11,206 polymorphic loci (Illumina, Inc. Pub No. 370-2007-029).

Animal samples

A total of 50 EB (from Białowieża Forest, Poland), 25 WB (from the Wood Buffalo National Park, Canada) and 26 PB (from the Elk Island National Park, Canada) specimens were genotyped. The two American bison populations have been identified, together with 7 other herds, as having no evidence of domestic cattle introgression (Halbert and Derr 2007) like the EB population from the Białowieża Forest (Wojcik et al. 2009). The WB population at Wood Buffalo National Park is one of only two indigenous herds of American bison. It reached a population low of approximately 250 animals in 1895 (Soper 1941). The PB population at Elk Island National Park was founded in the early 1900s by approximately 45 animals (Wilson and Strobeck 1999). Both of these populations have previously been found to be among the most diverse of their subspecies (Wilson and Strobeck 1999). Additionally, 216 Bos taurus specimens from seven different dairy and beef cattle breeds (all from Denmark but for some of them with parents from US and Canada) were included in the investigation. These were of the breeds: Red Danish (n = 32), Jersey (n = 31), Limousine (n = 30), Aberdeen Angus (n = 27), Hereford (n = 29), Holstein (n = 36), and Simmental (n = 31).

DNA was either isolated from blood using the BioSprint 96 (QIAGEN), or from soft tissue using the DNeasy Blood & Tissue Kit (QIAGEN), or purified from blood and muscle tissue by treatment with proteinase K followed by sodium chloride precipitation (Sambrook et al. 1989).

Infinium II assay protocol

SNPs were genotyped on the BovineSNP50 BeadChip according to the Infinium II Multi-Sample assay protocol provided by Illumina® (Manual Experienced User Card, 11208000 Rev. A., Illumina Inc.). Proprietary reagents were provided by Illumina®. Isothermal amplification was carried out overnight in 96-well plates using 50–200 ng of genomic DNA. The amplification generated a one thousand-fold quantity of DNA.

The amplified products were fragmented by a controlled enzyme process. After alcohol precipitation the DNA samples were re-suspended, denatured and loaded on the BeadChips. The amplified and fragmented DNA samples annealed to locus-specific 50-mers by linking covalently to one of the 54,000 bead types through overnight hybridization. Following hybridization, allelic specificity was conferred by enzymatic base extension. Products were subsequently stained with repeated application of staining and anti-staining reagents. Following staining, the chips were washed and coated. Fluorescence of the beads was detected by the Illumina BeadArray™ Reader.

SNPs were genotyped on the BovineSNP50 BeadChip in two batches with an overlap of six replicate samples (6 individuals were screened twice). These samples came from the EB population where we have extracted DNA from two different tissues (muscle and blood). The first run was conducted on 22 individuals (date: 21 April 2008), of EB, PB and WB including three EB families where the relationships (mother, father and offspring) were known. The second batch was run on all the other bison samples including all samples from the cattle breeds (date: 30 June 2008).

Genotyping of the SNPs

The scanned bead intensities were loaded in the BeadStudio Software for allelic discrimination. All samples, both Bison and cattle, were genotyped together and the SNPs were called by applying the Bovine50SNP_A.egt for cluster separation. The samples’ reliability was examined by the Call Rate option to remove non-reliable samples. The thresholds were set to 0.96 and 0.98 for Bison and cattle, respectively when all polymorphic SNPs were examined in all breeds at the same time. The SNPs were sorted according to the Call Freq option and examined manually if the genotyping frequency was below 0.95. All SNPs segregating in the bison were checked manually to ensure correct calls of clusters and only when the cluster of the bison samples was located within the same range of intensity as the cattle it was accepted. Questionable SNPs were allocated to one of the following categories; unsuccessful reclustering, overlap in clusters, not distinctly separated clusters, low intensity, parent-parent-offspring error or software limitations, and they were excluded from further analyses.

The three EB families utilized for checking of parent-parent–child errors (P–P–c errors) confirmed that the 929 SNPs analysed showed a Mendelian inheritance pattern. No errors were detected among 929 loci suggesting that SNPs can be genotyped reliably and that the true error rate is extremely low. Furthermore, six individuals sampled in the PB population were represented by two different tissue types and run on the system on different dates. These individuals were used to check for replication errors as well as reproducability. All six individuals run twice were 100% identical for all segregating SNPs in the two runs, confirming the reproducibility of the data between runs and tissue types. Besides, the check of mendelian inheritance together with replication of samples confirms, that the finding of the SNPs in bison is correct.

Data analysis

The polymorphic SNPs for bison were plotted by the corresponding known position on the 29 autosomal bovine chromosomes. Given the fact that EB, WB and PB hybridize with each other and that all the bison succesfully hybridize with the different cattle breeds, it is unlikely that the order and distance between the markers is much different between cattle and bison.

The percent of polymorphic loci (P%) for each bison species and for each cattle breed was calculated relative to the total number of loci that were polymorphic when the EB, WB and PB samples were pooled. Expected heterozygosity (H E) at loci found to be polymorphic in EB, PB and WB and the mean distance (along every chromosome) between the polymorphic SNPs were estimated. Average observed individual heterozygosity (H Oi) was compared between EB, WB and PB by a one-way ANOVA followed by a Tukey’s pairwise test.

To test if the distribution of the distances between the polymorphic loci was random, uniform or aggregated along the chromosomes we performed χ2 analyses for randomness, for EB, PB and WB (Green 1966).

After considering the distribution of the distance between the polymorphic loci the haplotypes were calculated according to the four gamete rule implemented in the Haploview 4.1 software (Barrett et al. 2005). The four gamete rule (under the assumption of random mating) is used to find out if all the four gametes between each pair of SNPs are present. If they are present there is evidence of recombination somewhere between the SNPs. The command line version was used in its default settings (minMAF = 0.001, hwcutoff = 0.001, maxMendel = 1 and minGeno = 0.75) except for the maxDistance parameter which was set to 50,000 kb to ensure correct haplotype blocks partitioning. As input the genotype pedigree file format was used together with the marker info files on each breed and each chromosome separately.

As the number of polymorphic markers in EB, WB and PB was clearly below the amount segregating in the cattle breeds, five decreasing subsets of SNP markers were selected at random from the Jersey breed. The Jersey breed was selected because 1) the Jersey breed segregated in the lowest number of SNPs, 2) the analysis showed that this breed had the lowest amount of blocks with the highest number of SNPs per block and the largest average block size of all cattle breeds, 3) the percentage of SNPs represented in the blocks is highest in the Jersey breed. Based on these finding we predict that the Jersey breed has the lowest genetic diversity of the cattle breeds and hence that it is suitable for comparison to the bison breeds.

Results

A total of 52,978 SNPs were genotyped in the cattle breeds and the European and American bison. There were 42,659 polymorphic SNPs in the seven Bos taurus breeds. Only 2,209 of the total amount of SNPs were polymorphic in the bison when EB, PB and WB were pooled. The average call rate for the bison was 97.60% (96.7–98.7%) and 99.57% (98.4–99.9%) for the cattle when all segregating SNPs were considered simultaneously. There were no tendencies for the bison samples to cluster in the interval suggesting the same ability to hybridize to the bovine SNP sequences on the chip in EB, PB and WB. To cluster in the interval, refers to the process of genotyping made by the software, where the clusters of the cattle samples fall all into very narrow intensities (the signal intensity used by the scanner to differentiate the genotypes). The small difference in average call rates (1.97%) confirms the reliability of our comparisons and also reflects differences in the genomic DNA between cattle and bison such as the possibility of other alleles or deletions of the sequence surrounding the SNPs or in the actual SNP.

A total of 929 SNPs were found to be polymorphic in EB. More polymorphic SNPs were found in WB and PB (1,524 and 1,403 SNPs, respectively, see Fig. 1). Expected heterozygosity was lowest in EB (H E = 0.135) followed by WB (H E = 0.197) and PB (H E = 0.199) (Table 1). In cattle the lowest H E and P% were found in Jersey (H E = 0.278, P% = 83.30%) and the highest H E and P% were found in Aberdeen Angus (H E = 0.328, P% = 92.50%) (Table 1).

Fig. 1
figure 1

The distribution of the 829, 1,524 and 1,403 polymorphic SNPs that were mapped to the BTA 4.0 assembly in the European bison (EB) (red), the wood bison (WB) (green) and the plains bison (PB) (blue), respectively. All the polymorphic SNPs are aligned according to their position on the chromosomes

Table 1 Overview of the relative degree of polymorphism (P%) and expected heterozygosity (H E) found in the European bison (EB), plains bison (PB), wood bison (WB) and seven cattle breeds. The percent of polymorphic loci (P%) (5% criterion) for each bison species and for each cattle breed was calculated relative to the total number of loci that were polymorphic when the EB, WB and PB samples were pooled

Differences in H Oi among the bison samples were highly significant (one-way ANOVA; F = 291.2, P < 0.001). Post-hoc tests indicated that H Oi of PB and WB did not differ significantly from each other but both have significantly higher H Oi variation than EB (Tukey’s test P < 0.001).

The chromosome alignment in Fig. 1 shows that there are common regions of shared polymorphism in EB, PB and WB. Of the 2,209 SNPs that were found to be polymorphic in at least one of the Bison breeds, 767 were represented in only one of the bison species (EB = 480, PB = 85 and WB = 202). There were clear differences in the distribution of segregating SNPs that were shared among EB, PB and WB (Fig. 2). Very few segregating SNPs were shared only between EB and PB (1.58%) and EB and WB (1.76%) when compared to the overlap between PB and WB (41.1%). Besides, we found that 375 SNPs or 16.98% were represented in all bison species.

Fig. 2
figure 2

Haplotype block partitioning of the Bos taurus chromosome 14 (BTA 14) in the European bison (EB), the plains bison (PB) and wood bison (WB) together with the haplotype blocks found in Jersey after reduction of the data set. Hatch marks of all SNPs in a certain block are connected by a line. Each hatch represents a SNP

The mean distance ± SD. between the polymorphic SNPs in bison was 2,777 kb ± 3,444 kb for EB, 1,870 kb ± 1,965 kb for PB and 1,710 kb ± 2,085 kb for WB. The mean SNP density for EB was 0.36 SNP/Mb, 0.53 SNP/Mb for PB and 0.58 SNP/Mb for WB. The variance of the distribution of the distances between the polymorphic loci in EB, PB and WB along the chromosomes was significantly higher than the mean of the distribution (P < 0.01) confirming that the polymorphic loci are aggregated and not randomly or uniformly distributed.

The result of the haplotype block partitioning done on the seven cattle breeds is shown in supplemental Table 1. As pointed out in the methods section the Jersey breed is the less variable of all the breeds. In Table 2 both the Jersey experiment of reducing the amount of SNPs in the dataset as well as the bison haplotype block partitioning is presented. From the data it is evident that the bison are less variable than the Jersey cattle. First of all, the percentages of SNPs represented in the blocks were decreasing rapidly in Jersey with a reduced number of SNPs in the analysis. Secondly, the much longer maximum length of the bison blocks combined with the fact that the blocks contained more SNPs contributed to the lower variability as well. The distribution of SNPs per block in the seven cattle breeds can be seen in supplemental Table 2. Figure 2 illustrates the haplotype block partitioning on BTA 14 in EB, WB and PB and the Jersey dataset containing 1,188 SNPs. From the figure it is evident that EB, WB and PB have greater overlap in the block partitioning than they have with the Jersey breed. Besides, the two American bison subspecies were found to be the most similar. A complete comparison of the bison haplotypes can be seen in supplemental Fig. 1.

Table 2 Reduction in the amount of single nucleotide polymorphism (SNP) found in Jersey along with haplotype statistics compared to the bison haplotype statistics

Discussion

We observed a lower level of polymorphism and H E in EB, compared to WB and PB, which confirms the expectation as the European bison underwent a more extreme bottleneck in population size at the beginning of the twentieth century (seven founders––three females and four males). The EB founder effect was further exacerbated by the skewed genetic contribution of the founder females in the bison population, with one female contributing nearly six times more than the two other females (Wojcik et al. 2009). Furthermore the Y chromosome of all contemporary Lowland line males originates from only one ancestor (Pucek et al. 2004). A certain level of ascertainment bias could have been introduced in this data set, because a bovine DNA chip was used to identify SNPs in two different, although closely related species (genus Bison). To alleviate this problem, SNPs were ascertained comparing the EB with the PB and WB data set, and selecting all the segregating sites originating from this comparison. In this way we have minimized the influence of ascertainment bias on the estimated level of genetic variability.

The low variability detected in this study agrees with theoretical expectation for populations which have undergone a severe bottleneck (Nei et al. 1975). Both PB and WB have been through a strong population size reduction (see Soper 1941; Wilson and Strobeck 1999; Freese et al. 2007; Hedrick 2009). The fact that WB and PB share a recent common ancestor could also partially explain the lack of differences between the average H Oi.

Inspection of the polymorphic SNPs in the bison (see Fig. 1) reveals long chromosomal regions fixed for one allele and leaves little doubt that the European and American bison have extremely depauperate genomes. There are several possible reasons for the presence of such haplotype blocks such as genetic hitchhiking, variable mutation rates and recombination, gene-flow, drift and inbreeding (Hayes et al. 2003; Tenesa et al. 2007). It is difficult on basis of the existing data to differentiate between the different possible reasons. Sequencing of the bison genome would help to separate between the reasons for the presence of the haplotype blocks. However, it is likely that many of the shared blocks and of highly polymorphic regions are ancestral as the small N e of the bison, the relatively low mutation rate of SNPs and the fact that North American bison is a relatively recent evolutionary product, coming into existence about 4,000–5,000 Y.B.P. (Wilson and Strobeck 1999) make it unlikely that the observed polymorphisms are due to mutations that occurred in bison recently. An intensive sequencing of the bison genome, would answer important evolutionary questions in this regard.

No comparison of SNP variation between bison and cattle breeds has been conducted here. This is because an ascertainment bias is introduced when comparing the genetic variability in cattle and bison as the markers on the chip were selected based on polymorphisms in cattle. Hence, the comparisons between bison and cattle should be interpreted with caution. However this random selection of genes would not constitute a problem when comparing genetic variability between cattle breeds.

The low genome-wide level of genetic variability found in EB compared to PB and WB provides the best evidence yet for a low potential to adapt to a variable environment in a flagship species and might constitute a threat to the long term persistence of this bison. Furthermore, despite rapid population growth in the last century the N e of EB has only slowly increased (Tokarska et al. 2009). This is because the long term N e is a function of the harmonic mean which is strongly influenced by the minimum population size reached (Lynch and Walsh 1998; Pertoldi et al. 2007).

The domestication of the wild ox or aurochs (Bos promigenius), the direct ancestor of the extant cattle populations, started already 10,000 years B.P. (Bradley et al. 1996). Signs of human manipulation of cattle have also been documented by archeological findings of cattle bones which declined in size with time (Clutton Brook 1999). The different degrees of genetic variability between cattle breeds (Table 1) are due to different N e and to different demographic history of the breeds. Evidences for selection in cattle on basis of phenotypic traits have been documented in the 18th and 19th centuries (Myrdal 1994). The formal breed definition with herd-books began already 200 years ago. This process has tended to sharpen the differences between breeds and a large amount of genetic variability has been lost during the breeding practices (Lenstra and Bradley 1999).

Genome-wide based breeding schemes designed to preserve rare alleles and minimize inbreeding (by estimating the true relationships between individuals) should be undertaken for the bison and other captive populations. Traditional methods for making breeding decisions to reduce the level of inbreeding (by increasing N e) utilize only pedigree information, which describes the expected relationship among individuals. With the same pedigree, however, individuals still vary in the realized genetic relationship between them (Nielsen et al. 2007). Therefore, information obtained from genetic markers can be useful in this respect as they will provide the realized genetic relationships. With the information obtained from the BeadChip it will be possible to create a SNP panel on the polymorphisms described here and use them in marker assisted breeding. Information from genome-wide screening also enables detection of genes associated with inbreeding depression and hereditary genetic diseases. SNP-based association mapping has recently been successfully used to identify recessive mutations that cause inherited defects in livestock and dogs (Karlsson et al. 2007; Charlier et al. 2008). Low N e have increased the rate of inbreeding in many domestic species increasing the expression of recessive deleterious alleles. Thus, detection of recessive deleterious alleles allows rapid control of emerging recessive defects.

Provided the genomic tools are available, similar methods can also be used on small populations held in zoological gardens or under semi-natural conditions. The most effective method to minimize drift and inbreeding is to equalize the contribution of offspring from all potential ancestors. This is generally realized by selecting those individuals for breeding in each generation that have the lowest average coancestry among them (minimum coancestry or equal contribution of parents) (Caballero and Toro 2000). Therefore, the genetic information provided by the SNP chip will allow to start an innovative breeding strategy which uses marker information to select the offspring that have the minimum average probability of identity by descent, which may lead to an increase of both N e and genetic variability.

SNP arrays are so far only available for a few species. However, our study on bison and an ongoing project using the canine Affymetrix GeneChip to investigate genetic variation in Grey wolves (Canis lupus) and domestic dogs (E. Randi, pers. comm.) illustrate that genome-wide scanning is not limited to model organisms, livestock or pet animal species. We expect that these new opportunities will have a huge impact on genetic management of captive populations in the future and potentially revolutionize the field of conservation genetics.