Introduction

The limited quality and quantity of nuclear DNA extracted from non-invasively collected samples, like single hairs (Vigilant 1999; Bengtsson et al. 2011), is a challenge for accurate genotyping (Gagneux et al. 1997; Goossens et al. 1998). Current methods rely on pooling hairs from the same individual to have sufficient DNA for accurate genotyping. In addition, genotyping is repeated to assess error rates (Taberlet et al. 1997; Goossens et al. 1998). However, these approaches are usually not applicable to hairs collected on hair traps. Pooling hairs from lure sticks can lead to erroneous genotyping when hairs belong to different individuals. In addition, single hair samples often yield too little DNA for accurate genotyping. Nevertheless, conservation and population genetic studies often rely on non-invasively collected samples, because it is an efficient way to sample elusive species (Valière et al. 2003; Schwartz et al. 2004; Henry and Russello 2011; Heurich et al. 2012; Barbosa et al. 2013). For instance, non-invasive hair sampling using lure stick traps has been put forward as a useful way to survey European wildcats (Felis silvestris silvestris; Kéry et al. 2011; Steyer et al. 2013).

Introgression with domestic cats (F. silvestris catus) is thought to be a threat to European wildcat (Daniels et al. 2001; Oliveira et al. 2008; Randi 2008; Driscoll and Nowell 2010), which could lead to its genetic extinction (Rhymer and Simberloff 1996). Thus, it is crucial to monitor and better understand the process of introgression in wildcat populations. However, the microsatellite markers used so far to monitor wildcats based on hairs do not recognize introgression sufficiently, since they are highly polymorphic and designed to detect population structure or to recognize individuals (Hertwig et al. 2009; Say et al. 2012). Further, single nucleotide polymorphism (SNP) markers developed to recognize introgression have been genotyped with Sanger Sequencing so far, thus relying on DNA samples of high quality and quantity (Nussberger et al. 2013). These markers have not yet been adapted to DNA samples of low quality and quantity, like single hair samples.

In the present study, we provide a method that tackles these challenges to assess the introgression rates in wildcats based on non-invasive hair sampling using lure stick traps. We investigated (I) whether genotyping by application of a newly designed 96.96 Fluidigm SNP chip containing the previously mentioned SNP markers (Nussberger et al. 2013) is reliably reflecting genotypes generated with Sanger sequencing and (II) whether this chip yields reliable genotypes even in samples of low DNA quality and quantity, such as single hairs. Moreover, this chip represents a new set of SNP genotyping assays for high-throughput genotyping of European wildcats and domestic cats that enables the identification of individuals and the assessment of individual introgression levels from single hair samples.

Materials and methods

Cat samples were provided by the Centre for Fish and Wildlife Health in Berne, gamekeepers and private collections (Nussberger et al. 2013). Blood and tissue (muscle, liver, spleen) samples were stored at −20 °C until used and extracted using the DNeasy Blood & Tissue Kit (Qiagen), following the manufacturer’s protocol. Hair samples were plucked from known specimens and stored dry at room temperature for 15 to 53 months prior to DNA extraction. DNA was extracted with the Sample-to-SNP-kit (Applied Biosystems) using the following modified protocol. We checked every hair under the microscope for the presence of a root, placed each hair root singly into a 0.2-ml PCR tube, added 9 μl lysis solution and placed the tube in a thermocycler at 75 °C for 10 min and 95 °C for 4 min. Finally, we added 9 μl stabilization solution.

We quantified the cat-specific DNA amount available for genotyping in 16 singly extracted hairs (four single hairs from four individuals) using quantitative real-time PCR on a StepOnePlus instrument (Applied Biosystems). PCR contained 2 μl DNA, 10 μl FastStart Universal SYBR Green Master (ROX) 2× (Roche Applied Science), 6.64 μl molecular grade water, 0.16 μl BSA and 0.6 μl forward and 0.6 μl reverse cat-specific primer of 10 μM (F: ACGCACAACGTCTTGGAAC; R: TGGCCTTTTTAAGGATCACC, on conserved region of c-Myc proto-oncogene). Initial incubation was set to 10 min at 95 °C, followed by 50 cycles of 95 °C for 15 s and 60 °C for 1 min. Melt curve stage was 95 °C for 15 s, 60 °C for 1 min (step and hold +0.3 °C) and 95 °C for 15 s. Quadruple sets of four standards containing 10 ng/μl, 1 ng/μl, 100 pg/μl and 10 pg/μl domestic cat DNA, respectively, as well as one blank were amplified with the DNA samples of unknown quantity. We quantified the samples with StepOne Software v2.2 (Applied Biosystems).

To distinguish individuals and to assess introgression levels, we developed 96 Fluidigm SNPtype™ Assays for SNP genotyping (Fluidigm, San Francisco, USA). The set of assays contains nuclear SNP markers (Nussberger et al. 2013) as well as mitochondrial DNA (mtDNA) markers described by Driscoll et al. (2007); 75 nuclear markers with a F ST value (genetic differentiation index) between wildcats and domestic cats ranging from 0.6 to 1 are for introgression level diagnosis, 11 nuclear markers with F ST values <0.5 and four mtDNA markers to distinguish individuals, four diagnostic mtDNA markers for maternal lineage assessment and two diagnostic Y-linked markers for sex determination and paternal lineage assessment. Assay primers and sequences used to order them are shown in Online Resource 1. All assay primers were designed by Fluidigm.

Fluidigm SNP genotyping is an analogue to the Amplifluor Genotyping System (for details, see Morin and McCarthy 2007). In the first step, two pre-amplification primers [locus-specific primer (LSP) and specific target amplification (STA) primer] amplify the target region containing the SNP to be genotyped. Secondly, an additional PCR amplifies a portion of that target SNP region, using the LSP and two fluorescently labeled allele-specific primers ASP1 and ASP2, which are internal primers containing either the first or the second allele, respectively. Finally, the SNP genotype is then determined by measuring the fluorescence intensity of both alleles. All 96 SNPs are pre-amplified simultaneously in one multiplex PCR, for each sample separately, on a Veriti Thermal Cycler (Applied Biosystems), with the following conditions: hold at 95 °C for 15 min, 14 cycles at 95 °C for 15 s and 60 °C for 4 min. The second PCR is performed on a Fluidigm 96.96 Dynamic Array (SNP chip), where the reactions occur in separate nano-wells for each SNP and sample combination, allowing simultaneous genotyping of 96 samples at 96 SNP loci. This PCR is performed on a BioMark HD System (Fluidigm), with the following PCR cycling conditions: 50 °C for 2 min, 70 °C for 30 min, 25 °C for 10 min and 95 °C for 5 min, followed by four touchdown cycles (95 °C for 15 s, from 64 °C to 61 °C for 45 s, 72 °C for 15 s) and 34 additional cycles (95 °C for 15 s, 60 °C for 45 s, 72 °C for 15 s). The PCR ends with 1 cycle at 20 °C for 10 s (for details, see Fluidigm genotyping user guide).

We genotyped blood and tissue samples of 20 cats following the manufacturer’s SNP genotyping protocol (see Fluidigm genotyping user guide). For hair samples, we modified the protocol as follows. In the pre-amplification step, we used 2 or 4 μl genomic DNA extraction solution to increase the total number of DNA copies in the reaction above an a priori threshold of 50 pg DNA per reaction. DNA was pre-amplified using 4 μl Qiagen Master Mix 2×, 0.8 μl specific target amplification primer pool and 1.2 μl molecular grade water. The pre-amplification PCR product was diluted in 1:10. The number of additional cycles in the second PCR protocol was increased from 34 to 46 (Online Resource 2). We included eight references (two domestic cats, two wildcats, twice on one first-generation hybrid and twice on one backcrossed wildcat) and eight no template controls (NTCs, for fluorescence plot normalization) in each chip. Genotypes of the reference individuals were known from previous genotyping based on Sanger sequencing (Nussberger et al. 2013). Fluorescence plots for each SNP were provided by Fluidigm SNP genotyping analysis software. All plots were checked visually and corrected for errors such as NTC with fluorescence values >0.1 or clusters which did not make sense in accordance to our reference samples. Except for the reference samples, we were naive to the true genotype of the samples during manual correction of the automatically generated calls. Three out of the 75 diagnostic nuclear markers (Fst01_SNP033, Fst03_SNP149 and Fst33_SNP152) were excluded for further analysis, because their fluorescence plot were ambiguous.

We tested the accuracy of our SNP genotyping assays by comparing loci genotyped by Sanger sequencing in a previous study (Nussberger et al. 2013) and by Fluidigm for 17 blood or tissue samples. We calculated the genotyping error rate as the number of mismatches between Sanger genotype and Fluidigm genotype, divided by the total number of diploid markers genotyped with both methods. To estimate the rate of allelic dropout and false alleles (Pompanon et al. 2005), we assumed that the genotyping based on Sanger sequencing (Nussberger et al. 2013) showed the true genotype of an individual.

We genotyped four cats from which we had blood or tissue (high-quality) samples as well as hair (low-quality) samples to test whether our SNP assays yield reliable genotypes for low-quality DNA samples. We analysed independently four hairs from each of the four individuals. For two individuals, we further duplicated these four low-quality samples from the DNA extraction onwards, thus generating 24 hair genotypes. We compared genotypes of high- and low-quality samples, both generated using the 96 Fluidigm SNPtype™ Assays as previously defined. We calculated the error rate in the genotypes from low-quality samples using the genotype of the high-quality sample as reference (genotypes are shown in Online Resource 3). Here, we defined the error rate as the number of loci with mismatches between the high- and low-quality sample genotypes divided by the total number of diploid loci genotyped. The proportion of false alleles was estimated as the number of homozygous loci in the reference genotype which were called as heterozygote in the hair genotype divided by the number of homozygote loci in the reference genotype. The proportion of allelic dropout was estimated as the number of heterozygous loci in the reference genotype which were called as homozygote in the hair genotype divided by the number of heterozygote loci in the reference genotype.

Finally, we checked whether the errors in the 24 hair genotypes affect the assessment of identity and introgression levels. We used Gimlet (Valière 2002) to recognize individuals. Here, we considered an individual as recognized when at least 95 % of all examined SNP genotypes of two samples were identical. We assessed individual introgression level based on 72 diagnostic nuclear SNP markers and using Bayesian model-based clustering by computing posterior probabilities for six different hybrid classes (two parental hybrids of first and second generation and two backcrosses) in NewHybrids (Anderson and Thompson 2002). We checked whether the hybrid class attributed to each of the hair genotypes were consistent within individuals. As a further control, we checked whether the hair genotypes of one individual lead to the same hybrid class as the tissue genotype.

Results and discussion

Here, we presented a SNP genotyping method which is reliable even in samples of low quantity and quality, since genotyping error rates in single hair samples were low and did alter neither identity nor introgression level assessment. However, a minimal amount of genomic DNA of about 200 pg is recommended. We believe that this genotyping method is applicable to detect introgression in wildcats, based on non-invasive samples.

Four out of 17 individual Fluidigm genotypes based on high-quality (tissue) samples contained errors when compared to Sanger genotypes (Table 1 (a)). The genotyping error rate per locus estimated from comparisons between Sanger and Fluidigm was 0.9 %. Further, SNP genotypes were consistent between the four hair samples and the reference sample for all four individuals analysed (Table 1 (b)). Overall, genotyping error rate per locus was 1.6 %. Non-called loci were the most commonly observed error type. In the 16 hair genotypes having at least 200 pg DNA in STA pre-amplification, the overall error rate was 0.7 %, allelic dropout was not observed, false alleles occurred in 0.1 % of all homozygous SNP callings and non-called loci occurred in 0.6 % of all SNP loci. The here-observed error rates are somewhat below most error rates estimated in studies using non-invasive sampling summarized by Valière et al. (2007), where allelic dropout ranges between 0 and 31.3 % and false allele between 0 and 4 %, even though DNA was extracted from more than one hair in most of the studies. The low error rate observed in our study could partly be due to the use of biallelic and almost diagnostic SNP markers instead of polymorphic microsatellite markers. All the studies referred in Valière et al. (2007) used polymorphic microsatellites. Moreover, the SNP assays we used here are substantially shorter (between 52 and 116 bp, mean = 82 bp) than average microsatellites fragments of around 150 bp and therefore are less sensitive to genotyping errors in highly fragmented DNA.

Table 1 Genotyping errors in cats (Felis silvestris) with Fluidigm SNPtype Assays when evaluating Fluidigm genotypes with Sanger sequencing genotypes as a reference and hair sample genotypes with tissue sample genotypes as a reference. DNA input quantity for specific target amplification is given in picograms (>10 ng if not indicated)

Our data show that, if samples contain over 200 pg DNA, SNP genotyping error rates are negligible. Therefore, repeats of samples >200 pg DNA are theoretically no longer needed. However, we would still recommend repeating a small part of the samples for quality control. In addition, our threshold value of 200 pg is an empiric value relative to our real-time PCR (RT-PCR) standards and therefore should not be generalized. In fact, using different RT-PCR standards may yield slightly different threshold values. Thus, we recommend performing a short pilot study to find out the reliability threshold value for each new set of standards. This pilot study should genotype—as we did here—single hairs as well as high concentration samples from the same individuals.

The quantification of input DNA through RT-PCR is a crucial step, since in some cases, also samples with very small DNA input amounts (<200 pg) lead to an apparently complete genotype. However, these genotypes may contain many false alleles. Such highly heterozygous genotypes might be misinterpreted as hybrids, if they are not excluded through the previous quantification step. For example, Morin et al. (2001) demonstrated that PCR failures drastically increased below 100 pg in orang-utan hair and faecal samples. Thus, it is crucial to accurately quantify the DNA available in a sample prior to genotyping in order to anticipate genotype quality (Morin et al. 2001; Beja-Pereira et al. 2009).

The high number of SNP markers and the low genotyping error rates in hair samples allow an accurate assessment of identity and introgression level. Gimlet attributed all except one hair sample to the correct individual out of the four genotyped individuals (Online Resource 4). Sample HK87_1, with 73 pg of genomic DNA in the STA, had only 92 % percent of identical loci with the other three hair genotypes from this individual and was thus considered as not correctly identified. The four DNA extractions from single hairs of the same individual always led to the same hybrid category as the reference genotype with a minimum posterior probability >0.99, even in the samples with the highest number of observed errors (Online Resource 5). The high accuracy of the introgression level assessment presented here was previously demonstrated (Nussberger et al. 2013) and mainly relies on numerous independently inherited diagnostic SNP markers with a strong differentiation in allele frequencies between wildcats and domestic cats. Thus, the introgression level in wildcat populations can now be assessed without invasive sampling and with more statistical power than shown in previous studies (Oliveira et al. 2008; Hertwig et al. 2009; Say et al. 2012). This represents a major improvement in conservation of the European wildcat, since representative DNA sampling from this elusive species relies mostly on non-invasive sampling.

An additional challenge when dealing with non-invasive sampling is the accurate identification of the studied species. For example, Monterroso et al. (2013) showed that the accuracy of wildcat scat identification was low (11.5 %) when based on the morphology of scat alone. Thus, it is worth to include genetic identification in non-invasive studies (Oliveira et al. 2010). With the method presented here, identification of the species F. silvestris ssp. is assured by the use of cat-specific primers already in the first DNA quantification step (quantitative real-time PCR). A preliminary test (data not shown) showed that the application of these primers to high-quality blood or tissue samples (20 ng/μl) of human (Homo sapiens), squirrel (Sciurus vulgaris), stone marten (Martes foina), pine marten (Martes martes), European badger (Meles meles), brown hare (Lepus europaeus), raccoon dog (Nyctereutes procyonoides), European lynx (Lynx lynx) and red fox (Vulpes vulpes) did not yield any PCR product exceeding a concentration of 2 pg/μl. Thus, we concluded that hair samples from other species other than F. silvestris are effectively eliminated prior to the following SNP assay, which consequently gets more efficient and cost-effective.

In conclusion, the presented method allows simultaneous genotyping of 96 SNP markers in 96 samples even with DNA of low quality and quantity. This protocol is suitable for non-invasively collected hair samples and can further be applied to other low-quality DNA samples, such as faeces or historical specimens. The SNP chip presented here will help conservationists to monitor the introgression rate in wildcat populations based on non-invasive sampling and thus to better understand the process of hybridization.