Introduction

Species identification of biological specimens such as bloodstains, hairs, tissues, and bones is one of the most important aspects in forensic science. These investigations have conventionally been performed using immunological and morphological tests. The former type of test only discriminates between human and non-human origin because of insufficient commercial support and the limited stability of antibodies against various species. The latter requires skill for the characterization of biological specimens such as tiny animal hairs and bones. It is very difficult to obtain information about non-human samples using these tests.

With the recent progress in molecular biology, new methods have been developed on the basis of the genetic differences among species. The cytochrome b (CYB) [18], cytochrome oxidase I [9], and 12S [1012] and 16S ribosomal RNA (rRNA) genes [12, 13] in mitochondrial DNA (mtDNA) have been widely used to discriminate among species. D-loops have also been targeted for which universal primers amplified similar sized fragments among all species [1416]. Consequently, sequencing [13, 5, 6, 812, 14, 16], restriction fragment length polymorphism analysis [4, 15], and species-specific polymerase chain reaction (PCR) [7, 13] are indispensable for species discrimination. Due to the high copy number of mtDNA (∼100 to >1,000 copies/cell), these analytical methods are highly sensitive and reliable and are very useful for poor quality samples such as old bloodstains, animal hairs, bird feathers, and bone particles [2, 3, 5, 7, 1113]. Another advantage of these assays is that a BLAST search can be used to determine the species of a sample, even if there is no reference sample [2, 3, 5, 11, 12]. Recently, to distinguish mammal species from mixtures, a multiplex assay was developed based on different sizes of PCR products using lots of species-specific primers designed on the CYB gene [17].

The TP53 tumor suppressor gene [18] and the 28S rRNA gene [19] in nuclear DNA have also been chosen for species identification. Amplification of these genes produces different sized fragments in different animals, and appropriate gel electrophoresis enables easy and rapid classification. Bellis et al. [18] previously compared five markers (β-actin, CYB, D-loop, 28S rRNA, and TP53) for identification of 11 mammal and bird species and reported that the TP53 gene was the most suitable for forensic purposes. However, they did not investigate low-quality samples. Methods using nuclear DNA are often not applicable to highly degraded samples.

In this study, two novel systems for species identification based on the size variation of mtDNA hypervariable regions (mtDNA-HV) among animals are reported. These systems distinguish not only among mammals but also birds and fishes, which are often involved in human environments. The present systems have been successfully applied to forensic cases.

Materials and methods

Samples

Biological samples (e.g., blood, hair, muscle, and buccal swabs) were obtained from 41 animals, including 18 mammals, four birds, and 19 fishes (Electronic supplementary material Table S1). DNA extraction from the samples was performed using a QIAamp DNA Mini Kit (Qiagen, Hilden, Germany) following the manufacturer’s protocol. DNA extracts from samples except for those from hairs were quantified using a DU® 640 UV spectrometer (Beckman, Fullerton, CA, USA).

Conventional system for species identification

A method was developed using non-fluorescent primers and agarose gel electrophoresis. The sequences and priming positions of the PCR primers used in this study are shown in Table 1 and Fig. 1. The mammal primer set consisted of two primer pairs, mt-U1/mt-U2 and mt-HV2F/mt-HV2R, that were specific for mammals and humans, respectively. The bird primer set (mt-BdF6-7/mt-BdR4) and the fish primer set (mt-FhF3-5/mt-FhR3-5) were specific for birds and fishes, respectively. Three separate multiplex PCR reactions for mammals, birds, and fishes were carried out as follows: the PCR reaction mixture (25 μl) contained 1 μl of the extracted DNA (0.1–1 ng of DNA), 1× supplemented buffer, 0.2 mM of dNTPs, 0.2 μM of each primer except for the primers mt-HV2F/mt-HV2R (0.3 μM each), 0.16 mg/ml of BSA, and 0.25 units of AmpliTaq Gold® DNA polymerase (Applied Biosystems, Foster City, CA, USA). PCR was performed in a GeneAmp® 9700 thermal cycler (Applied Biosystems) at 94°C for 11 min, followed by 32 cycles of 94°C for 1 min, 56°C for 1 min, 72°C for 1.5 min, and a final extension at 72°C for 6 min. The extraction reagent blank and a negative PCR control were amplified in parallel. The amplification products were visualized by electrophoresis of 4 μl aliquots on a 3% Nusieve® 3:1 agarose gel (Cambrex, Rockland, ME, USA). The size of the products was estimated by comparison with a Trackit™ 100 bp DNA ladder (Invitrogen, Carlsbad, CA, USA).

Fig. 1
figure 1

The strategy of mtDNA-HV amplification for species identification. The size of the products obtained with the primers mt-U1/mt-U2, mt-BdF6, 7/mt-BdR4, and mt-FhF3-5/mt-FhR3-5 was different among species. The primers mt-HV2F/mt-HV2R were human specific, and products containing mtDNA-HV2/3 were observed in human DNA only

Table 1 Sequences of the primers for species identification

Automatic system for species identification

To develop an automatic species identification system based on the size variation among PCR products from various animal mtDNA, we used six fluorescent dye-labeling primers, 6FAM-mt-U1, 6FAM-mt-HV2F, HEX-mt-BdR4, and NED-mt-FhR3-5, instead of the non-fluorescent primers mt-U1, mt-HV2F, mt-BdR4, and mt-FhR3-5. The multiplex PCR reaction mixture and conditions were identical to those of the abovementioned amplification methods, except that the final extension time was 30 min. Each sample for analysis in the ABI Prism® 310 Genetic Analyzer (Applied Biosystems) was prepared by adding 0.5 μl of PCR product to 20 μl of Hi-Di™ formamide (Applied Biosystems) containing 0.4 μl of X-Rhodamine MapMaker® 1000 (BioVentures, Murfreeboro, TN, USA). The samples were separated at 1.5 kV for 40 min with temperature of 60°C using POP-4 polymer. Data from the ABI 310 were analyzed using GenMapper™ ID software v3.2 (Applied Biosystems).

Sequencing and database research

The remainders of the PCR products after species identification were purified using Microcon® YM-100 centrifugal filter devices (Millipore, Billerica, MA, USA) as described by the manufacturer. DNA sequencing was performed using a BigDye™ v3.1 Terminator Cycle Sequencing Kit (Applied Biosystems) and an ABI Prism® 310 Genetic Analyzer (Applied Biosystems). The primers for sequencing were identical to those for the abovementioned PCR, except that mt-U2seq (5′-TGGCCCTGAAGTAAGAACCAGATG-3′) was used as a sequencing primer instead of mt-U2. After the sequencing analysis, the obtained nucleotide sequences were subjected to a BLAST search (http://www.ddbj.nig.ac.jp/search/blast-j.html) for comparison with the numerous animal mtDNA sequences on the DDBJ/EMBL/GenBank sequence database.

Results

Conventional system for human and mammal identification

To search for highly conserved regions, we compared the nucleotide sequences of mtDNA-HV and their surrounding regions in humans and the 12 other mammals that were available from the DDBJ/EMBL/GenBank database (Electronic supplementary material Table S2). As shown in Table 1 and Fig. 1, the universal primers mt-U1 and mt-U2 were prepared, which were designed to hybridize to highly conserved portions. The nucleotide sequences of these primers were the same as those of dogs and differed by at most 3 bp from those of other animals, but the 3′-regions of all animals were identical to each other. This primer set amplified the mtDNA-HV corresponding to human mtDNA-HV1. The size of PCR products ranged from about 350 bp in weasels to 900 bp in goats, indicating that these animals were distinguishable from each other. Human DNA resulted in a product of about 550 bp in size, which was similar to that of cow. To resolve this problem, a second primer set, mt-HV2F/mt-HV2R, was prepared, which was specific for human mtDNA (Fig. 1). A product containing mtDNA-HV2/3 was observed in the human DNA only (about 600 bp). When duplex PCR was carried out using the mammal primer sets, the human DNA showed two bands, whereas all of the animal DNA showed one band except for cat DNA. The cat products showed a few bands of over 700 bp because of a length heteroplasmy arising from the variable number of tandem repeats in the mtDNA-HV [2022]. The sizes of the products from mammals were consistent with the sizes estimated from the corresponding sequences of animal mtDNAs in the DDBJ/EMBL/GenBank database (Electronic supplementary material Table S2). Thus, the mammals were successfully identified (Fig. 2a). The sensitivity of this method was assessed by analysis of consecutive dilutions of total DNA extracted from human buccal swabs. When >10 pg of template was used in the PCR, two human bands were readily detected. When >10 ng DNA was applied, some species, especially cats, showed additional bands in the agarose gel. These results may be attributable to a co-amplification of nuclear mitochondrial pseudogenes (Numt) [20, 23]. The addition of excessive template was undesirable for this PCR reaction. The primers mt-U1 and mt-U2 were specific for mammals, and no product was observed in the birds and fishes tested.

Fig. 2
figure 2

Amplification of mtDNA-HV for species identification. a Products from mammals obtained with the mammal primer set. Lanes 1 ferret, 2 raccoon dog, 3 dog, 4 rat, 5 horse, 6 pig, 7 and 17 human, 8 cow, 9 rabbit, 10 cat, 11 goat, 12 negative control, 13 weasel, 14 whale, 15 sheep, 16 Japanese monkey. b Products from birds obtained with the bird primer set. Lanes 1 quail, 2 turkey, 3 duck, 4 and 5 chicken. c Products from fishes obtained with the fish primer set. Lanes 1 thread-sail filefish, 2 yellow goosefish, 3 globe fish, 4 crucian carp, 5 bastard halibut, 6 chicken grunt, 7 pacific saury, 8 salmon, 9 eel, 10 red seabream, 11 Japanese sardine, 12 round herring. M Trackit™ 100 bp ladder (Invitrogen)

Conventional system for bird and fish identification

In the same way as for the design of the mammal primer set, we searched for conserved regions among birds and fishes and chose the corresponding regions as the priming positions for amplification (Fig. 1). We designed the several primers because the sequences of these regions contained some mutation sites that could have prevented stable primer annealing. The bird primer set was created with two forward primers, mt-BdF6 and mt-BdF7, and one reverse primer, mt-BdR4 (Table 1). The amplification products from the bird primer set are shown in Fig. 2b. The products from birds, which were approximately 530–600 bp in length, differed from each other. The fish primer set consisted of mt-FhF3-5 and mt-FhR3-5 (Table 1). Figure 2c shows the different sizes of fragments for 12 of the 19 fishes. The sizes ranged from about 380 bp in a thread-sail filefish to 900 bp in a round herring; however, in most fishes, the products were between 400 and 500 bp in length.

Automatic system for species identification

Because three separate PCR reactions were necessary for the abovementioned conventional system, which used non-fluorescent primers and agarose gel electrophoresis, a multiplex PCR system using fluorescent mammal, bird, and fish primer sets was developed. The products from animals, obtained by the multiplex PCR reaction mentioned earlier, were analyzed in an ABI Prism® 310 Genetic Analyzer (Applied Biosystems). The products from mammals, birds, and fishes were detected in the blue, yellow, and green windows, respectively, which corresponded to the color of each fluorescent dye-labeling primer. The product sizes were estimated using the GeneMapper™ ID software (Applied Biosystems). The size ± threshold value (base pair) of bins was defined for each animal (Electronic supplementary material Table S1). Since insertion/deletion of nucleotide(s) may occur in mtDNA-HV, we set ±2 bp as the threshold value in fragments shorter than 600 bp and ±3 bp as the threshold value in longer fragments. For animals with a similar fragment size, the animal names were displayed together, e.g., Mouse, Rat and Hum1, Cow. The amplicons obtained with the human specific primers 6FAM-mt-HV2F/mt-HV2R showed fragment sizes ranging from 592 to 601 bp among individuals. This difference arose from the number of cytosines in the C-stretch region in mtDNA-HV3. We set 597 ± 5 bp as the size and threshold value of these fragments and designated its name as Hum2. Analysis of cat DNA revealed a few fragments differing in size via an 80 bp repeat unit, and these fragments were named Cat3-Cat5, depending on the number of repeat units. Figure 3 shows some representative electropherograms obtained from various animals. By automated species calling using the ABI Prism® 310 Genetic Analyzer and the GeneMapper™ ID software, all animals investigated were successfully identified.

Fig. 3
figure 3

Automated electropherogram of several animals. The species name (upper) and the fragment size (lower) are automatically displayed

Confirmation by sequencing of PCR products

The products from mammals were successfully sequenced using the primers mt-U1 and mt-U2seq. Although the human products consisted of two fragments, mtDNA-HV1 and mtDNA-HV2/3, the presence of the mtDNA-HV2/3 fragment did not interfere with the sequencing of mtDNA-HV1. Sequencing of the products from birds and fishes was performed using a mixture of several sequencing primers, i.e., mt-BdF6-7, mtFhF3-5, or mt-FhR3-5, because it was not known which of these primers had participated in the amplification. The redundant primers in sequencing reaction did not interfere with the analysis.

Application to criminal cases

The present systems have been used in species identification of various biological samples from 12 criminal cases (Table 2). Cases 1 to 8 were subjected to the conventional system and cases 9 to 12 to both systems. In nine cases, the species was successfully identified by comparison with the reference DNA markers and by automated species calling. The amplicons from bloodstains in the other cases (3, 5, and 7) did not coincide with the reference DNA markers when visualized on agarose gel electrophoresis. A subsequent BLAST search permitted successful identification of these species. These bloodstains were found to belong to a Japanese weasel (Mustela itatsi), a sea bass (Laterolabrax japonicus), and a deer (Cervus nippon), respectively. In cases 1, 10, and 12, we were requested to identify species and individuals. Case 1 was a dog attack with two suspected dogs and an old woman. Agarose gel electrophoresis of the products from animal hairs, which were collected from the body of the woman, revealed they had the expected fragment size for dogs. Furthermore, two nucleotide sequences in the products obtained from the animal hairs were identical to those of the two suspected dogs. For further confirmation, we analyzed the nucleotide sequence of samples taken from 82 different dogs and compared about 180 bp sequences. A total of 21 haplotypes (DH1-21, Electronic supplementary material Table S3) were found with frequencies ranging from 0.012 to 0.195. The two suspected dogs belonged to haplotypes DH2 (0.195) and DH4 (0.024). Tissues in the gastric contents from the dog with haplotype DH2 were of human origin, and nuclear DNA typing showed the old woman’s profile. Case 10 was a robbery case, where PCR product from one animal hair collected from an offender’s shoe coincided in size with the reference DNA markers for cat. The sequence from the cat hair was identical to that of a cat hair that was collected in the suspect’s house and identified using the above systems (he had once kept a cat at home). The haplotypes obtained from both cat hairs were identical to each other and turned out to be CH7 (0.162), one of 16 haplotypes (CH1-16, Electronic supplementary material Table S4), ranging in frequency from 0.027 to 0.189, classified in 37 different cats by comparing nucleotide sequences of about 470 bp. Case 12 was another robbery case. The PCR product from one animal hair adhering to a sock of a suspect showed the same size as the reference DNA marker for dog. The sequence from the dog hair was typed as DH1 (0.134) and was identical to that of one of two dogs that were kept indoors by the victim. The other dog had a type DH2.

Table 2 Casework results using the present system for species identification

Discussion

Two major principles for species identification by molecular analysis have been described to date. One is based on differences in nucleotide sequences. Universal primers directed toward highly conserved regions can amplify DNA from most of animals. This type of investigation requires confirmation of PCR products by electrophoresis followed by nucleotide sequencing of PCR products [13, 5, 6, 812, 14]. It is rather tedious but exact. The other is based on the differences in the size of amplicons that are identified by electrophoresis [1719]. This technique is rapid and simple but may sometimes be inexact [18]. As an amplification target, both of nuclear DNA and mtDNA have been used. It is advantageous to target mtDNA because of its high copy number and the substantial database of its sequences. The mtDNA-HV analysis has been used for human identification of specimens such as shed hairs, old bones, and degraded DNA samples [2426].

Kocher et al. [14] designed universal primers to amplify the D-loop of mtDNA. However, the fragments obtained were similar in size among all species tested. In this study, we have found highly conserved regions in the D-loop and its flanking regions among various species. The use of primers for amplification of mtDNA-HV permitted us to distinguish between species on the basis of size variation. We developed systems for species identification of mammals, birds, and fishes, which are often involved in human environments. One pair of primers for mammals is able to amplify mtDNA-HV in all of the 18 mammals investigated here, indicating that these universal primers can amplify mtDNA-HV from many other mammals. To identify birds and fishes, a few primer sets were needed because of their high diversity. These primers may only be specific for a limited number of animals. The conventional system using non-fluorescent primers is a powerful screening method, especially for human and mammal identification. In the automatic system using fluorescent primers, species identification was carried out after a multiplex PCR reaction. Species identification was more accurate due to the resolving power of the apparatus. When necessary, the PCR products can be subjected to sequencing, which would not only help accurate identification of species but also provide information about individuals because of the high diversity in nucleotide sequences of mtDNA-HV within species [2732]. The present systems were applied to 12 criminal cases. In nine cases, the species was easily identified by comparison with the reference DNA markers. In the three other cases where the PCR products were not identical in size to the references, species identification was successfully achieved by sequencing of the products and subsequent BLAST analysis of the obtained nucleotide sequence data. As shown in Table 2, low E values, which is a probability one can expect to see by chance when searching a database [33], and high degrees of sequence identity were obtained in all cases investigated.

Sequence variations in the products amplified for species identification provided us with information about individuals from animal hairs. Nucleotide sequences of mtDNA-HV should be interpreted with caution because of the risk of heteroplasmy and/or contamination [26, 34]. However, identification by the comparison of animal mtDNA-HV sequences is very important for forensic investigation since shed hairs from domestic animals, especially dogs and cats, are frequently recovered from crime scenes and can be used as evidence [11, 35]. By morphological observations alone, it is impossible to distinguish between individuals of a species [36]. Species identification and the subsequent haplotyping presented here could also be used in many other types of cases such as illegal hunting and trading [58, 16, 37]. Finally, the applicability of the present systems to forensic fields has been demonstrated by practical case studies. These two systems would also be more useful for mixtures of human and animal DNA than many methods based on mtDNA gene loci. We recommend the present species identification systems for forensic investigations.