Abstract
Background
Forensic scientists are often required to identify species of unknown biological samples. Although methods based on sequencing of DNA barcode regions are the gold standard for species identification in single-source forensic samples, they are cumbersome to implement as routine work in forensic laboratories that perform many tests, including human DNA typing. We have developed a species identification workflow that incorporates direct sequencing with real-time PCR products (real-time PCR–direct sequencing) as the technical trick for easy testing in forensic practice.
Method and results
Following our workflow, DNA samples from vertebrates, such as mammals, amphibians, reptiles, birds, and fish, were subjected to species identification using vertebrate universal primers targeting each of the four DNA barcode regions. In real-time PCR melting curve analysis, humans and animals (nonhuman) could be differentiated by comparing melting temperatures, and subsequent real-time PCR–direct sequencing contributed to simplified sequencing. Searches against public DNA databases using the obtained sequences were compatible with the origin of the samples, indicating that this method might be used to identify animal species at the genus level. Furthermore, this workflow was effective in actual casework, which provided rapid test results according to the needs of the investigating agencies.
Conclusions
The species identification workflow will simply sequence as much as possible and can be integrated into routine forensic practice. The real-time PCR–direct sequencing used in this workflow might be beneficial not only for species identification but also for DNA sequencing by using the Sanger method for a variety of life sciences.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Identification of the species from which biological samples are derived is often necessary in forensic science. Examples include poaching and illegal trade in endangered species, animal cruelty, and the detection of species misrepresentation and forgery in the food industry [1]. Species identification is also crucial for determining the presence or absence of criminal activity when outdoor bloodstains with an unknown reason, a piece of tissue from a suspected hit-and-run vehicle, or incomplete bones whose morphology cannot be identified as human remains are discovered. Species identification methods using molecular biological techniques target the barcode region of mitochondrial DNA, which has a high mutation rate and is employed for evolutionary phylogenetic analysis [2], and the forensic scientist in charge must decide which of the several methods to use. They can be divided into two major categories: methods based on PCR using species-specific primers [3, 4] and methods based on sequencing the barcode region using universal primers [5, 6]. The former is user-friendly and can identify samples with two or more mixed target species, but it can only detect the target species of the primers in the reaction system. By using DNA sequence databases such as the Basic Local Alignment Search Tool (BLAST), the latter can be specified from a large number of registered species, but it is challenging to incorporate sequencing into routine work and is not suitable for complex mixed samples. If the species from which an unknown sample is derived is predictable, it can be confirmed by multiplex PCR containing specific primers for the predicted species, but the biological material encountered in forensic sciences is so diverse that it is challenging to predict species. In order to identify species quickly and readily in a criminal investigation, we simplified sequencing with a technical trick and developed a practical workflow.
Materials and methods
DNA sample
This study made use of control genomic DNA from animals including cattle, chicken, dog, pig, rabbit, and rat that was purchased from BioChain Institute Incorporated. As human control DNA, DNA derived from one male and one female was used from Standard Reference Material 2372a Human DNA Quantitation Standard purchased from ATCC (American Type Culture Collection). DNA was also extracted from the blood or tissue of a chimpanzee, a Japanese macaque, a gorilla, an orangutan, a cat, a crocodile, a goose, a frog, a puffer fish, a tuna, a shrimp, and a squid. DNA extraction was carried out on an EZ1 Advanced XL (Qiagen, Hiden, Germany) by using the EZ1 DNA Investigator kit (Qiagen), and the extracted DNA was measured for DNA concentration using a NanoDrop-1000 spectrophotometer (Thermo Fisher Scientific, Wilmington, DE, USA). Each DNA was adjusted to approximately 1 ng/µl.
Real-time PCR
The DNA barcode regions used in forensic and systematic studies are the 12 and 16 S ribosomal RNA (rRNA) genes, cytochrome b (cyt b), and cytochrome c oxidase subunit 1 (COI), which are all located within the mitochondrial genome. Four primer sets reported as vertebrate-universal primers in each locus were selected for this workflow [5, 7,8,9] (Table 1).
Real-time PCR was carried out in 20 µL reactions containing 10 µL of 2× TB Green Premix Ex Taq™ (RR420) (TaKaRa Bio Inc., Shiga, Japan), 0.4 µL of 50× ROX Reference Dye II, 0.4 µM each of forward and reverse universal primers, and 2 µL of DNA (singleplex per mtDNA locus). The reaction was conducted using a QuantStudio™ Design & Analysis software v1.5.1 on a QuantStudio 5 (Thermo Fisher Scientific Inc., Waltham, MA, USA) using the following conditions: 95 °C for 30 s; 40 cycles of 95 °C for 3 s, 57 °C (12 and 16 S rRNA) or 50 °C (cyt b and COI) for 20 s, and 72 °C for 20 s; and melting curve analysis (95 °C for 15 s, 60 °C for 1 min and 95 °C at a rate of 0.15 °C/s).
Real-time PCR was performed in the experimental system using control DNA three times on separate occasions, and the mean and standard deviation of the melting temperatures were calculated (technical triplicate). Incidentally, since the technical triplicate of the control DNA confirmed the reproducibility of the real-time PCR, the real-time PCR for the additional samples and the case samples was performed once (no repeats).
Direct sequencing
Without using a purification process, the real-time PCR products were diluted 20-fold in TE buffer (Thermo Fisher Scientific, Inc.). Direct sequencing was conducted using 2 µl diluted real-time PCR products, 4 µl BigDye Terminator v1.1 Ready Reaction mix (Thermo Fisher Scientific, Inc.), 2 µl 5× Sequencing Buffer, and one of the forward or reverse primer (3.2 µM) in a total volume of 20 µl. On a ProFlex PCR System (Thermo Fisher Scientific, Inc.), the reaction was then carried out under the following conditions: 96 °C for 1 min and 25 cycles of 96 °C for 10 s, 50 °C for 5 s, and 60 °C for 2 min. According to the manufacturer’s instructions, the reaction product was purified using a BigDye XTerminator Purification Kit (Thermo Fisher Scientific, Inc.) and run on a 3500xL Genetic Analyzer (Thermo Fisher Scientific, Inc.) using 36-cm capillary arrays (Thermo Fisher Scientific, Inc.) with POP-4 Polymer (Thermo Fisher Scientific, Inc.). Because electrophoresis can incidentally result in low resolution, the run was performed two times.
Data analysis
Sequence data that passed quality checks in the 3500 series data collection software4 (Thermo Fisher Scientific Inc.) were employed for sequence analysis. The sequence of the target region was completed by checking and merging the sequence data obtained from both strands using the MEGA X software [10] (primer sequences were deleted). With the exception of the COI sequence, the completed sequences were homology searched against sequence data in public databases like GenBank by the BLAST [11]. The COI sequence was matched to COI sequences that were registered and publicly available by using the Barcode of Life Data (BOLD) system (http://www.boldsystems.org/) [12]. The genus level of sequence data demonstrating 99–100% identity to the input sequence was defined as the identification result.
Case study
Casework Sample 1 (CS1): Bloodstains (drip pattern) found in a large area spanning the parking lot to the roadway.
Casework Sample 2 (CS2): Bloodstains (swipe pattern) left on the floor of the front door of the house.
Casework Sample 3 (CS3): Tissue fragments found on the vehicle suspected of hitting a person.
Casework Sample 4 (CS4): Bone fragments found on a mountain trail.
Although the casework sample mentioned here proved to be nonhuman and unrelated to the case, species identification was done for confirmation. DNA was extracted from these casework samples in a manner appropriate for each forensic sample type, and human DNA was measured on the Quantifiler HP DNA Quantification Kit (Thermo Fisher Scientific, Inc.) according to the manufacturer’s instructions but was not detected. Since it was anticipated that the extracted DNA would also contain bacterial DNA of environmental origin, spectrophotometric measurements were not performed and 2 µl of DNA solution with an unknown concentration was utilized for real-time PCR. The products that reached a plateau were then diluted 20-fold in TE buffer, and direct sequencing and sequence analysis were carried out as above.
Results and discussion
We have developed a practical species identification workflow for unknown biological samples (Fig. 1). A commercially available kit is used for the quantification of human DNA if DNA taken from an unknown sample is most likely of human origin. Real-time PCR is carried out using vertebrate universal primers if human DNA cannot be detected or the sample is assumed to be of animal origin. Thereafter, the amplicon that has reached a plateau and its melting temperature is confirmed (Check 1 and 2 in Fig. 1). The real-time PCR product is diluted and directly sequenced (real-time PCR–direct sequencing), and the resulting sequence is utilized to identify the animal species by homology search against public DNA databases (DDB).
It is often important in forensic practice to determine whether an unknown sample is of human origin or not. Among these are bones discovered in the mountains, blood and tissues recovered from a knife suspected to have been used in a murder, or a car suspected to have run over a person. Once those samples are proven to be of non-human origin, they are excluded from evidence. However, even though the test results utilizing a commercially available human DNA quantification kit based on the qPCR method are undetectable for human DNA, it cannot be concluded that these suspected human samples are not of human origin. This is due to the fact that forensic samples are subjected to a variety of conditions: there is the possibility of DNA extraction failure and high levels of degradation cannot be ruled out. Thus, if human DNA is not detected with the quantification kit, it is unclear whether the sample is not of human origin or whether there is no detectable DNA in the sample. Amplifications reaching a plateau were obtained at two or more loci in all of the animal control DNAs used in this study, and melting temperatures were different from those of humans (Tm in Table 2). Ever higher primates that are closely related to humans, such as a chimpanzee and an orangutan, had melting temperatures that were different from those of humans, particularly in cyt b (Tm in Table 3). Thus, in suspected human samples, the test could be terminated at this step if amplification is confirmed in the real-time PCR with vertebrate universal primers and the melting temperatures differ from those of humans because the sample has been confirmed as not being of human origin (No arrow under “Investigative Necessity” in Fig. 1).
On the other hand, one can proceed to direct sequencing if the identification of the animal species provides information on the case (Yes arrow under “Investigative Necessity” in Fig. 1). For example, in the case of meat counterfeiting, violations of animal protection laws, or when hairs found at a crime scene are derived from animals related to (kept by) the suspect. If melting temperatures have been measured in the past for each target locus of the animal species assumed from the case details and investigative information, they could be used as a preliminary confirmation aid prior to sequencing. It should be noted that melting temperatures are dependent on the reagents used (reaction composition and salt concentration); therefore, the same reagents and reaction composition must be used for comparison (i.e., TB Green Premix Ex Taq must be used when referring to the Tm values described in this paper).
One of the characteristics of real-time PCR is the plateau phenomenon, which may be caused by primer depletion, and the amplified product does not significantly exceed a certain amount [13]. Therefore, the real-time PCR product that has achieved a plateau in the amplification curve can be fed into the sequencing reaction with the appropriate amount of amplified product without concentration measurement. In our preliminary experiments, we have determined the amount of real-time PCR product to be used in the sequencing reaction (1 part in 200 in the sequencing reaction solution in this study). The amount may be crucial because it must contain enough template DNA for the sequencing reaction while keeping unreacted primers and dNTPs at low levels to avoid affecting the sequencing reaction. As we expected, good sequencing results were obtained, with the exception of cases where the amplicon was not a single sequence. A possible reason for the non-single sequence could be nonspecific amplification due to nuclear mitochondrial DNA segments (NUMTs), which are mtDNA sequences that have migrated into the nuclear DNA and may co-amplify with true mtDNA [14]. Heteroplasmy with many haplotypes mixed together also makes sequencing difficult, especially in length heteroplasmy. Therefore, it is important to target the barcode regions of multiple loci for sequencing. Naturally, it is challenging to sequence samples that originally contain more than two species. Although outside the scope of this study, which aims to simplify species identification, for complex samples containing multiple unknown species (e.g., products of the Traditional Chinese medicine), techniques such as metabarcoding with next-generation sequencing are recommended [1].
More than two-locus sequences were obtained from the vertebrate-derived DNA samples used in this study, including those from mammals, amphibians, reptiles, birds, and fish, and the search results against the public database were compatible with the origin of the sample, which allowed for the identification to the genus level (Tables 2 and 3). Furthermore, although sequencing was obtained for only one locus (COI), even the mollusk squid and the arthropod shrimp were considered to be able to infer the genus (Table 3). Meanwhile, the COI amplicon in some animals could not be sequenced with the forward primer, most likely as a result of slippage at the poly-C sequence adjacent to the primer junction. Nevertheless, species identification by the BOLD system, which relied only on the sequence obtained with the reverse primer, was consistent with the origin of the samples (superscript b in Tables 2 and 3).
We tested four actual samples following our workflow. All of those samples consequently exhibited melting temperatures that were different from those of humans (Tm in Table 3), allowing us to quickly complete our testing and inform the investigating agency that they were not of human origin. Determining criminality at an early stage is valuable because it reduces human labor and cost. Although it was clear that there was no criminality, by our interest, real-time PCR–direct sequencing was performed and homology searches allowed us to estimate the animal species at the genus level (Table 3). The amount of DNA that could be recovered from these forensic samples would vary significantly due to the various sample types (bloodstains, tissue, bone), degree of antiquity, and environmental exposure conditions. Despite this, we were able to easily identify the animal species by sequencing them using real-time PCR–direct sequencing without measuring DNA concentrations.
Although vertebrate universal primers were used in this study, plant universal primers and universal primers for each class, including reptiles, birds, and fish, have also been reported [1]. The identification of plant species and the species-level identification of animals may both be possible through the use of such universal primers. It should be noted that in species-level identification, hybrid animals are incorrectly designated as the maternal species, because mitochondrial DNA is maternally inherited [15]. In light of this, it is recommended that the report to the investigating agency state that the test is for mitochondrial DNA.
DNA extraction, DNA quantification, PCR amplification, confirmation of amplified products, purification of PCR products, and direct sequencing are all steps in forensic science species identification methods based on sequencing [5, 6, 16]. As long as the amplification is confirmed to have reached a plateau by real-time PCR, the real-time PCR–direct sequencing incorporated in this workflow does not require purification of the amplified product before the sequencing reaction or confirmation of the product by agarose gel electrophoresis; thus sequencing results can be obtained quickly and easily. In additional supplementary attempts, real-time PCR products from other intercalator reagents (PowerUp SYBR Green Master Mix, Thermo Fisher Scientific, Inc.) and the TaqMan reagents (TaqPath qPCR Master Mix, CG, Thermo Fisher Scientific, Inc.) were also successfully directly sequenced (data not shown). Thus, this technical trick might be applied in direct sequencing based on the Sanger method in a variety of life science fields, and workflows for species identification involving this trick could be integrated into routine forensic practice.
Conclusion
We have developed a workflow for species identification that incorporates real-time PCR–direct sequencing as a technical trick to enable species identification to be routinely performed in the forensic community. Four actual samples of different types were tested based on this workflow, which immediately indicated their nonhuman origin and subsequently easily identified the animal species of origin to genus level. Therefore, in forensic practice, this workflow is effective for routine species identification.
Data Availability
The data of this study are available from the corresponding author on reasonable request.
References
Staats M, Arulandhu AJ, Gravendeel B, Holst-Jensen A, Scholtens I, Peelen T, Prins TW, Kok E (2016) Advances in DNA metabarcoding for food and wildlife forensic species identification. Anal Bioanal Chem 408:4615–4630. https://doi.org/10.1007/s00216-016-9595-8
Mori C, Matsumura S (2021) Current issues for mammalian species identification in forensic science: a review. Int J Legal Med 135:3–12. https://doi.org/10.1007/s00414-020-02341-w
Ishida N, Sakurada M, Kusunoki H, Ueno Y (2018) Development of a simultaneous identification method for 13 animal species using two multiplex real-time PCR assays and melting curve analysis. Leg Med (Tokyo) 30:64–71. https://doi.org/10.1016/j.legalmed.2017.11.007
Mori C, Matsumura S (2022) Development and validation of simultaneous identification of 26 mammalian and poultry species by a multiplex assay. Int J Legal Med 136:1–12. https://doi.org/10.1007/s00414-021-02711-y
Kitano T, Umetsu K, Tian W, Osawa M (2007) Two universal primer sets for species identification among vertebrates. Int J Legal Med 121:423–427. https://doi.org/10.1007/s00414-006-0113-y
Mitani T, Akane A, Tokiyasu T, Yoshimura S, Okii Y, Yoshida M (2009) Identification of animal species using the partial sequences in the mitochondrial 16S rRNA gene. Leg Med (Tokyo) 11:S449–S450. https://doi.org/10.1016/j.legalmed.2009.02.002
Lopez-Oceja A, Gamarra D, Borragan S, Jiménez-Moreno S, de Pancorbo MM (2016) New cyt b gene universal primer set for forensic analysis. Forensic Sci Int Genet 23:159–165. https://doi.org/10.1016/j.fsigen.2016.05.001
Leray M, Yang JY, Meyer CP, Mills SC, Agudelo N, Ranwez V, Boehm JT, Machida RJ (2013) A new versatile primer set targeting a short fragment of the mitochondrial COI region for metabarcoding metazoan diversity: application for characterizing coral reef fish gut contents. Front Zool 10:34. https://doi.org/10.1186/1742-9994-10-34
Geller J, Meyer C, Parker M, Hawk H (2013) Redesign of PCR primers for mitochondrial cytochrome c oxidase subunit I for marine invertebrates and application in all-taxa biotic surveys. Mol Ecol Resour 13:851–861. https://doi.org/10.1111/1755-0998.12138
Kumar S, Stecher G, Li M, Knyaz C, Tamura K (2018) MEGA X: Molecular Evolutionary Genetics Analysis across Computing platforms. Mol Biol Evol 35:1547–1549. https://doi.org/10.1093/molbev/msy096
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410. https://doi.org/10.1016/S0022-2836(05)80360-2
Ratnasingham S, Hebert PD (2007) BOLD: the Barcode of Life Data System. Mol Ecol Notes 7:355–364. http://www.barcodinglife.orghttps://doi.org/10.1111/j.1471-8286.2007.01678.x
Jansson L, Hedman J (2019) Challenging the proposed causes of the PCR plateau phase. Biomol Detect Quantif 17:100082. https://doi.org/10.1016/j.bdq.2019.100082
Marshall C, Parson W (2021) Interpreting NUMTs in forensic genetics: seeing the forest for the trees. Forensic Sci Int Genet 53:102497. https://doi.org/10.1016/j.fsigen.2021.102497
Amorim A, Pereira F, Alves C, García O (2020) Species assignment in forensics and the challenge of hybrids. Forensic Sci Int Genet 48:102333. https://doi.org/10.1016/j.fsigen.2020.102333
Linacre A (2021) Animal Forensic Genetics. Genes (Basel). 12:515. https://doi.org/10.3390/genes12040515
Acknowledgements
This work was supported by the Cooperative Research Program of Primate Research Institute, Kyoto University.
Funding
No funding was received to assist with the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
MD: Conceptualization, Methodology, Investigation, Visualization, Writing–Original draft preparation, Reviewing and Editing; TN: Investigation, Resources, Writing–Reviewing and Editing; MA: Project administration, Resources, Writing–Reviewing and Editing. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Conflict of interest
There are no conflicts of interest to declare.
Ethics approval and consent to participate
The committee on animal experiments in Ehime University has confirmed that no ethical approval is required.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Doi, M., Nakagawa, T. & Asano, M. A practical workflow for forensic species identification using direct sequencing of real-time PCR products. Mol Biol Rep 51, 17 (2024). https://doi.org/10.1007/s11033-023-08980-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11033-023-08980-7