Introduction

Forensic mitochondrial (mt) DNA cases often involve compromised, low template evidence such as aged hair shafts and skeletal remains. Therefore, DNA damage is routinely encountered when performing sequence analysis of these sample types. The need to further identify and characterize DNA damage, especially with the onset of newer technologies like massively parallel sequencing (MPS), is clear, as the ability to resolve sites of damage has increased.

Patterns of DNA damage across genomic and mtDNA sequence have been well studied in the ancient DNA community [1,2,3,4,5]. The majority of damage studies conducted in the forensic community have focused on short tandem repeat (STR) loci [for example, 6, 7], with historical studies examining mtDNA damage associated with conventional Sanger-type sequencing (STS) data [8]. While the impact of mtDNA damage on MPS results has been reported in a limited number of recent studies [9,10,11], the value of the current study is the introduction of damage in a controlled environment using hydrolysis (DNA stored in water at room temperature) to create a simulation of damage occurring naturally in forensic evidentiary samples at low template amounts and when using different amplification and library preparation approaches. In addition, the ability to compare the results to known mtDNA MPS profiles addresses the limitations of ancient and forensic DNA damage studies, which are restricted to observing the aftermath of damage with no clear knowledge of the true mtDNA profile [10, 12].

Different forms of DNA damage have been characterized on a molecular level such as oxidative lesions [13], damage induced by ultraviolet radiation [14], depurination [15, 16], and deamination [17]. The current study focused on DNA damage caused by depurination and deamination, as both types of DNA damage are a direct result of hydrolysis; although oxidative damage can also occur through passive hydrolysis, but at a lower rate. Depurination typically involves the loss of purine bases (adenine and guanine) due to hydrolytic cleaving of the N-glycosyl bond, resulting in an abasic site [5, 18]. During the extension step of PCR, any base will be incorporated opposite of the abasic site, although adenine is preferentially incorporated [19]. Either way, the result may be a miscoding event, including a transversion.

Deamination, the removal of an amine group from a nitrogenous base, most commonly results in the hydrolytic conversion of cytosine to uracil, an analog of thymine [2, 20]. Deamination of adenine to hypoxanthine, an analog of guanine, has also been documented as another form of DNA damage [21]. These deamination conversions result in two complementary groups of transitions, termed “type 1” (A-G/T-C) and “type 2” (C-T/G-A) [21]. Because both depurination and deamination can generate miscoding lesions, manifested as either base modifications or mixed sites, this can in turn lead to erroneous interpretation of mtDNA sequence data. One potential way of addressing these misincorporation events is through the use of repair treatments on deaminated residues [20, 22]. In fact, recent repair studies have begun to illustrate that damage lesions can often be repaired prior to MPS analysis [10, 23].

The reporting of heteroplasmy will enhance the overall value of mtDNA testing in forensic casework by increasing the discrimination potential and weight of an mtDNA match [24] and allowing for the differentiation of maternal relatives [25]. Forensic DNA laboratories using STS technology are limited in their ability to report heteroplasmy, especially at low levels, as the detection of minor variant single nucleotide polymorphisms (SNPs) is 10–20%, depending on the analytical noise or error in the sequencing instrument [26]. MPS approaches allow for increased resolution, providing the forensic community the opportunity to identify, report, and capitalize on heteroplasmic variants [27]. It has been shown that MPS technology can detect true heteroplasmic variants down to a level of at least 2% [27, 28], supported through an analysis of analytical noise or error [29]. Therefore, the ability to detect and report low-level heteroplasmy, along with the diminished quality of evidence samples routinely encountered in forensic casework, means that DNA damage will impact the interpretation process, and therefore, will require appropriate modifications to mtDNA MPS reporting guidelines. The objectives of the current study were to assess the impact of DNA damage on mtDNA MPS data associated with a range of template DNA concentrations, and when using different amplification and library preparation strategies for MPS analysis.

Materials and methods

Biological sample collection and preparation

Buccal samples from ten healthy donors were collected in accordance with the Penn State University internal review board (IRB) approved protocol STUDY00000970. Samples were organically extracted; lysis with 400 μL of UV irradiated stain extraction buffer (molecular biology grade (MBG) 2% SDS (Amresco; Solon OH), 10 mM EDTA (Promega; Madison, WI), 100 mM NaCl (Dot Scientific; Burton, MI), 7.6 mM Tris-HCl, pH 8.0 (Quality Biological; Gaithersburg, MD)) and 10 μL of 20 mg/mL of MBG proteinase K (ThermoFisher Scientific; Carlsbad, CA) for approximately 2 h at 56 °C, followed by MBG phenol-chloroform-isoamyl alcohol (PCIA) (Sigma-Aldrich; St. Louis, MO) purification, ethanol precipitation, and resuspension in 50 μL of MBG water (G-Biosciences; St. Louis, MO). Low binding tubes (Eppendorf; Hamburg, Germany) were used to store the DNA extracts and subsequent dilutions. Quantification was performed using a custom duplex mtDNA quantitative PCR (mtqPCR) assay (see quantification section) on the Applied Biosystems 7500 instrument (according to the manufacturers recommended protocol), and each extract was diluted with the same lot of MBG water to a concentration of approximately 100,000 mtDNA copies/μL (equivalent to 0.66–1.32 ng/μL of nuclear (n) DNA when assuming 500–1000 copies of mtDNA and 6.6 pgs of nDNA per cell). Extracts were stored at 4 °C for no longer than 72 h prior to being diluted and subjected to conditions designed to induce DNA damage. An extraction reagent blank control was quantified with each set of five DNA extracts. The first reagent blank quantified below the mtqPCR standard curve (3 mtcopies) and the second reagent blank quantified between the lowest two standards (28 mtcopies). Therefore, the reagent blanks were not carried through PCR and MPS.

Preparation of damaged DNA samples and experimental design

From each dilution of 100,000 mtcopies/μL, two further dilutions were made: 25,000 mtcopies/μL and 2500 mtcopies/μL. The same lot of water was used for the dilutions that was used for resuspending the original DNA extracts. One undiluted aliquot was taken from each 100,000 mtcopies/μL sample and stored at − 20 °C to be used as an undamaged control for each of the five donors for the two experiments. The three dilutions for each donor were incubated at room temperature for 3 weeks. Preliminary experiments were attempted using a 37 °C incubation instead of room temperature for a shorter time period, however the results led to complete degradation rather than observable damage lesions. Two damage negative controls containing just the MBG water were also carried through the experiment. After the 3-week incubation period, the damaged 100,000 mtcopies/μL sample for each donor was diluted to 25,000 mtcopies/μL and 2500 mtcopies/μL. This allowed for the observation of DNA damage in low template conditions both before and after the damage occurred.

Two identical experiments involving five different donor samples were conducted, for a total of ten donor samples tested. Each experiment included a comparison of two different PCR amplification/library preparation approaches. Six samples per donor were tested for each approach (Fig. 1), with each of the six samples run in duplicate, and the PCR amplifications replicated, for a total of 120 samples per experiment, and an overall total of 240 samples in this study.

Fig. 1
figure 1

Experimental design schematic showing how the dilutions of donor mtDNA extracts were performed in order to observe DNA damage in low template conditions. 25 K and 2.5 K copies/μl dilutions were created from 100 K copies/μl both before and after the damage was induced by storing the samples in water at room temperature for 3 weeks. An undamaged control of 100 K copies/μl was also created for each donor by taking an aliquot from the undamaged 100 K copies/μl sample and storing at − 20 °C

mtqPCR assay

After the 3-week incubation at room temperature and subsequent dilutions, an mtqPCR assay was performed on the 240 samples. The custom duplex mtqPCR assay [30] was used to assess mtDNA yield for the different dilutions before and after inducing DNA damage. A small target of 69 base pairs (bps), utilizing the previously published mtND1 primers and TaqMan probe from [31], was combined with a large target of 283 bps, utilizing the previously published mt8154 primers and probe from [32], with the dye on the mt8154 probe changed from VIC to FAM. This strategy allowed for quantification of DNA extracts (small target) and relative analysis of DNA degradation (large target). The DNA standard used for the mtqPCR assay was the Quantifiler® Human Standard DNA (Applied Biosystems; Foster City, CA). Aliquots of the standard were sent to the National Institute of Standards and Technology (NIST) to certify the number of mtDNA copies per μL. Eight concentrations of the standard (200,000, 20,000, 2000, 200, 150, 100, 50, and 10 mtDNA copies/μL) were used to create the standard curve. Copies of mtDNA per unit volume were determined within 24 h of the completion of the 3-week incubation.

The two damage negative controls along with a mtqPCR no template control (NTC) were run with each set of 30 samples. All negative controls quantified below the mtqPCR standard curve. MIQE guidelines [33] related to the mtqPCR assay have been addressed according to forensic considerations [34], including in previous studies [11, 30].

Amplification and library preparation using the PowerSeq CRM kit (10-plex)

Following incubation at room temperature and subsequent quantification, each of the sample dilutions was amplified using the PowerSeq™ CRM Nested System kit from Promega (Madison, WI), referred to as “10-plex” moving forward. Duplicate amplifications were performed for all samples. Template input amounts ranged from 1000 to 100,000 copies of mtDNA. Kit recommendations call for ~ 38,000–76,000 copies (0.5 ng of nDNA). Experience in previous studies [11, 30] supports the production of quality MPS results from as little as 1000–5000 copies of mtDNA. The kit utilizes a single multiplexed PCR reaction with ten overlapping targets spanning the mtDNA control region (CR); CRM stands for control region mitochondrial. The primers for amplification include priming adapters for incorporation of indices and P5/P7 sequences during the original PCR, streamlining library preparation. The amplification, and subsequent purification, quantification and normalization steps followed the manufacturer’s instructions. If the concentration of amplicon for a sample was less than the desired 4 nM, then the sample was not diluted prior to being added to the pooled library. The pooled library was spiked with 15% PhiX Control v3 (Illumina), instead of the recommended 10%, due to a lack of diversity in the target amplicons. The final loading concentration of the pooled library was 10 pM, instead of the recommended 12.5–15 pM, due to over-clustering observed during initial run assessments. The prepared library was run on an Illumina® MiSeq with a 600-cycle v3 kit (Illumina; San Diego, CA), providing 300 × 300 paired end reads. Negative controls (the two damage negatives and one amplification blank) were run with each set of 60 samples. A PCR positive control (2800 M) was also included in each run, resulting in a total of 64 total samples per MiSeq run. The 100,000 mtcopies/uL samples stored at − 20 °C served as an undamaged positive control for each donor, as it was assumed that DNA damage would be negligible or non-existent given previous findings [9].

Amplification and library preparation using the Nextera XT kit with entire CR amplicon (1 kb)

The second set of 120 samples was amplified using a PCR approach that targets a 1.16 kb amplicon encompassing the entire mtDNA CR [29]. This amplification scenario will be referred to as “1 kb” from this point forward. Template input amounts ranged from 700 to 75,000 copies of mtDNA. The following oligonucleotide primers (Integrated DNA Technologies Inc.; Coralville, IA) were used: 5′-CACCATTAGCACCCAAAGCT-3′ (forward primer F15997) and 5′-TCAGTGTATTGCTTTGAGGAGGT-3′ (reverse primer R590). Both primers were modified to include the Illumina overhang adapter sequence at the 5′ end of the locus specific primer [35]. The transposase adapted primers allow for the binding of indices and P5/P7 sequencing adapters during library preparation for sequencing on the Illumina MiSeq. Each amplification reaction had a total volume of 25 μL containing 2.5 μL of 10x TaKaRa Ex PCR buffer with 20 mM Mg2+ (Clontech, Mountain View, CA), 0.4 mM of each deoxynucleoside triphosphate (TaKaRa dNTP mixture; Clontech), 1.25 units TaKaRa Ex Taq polymerase (Clontech), 0.4 μM of each primer (IDT), and up to 10 μL of template DNA. For negative controls, 10 μL of MBG water was added in place of template DNA. Duplicate amplifications were carried out for all samples. PCR was performed in a GeneAmp® PCR System 9700 (Applied Biosystems, Foster City, CA) thermal cycler using the following conditions: a 94 °C soak for 1 min; 98 °C for 10 s, 56 °C for 30 s, 68 °C for 1 min for 35 cycles; 72 °C for 5 min; 4 °C hold. Prior to library preparation, the PCR products were quantified using the non-specific Qubit® dsDNA BR assay kit and the Qubit® 2.0 Fluorometer (Invitrogen, Carlsbad, CA) according to the manufacturer’s recommendations. Library preparation was performed according to the Nextera® XT protocol (Illumina, San Diego, CA). The final loading concentration of the pooled library was 10 pM and included a 5% spike of PhiX Control v3 (Illumina). Samples with less than the desired 4 nM were not diluted prior to being added to the pooled library. Sequencing runs were performed on an Illumina® MiSeq with a 300-cycle v2 reagent kit (Illumina), providing 150 × 150 paired end reads.

MPS data analysis

Secondary analysis of fastq files generated from the MiSeq Reporter software (Illumina) were analyzed using GeneMarker® HTS (GM HTS; SoftGenetics, State College, PA), version 1.2.2 [36]. Sequences were aligned to the revised Cambridge reference sequence (rCRS) [37, 38] with the aid of a custom motif file to ensure that phylogenetically correct calls were made for each sample. SNPs present in 50% or more of the reads were considered major variants and collectively referred to as the sample haplotype. Minor variants, present in less than 50% of the reads, represented sites of potential heteroplasmy or DNA damage. The filter settings were set to include the following: nucleotide positions (nps) 16024-16569 and 1-576 of the mtDNA CR as the identified range of sequence analyzed, 200 reads of minimum coverage at each np to be reported, 40 reads of minimum coverage for a minor variant to be reported, 1% threshold for including variant positions in the minor reporting table, ≤ 10 for the allele score difference, ≤ 2.5 for the SNP balance ratio, and ≤ 5.0 for the indel (insertion/deletion) balance ratio. A 2% reporting threshold was used for heteroplasmic positions, with the 1% filter serving as an analytical threshold. Given a 2% reporting threshold and the minor variant read coverage set at 40, each position required a minimum coverage of 2000 reads to report ≥ 2% heteroplasmy. Likewise, when calculating the damage rates, each damage site required a minimum coverage of 4000 reads to report ≥ 1% damage. Variants observed at a higher percentage could be reported with lower total read depth. Only SNP-related heteroplasmy or damage was considered for this study.

A conservative estimation of the total substitution error for all nucleotides was used to establish the level of background noise in the sequencing data, and calculated as previously described [29]. The total average error rate was determined to be 0.205 ± 0.002 and 0.297 ± 0.069 per 100 nucleotides for the 1 kb and 10-plex MiSeq runs, respectively. Lower error rates were observed for high quality DNA samples (without damage): 0.182 ± 0.065 and 0.231 ± 0.066 per 100 nucleotides for 1 kb and 10-plex MiSeq runs, respectively. Higher error rates were observed for damaged, low quality samples when using the 10-plex in previous studies: 0.325 ± 0.101 for hair and 0.485 ± 0.005 for metal cartridge casings [29]. Therefore, the overall rates for the current study were consistent with expectations. In turn, the analytical threshold of 1% was established by taking the average total error rate, adding ten standard deviations, and rounding up to 1%. Supplemental Table 1 lists the SNPs and indels in the mtDNA CR for each of the ten donor haplotypes. Heteroplasmy observed at the reporting threshold of ≥ 2% for minor variants was included in the donor profile, along with the corresponding minor variant frequency (MVF). Donor concordance was verified through replicate analysis.

Calculations and statistical analysis

The mtDNA quantification results were expressed as percent yield (observed/expected). Statistical significance was evaluated for the percent yield values by applying the non-parametric Kruskal-Wallis test [39], with post hoc testing using Dunn’s test [40] with Benjamini-Hochberg correction for multiple testing.

Degradation ratios were calculated by dividing the number of mtDNA copies based on the 69 bp (small) target amplicon by the number of copies based on the 283 bp (large) target amplicon (i.e., # of copies of small fragment/# of copies of large fragment). Statistical significance was evaluated by applying the non-parametric Kruskal-Wallis test, with post hoc testing using Dunn’s test with Benjamini-Hochberg correction for multiple testing.

Read depths were evaluated using base call counts taken from GM HTS output files. The average read depth per sample was calculated by averaging the total number of base calls at each np across the CR (1122 np total) for each sample. Individual average read depth values were then grouped together based on dilution category. Because the 1 kb data were generated using a 300-cycle kit and the Promega 10-plex data were generated using a 600-cycle kit, the read depth data was normalized to account for this difference. This was accomplished by obtaining the yield values from each MiSeq run using the MiSeq Sequencing Analysis Viewer (SAV) software. The 600-cycle yield was then divided by the 300-cycle yield to obtain a correction factor. Each 1 kb value was then multiplied by this correction factor.

The MPS results were placed into three categories: full profiles, partial profiles, and no results. Samples were still considered full profiles if major profile SNPs in the HV2 C-stretch region (nps 303–315) fell below the 200 read cutoff. This is due to the characteristic low read depth obtained in this region [28]. Partial profiles were defined as incomplete profiles where all known SNPs were not observed in the major profile (missing SNPs in addition to the HV2 C-stretch). When all SNPs fell below the 200 read cutoff, this was defined as no results. It is important to note that read depths below 200 are acceptable for reporting haplotypes. However, when considering the reporting of heteroplasmy, read depths exceeding 2000 are desired. Therefore, read depths under 200 were categorized as examples of no results for the purpose of this analysis. Two-proportion z-tests [41] for equality of proportions without continuity correction were applied to evaluate the statistical significance of success rates for generating an mtDNA MPS profile. The alpha value for the z-test was adjusted for multiple testing using the Holm’s method.

Two different read depth cut-offs were chosen for the damage rate calculations, 4000 reads and 1000 reads. The 4000 read cutoff was selected based on our GM HTS filter settings requiring a minimum of 40 variant calls and an analytical threshold of ≥ 1%. The 1000 read cutoff was used for comparison purposes to assess the loss of damage sites due to low read depth. It should be noted that 95% of the damage sites were ≥ 1000 reads. Therefore, using a threshold of 1000 served as an exercise to ensure that a greater percentage of sites could be evaluated for damage rate calculations. Damage rates were calculated on a per nucleotide basis as follows: total number of damage sites at a given analytical threshold/total number of viable nucleotide positions × 100, where a viable np is defined as a position passing the minimum coverage reporting parameter corresponding to a given analytical threshold. Specifically, any np below 4000 or 1000 reads which failed the minimum coverage requirement, and therefore any damage present at that position would not be reported by the GM HTS software, were not included in the calculation as viable nps. Damage rates per sample were evaluated using Kruskal-Wallis testing, post hoc Dunn’s test, and Benjamini-Hochberg correction for multiple tests.

A five-point Likert scale, which is a rating scale approach that assumes that the strength or intensity of the response being scaled is linear [42], was applied for the overall damage assessment. Each data point (sample) in each of the four categories was assigned a damage coefficient ranging from one to five (1–5) with five representing little to no damage and one representing the most damaged samples. The mtqPCR, degradation ratios, and damage rates data sets were each divided into thirds: the lowest third (highest for mtqPCR yield) were assigned a damage coefficient of five (5), the middle third a three (3), and the highest third (lowest for mtqPCR yield) a value of one (1). For the MPS profile results, full profiles were assigned a five (5), partial profiles a three (3), and no results a one (1). For comparison purposes, this data set was reduced to 120 data points by averaging the duplicate amplifications as follows: one full profile and one partial profile was assigned a four (4); one partial profile and one no results was assigned a two (2); and one full profile and one no results was assigned a three (3). The overall damage assessment was calculated by averaging the damage coefficients for all four categories. Likert scale data was generated using the R package Likert [43].

All statistical analyses were conducted using RStudio v.1.1.383; RStudio Team, 2015; R v.3.6.1 [44].

Results and discussion

mtQuantification and degradation

The mtDNA quantification results were expressed as percent yield (observed/expected) and compared between each dilution group: 100,000 mtcopies/uL undamaged (100 K Undam), 100,000 mtcopies/uL damaged (100 K Dam), 25,000 mtcopies/uL diluted before damaged (25 K dBD), 25,000 mtcopies/uL diluted after damaged (25 K dAD), 2500 mtcopies/uL diluted before damaged (2.5 K dBD), and 2500 mtcopies/uL diluted after damaged (2.5 K dAD), with 20 samples in each dilution group comprised of 10 donors and duplicates to cover both amplification scenarios. The mean percent yield for the 100 K undamaged samples is 97% and decreases to 76% for the 100 K damaged samples. The other dilution groups have similar mean percent yields, with the 2.5 K dBD group having the lowest mean (70%). This dilution group also has the lowest end of the lower quartile falling below 40% yield. The only comparisons that were marginally statistically significant were between the 100 K Undam and the 100 K Dam samples (p value of 0.047) and the 100 K Undam and the 25 K dAD samples (p value of 0.039)(Fig. 2). As expected, these findings illustrate that damaging DNA lowers yields as seen by the 20–30% average mtDNA loss following controlled damage events. However, it is important to note that damage can manifest itself as fragmentation or lesions that make primer binding less optimal. So lower yields can also be associated with reduced templates available for amplification.

Fig. 2
figure 2

Box and whisker plot depicting mtqPCR results as percent yield (observed/expected) for each dilution category (n = 20 per category) after exposure to DNA damage. Two outliers were observed for the 100 K Undam category (82% and 85%). Median values are depicted by lines within box plots. Kruskal-Wallis test (with Benjamini-Hochberg correction for multiple testing) p values are given in the bottom portion of the figure with the ends of the brackets denoting which datasets are being compared

Degradation ratios were calculated for each dilution category (n = 20) and depicted as a box and whisker plot (Fig. 3). As expected, there was no degradation observed in the 100 K undamaged samples (a mean ratio of 1). A 4 to 5-fold increase in degradation was observed in the 100 K and 25 K damaged samples; mean values between 3.7 and 4.8, respectively. The mean degradation ratio for the 2.5 K dAD samples increased to 9.7 and increased another 4-fold to 36.7 for the 2.5 K dBD samples. The most statistically significant comparisons are as follows: p values of 8.49 × 10−7 for 100 K Undam compared with 25 K dBD, 2.31 × 10−8 for 100 K Undam compared with 2.5 K dAD, and 3.22 × 10−10 for 100 K Undam compared with 2.5 K dBD, with a complete list provided in Supplemental Table 2. Overall, these data clearly illustrate that as DNA template amounts decrease, the level of degradation increases. In addition, and as expected, ratios were higher when samples were diluted prior to damage, with the 2.5 K dBD dilution group having the highest degradation rate; including the two highest outlier degradation ratios of 145 and 349.

Fig. 3
figure 3

Box and whisker plot depicting mtqPCR results as degradation ratios (mtcopies of 69 bp target/mtcopies of 283 bp target) for each dilution category (n = 20 per category) after exposure to DNA damage. The following outliers were removed from the figure: 2.1 (100 K Undam), 11 (100 K Dam), 145 and 349 (2.5 K dBD), and 28 and 39 (2.5 K dAD). Median values are depicted by lines within box plots. Kruskal-Wallis test (with Benjamini-Hochberg correction for multiple testing) p values are given in the top portion of the figure with the ends of the brackets denoting which datasets are being compared

MPS read depth/coverage characteristics

Evaluating the coverage of MPS sequence data is an effective way to assess the quality and quantity of mtDNA sequence information obtained from each sample. As a comparative assessment, normalized average read depths for the different dilution categories, as well as between the two amplification schemes, are presented in Fig. 4. Similar read depths were observed for the 100 K Undam samples when using either of the amplification approaches; average of 83 K reads for the 1 kb samples and 69 K reads for the 10-plex samples. Once damage was applied, there was an overall pattern of higher read depths observed in the 10-plex samples compared with the 1 kb samples; with the exception of the 2.5 K dAD group. A small increase was observed in the read depths of the 10-plex 100 K damage samples (61 K v. 45 K). Observations in the other dilution groups were as follows: a 7-fold read depth increase in the 10-plex for the 25 K dBD sample group; a 2.5-fold increase in the 25 K dAD group; and a 6-fold increase in the 2.5 dBD group. These observations are expected since the target amplicons are much smaller for the 10-plex. While the 2.5 K dAD group yielded a 2.8-fold increase in the read depth for the 1 kb samples over the 10-plex samples, these findings may have been impacted by the limited data for the 1 kb samples.

Fig. 4
figure 4

Comparison of normalized average read depths between different DNA damage dilution categories and two amplification and library prep strategies (1 kb v. Promega 10-plex; n = 20 per category)

Another interesting comparison was the point where the average read depths start to dramatically decrease for each amplification scheme when considering the different damaged dilution groups. As expected, the average read depth continued to steadily decline as the input template DNA decreased with each dilution group. The most significant decrease for the 1 kb amplification scenario was going from the 100 K Dam sample group to the 25 K dBD dilution group (10-fold decrease). Whereas, the most significant decrease for the 10-plex amplification scheme was going from the 25 K dAD dilution group to the 2.5 K dAD dilution group (9-fold). These results suggest that average read depth is more negatively impacted for the 1 kb amplification approach when compared with the 10-plex amplification method. Nonetheless, it is important to note that these findings are impacted by template availability, library preparation, and software alignment and filtering parameters, so are only a comparative view of the quality of the data generated from samples with varying degrees of damage.

MPS mtDNA profile results

Another way to assess the quality and quantity of mtDNA MPS sequence data is by analyzing the profile results and measuring the effect of damage and degradation. To achieve this, the sequence data were placed into three categories: full profiles, partial profiles, and no results.

Samples were divided into dilution categories and the two amplification approaches were compared. Each group of data points were expressed as percentages of full profiles, partial profiles, and no results (Table 1 and Fig. 5). The results clearly illustrate that the 10-plex amplification yields higher percentages of full profiles across each dilution group when compared with the 1 kb samples (p value of 2.25 × 10−12). Although, full profiles were obtained from the undamaged samples for both amplification schemes. As damage was introduced in the 100 K Dam samples, an immediate decrease was observed in the percentage of full profiles (70%) for the 1 kb strategy, with continued substantial decreases as the amount of template DNA was reduced. Whereas 90–100% full profiles were observed in every dilution category of the 10-plex samples except for the 2.5 K dilution categories. This pattern mirrors the read depth results, with the higher read depth and higher percentage of full profiles obtained from the low template 10-plex samples, most likely attributed to the increased ability of the 10-plex to amplify smaller, more degraded fragments of mtDNA.

Table 1 MPS results depicted as percentages of full profiles, partial profiles, and no results across each dilution category and comparing the two amplification and library prep strategies (1 kb v. Promega 10-plex). n = 20 per category
Fig. 5
figure 5

MPS results depicted as percentages of full profiles, partial profiles, and no results across each dilution category and comparing the two amplification and library prep strategies (1 kb v. Promega 10-plex). n = 20 per category

For both amplification approaches, the 2.5 K dBD category had the lowest percentage of full profiles and the highest percentage of no results. In fact, this dilution category is the only group that had no results (40%) in the 10-plex samples. Comparing full profile results of the 2.5 K dBD dilution group to the others resulted in the following statistically significant p values: 2.47 × 10−10 compared with 100 K Undam, 2.04 × 10−6 compared with 100 K Dam, and 3.45 × 10−4 compared with 25 K dAD. This trend also correlates with the degradation ratio results, where the 2.5 K dBD samples had the highest degradation ratios, adding further evidence to the conclusion that diluting samples before introducing damage had more of a negative impact than diluting after the damage was introduced. The overall pattern observed in both amplification formats is that as the template DNA decreases, the amount of damage/degradation increases, which in turn negatively affects the ability to obtain full mtDNA profiles.

DNA damage rates

Damage rates (damaged sites per 100 nps) were calculated for each sample and grouped together in different ways to assess overall damage patterns. Two different read depth cut-offs were used in the calculations; 4000 and 1000 reads. To account for zero value damage rates obtained in samples with no damage, and also for samples that were too degraded and gave no results, the following values were used: 0.001 for undamaged samples and 3.11 (highest observed damage rate value +10%) for degraded samples with no results. Results across dilution groups when using the 4000 read cutoff (Fig. 6) illustrate that the 100 K Undam samples were void of damage for both amplification approaches. Following damage, rate patterns were quite dissimilar between the two approaches. The 100 K Dam samples had very low rates of damage when using the 10-plex approach (mean value of 0.005), while the 1 kb samples had significantly higher rates (mean value of 1.3). Mean damage rates increased to ~ 0.2 for the 10-plex 25 K samples, whether the damage occurred before or after dilution, while the 1 kb mean rates increased to between 2 and 3 for all 25 K and 2.5 K samples. The mean rates of the 10-plex samples increased 10-fold when going from the 25 K dilution categories to the 2.5 K. The most statistically significant increase in mean damage rate for the 1 kb samples was between the 100 K Undam group and the 100 K Dam group (p value of 5.57 × 10−5), compared with the 10-plex samples where the most significant increase was between the 25 K dBD and 2.5 K dBD groups (p value of 3.52 × 10−4). For a comprehensive list of all p values related to damage rates, see Supplemental Table 3.

Fig. 6
figure 6

Box and whisker plot depicting damage rates (number of damage sites/number of total sites reported × 100) calculated across each dilution category using a 4000 read cutoff and a comparison of the 1 kb and Promega 10-plex amplification and library prep strategies (n = 20 per category)

When comparing the two amplification scenarios side by side for each dilution group, the mean rates for the 100 K Dam, 25 K dBD and 25 K dAD groups were all significantly higher for the 1 kb samples (p values of 8.35 × 10−5, 2.85 × 10−6, and 3.52 × 10−4, respectively). For the 2.5 K groups, although the mean rates for the 1 kb samples were closer to three and the 10-plex sample rates were closer to two, the comparisons were not statistically significant. It should be noted that many of the 1 kb samples in the 25 K dBD and 2.5 K dilution groups gave lower amounts of MPS data, including a large percentage of samples with no results due to low amplification yield or read depth, and were therefore given damage rate values of 3.11. The high levels of degradation for these samples resulted in the inability to effectively amplify a 1 kb target (1.16 kb amplicon size). When damaging low amounts of template, it is expected that a greater proportion of the template pool will be impacted through damage lesions and resulting fragmentation. Therefore, the dramatic increase in rates as template decreased is expected for both amplification approaches.

The overall mean damage rate of the 1 kb samples was 2.0, significantly higher when compared with a rate of 0.79 for the 10-plex samples (p value of 5.89 × 10−10). As damage occurs to DNA templates, they begin to fragment at or near the lesion sites. Given that the 1 kb approach requires longer templates for amplification, it is expected that rates of damage will be higher, as the approach captures templates that possess a higher number of damage lesions before fragmentation; consistent with previous observation in forensic casework [8]. The damage rates were also compared with [9], ranging from 0.09 to 1.5 per 100 nucleotides. While the previous study involved 0.5 ng/μL of nDNA (38,000–76,000 copies of mtDNA/μL), and the damage was induced by storage in water at room temperature for 8–34 weeks, the target amplicon was ~ 2 kb. Given that the damage rates fell within the same general range as seen in the current study, this supports the contention that more damage will be detected as the amplicon length increases. Overall, it is clear that the 10-plex samples yielded less DNA damage and produced more MPS profile information.

Interestingly, when the damage rates were calculated using a 1000 read cutoff, the trend in rates was very similar (Supplemental Fig. 1). However, the overall mean decreased for both amplification scenarios (down to 1.53 for 1 kb and 0.46 for 10-plex). It was predicted that the 10-plex damage rate may increase closer to the 1 kb rate due to the increased number of damage sites observed within the 1000–4000 read window for the 10-plex samples (111 compared with 22 for the 1 kb samples), but it was clear that the number of total viable nucleotides also increased dramatically, especially in the lower template samples. It is therefore evident that target amplicon length is the primary factor in determining damage rates for samples when performing mtDNA MPS analysis on low amounts of highly damaged DNA templates.

Characterization of DNA damage

A major benefit of conducting a controlled study on DNA damage is that it allowed for a unique opportunity to examine and characterize the frequency and type of damage present in hundreds of DNA samples. A total of 486 damage sites were observed and analyzed across all samples in the study, with 144 sites observed in the 1 kb samples and 342 sites in the 10-plex samples. Three different areas were examined in order to characterize the DNA damage: (1) the type of base change observed, (2) the minor variant frequencies (MVF), and (3) the damage hotspots observed three or more times. It should be noted that the resulting misincorporations at damage lesions were observed in the MPS data as a minor fraction of the sequencing reads covering a specific nucleotide position and were typically not associated with SNPs. As expected, this was impacted by the number of starting templates available for the initial amplification.

The type of base change across all damage sites was expressed as a percentage of the total number of damage sites for each amplification scenario (Fig. 7). For the 1 kb damage sites, the A-G base change had the highest frequency at 29%, followed by C-T and T-C (~ 20%), C-A and G-T (~ 10%), and G-A (~ 6%). For the 10-plex damage sites, the C-T base change had the highest frequency at 33%, followed by G-T at 17%, C-A at 15%, G-A at 12%, T-C at 11%, and A-G at 9%. Both amplification formats had negligible or no damage manifesting as C-G, G-C, T-A, T-G, A-C, or A-T base changes, all transversions. Instead, the types of damage correlated with deamination and depurination events, consistent with the exposure of samples to hydrolytic damage. It should be noted that type 1 deamination (A-G and T-C base changes) occurred more frequently in the 1 kb samples, whereas type 2 deamination (C-T and G-A base changes) occurred more often in the 10-plex samples. Damage lesions occurring in mtDNA casework samples such as aged bones and hair typically represent type 2 deamination when using smaller primer sets in PCR amplification [8]. While a previous DNA damage study on control samples reported predominantly type 1 deamination when using a 2 kb PCR amplification scenario [9]. These findings suggest that the type of base change may be correlated to the type of PCR amplification approach, specifically the size of the targets being amplified, and to the nature of how the damage was inflicted. Examining exclusively the occurrence of assumed type 2 deamination, 26% of the 1 kb damage sites showed type 2 lesions, compared with 45% of the 10-plex sites. A further assumption can be made that the remaining base changes were a result of depurination or other types of damage.

Fig. 7
figure 7

Frequencies of each type of base change observed in 486 damage lesions across all samples and comparison of two amplification and library prep strategies (1 kb v. Promega 10-plex). Data is stacked, not overlapping. Type 1 deamination is represented by A-G and T-C base changes and type 2 deamination by C-T and G-A base changes

The MVFs of damage sites are indicative of the relative amount of damage being observed at each nucleotide position and were plotted (Fig. 8) for every site in order to assess the importance of setting analytical and reporting thresholds to mitigate the effect of damage in forensic DNA casework. For example, 56% of the 1 kb damage sites and 63% of the 10-plex sites fell under the 2% MVF reporting threshold set by our laboratory and supported by assessments of error rates in empirical data [29]. In turn, 88% of the 1 kb sites and 90% of the 10-plex sites fell under a 5% threshold, while 97% of the 1 kb sites and 99% of the 10-plex sites fell under a 10% threshold. Therefore, at least for the samples in this study, by setting a 5–10% reporting threshold, the damage observed in this study has little impact on the interpretation of the MPS data. However, in forensic casework it is often desirable to report heteroplasmy at a lower level; ideally down to at least a 2% threshold. In this study, the MVFs of the known heteroplasmic sites among the ten donors were as follows: 13%, 5%, 3%, and two sites at 2%. Therefore, this allowed for an evaluation of whether relatively low-level heteroplasmic sites could be distinguished from damage. For the vast majority of sites in this study, damage was not duplicated in the replicate PCR amplicon further supporting that replicate analysis of extracts will mitigate the inaccurate reporting of damage as heteroplasmy [8, 9]. Alternatively, a consistent pattern of heteroplasmy was duplicated in the replicate MPS data, including reproducibility in their MVFs, consistent with previous studies [26, 46, 47]. Data showing the reproducibility of the heteroplasmy MVFs in duplicate amplifications can be found in Supplemental Table 4 (standard deviations ranged from 0 to 2.8, with one outlier at 4.9). In general, duplicate amplifications showed lower standard deviations from the average compared with the overall standard deviation for each heteroplasmic site. This is due in part to the higher standard deviations observed as the amount of damage increased in most of the lower template samples. Interestingly, when looking at the individual MVFs themselves (both heteroplasmy and damage sites), no correlation was observed between increased MVFs and lower template samples, including as they became more damaged. This is a promising finding for casework applications.

Fig. 8
figure 8

Individual MVFs of 486 damage lesions observed across the control region for all samples (144 in the 1 kb samples and 342 in the 10-plex samples). Proposed analytical (2%) and reporting (5%) thresholds are marked by the first and second circles moving outward, respectively, for each dataset. The minor variant frequency plot for the two library preparation kits used the shiny Circos package by [45]

It is important to point out that potential damage hotspots may be helpful when analyzing MPS sequence data and differentiating between damage and heteroplasmy. A listing of 35 potential damage hotspots observed in this study is provided (Table 2), with a “hotspot” defined as three or more occurrences of a damage lesion across all samples in the study. Out of the 35 hotspots, 20 (57%) were duplicated across the two independent experiments. The distribution of potential hotspots occurs as follows: 63% in the first half of the CR (between np 16024–16569) and 37% in the second half of the CR (1-576). Each hotspot in this study was observed three or four times, except for np 16390, which was seen seven times. Within 18 of the 35 hotspots, alternate variants or base changes were observed. For example, four instances of damage were observed at np 545, with two instances of a minor G545T lesion and two instances of a minor G545A lesion. Damage lesions most often occurred at a C or G nucleotide position, with two instances occurring at an A (nps 189 and 16265) and no instances occurring at a T. There were six instances in a total of three samples of damage lesions that occurred across duplicate PCR amplifications of the same sample. All three samples were in the lowest dilution group (2.5 K) and all six damage lesions were considered hotspots in this study (nps 172, 389, 412, 545, 16348, and 16401). When comparing the 35 hotspots to a previous damage study [9], five sites were observed in both: nps 67, 189, 16265, 16390, and 16519. The hotspots were also compared with heteroplasmy hotspots published or reported in previous studies. The first study used an STS approach and published 48 positions of heteroplasmy observed two or more times [48] and a second study applied an MPS approach and reported 26 positions of heteroplasmy that were observed five or more times (unpublished data). Interestingly, when comparing them to the current study, there is relatively little overlap. A total of seven sites (out of 35) overlapped; np 16179 (MPS study only), nps 16261 and 16291 (STS study only), and nps 189, 195, 16390, and 16519 (both heteroplasmy studies), with nps 189 and 195 exhibiting heteroplasmy in two of the current study donors. Of these seven sites, only nps 195 and 16591 were included within the top ten most observed heteroplasmic hotspots of the two previous studies. These findings will be helpful to practitioners when developing guidelines for differentiating between sites of damage and heteroplasmy.

Table 2 Potential damage hotspots (n = 35) observed three or more times across all samples in the current study

Overall damage assessment

In an attempt to assess the overall damage encumbered by each sample, and by proxy each dilution group, the following four features of damage were evaluated and combined: mtqPCR yield, degradation ratios, MPS profile results, and damage rates.

Using the damage coefficients, each of the four data sets was graphed individually and then combined together as an overall damage assessment (Fig. 9). These plots accurately represent the trends and patterns observed throughout the data, as illustrated by the decreasing damage coefficients as DNA template decreased. As the overall damage assessment plot indicates, 100% of the 100 K Undam samples had a damage coefficient of five (5), immediately dropping to 57% (4), 29% (3), and 14% (2) for the 100 K Dam samples. It should be noted that the 2.5 K dBD samples had the highest overall damage (lowest damage coefficients) with 29% (3) and 71% (1/2). The overall plot also highlights the disparity between the dAD and dBD groups, clearly illustrating that the dBD samples (both 25 K and 2.5 K) have lower damage coefficients than their dAD counterpart. This is consistent once again with the conclusion that diluted samples will exhibit greater damage.

Fig. 9
figure 9

Overall damage assessment using a damage coefficient scale of 1–5 (1 = highest damage; 5 = little to no damage). Individual plots are shown for the four categories (datasets): damage rates, degradation ratios, MPS profile results, and mtqPCR yield, from top to bottom. The last plot shows the overall damage assessment when averaging the damage coefficients for all four categories. Comparisons between the dilution categories are depicted by the percentages of each dilution category that fall into the different damage coefficient bins (1–5)

Analysis of positive and negative controls

All MPS profile results gave the correct known haplotypes, including all positive controls. No contamination was observed in the sequencing results of the 10-plex samples. Only two instances of minimal contamination were observed in the sequencing results of the 1 kb samples. The first involved nps 16249 and 16256 which were observed in the positive control at MVFs of 16% and 22%, respectively. These two sites were also observed in two different dilution samples of Donor 2, 100 K undamaged (4% MVF) and 100 K damaged (4–6% MVF). This was most likely a random contaminant, as the two sites were not present in the known profiles of any of the ten donors and they were only observed in Experiment 1. The second instance of contamination involved np 16069, which was observed in Donors 3 and 5 in Experiment 1 and Donors 6 and 10 in Experiment 2. Each time this contaminant was observed, it was in highly damaged samples with low read depth and the 16069 variant was the only sequencing result obtained. In both experiments, there was at least one donor that had 16069 as part of their known profile, therefore, it is possible that this was carry-over contamination from one donor to another during DNA extraction or PCR. No mtDNA sequence profile information was obtained from any of the negative controls across both experiments; in fact, each negative control contained fewer than an average of 50 reads per np. It should also be noted that the Promega 10-plex kit contains two degenerate primer sets to account for common variants that lie within primer binding sites which can elicit artifacts at nps 228 and 16111. In the current study, these two artifacts were manifested as transition SNPs in the minor profile, with a MVF of 3–4% for the 228 artifact (G-A) and 7–10% for the 16111 artifact (C-T). To simplify interpretation, it is recommended that primer sequences be removed during data analysis [49].

Conclusions

The findings of this study clearly demonstrate, in a controlled environment, that DNA damage impacts the quantity and quality of mtDNA sequence when interpreting data from low template, compromised samples. As damage increases, mtDNA yield decreases, by an average of 20–30% for this study. The lower the input template DNA, the more impactful the damage, which was further evidenced by increased degradation ratios, decreased MPS read depth and profile results, and increased damage rates in the lower template samples. And as expected, damage increased when occurring after the dilution of samples (dBD). These patterns were clearly illustrated through an assessment of damage coefficients. Therefore, it is particularly important to protect against further DNA damage when extracting low amounts of DNA from forensic samples and storing them in an aqueous environment. Clearly, from this and previous studies [9, 11], it is important to store extracts at or below − 20 °C and in a buffered solution (e.g., low TE) for subsequent MPS analysis and interpretation of low-level heteroplasmy.

When comparing the 1 kb and 10-plex approaches, the 10-plex samples had higher read depth in every dilution category (with the exception of one), had a greater percentage of full MPS profiles, and had a lower overall damage rate. The 1 kb samples exhibited much higher damage rates in the 100 K and 25 K dilution groups compared with the 10-plex samples, presumably due to the capturing of damage lesions from longer amplicon targets. It was also clear that a large percentage of the 25 K and 2.5 K samples were too degraded for the 1 kb approach while being successfully amplified with the smaller 10-plex targets (~ 162 versus 1160 base pairs).

An important component of this study was the comparison of samples diluted before or after the damage was applied. The majority of the findings support the conclusion that diluting samples prior to damage had a negative impact on the ability to obtain high quality MPS sequencing results, especially for the 2.5 K samples. The samples in the 2.5 K dBD group had a significantly higher degradation ratio, the lowest read depth (in the 1 kb samples only), the lowest percentage of full MPS profile results, the highest percentage of no MPS results, and the highest damage rates in the 10-plex samples. This trend was also clearly depicted in the overall damage assessment plot. This illustrates that samples with lower starting amounts of template DNA are generally more susceptible to DNA damage.

This study presented a rare opportunity to characterize hundreds of damage lesions that were a result of hydrolytic damage in a controlled setting using donor samples with known mtDNA profiles. Examination of the types of base changes, including hotspots, provides useful tools in the interpretation of mtDNA results with respect to differentiating between damage and heteroplasmy, as well as using the MVF data as guidance in setting appropriate thresholds in the MPS analysis of damaged mtDNA. For example, it is apparent that in this study and in practice, most damage can be differentiated from heteroplasmy by performing duplicate PCR amplifications. This practice is supported and recommended [8, 50], including in the ancient DNA community [20]. In this study, the only instances of damage sites being replicated through duplicate amplifications turned out to be damage hotspots. Also, in the determination of analysis thresholds, because most of the damage lesions in this study occurred at 5% MVF or less, a 5% reporting threshold may be warranted when confronted with damaged samples. Whereas in mtDNA MPS analyses where DNA damage is not an issue, it may be possible to lower the threshold to 2%, allowing for inclusion of low-level heteroplasmy which has the potential to increase the statistical value of a mtDNA match.

The results of this study provide an assessment of the Promega 10-plex MPS kit as an improved method of diminishing the effects of mtDNA damage on low template samples. As more forensic DNA laboratories begin to transition to an MPS protocol, it will be essential to develop robust interpretation guidelines that include not only strategies for mitigating DNA damage but also integrating the analysis of heteroplasmy as a more routine practice in mtDNA analysis.