Introduction

In forensics, mixed DNA samples are those in which two or more individual body fluids or secretions are mixed, which often occurs in the cases of sexual crimes or large disaster scenes, as well as in products of conception and fingernail cuttings taken by police or at autopsy [1]. Due to the different types of materials, the number of donors and the different proportions of each component in the mixture, identification, and interpretation are much more complicated for mixtures than for single-sourced samples. DNA mixtures occur routinely in our forensic investigations, and through years of efforts, there have been some relevant developments and improvements in the aspects of DNA separation/extraction methods, useful genetic markers, detection platforms, and analytical tools for mixtures [2,3,4,5]. As is widely known, despite such advances, forensic analysis of complicated mixtures remains a process that requires too much time and energy. In this review, we summarized the development and progress of DNA mixture analysis and put forward the existing problems, which may provide a reference for future forensic research and practical work.

Separation and DNA extraction from mixtures

For the mixed samples of sperm and vaginal fluid collected to investigate sexual crimes, Gill et al. [6] proposed a two-step differential extraction (DE) method based on the characteristics of sperm nuclei, which are ramified with cross-linked thiol-rich proteins. Female epithelial cells are first preferentially lysed by preliminary incubation in SDS/proteinase K buffer, releasing DNA into suspension, and then the sperm components were obtained by centrifugation and lysed in a second buffer containing SDS/proteinase K/DTT, which can break the protein disulfide bridges of the sperm nuclear membrane. At present, this method is most commonly used in forensic laboratories. However, when the specimen is older, the sperm component is very low or there are multiple male contributors mixed within a complex sample, and the separation efficiency of the differential lysis method becomes poor; moreover, repeated washing steps during the process can cause the loss of sperm components, or the precipitate may include female components due to incomplete lysis. For the past two decades, researchers have made many attempts to improve the efficiency of the DE method [7,8,9], and several novel techniques, such as laser capture microdissection (LCM), immunomagnetic separation (IMS), and microfluidic chip analysis, were developed to separate and extract DNA in complex DNA mixtures [10,11,12].

Exploration of DE method

Despite the widespread application of DE, the DNA typing efficiency of sperm in a male-female mixture after sexual intercourse is still limited by the small number of sperm in the context of a wide range of female-contributed materials. Therefore, many researchers are committed to improving the effectiveness of the DE method through a variety of different approaches. Wiegand et al. proposed a modified DE method that uses a mild preferential lysis to avoid further loss of sperm DNA in a low sperm content mixture [13]. Later, Yoshida et al. developed a method in which the temperature for incubation was elevated and the concentration of proteinase K was increased in the DE process [14].

In sexual assault and rape cases, genetic analysis of the perpetrator and victim DNA from vaginal cotton swabs is a well-established forensic technique for investigation [6, 15]. Voorhees et al. focused on a cellulose-based enzyme mixture obtained from Aspergillus niger, Trichoderma reesei, and Trichoderma viride, which come from different fungal sources, and indicated its potential to improve the release of sperm and epithelial cells from a common cotton swab compared with the buffer elution of DE alone, permitting forensic DNA analysis [7]. Afterwards, their team focused on the development of a method that chemically induced enhancement of cell elution and recovery from cotton swabs. They found that different detergents, as well as proteinase K, affected the sperm cell yield, with anionic detergent (e.g., SDS) and suitable use of proteinase K having the greatest effect [16]. Later, Benschop and his colleagues indicated that although nylon flocked swabs dried much slower than cotton swabs for post-coital vaginal sampling, which may promote microbial growth, the cellular eluate from the nylon flocked swabs contained a higher number of and more intact cells; additionally, less cell material was retained after DNA extractions, and at the same time, the yield of the extracted DNA was higher [8]. However, cotton swabs are still the most commonly used collection tools in sexual assault cases, possibly because the results from the two kinds of swabs are not so different in practical work compared to investigations.

Most recently, a semi-quantitative ratio-based analysis, termed Separation Potential Ratio of the Extraction Differential (SPRED), was conducted to evaluate the separation efficiency of both sperm DNA recovery and non-sperm DNA removal in the DE methods [3]. A higher SPRED ratio indicates a higher potential for obtaining a primarily male component in the sperm fraction. The SPRED value of the two-step DE could be improved significantly by performing a second non-sperm lysis step, which reduced the non-sperm cell DNA carryover with no concomitant reduction in sperm DNA recovery.

LCM

LCM technology has provided a significant breakthrough for the separation of a limited number of sperm cells from an overwhelming quantity of female epithelial cells [10]. It combines laser beam technology with light microscopic devices and targets specific cells or tissue sections that need to be isolated from others. Under direct microscopic visualization, the cells of interest are isolated by infrared (IR) capture systems [17] or ultraviolet (UV) cutting systems [18,19,20], and then separated cells or tissue regions are placed into independent tubes for DNA extraction and analysis [21,22,23].

Lucy et al. claimed that spermatozoa contain only half the genetic material of a contributor, because they are haploid cells, and that the theoretical number of sperm that need to be pooled for a full representation of the alleles comprising the contributor profile is 15–20 intact and non-degraded sperm. [24]. However, manual screening of microscope slides for sperm morphological identification is time-consuming and labor intensive. Vandewoestyne et al. introduced an automated screening method to detect spermatozoa stained with Sperm HY-LITER, and then LCM was used for isolation. The DNA analysis results showed that a minimum of 30 spermatozoa recovered from post-coital samples could generate a robust DNA profile without allelic dropout [25]. There have been other studies proposing staining methods to perform sex-specific labeling of cells for LCM [26,27,28]. Fluorescence in situ hybridization (FISH) was also conducted using Y-chromosome-specific probes to separate male cells from male-female mixtures [29].

For a long period, researchers used multiple techniques to extract DNA from cells obtained from LCM. Vandewoestyne and his colleagues compared three different DNA isolation methods for laser microdissected blood cells and found that the PicoPure DNA extraction kit (Arcturus, Mountain View, CA, USA) performed better than the DNA IQ™ system extraction kit (Promega Corporation, Madison, WI, USA) and the short alkaline DNA extraction method. The cell collection, lysis, and PCR could all be conducted in the same tube when using the PicoPure DNA extraction kit, which reduced the contamination risk, and as few as 10 cells could be used to detect the full DNA profiles [11]. Han et al. used a strategy for sperm isolation and short tandem repeat (STR) typing from multidonor sperm mixtures in which they applied LCM and low volume-PCR (LV-PCR) for single sperm isolation and detection [30]. In their study, the platform was so sensitive that the profiling of a single sperm cell could generate a minimum of 13–16 loci in nearly three quarters of the Y-chromosome STR (Y-STR) assays.

IMS

The IMS of eukaryotic and prokaryotic cells has been widely explored. Cell separation by IMS involves two steps: the first step is the specific binding of immunomagnetic beads and cells, the core of which is the specific binding reaction of the antigen-antibody. The second step is the application of immune complexes (antigen-antibody and magnetic beads) in an appropriate magnetic field [31]. In 2002, three monoclonal anti-sperm antibodies (MHS-1O, NUH-2 and HS-21), after attachment to magnetic beads, were evaluated and applied by Eisenberg et al. to capturing sperm from mixtures consisting of varying numbers of vaginal epithelial cells and sperm. The results proved that none of the three antibodies bound the vaginal epithelial cells and that the MHS-IO antibody bound as much as 90–95% of the input sperm [32]. In Anslinger’s study, another three monoclonal anti-sperm antibodies, 1E10, 4E3, and 4E10, were selected against the testicular isoform of angiotensin-converting enzyme (tACE), among which 4E3 was proven to be the most effective [33]. However, the vaginal swabs were required to be stored in PBS buffer to provide a sufficient amount of intact sperm with mid-piece and flagellum, which is a drawback for forensic application. In the study of Zhao et al., sperm were isolated from different mixtures of sperm cells and buccal cells with the help of anti-SPAG8 antibody that can combine a special antigenic protein located on the head of the sperm [34]. Wang et al. applied the complex of biotin-labeled rabbit anti-human sperm antibody and avidin-labeled magnetic beads to separate sperm from the mixture, which indicated that polyclonal antibodies interact with multiple surface proteins of sperm and thus have a greater likelihood of binding with sperm and a stronger ability to form more stable sperm-antibody-biotin complexes [35].

Microfluidic chip/LOC

With multiple laboratory techniques, including cell sorting, DNA extraction, DNA quantitation, and DNA amplification, all integrated in a chip of a few tens of square centimeters, microfluidic devices can provide fast genetic analysis for forensic application. The microfluidic chip, also called lab-on-a-chip (LOC), can also reduce the risk of cross-contamination and provide the possibility of direct analysis at the crime scene, which minimizes sample handling in a sealed microfluidic environment [36,37,38]. There have been many studies making improvements on the microfluidic chip [39]. Due to the differences in cell size, chemical composition, and membrane structure between sperm and epithelial cells, they respond differently to a non-uniform electric field, resulting in different motion in dielectrophoresis (DEP) that may be used to separate them. Buoncristiani’s group applied a commercially available Silicon Biosystems DEPSlide™ System that could separate sperm and epithelial cells in a microfluidic chip. However, they found that the purity or yield of the separated cells was not as good as expected based on visual observation and that the quality of the DEP was no better than that of standard DE [40]. In further developmental efforts, the DEP platform was integrated into the microfluidic system [41], which may allow the inexpensive, fast, highly sensitive, and label-free detection and analysis of sperm and epithelial cells for sexual assault cases. In addition, an acoustic differential extraction (ADE) analysis was developed on a microfluidic device, which relied on the acoustic trapping of sperm cells, while transferring the female components into a separate outlet [42, 43]. Because it involves similar lysis steps as the DE method, there will inevitably be problems with typing failure caused by over/incomplete digestion. Furthermore, Fontana et al. applied an image-based and microfluidic digital system, DEPArray™ Technology (Menarini Silicon Biosystems, Italy, MSB), to recover and detect pure homogeneous cells from imitated blood/saliva and semen/saliva mixtures as well as casework mixtures with an outstanding precision [44]. Although their research was not a current standard, it could promote the investigation and development of this technology in forensic biological mixtures.

The LOC has been proven to be a versatile technology that is fast, efficient, and integrated, with low sample input volume and low reagent consumption. The chip not only can carry out multithroughput operations but also is lightweight and small in size such that it is portable. Therefore, the LOC is very conducive to the realization of automated separation of sperm from challenging biological mixtures. However, the technology of microfluidic chips is still in development, and before it can truly be applied in forensic scenes, several problems need to be overcome: the cost must be reduced, additional DNA or RNA analysis techniques must be integrated for challenging samples, and uniform standardization for microfluidic interconnection, chip dimension, and vocabulary needs to be established to fully exploit the superiorities provided by LOC [37].

Other improved methods

In 2003, Garvin et al. created a vacuum-driven filtration system that allowed the DNA of digested diploid cells to pass through while sperm were stuck on the membrane surface. In this way, sperm were separated from mixtures containing a large number of female epithelial cells [45]. This method was considered to be more automatable and efficient than standard DE, but for old or severely degraded samples, cell recovery was not ideal due to the sperm clogging or adhering within the filtration membranes.

Schneider et al. described a new procedure in their work to evaluate the combination of the sensitive staining method Sperm Hy-Liter™, which allows staining even if the sperm morphology has disintegrated, with micromanipulation using the aureka® cell isolation system to isolate a low number of sperm on a microscope slide (as few as 20 sperm) to generate full STR profiles. Typically, laser microdissection (LMD) technology needs transparent objects, whereas aureka® is an open system that can handle various kinds of materials on different surfaces at different heights [46, 47]. Subsequently, Moors et al. also indicated that the Sperm Hy-Liter™ assay is highly specific and sensitive for human spermatozoa and reliable and simple to use for the detection of spermatozoa in a variety of sexual assault samples [48].

The high specificity and sensitivity of flow cytometry provides a possible method to sort and analyze the DNA of sperm cells [49, 50], which is also known as fluorescence assisted cell sorting (FACS). Technological advancement has given FACS the potential for more promising applications, enabling it not only to separate sperm from the mixture containing vaginal epithelial cells [51, 52] but also to sort other cells of forensic interest, such as the separation of saliva-derived epithelial cells from blood-derived leukocytes [53]. Of course, the loss of a certain number of cells in the process of separation and staining also requires attention.

In other research directions, Aptamer technology has also been used to capture and separate intact sperm cells from the presence of female epithelial cells and other non-sperm semen components [54, 55]. The haplotype-specific extraction (HSE) method was used for the separation of Y-chromosomal DNA from two-male mixtures to extract the haploid fractions and in the separation/analysis of the mitochondrial DNA (mtDNA) of samples containing DNA from two individuals, while situations with more than two donors or low DNA quality made it difficult to apply [56,57,58,59]. Koukouvinos et al. demonstrated a sensitive and accurate biosensor based on white light reflectance spectroscopy to identify prostate-specific antigen (PSA) in forensic samples containing semen, which could be completed within 10 min and had a low detection limit of 0.5 ng/mL PSA [60].

Another article proposed a robust real-time PCR system, termed InnoQuant® HY, which provided accurate quantitation of the DNA of a male from challenging male-female mixtures with a male to female ratio of as low as 1:128,000, thus allowing better decision-making about the appropriate DNA input amount for PCR [61]. The system provides significant data on the male to female ratio, degradation state, and the presence or absence of PCR inhibitors to support a more efficient and clear downstream workflow involving complex biological specimens.

Blood, saliva, or other body fluids can also be common components of mixtures from forensic scenes, and several studies have been devoted to their identification [62, 63]. For example, Yano and his colleagues used anti-human leukocyte CD45 and ABO blood group antibodies to separate leukocytes from mixed bloodstains involving different contributors, and the target DNA could be detected accurately with a ratio of as low as 1:512 [64]; this was only applicable for discriminating from a mixture containing different blood types.

Detection technology and genetic markers for mixture analysis

Detection technology

Capillary electrophoresis (CE) is still the most prevalent technology in contemporary forensic investigation on mixed DNA samples, and most of the available calculation methods and statistical software are based on the STR genotype generated by CE technology. It has some limitations for mixture analysis on CE platform, such as the limited number and length of each fluorescently labeled PCR fragments in a multisystem [65], the low efficiency of handling degraded or trace samples, and the difficulty in distinguishing genuine alleles provided by minor contributors from stutter alleles of major contributors [66]. Compared with the CE platform, many forensic analysts believe that the rapidly developing massively parallel sequencing (MPS) technology may shed some light on mixture analysis. For example, Morling’s team compared traditional PCR-CE and MPS at three STR loci among the Danish population and observed that approximately 30% of the samples identified as having a homozygous genotype with CE turned out to be heterozygous when sequenced by MPS [67]. STR alleles can be recognized not only by repeat number but also by sequence variation, which may simplify mixture interpretation [68]. Moreover, on the MPS platform, it is possible to detect a large number of different types of genetic markers with overlapping sizes, including STRs, single nucleotide polymorphism (SNPs), and insertions or deletions (indels), and the short PCR fragment may help in dealing with degraded or trace mixtures [5, 68]. Last, but not least, MPS is more sensitive than CE, and the former can detect the minor donor in a 1:100 mixture, while the latter is unable to handle ratios of less than 1:10 [5, 69, 70]. However, population genetic data generated from the MPS platform is still at a disadvantage due to the problems in popularization and normalization, which should be improved by further study. In general, MPS technology has the absolute advantage that the detailed sequence information provided by MPS may facilitate mixture interpretation and increase the statistical weight of the evidence, which offers new possibilities for forensic genetic casework [71].

Genetic markers

Many genetic markers, including STRs, SNPs, and indels of autosomes and Y-chromosomes, are commonly used for mixture detection and analysis [30, 72, 73]. Y-STR analysis can detect male components directly to avoid the loss of DNA. The number of donors in a mixture can be estimated on the basis of the number of Y-STR alleles, which is of great significance in the analysis of gang rape cases [74, 75]. In addition, mtDNA plays an important role in degraded specimens and can also be used to determine the number of individuals in a mixture [76, 77].

STRs

STR markers have been used in the analysis of mixed samples for many years, while allelic dropout and drop-in often occur due to the unbalanced DNA of the contributors, which makes it difficult to establish consensus guidelines [78,79,80]. Limited numbers of STRs available in commercial kits provide low statistical strength of inclusion in mixtures with more than three contributors or in cases where the minor contributor component is extremely low. From that point of view, the combination of autosomal and Y-chromosomal STR analysis was conducted, which provided an additional means for the investigation of mixtures in sexual assault cases, because Y-STR is highly valuable in some cases in which the minor male proportion in the DNA mixtures is undetectable by autosomal STR analysis [30, 81, 82]. Recently, nine novel pentameric STRs with lower stutter peak ratios were strongly proposed to aid in the analysis of mixed DNA profiles when minor donor alleles may coincide with stutter peaks from the major components [83].

The collaborative RNA/DNA co-analysis exercise results of the European DNA Profiling Group (EDNAP) illustrated special mRNA profiling as a reliable body fluid identification method that can easily be combined with standard STR typing technology [84,85,86,87]. Uchimoto et al. applied an in-house miRNA analysis to stains containing a mixture of blood and saliva that could determine the presence of more than one body fluid and major/minor contributors with a lower limit of detection than the enzymatic equivalent [88]. Future work will be expected to explore the use of the abovementioned methods with more body fluids, including seminal fluid, menstrual blood, and vaginal fluid.

SNPs

Voskoboinik et al. [89] indicated that if an individual carries some rare alleles and a complex DNA mixture also contains these particular alleles, then this individual is represented in the mixture. Based on this rationale, they proposed a theoretical framework of a method with a large number of SNPs in 2011 and presented implementation utilizing a panel of 3000 SNPs with relatively low minor allele frequency (MAF), below 0.25, in 2015 [72]. The later study could robustly identify individuals who contribute at minimum 5% to a mixture and examine the usefulness of the whole genome amplification (WGA) of complex mixtures before allelotyping to handle low DNA quantities. However, this method has some shortcomings; specifically, it can only be applied in cases with a known suspect and is not compatible with common STR databases. As well as establishing a SNP database, combining it with existing STR databases requires arduous work. Further research should be carried out to validate the proposed panel and to invest forensically focused SNP microarrays on DNA mixtures.

A study by Liu et al. used a sensitive system containing primer-specific alleles of 14 indel markers to detect ratios as low as 1:500 to 1:1000 minor components in two-source mixtures [73]. Recently, Hwa and his colleagues established a 1204 SNP/indel panel optimized for MPS consisting of a different number of SNPs and indels located on the autosomes, the X/Y chromosomes, and the mitochondria, combined with a scoring system, which could accurately identify minor contributors contributing 1% or more to DNA mixtures [5]. This panel enabled the successful simultaneous analysis of numerous different markers and was believed to be more sensitive and flexible than traditional CE approaches. The large number of loci in the panel could also increase the statistical power of the discrimination, which could be applied in individual identification for forensic DNA mixtures as a primary method or as a supplement to STR analysis.

mtDNAs

The Spanish and Portuguese working group of the International Society for Forensic Genetics (GEP-ISFG) has performed mtDNA collaborative exercises over many years [90,91,92,93,94,95]. They undertook extraordinary efforts to improve the quality and standardization of mtDNA analysis, including the analysis of different mixtures (semen-saliva, hair-saliva, saliva-saliva mixture), from both methodological and theoretical perspectives. The success rate for analyzing mixed samples by mtDNA was moderate, and they noted that errors in mtDNA testing occur mainly due to the lack of a solid devised experimental approach. In some cases involving mixed semen stains, it is more difficult to identify individuals than to just exclude unrelated ones. Zhang et al. first applied the LCM system to choose single sperm cells to target. Then, the mtDNA hypervariable region I (HVR I), which shows genetic polymorphisms in different matrilineal-related individuals, was amplified from each cell; by combining the cellular DNA of the same HVR I sequence from multiple cells, enough nuclear DNA for STR typing (reamplified) was obtained [96]. They indicated that a collection of 20 sperm could be typed correctly, but more than 30 sperm cells should be collected to avoid errors during the final typing process. Zander et al. performed haplotype-specific extraction (HSE) to separate the mtDNA of each donor from a two-source mixture and subsequently sequenced the mtDNA to reveal the underlying individual haplotypes successfully; the limits of this method are used in cases with more than two donors or low-quality DNA, which could make separating mtDNA haplotypes using HSE difficult [59]. However, mtDNA is not sufficient to differentiate individuals in the same maternal line and, as such, using other markers is necessary to identify individuals from a mixture. Therefore, Hwa et al. combined 129 mitochondrial SNPs and other markers into their 1204-marker panel to detect non-degraded and highly degraded DNA mixtures [5].

New markers

DIP-STRs

There have been several novel sets of genetic markers proposed in recent years, such as pairing deletion/insertion polymorphisms (DIP) with standard STR, called DIP-STR. Hall’s team proposed DIP-STR as an innovative genetic marker composed of a DIP tightly adjacent to an STR polymorphism, detected by a specially designed pair of PCR primers (L-long allele for insertion and S-short allele for deletion). These compound markers are able to target a minor component in the presence of a 1000-fold excess of background DNA from an unbalanced DNA mixture [97]. They compared DIP-STRs with traditional STRs and Y-STRs from a statistical and forensic perspective, which indicated that each method has its own advantages and that the use of DIP-STR markers could be of interest for all kinds of DNA mixtures [98]. They then applied the first set of 10 DIP-STR markers to estimate the allelic frequencies of a group from the Swiss population and the presence of informative alleles, as well as to calculate the random match probability of the minor DIP-STR profile detected across a large number of DNA mixtures in silico [99]. Six of these 10 DIP-STRs were also powerful for use in unbalanced DNA mixture investigations of the southwest Chinese Han population [100]. Recently, Hall et al. used another set of six DIP-STR markers for the analysis of unbalanced mixtures from challenging “touch” DNA samples to detect the minor donor and found an analogous sensitivity and similar occurrence of allelic dropout in comparison with Y-STRs [4]. They also reported the first use of 18 validated DIP-STRs in eight real cases, in which these markers were found to be more sensitive and specific and performed well on challenging DNA samples of both sexes [101].

It is noteworthy that DIP-STR can be used in DNA mixtures containing low levels of DNA or extremely unbalanced major/minor ratios, irrespective of the sexes of the donors. When male suspects from the same male lineage or from a highly inbred population are involved, DIP-STRs are more powerful. With regard to same-sex DNA mixtures or a female perpetrator, it can be practical to use DIP-STRs instead of Y-STRs in further investigation. In fact, DIP-STR markers can complement Y-STR data and help to provide new clues for investigation. In addition, DIP-STR data may be helpful in estimating the number of contributors and strengthening the DNA evidence from a mixed sample. Even so, these improved DIP-STR markers have several limits. For example, it can be used only when the major contributor is homozygous for the DIP allele and the minor contributor is heterozygous or homozygous for the opposite allele. Multiplexed assays for DIP-STRs have not yet been established, so the genotyping of these markers still requires several hundred picograms of DNA template, which is frequently not available in routine forensic practice. Lastly, DIP-STR markers cannot be used to analyze a DNA mixture of more than two individuals.

Microhaplotypes

To achieve the purposes of human familial identification and ancestry inference, Kidd’s team defined a mini-haplotype containing three or more SNPs within the expanse of 10 kilobase pairs (KB) [102]. Similarly, they identified many loci with two or more SNPs extending over a smaller molecular interval of 200 base pairs in the human genome, and when linkage disequilibrium is not complete, these loci are regarded as microhaplotype loci (microhaps) [103]. They illustrated that microhaps, with high heterozygosity and regional or global Fst, should be capable of providing information for ancestry inference, lineage-clan-family relationships, and individual identification [104]. Additionally, microhaps have great value for disentangling mixtures. When the microhaplotype loci are genotyped by sequencing, three or more genotypes with a sufficient number of reads appeared at each locus of a DNA mixture that can also be detected qualitatively according to different numbers of reads for each allele. MPS makes it realizable to integrate a cluster of SNPs and detect microhaps. Kidd and Speed defined the term “Ae” as the effective number of alleles at a microhaplotype locus, with which greater than three microhaps will be useful in routine forensic practice, and three-SNP microhaps will sometimes meet this criterion. They indicated that the Ae value and the ability of microhaps to discriminate mixed samples showed a positive correlation. When five microhaps with an average Ae value of four were detected, the cumulative probability of qualitative detection of a mixture was greater than 99% [105]. Later, Kidd proposed the nomenclature for the microhaps [106], and Zhu et al. designed the novel software FLfinder to type microhaps [107]. The great potential of optimally selected microhaps in forensic casework encouraged the search for novel loci that were better suited for forensic applications. Of course, there are some aspects that need to be improved in the development of microhaps. For example, the applicability of more than 100 existing loci proposed by Kidd’s team in other populations remains to be explored. MPS technology is less prevalent in forensic laboratories, which is also one of the factors limiting its development.

Interpretation of mixture profiles

Calculation models

Due to the extensive use of STR markers in mixture analysis on the CE platform, researchers have established several computational models for explaining the STR typing of mixed samples. In forensic casework, DNA mixture interpretation using STR loci is initially based on a binary model, which may be a qualitative binary model or a semi-quantitative binary model; it is simple to apply but not involved in peak heights in the computation of two-source mixtures [108, 109]. The peak height information is of benefit for analyzing mixed profiles. For example, four peaks were detected at the D16S539 locus for a two-donor mixture (Fig. 1), and if peak height is not considered, six possible genotype combinations will be given the same weight (Table 1). Although the binary model has served well for a great many years in forensic cases, it is not suitable to handle questionable low-level or LT-DNA samples, which often contain loci showing non-concordances, and it thus has a risk of being misused. In addition, the binary model cannot consider multiple replicates and has difficulty handling multiperson mixtures [110].

Fig. 1
figure 1

The electropherogram of the D16S539 locus for a two-donor mixture in casework from our lab. Allelic peak height data is shown under the allelic call

Table 1 Six possible genotype combinations for Fig. 1 without peak height information. The only combination that considers peak height information is shown in bold

The motivator for change is the shortfalls of binary models mentioned above. Like the binary model, a semi-continuous model does not make use of peak heights themselves either, while the model parameters were informed and different dropout rates per contributor were allowed [111, 112]. The drop-in can be distinct from contamination in the semi-continuous model and can also handle multiple replicates. The probabilities of all possible genotype groups in the mixture profile and the likelihood ratio (LR) for a series of propositions can be rapidly calculated in software employing this model [113]. A semi-continuous model makes improvements that can be used in the analysis of complex mixtures and LT-DNA, while ignoring peak height information, and the implementation always needs specialized software.

The continuous model, proposed a few years ago, incorporates peak heights and utilizes this information to assess all possible genotype combinations of the contributors by calculating their probabilities. For instance, use of the peak height and mixture ratio (based on data from multiple loci) information can help eliminate possible genotype combinations in Table 1. In situations with a 3:1 mixture, the correct genotypes are 10,14 for contributor 1 (the major one) and 9,13 for contributor 2 (the minor one) (shown in bold font). The continuous model has the potential to cope with any kinds of non-concordance and may assess multiple replicates without preprocessing or information loss [110]. PCR stochastic effects and potential stutters are taken into account, and the quantitative information taken by this model makes it relatively objective [114, 115]. The continuous method also requires specialized software and must be calibrated for different STR systems and conditions.

Among the three calculation models, the binary model is the simplest to apply or explain, and the continuous model is the most complex for the best use of all available information from the mixture profiles. Moreover, model choice dictates the probabilistic calculation, and different models may result in different results, particularly for complex DNA mixtures [116, 117]. The relatively simple models may descend into the misunderstood category due to their ignoring several useful pieces of information, and the risk in the continuous model is that our analysts may not be fully clear about the limitations of these complicated calculation programs in computer software, which result in inappropriate applications. There is no so-called gold standard for the use of these computing models, which depend on different circumstances, while the ideal continuous model is advocated for its reliability and objectivity. Additionally, Bayesian networks and the Monte Carlo Markov Chain (MCMC) method have been implemented with different statistical approaches for probabilistically resolving DNA mixtures in several studies [115, 118,119,120,121].

Statistical approach

In the daily work of forensic science, although statistical approaches for reporting DNA mixtures vary, two calculations are the most commonly used by forensic communities for evaluation. One is the LR, which is also the preferred method to determine the weight of evidence according to the ISFG DNA commission and several other forensic communities [122,123,124,125]. In the forensic evidence (E), a classical analysis involves the prosecution hypothesis (Hp) and the defense hypothesis (Hd), two of which are competing hypotheses evaluating the strength of evidence provided by genetic analysis. For a DNA profile with more than one donor, the Hp represents the probability that the suspect (S) and one unknown person (U) were the donors (Hp = S + U), whereas the Hd represents the probability that there were two unknown donors U1 and U2 (Hd = U1 + U2). The LR formula is represented as: LR = Pr (E | Hp) / Pr (E | Hd). When the LR is greater than 1, the evidence favors Hp; when it is less than 1, the evidence favors Hd. [122]. Due to its ability to handle PCR stochastic effects that occur frequently in mixtures, such as dropout and stutter, the LR approach can maximize the usage of genetic information obtained from the mixed profiles. Consequently, many forensic practitioners recognize the mathematical efficacy of LR and apply it for the analysis of mixtures [126, 127]. The abilities and limitations of the LR approach in the analysis of complex mixtures have been characterized in many studies. Marsden et al. indicated that the strength of the evidence could be extracted from complex mixtures containing a maximum of five donors on the condition involving no dropout [128]. The effect of incorrect estimation of the number of donors (caused by allele sharing) to the LR value was examined by Benschop and his colleagues and was illustrated to exert a great effect on the LR [129]. Prof. Slooten used the standard statistical technique of integrating the LR, by which he could avoid estimating the number of donors or their probabilities of dropout in his study [130]. The assessment depended on the allele frequencies and the mixture data, which was considered to be more objective as a practical advantage. The complex calculation process for LR makes it difficult to explain in court, and thus, several computer software programs have been designed for LR calculation and mixture interpretation [131, 132].

Another calculation method is the combined probability of inclusion/exclusion (CPI/CPE). The CPI evaluates the proportion of a population that would occur as a potential donor in the DNA mixture. CPI also refers to the mathematical complement of random man not excluded (RMNE), which is the probability that a random person has the same DNA profile as the evidence profile or considers that a random person cannot be excluded by the evidence. When applying the RMNE method in the mixture calculation, the main advantage is that it is straightforward to implement and explainable in court because it does not need the assumption of the possible number of contributors, the peak heights, or the genotype of known donors to a mixture; on the other hand, for these same reasons, the RMNE statistic is also deemed to underestimate the strength of the evidence and waste information that should be utilized [133]. However, the results assessed according to a mixture from the RMNE method are still conceptually correct and constantly improved by researchers [134, 135]. The CPI evaluation includes three steps: profile assessment; comparison with reference profiles and inclusion/exclusion determination, and calculation of the statistic [1]. Nevertheless, the two alleles at each locus of a donor being considered must exceed the threshold for analysis, which restricts the CPI approach to unambiguous DNA profiles. For example, low-level DNA mixture may occur with allele dropouts or when the distinction between minor alleles and stutters of major alleles becomes difficult [122, 136]. Thus, the CPI approach has less flexibility than the LR method under the condition of allele dropout in challenging mixtures and needs to be performed by more experienced practitioners under strict guidelines. Bieber et al. described a protocol as a guideline for applying the CPI approach that could lead the forensic communities to reduce variations in DNA mixture interpretation and promote a more defensible application of the CPI [1].

Slooten and Egeland indicated that the RMNE could be revealed as a certain average of the LR, implying that the expected value of the LR is at least equal to the inverse RMNE probability in the case of an actual donor to the mixture. In a great many of the examples for mixtures without dropout, LR even revealed a smaller weight of evidence than 1/RMNE based on the binary model [137]. Meanwhile, the mixture profiles are often characterized by artifacts such as dropout/drop-in and stutter, which makes the concept of exclusion hard to define in the RMNE. Therefore, although it takes more skill to correctly interpret the complicated evidence, LR is considered to be a more powerful method with which to address these situations. Beyond these two kinds of statistical approaches, Bille and his colleagues presented an extension to the concept of random match probability (RMP), which is utilized in standard single-source samples, which can be applied to analyze mixtures [138]. They proposed RMP as an intermediate method, as it is the probability that a random person is included in the mixture (similar to the RMNE) and requires profiling information, including the number of donors and the peak heights into analysis; the latter property is analogous to the LR method.

Interpretation software

Both the computational models and the LR approach are too complicated for routine calculations on mixed samples, so the use and optimization of appropriate software is highly encouraged by Forensic Science International Genetics (FSIG) [139] to avoid hand-calculation errors. Thus, a series of probabilistic DNA profile interpretation software programs have come into being. The study by Curran et al. proposed a set theory, based on which an expert interpretation system, named LoComatioN, was made available [140]. It is the first feasible expert system to rapidly evaluate different explanations in an LR approach, considering the factors of allelic dropout and drop-in [113]. The first open-source bio-statistical tool, Forensim, is a package for R statistical software that is dedicated to forensic DNA evidence interpretation, including that of mixtures [141]. It can simulate mixed DNA profiles and conduct common statistical calculations, which have been used in the research on the efficiency of the maximum-likelihood approach for DNA mixture interpretation [142, 143]. What is more, it was in the Forensim package that LRmix, an open-source software program, was applied by Gill and Haned to evaluate complex mixtures. They illustrated that there is no need for the known contributor and the mixed stain to have (any) matching alleles, and the calculations of strength of evidence are simplified [144]. Based on LRmix, Haned et al. quantified the relative risk of analyzing high-order mixtures by the comparison of the “gold standard” LR (having the known genotypes and number of donors) with the LR obtained in the casework (unknown donors are assumed). They presented some of the exploratory approaches used when encountering complex sample analyses and encouraged the forensic communities to evaluate the risk before applying any LR-based interpretation software [131]. Later, it was demonstrated by Gill and his colleagues that this LRmix program can not only interpret multiallelic STRs but also be extended to bi-allelic SNPs generated from Life Technologies’ HID-Ion AmpliSeq™ Identity Panel v2.2 using the Ion PGM™ MPS system [145]. However, when faced with complex SNP profiles containing three or more donors, LRmix is generally inefficient. The open-source EuroForMix software could significantly improve the analysis of complex SNP mixtures by incorporating the “sequence read” coverage value into the quantitative model and showed a great benefit over the qualitative approach [146]. The EuroForMix program could also interpret STR profiles from a mixture of contributors with artifacts based on the continuous model presented by Cowell and his colleagues [147], which is presented in the R-package euroformix and is freely accessible at www.euroformix.com [148]. Not long ago, the LRmix model was further modified and translated into a user-friendly software, SmartRank, that was recently utilized by Benschop and his colleagues to search several national DNA databases with mixed DNA profiles [132]. They deduced the profile types for which SmartRank can be complementary to CODIS and provided guidelines, as well as defined the applicable domains for the SmartRank software.

There have been other software programs employed for genotype determination and mixed DNA profiles interpretations, such as GeneMapper® ID-X, TrueAllele®, LikeLTD, STRmix™, Lab Retriever, and so on [149,150,151,152,153]. GeneMapper® ID-X and TrueAllele® could conduct complex quantitative analysis of DNA mixtures, LikeLTD may remain useful as a robust method for the analysis of LT-DNA profiles, Lab Retriever applies the semi-continuous model, and STRmix™ and TrueAllele® use a fully continuous model. The ISFG has established validation guidelines and presented recommendations for bio-statistical software to be used in forensic genetics [154]. We also advocate that forensic DNA profile interpretation software should be cross-validated and demonstrate performance under relevant guidelines, in order to facilitate improved evaluation of the complex evidence in court.

Conclusion

Mixed DNA samples are a common biological material in forensic crime cases, and the analysis and interpretation of the results is one of the difficulties in forensic casework. The analytic approaches and detection technologies described herein are intended to provide some instructions for forensic practitioners about how to apply different methods in the evaluation of forensic DNA mixtures. In addition to the two-step DE method used in routine work, other separation and extraction methods have achieved certain advancements in the past years. Additionally, some novel genetic markers, such as DIP-STRs and microhaps, have been defined and applied to improve the ability to present legally powerful evidence when handling DNA mixtures. What is more, much open-source or commercial software has been proposed to facilitate mixture analysis and reduce manually calculated errors.

High-throughput sequencing has accelerated investigation in many fields of biological science. In the last several decades, the use of MPS in forensic genetics has been questioned, and now, we know that it can be applied specifically in forensic casework, including human identification, phenotypic trait determination, and mixture detection [68, 155]. MPS technology has many advantages when compared with a traditional CE method, especially the capability to simultaneously detect many different kinds of genetic markers and to export detailed sequence information, which significantly enhances the forensic investigation of complex DNA mixtures. As a result, it is increasingly implemented and used in forensic laboratories. Currently, the validation of software solutions and the cost of instruments and kits are key factors in the introduction of MPS into forensic genetics.

In general, the interpretation of mixed DNA profiles obtained from multiple contributors has proven to be a particularly difficult problem in forensic science in terms of providing legal evidence, while the development of various appropriate methods, software, and detection techniques over the years has indeed significantly improved the ability to address this type of data. Increased detection sensitivity also means more challenges for mixture interpretation. From our point of view, varieties of DNA mixtures are too miscellaneous to establish a unified standard for detection and interpretation, which need extensive experience and careful training. The future advanced investigation should be forced to quantitatively and entirely assess mixture results, as well as to develop a reliable sequencing system with corresponding analysis software.