Introduction

Several studies indicate that males commit the majority of violent crimes. For example, the US Bureau of Justice Statistics reports that males commit about 80% of all violent crimes and 95% of sexual offenses in the United States [1]. Although autosomal short tandem repeat (STR) markers, the current loci of choice for the forensic analysis of biological evidence, are normally able to fully discriminate between unrelated individuals, there are several circumstances in which Y chromosome polymorphisms could be a useful addition to the forensic scientist's armamentarium.

Although the human Y chromosome has often been considered an evolutionary relic of the X chromosome that has retained the authority to dictate gender but has little other functional significance, recent studies have demonstrated that it possesses numerous functional genes including some that appear to be critical for normal male development [2]. Approximately 300 million years ago, the X and Y chromosomes were true homologues, comparable in size and genetic content [3]. Through the passage of time, the Y chromosome underwent a series of deletion mutations reducing it to its present size of approximately 60 Mb, although significant X chromosome sequence homology still persists [3]. The chromatin of the Y chromosome exists in at least three functionally different forms including the pseudoautosomal regions (PARs), the euchromatin and the heterochromatin [3]. The PARs, located in the telomeric regions of the chromosome, pair and recombine with the X chromosome during male meiosis. The euchromatin (containing the functional genes) and the transcriptionally inert heterochromatin together form the non-recombining region (NRY) of the Y chromosome [3]. Therefore, the NRY region of the Y chromosome is inherited in a patrilineal manner in which a haplotype of physically linked genetic markers is transmitted unchanged, barring the occasional mutation, from father to son. This non-independent segregation of genetic markers on the Y chromosome, which is in sharp contradistinction to the independently segregating behavior of commonly used autosomal STR markers, results in reduced genetic variability. Thus, more Y chromosome markers would be required to provide the same ability to discriminate individuals (the discriminatory potential, DP) as that obtained by autosomal markers. It has even been suggested that this mode of inheritance could enable a particular Y haplotype recovered at a crime scene to be equated with a surname although the confounding effects of non-paternity and multiple independent origins of surnames would need to be taken into account [4, 5]. Notwithstanding the above, it is possible, and in certain circumstances advantageous, to make use of the unique biology of the Y chromosome for forensic purposes.

Y chromosome-specific systems may prove invaluable for the identification of the genetic profile of the male component in mixed male/female specimens in those instances in which the female portion is present in overwhelming quantities relative to the male. This could be due to the deposition of semen by an azoospermic or oligospermic male, to cases of oral sodomy where only trace amounts of male buccal epithelial cells may be present, or due to the normal post-coital degradative and sample loss processes that occur with the passage of time. Additionally, Y chromosome systems could be used to determine the presence of a number of semen donors in cases of multiple perpetrator rape. A third reason for employing Y chromosome polymorphisms would be for criminal paternity analysis or disaster victim identification. The Y chromosome haplotype of a missing individual may be determined by typing a male relative such as a son, brother, father, nephew or uncle. Fourth, the ability to specifically detect a male profile could obviate the need for the time-consuming and oft-times inefficient differential extraction procedure for the separation of sperm and non-sperm fractions. Finally, male-specific systems may aid the investigation of cases involving mixtures or close biological relatives by providing additional statistical discriminatory power.

A variety of polymorphic genetic markers have been identified in the euchromatin portion of the Y chromosome, including a number of STR [6, 7, 8, 9] and single nucleotide polymorphism (SNP) loci [10] and a single hypervariable minisatellite locus [11]. At this time, the three classes of polymorphic loci have varying degrees of utility for forensic purposes. A number of candidate Y-SNP loci have been identified, but they suffer from a limited discrimination potential and their implementation in forensic casework is dependent upon the development of additional markers and appropriately validated detection technologies. The hypervariable minisatellite, MSY1, is the most polymorphic single locus system found on the Y chromosome but difficulties with the required multivariant repeat (MVR) technique have discouraged its operational use. Y-STR loci, on the other hand, offer a number of advantages including good discrimination potential, ease of analysis and a number of available candidate loci for multiplex analysis [6].

Although more than a hundred STR loci have been described on the Y chromosome a much more limited number have been appropriately evaluated for forensic casework use and some of these have presented a particular challenge for assay design. The Y-STR loci comprise di-, tri-, tetra-, and penta-nucleotide repeats with the dinucleotides exhibiting the most polymorphism but an excessively high level of stutter artifacts [12]. Due to their evolutionary relatedness, many homologous sequences are found on both the X and Y chromosomes, which can confound the analysis of mixed male/female specimens. Additionally, some of the loci are bi-local in the sense that one or both of the primer sites and associated tandem repeats are duplicated upstream or downstream of the parent sequence [6, 13]. In these particular cases, two alleles are co-amplified, and there is some, but not complete, correlation between the alleles at both loci. Importantly, sample quantity limitations with forensic specimens require that candidate Y-STR loci be analyzed together in a parallel fashion by incorporating them into a multiplex PCR assay format, the design of which can be complicated by some of the aforementioned factors.

A major international multi-center study of 13 candidate Y-STR markers resulted in recommendations for the use of seven 'core' loci for standard haplotyping (designated Yh1) together with a second set of four hypervariable markers to provide increased individualization of unrelated males [6]. The seven 'core' loci include DYS19, DYS389 (I and II), DYS390, DYS391, DYS392 and DYS393, whereas the four hypervariable loci comprise DYS385 (a) and (b), YCAII and YCAIII [6]. Originally, several analytical strategies were recommended to incorporate the Yh1 loci into two multiplex systems [6, 7]. Subsequently a number of different strategies for multiplex amplification of Y polymorphisms have been described and many of them incorporate all or some of the Yh1 loci plus DYS385 [14, 15, 16, 17]. The seven Yh1 core loci plus DYS385 (a) and (b) are referred to in this paper as the core minimal haplotype. Several World Wide Web accessible Y-STR haplotype reference databases have been constructed for the core minimal haplotype loci from various population samples obtained from Europe (http://ystr.charite.de) [18] and the United States (http://www.ystr.org/usa) [19].

Compared to autosomal STR loci, the ability of the above-described core minimal Y-STR haplotype loci to discriminate between unrelated males is modest and results in a relatively high rate of coincidental matches between samples that do not originate from the same person. Since the development of the core minimal haplotype, additional microsatellite loci have been reported which may have potential utility in forensic genetics [8, 9]. In this report, we have sought to improve the discriminatory potential, and hence the probative value, of Y-STR-based testing by extending the set of Y chromosome STR loci available for forensic casework use. Our goal was to develop new Y-STR multiplex systems that incorporate all, or most, of the newly described loci as well as the core minimal haplotype loci described above. In accordance with the requirements of a multiplex system developed for forensic use, we also sought to maximize the number of loci able to be co-amplified, ensure appropriate assay sensitivity (1–3 ng of input genomic DNA), balance inter-locus signals and minimize confounding female DNA artifacts.

We have developed two systems, multiplex I (MPI) and multiplex II (MPII), which allow for the robust co-amplification of 18 STRs and their subsequent separation and detection using a standard capillary electrophoresis analytical platform. The loci include DYS19, DYS385 (a) and (b), DYS388, DYS389I and II, DYS390, DYS391, DYS392, DYS393, DYS425, DYS434, DYS437, DYS438, DYS439, Y-GATA-C4, Y-GATA-A7.1 (DYS460), Y-GATA-H4 [6, 8, 9, 20]. The two multiplex systems are robust over a wide range of primer, magnesium and DNA polymerase concentrations and perform well under a variety of cycling conditions. Complete male haplotypes can be obtained with as little as 100–250 pg of template DNA. Although a limited number of female DNA artifacts are observed in male/female DNA mixtures when the male comprises 1/100 of the DNA tested, the nature and location of these artifacts on the electropherogram does not preclude the ability to obtain the haplotype of the male donor. Slightly modified versions of the multiplexes demonstrate a significant reduction of female DNA artifacts. Thus, it may not be necessary to employ a differential extraction strategy to obtain a male haplotype (or haplotypes in the case of multiple male donors) in cases of sexual assault. The potential utility of MPI and MPII in forensic casework is exemplified by their ability to determine the number of male donors in mixed body fluid stains and to dissect out the male haplotype of the semen donor in post-coital vaginal swabs.

Methods

Preparation of body fluid stains

Body fluids were collected using procedures approved by the University's Institutional Review Board. Blood samples were obtained by venepuncture or by fingerstick, and dried on cotton cloth. Buccal samples were supplied by a swabbing of the internal surface of the subject's mouth with sterile swabs. Neat semen was collected in a plastic cup and dried onto swabs. Post-coital cervicovaginal swabs were taken from female subjects at various time intervals after sexual intercourse. All samples were stored at −20 °C until required.

DNA Isolation and purification

DNA was extracted from dried blood stains, buccal swabs, semen swabs, or post-coital cervicovaginal swabs using a standard phenol:chloroform method [21]. Stains or swabs were cut into small pieces and placed in Spin-Ease tubes (Gibco-BRL, Grand Island NY). The tubes were incubated overnight at 56 °C in 800 µL DNA extraction buffer (100 mM NaCl, 10 mM Tris-HCl, pH 8.0, 25 mM EDTA, 0.5% SDS, 0.1 mg mL−1 Proteinase K, 10% 0.39 M DTT). The fabric or swab fragments were removed, placed into a Spin-Ease basket, inserted into the original tube containing the extract and centrifuged at 16,000 g for 5 min to facilitate the efficient removal of absorbed fluid.

The crude extract and an equal volume of phenol/chloroform/isoamyl alcohol (Fisher, Norcross GA, or Ambion, Austin TX) were added to a Phase Lock Gel (PLG) tube (15 mL, heavy, Eppendorf, Boulder CO), and centrifuged for 5 min at 1,100 g. The organic phase was trapped beneath the 'phase-lock' gel. The aqueous phase, containing the DNA, was filtered through a Centricon 100 Centrifugal Filter Device (MilliPore, Bedford MA), and washed twice with TE-4 buffer (10 mM Tris-HCl, 0.1 mM EDTA, pH 7.5). Samples were stored at 4 °C until analysis.

Differential cell lysis for the recovery of sperm DNA

The reader is referred to ref. [22]. Swabs were incubated overnight at 37 °C in 800 µL extraction buffer (100mM NaCl, 10 mM Tris-HCl, pH 8.0, 25 mM EDTA, 0.5% SDS, 0.1 mg mL−1 Proteinase K). The swab pieces were removed to Spin-Ease baskets, and centrifuged for 5 min at 16,000 g. The supernatant, containing the non-sperm DNA fraction, was removed to a new tube and the remaining sperm pellet washed twice in extraction buffer. After washing, the sperm fraction was gently re-suspended in 400 µL DNA extraction buffer, to which 40 µL of freshly prepared 0.39 M DTT was added, and incubated for a minimum of 2 h at 37 °C. Both the non-sperm and sperm fractions were then placed in separate PLG tubes and purified as described above.

DNA quantitation

DNA was quantitated using ethidium bromide/UV light-induced fluorescence on a 1% agarose yield gel or by hybridization to the primate-specific α-satellite probe, D17Z1 [23] using the Quantiblot Human DNA Quantitation Kit (Applied Biosystems, Foster City, CA).

Standard PCR conditions

Standard Reaction

Optimization of the multiplex systems resulted in a set of standard conditions. Multiplex I (MPI): the 50 µL reaction 'multi-mix' contained: 3 ng template DNA, 0.05–1.1 µM primers (see below), 250 µM dNTPs, 1X PCR Buffer II (10 mM Tris-HCL, pH 8.3, 50 mM KCl), 3.25 mM MgCl2, and 2.5 Units AmpliTaq Gold DNA Polymerase (Applied Biosystems). Multiplex II (MPII): the 50 µL reaction 'multi-mix' contained: 1.5 ng template DNA, 0.05–0.4 µM primers (see below), 250 µM dNTPs, 1X PCR Buffer II, 2.5 mM MgCl2, and 2.5 Units AmpliTaq Gold DNA Polymerase.

Primers

Primer sequences were obtained from published sources, or designed using Oligo 6 Primer Analysis Software (Lifescience Software Resource, Long Lake, MN). The forward or the reverse primer at each locus was labeled with a fluorescent phosphoroamidite dye (Applied Biosystems or Invitrogen). Tables 1 and 2 list the primer sequences and associated dye labels. The primer concentrations were as follows: MPI: DYS385–1.0 µM; DYS19–1.0 µM; DYS389–0.08 µM; DYS391–0.07 µM; DYS392–1.1 µM; DYS393–0.8 µM. MPII: DYS388–0.3 µM; DYS390–0.4 µM; DYS425–0.1 µM; DYS434–0.05 µM; DYS437–0.05 µM; DYS438–0.15 µM; DYS439–0.2 µM; Y-GATA-C4–0.05 µM, Y-GATA-A7.1–0.09 µM; Y-GATA-H4–0.1 µM. Modified versions of MPI and MPII consisted of the following primer concentrations: MPI: DYS385–1.0 µM; DYS389–0.15 µM; DYS391–0.07 µM; DYS392–1.0 µM; DYS393–0.06 µM; DYS438–0.15 µM; MPII: DYS19–0.5 µM; DYS388–0.15 µM; DYS390–0.1 µM; DYS425–0.09 µM; DYS434–0.06 µM; DYS437–0.04 µM; DYS439–0.06 µM; Y-GATA-C4–0.04 µM, Y-GATA-A7.1–0.07 µM; Y-GATA-H4–0.09 µM. The modified version of MPI also contained the locus Y-GATA-A7.2 (045461). The Y-GATA-A7.2 primers from this locus were used at a 0.1 µM concentration and comprised the following sequences: 5′-HEX-agg cag agg ata gat gat ag gat (forward) and 5′-ttc agg taa atc tgt cca gta gtg a (reverse).

Table 1. MPI characteristics
Table 2. MPII characteristics

Cycling conditions

MPI: (1) 95 °C 11 min, (2) 2 cycles: 96 °C 30 s, 62 °C 1 min, 72 °C 1 min, (3) 2 cycles: 96 °C 30 s, 60 °C 1 min, 72 °C 1 min, (4) 30 cycles: 96 °C 30 s, 58 °C 1 min, 72 °C 1 min, and (5) final extension: 72 °C 45 min. MPII: (1) 95 °C 11 min, (2) 2 cycles: 96 °C 30 s, 62 °C 1 min, 72 °C 1 min, (3) 2 cycles: 96 °C 30 s, 60 °C 1 min, 72 °C 1 min, (4) 6 cycles: 96 °C 30 s, 58 °C 1 min, 72 °C 1 min, (5) 10 cycles: 96 °C 30 s, 58 °C 1 min, 72 °C 1.15 min, (6) 13 cycles: 96 °C 30 s, 58 °C 1 min, 72 °C 1.5 min, (7) final extension: 72 °C 45 min.

PCR product detection

Amplified fragments were detected with the ABI Prism 310 capillary electrophoresis system (Applied Biosystems). A 1.5-µL aliquot of each amplified sample was added to 24 µL deionized formamide (Amresco, Solon, OH) and 1 µL of GeneScan 500 TAMRA internal lane standard (Applied Biosystems). Tubes containing the above were heated at 95oC for three minutes and snap cooled on ice for at least 3 min. Samples were electrokinetically injected into capillaries using Module C (5-s injection, 15 kV, 60 °C) and data analyzed with GeneScan Analysis v2.1 software using Filter Set C (Applied Biosystems).

Multiplex PCR optimization

Magnesium concentration

A standard reaction mixture was used with the following exceptions: 5 ng of template DNA was used and the concentration of magnesium was varied from 2.25 mM to 3.75 mM in increments of 0.25 mM.

Enzyme concentration

A standard reaction mixture was used with the following exceptions: 5 ng of template DNA was added and the amount of AmpliTaq Gold DNA Polymerase tested ranged from 2.25–3.75 Units per 50 µL PCR reaction (0.045–0.075 U µL−1) in increments of 0.25 Units.

Cycle number

A standard reaction was carried out with different cycle numbers. Numbers tested were 31, 33, 35, and 37.

Annealing conditions

A standard reaction mixture was prepared, and the results of varying the annealing time from 30 s to 1.5 min tested. Different annealing temperatures were applied, from 56 to 62 °C, and several different 'touchdown PCR' strategies evaluated.

PCR additives

The effect of BSA on multiplex PCR performance was tested by the addition of 20 µg of non-acetylated BSA to the standard reaction mixture.

Multiplex system performance

Multiplex sensitivity

A standard reaction mixture was prepared using different input quantities of template DNA. The amounts tested were: 50 pg, 100 pg, 125 pg, 150 pg, 200 pg, 250 pg, 500 pg, 1 ng, 3 ng, 5 ng, 10 ng, and 100 ng.

Precision

Amplified DNA from one male was injected twenty times, the size (in bp) of each peak was recorded, and the standard deviation calculated for each locus.

Stutter

Stutter was calculated as the peak height ratio of the stutter peak to the parent allele peak using DNA from 24 different male individuals. Several alleles at each locus were evaluated and an average locus percent stutter calculated.

Somatic stability

DNA was extracted and amplified from blood, saliva, and semen collected from the same individual.

Monoplex versus multiplex analysis

Each locus was amplified separately and as part of a multiplex system.

Specificity

To evaluate possible female DNA cross-reactivity, DNA from the erythroleukemia cell line K562 and from female volunteers was tested using all multiplex and monoplex primer sets with 3 ng, 30 ng, 300 ng and 1 µg of input DNA template.

Mixture studies

Male/female

DNA (2.5 ng) from a male individual was mixed with increasing amounts of female K562 DNA (Gibco BRL) in the following ratios: 1/2–2.5 ng male DNA/2.5 ng female DNA; 1/10–2.5 ng male DNA/22.5 ng female DNA; 1/100–2.5 ng male DNA/247.5 ng female DNA; 1/1000–2.5 ng male DNA/2,500 ng female DNA; 1/10,000–2.5 ng male DNA/22,500 ng female DNA. After the addition of 10% 3 M sodium acetate, pH 5.2, each DNA mixture was precipitated overnight in 2.5 volumes of cold absolute ethanol. The DNA was washed with room temperature 70% ethanol and dried at 56 °C. The DNA was re-solubilized overnight in 17 µL (MPI), or 27 µL (MPII) TE-4 buffer at 56 °C, and the entire volume added to the amplification reaction.

Male/male

DNA from two or three individual males was combined in differing proportions (1/2, 1/4, 1/6, 1/12, 1/30) to give 3 ng of total DNA for amplification.

Non-probative casework

Post-coital cervicovaginal swabs were taken from a female donor at various times after intercourse (immediately, 12 h and 24 h). After an individual act of sexual intercourse, only one set of swabs was taken for a particular time point to ensure that the amount of semen present at that given time interval was not artificially reduced by prior removal of sample. The female subject took four swabs for each time point: two were subject to a differential extraction, and the remaining two carried through a normal non-differential extraction procedure. To achieve a uniform distribution of sperm, two swabs were taken simultaneously and separated. The other two were likewise taken and separated to form two pairs. Swabs were allowed to dry overnight at room temperature and stored at −20 °C until required.

Autosomal STR analysis

Autosomal STRs were analyzed using the AmpFLSTR Profiler PCR Amplification Kit (Applied Biosystems) using conditions recommended by the manufacturer.

Results and discussion

Description of Y-STR multiplex systems

Multiplex I (MPI) consists of the core minimal haplotype loci with the exception of DYS390, which is incorporated into Multiplex II (MPII). In MPI, six pairs of primers amplify eight loci and the resulting alleles range in size from approximately 100–400 bp (Fig. 1a). These loci include DYS19, DYS385 (a) and (b), DYS389I and II, DYS391, DYS392, DYS393. DYS385 and DYS389 are bi-local and therefore each of their respective primer pairs co-amplify two alleles [6, 13]. The 389 alleles can be resolved into distinct, non-overlapping loci designated DYS389I and DYS389II [6]. The 385 alleles cannot be discretely resolved since their size ranges overlap and are referred to as 385(a) and 385(b) [13]. This nomenclature indicates the relative size of the observed 385 alleles and not their physical location on the chromosome.

Fig. 1. a
figure 1

Multiplex I, and b Multiplex II. 18 Y chromosome STR loci are co-amplified in two reactions, separately detected by capillary electrophoresis, and displayed as an electropherogram. The x-axis represents the allele size in base pairs (bp). The y-axis gives peak height in relative fluorescence units. Each locus is labeled with a fluorescent dye: top window 6-FAM (blue); middle window TET (green); bottom window HEX (yellow)

Multiplex II comprises ten loci, the alleles of which range in size from approximately 100–300 bp (Fig. 1b). The loci include DYS388, DYS390, DYS425, DYS434, DYS437, DYS438, DYS439, Y-GATA-C4, Y-GATA-A7.1 (DYS460), Y-GATA-H4 [6, 8, 9, 20].

Multiplex PCR optimization

The goals of the PCR optimization process included the presence of inter-locus peak balance, adequate allelic signal intensity (individual peak heights greater than 1,000 relative fluorescence units (rfu)) and the minimization of artifacts.

Primers

Primers were designed to function efficiently in a multiplex format. The primer sequences for the MPI loci DYS19 (forward only), DYS389, DYS391, DYS392, and DYS393 were obtained from published sources (Table 1). In an effort to reduce observed artifacts located in the green (TET) channel the DYS19 reverse sequence was re-designed which resulted in an amplimer 18 bp longer than that originally described. Although the artifacts were not completely removed, the re-designed primer also improved inter-locus peak balance. The DYS385 forward and reverse primers were re-designed to yield smaller products while encompassing both allele ranges. The Y chromosome contains numerous segmental duplications of DNA that originated from other regions of the Y chromosome. Initial experiments with MPI produced an additional male peak 51 bp smaller than the corresponding DYS391 allele. A detailed study of the relevant sequences revealed the presence of a sequence homologous to the DYS389 reverse primer sequence within the DYS391 amplimer. The artifact was successfully eliminated by altering the relative primer concentrations of the DYS389 and DYS391 loci.

The MPII primer sequences were obtained from published sources (Table 2). In our hands, a labeled forward DYS390 primer gave no amplification product, while a dye labeled reverse primer yielded the desired amplimer.

Magnesium and DNA polymerase concentration

The optimal magnesium concentration was determined to be 3.25 mM and 2.5 mM for MPI and MPII, respectively.

The optimal enzyme concentration for both MPI and MPII was determined to be 2.5 U per PCR reaction volume (0.05 U µL−1). Although MPII performed well across the range tested (0.045–0.075 U µL−1), MPI produced undesirable female DNA amplification products at higher enzyme concentrations (>0.06 U µL−1). Further study revealed an inverse relationship between magnesium and enzyme concentration in that lower magnesium concentrations permitted the addition of larger amounts of enzyme without the appearance of female DNA artifacts. We have opted to use a higher concentration of magnesium and a lower quantity of enzyme to achieve the appropriate specificity.

Thermal cycling parameters

Various thermal cycling conditions were evaluated by adjusting the annealing temperature and time and the cycle number. The standard reaction parameters described earlier (in the section entitled cycling conditions) allowed for the most efficient amplification and detection of male DNA.

PCR additives

Bovine serum albumin (BSA) is a commonly used PCR additive, believed to act by effectively removing inhibitors from the mix and stabilizing the DNA polymerase, thus lending greater specificity to the reaction [24]. To determine if its addition would affect multiplex amplification, 20 μg of non-acetylated BSA per PCR reaction (0.4 µg µL−1) was added to samples originating from single source blood and semen swabs, as well as from post-coital vaginal swabs that had been subject to both a differential and a non-differential extraction. Since BSA appeared to have no effect on the sensitivity or the specificity of the reaction, it was not included as a component of the standard reaction for the purposes of the present study. However, more recent work has demonstrated the efficacy of BSA in improving signal sensitivity and specificity in compromised samples and it is now routinely incorporated into the standard reaction.

Y-STR multiplex performance

Sensitivity

It is important to determine the sensitivity limits of a method to evaluate possible artifacts caused by too much input DNA template, such as detector saturation and non-specific artifacts, or too little DNA template, such as allelic drop-out or other stochastic effects. The sensitivity limits of MPI and MPII were tested by amplifying various quantities of DNA (50–500 pg, 1–100 ng). The minimum quantity of DNA required to produce a full multi-locus profile was determined to be 250 pg and 100 pg for MPI (Fig. 2a) and MPII (Fig. 2b), respectively. Although MPII was more sensitive than MPI, both systems performed well across a range of input DNA concentrations (0.25–10 ng) and, as expected, detector saturation and non-specific artifacts became increasingly evident in the 10–100 ng input DNA template range.

Fig. 2a,b.
figure 2

Sensitivity of MPI and MPII. A complete male Y-STR haplotype is obtained with 250 pg or 100 pg of template DNA with MPI (a) or MPII (b), respectively

Precision

Allele sizes were determined with excellent precision by use of an internal lane standard. Depending upon the locus the standard deviation of the measurement was determined to be 0.06–0.17 bp (Tables 1 and 2). Determination of the allelic state of a sample, including the ability to detect rare variant alleles that differ from the parental alleles by a non-integral number of repeat units, is therefore straightforward. However, although it is not strictly necessary to employ allelic ladders to characterize alleles their use is still recommended for inter-laboratory comparison purposes.

Stutter

Stutter is a PCR artifact, attributed to DNA polymerase slippage, which is observed during the amplification of simple sequence repeats such as STRs. Stutter is characterized by the presence of an allelic-like signal that is typically one repeat shorter than the parent peak and is significantly less intense than the parent peak (<20%) [25]. It is important to be able to distinguish between stutter and a true allele in order to be able to resolve mixtures of DNA from at least two individuals. The average percent stutter was determined for all MPI and MPII loci (Tables 1 and 2). Average stutter was less than or equal to 13% for all loci with the exception of DYS389II, which had an average stutter of 16%. The stutter range observed is similar to that seen with autosomal STRs and consequently stutter signals should normally be distinguishable from allele signals.

Somatic stability

Forensic samples are obtained from a number of tissue types, most frequently blood, saliva, and semen. It is, therefore, necessary to confirm that there are no tissue-specific effects on multiplex amplification and that DNAs from various tissues originating from the same individual give the same multi-locus haplotype. DNA extracted from blood, saliva, and semen from a single male subject was amplified using MPI and MPII. The multi-locus haplotype was found to be identical in each tissue type examined.

Monoplex versus multiplex analysis

Alleles at each locus were sized identically whether primer pairs were used individually in a monoplex reaction or incorporated into the multiplex formats of the MPI and MPII systems.

Male specificity

One of the chief goals in the development of Y chromosome polymorphisms for forensic use is to permit the specific amplification of male DNA in a background of greater quantities of female DNA, rendering unnecessary a separation of the two fractions prior to analysis. However, due to its evolutionary history, the Y chromosome is not only home to a variety of intra-chromosomal segmental duplications, but it also retains a considerable degree of sequence homology with the X chromosome [3]. Accordingly, most primers designed to recognize specific Y-STR loci, such as those incorporated into MPI and MPII, possess homologous sequences on the X chromosome. The degree of homology will determine to what extent confounding X chromosome derived artifacts are produced by DNA isolated from male (XY) versus female (XX) individuals. The object in Y-STR assay design is to remove, or at least minimize, such artifacts and this is accomplished by judicious primer design (to maximize differences with X chromosome sequences) and by stringent PCR cycling conditions (to reduce non-specific hybridization to homologous sequences). Although MPI and MPII assays were designed to produce male-specific haplotype profiles at 1–3 ng of male input DNA it is important to check for X chromosome derived artifacts. Our approach has been to test each Y-STR locus in both monoplex and multiplex formats using varying amounts of female input DNA (3 ng–1 μg).

With MPI, no artifacts were observed with 3 ng of female DNA. At 30 ng, female peaks began to appear most prominently in the green (TET) channel, but all were out of the range of any male Y-STR alleles. At 300 ng of female DNA or greater a limited number of potentially confounding products were observed (Fig. 3a). Female monomorphic products of relatively low signal intensity were located within the DYS392, DYS19, DYS389I and DYS389II loci. Thus, in the absence of male DNA, four of the eight MPI loci were affected by high concentrations of female DNA.

Fig. 3.
figure 3

Female DNA products: 300 ng of female DNA was amplified using a MPI and b MPII. The allelic size ranges of the Y-STR loci are indicated

No female DNA signals were detected with MPII using 3 ng of female DNA. At 30 ng, however, significant products were found within the DYS439 and DYS434 male allelic ranges, with other minor products detected in the Y-GATA-A7.1, DYS437 and DYS438 ranges. The addition of 300 ng (Fig. 3b) or 1 μg of template DNA resulted in amplification of the same female fragments. Thus, five of the ten MPII loci would be affected by female artifacts at high female DNA concentration.

Interestingly the female products observed at nine of the eighteen MPI and MPII loci were only observed with multiplex formats. No female products were obtained for any of the eighteen loci when 300 ng of female DNA was employed in a monoplex format. Thus the female products are likely the result of interactions between primers from different loci due to their binding to homologous regions of the X chromosome that happen to be in close proximity.

Mixtures

Male/female

A complementary approach to characterizing potentially confounding female DNA artifacts, and one that more realistically represents situations encountered in casework, is to check for their existence in the presence of male DNA. Some or all of the products observed with female DNA alone may be the result of Y-chromosome-specific primers binding to partially homologous X chromosome sequences. Thus, the presence of the truly complementary sequences on the Y chromosome (in male/female mixtures) could result in a competition for primers with a concomitant beneficial diminution or loss of female signal. In order to observe the effects of a mixed male/female sample on MPI and MPII amplification, a constant amount of male DNA (2.5 ng) was diluted with increasing quantities of female DNA (2.5 ng–22.5 μg) and the resulting total admixture subjected to MPI and MPII analysis.

With MPI, a complete eight-locus male haplotype without female products was obtained when the male DNA comprised one half or one tenth of the total DNA. When the male DNA consisted of one hundredth of the total, the male haplotype was still discernible at six of the loci although female products appeared within the allelic range of two loci, DYS19 and DYS389II (Fig. 4a). With MPII, female products appeared within three loci DYS439, DYS434 and Y-GATA-A7.1 at a one in ten dilution of male DNA, but a seven-locus haplotype was obtained even at a one hundred dilution (Fig. 4b). At higher dilutions of male DNA (1/1000, 1/10,000), the male profile was not apparent.

Fig. 4a,b.
figure 4

Male/female mixed DNA samples. a MPI, male/female (1/100 ratio); b MPII male/female (1/100 ratio). Major female DNA artifact peaks are indicated by asterisk adjacent to peaks

Importantly these experiments demonstrate that even in the presence of a vast excess of female DNA it is possible, with MPI and MPII, to obtain a thirteen-locus Y-STR profile of the male donor. The number of Y-STR loci affected by female products was reduced from nine to five in the presence of competing male DNA, which is consistent with the previously described primer competition hypothesis.

Male/male

Forensic samples can originate from more than one male donor. Unlike autosomal loci, for which an individual may be either heterozygous or homozygous, Y chromosomal loci are, by definition, hemizygous, and normally a profile will comprise only one allele from each individual at most of the Y-STR loci. Such a situation pertains to all MPI and MPII loci, with the exception of DYS385 for which an individual can possess one or two alleles due to the inability to distinguish alleles from the DYS385a and DYS385b loci. The ability to precisely determine the number of male donors in a mixture could aid in the interpretation of autosomal STR data from such samples.

We conducted a series of experiments to determine the level at which both male DNAs in an admixed sample could be detected and typed. DNA from two males was mixed in various ratios (1/2, 1/4, 1/6, 1/12, 1/30) and a total of 3 ng amplified and typed using MPI and MPII. The presence of two individuals, as determined by the presence of two allelic signals at a single locus (except DYS385) was clearly discernible when the minor donor was present at 1/2 (Fig. 5a), 1/4 and 1/6 the concentration of the major donor. Similarly, three donors can be discerned by the presence of three alleles at a single locus (Fig. 5b).

Fig. 5a,b.
figure 5

Mixed DNA Samples from multiple males. MPI profiles of a 2 males (1:1 ratio) and b 3 males (1:1:1 ratio)

In order to provide a preliminary estimate of the potential efficacy of MPI and MPII to determine the number of male donors (assuming the presence of detectable quantities of the minor donor(s)) simulated mixture experiments were conducted. For a two-person mixture, ten pairwise comparisons of complete eighteen-loci Y-STR profiles from different individuals were compared and the number of loci exhibiting two alleles computed. Similarly, thirty-six triple profiles were compiled and compared to ascertain the number of loci exhibiting three alleles. For the two-individual admixtures, an average of nine loci (range 3–14) exhibited the two alleles diagnostic of a mixture. In contrast, an average of four loci (range 2–7) produced three-allele patterns when the mixture contained DNA from three individuals. Thus, MPI and MPII profiling should provide, in the vast majority of cases, definitive information about the number of male donors in a mixture.

Non-probative casework

A DNA typing system needs to be tested using simulated or non-probative casework material prior to use in actual casework. Since Y-STR typing is expected to have a significant impact on the analysis of rapes and other sexual assaults, we decided to concentrate our initial efforts in this area. A series of post-coital vaginal swabs were collected from a female subject at 0 h, 12 h, and 24 h after intercourse. The subject carried on routine daily activities, allowing the normal processes of semen drainage and degradation to occur. Cervicovaginal swabs were taken as described in the Methods section. DNA was isolated either by a differential extraction method that separated the sperm from the non-sperm DNA or by a non-differential organic extraction method in which the sperm and non-sperm DNA was co-extracted. The isolated DNA was analyzed with Y-STRs (MPI/II) and a set of nine autosomal STRs.

The DNA recovered immediately after intercourse yielded complete MPI and MPII male profiles from both the differentially extracted sperm fraction and the non-differential admixed extract using 3 ng of input template DNA (data not shown). A similar amount of input DNA from the 12 h sperm fraction gave a full MPI and MPII Y-STR male profile but, in this case, the non-differential admixture failed to yield a Y-STR profile (data not shown). Interestingly, however, a hundred fold increase in the amount of input DNA from the non-differential admixture (i.e., 300 ng) gave a complete male Y-STR profile, albeit with the concurrent appearance of the aforementioned female product signals (data not shown).

The results from the 24 h post-coital samples proved to be significant. A differential extract yielded no quantifiable DNA in the sperm cell fraction and consequently no autosomal or Y-STR profiles from the male donor were obtained (Figs.6a(i), 6b(i), 6c(i)). The non-sperm fraction from the differential extract yielded an autosomal STR profile from the female donor only and furthermore no Y-STR allelic signals were obtained (data not shown). However, when 300 ng of DNA from a non-differentially extracted swab was used a MPI and MPII Y-STR haplotype profile was obtained (Figs. 6b(ii), 6c (ii)). Autosomal STR analysis of the non-differential extract demonstrated the presence of the female donor's profile only (Fig. 6a(ii)). This data is consistent with the hypothesis that the few remaining sperm present in these 24 h post-coital samples were lost during the differential extraction process.

Fig. 6a–c.
figure 6

Post-coital cervicovaginal samples. DNA from cervicovaginal samples taken 24 h after intercourse was isolated using both a differential and a non-differential extraction protocol. a Autosomal STR profiles; i sperm fraction from differential extract; ii non-differential extract. b Y-STR profiles (MPI); i sperm fraction from differential extract; ii non-differential extract. c Y-STR profiles (MPII). i sperm fraction from differential extract; ii non-differential extract

The striking ability to detect the Y-STR profile of the male donor in extended post-coital interval samples despite the failure to do so using a differential extraction protocol is probably due to two factors. First, the use of a standard organic non-differential extraction method requires less sample manipulation and therefore less opportunity for loss of fragile or few cells. Second, the few remaining male cells are present in all likelihood with a 102- to 103-fold overabundance of female vaginal epithelial cells. The high specificity of the MPI and MPII primers for the Y chromosome allows input of hundreds of nanograms of admixed male/female DNA into the PCR reaction with few of the confounding artifacts normally observed with autosomal STR systems at such high input DNA concentrations.

Although it was possible to obtain the male donor Y-STR profile from 24 h post-coital samples a number of co-amplified female DNA products were obtained. Some of these co-localized with certain of the Y-STR loci and interfered with allele designations at those loci. In an attempt to improve the performance of MPI and MPII with extended interval post-coital samples, an additional set of optimization experiments was conducted.

Enhancements to MPI and MPII

As previously described the unwanted female DNA products arise from an interaction between primers at different loci. We conducted a series of experiments in which one primer pair at a time was removed, and the remaining sets allowed to react with 300 ng of female DNA. After a specific locus pair was implicated in the production of a particular artifact, additional primer subtraction experiments revealed specifically which primer was at fault. This approach identified primers that were responsible for a significant fraction of, but not all, MPI and MPII female DNA artifacts.

Many of the MPI green channel artifacts were due to the DYS19 reverse primer (data not shown). Similarly, the cause two of the major MPII artifacts was an inter-locus interaction between two labeled primers-the DYS390 and DYS438 primers (Fig. 7). Elimination of these particular artifacts was obtained by a combination of primer re-design (DYS19) and moving DYS19 to MPII and DYS438 to MPI. This facilitated the addition of a new locus, Y-GATA-A7.2 (DYS461) [9], to MPI. These enhanced versions of MPI and MPII permit the analysis of nineteen Y-STR loci (Fig. 8) and exhibit reduced female product formation (Fig. 9). The performance of these modified systems for operational use, particularly in sexual assault cases involving an extended post-coital time interval, is currently under investigation.

Fig. 7.
figure 7

Example of an X chromosome artifact. The two predominant MPII female peaks result from the interaction of the 390 reverse primer (390RF), and the 438 reverse primer (438RH)

Fig. 8.
figure 8

Modified MPI and MPII. As a result of efforts to eliminate X chromosome artifacts, slightly modified versions of MPI and MPII were designed. DYS19 was incorporated into MPII and DYS438 was switched to MPI. An additional locus, Y-GATA-A7.2, was added to MPI that, together with MPII, now permits the analysis of 19 Y chromosome STRs

Fig. 9a,b.
figure 9

Reduced female DNA artifact peaks with modified versions of MPI and MPII: 300 ng of female DNA was amplified using modified a MPI and b MPII. The allelic size ranges of the Y-STR loci are indicated

Conclusions

We have developed two Y chromosome STR systems, multiplex I (MPI) and multiplex II (MPII), which permit the robust co-amplification of 18 Y-STRs, including DYS19, DYS385(a) and (b), DYS388, DYS389I and II, DYS390, DYS391, DYS392, DYS393, DYS425, DYS434, DYS437, DYS438, DYS439, Y-GATA-C4, Y-GATA-A7.1 (DYS460) and Y-GATA-H4. In accordance with the requirements of a Y chromosome multiplex analytical system developed specifically for forensic casework use, we have sought to maximize the number of loci able to be co-amplified, ensure appropriate assay sensitivity (1–2 ng of input genomic DNA), balance inter-locus signals and minimize confounding female DNA artifacts.

These systems have been assessed for their potential forensic utility with promising results. Complete male haplotypes were obtained with as little as 100–250 pg of template DNA, and the number of male donors in mixed stains could be determined with relative ease. Although a limited number of female DNA artifacts are observed in mixed stains in which the male DNA comprises 1/100 of the total, the male profile is easily discernible.

Slightly modified versions of MPI and MPII demonstrate a significant reduction in female artifacts. Thus, it may not be necessary to employ a differential extraction strategy to obtain a male haplotype (or haplotypes in the case of multiple male donors) in cases of sexual assault. The potential utility of MPI and MPII for forensic casework is exemplified by their ability to dissect out the male haplotype in extended interval post-coital vaginal swabs.

This study has emphasized the need for novel Y-STR multiplexes developed for forensic use to undergo a series of validation exercises that go beyond simply optimizing the PCR reaction conditions. Specifically, stringent performance checks on their efficacy need to be carried out using casework-type specimens in order to determine potential confounding effects from female DNA.