Abstract
An X-chromosomal multiplex amplifying ten short tandem repeats (STRs) in one single PCR reaction was developed and optimized in this work. The X-STRs included were DXS8378, DXS9898, DXS8377, HPRTB, GATA172D05, DXS7423, DXS6809, DXS7132, DXS101, and DXS6789. Decaplex performance was tested on 377 male samples from three United States population groups, namely, 130 African Americans, 104 Asians, and 143 Hispanics. DXS8377 was the most polymorphic locus across all three populations, whereas DXS7423 was the least informative marker. Genetic distance analysis (R ST and F ST) performed for the three populations residing in New York showed significant genetic distances between population groups for most pairwise comparisons, except for HPRTB, DXS6809, and DXS7132. When testing linkage disequilibrium for all pairs of loci in the three groups, no significant association was found between any pair of the loci studied, after applying Bonferroni correction. The high values for the average probability of excluding a random man obtained in all three populations when both mother and daughter are tested or when father/daughter relationships are evaluated support the potential of this decaplex system in kinship analysis. Also, the overall high power of discrimination values for samples of female and male origin, confirms the usefulness of this decaplex system in identification analysis. As expected, results also support the use of independent databases comprising these ten X-linked loci for the three US populations evaluated.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
The human X chromosome has grown to be the object of wide research within the fields of population genetics and forensics in recent years (e.g., [1, 4, 7, 8, 16–18]). X chromosomal STRs (X-STRs) are capable of complementing the analysis of Y and autosomal (AS) STRs, especially in cases where the offspring is female and the alleged father is unavailable, so that close blood relatives must be studied, or in the less common situation of maternity cases. Szibor et al. [18] summarized formulae for calculating the mean exclusion chance (MEC) when evaluating the forensic efficiency of X and AS loci in kinship analysis context. It is clear, that in trios involving a daughter, X chromosome markers are more efficient than AS markers, as the MEC is higher for these specific markers. Before a forensic application it is important to collect population data and construct reference databases to document the genetic variation of these specific STRs among worldwide populations.
Mutation rates and linkage disequilibrium studies of X-STRs are also lacking, which are essential for evaluating the potential applications of these specific genetic markers. Large PCR multiplexes for X-linked genetic markers make population studies and databasing more efficient and need to be designed and optimized. Several X-STR multiplexes have been described in the literature (e.g., [2, 4, 15, 16, 21, 22]), however, reference to amplifications in one single PCR reaction containing a high number of STR loci has not so far been common. This study presents information on the development and optimization of a ten-X-linked locus fluorescent STR multiplex and its application to the study of three United States population groups, namely, African Americans, Hispanics, and Asians. The aim of this study was to present and compare the distribution of allele frequencies of a newly developed decaplex X-chromosomal STR system in three US population groups.
Materials and methods
Samples and DNA extraction
Post-mortem blood stains available for research purposes at the Forensic Biology Department of the Office of Chief Medical Examiner, NY, USA were selected for this study. A total of 377 male samples from US-residing populations were typed for the ten X-STRs involved in this present work: 130 African Americans, 104 Asians, and 143 Hispanics. DNA extractions were performed employing the silica-coated magnetic bead purification technology using the automated M-48 bio-Robot (Qiagen, Hilden, Germany) following the manufacturer’s instructions (see also [13]).
X chromosomal STR amplification
Amplification was performed in a multiplex system, amplifying in one single-PCR reaction the following ten X-STRs: DXS8378, DXS9898, DXS8377, HPRTB, GATA172D05, DXS7423, DXS6809, DXS7132, DXS101, and DXS6789. Primer sequences are listed in Table S1. For DXS9898, DXS6809, DXS7132, DXS6789, and DXS101 new primers were designed using PRIMER3 software (http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi). PCR amplification was carried out using the QIAGEN Multiplex PCR kit (Qiagen) at 1X Qiagen multiplex PCR master mix and 0.5–5 ng of genomic DNA in a 10-μl final reaction volume. Final primer concentration in the reaction was 0.2 μM for all primers. Thermocycling conditions, using a GeneAmp PCR system 9700 thermocycler (Applied Biosystems, Foster City, CA, USA) were: pre-incubation for 15 min at 95°C, followed by ten cycles of 30 s at 94°C, 90 s at 60°C, and 60 s at 72°C; and 20 cycles of 30 s at 94°C, 90 s at 58°C, 60 s at 72°C with a final incubation for 30 min at 72°C.
Detection, typing, and analysis of PCR products
Separation and detection were performed using the ABI PRISM 3100 Genetic Analyser 16-capillary electrophoresis system. To each PCR product, 9.5 μl of HiDi formamide (Applied Biosystems) and 1.5 μl of internal size standard GS500 LIZ (Applied Biosystems) were added. Fragment sizes were automatically determined using GeneScan Analysis 2.1 software (Applied Biosystems) and genotyping performed through comparison with DNA control reference samples 9947A (female) and 9948 (male) taken from Promega commercial kits (Promega, Madison, WI, USA; originally established by Coriell Institute as NA9947 and NA9948, respectively; http://locus.umdnj.edu/nigms), previously typed by Szibor et al. [19] and using Genotyper software (Applied Biosystems).
Sequencing analysis
In this study, a few intermediate alleles were detected, and new alleles for some loci were unreported. Hence, DNA sequencing was performed to confirm these results, as well as sequencing of reference samples for loci DXS9898, HPRTB, DXS8377, GATA172D05, and DXS6809. To confirm evidence of allele dropout at the GATA172D05 locus, a new set of primers was designed using PRIMER3 software and used for sequencing (Table S1). PCR amplified fragments were purified with Microspin S-300 HR columns (GE Healthcare, Amersham Place, UK) and the sequencing reaction was performed using the ABI Big Dye Terminator Cycle Sequencing Ready Reaction kit (Applied Biosystems). Products were visualized in an ABI PRISM 3100 Genetic Analyser electrophoresis system and analyzed with Sequencing analysis 3.7 software (Applied Biosystems).
Statistical analysis
Allelic frequencies and standard errors, gene diversities, analysis of molecular variance (AMOVA), and population pairwise genetic distances (R ST for all markers, except DXS9898 where F ST was calculated; see Pereira et al. [14] for details) were calculated using software ARLEQUIN ver 3.0 [9]. Linkage disequilibrium tests were performed for all pairs of loci involved in this work using the same software. Statistics for forensic efficiency evaluation of each loci, namely, MEC in trios involving daughters (MECT) as well as in father/daughter duos (MECD), power of discrimination in women (PDF) and in men (PDM) were calculated using formulae according to Desmarais et al. [6].
Results and discussion
Amplification X-STR multiplex performance
The ten X-linked STR markers were successfully amplified in one single-PCR multiplex reaction, following the above reported conditions (Fig. S1). Aiming for a balanced amplification of multiplex STR performance, primer sequences for loci DXS9898, DXS6809, DXS7132, and DXS6789 had to be redesigned. New primer sequences were designed to adjust PCR product sizes to the range of each dye color and similar annealing temperatures. New primer sequences for locus DXS101 also had to be selected, as the initial primers used (Table S1) revealed amplification problems in comparison to the other X-linked STR loci selected for this work, presenting much lower amplification intensity in the multiplex, even when primer concentrations for this STR were twofold in relation to all other primers. Subsequent to redesigning and testing of the new primers, amplification of locus DXS101 was finally balanced in the decaplex X-STR multiplex. Several DNA concentrations ranging from 0.10 to 10 ng of DNA were tested for this X chromosomal specific multiplex and the best result revealing an enhanced balance among STRs, was observed for samples tested with a DNA concentration of 0.5 ng per reaction.
Nomenclature
Allelic nomenclature used for sample genotyping for most loci was according to Szibor et al. [19], except for two markers, where our sequencing data of reference samples revealed different typing results. After sequencing analysis of DNA reference samples 9947A and 9948 for locus HPRTB (Table S2), a TCTA repeat motif structure was considered and consequently these samples gained an extra repeat for this locus, having a 15-repeat TCTA motif instead of 14 repeats. In accordance with the DNA recommendations of the International Society for Forensic Genetics [3], we highlight the fact that both reference DNA samples 9947A and 9948 have a 15 allele for this specific marker. Genotyping of reference samples 9947A and 9948 for marker DXS9898 revealed incompatibilities among them. After DNA sequencing of 9948, we found 13 repeats for this sample instead of 14 repeat units (Table S2). Consequently, our 9948 genotype profile for calibration is a 13 allele.
X-STR alleles
Allele frequencies obtained for the ten X-STR loci studied (DXS8378, DXS9898, DXS8377, HPRTB, GATA172D05, DXS7423, DXS6809, DXS7132, DXS101, and DXS6789) in the three US population groups (African American, Asian, and Hispanic) are presented in Table 1. In this work four alleles not previously described were found, two at DXS9898 in African Americans and Hispanics (alleles 7 and 13.3), and two at DXS6809 in African Americans (alleles 31.1 and 33.1). These new alleles were sequenced and results are presented in Table S2.
A rare intermediate allele at locus HPRTB
During sequencing analysis, of what was initially considered an intermediate allele with an equivalent size to 12.2 repeats for locus HPRTB, we observed 13 TCTA repeat motifs (Table S2). Comparison with reference samples 9947A and 9948 sequences for the same locus also revealed a non-interrupted repeat motif in all three sequencing results. On the other hand, an AG deletion at bases 48 and 49 downstream from the repeat unit was detected for the “12.2” allele; a similar result was reported by Mertens et al. [12]. According to our results and to the recommendations of the DNA commission of the ISFG [10], we named this allele as 13 (D48AGdel) instead of 12.2. In this way, if a different reverse primer is used complementing the region upstream the mutation, smaller amplicons will be produced and thus the fragment size will be compatible with 13 repeats as it does not involve the deletion.
A null allele at locus GATA172D05
A loss of an allele at locus GATA172D05 was observed in one sample in the Hispanic population group, after several attempts of multiplex and singleplex amplification. After decreasing the PCR specificity by lowering annealing temperatures of primers to 50°C in the singleplex reaction, a DNA fragment was amplified with six repeats. New primer sequences were designed, away from the previous flanking regions and used for sequencing and detection of the polymorphism responsible for the allele dropout. Results revealed a nucleotide substitution G→A at nucleotide 7 from the 3′ end of the reverse primer sequence (Table S2). The mutation was only observed in one out of the 377 chromosomes studied confirming the rarity of this null allele. For this reason, no measure was taken to change primer sequences for the GATA172D05 locus.
Electrophoretic mobility of shorter alleles at DXS8377 and DXS9898
For DXS8377, we found evidence of anomalous mobility in two samples of sizes initially thought to be equivalent to 37.1 and 39.1 alleles, which could lead to genotyping errors. After DNA sequencing results, no polymorphisms in or outside the repeat regions were found in these two samples that could justify the non-consensus genotypes (Table S2). The same was observed at the DXS9898 locus where an allele initially genotyped as 6.3 based on it size, was found to carry seven repeats without any point mutations in the repeat flanking regions (Table S2). Therefore, the electrophoretic behavior of the shorter alleles at DXS8377 and DXS9898 reinforces the need for the use of sequenced allelic ladders for accurate typing [3].
Forensic efficiency
Forensic statistical evaluation parameters were calculated in all three groups and are shown in Table S3. All loci selected for this decaplex study revealed to be highly polymorphic and as a result confirm their potential use for forensic purposes. DXS8377 was revealed to be the most polymorphic in all three population groups, followed by DXS101 in African Americans and in Hispanics. As for Asians, the second most polymorphic marker was DXS6809. The least discriminating locus in all populations was DXS7423. The high values obtained for combined MECT and MECD in all three populations support the potential of this decaplex system in a specific kinship analysis context when the offspring is female or when father/daughter relationships are being investigated.
The same was observed for the overall values of PDF and PDM in all three populations. These high values of power of discrimination obtained both in females and in males support the value of this X-STR multiplex in forensic identity testing.
Population pairwise comparisons
Population differentiation of the three US population groups studied in the present work was evaluated by genetic distance analysis. The results (Table S4) show that for loci HPRTB, DXS6809, and DXS7132 there are no significant genetic distances between population groups. On the other hand, for GATA172D05 and DXS6789 highly significant R ST values were obtained in all pairwise comparisons. At DXS8378, DXS101, and DXS9898, only non-significant genetic distances were obtained when comparing African Americans and Hispanics and for DXS7423 the only nonsignificant R ST value was observed between Asians and Hispanics. In contrast, these two groups were the only populations showing significant genetic distance at DXS8377. As expected, these results are consistent with other studies regarding the genetic structure of the US populations using different genetic markers on the human genome [5, 11]. As a result, a pooled global database cannot be utilized for this decaplex X-STR system, but independent databases would have to be employed for each of the tested New York ethnic groups.
Linkage disequilibrium
The exact test for linkage disequilibrium was performed for all pairs of loci in the three population groups. In Hispanics, the only significant result out of 45 pairwise comparisons (p = 0.016) was obtained between DXS8378 and DXS7132. Nevertheless, as these two markers are quite distant on the chromosome and no significant associations between intermediately located markers were found, no real linkage disequilibrium is expected to exist between them, and the result is best attributed to sampling effects. The same was observed in Asians, where two pairs of distant loci also revealed significant association (p = 0.0326 between DXS9898 and DXS101 and p = 0.0434 between DXS7132 and DXS6789). In African Americans, p values below 5% were observed in five pairs of loci.
The highest association (p = 0.0165) was found between loci DXS6809 and DXS6789, which have been described as being part of the same haplotype cluster group composed by DXS6801-DXS6809-DXS6789 [20]. The recent population admixture has probably not yet allowed for recombination to break down this association. Anyway, as the p values do not stand Bonferroni’s correction (p < 0.0011), it cannot be considered as established that, in forensic applications, test for DXS6809 and DXS6789 in African Americans should be considered as haplotypes instead as independent loci. In any case, and to allow future comparisons and sample size enlargements, haplotype frequencies for these two loci are shown in Table S5.
The ten markers included in the present multiplex are distributed along the four different linkage groups on the X chromosome. Nevertheless, concerning the markers included in this multiplex, only DXS6809 and DXS6789, belonging to the second linkage group, have been shown to be in strong linkage disequilibrium [16, 20]. The lack of association between these ten X-STRs contributes to the increased power of discrimination of this multiplex. Nevertheless, haplotype analysis has been demonstrated to be a valuable tool in pedigree-based-kinship testing. Therefore, similar to the strategy followed by Robino et al. [16], the development of a second multiplex assay including markers closely linked to these ten X-STRs can be useful in solving these particular cases. The two sixplex systems developed by Robino et al. [16] overlap in eight out of ten markers included in the present decaplex. Hence, it would be necessary to type only two additional markers (DXS7424 and DXS6801) to complete the two groups described by Robino et al. [16] as being in strong linkage disequilibrium.
In conclusion, this work demonstrates the usefulness of this X-STR decaplex system in both anthropological and identification analysis in the three studied US population groups, as well as for population genetic studies.
References
Asamura H, Sakai H, Kobayashi K, Ota M, Fukushima H (2006) MiniX-STR multiplex system population study in Japan and application to degraded DNA analysis. Int J Leg Med 120:174–181
Athanasiadou D, Stradmann-Bellinghausen B, Rittner C, Alt KW, Schneider PM (2003) Development of a quadruplex PCR system for the genetic analysis of X-chromosomal STR loci. Int Congr Ser 1239:311–314
Bär W, Brinkmann B, Budowle B et al (1997) DNA recommendations: further report of the DNA Commission of the ISFH regarding the use of short tandem repeat systems. Int J Leg Med 110:175–176
Bini C, Ceccardi S, Ferri G et al (2005) Development of a heptaplex PCR system to analyse X-chromosome STR loci from five Italian population samples. A collaborative study. Forensic Sci Int 153:231–236
Budowle B, Adamowicz M, Aranda XG et al (2005) Twelve short tandem repeat loci Y chromosome haplotypes: genetic analysis on populations residing in North America. Forensic Sci Int 150:1–15
Desmarais D, Zhong Y, Chakraborty R, Perreault C, Busque L (1998) Development of a highly polymorphic STR marker for identity testing purposes at the human androgen receptor gene (HUMARA). J Forensic Sci 43:1046–1049
Edelmann J, Hering S, Michael M et al (2001) 16 X-chromosome STR loci frequency data from a German population. Forensic Sci Int 124:215–218
Edelmann J, Deichsel D, Hering S, Plate I, Szibor R (2002) Sequence variation and allele nomenclature for the X-linked STRs DXS9895, DXS8378, DXS7132, DXS6800, DXS7133, GATA172D05, DXS7423 and DXS8377. Forensic Sci Int 129:99–103
Excoffier L, Laval G, Schneider S (2005) Arlequin ver. 3.0: an integrated software package for population genetics data analysis. Evol Bioinformatics Online 1:47–50
Gusmão L, Butler JM, Carracedo A et al (2006) DNA Commission of the International Society of Forensic Genetics (ISFG): an update of the recommendations on the use of Y-STRs in forensic analysis. Int J Leg Med 120:191–200
Kayser M, Brauer S, Schädlich H et al (2003) Y chromosome STR haplotypes and the genetic structure of U.S. populations of African, European, and Hispanic ancestry. Genome Res 13:624–634
Mertens G, Gielis M, Mommers N et al (1999) Mutation of the repeat number of the HPRTB locus and structure of rare intermediate alleles. Int J Leg Med 112:192–194
Nagy M, Otremba P, Kruger C et al (2005) Optimization and validation of a fully automated silica-coated magnetic beads purification technology in forensics. Forensic Sci Int 152:13–22
Pereira R, Gomes I, Amorim A, Gusmão L (2006) Genetic diversity of 10 X-chromosome STRs in northern Portugal. Int J Legal Med. DOI 10.1007/s00414-006-0144-4
Poetsch M, Petersmann H, Repenning A, Lignitz E (2005) Development of two pentaplex systems with X-chromosomal STR loci and their allele frequencies in a northeast German population. Forensic Sci Int 155:71–76
Robino C, Giolitti A, Gino S, Torre C (2006) Development of two multiplex PCR systems for the analysis of 12 X-chromosomal STR loci in a northwestern Italian population sample. Int J Leg Med 120:315–318
Shin SH, Yu JS, Park SW, Min GS, Chung KW (2005) Genetic analysis of 18 X-linked short tandem repeat markers in Korean population. Forensic Sci Int 147:35–41
Szibor R, Krawczak M, Hering S, Edelmann J, Kuhlisch E, Krause D (2003a) Use of X-linked markers for forensic purposes. Int J Leg Med 117:67–74
Szibor R, Edelmann J, Hering S et al (2003b) Cell line DNA typing in forensic genetics: the necessity of reliable standards. Forensic Sci Int 138:37–43
Szibor R, Hering S, Kuhlisch E, Plate I, Demberger S, Krawczak M, Edelmann J (2005) Haplotyping of STR cluster DXS6801-DXS6809-DXS6789 on Xq21 provides a powerful tool for kinship testing. Int J Leg Med 119:363–369
Tabbada KA, De Ungria MCA, Faustino LP, Athanasiadou D, Stradmann-Bellinghausen B, Schneider PM (2005) Development of a pentaplex X-chromosomal short tandem repeat typing system and population genetic studies. Forensic Sci Int 154:173–180
Zarrabeitia MT, Amigo T, Sanudo C, Zarrabeitia A, González-Lamuño D, Riancho JA (2002) A new pentaplex system to study short tandem repeat markers of forensic interest on X chromosome. Forensic Sci Int 129:85–89
Acknowledgements
This work was partially supported by Fundação Luso-Americana para o Desenvolvimento (FLAD) and Fundação para a Ciência e a Tecnologia through grant SFRH/BD/21647/2005 and by “Programa Operacional Ciência, e Inovação 2010” (POCI 2010), VI Programa-Quadro (2002–2006).
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
Below is the link to the electronic supplementary material.
Table S1
Primer sequences, labels, and respective references (DOC 29 KB).
Table S2
Sequencing results for DXS9898, DXS8377, HPRTB, DXS6809, and GATA172D05 (DOC 30 KB).
Table S3
Forensic efficiency statistics for each X-STR loci and overall values for decaplex system (DOC 37.5 KB).
Table S4
R ST values for X-STRs, except for DXS9898 where F ST values are presented, and AMOVA results for global variation (% var) within population groups (DOC 29 KB).
Table S5
Haplotype frequencies and standard errors (S.E.) for DXS6809 and DXS6789 (DOC 34.5 KB).
Figure S1
Electropherogram of a male profile for the X-STR decaplex system. (DOC 230 KB).
Rights and permissions
About this article
Cite this article
Gomes, I., Prinz, M., Pereira, R. et al. Genetic analysis of three US population groups using an X-chromosomal STR decaplex. Int J Legal Med 121, 198–203 (2007). https://doi.org/10.1007/s00414-006-0146-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00414-006-0146-2